Skip to main content

API

Proxy API endpoints

The following endpoints are all for use with Semcache operating in proxy mode, i.e. forwarding requests to your desired LLM provider.

OpenAI format

POST /v1/chat/completions

Request

  • Method: POST

  • Headers:

    Header NameValueRequiredDescription
    Content-Typeapplication/jsonyesSpecifies that body is JSON
    x-llm-proxy-upstreamhttps://full_path_to_desired_upstream.com/pathnoAllows you to override the default upstream associated with this endpoint
    x-llm-proxy-hosthttps://host_to_override_default.comnoAllows for just overriding the host part of the url
    x-llm-proxy-prompt$.json_path_of_prompt_fieldnoAllows for overriding the default prompt location
  • Body (application/json):

      { "model": "gpt-4o", "messages": [{"role": "user", "content": "prompt?"}]}

Anthropic format

POST /v1/messages

Request

  • Method: POST

  • Headers:

    Header NameValueRequiredDescription
    Content-Typeapplication/jsonyesSpecifies that body is JSON
    x-llm-proxy-upstreamhttps://full_path_to_desired_upsteam.com/pathnoAllows you to override the default upstream associated with this endpoint
    x-llm-proxy-hosthttps://host_to_override_default.comnoAllows for just overriding the host part of the url
    x-llm-proxy-prompt$.json_path_of_prompt_fieldnoAllows for overriding the default prompt location
  • Body (application/json):

    {
    "model": "claude-opus-4-20250514",
    "max_tokens": 1024,
    "messages": [
    {"role": "user", "content": "Hello, world"}
    ]
    }

Generic format

POST /semcache/v1/chat/completions

Request

  • Method: POST

  • Headers:

    Header NameValueRequiredDescription
    Content-Typeapplication/jsonyesSpecifies that body is JSON
    x-llm-proxy-upstreamhttps://full_path_to_desired_upsteam.com/pathyesSet the upstream you want us to forward requests to
    x-llm-proxy-hosthttps://host_to_override_default.comnoAllows for just overriding the host part of the url
    x-llm-proxy-prompt$.json_path_of_prompt_fieldyesSet the jsonpath of cache key
  • Body (application/json):

      { "query": "string to use as key for cache lookup"}

Headers

If you are curious about when you might want to set the x-llm- headers, refer to LLM Providers & Tools.

Other than this, the headers sent to the proxy will be forwarded to the upstream on outgoing calls. This means that you may need to set authentication headers, or other metadata related headers needed for your upstream to properly understand your request.

Outgoing request body

In the event of a cache miss, the incoming request body will be sent as is to the proxy upstream.

Response

In the event of a cache miss, we will return the response unmodified from the proxy upstream.

In the event of a cache hit, we will return the stored value matched to the key specified by the x-llm-proxy-prompt (or the default associated with the specific route). If this is from another LLM provider, you need to be able to handle their format on your end.

Cache aside API endpoints

We also expose endpoints that allow you to utilize semcache in a cache-aside manner.

Write to cache

PUT /semcache/v1/put

Request

  • Method: PUT

  • Headers:

    Header NameValueRequiredDescription
    Content-Typeapplication/jsonyesSpecifies that body is JSON
  • Body (application/json):

      { "key": "What is the capital of France?", "data": "Paris"}

Response

  • Status Codes:

    CodeMeaningWhen It Occurs
    200OKCache entry was successfully written or updated
    500Internal Server ErrorAn unexpected server error occurred

Read from cache

PUT /semcache/v1/put

Request

  • Method: PUT

  • Headers:

    Header NameValueRequiredDescription
    Content-Typeapplication/jsonyesSpecifies that body is JSON
  • Body (application/json):

      { "key": "What is the capital of France?"}

  • Status Codes:

    CodeMeaningWhen It Occurs
    200OKCache entry was successfully written or updated
    404Not FoundNo corresponding cache entry was found
    500Internal Server ErrorAn unexpected server error occurred
  • Body (application/json):

     "Paris"