Skip to main content

LLM Providers & Tools

Semcache as an HTTP proxy supports different LLM providers and tools. It also can be configured to work with your own custom APIs.

Three Ways to Configure Providers

1. Preconfigured Default Routes

Semcache provides built-in routes for major LLM providers. Simply point your existing SDK to Semcache's base URL - no additional configuration needed. Each provider has a dedicated endpoint that automatically routes to the correct upstream API. See the Providers section below for specific examples.

2. Header-Based Provider Control

Use HTTP headers to override routing behavior while keeping existing API specifications:

x-llm-proxy-host

Override the upstream host while keeping the provider's API format.

For example, if you call the OpenAI associated route, but set x-llm-proxy-host: https://api.deepseek.com, we will use the same path and jsonpath as would be configured for OpenAI, but override the host to deepseek.

x-llm-proxy-upstream

Override the complete upstream URL for custom endpoints.

x-llm-prompt

Specify where to find the prompt in your request body using JSONPath syntax:

  • $.messages[-1].content - Last message content (OpenAI/Anthropic default)
  • $.input.text - Custom field location
  • $.prompt - Simple prompt field

Order of operations

  • x-llm-proxy-host and x-llm-prompt will take precedence over the defaults associated with each provider route
  • x-llm-proxy-upstream will override both the defaults and what may have been specified in x-llm-proxy-host

Examples

Using x-llm-proxy-host to route to DeepSeek:

from openai import OpenAI

client = OpenAI(
api_key="your-deepseek-key",
base_url="http://semcache-host-here:8080", # Replace with your Semcache host
default_headers={
"x-llm-proxy-host": "https://api.deepseek.com"
}
)

Using x-llm-proxy-upstream and x-llm-prompt for custom LLMs:

curl -X POST http://semcache-host-here:8080/semcache/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-api-key" \
-H "x-llm-proxy-upstream: https://your-custom-llm.com/api/v1/generate" \
-H "x-llm-prompt: $.input.text" \
-d '{
"input": {
"text": "What is the capital of France?",
"max_tokens": 100
}
}'

3. Custom Generic Endpoint

For custom LLMs or providers we haven't implemented yet, use the generic endpoint /semcache/v1/chat/completions with appropriate headers:

import requests

response = requests.post(
"http://semcache-host-here:8080/semcache/v1/chat/completions",
headers={
"Authorization": "Bearer your-custom-api-key",
"Content-Type": "application/json",
"x-llm-proxy-upstream": "https://your-llm-api.com/v1/complete",
"x-llm-prompt": "$.query"
},
json={
"query": "Explain quantum computing",
"temperature": 0.7,
"max_tokens": 500
}
)

*Note that you will need to set both x-llm-proxy-host and x-llm-prompt in order to use this endpoint.

Available Routes

RouteProviderPurpose
/v1/chat/completionsOpenAIDefault OpenAI format
/chat/completionsOpenAIAlternative OpenAI format
/v1/messagesAnthropicAnthropic Claude API
/semcache/v1/chat/completionsGenericCustom providers

Providers

These are providers we have created a default endpoint for. Remember you can configure any provider that uses HTTP with the custom provider endpoint.

OpenAI

from openai import OpenAI

client = OpenAI(
api_key="your-openai-key",
base_url="http://localhost:8080" # Point to Semcache
)

response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}]
)

Anthropic

import anthropic

client = anthropic.Anthropic(
api_key="your-anthropic-key",
base_url="http://localhost:8080" # Point to Semcache
)

response = client.messages.create(
model="claude-3-sonnet-20240229",
max_tokens=1000,
messages=[{"role": "user", "content": "Hello!"}]
)

DeepSeek

from openai import OpenAI

client = OpenAI(
api_key="your-deepseek-key",
base_url="http://semcache-host-here:8080", # Replace with your Semcache host
default_headers={
"x-llm-proxy-host": "https://api.deepseek.com"
}
)

response = client.chat.completions.create(
model="deepseek-chat",
messages=[{"role": "user", "content": "Write a Python function"}]
)

Mistral

from openai import OpenAI

client = OpenAI(
api_key="your-mistral-key",
base_url="http://semcache-host-here:8080", # Replace with your Semcache host
default_headers={
"x-llm-proxy-host": "https://api.mistral.ai"
}
)

response = client.chat.completions.create(
model="mistral-large-latest",
messages=[{"role": "user", "content": "Explain machine learning"}]
)

Tools

The following examples show how to configure popular tools to use Semcache as an HTTP proxy.

LiteLLM

import litellm

# Configure LiteLLM to use Semcache as proxy
litellm.api_base = "http://semcache-host-here:8080" # Replace with your Semcache host

# Use with different providers
response = litellm.completion(
model="gpt-4",
messages=[{"role": "user", "content": "Hello"}],
headers={"x-llm-proxy-host": "https://api.deepseek.com"}
)

LangChain

from langchain_openai import ChatOpenAI

# Standard OpenAI through Semcache
llm = ChatOpenAI(
model="gpt-4o",
openai_api_base="http://semcache-host-here:8080", # Replace with your Semcache host
openai_api_key="your-openai-key"
)

# Custom provider through Semcache
llm_custom = ChatOpenAI(
model="gpt-4o",
openai_api_base="http://semcache-host-here:8080", # Replace with your Semcache host
openai_api_key="your-provider-key",
default_headers={
"x-llm-proxy-host": "https://api.your-provider.com"
}
)

response = llm.invoke("What is semantic caching?")