Custom Compatible
Connect PRX to any LLM API that follows the OpenAI Chat Completions format. Works with LiteLLM, vLLM, Groq, Mistral, xAI, Venice, Vercel AI, Cloudflare AI, HuggingFace Inference, and any other OpenAI-compatible service.
Prerequisites
- A running LLM API that implements the OpenAI Chat Completions format (
/v1/chat/completionsor/chat/completions) - An API key (if required by the service)
Quick Setup
1. Identify Your Endpoint
Determine the base URL and authentication method for your API. For example:
- Groq:
https://api.groq.com/openai/v1 - Mistral:
https://api.mistral.ai/v1 - xAI:
https://api.x.ai/v1 - Local vLLM:
http://localhost:8000/v1 - LiteLLM proxy:
http://localhost:4000
2. Configure
[default]
provider = "compatible"
model = "your-model-name"
[providers.compatible]
api_key = "${YOUR_API_KEY}"
api_url = "https://api.your-provider.com/v1"3. Verify
prx doctor modelsBuilt-in Compatible Providers
PRX includes pre-configured aliases for popular OpenAI-compatible services:
| Provider Name | Aliases | Base URL | Auth Style |
|---|---|---|---|
| Venice | venice | https://api.venice.ai | Bearer |
| Vercel AI | vercel, vercel-ai | https://api.vercel.ai | Bearer |
| Cloudflare AI | cloudflare, cloudflare-ai | https://gateway.ai.cloudflare.com/v1 | Bearer |
| Groq | groq | https://api.groq.com/openai/v1 | Bearer |
| Mistral | mistral | https://api.mistral.ai/v1 | Bearer |
| xAI | xai, grok | https://api.x.ai/v1 | Bearer |
| Qianfan | qianfan, baidu | https://aip.baidubce.com | Bearer |
| Synthetic | synthetic | https://api.synthetic.com | Bearer |
| OpenCode Zen | opencode, opencode-zen | https://opencode.ai/zen/v1 | Bearer |
| LiteLLM | litellm, lite-llm | configurable | Bearer |
| vLLM | vllm, v-llm | configurable | Bearer |
| HuggingFace | huggingface, hf | configurable | Bearer |
Configuration Reference
| Field | Type | Default | Description |
|---|---|---|---|
api_key | string | optional | API authentication key |
api_url | string | required | Base URL of the API endpoint |
model | string | required | Model name/ID to use |
auth_style | string | "bearer" | Authentication header style (see below) |
Authentication Styles
| Style | Header Format | Usage |
|---|---|---|
bearer | Authorization: Bearer <key> | Most providers (default) |
x-api-key | x-api-key: <key> | Some Chinese providers |
custom | Custom header name | Special cases |
Features
Automatic Endpoint Detection
PRX automatically appends /chat/completions to your base URL. You do not need to include the endpoint path:
# Correct - PRX appends /chat/completions
api_url = "https://api.groq.com/openai/v1"
# Also correct - explicit path works too
api_url = "https://api.groq.com/openai/v1/chat/completions"Responses API Fallback
For providers that support OpenAI's newer Responses API, PRX can fall back to /v1/responses when /v1/chat/completions returns a 404. This is enabled by default but can be disabled for providers that do not support it (e.g., GLM/Zhipu).
Native Tool Calling
Tools are sent in OpenAI's standard function-calling format:
{
"type": "function",
"function": {
"name": "tool_name",
"description": "Tool description",
"parameters": { "type": "object", "properties": {...} }
}
}The provider supports tool_choice: "auto" and properly deserializes structured tool_calls responses.
Vision Support
For vision-capable models, images embedded in messages as [IMAGE:data:image/png;base64,...] markers are automatically converted to the OpenAI vision format with image_url content blocks.
Streaming Support
The compatible provider supports SSE streaming for real-time token delivery. Stream events are parsed incrementally with support for:
delta.contenttext chunksdelta.tool_callsfor incremental tool call construction[DONE]marker detection- Graceful timeout handling
System Message Merging
Some providers (e.g., MiniMax) reject role: system messages. PRX can automatically merge system message content into the first user message. This is enabled by default for known incompatible providers.
HTTP/1.1 Forced Mode
Some providers (notably DashScope/Qwen) require HTTP/1.1 instead of HTTP/2. PRX automatically detects these endpoints and forces HTTP/1.1 for connection reliability.
Reasoning Content Fallback
For reasoning models that return output in reasoning_content instead of content, PRX automatically falls back to extract the reasoning text.
Advanced Configuration
Local LLM Server (vLLM, llama.cpp, etc.)
[default]
provider = "compatible"
model = "meta-llama/Llama-3.1-8B-Instruct"
[providers.compatible]
api_url = "http://localhost:8000/v1"
# No api_key needed for local serversLiteLLM Proxy
[default]
provider = "litellm"
model = "gpt-4o"
[providers.litellm]
api_key = "${LITELLM_API_KEY}"
api_url = "http://localhost:4000"Multiple Custom Providers
Use the model router to configure multiple compatible providers:
[default]
provider = "openrouter"
model = "anthropic/claude-sonnet-4"
[[model_routes]]
pattern = "groq/*"
provider = "compatible"
api_url = "https://api.groq.com/openai/v1"
api_key = "${GROQ_API_KEY}"
[[model_routes]]
pattern = "mistral/*"
provider = "compatible"
api_url = "https://api.mistral.ai/v1"
api_key = "${MISTRAL_API_KEY}"Troubleshooting
Connection refused
Ensure the API endpoint is reachable:
curl -v https://api.your-provider.com/v1/models401 Unauthorized
- Verify your API key is correct
- Check the authentication style matches your provider (Bearer vs x-api-key)
- Some providers require additional headers; use a named provider alias if available
"role: system" rejected
If your provider does not support system messages, PRX should handle this automatically for known providers. For custom endpoints, this is a provider limitation. Workaround: include system instructions in the first user message.
Streaming not working
Not all OpenAI-compatible APIs support streaming. If streaming fails, PRX falls back to non-streaming mode automatically.
Model not found
Verify the exact model name/ID your provider expects. Different providers use different naming conventions:
- Groq:
llama-3.3-70b-versatile - Mistral:
mistral-large-latest - xAI:
grok-2
Check your provider's documentation for the correct model identifiers.