LLM Providers
PRX connects to large language models through providers -- pluggable backends that implement the Provider trait. Each provider handles authentication, request formatting, streaming, and error classification for a specific LLM API.
PRX ships with 9 built-in providers, an OpenAI-compatible endpoint for third-party services, and infrastructure for fallback chains and intelligent routing.
Capability Matrix
| Provider | Key Models | Streaming | Vision | Tool Use | OAuth | Self-hosted |
|---|---|---|---|---|---|---|
| Anthropic | Claude Opus 4, Claude Sonnet 4 | Yes | Yes | Yes | Yes (Claude Code) | No |
| OpenAI | GPT-4o, o1, o3 | Yes | Yes | Yes | No | No |
| Google Gemini | Gemini 2.0 Flash, Gemini 1.5 Pro | Yes | Yes | Yes | Yes (Gemini CLI) | No |
| OpenAI Codex | Codex models | Yes | No | Yes | Yes | No |
| GitHub Copilot | Copilot Chat models | Yes | No | Yes | Yes (Device Flow) | No |
| Ollama | Llama 3, Mistral, Qwen, any GGUF | Yes | Depends on model | Yes | No | Yes |
| AWS Bedrock | Claude, Titan, Llama | Yes | Depends on model | Depends on model | AWS IAM | No |
| GLM | GLM-4, Zhipu, Minimax, Moonshot, Qwen, Z.AI | Yes | Depends on model | Depends on model | Yes (Minimax/Qwen) | No |
| OpenRouter | 200+ models from multiple vendors | Yes | Depends on model | Depends on model | No | No |
| Custom Compatible | Any OpenAI-compatible API | Yes | Depends on endpoint | Depends on endpoint | No | Yes |
Quick Configuration
Providers are configured in ~/.config/openprx/config.toml (or ~/.openprx/config.toml). At minimum, set the default provider and supply an API key:
# Select the default provider and model
default_provider = "anthropic"
default_model = "anthropic/claude-sonnet-4-6"
default_temperature = 0.7
# API key (can also be set via ANTHROPIC_API_KEY env var)
api_key = "sk-ant-..."For self-hosted providers like Ollama, specify the endpoint:
default_provider = "ollama"
default_model = "llama3:70b"
api_url = "http://localhost:11434"Each provider resolves its API key from (in order):
- The
api_keyfield inconfig.toml - Provider-specific environment variable (e.g.,
ANTHROPIC_API_KEY,OPENAI_API_KEY) - The generic
API_KEYenvironment variable
See Environment Variables for the full list of supported variables.
Fallback Chains with ReliableProvider
PRX wraps provider calls in a ReliableProvider layer that provides:
- Automatic retry with exponential backoff for transient failures (5xx, 429 rate limits, network timeouts)
- Fallback chains -- when the primary provider fails, requests are automatically routed to the next provider in the chain
- Non-retryable error detection -- client errors like invalid API keys (401/403) and unknown models (404) fail fast without wasting retries
Configure reliability in the [reliability] section:
[reliability]
max_retries = 3
fallback_providers = ["openai", "gemini"]When the primary provider (e.g., Anthropic) returns a transient error, PRX retries up to max_retries times with backoff. If all retries are exhausted, it falls through to the first fallback provider. The fallback chain continues until a successful response or all providers are exhausted.
Error Classification
The ReliableProvider classifies errors into two categories:
- Retryable: HTTP 5xx, 429 (rate limit), 408 (timeout), network errors
- Non-retryable: HTTP 4xx (except 429/408), invalid API keys, unknown models, malformed responses
Non-retryable errors skip retries and immediately fall through to the next provider, avoiding wasted latency.
Router Integration
For advanced multi-model setups, PRX supports a heuristic LLM router that selects the optimal provider and model per request based on:
- Capability scoring -- matches query complexity to model strengths
- Elo rating -- tracks model performance over time
- Cost optimization -- prefers cheaper models for simple queries
- Latency weighting -- factors in response time
- KNN semantic routing -- uses historical query embeddings for similarity-based routing
- Automix escalation -- starts with a cheap model and escalates to a premium model when confidence is low
[router]
enabled = true
knn_enabled = true
[router.automix]
enabled = true
confidence_threshold = 0.7
premium_model_id = "anthropic/claude-sonnet-4-6"See Router Configuration for full details.