Automix Router

The Automix router optimizes for cost by starting every query avec un cheap model and only escalating vers un premium model lorsque le initial response's confidence is below a threshold.

Fonctionnement

Initial query -- send the query vers le cheap model
Confidence check -- evaluate la reponse confidence score
Escalate if needed -- if confidence is below threshold, re-query with premium model
Retour -- retour the first confident response

Confidence Scoring

Confidence is assessed based on:

Self-reported confidence in la reponse
Presence of hedging language ("I'm not sure", "pourrait etre")
Token-level entropy of la reponse
Tool call success rate

Configuration

toml

[router]
strategy = "automix"

[router.automix]
enabled = true
confidence_threshold = 0.7
cheap_model = "anthropic/claude-haiku"
premium_model = "anthropic/claude-opus-4-6"
max_escalations = 1

Cost Savings

In typical usage, Automix route 60-80% of interroge vers le cheap model, achieving significant cost savings tandis que maintaining quality for complex queries.

Automix Router ​

Fonctionnement ​

Confidence Scoring ​

Configuration ​

Cost Savings ​

Voir aussi Pages ​

Automix Router

Fonctionnement

Confidence Scoring

Configuration

Cost Savings

Voir aussi Pages