Auto-router
Send model: "haimaker/auto" and the auto-router picks the right model for each request. It looks at what the request actually needs (vision, tool use, long context) and matches keywords in the prompt against rules you define.
You configure it in the dashboard -- set up a model pool, add routing rules, and assign the router to your API keys. No code changes required beyond swapping the model name.
Quick start
- cURL
- Python
- Node.js
curl https://api.haimaker.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "haimaker/auto",
"messages": [{"role": "user", "content": "Write a Python function to sort a list"}]
}'
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.haimaker.ai/v1"
)
response = client.chat.completions.create(
model="haimaker/auto",
messages=[{"role": "user", "content": "Write a Python function to sort a list"}]
)
print(response.choices[0].message.content)
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'YOUR_API_KEY',
baseURL: 'https://api.haimaker.ai/v1'
});
const response = await client.chat.completions.create({
model: 'haimaker/auto',
messages: [{ role: 'user', content: 'Write a Python function to sort a list' }]
});
console.log(response.choices[0].message.content);
The response model field contains the actual model that handled the request (e.g., moonshotai/kimi-k2.5), not haimaker/auto.
How routing works
The router evaluates requests in three layers, in order:
1. Capability detection
The router inspects the request and filters the model pool to only models that can handle it:
| Detected capability | Request signal | Pool filter |
|---|---|---|
| Vision | Message contains image_url content | supports_vision |
| Tool use | Request has tools or functions | supports_function_calling |
| Structured output | response_format.type is json_schema | supports_response_schema |
| Audio input | Message contains input_audio content | supports_audio_input |
| PDF input | Message contains file or document content | supports_pdf_input |
| Web search | Tool with type web_search or web_search_preview | supports_web_search |
| Long context | Estimated token count exceeds model limit | max_input_tokens (with 10% safety buffer) |
This always runs first. If a keyword rule matches but the target model can't handle the request (e.g., the prompt says "python" but includes an image, and the coding model doesn't support vision), that rule is skipped.
2. Keyword and capability rules
Each rule has:
- Keywords matched against the last user message
- Keyword categories (pre-built bundles like "Code & Development")
- Required capabilities (optional -- the rule only fires if the request needs those capabilities)
- A target model
The router counts how many of each rule's keywords appear in the last user message. The rule with the most matches wins. If two rules tie on match count, the one with higher priority (lower order number) wins.
Keywords that also appear in the system message are automatically deweighted — they count at 25% of their normal value. This prevents tool descriptions and agent instructions in the system prompt from triggering false keyword matches on every turn. A rule needs at least 0.5 effective score to trigger at all.
This means a prompt like "evaluate this code, debug the const" routes to the coding model (3 matches: "code", "debug", "const") rather than the reasoning model (1 match: "evaluate"), even if the reasoning rule has higher priority.
3. Default model
If nothing matches, the default model handles the request. Pick something general-purpose here.
Setting up a router
1. Create a router
Go to Auto-Router in the dashboard and click Create New. A default configuration is created with sensible routing rules that you can customize.
2. Configure rules
Each rule maps a set of keywords or capabilities to a target model. The dashboard provides pre-built keyword categories:
| Category | What it matches |
|---|---|
| Code & Development | python, javascript, const, async, await, debug, api, react, npm, and more |
| Complex Reasoning | analyze, evaluate, step-by-step, pros and cons, strategy |
| Simple & Conversational | hello, what is, define, summarize, translate |
| Creative Writing | story, poem, creative, narrative, blog post |
| Data & Analysis | csv, json, chart, statistics, regression, dashboard |
| Math & Science | equation, calculus, proof, physics, chemistry, probability |
Select one or more categories, optionally add your own custom keywords on top, and pick the target model. Rules are ordered by priority -- drag them up or down to reorder.
3. Add capability-based rules
You can also create rules that trigger based on what the request contains, independent of keywords. For example:
- Route all image requests to a vision-capable model
- Route all function-calling requests to a model with strong tool use
- Route structured output requests to a model that handles JSON schemas well
Set the required capabilities on a rule and leave keywords empty to match on capability alone.
The reasoning capability is special. There's no way to auto-detect "this prompt needs a reasoning model" from the request structure, so it always passes the capability check. Use it as a label alongside keywords -- e.g., a rule with keywords "analyze, evaluate" and capability "reasoning" routes analytical prompts to a reasoning model.
4. Assign to API keys
In the dashboard, edit an API key and select your auto-router from the dropdown. Any request from that key using model: "haimaker/auto" will use your router configuration.
Keyword matching
Keywords are matched against the last user message only. Since each API call carries the full conversation history, the latest message reflects the current intent.
Matching uses word boundaries, not substring search. The keyword "python" matches "write python code" but not "pythonic". Matching is case-insensitive.
When you select keyword categories, the category's keywords are expanded and merged with any custom keywords you add. The expanded set is stored for matching, while the original category selection is preserved so the dashboard can reconstruct it.
Testing your configuration
The dashboard includes a test sandbox. Type a sample prompt and see which model the router would select, which rule matched, and what capabilities were detected. This calls the simulate endpoint without making an actual LLM request, so it's free and fast.
- Dashboard
Use the Test sandbox tab in your router's configuration page. Type a prompt and click Test.
Observability
The response includes headers for debugging:
| Header | Value |
|---|---|
x-auto-router-rule-id | ID of the matched rule, or "default" |
x-auto-router-reason | keyword-match, capability-fallback, or default |
The spend log metadata field records routing decisions:
{
"auto_routed_from": "haimaker/auto",
"auto_routed_model": "moonshotai/kimi-k2.5",
"auto_routing_trigger": "rule:abc-123",
"auto_routing_keyword": "python"
}
Cost tracking and rate limits apply to the resolved model, not haimaker/auto.
Limits and edge cases
No recursive routing. You can't add haimaker/auto to a router's model pool or use it as a rule target. The API rejects this, and there's a runtime check as a safety net.
One router per key. Each API key can have one auto-router assigned. If no router is assigned, requests to haimaker/auto return an error.
Context length safety buffer. Token estimation uses a 10% safety margin. A model is only considered if estimated_tokens < max_input_tokens * 0.9. This avoids routing to a model that then fails at the provider with a context length error.
Caching. Router configurations are cached for 60 seconds. Changes you make in the dashboard take up to a minute to take effect.