Skip to main content

Auto-router

Send model: "haimaker/auto" and the auto-router picks the right model for each request. It looks at what the request actually needs (vision, tool use, long context) and matches keywords in the prompt against rules you define.

You configure it in the dashboard -- set up a model pool, add routing rules, and assign the router to your API keys. No code changes required beyond swapping the model name.

Quick start

curl https://api.haimaker.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "haimaker/auto",
"messages": [{"role": "user", "content": "Write a Python function to sort a list"}]
}'

The response model field contains the actual model that handled the request (e.g., moonshotai/kimi-k2.5), not haimaker/auto.

How routing works

The router evaluates requests in three layers, in order:

1. Capability detection

The router inspects the request and filters the model pool to only models that can handle it:

Detected capabilityRequest signalPool filter
VisionMessage contains image_url contentsupports_vision
Tool useRequest has tools or functionssupports_function_calling
Structured outputresponse_format.type is json_schemasupports_response_schema
Audio inputMessage contains input_audio contentsupports_audio_input
PDF inputMessage contains file or document contentsupports_pdf_input
Web searchTool with type web_search or web_search_previewsupports_web_search
Long contextEstimated token count exceeds model limitmax_input_tokens (with 10% safety buffer)

This always runs first. If a keyword rule matches but the target model can't handle the request (e.g., the prompt says "python" but includes an image, and the coding model doesn't support vision), that rule is skipped.

2. Keyword and capability rules

Each rule has:

  • Keywords matched against the last user message
  • Keyword categories (pre-built bundles like "Code & Development")
  • Required capabilities (optional -- the rule only fires if the request needs those capabilities)
  • A target model

The router counts how many of each rule's keywords appear in the last user message. The rule with the most matches wins. If two rules tie on match count, the one with higher priority (lower order number) wins.

Keywords that also appear in the system message are automatically deweighted — they count at 25% of their normal value. This prevents tool descriptions and agent instructions in the system prompt from triggering false keyword matches on every turn. A rule needs at least 0.5 effective score to trigger at all.

This means a prompt like "evaluate this code, debug the const" routes to the coding model (3 matches: "code", "debug", "const") rather than the reasoning model (1 match: "evaluate"), even if the reasoning rule has higher priority.

3. Default model

If nothing matches, the default model handles the request. Pick something general-purpose here.

Setting up a router

1. Create a router

Go to Auto-Router in the dashboard and click Create New. A default configuration is created with sensible routing rules that you can customize.

2. Configure rules

Each rule maps a set of keywords or capabilities to a target model. The dashboard provides pre-built keyword categories:

CategoryWhat it matches
Code & Developmentpython, javascript, const, async, await, debug, api, react, npm, and more
Complex Reasoninganalyze, evaluate, step-by-step, pros and cons, strategy
Simple & Conversationalhello, what is, define, summarize, translate
Creative Writingstory, poem, creative, narrative, blog post
Data & Analysiscsv, json, chart, statistics, regression, dashboard
Math & Scienceequation, calculus, proof, physics, chemistry, probability

Select one or more categories, optionally add your own custom keywords on top, and pick the target model. Rules are ordered by priority -- drag them up or down to reorder.

3. Add capability-based rules

You can also create rules that trigger based on what the request contains, independent of keywords. For example:

  • Route all image requests to a vision-capable model
  • Route all function-calling requests to a model with strong tool use
  • Route structured output requests to a model that handles JSON schemas well

Set the required capabilities on a rule and leave keywords empty to match on capability alone.

info

The reasoning capability is special. There's no way to auto-detect "this prompt needs a reasoning model" from the request structure, so it always passes the capability check. Use it as a label alongside keywords -- e.g., a rule with keywords "analyze, evaluate" and capability "reasoning" routes analytical prompts to a reasoning model.

4. Assign to API keys

In the dashboard, edit an API key and select your auto-router from the dropdown. Any request from that key using model: "haimaker/auto" will use your router configuration.

Keyword matching

Keywords are matched against the last user message only. Since each API call carries the full conversation history, the latest message reflects the current intent.

Matching uses word boundaries, not substring search. The keyword "python" matches "write python code" but not "pythonic". Matching is case-insensitive.

When you select keyword categories, the category's keywords are expanded and merged with any custom keywords you add. The expanded set is stored for matching, while the original category selection is preserved so the dashboard can reconstruct it.

Testing your configuration

The dashboard includes a test sandbox. Type a sample prompt and see which model the router would select, which rule matched, and what capabilities were detected. This calls the simulate endpoint without making an actual LLM request, so it's free and fast.

Use the Test sandbox tab in your router's configuration page. Type a prompt and click Test.

Observability

The response includes headers for debugging:

HeaderValue
x-auto-router-rule-idID of the matched rule, or "default"
x-auto-router-reasonkeyword-match, capability-fallback, or default

The spend log metadata field records routing decisions:

{
"auto_routed_from": "haimaker/auto",
"auto_routed_model": "moonshotai/kimi-k2.5",
"auto_routing_trigger": "rule:abc-123",
"auto_routing_keyword": "python"
}

Cost tracking and rate limits apply to the resolved model, not haimaker/auto.

Limits and edge cases

No recursive routing. You can't add haimaker/auto to a router's model pool or use it as a rule target. The API rejects this, and there's a runtime check as a safety net.

One router per key. Each API key can have one auto-router assigned. If no router is assigned, requests to haimaker/auto return an error.

Context length safety buffer. Token estimation uses a 10% safety margin. A model is only considered if estimated_tokens < max_input_tokens * 0.9. This avoids routing to a model that then fails at the provider with a context length error.

Caching. Router configurations are cached for 60 seconds. Changes you make in the dashboard take up to a minute to take effect.