Skip to main content

Upgrading from v1

This page is for anyone who set up an auto-router before the v2 rollout, or who built against the keyword-based rules API. If you're new to the auto-router, skip this and read the overview instead.

The short version

  • Rules are no longer keyword lists. A rule is now a set of example prompts, and incoming requests match by similarity. "Fix this bug in my code" matches a coding rule even though it contains none of the old keywords.
  • The router now learns from your traffic. A daily analysis finds repeated prompt patterns going to expensive models and proposes cheaper routes. Proposals either wait for your approval in the Review Queue or, for the most clear-cut cases, apply automatically with a revert button.
  • The long-context fallback got cheaper. When your default model can't fit a request, the router used to pick the model with the largest context window, which was usually also the most expensive one. It now picks the cheapest model that fits. Most users will see their fallback spend drop and nothing else change.
  • Routing is still deterministic. The same prompt routes to the same model, every time. No LLM is involved in the request path; the learning happens offline.

Why we replaced keywords

Keyword matching breaks in ways that are hard to patch. A prompt that mentions "error" or "node" in passing would trip a coding rule even when the request was about something else entirely. Agent system prompts were the worst offenders: their tool descriptions are full of words like "api" and "function", so we had to add scoring hacks to discount them, and those hacks created their own surprises. Counting words tells you very little about what a prompt is actually asking for.

Example matching judges the whole prompt at once. You show the router what a coding request looks like, and it routes things that look like that. When it gets one wrong, the fix is direct: add the misrouted prompt as an example to the right rule, or tighten a threshold. No keyword list to maintain.

What happened to your existing router

Your default model, API key assignments, and capability-based rules all carried over unchanged. Keyword rules were converted like this:

Rules with recent traffic were converted automatically. We took real prompts that each rule matched over the previous weeks and used them as that rule's example prompts. These rules show a migrated badge in the rule list. They should route the same traffic they did before, usually with fewer false positives.

Rules with no recent traffic could not be converted. There was nothing to seed examples from. These rules show up with an empty example list and won't match anything until you open them in the rule editor and write a few example prompts yourself. Until then, their traffic goes to your default model. (Rules that matched on capabilities alone, with no keywords, are unaffected.)

Either way, spend ten minutes in the Test Sandbox after the upgrade. Run a few prompts that should match each rule and a few that shouldn't. The sandbox shows the similarity score against every rule, so if something routes the wrong way, you'll see exactly why and which threshold to adjust.

Breaking API changes

If you manage routers through the management API, these are the changes that will break existing code:

Whatv1v2
Rule payloadskeywords, keyword_categoriesexample_prompts (+ optional match_threshold). Sending the old fields returns a 400.
Keyword categoriesGET /auto-router/keyword-categoriesRemoved. Returns 404.
Simulate responsereason: "keyword-match", matched_keywordreason: "example-match", similarity, plus a per-rule rule_similarities breakdown
Spend-log metadataauto_routing_keyword, auto_routing_effective_scoreauto_routing_similarity, auto_routing_rule_source

Two additive changes worth knowing about: rule responses now include source (manual, mined, or migrated) and enabled, and router responses include the capture_enabled and auto_apply_enabled settings.

If you have dashboards or queries that read auto_routing_keyword from spend logs, they'll come back empty for new requests. Switch them to auto_routing_trigger (unchanged) or the new similarity fields.

Behavior changes to watch for

Fallback model selection. This is the one silent behavior change. If a request is too large for your default model, v1 sent it to your largest-context model; v2 sends it to the cheapest model that fits. If you were (perhaps without realizing it) relying on the fallback to reach a specific flagship model, add an explicit rule for that traffic instead.

Match strictness is now tunable. A keyword rule fired if words appeared; an example rule fires when similarity crosses its threshold (default 0.80). If a migrated rule matches less than it used to, lower its threshold a little or add more varied examples. If it matches too much, raise the threshold.

Tie-breaking changed. v1 picked the rule with the most keyword hits. v2 picks the rule with the highest similarity; rule order only breaks exact ties.

New things in the dashboard

Three tabs and two settings you haven't seen before:

  • Review Queue: rule proposals mined from your traffic, each with sample prompts, request volume, the judge's reasoning, and projected monthly savings. Accept or reject.
  • Changelog: a history of applied, accepted, and reverted proposals, with one-click revert for anything the tuner created. If a learned rule ever sends traffic somewhere you don't like, this is the undo button.
  • Traffic: your top prompt templates over the last 30 days and which model serves each. Useful for spotting high-volume traffic still on the default model.
  • Settings → Traffic capture: the learning features work by storing the normalized last user message of routed requests (capped at 2KB, deduplicated, deleted after 30 days, used only for your own router). Turn this off and the router stops learning; routing itself is unaffected.
  • Settings → Auto-apply: on by default. Turn it off and every proposal waits in the review queue, nothing changes without your click.

Upgrade checklist

  1. Open your router and check each rule. migrated rules came over with examples; rules with an empty example list need you to write some before they'll match.
  2. Run your typical prompts through the Test Sandbox and adjust thresholds where matches look wrong.
  3. If you call the management API: replace keywords/keyword_categories with example_prompts in rule payloads, and drop any call to the keyword-categories endpoint.
  4. Update anything reading auto_routing_keyword or auto_routing_effective_score from spend logs.
  5. Decide whether you want traffic capture and auto-apply on. Both default to on; both have off switches in Settings.

Questions or a router that didn't convert cleanly? Reach us at support@haimaker.ai.