Skip to main content

v1.80.8-stable - Introducing A2A Agent Gateway

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaff
CTO, LiteLLM

Deploy this version

docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
docker.litellm.ai/berriai/litellm:v1.80.8-stable

Key Highlights


Agent Gateway (A2A)


This release introduces A2A Agent Gateway for LiteLLM, allowing you to invoke and manage A2A agents with the same controls you have for LLM APIs.

As a LiteLLM Gateway Admin, you can now do the following:

  • Request/Response Logging - Every agent invocation is logged to the Logs page with full request and response tracking.
  • Access Control - Control which Team/Key can access which agents.

As a developer, you can continue using the A2A SDK, all you need to do is point you A2AClient to the LiteLLM proxy URL and your API key.

Works with the A2A SDK:

from a2a.client import A2AClient

client = A2AClient(
base_url="http://localhost:4000", # Your LiteLLM proxy
api_key="sk-1234" # LiteLLM API key
)

response = client.send_message(
agent_id="my-agent",
message="What's the status of my order?"
)

Get started with Agent Gateway here: Agent Gateway Documentation


Customer (End User) Usage UI

Users can now filter usage statistics by customers, providing the same granular filtering capabilities available for teams and organizations.

Details:

  • Filter usage analytics, spend logs, and activity metrics by customer ID
  • View customer-level breakdowns alongside existing team and user-level filters
  • Consistent filtering experience across all usage and analytics views

New Providers and Endpoints

New Providers (5 new providers)

ProviderSupported LiteLLM EndpointsDescription
Z.AI (Zhipu AI)/v1/chat/completions, /v1/responses, /v1/messagesBuilt-in support for Zhipu AI GLM models
RAGFlow/v1/chat/completions, /v1/responses, /v1/messages, /v1/vector_storesRAG-based chat completions with vector store support
PublicAI/v1/chat/completions, /v1/responses, /v1/messagesOpenAI-compatible provider via JSON config
Google Cloud Chirp3 HD/v1/audio/speech, /v1/audio/speech/streamText-to-speech with Google Cloud Chirp3 HD voices

New LLM API Endpoints (2 new endpoints)

EndpointMethodDescriptionDocumentation
/v1/agents/invokePOSTInvoke A2A agents through the AI GatewayAgent Gateway
/cursor/chat/completionsPOSTCursor BYOK endpoint - accepts Responses API input, returns Chat Completions outputCursor Integration

New Models / Updated Models

New Model Support (33 new models)

ProviderModelContext WindowInput ($/1M tokens)Output ($/1M tokens)Features
OpenAIgpt-5.1-codex-max400K$1.25$10.00Reasoning, vision, PDF input, responses API
Azureazure/gpt-5.1-codex-max400K$1.25$10.00Reasoning, vision, PDF input, responses API
Anthropicclaude-opus-4-5200K$5.00$25.00Computer use, reasoning, vision
Bedrockglobal.anthropic.claude-opus-4-5-20251101-v1:0200K$5.00$25.00Computer use, reasoning, vision
Bedrockamazon.nova-2-lite-v1:01M$0.30$2.50Reasoning, vision, video, PDF input
Bedrockamazon.titan-image-generator-v2:0--$0.008/imageImage generation
Fireworksfireworks_ai/deepseek-v3p2164K$1.20$1.20Function calling, response schema
Fireworksfireworks_ai/kimi-k2-instruct-0905262K$0.60$2.50Function calling, response schema
DeepSeekdeepseek/deepseek-v3.2164K$0.28$0.40Reasoning, function calling
Mistralmistral/mistral-large-3256K$0.50$1.50Function calling, vision
Azure AIazure_ai/mistral-large-3256K$0.50$1.50Function calling, vision
Moonshotmoonshot/kimi-k2-0905-preview262K$0.60$2.50Function calling, web search
Moonshotmoonshot/kimi-k2-turbo-preview262K$1.15$8.00Function calling, web search
Moonshotmoonshot/kimi-k2-thinking-turbo262K$1.15$8.00Function calling, web search
OpenRouteropenrouter/deepseek/deepseek-v3.2164K$0.28$0.40Reasoning, function calling
Databricksdatabricks/databricks-claude-haiku-4-5200K$1.00$5.00Reasoning, function calling
Databricksdatabricks/databricks-claude-opus-4200K$15.00$75.00Reasoning, function calling
Databricksdatabricks/databricks-claude-opus-4-1200K$15.00$75.00Reasoning, function calling
Databricksdatabricks/databricks-claude-opus-4-5200K$5.00$25.00Reasoning, function calling
Databricksdatabricks/databricks-claude-sonnet-4200K$3.00$15.00Reasoning, function calling
Databricksdatabricks/databricks-claude-sonnet-4-1200K$3.00$15.00Reasoning, function calling
Databricksdatabricks/databricks-gemini-2-5-flash1M$0.30$2.50Function calling
Databricksdatabricks/databricks-gemini-2-5-pro1M$1.25$10.00Function calling
Databricksdatabricks/databricks-gpt-5400K$1.25$10.00Function calling
Databricksdatabricks/databricks-gpt-5-1400K$1.25$10.00Function calling
Databricksdatabricks/databricks-gpt-5-mini400K$0.25$2.00Function calling
Databricksdatabricks/databricks-gpt-5-nano400K$0.05$0.40Function calling
Vertex AIvertex_ai/chirp-$30.00/1M chars-Text-to-speech (Chirp3 HD)
Z.AIzai/glm-4.6200K$0.60$2.20Function calling
Z.AIzai/glm-4.5128K$0.60$2.20Function calling
Z.AIzai/glm-4.5v128K$0.60$1.80Function calling, vision
Z.AIzai/glm-4.5-flash128KFreeFreeFunction calling
Vertex AIvertex_ai/bge-large-en-v1.5---BGE Embeddings

Features

Bug Fixes

  • Bedrock

    • Fix extra_headers in messages API bedrock invoke - PR #17271
    • Fix Bedrock models in model map - PR #17419
    • Make Bedrock converse messages respect modify_params as expected - PR #17427
    • Fix Anthropic beta headers for Bedrock imported Qwen models - PR #17467
    • Preserve usage from JSON response for OpenAI provider in Bedrock - PR #17589
  • SambaNova

    • Fix acompletion throws error with SambaNova models - PR #17217
  • General

    • Fix AttributeError when metadata is null in request body - PR #17306
    • Fix 500 error for malformed request - PR #17291
    • Respect custom LLM provider in header - PR #17290
    • Replace deprecated .dict() with .model_dump() in streaming_handler - PR #17359

LLM API Endpoints

Features

Bugs

  • General
    • Fix streaming error validation - PR #17242
    • Add length validation for empty tool_calls in delta - PR #17523

Management Endpoints / UI

Features

  • New Login Page

  • Customer (End User) Usage

  • Virtual Keys

    • Standardize API Key vs Virtual Key in UI - PR #17325
    • Add User Alias Column to Internal User Table - PR #17321
    • Delete Credential Enhancements - PR #17317
  • Models + Endpoints

    • Show all credential values on Edit Credential Modal - PR #17397
    • Change Edit Team Models Shown to Match Create Team - PR #17394
    • Support Images in Compare UI - PR #17562
  • Callbacks

  • Management Routes

    • Allow admin viewer to access global tag usage - PR #17501
    • Allow wildcard routes for nonproxy admin (SCIM) - PR #17178
    • Return 404 when a user is not found on /user/info - PR #16850
  • OCI Configuration

    • Enable Oracle Cloud Infrastructure configuration via UI - PR #17159

Bugs

  • UI Fixes

    • Fix Request and Response Panel JSONViewer - PR #17233
    • Adding Button Loading States to Edit Settings - PR #17236
    • Fix Various Text, button state, and test changes - PR #17237
    • Fix Fallbacks Immediately Deleting before API resolves - PR #17238
    • Remove Feature Flags - PR #17240
    • Fix metadata tags and model name display in UI for Azure passthrough - PR #17258
    • Change labeling around Vertex Fields - PR #17383
    • Remove second scrollbar when sidebar is expanded + tooltip z index - PR #17436
    • Fix Select in Edit Membership Modal - PR #17524
    • Change useAuthorized Hook to redirect to new Login Page - PR #17553
  • SSO

    • Fix the generic SSO provider - PR #17227
    • Clear SSO integration for all users - PR #17287
    • Fix SSO users not added to Entra synced team - PR #17331
  • Auth / JWT

    • JWT Auth - Allow using regular OIDC flow with user info endpoints - PR #17324
    • Fix litellm user auth not passing issue - PR #17342
    • Add other routes in JWT auth - PR #17345
    • Fix new org team validate against org - PR #17333
    • Fix litellm_enterprise ensure imported routes exist - PR #17337
    • Use organization.members instead of deprecated organization field - PR #17557
  • Organizations/Teams

    • Fix organization max budget not enforced - PR #17334
    • Fix budget update to allow null max_budget - PR #17545

AI Integrations (2 new integrations)

Logging (1 new integration)

New Integration

Improvements & Fixes

Guardrails (1 new integration)

New Integration

  • Generic Guardrail API
    • Generic Guardrail API - allows guardrail providers to add INSTANT support for LiteLLM w/out PR to repo - PR #17175
    • Guardrails API V2 - user api key metadata, session id, specify input type (request/response), image support - PR #17338
    • Guardrails API - add streaming support - PR #17400
    • Guardrails API - support tool call checks on OpenAI /chat/completions, OpenAI /responses, Anthropic /v1/messages - PR #17459
    • Guardrails API - new structured_messages param - PR #17518
    • Correctly map a v1/messages call to the anthropic unified guardrail - PR #17424
    • Support during_call event type for unified guardrails - PR #17514

Improvements & Fixes

Secret Managers

  • CyberArk

    • Allow setting SSL verify to false - PR #17433
  • General

    • Make email and secret manager operations independent in key management hooks - PR #17551

Spend Tracking, Budgets and Rate Limiting

  • Rate Limiting

    • Parallel Request Limiter with /messages - PR #17426
    • Allow using dynamic rate limit/priority reservation on teams - PR #17061
    • Dynamic Rate Limiter - Fix token count increases/decreases by 1 instead of actual count + Redis TTL - PR #17558
  • Spend Logs

    • Deprecate spend/logs & add spend/logs/v2 - PR #17167
    • Optimize SpendLogs queries to use timestamp filtering for index usage - PR #17504
  • Enforce User Param

    • Enforce support of enforce_user_param to OpenAI post endpoints - PR #17407

MCP Gateway

  • MCP Configuration

    • Remove URL format validation for MCP server endpoints - PR #17270
    • Add stack trace to MCP error message - PR #17269
  • MCP Tool Results

    • Preserve tool metadata in CallToolResult - PR #17561

Agent Gateway (A2A)

  • Agent Invocation

    • Allow invoking agents through AI Gateway - PR #17440
    • Allow tracking request/response in "Logs" Page - PR #17449
  • Agent Access Control

    • Enforce Allowed agents by key, team + add agent access groups on backend - PR #17502
  • Agent Gateway UI


Performance / Loadbalancing / Reliability improvements

  • Audio/Speech Performance

    • Fix /audio/speech performance by using shared_sessions - PR #16739
  • Memory Optimization

    • Prevent memory leak in aiohttp connection pooling - PR #17388
    • Lazy-load utils to reduce memory + import time - PR #17171
  • Database

    • Update default database connection number - PR #17353
    • Update default proxy_batch_write_at number - PR #17355
    • Add background health checks to db - PR #17528
  • Proxy Caching

    • Fix proxy caching between requests in aiohttp transport - PR #17122
  • Session Management

    • Fix session consistency, move Lasso API version away from source code - PR #17316
    • Conditionally pass enable_cleanup_closed to aiohttp TCPConnector - PR #17367
  • Vector Store

    • Fix vector store configuration synchronization failure - PR #17525

Documentation Updates

  • Provider Documentation

    • Add Azure AI Foundry documentation for Claude models - PR #17104
    • Document responses and embedding API for GitHub Copilot - PR #17456
    • Add gpt-5.1-codex-max to OpenAI provider documentation - PR #17602
    • Update Instructions For Phoenix Integration - PR #17373
  • Guides

    • Add guide on how to debug gateway error vs provider error - PR #17387
    • Agent Gateway documentation - PR #17454
    • A2A Permission management documentation - PR #17515
    • Update docs to link agent hub - PR #17462
  • Projects

    • Add Google ADK and Harbor to projects - PR #17352
    • Add Microsoft Agent Lightning to projects - PR #17422
  • Cleanup

    • Cleanup: Remove orphan docs pages and Docusaurus template files - PR #17356
    • Remove source .env from docs - PR #17466

Infrastructure / CI/CD

  • Helm Chart

  • Docker

    • Add retry logic to apk package installation in Dockerfile.non_root - PR #17596
    • Chainguard fixes - PR #17406
  • OpenAPI Schema

    • Refactor add_schema_to_components to move definitions to components/schemas - PR #17389
  • Security

    • Fix security vulnerability: update mdast-util-to-hast to 13.2.1 - PR #17601
    • Bump jws from 3.2.2 to 3.2.3 - PR #17494

New Contributors

  • @weichiet made their first contribution in PR #17242
  • @AndyForest made their first contribution in PR #17220
  • @omkar806 made their first contribution in PR #17217
  • @v0rtex20k made their first contribution in PR #17178
  • @hxomer made their first contribution in PR #17207
  • @orgersh92 made their first contribution in PR #17316
  • @dannykopping made their first contribution in PR #17313
  • @rioiart made their first contribution in PR #17333
  • @codgician made their first contribution in PR #17278
  • @epistoteles made their first contribution in PR #17277
  • @kothamah made their first contribution in PR #17368
  • @flozonn made their first contribution in PR #17371
  • @richardmcsong made their first contribution in PR #17389
  • @matt-greathouse made their first contribution in PR #17384
  • @mossbanay made their first contribution in PR #17380
  • @mhielpos-asapp made their first contribution in PR #17376
  • @Joilence made their first contribution in PR #17367
  • @deepaktammali made their first contribution in PR #17357
  • @axiomofjoy made their first contribution in PR #16611
  • @DevajMody made their first contribution in PR #17445
  • @andrewtruong made their first contribution in PR #17439
  • @AnasAbdelR made their first contribution in PR #17490
  • @dominicfeliton made their first contribution in PR #17516
  • @kristianmitk made their first contribution in PR #17504
  • @rgshr made their first contribution in PR #17130
  • @dominicfallows made their first contribution in PR #17489
  • @irfansofyana made their first contribution in PR #17467
  • @GusBricker made their first contribution in PR #17191
  • @OlivverX made their first contribution in PR #17255
  • @withsmilo made their first contribution in PR #17585

Full Changelog

View complete changelog on GitHub