Skip to main content

3 posts tagged with "admin ui"

View All Tags

v1.59.8-stable

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaffer
CTO, LiteLLM
info

Get a 7 day free trial for LiteLLM Enterprise here.

no call needed

New Models / Updated Models

  1. New OpenAI /image/variations endpoint BETA support Docs
  2. Topaz API support on OpenAI /image/variations BETA endpoint Docs
  3. Deepseek - r1 support w/ reasoning_content (Deepseek API, Vertex AI, Bedrock)
  4. Azure - Add azure o1 pricing See Here
  5. Anthropic - handle -latest tag in model for cost calculation
  6. Gemini-2.0-flash-thinking - add model pricing (it’s 0.0) See Here
  7. Bedrock - add stability sd3 model pricing See Here (s/o Marty Sullivan)
  8. Bedrock - add us.amazon.nova-lite-v1:0 to model cost map See Here
  9. TogetherAI - add new together_ai llama3.3 models See Here

LLM Translation

  1. LM Studio -> fix async embedding call
  2. Gpt 4o models - fix response_format translation
  3. Bedrock nova - expand supported document types to include .md, .csv, etc. Start Here
  4. Bedrock - docs on IAM role based access for bedrock - Start Here
  5. Bedrock - cache IAM role credentials when used
  6. Google AI Studio (gemini/) - support gemini 'frequency_penalty' and 'presence_penalty'
  7. Azure O1 - fix model name check
  8. WatsonX - ZenAPIKey support for WatsonX Docs
  9. Ollama Chat - support json schema response format Start Here
  10. Bedrock - return correct bedrock status code and error message if error during streaming
  11. Anthropic - Supported nested json schema on anthropic calls
  12. OpenAI - metadata param preview support
    1. SDK - enable via litellm.enable_preview_features = True
    2. PROXY - enable via litellm_settings::enable_preview_features: true
  13. Replicate - retry completion response on status=processing

Spend Tracking Improvements

  1. Bedrock - QA asserts all bedrock regional models have same supported_ as base model
  2. Bedrock - fix bedrock converse cost tracking w/ region name specified
  3. Spend Logs reliability fix - when user passed in request body is int instead of string
  4. Ensure ‘base_model’ cost tracking works across all endpoints
  5. Fixes for Image generation cost tracking
  6. Anthropic - fix anthropic end user cost tracking
  7. JWT / OIDC Auth - add end user id tracking from jwt auth

Management Endpoints / UI

  1. allows team member to become admin post-add (ui + endpoints)
  2. New edit/delete button for updating team membership on UI
  3. If team admin - show all team keys
  4. Model Hub - clarify cost of models is per 1m tokens
  5. Invitation Links - fix invalid url generated
  6. New - SpendLogs Table Viewer - allows proxy admin to view spend logs on UI
    1. New spend logs - allow proxy admin to ‘opt in’ to logging request/response in spend logs table - enables easier abuse detection
    2. Show country of origin in spend logs
    3. Add pagination + filtering by key name/team name
  7. /key/delete - allow team admin to delete team keys
  8. Internal User ‘view’ - fix spend calculation when team selected
  9. Model Analytics is now on Free
  10. Usage page - shows days when spend = 0, and round spend on charts to 2 sig figs
  11. Public Teams - allow admins to expose teams for new users to ‘join’ on UI - Start Here
  12. Guardrails
    1. set/edit guardrails on a virtual key
    2. Allow setting guardrails on a team
    3. Set guardrails on team create + edit page
  13. Support temporary budget increases on /key/update - new temp_budget_increase and temp_budget_expiry fields - Start Here
  14. Support writing new key alias to AWS Secret Manager - on key rotation Start Here

Helm

  1. add securityContext and pull policy values to migration job (s/o https://github.com/Hexoplon)
  2. allow specifying envVars on values.yaml
  3. new helm lint test

Logging / Guardrail Integrations

  1. Log the used prompt when prompt management used. Start Here
  2. Support s3 logging with team alias prefixes - Start Here
  3. Prometheus Start Here
    1. fix litellm_llm_api_time_to_first_token_metric not populating for bedrock models
    2. emit remaining team budget metric on regular basis (even when call isn’t made) - allows for more stable metrics on Grafana/etc.
    3. add key and team level budget metrics
    4. emit litellm_overhead_latency_metric
    5. Emit litellm_team_budget_reset_at_metric and litellm_api_key_budget_remaining_hours_metric
  4. Datadog - support logging spend tags to Datadog. Start Here
  5. Langfuse - fix logging request tags, read from standard logging payload
  6. GCS - don’t truncate payload on logging
  7. New GCS Pub/Sub logging support Start Here
  8. Add AIM Guardrails support Start Here

Security

  1. New Enterprise SLA for patching security vulnerabilities. See Here
  2. Hashicorp - support using vault namespace for TLS auth. Start Here
  3. Azure - DefaultAzureCredential support

Health Checks

  1. Cleanup pricing-only model names from wildcard route list - prevent bad health checks
  2. Allow specifying a health check model for wildcard routes - https://docs.haimaker.ai/docs/proxy/health#wildcard-routes
  3. New ‘health_check_timeout ‘ param with default 1min upperbound to prevent bad model from health check to hang and cause pod restarts. Start Here
  4. Datadog - add data dog service health check + expose new /health/services endpoint. Start Here

Performance / Reliability improvements

  1. 3x increase in RPS - moving to orjson for reading request body
  2. LLM Routing speedup - using cached get model group info
  3. SDK speedup - using cached get model info helper - reduces CPU work to get model info
  4. Proxy speedup - only read request body 1 time per request
  5. Infinite loop detection scripts added to codebase
  6. Bedrock - pure async image transformation requests
  7. Cooldowns - single deployment model group if 100% calls fail in high traffic - prevents an o1 outage from impacting other calls
  8. Response Headers - return
    1. x-litellm-timeout
    2. x-litellm-attempted-retries
    3. x-litellm-overhead-duration-ms
    4. x-litellm-response-duration-ms
  9. ensure duplicate callbacks are not added to proxy
  10. Requirements.txt - bump certifi version

General Proxy Improvements

  1. JWT / OIDC Auth - new enforce_rbac param,allows proxy admin to prevent any unmapped yet authenticated jwt tokens from calling proxy. Start Here
  2. fix custom openapi schema generation for customized swagger’s
  3. Request Headers - support reading x-litellm-timeout param from request headers. Enables model timeout control when using Vercel’s AI SDK + LiteLLM Proxy. Start Here
  4. JWT / OIDC Auth - new role based permissions for model authentication. See Here

Complete Git Diff

This is the diff between v1.57.8-stable and v1.59.8-stable.

Use this to see the changes in the codebase.

Git Diff

v1.59.0

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaffer
CTO, LiteLLM
info

Get a 7 day free trial for LiteLLM Enterprise here.

no call needed

UI Improvements

[Opt In] Admin UI - view messages / responses

You can now view messages and response logs on Admin UI.

How to enable it - add store_prompts_in_spend_logs: true to your proxy_config.yaml

Once this flag is enabled, your messages and responses will be stored in the LiteLLM_Spend_Logs table.

general_settings:
store_prompts_in_spend_logs: true

DB Schema Change

Added messages and responses to the LiteLLM_Spend_Logs table.

By default this is not logged. If you want messages and responses to be logged, you need to opt in with this setting

general_settings:
store_prompts_in_spend_logs: true

v1.56.4

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaffer
CTO, LiteLLM

deepgram, fireworks ai, vision, admin ui, dependency upgrades

New Models

Deepgram Speech to Text

New Speech to Text support for Deepgram models. Start Here

from litellm import transcription
import os

# set api keys
os.environ["DEEPGRAM_API_KEY"] = ""
audio_file = open("/path/to/audio.mp3", "rb")

response = transcription(model="deepgram/nova-2", file=audio_file)

print(f"response: {response}")

Fireworks AI - Vision support for all models

LiteLLM supports document inlining for Fireworks AI models. This is useful for models that are not vision models, but still need to parse documents/images/etc. LiteLLM will add #transform=inline to the url of the image_url, if the model is not a vision model See Code

Proxy Admin UI

  • Test Key Tab displays model used in response
  • Test Key Tab renders content in .md, .py (any code/markdown format)

Dependency Upgrades

Bug Fixes