Skip to main content

2 posts tagged with "snowflake"

View All Tags

v1.63.14-stable

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaffer
CTO, LiteLLM

These are the changes since v1.63.11-stable.

This release brings:

  • LLM Translation Improvements (MCP Support and Bedrock Application Profiles)
  • Perf improvements for Usage-based Routing
  • Streaming guardrail support via websockets
  • Azure OpenAI client perf fix (from previous release)

Docker Run LiteLLM Proxy

docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
docker.litellm.ai/berriai/litellm:main-v1.63.14-stable.patch1

Demo Instance

Here's a Demo Instance to test changes:

New Models / Updated Models

  • Azure gpt-4o - fixed pricing to latest global pricing - PR
  • O1-Pro - add pricing + model information - PR
  • Azure AI - mistral 3.1 small pricing added - PR
  • Azure - gpt-4.5-preview pricing added - PR

LLM Translation

  1. New LLM Features
  • Bedrock: Support bedrock application inference profiles Docs
    • Infer aws region from bedrock application profile id - (arn:aws:bedrock:us-east-1:...)
  • Ollama - support calling via /v1/completions Get Started
  • Bedrock - support us.deepseek.r1-v1:0 model name Docs
  • OpenRouter - OPENROUTER_API_BASE env var support Docs
  • Azure - add audio model parameter support - Docs
  • OpenAI - PDF File support Docs
  • OpenAI - o1-pro Responses API streaming support Docs
  • [BETA] MCP - Use MCP Tools with LiteLLM SDK Docs
  1. Bug Fixes
  • Voyage: prompt token on embedding tracking fix - PR
  • Sagemaker - Fix ‘Too little data for declared Content-Length’ error - PR
  • OpenAI-compatible models - fix issue when calling openai-compatible models w/ custom_llm_provider set - PR
  • VertexAI - Embedding ‘outputDimensionality’ support - PR
  • Anthropic - return consistent json response format on streaming/non-streaming - PR

Spend Tracking Improvements

  • litellm_proxy/ - support reading litellm response cost header from proxy, when using client sdk
  • Reset Budget Job - fix budget reset error on keys/teams/users PR
  • Streaming - Prevents final chunk w/ usage from being ignored (impacted bedrock streaming + cost tracking) PR

UI

  1. Users Page
    • Feature: Control default internal user settings PR
  2. Icons:
    • Feature: Replace external "artificialanalysis.ai" icons by local svg PR
  3. Sign In/Sign Out
    • Fix: Default login when default_user_id user does not exist in DB PR

Logging Integrations

  • Support post-call guardrails for streaming responses Get Started
  • Arize Get Started
    • fix invalid package import PR
    • migrate to using standardloggingpayload for metadata, ensures spans land successfully PR
    • fix logging to just log the LLM I/O PR
    • Dynamic API Key/Space param support Get Started
  • StandardLoggingPayload - Log litellm_model_name in payload. Allows knowing what the model sent to API provider was Get Started
  • Prompt Management - Allow building custom prompt management integration Get Started

Performance / Reliability improvements

  • Redis Caching - add 5s default timeout, prevents hanging redis connection from impacting llm calls PR
  • Allow disabling all spend updates / writes to DB - patch to allow disabling all spend updates to DB with a flag PR
  • Azure OpenAI - correctly re-use azure openai client, fixes perf issue from previous Stable release PR
  • Azure OpenAI - uses litellm.ssl_verify on Azure/OpenAI clients PR
  • Usage-based routing - Wildcard model support Get Started
  • Usage-based routing - Support batch writing increments to redis - reduces latency to same as ‘simple-shuffle’ PR
  • Router - show reason for model cooldown on ‘no healthy deployments available error’ PR
  • Caching - add max value limit to an item in in-memory cache (1MB) - prevents OOM errors on large image url’s being sent through proxy PR

General Improvements

  • Passthrough Endpoints - support returning api-base on pass-through endpoints Response Headers Docs
  • SSL - support reading ssl security level from env var - Allows user to specify lower security settings Get Started
  • Credentials - only poll Credentials table when STORE_MODEL_IN_DB is True PR
  • Image URL Handling - new architecture doc on image url handling Docs
  • OpenAI - bump to pip install "openai==1.68.2" PR
  • Gunicorn - security fix - bump gunicorn==23.0.0 PR

Complete Git Diff

Here's the complete git diff

v1.63.11-stable

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaffer
CTO, LiteLLM

These are the changes since v1.63.2-stable.

This release is primarily focused on:

  • [Beta] Responses API Support
  • Snowflake Cortex Support, Amazon Nova Image Generation
  • UI - Credential Management, re-use credentials when adding new models
  • UI - Test Connection to LLM Provider before adding a model

Known Issues

  • 🚨 Known issue on Azure OpenAI - We don't recommend upgrading if you use Azure OpenAI. This version failed our Azure OpenAI load test

Docker Run LiteLLM Proxy

docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
docker.litellm.ai/berriai/litellm:main-v1.63.11-stable

Demo Instance

Here's a Demo Instance to test changes:

New Models / Updated Models

  • Image Generation support for Amazon Nova Canvas Getting Started
  • Add pricing for Jamba new models PR
  • Add pricing for Amazon EU models PR
  • Add Bedrock Deepseek R1 model pricing PR
  • Update Gemini pricing: Gemma 3, Flash 2 thinking update, LearnLM PR
  • Mark Cohere Embedding 3 models as Multimodal PR
  • Add Azure Data Zone pricing PR
    • LiteLLM Tracks cost for azure/eu and azure/us models

LLM Translation

  1. New Endpoints
  1. New LLM Providers
  1. New LLM Features
  1. Bug Fixes
  • OpenAI: Return code, param and type on bad request error More information on litellm exceptions
  • Bedrock: Fix converse chunk parsing to only return empty dict on tool use PR
  • Bedrock: Support extra_headers PR
  • Azure: Fix Function Calling Bug & Update Default API Version to 2025-02-01-preview PR
  • Azure: Fix AI services URL PR
  • Vertex AI: Handle HTTP 201 status code in response PR
  • Perplexity: Fix incorrect streaming response PR
  • Triton: Fix streaming completions bug PR
  • Deepgram: Support bytes.IO when handling audio files for transcription PR
  • Ollama: Fix "system" role has become unacceptable PR
  • All Providers (Streaming): Fix String data: stripped from entire content in streamed responses PR

Spend Tracking Improvements

  1. Support Bedrock converse cache token tracking Getting Started
  2. Cost Tracking for Responses API Getting Started
  3. Fix Azure Whisper cost tracking Getting Started

UI

Re-Use Credentials on UI

You can now onboard LLM provider credentials on LiteLLM UI. Once these credentials are added you can re-use them when adding new models Getting Started

Test Connections before adding models

Before adding a model you can test the connection to the LLM provider to verify you have setup your API Base + API Key correctly

General UI Improvements

  1. Add Models Page
    • Allow adding Cerebras, Sambanova, Perplexity, Fireworks, Openrouter, TogetherAI Models, Text-Completion OpenAI on Admin UI
    • Allow adding EU OpenAI models
    • Fix: Instantly show edit + deletes to models
  2. Keys Page
    • Fix: Instantly show newly created keys on Admin UI (don't require refresh)
    • Fix: Allow clicking into Top Keys when showing users Top API Key
    • Fix: Allow Filter Keys by Team Alias, Key Alias and Org
    • UI Improvements: Show 100 Keys Per Page, Use full height, increase width of key alias
  3. Users Page
    • Fix: Show correct count of internal user keys on Users Page
    • Fix: Metadata not updating in Team UI
  4. Logs Page
    • UI Improvements: Keep expanded log in focus on LiteLLM UI
    • UI Improvements: Minor improvements to logs page
    • Fix: Allow internal user to query their own logs
    • Allow switching off storing Error Logs in DB Getting Started
  5. Sign In/Sign Out

Security

  1. Support for Rotating Master Keys Getting Started
  2. Fix: Internal User Viewer Permissions, don't allow internal_user_viewer role to see Test Key Page or Create Key Button More information on role based access controls
  3. Emit audit logs on All user + model Create/Update/Delete endpoints Getting Started
  4. JWT
    • Support multiple JWT OIDC providers Getting Started
    • Fix JWT access with Groups not working when team is assigned All Proxy Models access
  5. Using K/V pairs in 1 AWS Secret Getting Started

Logging Integrations

  1. Prometheus: Track Azure LLM API latency metric Getting Started
  2. Athina: Added tags, user_feedback and model_options to additional_keys which can be sent to Athina Getting Started

Performance / Reliability improvements

  1. Redis + litellm router - Fix Redis cluster mode for litellm router PR

General Improvements

  1. OpenWebUI Integration - display thinking tokens
  • Guide on getting started with LiteLLM x OpenWebUI. Getting Started
  • Display thinking tokens on OpenWebUI (Bedrock, Anthropic, Deepseek) Getting Started

Complete Git Diff

Here's the complete git diff