Skip to main content

haimaker - Getting Started

Call 200+ LLMs using the OpenAI Input/Output Format

  • Translate inputs to provider's completion, embedding, and image_generation endpoints
  • Consistent output, text responses will always be available at ['choices'][0]['message']['content']
  • Retry/fallback logic across multiple deployments (e.g. Azure/OpenAI) - Router

How to use haimaker

  1. haimaker python SDK - Python Client to call 200+ LLMs, load balance, cost tracking

When to use haimaker Python SDK

tip

Use haimaker Python SDK if you want to use haimaker in your python code

Typically used by developers building llm projects

  • haimaker SDK gives you a unified interface to access multiple LLMs (100+ LLMs)
  • Retry/fallback logic across multiple deployments (e.g. Azure/OpenAI) - Router

haimaker Python SDK

Basic usage

# haimaker currentl uses litellm's python package
pip install litellm
from litellm import completion
import os

## set ENV variables
os.environ["OPENAI_API_KEY"] = "your-api-key"

response = completion(
model="gpt-3.5-turbo",
messages=[{ "content": "Hello, how are you?","role": "user"}]
)

Streaming

Set stream=True in the completion args.

from litellm import completion
import os

## set ENV variables
os.environ["OPENAI_API_KEY"] = "your-api-key"

response = completion(
model="gpt-3.5-turbo",
messages=[{ "content": "Hello, how are you?","role": "user"}],
stream=True,
)

Exception handling

haimaker maps exceptions across all supported providers to the OpenAI exceptions. All our exceptions inherit from OpenAI's exception types, so any error-handling you have for that, should work out of the box with haimaker.

from openai.error import OpenAIError
from litellm import completion

os.environ["ANTHROPIC_API_KEY"] = "bad-key"
try:
# some code
completion(model="claude-instant-1", messages=[{"role": "user", "content": "Hey, how's it going?"}])
except OpenAIError as e:
print(e)

Logging Observability - Log LLM Input/Output (Docs)

haimaker exposes pre defined callbacks to send data to MLflow, Lunary, Langfuse, Helicone, Promptlayer, Traceloop, Slack

from litellm import completion

## set env variables for logging tools (API key set up is not required when using MLflow)
os.environ["LUNARY_PUBLIC_KEY"] = "your-lunary-public-key" # get your key at https://app.lunary.ai/settings
os.environ["HELICONE_API_KEY"] = "your-helicone-key"
os.environ["LANGFUSE_PUBLIC_KEY"] = ""
os.environ["LANGFUSE_SECRET_KEY"] = ""

os.environ["OPENAI_API_KEY"]

# set callbacks
haimaker.success_callback = ["lunary", "mlflow", "langfuse", "helicone"] # log input/output to lunary, mlflow, langfuse, helicone

#openai call
response = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hi 👋 - i'm openai"}])

Track Costs, Usage, Latency for streaming

Use a callback function for this - more info on custom callbacks: https://docs.haimaker.ai/docs/observability/custom_callback

import haimaker

# track_cost_callback
def track_cost_callback(
kwargs, # kwargs to completion
completion_response, # response from completion
start_time, end_time # start/end time
):
try:
response_cost = kwargs.get("response_cost", 0)
print("streaming response_cost", response_cost)
except:
pass
# set callback
haimaker.success_callback = [track_cost_callback] # set custom callback function

# haimaker.completion() call
response = completion(
model="gpt-3.5-turbo",
messages=[
{
"role": "user",
"content": "Hi 👋 - i'm openai"
}
],
stream=True
)