LiteLLM Proxy (LLM Gateway)
Property | Details |
---|---|
Description | LiteLLM Proxy is an OpenAI-compatible gateway that allows you to interact with multiple LLM providers through a unified API. Simply use the litellm_proxy/ prefix before the model name to route your requests through the proxy. |
Provider Route on LiteLLM | litellm_proxy/ (add this prefix to the model name, to route any requests to litellm_proxy - e.g. litellm_proxy/your-model-name ) |
Setup LiteLLM Gateway | LiteLLM Gateway ↗ |
Supported Endpoints | /chat/completions , /completions , /embeddings , /audio/speech , /audio/transcriptions , /images , /rerank |
Required Variables​
os.environ["LITELLM_PROXY_API_KEY"] = "" # "sk-1234" your litellm proxy api key
os.environ["LITELLM_PROXY_API_BASE"] = "" # "http://localhost:4000" your litellm proxy api base
Usage (Non Streaming)​
import os
import litellm
from litellm import completion
os.environ["LITELLM_PROXY_API_KEY"] = ""
# set custom api base to your proxy
# either set .env or litellm.api_base
# os.environ["LITELLM_PROXY_API_BASE"] = ""
litellm.api_base = "your-openai-proxy-url"
messages = [{ "content": "Hello, how are you?","role": "user"}]
# litellm proxy call
response = completion(model="litellm_proxy/your-model-name", messages)
Usage - passing api_base
, api_key
per request​
If you need to set api_base dynamically, just pass it in completions instead - completions(...,api_base="your-proxy-api-base")
import os
import litellm
from litellm import completion
os.environ["LITELLM_PROXY_API_KEY"] = ""
messages = [{ "content": "Hello, how are you?","role": "user"}]
# litellm proxy call
response = completion(
model="litellm_proxy/your-model-name",
messages,
api_base = "your-litellm-proxy-url",
api_key = "your-litellm-proxy-api-key"
)
Usage - Streaming​
import os
import litellm
from litellm import completion
os.environ["LITELLM_PROXY_API_KEY"] = ""
messages = [{ "content": "Hello, how are you?","role": "user"}]
# openai call
response = completion(
model="litellm_proxy/your-model-name",
messages,
api_base = "your-litellm-proxy-url",
stream=True
)
for chunk in response:
print(chunk)
Embeddings​
import litellm
response = litellm.embedding(
model="litellm_proxy/your-embedding-model",
input="Hello world",
api_base="your-litellm-proxy-url",
api_key="your-litellm-proxy-api-key"
)
Image Generation​
import litellm
response = litellm.image_generation(
model="litellm_proxy/dall-e-3",
prompt="A beautiful sunset over mountains",
api_base="your-litellm-proxy-url",
api_key="your-litellm-proxy-api-key"
)
Audio Transcription​
import litellm
response = litellm.transcription(
model="litellm_proxy/whisper-1",
file="your-audio-file",
api_base="your-litellm-proxy-url",
api_key="your-litellm-proxy-api-key"
)
Text to Speech​
import litellm
response = litellm.speech(
model="litellm_proxy/tts-1",
input="Hello world",
api_base="your-litellm-proxy-url",
api_key="your-litellm-proxy-api-key"
)
Rerank​
import litellm
import litellm
response = litellm.rerank(
model="litellm_proxy/rerank-english-v2.0",
query="What is machine learning?",
documents=[
"Machine learning is a field of study in artificial intelligence",
"Biology is the study of living organisms"
],
api_base="your-litellm-proxy-url",
api_key="your-litellm-proxy-api-key"
)