How to Deploy LiteLLM Proxy on a VPS for a Unified OpenAI-Compatible API
When AI projects start to grow, you quickly end up with many models, many providers, and too many API keys to manage safely. You can deploy LiteLLM Proxy on a VPS to solve this problem with a single OpenAI‑compatible gateway that sits between your apps and all your models.
LiteLLM works like a control plane, which your apps talk to one simple endpoint, and LiteLLM handles routing, provider selection, rate limits, and cost tracking in the background. This guide shows how to deploy LiteLLM Proxy on a VPS with Docker, connect cloud or self‑hosted models, and test everything using an OpenAI‑compatible client.
If you want to host models or gateways on powerful dedicated hardware, check the AI Hosting Environment for GPU and enterprise‑grade setups.
Table of Contents
What LiteLLM Proxy Does
You can treat LiteLLM as a control plane that sits between:
- Upstream models: OpenAI, Anthropic, Azure, Bedrock, vLLM, Ollama, and other OpenAI‑compatible backends.
- Your applications: anything that already speaks the OpenAI API, such as LangChain apps, custom backends, or internal tools.
With this architecture, you can deploy LiteLLM Proxy on a VPS that lets you:
- Use one OpenAI‑style endpoint instead of many different SDKs and URLs.
- Rotate provider keys and change models without touching application code.
- Control costs by setting budgets, logging spend, and issuing per‑team keys from one place.
For a deeper strategy view of how enterprises do this at scale, check the Enterprise LLM Hosting tutorial.
Prerequisites to Deploy LiteLLM Proxy on a VPS
To deploy LiteLLM Proxy on a VPS smoothly, you need a fresh Linux VPS running Ubuntu 22.04 or newer with at least 2 GB RAM. Also, you need root or a non‑root user with sudo privileges.
If you need a clean and cost‑effective server for this, you can start with an affordable Linux VPS from PerLod.
Step 1: Prepare Your VPS for LiteLLM Proxy Deployment
First, you must update your local packages and install the required tools:
sudo apt update && sudo apt upgrade -y
sudo apt install ca-certificates curl gnupg lsb-release -y
Then, install Docker Engine and Docker Compose plugin:
# Add Docker’s official GPG key
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | \
sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
# Set up Docker repo
echo \
"deb [arch=$(dpkg --print-architecture) \
signed-by=/etc/apt/keyrings/docker.gpg] \
https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
# Install Docker and Compose
sudo apt update
sudo apt install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin -y
Add your user to the Docker group so you can run containers without sudo:
sudo usermod -aG docker $USER
newgrp docker
Now your server is ready to deploy LiteLLM Proxy on a VPS.
Step 2: Create LiteLLM Proxy Project Folder and Config
You must create a working directory and move into it:
mkdir -p ~/litellm-proxy
cd ~/litellm-proxy
Then, create the config.yaml file with the command below. This example exposes:
- One OpenAI model,
gpt-4o-mini. - One self‑hosted OpenAI‑compatible backend, for example, vLLM or another gateway.
nano config.yaml
Paste this minimal config and edit the values with yours:
model_list:
- model_name: gpt-4o-mini
litellm_params:
model: openai/gpt-4o-mini
api_key: os.environ/OPENAI_API_KEY
- model_name: local-llm
litellm_params:
# Self-hosted OpenAI-compatible backend
model: openai/local-llm
api_base: http://your-self-hosted-llm:8000/v1
api_key: os.environ/LOCAL_LLM_API_KEY
general_settings:
master_key: sk-your-master-key-1234
database_url: postgresql://llmproxy:dbpassword9090@db:5432/litellm
- Set
OPENAI_API_KEYandLOCAL_LLM_API_KEYonly in the.envfile. - Replace
api_basewith the real address of your own backend. - Use any secure, long string you create yourself as the
master_key. - Set
database_urlbased on your own Postgres settings.
You can generate the master_key by using the command below:
openssl rand -base64 32
When you deploy LiteLLM Proxy on a VPS with this config, you will route both cloud and self‑hosted models through one OpenAI‑compatible endpoint.
Step 3: Create Docker Compose YAML File for Proxy and Postgres
At this point, you can create the Docker Compose file so you can deploy LiteLLM Proxy with a database and monitoring:
nano docker-compose.yml
Use this example:
version: "3.11"
services:
litellm:
image: ghcr.io/berriai/litellm:main-stable
container_name: litellm_proxy
ports:
- "4000:4000"
volumes:
- ./config.yaml:/app/config.yaml
environment:
# Database for keys, logs, UI
DATABASE_URL: postgresql://llmproxy:dbpassword9090@db:5432/litellm
STORE_MODEL_IN_DB: "True"
# Provider keys pulled from host .env
OPENAI_API_KEY: ${OPENAI_API_KEY}
LOCAL_LLM_API_KEY: ${LOCAL_LLM_API_KEY}
# Master key for Admin and virtual keys
LITELLM_MASTER_KEY: sk-your-master-key-1234
# Optional UI login
UI_USERNAME: admin
UI_PASSWORD: strong-admin-password
depends_on:
- db
healthcheck:
test: [ "CMD-SHELL", "wget --no-verbose --tries=1 http://localhost:4000/health/liveliness || exit 1" ]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
db:
image: postgres:16
container_name: litellm_db
restart: always
environment:
POSTGRES_DB: litellm
POSTGRES_USER: llmproxy
POSTGRES_PASSWORD: dbpassword9090
ports:
- "5432:5432"
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: [ "CMD-SHELL", "pg_isready -d litellm -U llmproxy" ]
interval: 1s
timeout: 5s
retries: 10
volumes:
postgres_data:
driver: local
This lets you deploy LiteLLM Proxy on a VPS with Postgres for key storage, spend tracking, and models added via UI.
Step 4: Add .env File with Provider Keys
Create a simple .env file with the command below:
nano .env
Fill in with your keys:
OPENAI_API_KEY=sk-your-openai-key
LOCAL_LLM_API_KEY=sk-your-local-llm-key-or-token
The environment values are read by the Docker Compose YAML file and used by the container.
If you want to run larger self‑hosted models like 70B behind this gateway, you can check this guide on Running a 70B Model on One Server to evaluate hardware.
Step 5: Start the LiteLLM Proxy Stack
Use the following command to start the stack:
docker compose up -d
Check the containers’ status:
docker compose ps
docker logs -f litellm_proxy
You should see LiteLLM listening on http://0.0.0.0:4000 and connecting to the Postgres database. At this point, the unified gateway is up and ready.
Step 6: Access the LiteLLM Proxy Admin UI and Swagger
To access the admin panel, open your browser and go to:
http://YOUR_VPS_IP:4000/ui
To access the Swagger, you can go to:
http://YOUR_VPS_IP:4000/
Log in with the Admin username and password you have set in the Docker compose YAML file.
From there, you can:
- Generate new virtual API keys.
- See logs and spend per key.
- Add models via the UI when
STORE_MODEL_IN_DB=True.
This turns your setup into a full gateway for teams.
Test Your LiteLLM Proxy Deployment with curl
You can test the OpenAI‑compatible endpoint using one of the logical model names, like gpt-4o-mini:
curl http://YOUR_VPS_IP:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-your-master-key-1234" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{"role": "user", "content": "Hello from LiteLLM proxy on a VPS!"}
]
}'
If you get a normal Chat Completion JSON response, your setup is working.
You can also use the local-llm model:
curl http://YOUR_VPS_IP:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-your-master-key-1234" \
-d '{
"model": "local-llm",
"messages": [
{"role": "user", "content": "Reply from the self-hosted model"}
]
}'
Test Your LiteLLM Proxy Deployment from Python
Because the proxy is OpenAI‑compatible, any OpenAI client can point to it by changing only base_url and the key.
pip install openai
Then use this script for Python 3.10 and higher:
from openai import OpenAI
client = OpenAI(
base_url="http://YOUR_VPS_IP:4000/v1",
api_key="sk-your-master-key-1234",
)
resp = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Test message via LiteLLM proxy"}],
)
print(resp.choices[0].message.content)
If this prints a response, you have successfully deployed LiteLLM Proxy on a VPS as a drop‑in OpenAI‑compatible gateway.
Conclusion
Deploying LiteLLM Proxy on a VPS is a simple move that gives your team a unified OpenAI‑compatible API, even if you use many providers and a mix of cloud and self‑hosted models. LiteLLM acts as a control plane between your apps and models so that you can focus on product features rather than keys, routing, and policy logic.
We hope you enjoy this guide. For more detailed information, you can check the official LiteLLM Docker Page.
FAQs
Can I add more providers after deploying LiteLLM Proxy?
Yes, just add more entries under model_list in config.yaml file and restart the container.
Can I skip Postgres in the LiteLLM Proxy deployment?
Yes, remove the db service and DATABASE_URL to run without a database, but you lose the UI‑based model and key management.
How do I manage teams and budgets in LiteLLM Proxy deployment?
Enable the database, set a master_key, and use the Admin UI plus key‑generation endpoints to create scoped keys and track spend.