//------------------------------------------------------------------- //-------------------------------------------------------------------
Deploy LiteLLM Proxy on a VPS

How to Deploy LiteLLM Proxy on a VPS for a Unified OpenAI-Compatible API

When AI projects start to grow, you quickly end up with many models, many providers, and too many API keys to manage safely. You can deploy LiteLLM Proxy on a VPS to solve this problem with a single OpenAI‑compatible gateway that sits between your apps and all your models.

LiteLLM works like a control plane, which your apps talk to one simple endpoint, and LiteLLM handles routing, provider selection, rate limits, and cost tracking in the background. This guide shows how to deploy LiteLLM Proxy on a VPS with Docker, connect cloud or self‑hosted models, and test everything using an OpenAI‑compatible client.

If you want to host models or gateways on powerful dedicated hardware, check the AI Hosting Environment for GPU and enterprise‑grade setups.

What LiteLLM Proxy Does

You can treat LiteLLM as a control plane that sits between:

  • Upstream models: OpenAI, Anthropic, Azure, Bedrock, vLLM, Ollama, and other OpenAI‑compatible backends.
  • Your applications: anything that already speaks the OpenAI API, such as LangChain apps, custom backends, or internal tools.

With this architecture, you can deploy LiteLLM Proxy on a VPS that lets you:

  • Use one OpenAI‑style endpoint instead of many different SDKs and URLs.
  • Rotate provider keys and change models without touching application code.
  • Control costs by setting budgets, logging spend, and issuing per‑team keys from one place.

For a deeper strategy view of how enterprises do this at scale, check the Enterprise LLM Hosting tutorial.

Prerequisites to Deploy LiteLLM Proxy on a VPS

To deploy LiteLLM Proxy on a VPS smoothly, you need a fresh Linux VPS running Ubuntu 22.04 or newer with at least 2 GB RAM. Also, you need root or a non‑root user with sudo privileges.

If you need a clean and cost‑effective server for this, you can start with an affordable Linux VPS from PerLod.

Step 1: Prepare Your VPS for LiteLLM Proxy Deployment

First, you must update your local packages and install the required tools:

sudo apt update && sudo apt upgrade -y
sudo apt install ca-certificates curl gnupg lsb-release -y

Then, install Docker Engine and Docker Compose plugin:

# Add Docker’s official GPG key
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | \
sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

# Set up Docker repo
echo \
  "deb [arch=$(dpkg --print-architecture) \
  signed-by=/etc/apt/keyrings/docker.gpg] \
  https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# Install Docker and Compose
sudo apt update
sudo apt install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin -y

Add your user to the Docker group so you can run containers without sudo:

sudo usermod -aG docker $USER
newgrp docker

Now your server is ready to deploy LiteLLM Proxy on a VPS.

Step 2: Create LiteLLM Proxy Project Folder and Config

You must create a working directory and move into it:

mkdir -p ~/litellm-proxy
cd ~/litellm-proxy

Then, create the config.yaml file with the command below. This example exposes:

  • One OpenAI model, gpt-4o-mini.
  • One self‑hosted OpenAI‑compatible backend, for example, vLLM or another gateway.
nano config.yaml

Paste this minimal config and edit the values with yours:

model_list:
  - model_name: gpt-4o-mini
    litellm_params:
      model: openai/gpt-4o-mini
      api_key: os.environ/OPENAI_API_KEY

  - model_name: local-llm
    litellm_params:
      # Self-hosted OpenAI-compatible backend
      model: openai/local-llm
      api_base: http://your-self-hosted-llm:8000/v1
      api_key: os.environ/LOCAL_LLM_API_KEY

general_settings:
  master_key: sk-your-master-key-1234
  database_url: postgresql://llmproxy:dbpassword9090@db:5432/litellm
  • Set OPENAI_API_KEY and LOCAL_LLM_API_KEY only in the .env file.
  • Replace api_base with the real address of your own backend.
  • Use any secure, long string you create yourself as the master_key.
  • Set database_url based on your own Postgres settings.

You can generate the master_key by using the command below:

openssl rand -base64 32

When you deploy LiteLLM Proxy on a VPS with this config, you will route both cloud and self‑hosted models through one OpenAI‑compatible endpoint.

Step 3: Create Docker Compose YAML File for Proxy and Postgres

At this point, you can create the Docker Compose file so you can deploy LiteLLM Proxy with a database and monitoring:

nano docker-compose.yml

Use this example:

version: "3.11"

services:
  litellm:
    image: ghcr.io/berriai/litellm:main-stable
    container_name: litellm_proxy
    ports:
      - "4000:4000"
    volumes:
      - ./config.yaml:/app/config.yaml
    environment:
      # Database for keys, logs, UI
      DATABASE_URL: postgresql://llmproxy:dbpassword9090@db:5432/litellm
      STORE_MODEL_IN_DB: "True"

      # Provider keys pulled from host .env
      OPENAI_API_KEY: ${OPENAI_API_KEY}
      LOCAL_LLM_API_KEY: ${LOCAL_LLM_API_KEY}

      # Master key for Admin and virtual keys
      LITELLM_MASTER_KEY: sk-your-master-key-1234

      # Optional UI login
      UI_USERNAME: admin
      UI_PASSWORD: strong-admin-password
    depends_on:
      - db
    healthcheck:
      test: [ "CMD-SHELL", "wget --no-verbose --tries=1 http://localhost:4000/health/liveliness || exit 1" ]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

  db:
    image: postgres:16
    container_name: litellm_db
    restart: always
    environment:
      POSTGRES_DB: litellm
      POSTGRES_USER: llmproxy
      POSTGRES_PASSWORD: dbpassword9090
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: [ "CMD-SHELL", "pg_isready -d litellm -U llmproxy" ]
      interval: 1s
      timeout: 5s
      retries: 10

volumes:
  postgres_data:
    driver: local

This lets you deploy LiteLLM Proxy on a VPS with Postgres for key storage, spend tracking, and models added via UI.

Step 4: Add .env File with Provider Keys

Create a simple .env file with the command below:

nano .env

Fill in with your keys:

OPENAI_API_KEY=sk-your-openai-key
LOCAL_LLM_API_KEY=sk-your-local-llm-key-or-token

The environment values are read by the Docker Compose YAML file and used by the container.

If you want to run larger self‑hosted models like 70B behind this gateway, you can check this guide on Running a 70B Model on One Server to evaluate hardware.

Step 5: Start the LiteLLM Proxy Stack

Use the following command to start the stack:

docker compose up -d

Check the containers’ status:

docker compose ps
docker logs -f litellm_proxy

You should see LiteLLM listening on http://0.0.0.0:4000 and connecting to the Postgres database. At this point, the unified gateway is up and ready.

Step 6: Access the LiteLLM Proxy Admin UI and Swagger

To access the admin panel, open your browser and go to:

http://YOUR_VPS_IP:4000/ui

To access the Swagger, you can go to:

http://YOUR_VPS_IP:4000/

Log in with the Admin username and password you have set in the Docker compose YAML file.

From there, you can:

  • Generate new virtual API keys.
  • See logs and spend per key.
  • Add models via the UI when STORE_MODEL_IN_DB=True.

This turns your setup into a full gateway for teams.

Test Your LiteLLM Proxy Deployment with curl

You can test the OpenAI‑compatible endpoint using one of the logical model names, like gpt-4o-mini:

curl http://YOUR_VPS_IP:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-master-key-1234" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {"role": "user", "content": "Hello from LiteLLM proxy on a VPS!"}
    ]
  }'

If you get a normal Chat Completion JSON response, your setup is working.

You can also use the local-llm model:

curl http://YOUR_VPS_IP:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-master-key-1234" \
  -d '{
    "model": "local-llm",
    "messages": [
      {"role": "user", "content": "Reply from the self-hosted model"}
    ]
  }'

Test Your LiteLLM Proxy Deployment from Python

Because the proxy is OpenAI‑compatible, any OpenAI client can point to it by changing only base_url and the key.

pip install openai

Then use this script for Python 3.10 and higher:

from openai import OpenAI

client = OpenAI(
    base_url="http://YOUR_VPS_IP:4000/v1",
    api_key="sk-your-master-key-1234",
)

resp = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Test message via LiteLLM proxy"}],
)

print(resp.choices[0].message.content)

If this prints a response, you have successfully deployed LiteLLM Proxy on a VPS as a drop‑in OpenAI‑compatible gateway.

Conclusion

Deploying LiteLLM Proxy on a VPS is a simple move that gives your team a unified OpenAI‑compatible API, even if you use many providers and a mix of cloud and self‑hosted models. LiteLLM acts as a control plane between your apps and models so that you can focus on product features rather than keys, routing, and policy logic.

We hope you enjoy this guide. For more detailed information, you can check the official LiteLLM Docker Page.

FAQs

Can I add more providers after deploying LiteLLM Proxy?

Yes, just add more entries under model_list in config.yaml file and restart the container.

Can I skip Postgres in the LiteLLM Proxy deployment?

Yes, remove the db service and DATABASE_URL to run without a database, but you lose the UI‑based model and key management.

How do I manage teams and budgets in LiteLLM Proxy deployment?

Enable the database, set a master_key, and use the Admin UI plus key‑generation endpoints to create scoped keys and track spend.

Post Your Comment

PerLod delivers high-performance hosting with real-time support and unmatched reliability.

Contact us

Payment methods

payment gateway