//------------------------------------------------------------------- //-------------------------------------------------------------------
how to optimize Qdrant

Qdrant Optimization Guide: Snapshots, RAM Tuning, and Secure Public Access

If you already have Qdrant running on your Linux server, the next step is to make it production-ready. That means knowing how to optimize Qdrant for real traffic, including tuning RAM usage, taking proper snapshots, and setting up safe network access.

This guide covers all of Qdrant optimization for real traffic. If you haven’t set up Qdrant yet, you can follow this guide on Qdrant installation on a Linux server.

Prerequisites for How to Optimize Qdrant

Before starting, you must make sure:

  • Qdrant is installed and running via Docker or binary.
  • You have sudo or root access on your Linux VPS.
  • curl is installed for API calls.
  • You’re using Ubuntu 22.04 or newer.

Part 1: Understand How Qdrant Uses Memory

Before you start tuning, you need to understand what Qdrant actually does with your RAM. Qdrant has two main storage modes:

  • In-Memory Storage: All vectors live in RAM. Fastest possible search, but expensive on large collections.
  • Memmap on-disk Storage: Vectors live on disk and are accessed via OS page cache. When you have enough RAM, it’s almost as fast as in-memory.

Important note: Tools like htop or top can be misleading. If they show Qdrant using 10 GB of RAM, it doesn’t mean the service requires 10 GB to run; Qdrant aggressively caches data from disk. Unused RAM is wasted RAM, so Qdrant will use what’s available.

Check Current Memory Usage:

# Check Qdrant process memory
ps aux | grep qdrant

# Check Docker container memory
docker stats qdrant --no-stream

Check Your Collections Info:

curl http://localhost:6333/collections

Part 2: How to Optimize Qdrant Memory | RAM Tuning

This is the core of how to optimize Qdrant for production. You have three real scenarios to choose from, depending on your setup.

Scenario A: Low RAM, Need Speed (Quantization with On-Disk)

This is the most common setup for a VPS with limited memory. You store the original vectors on disk but keep compressed (quantized) vectors in RAM. Scalar quantization compresses vectors to int8, which means 4x less memory with about 99% accuracy.

curl -X PUT http://localhost:6333/collections/my_collection \
  -H 'Content-Type: application/json' \
  -d '{
    "vectors": {
      "size": 768,
      "distance": "Cosine",
      "on_disk": true
    },
    "quantization_config": {
      "scalar": {
        "type": "int8",
        "always_ram": true
      }
    }
  }'

This setup does:

  • "on_disk": true: Original vectors stay on disk.
  • "always_ram": true: Quantized vectors stay in RAM for fast search.
  • Result: Much lower RAM usage, fast search, and minimal accuracy loss.

Scenario B: Very Low RAM, Precision Matters

If you need high accuracy but have very little RAM, put both vectors and the HNSW index on disk:

curl -X PUT http://localhost:6333/collections/my_collection \
  -H 'Content-Type: application/json' \
  -d '{
    "vectors": {
      "size": 768,
      "distance": "Cosine",
      "on_disk": true
    },
    "hnsw_config": {
      "on_disk": true
    }
  }'

Note: This setup depends heavily on your disk speed. For good results, use an SSD with at least 50,000 IOPS. Slow HDDs will make this very sluggish.

Scenario C: High RAM, Max Performance

If you have enough RAM, for example, 16 GB+ with moderate data, keep everything in memory and use quantization only for speed, not memory savings:

curl -X PUT http://localhost:6333/collections/my_collection \
  -H 'Content-Type: application/json' \
  -d '{
    "vectors": {
      "size": 768,
      "distance": "Cosine"
    },
    "quantization_config": {
      "scalar": {
        "type": "int8",
        "always_ram": true
      }
    }
  }'

For more detailed information, you can check the official Qdrant performance optimization.

Set RAM Limits in Docker

To prevent Qdrant from using all your server memory, you can set a hard limit in your docker-compose.yml:

version: '3.8'
services:
  qdrant:
    image: qdrant/qdrant:latest
    restart: unless-stopped
    ports:
      - "127.0.0.1:6333:6333"
      - "127.0.0.1:6334:6334"
    volumes:
      - ./qdrant_storage:/qdrant/storage
      - ./config.yaml:/qdrant/config/production.yaml
    environment:
      - QDRANT__CONFIG_PATH=/qdrant/config/production.yaml
    deploy:
      resources:
        limits:
          memory: 8G

Tune Segment Count for Latency vs Throughput

For low latency, you can use more segments equal to your CPU cores:

curl -X PUT http://localhost:6333/collections/my_collection \
  -H 'Content-Type: application/json' \
  -d '{
    "vectors": {"size": 768, "distance": "Cosine"},
    "optimizers_config": {
      "default_segment_number": 8
    }
  }'

For high throughput, you can use fewer large segments:

curl -X PUT http://localhost:6333/collections/my_collection \
  -H 'Content-Type: application/json' \
  -d '{
    "vectors": {"size": 768, "distance": "Cosine"},
    "optimizers_config": {
      "default_segment_number": 2,
      "max_segment_size": 5000000
    }
  }'

Part 3: Qdrant Snapshots and Backup

This is where a lot of self-hosted vector database setups fail. People configure Qdrant perfectly, then lose everything because they never set up backups. Knowing how to optimize Qdrant isn’t just about performance; it’s also about making sure your data survives.

Qdrant snapshots are .tar archive files that contain all data and configuration for a collection at a specific point in time. They are not the same as filesystem-level snapshots.

First, create a Snapshot for one collection:

curl -X POST http://localhost:6333/collections/my_collection/snapshots \
  -H 'api-key: YOUR_API_KEY'

You’ll get a response like:

{
  "result": {
    "name": "my_collection-2025-06-02-10-30-00.snapshot",
    "creation_time": "2025-06-02T10:30:00",
    "size": 52428800
  },
  "status": "ok"
}

List all snapshots with the command below:

curl http://localhost:6333/collections/my_collection/snapshots \
  -H 'api-key: YOUR_API_KEY'

Download a snapshot file with:

curl http://localhost:6333/collections/my_collection/snapshots/my_collection-2025-06-02-10-30-00.snapshot \
  -H 'api-key: YOUR_API_KEY' \
  --output my_collection_backup.snapshot

Next, create a full storage snapshot with the command below:

curl -X POST http://localhost:6333/snapshots \
  -H 'api-key: YOUR_API_KEY'

This creates a snapshot of your entire Qdrant storage, including collection aliases.

To restore a snapshot from a local file, you can use:

curl -X PUT http://localhost:6333/collections/my_collection/snapshots/recover \
  -H 'Content-Type: application/json' \
  -H 'api-key: YOUR_API_KEY' \
  -d '{
    "location": "file:///qdrant/snapshots/my_collection/my_collection-2025-06-02-10-30-00.snapshot",
    "priority": "snapshot"
  }'

Note: Use "priority": "snapshot" when restoring to a new collection, otherwise, Qdrant will prefer the empty existing data over your snapshot.

To restore a full storage snapshot at startup, you can run:

./qdrant --storage-snapshot /snapshots/full-snapshot-2025-06-02-10-30-00.snapshot

Automate Qdrant Snapshots with Cron

To automate the backup process, create a simple backup script:

sudo nano /usr/local/bin/qdrant-backup.sh

Paste the following content into the file with your values:

#!/bin/bash
# Qdrant snapshot backup script

QDRANT_URL="http://localhost:6333"
API_KEY="YOUR_API_KEY"
BACKUP_DIR="/var/backups/qdrant"
DATE=$(date +%Y-%m-%d_%H-%M)
COLLECTIONS=("my_collection" "my_other_collection")

mkdir -p "$BACKUP_DIR/$DATE"

for COLLECTION in "${COLLECTIONS[@]}"; do
  # Create snapshot
  SNAPSHOT_NAME=$(curl -s -X POST "$QDRANT_URL/collections/$COLLECTION/snapshots" \
    -H "api-key: $API_KEY" | \
    python3 -c "import sys, json; print(json.load(sys.stdin)['result']['name'])")

  # Download snapshot
  curl -s "$QDRANT_URL/collections/$COLLECTION/snapshots/$SNAPSHOT_NAME" \
    -H "api-key: $API_KEY" \
    --output "$BACKUP_DIR/$DATE/${COLLECTION}.snapshot"

  echo "Backed up $COLLECTION -> $BACKUP_DIR/$DATE/${COLLECTION}.snapshot"
done

# Delete snapshots older than 7 days
find "$BACKUP_DIR" -type d -mtime +7 -exec rm -rf {} + 2>/dev/null

echo "Backup complete: $DATE"

Make it executable and add to cron:

sudo chmod +x /usr/local/bin/qdrant-backup.sh

# Edit crontab
sudo crontab -e

Add this line to run the backup every day at 2 AM:

0 2 * * * /usr/local/bin/qdrant-backup.sh >> /var/log/qdrant-backup.log 2>&1

Where Qdrant Stores Snapshots

By default, Qdrant stores snapshots at /qdrant/snapshots inside Docker. You can change this in your config.yaml:

storage:
  snapshots_path: /your/custom/snapshots/path
  temp_path: /tmp

Store Qdrant Snapshots in S3 (Optional)

If you want off-site backups, which is highly recommended for production, Qdrant supports S3-compatible storage directly as of v1.10.0:

storage:
  snapshots_config:
    snapshots_storage: s3

    s3_config:
      bucket: your-bucket-name
      region: us-east-1
      access_key: your_access_key
      secret_key: your_secret_key
      # For MinIO or other S3-compatible storage:
      endpoint_url: https://your-minio-server:9000

Part 4: How to Optimize Qdrant Network Security

By default, Qdrant binds to 0.0.0.0, which means it listens on all network interfaces, including your public IP. Anyone who finds port 6333 open on your server can read or delete your data.

To learn how to optimize Qdrant network security, follow the steps below:

Step 1: Bind Qdrant to Localhost Only

This is the most important security step for Qdrant. Edit your config.yaml file:

service:
  host: 127.0.0.1   # Only accept connections from localhost
  http_port: 6333
  grpc_port: 6334
  api_key: "your_strong_api_key_here"
  read_only_api_key: "your_read_only_key_here"

Or use environment variables with Docker:

docker run -d \
  --name qdrant \
  -p 127.0.0.1:6333:6333 \
  -p 127.0.0.1:6334:6334 \
  -v $(pwd)/qdrant_storage:/qdrant/storage \
  -e QDRANT__SERVICE__HOST=127.0.0.1 \
  -e QDRANT__SERVICE__API_KEY=your_strong_api_key \
  qdrant/qdrant:latest

With -p 127.0.0.1:6333:6333, the port is only accessible from within the same server, not from the internet.

Step 2: Add an API Key

You must always set an API key, even if Qdrant is private:

service:
  api_key: "your_strong_api_key_here"
  read_only_api_key: "a_separate_read_only_key"

To use the API key in requests:

curl http://localhost:6333/collections \
  -H 'api-key: your_strong_api_key_here'

Step 3: Block Port 6333 with UFW

Even if you’ve bound Qdrant to localhost, you can block the port at the firewall level as a second layer of security:

# Enable UFW if not already enabled
sudo ufw enable

# Allow SSH first (important: do this before enabling UFW)
sudo ufw allow ssh

# Block Qdrant ports from external access
sudo ufw deny 6333/tcp
sudo ufw deny 6334/tcp

# Check rules
sudo ufw status

If you have a trusted app server on the same network and need to allow it specifically, you can use:

# Allow only from a specific IP 
sudo ufw allow from 10.0.0.5 to any port 6333

Step 4: Put Qdrant Behind Nginx

If your app needs to expose Qdrant queries to external clients, for example, a public-facing search API, never expose Qdrant directly. Instead, put Nginx in front of it:

Install Nginx:

sudo apt update
sudo apt install nginx -y

Create a new Nginx config:

sudo nano /etc/nginx/sites-available/qdrant

Paste this config:

server {
    listen 443 ssl;
    server_name qdrant.yourdomain.com;

    ssl_certificate /etc/letsencrypt/live/qdrant.yourdomain.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/qdrant.yourdomain.com/privkey.pem;

    # Only allow search endpoint (read-only exposure)
    location /collections {
        # Validate API key
        if ($http_api_key != "your_read_only_key") {
            return 403;
        }
        proxy_pass http://127.0.0.1:6333;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }

    # Block everything else
    location / {
        return 403;
    }
}

# Redirect HTTP to HTTPS
server {
    listen 80;
    server_name qdrant.yourdomain.com;
    return 301 https://$host$request_uri;
}

Enable the config:

sudo ln -s /etc/nginx/sites-available/qdrant /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx

Get an SSL certificate with Certbot:

sudo apt install certbot python3-certbot-nginx -y
sudo certbot --nginx -d qdrant.yourdomain.com

Step 5: Enable TLS Directly in Qdrant

If you prefer not to use Nginx, Qdrant supports TLS natively. You can use:

service:
  enable_tls: true
  api_key: "your_strong_key"

tls:
  cert: /path/to/cert.pem
  key: /path/to/key.pem

Or via Docker:

docker run -d \
  --name qdrant \
  -p 6333:6333 \
  -e QDRANT__SERVICE__ENABLE_TLS=true \
  -e QDRANT__SERVICE__API_KEY=your_key \
  -e QDRANT__TLS__CERT=/tls/cert.pem \
  -e QDRANT__TLS__KEY=/tls/key.pem \
  -v $(pwd)/tls:/tls \
  qdrant/qdrant:latest

Part 5: When to Keep Qdrant Fully Private

Not every deployment needs Qdrant exposed to the public internet. In most cases, the best way of how to optimize Qdrant for security is to keep it completely private and let your app layer handle all external requests.

Keep Qdrant private when:

  • Your app server is on the same machine as Qdrant.
  • You’re using a private network or VPN between services.
  • You’re running a RAG pipeline where only your backend queries Qdrant.
  • You don’t want to manage public-facing certificates and API exposure.

Architecture for private setup:

Internet → Your App (public port 443) → Qdrant (127.0.0.1:6333, private)

Your app handles auth, rate limiting, and business logic, then queries Qdrant internally over localhost. This is the safest and most common production pattern.

Only expose Qdrant publicly when:

  • You need multiple services on different servers to query Qdrant directly.
  • You’re building a shared search infrastructure for a team.
  • Your app is on a different server with no private network option.

In those cases, use the Nginx reverse proxy setup from Part 4, always with HTTPS and API key validation.

Part 6: Complete Qdrant Config YAML File

Here is a full production-ready configuration file that keeps everything together. This is how to optimize Qdrant looks in a real config.yaml:

# /path/to/qdrant/config/production.yaml

log_level: INFO

storage:
  # Path where Qdrant stores data
  storage_path: /qdrant/storage

  # Path for snapshots
  snapshots_path: /qdrant/snapshots

  # Temp path for snapshot creation
  temp_path: /tmp

  # Optional S3 snapshot storage
  # snapshots_config:
  #   snapshots_storage: s3
  #   s3_config:
  #     bucket: my-qdrant-backups
  #     region: us-east-1
  #     access_key: KEY
  #     secret_key: SECRET

service:
  # Bind to localhost only — change to 0.0.0.0 only if behind a secure reverse proxy
  host: 127.0.0.1
  http_port: 6333
  grpc_port: 6334

  # Always set an API key in production
  api_key: "replace_with_a_strong_random_key"
  read_only_api_key: "replace_with_a_separate_read_only_key"

  # Enable CORS only if your frontend queries Qdrant directly (not recommended)
  enable_cors: false

  # Enable TLS if you're not using a reverse proxy
  # enable_tls: true

# Optional TLS config
# tls:
#   cert: /tls/cert.pem
#   key: /tls/key.pem

Load this config in Docker Compose:

version: '3.8'
services:
  qdrant:
    image: qdrant/qdrant:latest
    restart: unless-stopped
    ports:
      - "127.0.0.1:6333:6333"
      - "127.0.0.1:6334:6334"
    volumes:
      - qdrant_storage:/qdrant/storage
      - qdrant_snapshots:/qdrant/snapshots
      - ./config/production.yaml:/qdrant/config/production.yaml
    environment:
      - QDRANT__CONFIG_PATH=/qdrant/config/production.yaml
    deploy:
      resources:
        limits:
          memory: 8G
    ulimits:
      nofile:
        soft: 10000
        hard: 10000

volumes:
  qdrant_storage:
  qdrant_snapshots:

The ulimits setting prevents the too many open files error that can occur with large collections.

Part 7: Verify Your Qdrant Optimization Setup

Run these checks to confirm everything is working correctly.

Check Qdrant is running:

curl http://localhost:6333/healthz

Expected output:

healthz check passed

Test API key works:

# This should be rejected (no key)
curl http://localhost:6333/collections

# This should work (with key)
curl http://localhost:6333/collections \
  -H 'api-key: your_strong_api_key_here'

Check Port is Not Exposed Externally: From another machine or using an online port scanner, confirm port 6333 is closed:

nc -zv YOUR_SERVER_IP 6333

Check Snapshot Location:

# Inside Docker
docker exec qdrant ls /qdrant/snapshots/

# Or check your mounted volume
ls /path/to/qdrant_snapshots/

Check Collection Info:

curl http://localhost:6333/collections/my_collection \
  -H 'api-key: your_strong_api_key_here'

Look at the config section in the response; it shows the on_disk and quantization settings you applied.

When to Upgrade Your Qdrant Server

If you’re noticing the following things, a small VPS is no longer the right option:

  • Slow query times even with quantization enabled.
  • Frequent memory pressure, OOM kills, and swap usage.
  • Growing collections with millions of vectors.
  • High concurrent query load.

At this stage, moving to a dedicated server hosting gives you predictable performance, dedicated CPU and RAM, and much faster NVMe storage for on-disk collections.

Moving from a shared VPS to a dedicated server is one of the best performance improvements you can make for a production Qdrant deployment, especially when your collections start growing past a few million vectors.

For AI-specific workloads involving embedding generation, reranking, or LLM inference alongside Qdrant, you should also look into purpose-built AI hosting infrastructure that gives you GPU access alongside your vector database.

Conclusion

At this point, you have learned how to optimize Qdrant for your specific RAM and disk situation, build a reliable snapshot and backup habit before you need it, and make sure Qdrant is never accidentally left open to the internet.

We hope you enjoy this guide.

FAQs

Does Qdrant have a built-in automatic snapshot scheduler?

No, Qdrant does not have a built-in scheduler for snapshots on self-hosted setups. You need to use the API with a system cron job, as shown in this guide. Qdrant Cloud has a built-in backup scheduler.

Is 0.0.0.0 dangerous for Qdrant?

Yes, if you expose port 6333 publicly without an API key, anyone can access your data. Always bind to 127.0.0.1 or use a firewall rule to block external access.

What happens if Qdrant crashes without a snapshot?

Qdrant saves data to disk all the time, so if it shuts down properly, your data is usually safe. But if there’s a hardware failure or disk corruption and no snapshot, you can lose data. Always take regular snapshots.

Post Your Comment

PerLod delivers high-performance hosting with real-time support and unmatched reliability.

Contact us

Payment methods

payment gateway