How to Install pgvector on PostgreSQL for a Self-Hosted RAG Stack
If you are building a RAG app and you already run PostgreSQL, you do not need to set up a new database to store embeddings. You can install pgvector on PostgreSQL and get vector search working in minutes, right inside the database you already know.
This guide shows you how to install the extension, create vector-enabled tables, insert embeddings, run similarity searches, and add indexes for performance.
Table of Contents
What is pgvector and Why Use It?
pgvector is an open-source PostgreSQL extension that adds a native vector data type and similarity search operators to your database. Instead of sending your embeddings to a separate vector database service, you store them as a column right next to your other data, documents, users, products, whatever you have.
For most RAG workloads at small-to-medium scale, this is the simplest and most cost-effective method. You get ACID transactions, joins, your existing backup and monitoring stack, and no extra system to operate.
The pgvector project supports exact and nearest neighbor search, L2 distance, cosine distance, inner product, and more.
What You Need for pgvector Setup
Before you start, make sure you have:
- A Linux VPS running Ubuntu 22.04 or newer. This guide uses Ubuntu; you can easily buy a reliable Linux VPS from PerLod if you need one.
- PostgreSQL 14, 15, 16, or 17 is already installed.
- sudo access on the server.
- Basic familiarity with psql.
To check your PostgreSQL version, you can run:
psql --version
Method 1: Install pgvector on PostgreSQL via APT Repository
The easiest way to install pgvector on PostgreSQL on Ubuntu is through the official PGDG apt repository. The package name follows the pattern:
postgresql-{VERSION}-pgvector
To install pgvector on PostgreSQL, you can use:
sudo apt update
sudo apt install postgresql-17-pgvector -y #PostgreSQL 17
sudo apt install postgresql-16-pgvector -y #PostgreSQL 16
sudo apt install postgresql-15-pgvector -y #PostgreSQL 15
sudo apt install postgresql-14-pgvector -y #PostgreSQL 14
If you are not sure which PGDG repo is set up on your server, run the setup script first:
sudo apt install postgresql-common -y
sudo /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh
Then install the pgvector package for your version.
Enable the Vector Extension in Your Database
Once the package is installed, you need to enable the extension inside each database where you want to use it.
First, connect as the postgres superuser:
sudo -u postgres psql
Then connect to your target database, replace mydb with your actual database name:
\c mydb
Now enable the extension:
CREATE EXTENSION vector;
Verify it is active:
\dx
You should see the vector in the list. You can also check the version:
SELECT extversion FROM pg_extension WHERE extname = 'vector';
The Most Common Mistake in pgvector: Wrong Extension Name
This is the most common issue everyone makes for the first time. The project is called pgvector, but the SQL extension name is vector, not pgvector.
If you run this:
-- ❌ WRONG: this will fail
CREATE EXTENSION pgvector;
You will get an error like:
ERROR: could not open extension control file ".../pgvector.control": No such file or directory
The correct command is always like this:
-- ✅ CORRECT
CREATE EXTENSION vector;
The same rule applies when using CREATE EXTENSION IF NOT EXISTS:
CREATE EXTENSION IF NOT EXISTS vector;
Method 2: Install pgvector on PostgreSQL via Docker
If you prefer using Docker, the official pgvector Docker image is the quickest way. It includes PostgreSQL with pgvector already compiled and ready:
docker run -d \
--name rag-postgres \
-e POSTGRES_USER=raguser \
-e POSTGRES_PASSWORD=securepassword \
-e POSTGRES_DB=ragdb \
-p 5432:5432 \
pgvector/pgvector:pg16
Connect to it with the command below:
docker exec -it rag-postgres psql -U raguser -d ragdb
Then enable the extension as shown above:
CREATE EXTENSION vector;
Create a Vector-Enabled Table
Now that the extension is active, you can create a table with a vector column. The number in parentheses is the dimension count of your embeddings; this must match the output size of your embedding model.
For example, text-embedding-3-small from OpenAI outputs 1536 dimensions. all-MiniLM-L6-v2 outputs 384 dimensions.
CREATE TABLE documents (
id BIGSERIAL PRIMARY KEY,
content TEXT NOT NULL,
source TEXT,
embedding vector(1536)
);
If you already have a table and want to add a vector column, you can use:
ALTER TABLE documents ADD COLUMN embedding vector(1536);
Dimension limit note: For HNSW and IVFFlat indexes, the maximum supported dimensions are 2000 for vector (full precision) and 4000 for halfvec (half precision). Most popular embedding models used for RAG, like text-embedding-3-small at 1536 dimensions, fit within this limit without any issue.
Insert Embeddings into the Database
You can insert rows with embeddings using standard SQL. Vectors are written as a string of comma-separated floats inside square brackets:
INSERT INTO documents (content, source, embedding)
VALUES
('PostgreSQL is a powerful relational database.', 'docs', '[0.12, 0.45, 0.78, ...]'),
('pgvector adds vector search to Postgres.', 'blog', '[0.33, 0.21, 0.66, ...]');
In a real RAG pipeline, you must generate these embeddings from a model, like OpenAI or a local sentence-transformers model, and pass them in from your application code. Here is a simple Python example using psycopg2:
import psycopg2
import openai
conn = psycopg2.connect(
host="localhost",
port=5432,
user="raguser",
password="securepassword",
dbname="ragdb"
)
cur = conn.cursor()
# Generate an embedding with OpenAI
response = openai.embeddings.create(
model="text-embedding-3-small",
input="PostgreSQL is a powerful relational database."
)
embedding = response.data[0].embedding # list of 1536 floats
# Insert into the table
cur.execute(
"INSERT INTO documents (content, source, embedding) VALUES (%s, %s, %s)",
("PostgreSQL is a powerful relational database.", "docs", embedding)
)
conn.commit()
cur.close()
conn.close()
Run a Similarity Search
pgvector adds three distance operators you can use directly in SQL, including:
| Operator | Metric | Use Case |
|---|---|---|
| <-> | L2 / Euclidean distance | General nearest-neighbor |
| <=> | Cosine distance | Semantic similarity, NLP |
| <#> | Negative inner product | Dot-product similarity |
Cosine similarity search, which is best for RAG:
SELECT id, content, source,
1 - (embedding <=> '[0.12, 0.45, 0.78, ...]') AS similarity
FROM documents
ORDER BY embedding <=> '[0.12, 0.45, 0.78, ...]'
LIMIT 5;
L2 distance search:
SELECT id, content
FROM documents
ORDER BY embedding <-> '[0.12, 0.45, 0.78, ...]'
LIMIT 5;
Filter by metadata and then rank by similarity:
SELECT id, content
FROM documents
WHERE source = 'blog'
ORDER BY embedding <=> '[0.12, 0.45, 0.78, ...]'
LIMIT 5;
This kind of combined metadata with vector query is one of the biggest advantages of keeping everything in PostgreSQL; you do it in a single SQL query.
Index Your Vectors for Faster Queries
Without an index, every search scans the entire table. This is fine for development or small datasets, but for production, you need an index.
pgvector supports two index types:
HNSW (Hierarchical Navigable Small World): Recommended
HNSW is generally preferred for most production use cases. It builds a graph structure that allows fast and accurate nearest-neighbor searches. Build time is slower than IVFFlat, but query performance is better.
-- For cosine distance (most common for RAG)
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops);
-- For L2 distance
CREATE INDEX ON documents
USING hnsw (embedding vector_l2_ops);
-- For inner product
CREATE INDEX ON documents
USING hnsw (embedding vector_ip_ops);
Also, you can tune HNSW parameters:
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
- m: number of connections per layer. The default is 16, and the higher number has better recall but with more memory.
- ef_construction: search width during index build. The default is 64, and the higher value has better recall but with slower build.
Set ef_search at Query Time in HNSW: To improve recall during queries, you can set this session variable:
SET hnsw.ef_search = 100; -- default is 40
IVFFlat: Faster to Build, Good for Large Datasets
IVFFlat divides data into clusters and searches within them. Build time is faster than HNSW, and it has a smaller memory footprint.
-- Build the index AFTER inserting data (not before)
CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
The lists value controls the number of clusters. A common rule is to use sqrt(number of rows) as a starting point.
Important Note: For IVFFlat, always insert your data first and then build the index. Building on an empty table produces a poor-quality index.
Build a Basic Retrieval Pipeline with pgvector
Here is how everything fits together in a real RAG pipeline. The idea is simple:
- A user asks a question.
- You embed that question using your embedding model.
- You query the database for the most similar document chunks.
- You pass those chunks as context to your LLM.
import psycopg2
import openai
def get_relevant_context(question: str, top_k: int = 5) -> list[str]:
conn = psycopg2.connect(
host="localhost", port=5432,
user="raguser", password="securepassword", dbname="ragdb"
)
cur = conn.cursor()
# Embed the user's question
response = openai.embeddings.create(
model="text-embedding-3-small",
input=question
)
query_embedding = response.data[0].embedding
# Run similarity search
cur.execute("""
SELECT content
FROM documents
ORDER BY embedding <=> %s::vector
LIMIT %s
""", (query_embedding, top_k))
rows = cur.fetchall()
cur.close()
conn.close()
return [row[0] for row in rows]
# Use it
context_chunks = get_relevant_context("How does pgvector work?")
context_text = "\n\n".join(context_chunks)
# Send to LLM
answer = openai.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Answer based on the context below.\n\n" + context_text},
{"role": "user", "content": "How does pgvector work?"}
]
)
print(answer.choices[0].message.content)
When Is pgvector Enough and When It’s Not
For most teams, pgvector at first is enough. It works well when:
- Your dataset is in the single-digit millions of vectors.
- You want to store vectors alongside your relational data, such as documents, users, and metadata, in the same tables.
- You already run PostgreSQL and do not want to operate another system.
- You care about transactional consistency between your data and embeddings.
- You need SQL joins and filters mixed with vector similarity in one query.
You should think about a dedicated vector DB when:
- You are at hundreds of millions or billions of vectors with strict latency requirements.
- You need a very high query-per-second throughput across a distributed cluster.
- You need specialized features like multi-tenant isolation, hybrid BM25+vector ranking, or advanced hardware acceleration.
If you are unsure which vector database method fits your workload, check out this guide on the Best Vector Database for RAG.
Verify Your pgvector Setup
Run these quick checks to confirm everything is working:
-- Check extension is installed
SELECT extname, extversion FROM pg_extension WHERE extname = 'vector';
-- Check your table structure
\d documents
-- Count rows
SELECT COUNT(*) FROM documents;
-- Run a test similarity query
SELECT id, content
FROM documents
ORDER BY embedding <=> '[0.1, 0.2, 0.3, ...]'
LIMIT 3;
Conclusion
Setting up pgvector on PostgreSQL is one of the quickest ways when building a self-hosted RAG stack. You skip the overhead of a new database system and keep everything in one place. Once you install pgvector on PostgreSQL, you can create a vector column, insert embeddings from your model, and run similarity searches directly in SQL.
Just remember that the extension name in SQL is vector, not pgvector.
For the actual server to host your PostgreSQL and RAG stack, PerLod’s AI-ready server hosting plans are designed for exactly this kind of workload.
FAQs
What is the correct command to enable pgvector in PostgreSQL?
CREATE EXTENSION vector; This is the correct command. The project is called pgvector, but the extension name is vector.
What PostgreSQL versions does pgvector support?
pgvector supports PostgreSQL 12 and above.
Do I need an index to use pgvector?
No, but without an index, every query does a full table scan. For more than a few thousand rows, add an HNSW index.