Vector Databases

Why You Need a Vector Database

With embedding vectors, you need somewhere to store and search them.

The naive approach: keep all vectors in memory, compute cosine similarity against every vector for each query, return the Top-K most similar. This works with a few hundred vectors, but at millions of documents, brute-force search is too slow.

Vector databases are purpose-built for this — efficient storage and search of high-dimensional vectors.

Core Concepts

Approximate Nearest Neighbor (ANN)

Vector databases don't do exact search (too slow). They do Approximate Nearest Neighbor search — they might not return the absolute most similar result, but they return very close results at extreme speed.

For RAG, this is perfectly fine — you don't need the exact #1 match, you need "roughly the most relevant top few."

Index Types

Vector databases use indexes to speed up search:

HNSW (Hierarchical Navigable Small World)

Most commonly used index type
Fast search, high accuracy
Higher memory usage
Suitable for most scenarios

IVF (Inverted File Index)

Clusters vectors first, searches only relevant clusters
Better memory efficiency
Slightly slower than HNSW
Suitable for very large-scale data

Flat (Brute Force)

Exact search, scans all vectors
Most accurate but slowest
Only for small datasets or accuracy benchmarks

Popular Vector Databases

Chroma

Best choice for getting started quickly.

pip install chromadb

import chromadb

client = chromadb.Client()
collection = client.create_collection(
    name="knowledge_base",
    metadata={"hnsw:space": "cosine"}
)

# Add documents
collection.add(
    ids=["doc1", "doc2", "doc3"],
    documents=[
        "Refund policy: Full refund available within 30 days of purchase.",
        "Support hours: Monday to Friday, 9 AM to 6 PM.",
        "Product warranty is one year from date of purchase.",
    ],
    metadatas=[
        {"source": "policy.md"},
        {"source": "faq.md"},
        {"source": "warranty.md"},
    ]
)

# Search
results = collection.query(
    query_texts=["How do I get a refund?"],
    n_results=2
)

print(results["documents"])
# [['Refund policy: Full refund available within 30 days...', ...]]

Chroma advantages:

Built-in embedding (defaults to all-MiniLM-L6-v2)
Minimal API, runs in a few lines
Supports local storage and in-memory mode
Great for prototypes and small projects

pgvector

Vector search extension for PostgreSQL. The natural choice if you're already using PostgreSQL.

CREATE EXTENSION vector;

CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    content TEXT,
    embedding vector(1536)
);

INSERT INTO documents (content, embedding)
VALUES ('Refund policy...', '[0.1, 0.2, ...]');

SELECT content, embedding <=> '[0.15, 0.22, ...]' AS distance
FROM documents
ORDER BY embedding <=> '[0.15, 0.22, ...]'
LIMIT 5;

Advantages:

Integrates with existing PostgreSQL infrastructure
SQL queries — combine vector search with regular queries
Supports metadata filtering
Low operational overhead

Other Options

Database	Type	Notes
Pinecone	Cloud-hosted	Fully managed, zero ops, pay-per-use
Weaviate	Open-source/Cloud	Feature-rich, hybrid search support
Qdrant	Open-source/Cloud	Rust-based, excellent performance
Milvus	Open-source	Large-scale, distributed architecture
FAISS	Library (not DB)	By Meta, pure vector search library

How to Choose

Need	Recommendation
Quick prototype	Chroma
Already using PostgreSQL	pgvector
Don't want to self-host	Pinecone
Need high performance	Qdrant or Milvus
Need hybrid search	Weaviate
Just need a search library	FAISS

For most projects, start with Chroma — it's simple enough to let you focus on RAG core logic. Migrate to pgvector or Qdrant when you need production-grade deployment.

Metadata Filtering

Vector search finds "most semantically similar," but sometimes you need additional conditions:

# Search only within a specific source
results = collection.query(
    query_texts=["How do I get a refund?"],
    n_results=3,
    where={"source": "policy.md"}
)

# Combined filters
results = collection.query(
    query_texts=["How do I get a refund?"],
    n_results=3,
    where={
        "$and": [
            {"source": "policy.md"},
            {"updated_after": {"$gte": "2024-01-01"}}
        ]
    }
)

Metadata filtering is "filter first, search second" — narrow the candidate set, then do vector search within it. Essential for multi-tenant, multi-category knowledge bases.

Key Takeaways

Vector databases are purpose-built for storing and searching high-dimensional vectors. They use ANN algorithms for efficient similarity search.
HNSW is the most common index type — fast, accurate, suitable for most scenarios.
Start prototyping with Chroma, consider pgvector (if already using PostgreSQL) or Qdrant for production.
Metadata filtering + vector search is the standard combo. Filter to narrow scope, then semantic search.