Vector Databases

Contents

What is a vector DB
pgvector: Postgres-native
Qdrant: filterable ANN
Weaviate: schema + search
Pinecone: managed
Chroma: developer-first
Choosing and operating

01 — Definition

What Is a Vector Database?

A vector database stores high-dimensional embedding vectors alongside metadata, and provides approximate nearest neighbor (ANN) search — find the top-k vectors most similar to a query vector.

Compared to traditional DBs: regular DBs filter/sort on exact values. Vector DBs find "semantically similar" items via geometric distance.

Core Operations

Insert: store vector + metadata + optional text. Query: find top-k nearest by cosine/dot/L2. Filter: restrict search by metadata. Delete/Update: remove or modify stored vectors.

⚠️ A vector database is NOT just a vector search library. Production vector DBs add: persistence, horizontal scaling, real-time insert/delete, filtered search, multi-tenancy, backups, and access control.

02 — Postgres-native

pgvector: Postgres-Native

pgvector: open-source Postgres extension. Adds vector column type and HNSW/IVFFlat indices.

Why use it: you already have Postgres, SQL joins across vector and relational data, ACID transactions, existing auth/backup infrastructure.

Supports: cosine, L2, inner product similarity. HNSW index (v0.5+) for fast ANN. IVFFlat for memory efficiency.

Example: pgvector Setup and Query

-- Enable extension CREATE EXTENSION IF NOT EXISTS vector; -- Create table with vector column CREATE TABLE documents ( id BIGSERIAL PRIMARY KEY, content TEXT, embedding vector(1536), -- OpenAI text-embedding-3-small metadata JSONB, created_at TIMESTAMPTZ DEFAULT NOW() ); -- Create HNSW index for fast ANN CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops) WITH (m = 32, ef_construction = 200); -- Semantic search with metadata filter SELECT content, metadata, 1 - (embedding <=> $1::vector) AS similarity FROM documents WHERE metadata->>'category' = 'finance' -- filter first AND created_at > NOW() - INTERVAL '30 days' ORDER BY embedding <=> $1::vector LIMIT 10;

✓ For most teams already on Postgres, pgvector is the right default up to ~5M vectors. The query planner, connection pooling (PgBouncer), and monitoring you already have apply directly.

03 — Filterable ANN

Qdrant: Filterable ANN

Qdrant: Rust-based vector DB, designed from scratch for filtered ANN search. Open-source + managed cloud.

Key differentiator: "payload filtering" is first-class — doesn't fall back to post-filtering (which kills recall). Maintains separate HNSW graphs per filter combination.

Supports: dense vectors, sparse vectors (BM25), multi-vectors (ColBERT). Named vectors for multi-embedding-per-document.

Example: Qdrant Python Client

from qdrant_client import QdrantClient from qdrant_client.models import Distance, VectorParams, PointStruct, Filter, FieldCondition, MatchValue client = QdrantClient(url="http://localhost:6333") # Create collection client.create_collection("documents", vectors_config=VectorParams( size=1536, distance=Distance.COSINE )) # Insert with payload client.upsert("documents", points=[ PointStruct(id=1, vector=embedding, payload={"category": "finance", "date": "2024-01-15"}) ]) # Filtered ANN search results = client.search( "documents", query_vector=query_embedding, query_filter=Filter(must=[ FieldCondition(key="category", match=MatchValue(value="finance")) ]), limit=10 )

Qdrant vs pgvector for Filtered Search

Scenario	pgvector	Qdrant
Low cardinality filter (2 categories)	Good	Excellent
High cardinality filter (1000 users)	Degrades	Maintains recall
Complex nested filters	SQL	Payload filter JSON
Pre-filter selectivity <1%	ANN recall drops	Optimized

04 — Schema + Search

Weaviate: Schema + Search

Weaviate: open-source vector DB with GraphQL API, built-in text/image vectorizers, and multi-modal support.

Schema-first: define classes with properties and vectorizer config. Weaviate handles embedding generation if you configure a module.

Hybrid search: built-in BM25 + vector fusion with a single API call. Alpha parameter controls dense vs sparse weight.

Multi-tenancy: built-in tenant isolation — each tenant gets its own HNSW index shard. Critical for SaaS applications.

Example: Weaviate Python Client with Hybrid Search

import weaviate client = weaviate.connect_to_local() collection = client.collections.get("Document") # Hybrid search (BM25 + vector) results = collection.query.hybrid( query="transformer attention mechanism", alpha=0.7, # 0 = pure BM25, 1 = pure vector limit=10, filters=weaviate.classes.query.Filter.by_property("category").equal("ml") ) for obj in results.objects: print(obj.properties["content"][:100])

05 — Managed, Serverless

Pinecone: Managed, Serverless

Pinecone: fully managed, serverless vector DB. No infrastructure to operate. Scales automatically.

Pods vs Serverless: pods = dedicated compute (predictable latency), serverless = pay per query (scales to zero)

Strong at: multi-tenancy via namespaces, metadata filtering, hybrid search (sparse + dense)

Limitations: no SQL joins, no self-hosting, more expensive at high query volumes vs self-hosted

Example: Pinecone Serverless

from pinecone import Pinecone, ServerlessSpec pc = Pinecone(api_key="YOUR_API_KEY") # Create serverless index pc.create_index("my-index", dimension=1536, metric="cosine", spec=ServerlessSpec(cloud="aws", region="us-east-1")) index = pc.Index("my-index") # Upsert vectors with metadata index.upsert(vectors=[ {"id": "doc_1", "values": embedding, "metadata": {"category": "tech", "date": "2024-01"}} ], namespace="tenant_123") # namespace for multi-tenancy # Query with filter results = index.query(vector=query_embedding, top_k=10, filter={"category": {"$eq": "tech"}}, namespace="tenant_123")

06 — Developer-first

Chroma: Developer-First

Chroma: Python-native, embeddable vector DB. Runs in-process (no server) or as a server. Built for rapid prototyping and small-to-medium scale.

Built-in embedding: configure an embedding function once; Chroma calls it automatically on insert and query

Persistent or in-memory: ephemeral for testing, persistent SQLite/DuckDB for development, server mode for production

Example: Chroma Integration

import chromadb from chromadb.utils.embedding_functions import OpenAIEmbeddingFunction client = chromadb.PersistentClient(path="./chroma_db") ef = OpenAIEmbeddingFunction(model_name="text-embedding-3-small") collection = client.get_or_create_collection("docs", embedding_function=ef) # Add documents — Chroma handles embedding collection.add(documents=["RAG is retrieval-augmented generation...", "Transformers use attention mechanisms..."], ids=["doc1", "doc2"], metadatas=[{"topic": "rag"}, {"topic": "architecture"}]) # Query — Chroma handles query embedding results = collection.query(query_texts=["how does RAG work?"], n_results=3, where={"topic": "rag"})

07 — Decision Guide

Choosing and Operating a Vector DB

Vector DB Decision Guide

Situation	Recommendation
Already on Postgres, <5M vectors	pgvector
Need complex filtering at scale	Qdrant
Multi-tenant SaaS product	Weaviate or Pinecone (namespaces)
No infra team, serverless preferred	Pinecone Serverless
Prototyping / development	Chroma
>100M vectors, on-prem	Milvus or Qdrant distributed

Operational Patterns

Index warm-up: HNSW requires a query to be fast after build. Hot/cold segmentation: recent vectors in hot index, old in cold. Embedding cache: avoid re-embedding identical texts.

Monitoring

Track query latency (P50/P95/P99), recall (run ground-truth queries regularly), index size growth, failed queries

Tools Grid

Postgres Extension

pgvector

HNSW + IVFFlat indices, SQL joins, ACID

Rust VectorDB

Qdrant

Filtered ANN, sparse vectors, multi-vector

GraphQL VectorDB

Weaviate

Schema-first, hybrid search, multi-tenancy

Managed Serverless

Pinecone

No ops, auto-scaling, namespaces

Python Embeddable

Chroma

In-process or server, auto-embedding

Distributed VectorDB

Milvus

Kubernetes-native, petabyte scale

References

Documentation

Docs pgvector GitHub — github.com/pgvector/pgvector ↗
Docs Qdrant Documentation — qdrant.tech/documentation ↗
Docs Weaviate Developers — weaviate.io/developers ↗
Docs Pinecone Documentation — docs.pinecone.io ↗
Docs Chroma Documentation — docs.trychroma.com ↗

Benchmarks & Research

Paper ANN Benchmarks — ann-benchmarks.com ↗

Vector Databases

What Is a Vector Database?

Core Operations

pgvector: Postgres-Native

Example: pgvector Setup and Query

Qdrant: Filterable ANN

Example: Qdrant Python Client

Qdrant vs pgvector for Filtered Search

Weaviate: Schema + Search

Example: Weaviate Python Client with Hybrid Search

Pinecone: Managed, Serverless

Example: Pinecone Serverless

Chroma: Developer-First

Example: Chroma Integration

Choosing and Operating a Vector DB

Vector DB Decision Guide

Operational Patterns

Monitoring

Tools Grid

References

Related concepts