Choosing a Vector DB

The vector database market is crowded and loud. The good news: for most teams the decision is simple, and the most common mistake is adopting a heavyweight dedicated database far too early.

The first question: do you even need one?

Skip a vector database if:

You have fewer than ~10k–50k vectors. An in-memory library like FAISS or even a NumPy array does exact search in milliseconds.
You already run Postgres and your scale is moderate. The pgvector extension adds vector columns and ANN indexes to the database you already operate, back up, and secure.

You genuinely need a dedicated vector database when you have millions-plus vectors, demanding latency targets, frequent updates, or want managed operations.

The categories

Category	Examples	Best when
Postgres extension	pgvector, pgvecto.rs	You already use Postgres; moderate scale
Embedded / library	FAISS, Chroma, LanceDB	Prototypes, local apps, small corpora
Dedicated open-source	Qdrant, Weaviate, Milvus	Large scale, self-hosted, want control
Fully managed	Pinecone, managed Qdrant/Weaviate	Want zero ops, predictable scaling

The engines differ less than the marketing implies — most use HNSW under the hood. Decide on operational fit, not benchmark screenshots.

Features that actually matter

Metadata filtering

Real queries are rarely pure similarity. You want “the most relevant chunks from this user’s documents, written this year.” That means filtering on structured metadata alongside the vector search.

results = collection.query(
    query_vector=embed("quarterly revenue"),
    top_k=5,
    filter={"user_id": "u_123", "year": {"$gte": 2024}},
)

How the engine combines the filter with the ANN search matters a lot. Pre-filtering (restrict candidates, then search) preserves recall but can be slower; post-filtering (search, then drop non-matches) is fast but can return too few results. For heavily filtered workloads, test this behavior explicitly.

Hybrid search

Vector search is strong on meaning but weak on exact terms — product codes, error IDs, names, rare jargon. Keyword search (BM25) is the opposite. Hybrid search runs both and fuses the rankings (commonly with Reciprocal Rank Fusion).

Hybrid search measurably improves retrieval quality and is worth prioritizing — many RAG quality problems are really “we used pure vector search.”

Operational checklist

Beyond search, weigh: how upserts and deletes are handled, persistence and backup, horizontal scaling, security and multi-tenant isolation, and total cost (managed pricing can climb fast at scale).

A decision walkthrough

< 50k vectors, static? Use FAISS or an in-process library. Done.
Already on Postgres, moderate scale? Use pgvector. Done.
Millions of vectors, or strict latency, or heavy updates? Adopt a dedicated store.
Want zero operational burden? Pick a managed service.
Self-hosting, want control and cost efficiency? Qdrant, Weaviate, or Milvus.

In every case, require metadata filtering and hybrid search — they matter more than raw single-query speed.

Key takeaways

Most teams should start with pgvector or an embedded library and adopt a dedicated vector database only when measured scale or latency demands it. The engines are more alike than different — choose on operational fit. Prioritize metadata filtering and hybrid search: they drive real-world retrieval quality far more than benchmark latency numbers.