What is a vector database?

A vector database stores high-dimensional numeric representations (embeddings) of documents, images, audio, or other content and supports similarity search — finding the items most semantically related to a query embedding. It is the storage and retrieval layer underneath retrieval-augmented generation (RAG) and most production AI search.

Which vector database should I pick?

For most enterprise teams: Pinecone if you want fully managed and zero ops; Milvus or Qdrant if you want open-source with strong scale and the option of self-host; Weaviate if you want open-source with rich hybrid search and module ecosystem; pgvector if your data already lives in Postgres and the corpus is small to mid-sized; OpenSearch / Elasticsearch with vector if you already run them and want one engine for keyword + vector.

Do I need a dedicated vector database, or can I use Postgres?

For corpora under tens of millions of vectors with modest QPS, pgvector on Postgres is often the right answer — one fewer system to operate, transactional consistency with the rest of your data, mature governance. For high-volume, high-QPS, or very large corpora, dedicated vector databases (Pinecone, Milvus, Qdrant, Weaviate) deliver materially better latency and scale.

What is hybrid search?

Hybrid search combines vector (semantic) similarity with traditional keyword (BM25) search. It catches both the queries where meaning matters more than exact words, and the queries where exact words matter more than meaning. Most production RAG systems use hybrid search rather than pure vector — accuracy is consistently better.

How do I host a vector database in India for DPDP compliance?

Three options. Self-host Milvus, Qdrant, Weaviate, or pgvector in your Indian VPC on AWS Mumbai, Azure India, or GCP Mumbai/Delhi. Use Pinecone's regional deployment in Mumbai if you prefer managed. Or use OpenSearch on AWS Mumbai for the engine you may already run. The choice is about data residency and your existing operational footprint.

How does vector storage relate to my data lakehouse?

Embeddings are downstream artefacts of your gold tables and document corpora. The lakehouse is the source of truth; the vector database is an optimised retrieval index on top of it. Treat embeddings as a derived dataset that must be reindexed when the source changes — otherwise you ship silently stale RAG. See our data lakehouse architecture guide for the broader pattern.

How do I evaluate vector database performance?

On your own corpus and queries, not vendor benchmarks. Measure recall at k (does the right document show up in the top results), P95 latency under your concurrency, ingest throughput, memory footprint, and cost per million vectors stored and queried. Vendor leaderboards rarely match real workloads.

Vector Database for Enterprise: 2026 Selection Guide (India)

Pick the wrong vector database and your RAG is slow, expensive, or — most painfully — silently wrong. Pick the right one and the storage layer disappears as a concern. The good news in 2026 is that the leading options are all genuinely production-ready; the question is which fits your operating model, your corpus, and the regulatory regime your data lives under.

This guide compares the six vector stores Indian enterprises actually shortlist — Pinecone, Milvus, Weaviate, Qdrant, pgvector, OpenSearch — on the dimensions that matter (managed vs self-hosted, scale ceiling, hybrid search, governance, India residency), explains hybrid search and HNSW, and lays out a deployment pattern that works for DPDP-bound workloads.

What a Vector Database Actually Does

A vector database stores high-dimensional embeddings — numeric representations of documents, images, audio, or any content — and supports approximate nearest neighbour (ANN) search to find the items most semantically related to a query embedding. It is the retrieval layer underneath RAG, semantic search, recommendation, and most production AI search.

The hard part is not storage; it is keeping recall (do the right items show up?) high while keeping latency (how fast?) and cost (how cheap per million vectors?) acceptable as the corpus grows. Every vector database you will consider takes a different position on that triangle.

The Six Options Indian Enterprises Shortlist

Pinecone

Fully managed, serverless option. Strong default choice when you want zero operational burden and predictable performance. Pricing scales with vectors and queries. Available in multiple regions including Mumbai for India residency. Best fit: teams that value time-to-production and managed reliability over the cost or sovereignty advantages of self-hosting.

Milvus (and Zilliz, the managed Milvus)

Open-source, designed for very large scale. Strong support for hybrid search, real-time updates, and rich metadata filtering. Self-host on Kubernetes or use Zilliz Cloud for a managed deployment. Best fit: large corpora, multi-tenant SaaS, teams that want open-source insurance with optional managed convenience.

Weaviate

Open-source with a generous module ecosystem (vectoriser modules, generative modules, hybrid-search-first design). Cloud and self-host options. Strong on developer experience. Best fit: teams that want open-source with first-class hybrid search and modular composition.

Qdrant

Open-source, written in Rust, with a strong focus on performance and resource efficiency. Cloud and self-host options. Sharp filtering capabilities and a clean API. Best fit: teams that prioritise performance per rupee and want straightforward self-hosting.

pgvector (Postgres extension)

Vector storage as a Postgres extension. Mature, transactional, sits next to your relational data. Excellent for corpora under tens of millions of vectors with modest QPS. Best fit: teams whose data already lives in Postgres, want one fewer system to operate, and do not yet need the scale of a dedicated engine.

OpenSearch (and Elasticsearch) with vector

Both engines support vector indexes alongside their mature keyword search. If you already run OpenSearch or Elasticsearch for logs, security, or product search, adding vector to the same engine is operationally cheap. Best fit: teams with existing OpenSearch / Elastic estates that want hybrid search in one engine.

Side-by-Side Comparison

Engine	Hosted / Self	Open source	Best for
Pinecone	Managed only	No	Zero-ops, managed reliability
Milvus	Both (Zilliz managed)	Yes (Apache-2)	Very large scale; multi-tenant SaaS
Weaviate	Both	Yes (BSD)	Module ecosystem, developer experience
Qdrant	Both	Yes (Apache-2)	Performance per rupee; self-hosting
pgvector	Self (managed by Postgres host)	Yes	Postgres-native, small-to-mid corpora
OpenSearch	Both	Yes (Apache-2)	Existing OpenSearch / Elastic estates

Hybrid Search Is the Default in 2026

Pure vector search misses queries where exact terms matter (a model number, a person's surname, a regulatory citation). Pure keyword search misses queries where meaning matters more than vocabulary. Hybrid search — combining BM25 keyword scores with vector similarity — outperforms either approach alone on most enterprise corpora and is the default pattern in production RAG.

Weaviate and OpenSearch ship hybrid out of the box. Milvus, Qdrant, and Pinecone support hybrid via sparse-and-dense indexing or score fusion at the application layer. pgvector users typically combine pgvector with Postgres full-text search and a re-ranking step. Whichever engine you pick, plan for hybrid from day one.

Indexes — HNSW, IVF, and Why It Usually Doesn't Matter

HNSW (Hierarchical Navigable Small World) is the dominant ANN index across every major vector database in 2026. Strong recall, low latency, predictable behaviour. IVF (Inverted File) variants trade a little recall for memory efficiency at very large scale. Most engines let you choose, and the defaults are usually right.

Tune the index only when measurements show you need to. Engineering teams routinely waste weeks optimising HNSW parameters for corpora where the bottleneck is somewhere else entirely.

India Residency — Where to Host

For DPDP-bound workloads and BFSI sectoral expectations, vector storage residency in India matters as much as primary database residency. Three viable patterns:

Self-host on AWS Mumbai, Azure India, or GCP Mumbai/Delhi. Milvus, Qdrant, Weaviate, OpenSearch, or pgvector — your VPC, your operational control. The right pattern for high-sensitivity workloads.
Pinecone with regional deployment. Pinecone offers Mumbai-region deployment for India residency without operating the engine yourself.
Hybrid. Sensitive data in self-hosted vector storage; non-sensitive corpora on a managed service. The split mirrors what most Indian enterprises already do for primary databases.

Whichever pattern, document the data flow end-to-end — what gets embedded, where embeddings live, who can query, what happens on a deletion request. See our DPDP Act AI compliance guide for the obligations the architecture has to support.

Vector Storage and Your Lakehouse

Embeddings are derived data. The source of truth lives in your data lakehouse — gold tables, document stores, knowledge bases. The vector database is an optimised retrieval index on top of that source. The architectural discipline that prevents silent failure:

Reindex on source change. If gold changes, the vectors must be regenerated. Without this pipeline, your RAG slowly diverges from the truth.
Tenant isolation at the index level. Multi-tenant systems must enforce tenant boundaries in the vector store, not just at filter time. Filter-time isolation breaks under prompt injection and bug-in-application code.
Provenance metadata. Every vector carries the source document ID, version, and access policy. Retrieval respects them.
Deletion propagation. DPDP deletion requests must reach the vector store, not just the primary database. Build the pipeline; do not rely on manual hygiene.

How to Actually Pick

The decision in plain English:

Small corpus, Postgres already in your stack? → pgvector. Don't add a system you don't need.
OpenSearch or Elastic already in production? → vector index in the same engine. Operational simplicity wins.
Want managed and predictable, fine paying for it? → Pinecone. India region available.
Open-source, large scale, multi-tenant? → Milvus. Self-host or Zilliz managed.
Open-source, hybrid-first, rich modules? → Weaviate.
Performance per rupee, lean self-hosting? → Qdrant.

Whichever you pick, benchmark on your own corpus and queries before committing. Vendor leaderboards rarely match real workloads, and the right answer for a 1M-vector corpus is rarely the right answer for a 1B-vector corpus in the same enterprise.

What Will Change Through 2026

Three trends. First, the lakehouse table formats (Iceberg, Hudi, Delta) are adding native vector support, which will compress the architectural gap between the warehouse layer and the vector store for some use cases. Second, GraphRAG and structured retrieval are reducing the number of pure-vector queries in many enterprise stacks, with vectors complementing graph traversal rather than replacing it. Third, embedding models continue to improve faster than the storage engines do — investing in evaluation harnesses that let you swap embedding models without rebuilding the application is a higher-return investment than tuning an HNSW parameter.

Pick a vector store that fits your operating model today, and assume you will be re-evaluating every 18–24 months as the layer matures. The right architecture decouples the application from the storage choice — embeddings change, indexes change, engines change, but the agent that reads them keeps working.

Architect your RAG stack with humaineeti

Vector Database for the Enterprise.