Semantic Search with pgvector Inside Supabase PostgreSQL
Why pgvector Eliminates a Separate Vector Database
Most RAG tutorials tell you to set up Pinecone, Weaviate, or ChromaDB alongside your application database. This creates two data stores to maintain, two billing accounts, and additional latency on every query. If you are already using Supabase PostgreSQL, the pgvector extension lets you store and query embeddings directly in your existing database.
Enabling pgvector in Supabase
-- Run once in your Supabase SQL Editor
CREATE EXTENSION IF NOT EXISTS vector;
Now you can store embeddings as a native column type.
Creating a Vector-Enabled Table
CREATE TABLE documents (
id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
content TEXT NOT NULL,
embedding vector(768) -- 768 dimensions for text-embedding-3-small
);
-- Create an approximate nearest-neighbor index for fast similarity queries
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
Generating and Storing Embeddings
import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY!);
const model = genAI.getGenerativeModel({ model: "text-embedding-004" });
async function embedAndStore(text: string) {
const result = await model.embedContent(text);
const embedding = result.embedding.values;
await supabase.from("documents").insert({
content: text,
embedding: JSON.stringify(embedding), // pgvector accepts JSON array
});
}
Running a Semantic Search Query
async function semanticSearch(query: string, limit = 5) {
const queryEmbedding = await embed(query);
const { data } = await supabase.rpc("match_documents", {
query_embedding: queryEmbedding,
match_threshold: 0.78,
match_count: limit,
});
return data;
}
The Postgres function (match_documents) uses 1 - (embedding <=> query_embedding) as the cosine similarity score.
When to Use pgvector vs a Dedicated Vector DB
- pgvector: Best when your dataset is under 1M documents, you want a single infrastructure bill, and your queries are sub-100ms (the ivfflat index handles this comfortably).
- Dedicated vector DB (Pinecone/Weaviate): Best when you have tens of millions of vectors requiring ANN search with under 10ms P99 latency, or complex hybrid filtered searches at massive scale.
Building a custom RAG pipeline for your application? Let's talk →
Frequently Asked Questions
Q:How many vectors can pgvector handle efficiently?
With an ivfflat index, pgvector handles up to ~1M vectors well with sub-100ms query times. For 10M+ vectors, consider Pinecone or Qdrant.
Q:Which embedding model works best with pgvector in Supabase?
Google text-embedding-004 (768 dims) and OpenAI text-embedding-3-small (1536 dims) both work well. Smaller dimension models are faster and cheaper to store.
Related Engineering Notes
Related Project Cases
Working on something similar?
Let's collaborate to design custom PCB schematics, write deterministic FreeRTOS threads, or configure secure Next.js databases.