DataDec 21, 2023Updated Jun 11, 20268 min read

PostgreSQL as a vector database: a beginner's guide to pgvector

PostgreSQL works as a vector database with the pgvector extension. A side-by-side speed test against Pinecone, the tables LangChain creates for you, and how to host it for free on Supabase.

Dave EbbelaarSenior AI Engineer

TL;DR

Yes, PostgreSQL works as a vector database. The pgvector extension adds vector storage and similarity search to a regular Postgres database, so your embeddings sit in a table next to the original documents and you can query both with SQL. In my side-by-side test, the same similarity search over a chunked ebook took 0.53 seconds on Pinecone and 0.29 seconds on pgvector, roughly twice as fast, largely because the network hop dominates query time and Postgres can run close to your application. LangChain's PGVector class sets up the tables for you from a connection string, one for collections and one for embeddings with the source documents. To get started, enable the vector extension on a free Supabase database, run Postgres locally, or use a managed service like Azure Database for PostgreSQL.

The same similarity search that took 0.53 seconds on Pinecone came back in 0.29 seconds from Postgres with pgvector. Twice as fast, on a database I was already using for everything else. That experiment is why I moved all of my generative AI projects to pgvector, and why I think most people building applications with large language models are better off with Postgres than with a dedicated vector database like Pinecone or Weaviate.

This post is the beginner version of that argument. It covers what pgvector is, how I set up the speed test, what LangChain creates in your database when you connect it, and how to get a Postgres vector database running for free.

What pgvector is

pgvector is an extension for PostgreSQL that adds vector storage and similarity search to a normal Postgres database. Your embeddings live in a table, right next to the documents they were created from, and you query them the same way you query anything else in Postgres.

If you've built anything with retrieval augmented generation, you know the architecture. Chunk your documents, embed the chunks, store the vectors, retrieve the closest ones at query time, and pass them to the model. The vector database is one box in that diagram. pgvector's pitch is that the box can be a database you already know how to run, instead of a separate service with its own console and its own data model.

Why I moved off Pinecone

I spent my first year of building LLM applications on Pinecone. Vector databases were a new concept to me back then, Pinecone was put forward as the vector database for AI applications, and I built all of my client projects around it.

The problems showed up when those projects started moving from proof of concept to production. At that stage, managing the data that's active in the vector database matters more and more, and I kept hitting issues that forced me to reconsider my data pipelines along with the database itself. While researching alternatives I found an article where other people described the same problems and their switch to pgvector.

That got my attention because I already live in Postgres. Most of my client projects that need a database have one open in DataGrip, usually Postgres or something close to it. The vectors were the only data living somewhere else.

My first objection was the obvious one. Pinecone is a dedicated vector database, so surely it's faster. Then I found a comparison published by Supabase showing pgvector outperforming Pinecone. LangChain also ships a PGVector class that works well, so there wasn't much standing in the way of an experiment.

The speed test, side by side

The setup is deliberately simple. I took an ebook, A Christmas Carol, as a single txt file and split it into chunks. On the Pinecone side, I connected with my API key and ran a small bit of logic that loads the demo index if it exists and creates it if it doesn't. In the Pinecone data explorer you can then see the vectors, the source, and the text of each chunk. Standard Pinecone workflow so far.

To measure anything useful, I wrote a helper that runs a similarity search a configurable number of times and averages the duration. The query was the first line of the book, so the correct result is obvious. The search should return the opening chunk.

Pinecone returned the right chunk in 0.53 seconds.

Then the same thing through pgvector. A collection name, a connection string to my Postgres database, the same query through a similar helper. Same chunk back, 0.29 seconds. There's some variation between runs, so I kept going. pgvector dropped to 0.23 on a re-run; Pinecone came back at 0.5. Roughly twice as fast every time I ran it, and that's held consistently across the applications I'm building. For me, that defeated the purpose of paying for something dedicated.

One honest caveat is that I'm not a Pinecone expert. It could be that I'm missing knowledge about how to work with it effectively, and these are basic examples chosen because they're relatable. But from a speed and simplicity perspective, pgvector is the clear winner for me.

The bottleneck is the network, not the algorithm

A general-purpose database beats a dedicated one here for a boring reason. You talk to all of these services through APIs. The implementation and speed of the search algorithm is rarely the bottleneck. The network connection the data travels over is.

Postgres is open source, so you can put the database right next to wherever your application runs and cut that latency down. Pinecone is always a round trip to someone else's infrastructure.

What LangChain creates in your database

Connect LangChain's PGVector class to a Postgres database, give it a collection name and a connection string, and it creates two tables for you automatically. The first holds your collections, the name you chose plus a UUID. The second holds the embeddings, each row with the collection ID, the vector itself, and the original document.

So you end up with a traditional structured SQL database that happens to contain your vectors, and that visibility is the first thing I liked about this setup. Open it in DataGrip and you can see exactly what data your application is working with. You can query the embeddings, but also the documents and the collection IDs.

Adding data is just as visible. I loaded a second txt file under a new collection name, refreshed the database, and watched the embedding table grow from 102 rows to a lot more. That gives me a much better overview of what's in my applications, and my clients' applications, than the Pinecone console, which greets you with a random vector by default. You can query by ID there, but I don't store those IDs anywhere. Keeping them would mean running a separate database alongside Pinecone, which defeats the purpose.

Some of this probably comes down to namespaces, and configuring Pinecone with namespaces the way I use collections here might solve it. I'll grant that. My criteria were speed and simplicity, and on both counts I'd rather open a database and see what's going on.

Stuck building prototypes?

Get our production stack

What you're missing is the architecture, evaluation, and judgment to ship a real project.

See Curriculum

Searching across collections

Out of the box, LangChain's PGVector runs a similarity search against one collection at a time. I built a custom PGVector service with a custom similarity search with scores that queries across every collection in the database. You can steal it; the repository is linked in the video description.

The point of it is control. I can delete collection one and it's gone instantly. Delete collection two, same thing. Re-upload the documents and the collection is back, while one simple query keeps searching through whatever is currently loaded. Adding and replacing data in the vector store becomes a set of ordinary database operations.

The real work starts after the proof of concept

The same pattern shows up in every project. In the beginning there's a fixed set of documents. You load them, run some tests, and validate that the idea works. That moment gets treated as the finish line, and it's actually where the real testing starts. From that point on, the job is upgrading and improving the dataset much more than proving the application.

This is where the pgvector setup earns its keep, because dataset iteration is just collection management. I've gone a step further and built a GUI around the delete-collection and update-collection functions, using an Azure storage account with webhooks, so everything in the vector store can be managed without touching code.

That data discipline is a big part of what separates a demo from production AI systems. The retrieval layer is only useful if you can inspect and update what's in it.

How to set up a Postgres vector database

Three options, depending on where you are.

The fastest free start is Supabase, which I really like. You get a managed Postgres database on the free tier, you enable the vector extension, and you're done.

You can also run Postgres locally, which is how I ran this experiment, or put it on whatever server you already deploy to.

For most of my client projects I use a managed PostgreSQL database through Microsoft Azure, because that's where I deploy the applications anyway. Everything ends up hosted in one place.

If you're building LLM applications on a dedicated vector database like Pinecone or Weaviate right now, my recommendation is to at least give pgvector a shot and see if it works for you. Plain similarity search like this is the entry point. When a single retriever stops being enough, the next step is hybrid search for RAG, which combines vector search with keyword search and reranking. And if you'd rather watch the whole experiment than read about it, the PostgreSQL as a vector database tutorial walks through every step on screen.

FAQ

Can PostgreSQL be used as a vector database?

Yes. The pgvector extension adds a vector type and similarity search to Postgres, so embeddings can be stored and queried in the same database as the rest of your application data. For most LLM applications, that covers everything a vector database needs to do.

Is pgvector faster than Pinecone?

In my tests, yes. The same similarity search took 0.29 seconds versus 0.53 seconds, roughly twice as fast. The main reason is network locality, since Postgres can run close to your application. Supabase published a comparison that reached the same conclusion. This isn't a formal benchmark, so test it on your own workload.

Do I need LangChain to use pgvector?

No. pgvector works from any Postgres client. LangChain's PGVector class is convenient because it creates the collection and embedding tables automatically from a connection string, and it gives you a ready-made similarity search interface.

Where should I host a Postgres vector database?

Supabase's free tier with the vector extension enabled is the cheapest way to start. Running Postgres locally or on your own server works too. For client work I use a managed PostgreSQL database on Microsoft Azure, since my applications already deploy there.

When would a dedicated vector database still make sense?

If your setup depends on features I didn't need, it might. My issues with Pinecone could well be solvable with namespaces, and I said as much when I switched. I moved because pgvector was faster and simpler for the applications I build, not because dedicated vector databases are useless.