PGVectorStore is an implementation of a LangChain vectorstore using postgres as the backend.
This notebook goes over how to use the PGVectorStore API.
The code lives in an integration package called: langchain-postgres.
Setup
This package requires a PostgreSQL database with thepgvector extension.
You can run the following command to spin up a container for a pgvector enabled Postgres instance:
Install
Install the integration library,langchain-postgres.
Set your Postgres values
Set your Postgres values to test the functionality in this notebook against a Postgres instance.Initialization
PGEngine Connection Pool
One of the requirements and arguments to establish PostgreSQL as a vector store is aPGEngine object. The PGEngine configures a shared connection pool to your Postgres database. This is an industry best practice to manage number of connections and to reduce latency through cached database connections.
PGVectorStore can be used with the asyncpg and psycopg3 drivers.
To create a PGEngine using PGEngine.from_connection_string() you need to provide:
url: Connection string using thepostgresql+asyncpgdriver.
PGEngine using PGEngine.from_engine() you need to provide:
engine: An object ofAsyncEngine
Initialize a table
ThePGVectorStore class requires a database table. The PGEngine engine has a helper method ainit_vectorstore_table() that can be used to create a table with the proper schema for you.
See Create a custom Vector Store or Create a Vector Store using existing table for customizing the schema.
Optional Tip: 💡
You can also specify a schema name by passingschema_name wherever you pass table_name. Eg:
Create an embedding class instance
You can use any LangChain embeddings model.Initialize a default PGVectorStore
Use the default table schema to connect to the vectorstore. See Create a custom Vector Store or Create a Vector Store using existing table for customizing the schema.Manage vector store
Add documents
Add documents to the vector store. Metadata is stored in a JSON column, see “Create a custom Vector Store” to store metadata to be used for filters.Add texts
Add text directly to the vectorstore, if not structured as a Document.Delete documents
Documents can be deleted using ids.Query vector store
Search for documents
Use a natural language query to search for similar documents.Search for documents by vector
Search for similar documents using a vector embedding.Add a Index
Speed up vector search queries by applying a vector index. Learn more about vector indexes. Indexes will use a default index name if a name is not provided. To add multiple indexes, different index names are required.Re-index
Rebuild an index using the data stored in the index’s table, replacing the old copy of the index. Some index types may require re-indexing after a considerable amount of new data is added.Drop an index
Remove a vector index.Create a custom Vector Store
Customize the vectorstore with special column names or with custom metadata columns.ainit_vectorstore_table
- Use fields
content_column,embedding_column,metadata_columns,metadata_json_column,id_columnto rename the columns. - Use the
Columnclass to create custom id or metadata columns. A Column is defined by a name and data type. Any Postgres data type can be used. - Use
store_metadatato create a JSON column to store extra metadata.
Optional Tip: 💡
To use non-uuid ids, you must customize the id column:PGVectorStore
- Use fields
content_column,embedding_column,metadata_columns,metadata_json_column,id_columnto rename the columns. ignore_metadata_columnsto ignore columns that should not be used for Document metadata. This is helpful when using a preexisting table, where all data columns are not necessary.- Use a different
distance_strategyfor the similarity calculation during vector search. - Use
index_query_optionsto tune local index parameters during vector search.
Search for documents with metadata filter
A Vector Store can take advantage of relational data to filter similarity searches. The vectorstore supports a set of filters that can be applied against the metadata fields of the documents. See the migration guide for details on how to migrate to use metadata columns.PGVectorStore currently supports the following operators and all Postgres data types.
| Operator | Meaning/Category |
|---|---|
| $eq | Equality (==) |
| $ne | Inequality (!=) |
| $lt | Less than (<) |
| $lte | Less than or equal (<=) |
| $gt | Greater than (>) |
| $gte | Greater than or equal (>=) |
| $in | Special Cased (in) |
| $nin | Special Cased (not in) |
| $between | Special Cased (between) |
| $exists | Special Cased (is null) |
| $like | Text (like) |
| $ilike | Text (case-insensitive like) |
| $and | Logical (and) |
| $or | Logical (or) |
Create a Vector Store using existing table
A Vector Store can be built up on an existing table. Assuming there’s a pre-existing table in PG DB:products, which stores product details for an eComm venture.
Here is how this table mapped to PGVectorStore:
-
id_column="product_id": ID column uniquely identifies each row in the products table. -
content_column="description": Thedescriptioncolumn contains text descriptions of each product. This text is used by theembedding_serviceto create vectors that go in embedding_column and represent the semantic meaning of each description. -
embedding_column="embed": Theembedcolumn stores the vectors created from the product descriptions. These vectors are used to find products with similar descriptions. -
metadata_columns=["name", "category", "price_usd", "quantity", "sku", "image_url"]: These columns are treated as metadata for each product. Metadata provides additional information about a product, such as its name, category, price, quantity available, SKU (Stock Keeping Unit), and an image URL. This information is useful for displaying product details in search results or for filtering and categorization. -
metadata_json_column="metadata": Themetadatacolumn can store any additional information about the products in a flexible JSON format. This allows for storing varied and complex data that doesn’t fit into the standard columns.
-
Optional: If the
embedcolumn is newly created or has different dimensions than supported by embedding model, it is required to one-time add the embeddings for the old records, like this:ALTER TABLE products ADD COLUMN embed vector(768) DEFAULT NULL -
For new records, added via
VectorStoreembeddings are automatically generated.
Clean up
⚠️ WARNING: this can not be undone Drop the vector store table.Usage for retrieval-augmented generation
For guides on how to use this vector store for retrieval-augmented generation (RAG), see the following sections:API reference
For detailed documentation of all VectorStore features and configurations head to the API reference: python.langchain.com/api_reference/postgres/v2/langchain_postgres.v2.vectorstores.PGVectorStore.htmlConnect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.