Upstash Vector is a serverless vector database designed for working with vector embeddings. The vector langchain integration is a wrapper around the upstash-vector package. The python package uses the vector rest api behind the scenes.
Installation
Create a free vector database from upstash console with the desired dimensions and distance metric. You can then create anUpstashVectorStore instance by:
-
Providing the environment variables
UPSTASH_VECTOR_URLandUPSTASH_VECTOR_TOKEN - Giving them as parameters to the constructor
-
Passing an Upstash Vector
Indexinstance to the constructor
Embeddings instance is required to turn given texts into embeddings. Here we use OpenAIEmbeddings as an example
UpstashVectorStore is to create an Upstash Vector index by selecting a model and passing embedding=True. In this configuration, documents or queries will be sent to Upstash as text and embedded there.
store like above and run the rest of the tutorial.
Load documents
Load an example text file and split it into chunks which can be turned into vector embeddings.Inserting documents
The vectorstore embeds text chunks using the embedding object and batch inserts them into the database. This returns an id array of the inserted vectors.Querying
The database can be queried using a vector or a text prompt. If a text prompt is used, it’s first converted into embedding and then queried. Thek parameter specifies how many results to return from the query.
Querying with score
The score of the query can be included for every result.The score returned in the query requests is a normalized value between 0 and 1, where 1 indicates the highest similarity and 0 the lowest regardless of the similarity function used. For more information look at the docs.
Namespaces
Namespaces can be used to separate different types of documents. This can increase the efficiency of the queries since the search space is reduced. When no namespace is provided, the default namespace is used.Metadata Filtering
Metadata can be used to filter the results of a query. You can refer to the docs to see more complex ways of filtering.Getting info about vector database
You can get information about your database like the distance metric dimension using the info function.
When an insert happens, the database an indexing takes place. While this is happening new vectors can not be queried. pendingVectorCount represents the number of vector that are currently being indexed.
Deleting vectors
Vectors can be deleted by their idsClearing the vector database
This will clear the vector databaseConnect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.