Skip to main content
Xata is a serverless data platform, based on PostgreSQL. It provides a type-safe TypeScript/JavaScript SDK for interacting with your database, and a UI for managing your data. Xata has a native vector type, which can be added to any table, and supports similarity search. LangChain inserts vectors directly to Xata, and queries it for the nearest neighbors of a given vector, so that you can use all the LangChain Embeddings integrations with Xata.

Setup

Install the Xata CLI

npm install @xata.io/cli -g

Create a database to be used as a vector store

In the Xata UI create a new database. You can name it whatever you want, but for this example we’ll use langchain. Create a table, again you can name it anything, but we will use vectors. Add the following columns via the UI:
  • content of type “Text”. This is used to store the Document.pageContent values.
  • embedding of type “Vector”. Use the dimension used by the model you plan to use (1536 for OpenAI).
  • any other columns you want to use as metadata. They are populated from the Document.metadata object. For example, if in the Document.metadata object you have a title property, you can create a title column in the table and it will be populated.

Initialize the project

In your project, run:
xata init
and then choose the database you created above. This will also generate a xata.ts or xata.js file that defines the client you can use to interact with the database. See the Xata getting started docs for more details on using the Xata JavaScript/TypeScript SDK.

Usage

npm
npm install @langchain/openai @langchain/community @langchain/core

Example: Q&A chatbot using OpenAI and Xata as vector store

This example uses the VectorDBQAChain to search the documents stored in Xata and then pass them as context to the OpenAI model, in order to answer the question asked by the user.
import { XataVectorSearch } from "@langchain/community/vectorstores/xata";
import { OpenAIEmbeddings, OpenAI } from "@langchain/openai";
import { BaseClient } from "@xata.io/client";
import { VectorDBQAChain } from "@langchain/classic/chains";
import { Document } from "@langchain/core/documents";

// First, follow set-up instructions at
// https://js.langchain.com/docs/modules/data_connection/vectorstores/integrations/xata

// if you use the generated client, you don't need this function.
// Just import getXataClient from the generated xata.ts instead.
const getXataClient = () => {
  if (!process.env.XATA_API_KEY) {
    throw new Error("XATA_API_KEY not set");
  }

  if (!process.env.XATA_DB_URL) {
    throw new Error("XATA_DB_URL not set");
  }
  const xata = new BaseClient({
    databaseURL: process.env.XATA_DB_URL,
    apiKey: process.env.XATA_API_KEY,
    branch: process.env.XATA_BRANCH || "main",
  });
  return xata;
};

export async function run() {
  const client = getXataClient();

  const table = "vectors";
  const embeddings = new OpenAIEmbeddings();
  const store = new XataVectorSearch(embeddings, { client, table });

  // Add documents
  const docs = [
    new Document({
      pageContent: "Xata is a Serverless Data platform based on PostgreSQL",
    }),
    new Document({
      pageContent:
        "Xata offers a built-in vector type that can be used to store and query vectors",
    }),
    new Document({
      pageContent: "Xata includes similarity search",
    }),
  ];

  const ids = await store.addDocuments(docs);

  // eslint-disable-next-line no-promise-executor-return
  await new Promise((r) => setTimeout(r, 2000));

  const model = new OpenAI();
  const chain = VectorDBQAChain.fromLLM(model, store, {
    k: 1,
    returnSourceDocuments: true,
  });
  const response = await chain.invoke({ query: "What is Xata?" });

  console.log(JSON.stringify(response, null, 2));

  await store.delete({ ids });
}

Example: Similarity search with a metadata filter

This example shows how to implement semantic search using LangChain.js and Xata. Before running it, make sure to add an author column of type String to the vectors table in Xata.
import { XataVectorSearch } from "@langchain/community/vectorstores/xata";
import { OpenAIEmbeddings } from "@langchain/openai";
import { BaseClient } from "@xata.io/client";
import { Document } from "@langchain/core/documents";

// First, follow set-up instructions at
// https://js.langchain.com/docs/modules/data_connection/vectorstores/integrations/xata
// Also, add a column named "author" to the "vectors" table.

// if you use the generated client, you don't need this function.
// Just import getXataClient from the generated xata.ts instead.
const getXataClient = () => {
  if (!process.env.XATA_API_KEY) {
    throw new Error("XATA_API_KEY not set");
  }

  if (!process.env.XATA_DB_URL) {
    throw new Error("XATA_DB_URL not set");
  }
  const xata = new BaseClient({
    databaseURL: process.env.XATA_DB_URL,
    apiKey: process.env.XATA_API_KEY,
    branch: process.env.XATA_BRANCH || "main",
  });
  return xata;
};

export async function run() {
  const client = getXataClient();
  const table = "vectors";
  const embeddings = new OpenAIEmbeddings();
  const store = new XataVectorSearch(embeddings, { client, table });
  // Add documents
  const docs = [
    new Document({
      pageContent: "Xata works great with LangChain.js",
      metadata: { author: "Xata" },
    }),
    new Document({
      pageContent: "Xata works great with LangChain",
      metadata: { author: "LangChain" },
    }),
    new Document({
      pageContent: "Xata includes similarity search",
      metadata: { author: "Xata" },
    }),
  ];
  const ids = await store.addDocuments(docs);

  // eslint-disable-next-line no-promise-executor-return
  await new Promise((r) => setTimeout(r, 2000));

  // author is applied as pre-filter to the similarity search
  const results = await store.similaritySearchWithScore("xata works great", 6, {
    author: "LangChain",
  });

  console.log(JSON.stringify(results, null, 2));

  await store.delete({ ids });
}

Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.
I