Embedding models - Docs by LangChain

Overview

This overview covers text-based embedding models. LangChain does not currently support multimodal embeddings.

Embedding models transform raw text—such as a sentence, paragraph, or tweet—into a fixed-length vector of numbers that captures its semantic meaning. These vectors allow machines to compare and search text based on meaning rather than exact words. In practice, this means that texts with similar ideas are placed close together in the vector space. For example, instead of matching only the phrase “machine learning”, embeddings can surface documents that discuss related concepts even when different wording is used.

How it works

Vectorization — The model encodes each input string as a high-dimensional vector.
Similarity scoring — Vectors are compared using mathematical metrics to measure how closely related the underlying texts are.

Similarity metrics

Several metrics are commonly used to compare embeddings:

Cosine similarity — measures the angle between two vectors.
Euclidean distance — measures the straight-line distance between points.
Dot product — measures how much one vector projects onto another.

Interface

LangChain provides a standard interface for text embedding models (e.g., OpenAI, Cohere, Hugging Face) via the Embeddings interface. Two main methods are available:

embedDocuments(documents: string[]) → number[][]: Embeds a list of documents.
embedQuery(text: string) → number[]: Embeds a single query.

The interface allows queries and documents to be embedded with different strategies, though most providers handle them the same way in practice.

Install and use

OpenAI

Install dependencies:

npm i @langchain/openai

Add environment variables:

OPENAI_API_KEY=your-api-key

Instantiate the model:

import { OpenAIEmbeddings } from "@langchain/openai";

const embeddings = new OpenAIEmbeddings({
  model: "text-embedding-3-large"
});

Azure

Install dependencies

npm i @langchain/openai

Add environment variables:

AZURE_OPENAI_API_INSTANCE_NAME=<YOUR_INSTANCE_NAME>
AZURE_OPENAI_API_KEY=<YOUR_KEY>
AZURE_OPENAI_API_VERSION="2024-02-01"

Instantiate the model:

import { AzureOpenAIEmbeddings } from "@langchain/openai";

const embeddings = new AzureOpenAIEmbeddings({
  azureOpenAIApiEmbeddingsDeploymentName: "text-embedding-ada-002"
});

AWS

Install dependencies:

npm i @langchain/aws

Add environment variables:

BEDROCK_AWS_REGION=your-region

Instantiate the model:

import { BedrockEmbeddings } from "@langchain/aws";

const embeddings = new BedrockEmbeddings({
  model: "amazon.titan-embed-text-v1"
});

Google Gemini

Install dependencies:

npm i @langchain/google-genai

Add environment variables:

GOOGLE_API_KEY=your-api-key

Instantiate the model:

import { GoogleGenerativeAIEmbeddings } from "@langchain/google-genai";

const embeddings = new GoogleGenerativeAIEmbeddings({
  model: "text-embedding-004"
});

Google Vertex

Install dependencies:

npm i @langchain/google-vertexai

Add environment variables:

GOOGLE_APPLICATION_CREDENTIALS=credentials.json

Instantiate the model:

import { VertexAIEmbeddings } from "@langchain/google-vertexai";

const embeddings = new VertexAIEmbeddings({
  model: "gemini-embedding-001"
});

MistralAI

Install dependencies:

npm i @langchain/mistralai

Add environment variables:

MISTRAL_API_KEY=your-api-key

Instantiate the model:

import { MistralAIEmbeddings } from "@langchain/mistralai";

const embeddings = new MistralAIEmbeddings({
  model: "mistral-embed"
});

Cohere

Install dependencies:

npm i @langchain/cohere

Add environment variables:

COHERE_API_KEY=your-api-key

Instantiate the model:

import { CohereEmbeddings } from "@langchain/cohere";

const embeddings = new CohereEmbeddings({
  model: "embed-english-v3.0"
});

Ollama

Install dependencies:

npm i @langchain/ollama

Instantiate the model:

import { OllamaEmbeddings } from "@langchain/ollama";

const embeddings = new OllamaEmbeddings({
  model: "llama2",
  baseUrl: "http://localhost:11434", // Default value
});

Caching

Embeddings can be stored or temporarily cached to avoid needing to recompute them. Caching embeddings can be done using a CacheBackedEmbeddings. This wrapper stores embeddings in a key-value store, where the text is hashed and the hash is used as the key in the cache. The main supported way to initialize a CacheBackedEmbeddings is fromBytesStore. It takes the following parameters:

underlyingEmbeddings: The embedder to use for embedding.
documentEmbeddingStore: Any BaseStore for caching document embeddings.
options.namespace: (optional, defaults to "") The namespace to use for the document cache. Helps avoid collisions (e.g., set it to the embedding model name).

import { CacheBackedEmbeddings } from "@langchain/classic/embeddings/cache_backed";
import { InMemoryStore } from "@langchain/core/stores";

const underlyingEmbeddings = new OpenAIEmbeddings();

const inMemoryStore = new InMemoryStore();

const cacheBackedEmbeddings = CacheBackedEmbeddings.fromBytesStore(
  underlyingEmbeddings,
  inMemoryStore,
  {
    namespace: underlyingEmbeddings.model,
  }
);

// Example: caching a query embedding
const tic = Date.now();
const queryEmbedding = cacheBackedEmbeddings.embedQuery("Hello, world!");
console.log(`First call took: ${Date.now() - tic}ms`);

// Example: caching a document embedding
const tic = Date.now();
const documentEmbedding = cacheBackedEmbeddings.embedDocuments(["Hello, world!"]);
console.log(`Cached creation time: ${Date.now() - tic}ms`);

In production, you would typically use a more robust persistent store, such as a database or cloud storage. Please see stores integrations for options.

All integrations

Alibaba Tongyi

Azure OpenAI

Baidu Qianfan

Amazon Bedrock

ByteDance Doubao

Cloudflare Workers AI

Cohere

DeepInfra

Fireworks

Google Generative AI

Google Vertex AI

Gradient AI

HuggingFace Inference

IBM watsonx.ai

Jina

Llama CPP

Minimax

MistralAI

Mixedbread AI

Nomic

Ollama

OpenAI

Pinecone

Prem AI

Tencent Hunyuan

TensorFlow

TogetherAI

HuggingFace Transformers

Voyage AI

ZhipuAI

Edit the source of this page on GitHub.

Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.

Popular Providers

General integrations

RAG integrations

​Overview

​How it works

​Similarity metrics

​Interface

​Install and use

​Caching

​All integrations

Alibaba Tongyi

Azure OpenAI

Baidu Qianfan

Amazon Bedrock

ByteDance Doubao

Cloudflare Workers AI

Cohere

DeepInfra

Fireworks

Google Generative AI

Google Vertex AI

Gradient AI

HuggingFace Inference

IBM watsonx.ai

Jina

Llama CPP

Minimax

MistralAI

Mixedbread AI

Nomic

Ollama

OpenAI

Pinecone

Prem AI

Tencent Hunyuan

TensorFlow

TogetherAI

HuggingFace Transformers

Voyage AI

ZhipuAI

Overview

How it works

Similarity metrics

Interface

Install and use

Caching

All integrations