You are currently on a page documenting the use of Ollama models as text completion models. Many popular models available on Ollama are chat completion models.You may be looking for this page instead.
Ollama features and configuration options, please refer to the API reference.
Overview
Integration details
Ollama allows you to run open-source large language models, such as Llama 3, locally. Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile. It optimizes setup and configuration details, including GPU usage. This example goes over how to use LangChain to interact with an Ollama-run Llama 2 7b instance. For a complete list of supported models and model variants, see the Ollama model library.| Class | Package | Local | Serializable | PY support | Downloads | Version |
|---|---|---|---|---|---|---|
Ollama | @langchain/ollama | ✅ | ❌ | ✅ |
Setup
To access Ollama embedding models you’ll need to follow these instructions to install Ollama, and install the@langchain/ollama integration package.
Credentials
If you want to get automated tracing of your model calls you can also set your LangSmith API key by uncommenting below:Installation
The LangChain Ollama integration lives in the@langchain/ollama package:
Instantiation
Now we can instantiate our model object and generate chat completions:Invocation
Multimodal models
Ollama supports open source multimodal models like LLaVA in versions 0.1.15 and up. You can bind base64 encoded image data to multimodal-capable models to use as context like this:Related
API reference
For detailed documentation of allOllama features and configurations head to the API reference
Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.