CTranslate2

Installation and Setup
LLMs

CTranslate2 is a C++ and Python library for efficient inference with Transformer models. The project implements a custom runtime that applies many performance optimization techniques such as weights quantization, layers fusion, batch reordering, etc., to accelerate and reduce the memory usage of Transformer models on CPU and GPU. A full list of features and supported models is included in the project’s repository. To start, please check out the official quickstart guide.

Installation and Setup

Install the Python package:

pip install ctranslate2

LLMs

See a usage example.

from langchain_community.llms import CTranslate2

Edit the source of this page on GitHub.

Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.

⌘I

Popular Providers

Integrations by component

Installation and Setup

LLMs

Popular Providers

Integrations by component

​Installation and Setup

​LLMs

Installation and Setup

LLMs