Overview
This will help you get started with vLLM chat models, which leverages thelangchain-openai package. For detailed documentation of all ChatOpenAI features and configurations head to the API reference.
Integration details
| Class | Package | Local | Serializable | JS support | Downloads | Version | 
|---|---|---|---|---|---|---|
| ChatOpenAI | langchain_openai | ✅ | beta | ❌ | 
Model features
Specific model features, such as tool calling, support for multi-modal inputs, support for token-level streaming, etc., will depend on the hosted model.Setup
See the vLLM docs here. To access vLLM models through LangChain, you’ll need to install thelangchain-openai integration package.
Credentials
Authentication will depend on specifics of the inference server. To enable automated tracing of your model calls, set your LangSmith API key:Installation
The LangChain vLLM integration can be accessed via thelangchain-openai package:
Instantiation
Now we can instantiate our model object and generate chat completions:Invocation
API reference
For detailed documentation of all features and configurations exposed vialangchain-openai, head to the API reference: python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html
Refer to the vLLM documentation as well.
Connect these docs programmatically to Claude, VSCode, and more via MCP for    real-time answers.