Overview
This guide covers how to use the LangChainRunPod LLM class to interact with text generation models hosted on RunPod Serverless.
Setup
-
Install the package:
- Deploy an LLM Endpoint: Follow the setup steps in the RunPod Provider Guide to deploy a compatible text generation endpoint on RunPod Serverless and get its Endpoint ID.
-
Set Environment Variables: Make sure
RUNPOD_API_KEYandRUNPOD_ENDPOINT_IDare set.
Instantiation
Initialize theRunPod class. You can pass model-specific parameters via model_kwargs and configure polling behavior.
Invocation
Use the standard LangChain.invoke() and .ainvoke() methods to call the model. Streaming is also supported via .stream() and .astream() (simulated by polling the RunPod /stream endpoint).
Async Usage
Chaining
The LLM integrates seamlessly with LangChain Expression Language (LCEL) chains.Endpoint Considerations
- Input: The endpoint handler should expect the prompt string within
{"input": {"prompt": "...", ...}}. - Output: The handler should return the generated text within the
"output"key of the final status response (e.g.,{"output": "Generated text..."}or{"output": {"text": "..."}}). - Streaming: For simulated streaming via the
/streamendpoint, the handler must populate the"stream"key in the status response with a list of chunk dictionaries, like[{"output": "token1"}, {"output": "token2"}].
API reference
For detailed documentation of theRunPod LLM class, parameters, and methods, refer to the source code or the generated API reference (if available).
Link to source code: https://github.com/runpod/langchain-runpod/blob/main/langchain_runpod/llms.py
Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.