TruLens is an open-source package that provides instrumentation and evaluation tools for large language model (LLM) based applications.This page covers how to use TruLens to evaluate and track LLM apps built on langchain.
Installation and Setup
Install thetrulens-eval python package.
Quickstart
See the integration details in the TruLens documentation.Tracking
Once you’ve created your LLM chain, you can use TruLens for evaluation and tracking. TruLens has a number of out-of-the-box Feedback Functions, and is also an extensible framework for LLM evaluation. Create the feedback functions:Chains
After you’ve set up Feedback Function(s) for evaluating your LLM, you can wrap your application with TruChain to get detailed tracing, logging and evaluation of your LLM app. Note: See code for thechain creation is in
the TruLens documentation.
Evaluation
Now you can explore your LLM-based application! Doing so will help you understand how your LLM application is performing at a glance. As you iterate new versions of your LLM application, you can compare their performance across all of the different quality metrics you’ve set up. You’ll also be able to view evaluations at a record level, and explore the chain metadata for each record.Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.