Skip to main content

Evaluation how-to guides

These guides answer “How do I….?” format questions. They are goal-oriented and concrete, and are meant to help you complete a specific task. For conceptual explanations see the Conceptual guide. For end-to-end walkthroughs see Tutorials. For comprehensive descriptions of every class and function see the API Reference.

Offline evaluation

Evaluate and improve your application.

Run an evaluation

Define an evaluator

Configure the data

Configure an evaluation job

Unit testing

Unit test your system to identify bugs and regressions.

Online evaluation

Evaluate and monitor your system's live performance on production data.

Automatic evaluation

Set up evaluators that automatically run for all experiments against a dataset.

Analyzing experiment results

Use the UI & API to understand your experiment results.

Dataset management

Manage datasets in LangSmith used by your evaluations.

Annotation queues and human feedback

Collect feedback from subject matter experts and users to improve your applications.


Was this page helpful?


You can leave detailed feedback on GitHub.