Opik icon

Opik

Createdcomet-ml

Debug, evaluate, and monitor LLM applications, RAG systems, and agentic workflows.

About

Opik is an open-source platform designed to evaluate, test, and monitor LLM applications. It offers comprehensive tracing capabilities to track LLM calls during development and production, enabling developers to annotate calls with feedback scores via the Python SDK or UI. Opik also streamlines the evaluation process with datasets, experiments, and LLM-as-a-judge metrics for issues like hallucination detection, moderation, and RAG evaluation, featuring integrations for CI/CD pipelines and production monitoring dashboards.

Key Features

  • LLM call tracing during development and production
  • Integrations with LangChain, LlamaIndex, OpenAI, and more
  • Prompt playground for experimenting with prompts and models
  • Production monitoring dashboards for tracking feedback and tokens
  • LLM-as-a-judge metrics for automated evaluation
  • 6,411 GitHub stars

Use Cases

  • Engineering prompts in a playground environment
  • Automating LLM application evaluation
  • Debugging and monitoring LLM applications
Craft Better Prompts with AnyPrompt