Phoenix icon

Phoenix

CreatedArize-ai

Provides an open-source AI observability platform designed for experimentation, evaluation, and troubleshooting of LLM applications.

About

Phoenix is an open-source AI observability platform designed for experimentation, evaluation, and troubleshooting of LLM applications. It offers features like tracing, evaluation, dataset management, experiment tracking, a playground for prompt engineering, and prompt management. With vendor and language-agnostic design, it supports popular frameworks like LlamaIndex and LangChain, along with LLM providers like OpenAI and Bedrock, making it suitable for various deployment environments, including local machines, Jupyter notebooks, containers, or the cloud.

Key Features

  • Leverage LLMs to benchmark application performance using response and retrieval evals.
  • Track and evaluate changes to prompts, LLMs, and retrieval.
  • 5,347 GitHub stars
  • Tracing of LLM application runtime using OpenTelemetry-based instrumentation.
  • Create versioned datasets of examples for experimentation, evaluation, and fine-tuning.
  • Optimize prompts, compare models, adjust parameters, and replay traced LLM calls.

Use Cases

  • Benchmarking and evaluating the performance of different LLMs and prompts.
  • Troubleshooting and debugging LLM applications.
  • Experimenting with and fine-tuning LLM models and prompts.