How does this help with prompt engineering?

The skill allows you to list all prompts, fetch specific versions by label (like 'production'), create new text/chat prompts, and update labels to manage your prompt lifecycle.

Can I use this with a self-hosted Langfuse instance?

Yes, simply set the LANGFUSE_HOST environment variable to your self-hosted instance URL during the installation process.

What metrics can I monitor with this skill?

You can monitor error counts, exception details, latency for specific generations, token usage, and user session activity.

How do I install the Langfuse skill for Claude Code?

You can install it using the command 'claude mcp add' followed by the skill name and environment variables for your Langfuse public key, secret key, and host URL.

Is there a way to prevent Claude from modifying my prompts?

Yes, you can enable Read-Only Mode by using the --read-only flag or setting LANGFUSE_MCP_READ_ONLY=true, which disables all write and update tools.

Langfuse AI Observability

Name: Langfuse AI Observability
Author: avivsinai

byavivsinai

•

Analytics & Monitoring

Integrates Langfuse observability and prompt management into your development workflow for debugging and monitoring AI systems.

The Langfuse skill empowers developers to monitor and debug AI applications directly through Claude Code or Codex CLI by connecting to Langfuse via the Model Context Protocol (MCP). It provides real-time access to AI traces, exceptions, and session data, making it easy to identify performance bottlenecks, troubleshoot failed interactions, and inspect LLM inputs and outputs. Beyond observability, the skill allows for seamless management of production prompts and evaluation datasets, ensuring a closed-loop workflow for iterating on and deploying AI features safely.

Key Features

01Manage prompt versions and labels across staging and production

02Identify and debug exceptions with full stacktrace context

0341 GitHub stars

04Create and maintain evaluation datasets and test cases

05Monitor performance metrics including latency and token usage

06Query and inspect AI traces and LLM generations in real-time

Use Cases

01Root-cause analysis of AI failures by retrieving specific trace and session logs

02Performance optimization by identifying high-latency LLM calls and token-heavy generations

03Streamlining prompt engineering workflows by promoting versions from staging to production via the CLI

Key Features

01Manage prompt versions and labels across staging and production

02Identify and debug exceptions with full stacktrace context

0341 GitHub stars

04Create and maintain evaluation datasets and test cases

05Monitor performance metrics including latency and token usage

06Query and inspect AI traces and LLM generations in real-time

Use Cases

01Root-cause analysis of AI failures by retrieving specific trace and session logs

02Performance optimization by identifying high-latency LLM calls and token-heavy generations

03Streamlining prompt engineering workflows by promoting versions from staging to production via the CLI