Manages human annotations and quality scores for Langfuse traces directly within the Claude Code environment.
The Langfuse Annotation Manager skill empowers developers to streamline their LLM evaluation workflows by managing human-in-the-loop annotations. It provides a comprehensive suite of tools to create, update, and delete trace scores across numeric, categorical, and boolean data types. By enabling users to quickly identify pending traces and export annotation data to JSON or CSV, it bridges the gap between observability and actionable quality improvements for AI applications.
Key Features
01Identify pending traces requiring human review based on timeframes
020 GitHub stars
03Create numeric, categorical, and boolean scores for traces
04Update or delete existing trace scores and comments
05Export annotation data to JSON and CSV formats for analysis
06View and list available score configurations and types
Use Cases
01Managing large-scale annotation workflows to find and review unrated traces
02Generating labeled datasets for fine-tuning by exporting verified scores
03Conducting manual QA and human-in-the-loop evaluation of LLM outputs