Can I automate the evaluation process?

Yes, by using the --no-user-score flag, the skill can perform agent-only evaluations without requiring interactive user input.

Where does the skill store its data?

All evaluation data, including scoring rubrics, logs, and sample files, are stored locally within the plugin's docs/evaluation directory.

How are the quality scores calculated?

Summaries are evaluated on four criteria: Completeness, Conciseness, Actionability, and Accuracy. Each is worth 2.5 points, totaling a maximum score of 10.

What is the purpose of the TLDR Feedback skill?

It is a development tool designed to audit and improve the quality of the /tldr command by scoring outputs against specific professional criteria.

TLDR Feedback & Evaluation

Name: TLDR Feedback & Evaluation
Author: itsdevcoffee

byitsdevcoffee

•

학습 및 문서화

Evaluates and audits the quality of TLDR summaries against standardized rubrics to track and improve AI-generated documentation.

The TLDR Feedback skill provides a structured framework for grading the performance of summarization tasks within Claude Code. It automates a multi-step evaluation workflow that scores summaries based on completeness, conciseness, actionability, and accuracy. By maintaining a centralized evaluation log and generating detailed sample benchmarks, this skill enables developers to quantify the effectiveness of their AI agents and iterate on prompt engineering or documentation strategies with data-driven insights.

주요 기능

01Headless evaluation mode for programmatic benchmarking

02Automated version-controlled evaluation logging and metrics tracking

03Detailed analysis reporting with specific recommendations for improvement

04Standardized 10-point scoring rubric based on four key quality dimensions

05Interactive user feedback collection for human-in-the-loop scoring

061 GitHub stars

사용 사례

01Maintaining a historical log of documentation quality for long-term improvement

02Benchmarking the accuracy of AI-generated project summaries

03Quantifying the impact of prompt changes on summary actionability

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add itsdevcoffee/devcoffee-agent-skills feedback

For use in Claude.ai and ChatGPT

Download Skill