What metrics are used for evaluation?

The skill uses defined acceptance metrics such as pass rates, defect counts, and cycle times to determine if a pattern is effective.

How does it handle patterns that don't show improvement?

It follows a strict rule to not adopt patterns without measurable improvement and will suggest either revising or rejecting the pattern.

Does this skill work in restricted environments?

Yes, it includes fallback mechanisms for unavailable tools and will mark unsupported checks as 'SKIP' rather than causing a hard failure.

What is the primary purpose of the learn-eval skill?

It is designed to evaluate and validate newly learned patterns by comparing them against baseline performance metrics to ensure measurable improvement.

Learning Evaluation & Benchmarking

Name: Learning Evaluation & Benchmarking
Author: oabdelmaksoud

byoabdelmaksoud

0•

보안 및 테스팅

Validates learned AI patterns through systematic benchmarking, quality scoring, and regression testing.

The ecc-cmd-learn-eval skill provides a robust framework for validating candidate learnings derived from agent tasks. It allows users to define specific acceptance metrics—such as pass rates and cycle times—to run controlled trials that compare baseline performance against new patterns. By prioritizing evidence-based execution, this skill ensures that only patterns demonstrating measurable improvements are adopted, while regressions are flagged immediately for review, making it essential for maintaining high-quality autonomous workflows.

주요 기능

01Quantitative pattern benchmarking

02Regression detection and flagging

03Customizable acceptance metrics (pass rate, cycle time)

04Automated trial vs. baseline comparisons

05Decision-based workflow (adopt/revise/reject)

060 GitHub stars

사용 사례

01Ensuring quality control in autonomous agent execution

02Running comparative benchmarks for workflow optimizations

03Testing the efficiency of newly extracted coding patterns

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add oabdelmaksoud/agention ecc-cmd-learn-eval

For use in Claude.ai and ChatGPT

주요 기능

01Quantitative pattern benchmarking

02Regression detection and flagging

03Customizable acceptance metrics (pass rate, cycle time)

04Automated trial vs. baseline comparisons

05Decision-based workflow (adopt/revise/reject)

060 GitHub stars

사용 사례

01Ensuring quality control in autonomous agent execution

02Running comparative benchmarks for workflow optimizations

03Testing the efficiency of newly extracted coding patterns

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add oabdelmaksoud/agention ecc-cmd-learn-eval

For use in Claude.ai and ChatGPT