Do I need to manually review the test transcripts?

No, Skill Tester uses specialized sub-agents to analyze transcripts and provide structured evaluations with direct evidence from tool calls.

Can Skill Tester handle user interactions during tests?

Yes, it adopts a defined persona from your scenarios to answer questions and interact with the runners naturally as a real user would.

How does Skill Tester compare skill performance?

It spawns two runners with the target skill enabled and one baseline runner without the skill, then compares their ability to meet specific acceptance criteria.

What is required to run a skill test?

You need scenario files prepared by a tool like skill-test-designer, typically located in your ~/.claude/skill-tests/ directory.

Skill Tester

Name: Skill Tester
Author: servitola

byservitola

•

보안 및 테스팅

Automates the evaluation of Claude Code skills by executing test scenarios and comparing performance against baseline models.

Skill Tester provides a rigorous framework for verifying the efficacy of Claude Code skills through automated scenario testing. It manages parallel runners, simulating a user persona to interact with the skill-under-test while simultaneously running a baseline instance for comparison. By employing specialized grader agents to analyze detailed execution transcripts, it provides structured feedback on acceptance criteria, skill compliance, and value-add, ensuring that custom skills deliver reliable and high-quality results.

주요 기능

01Comprehensive reporting with evidence-based passing/failing marks

02Role-playing user persona for natural interaction during test execution

03Automated parallel execution of skill-enabled and baseline runners

04Detailed compliance checking against SKILL.md documentation

0512 GitHub stars

06AI-powered grading of transcripts using specific acceptance criteria

사용 사례

01Benchmarking skill-enhanced performance against standard model behavior

02Verifying the effectiveness of newly developed Claude Code skills

03Regression testing existing skills after prompt updates or logic changes

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add servitola/dotfiles skill-tester

For use in Claude.ai and ChatGPT