How does this differ from standard unit testing?

Unlike unit tests that check isolated functions, this skill evaluates the entire system path from a user perspective, identifying issues that emerge only when layers interact.

Does it support reporting to project management tools?

Yes, the skill can synthesize findings and automatically create prioritized issues and projects in Linear via the Linear MCP integration.

Can I define custom evaluation goals?

Yes, while the skill includes a library for common product phases, you can provide custom goal statements which the skill will decompose into success criteria and layer weights.

What is Goal-Driven Evaluation?

It is a methodology that focuses on whether a user can achieve a specific outcome, tracing the journey through the frontend, backend, and infrastructure to find the origin of failures.

Goal-Driven E2E Evaluator

Name: Goal-Driven E2E Evaluator
Author: mberto10

bymberto10

セキュリティとテスト

Evaluates user goals across UX, code, and infrastructure layers to identify and resolve root causes of system failures.

概要

This skill provides a systematic framework for performing end-to-end evaluations based on specific user goals rather than testing features in isolation. It traces user journeys through the entire stack—including the frontend (UX), application logic (Code), LLM interactions (AI), and backend persistence (Infrastructure)—to pinpoint exactly where a process breaks down. Ideal for debugging complex workflows like onboarding or multi-step configurations, it includes a goal library for standardizing assessments and generates comprehensive reports with prioritized recommendations to improve product success rates.

主な機能

Cross-layer finding synthesis with detailed layer traces
0 GitHub stars
Automated browser-based journey tracing and friction identification
Pre-defined goal library for onboarding, activation, and adoption
Direct integration with Linear for automated issue creation
Multi-layer root cause analysis across UX, Code, AI, and Infra

ユースケース

Debugging complex data persistence issues in configuration workflows
Evaluating the fidelity of AI-generated outputs across system layers
Auditing user onboarding flows to find where users drop off

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add mberto10/mberto-compound goal-driven-evaluation

For use in Claude.ai and ChatGPT

Download Skill

GitHub

概要