What kind of error patterns does it detect?

It categorizes errors into groups like retrieval augmentation failures, prompt construction issues, formatting errors, and semantic misunderstandings of the domain.

How does Review Iteration differ from Log Iteration?

Review Iteration is a read-only tool used to analyze and understand results before deciding on changes, whereas Log Iteration is used to write a permanent record of what happened after the decision is made.

Does this skill work with any programming language?

This specific skill is optimized for R environments using the targets package, though the diagnostic logic can be adapted for other pipeline frameworks.

Can this skill fix my code automatically?

No, this is a strictly read-only analysis tool. It identifies root causes and suggests next steps, but it does not modify files or source code.

Review Iteration

Name: Review Iteration
Author: estebandegetau

byestebandegetau

•

データサイエンスとML

Performs structured, read-only analysis of pipeline results to diagnose model failures and map error patterns without modifying files.

The Review Iteration skill provides a specialized diagnostic layer for R-based data pipelines, filling the gap between code execution and human understanding. It automatically gathers context from codebooks, pipeline results via the targets package, and strategy documents to perform deep-dive analyses on behavioral tests and zero-shot evaluations. By categorizing errors into specific domains like retrieval augmentation or semantic reasoning, it helps developers identify the root causes of poor model performance and provides ranked suggestions for high-leverage codebook improvements.

主な機能

01Error categorization using H&K frameworks for precise failure diagnosis

02Zero-shot evaluation metrics mapping against predefined strategy targets

03Detailed behavioral test analysis for JSON validity and classification accuracy

041 GitHub stars

05Automated context gathering from R targets and YAML codebooks

06Ablation ranking to identify the most impactful codebook components

ユースケース

01Identifying the worst-performing data segments in a zero-shot classification task

02Diagnosing why a specific pipeline stage failed behavioral thresholds

03Comparing current model performance against previous iterations to track progress

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add estebandegetau/fiscal-shocks review-iteration

For use in Claude.ai and ChatGPT

Download Skill