When should I start a fresh session with Claude Code?

The skill recommends starting a fresh session after major phase transitions, such as moving from the implementation phase to the testing or documentation phase, to clear unnecessary context.

How does this skill help optimize development costs?

It provides a routing framework that assigns simple tasks like refactoring to Haiku, standard implementation to Sonnet, and complex architectural analysis to Opus, ensuring you only pay for the intelligence you need.

What is an eval-first loop?

An eval-first loop involves writing tests that capture desired behavior and baseline failures before implementation starts. This creates a clear success metric for the AI agent to work toward.

What is the 15-minute unit rule in Agentic Engineering?

It is a decomposition strategy where every task is broken down into small, independently verifiable units that take approximately 15 minutes to complete, ensuring clear focus and reducing the risk of model drift.

How should I review AI-generated code using this skill?

Focus your review on invariants, edge cases, error boundaries, and security assumptions. Avoid wasting time on code style if automated formatters and linters are already in place.

Agentic Engineering

Name: Agentic Engineering
Author: affaan-m

byaffaan-m

•

172,650

•

Productivity & Workflow

Orchestrates AI-driven development through eval-first loops, granular task decomposition, and cost-optimized model routing.

Agentic Engineering is a specialized skill designed to transform Claude Code into a high-performance autonomous developer by establishing a rigorous methodology for AI-human collaboration. It moves beyond simple prompting by implementing an 'eval-first' execution loop, where completion criteria and tests are defined before coding begins. The skill provides a framework for breaking complex features into 15-minute verifiable units, intelligently routing tasks between model tiers (Haiku, Sonnet, Opus) based on complexity, and focusing human oversight on high-risk architectural invariants rather than stylistic minutiae.

Key Features

01172,650 GitHub stars

02Intelligent Model Routing: Optimize costs by assigning tasks to Haiku, Sonnet, or Opus based on reasoning requirements.

0315-Minute Task Decomposition: Break complex workflows into small, verifiable units with single dominant risks.

04Eval-First Execution: Define capability and regression tests before starting any implementation work.

05Session Strategy Management: Guidelines on when to continue or refresh sessions to maintain optimal context and performance.

06Risk-Centric Review Focus: Prioritize edge cases, security assumptions, and error boundaries during AI code reviews.

Use Cases

01Standardizing software quality through rigorous automated evaluation and human-in-the-loop risk controls.

02Managing complex feature implementations where AI agents perform the majority of the coding work.

03Optimizing API costs and performance by matching model tiers to task difficulty.

Key Features

01172,650 GitHub stars

02Intelligent Model Routing: Optimize costs by assigning tasks to Haiku, Sonnet, or Opus based on reasoning requirements.

0315-Minute Task Decomposition: Break complex workflows into small, verifiable units with single dominant risks.

04Eval-First Execution: Define capability and regression tests before starting any implementation work.

05Session Strategy Management: Guidelines on when to continue or refresh sessions to maintain optimal context and performance.

06Risk-Centric Review Focus: Prioritize edge cases, security assumptions, and error boundaries during AI code reviews.

Use Cases

01Standardizing software quality through rigorous automated evaluation and human-in-the-loop risk controls.

02Managing complex feature implementations where AI agents perform the majority of the coding work.

03Optimizing API costs and performance by matching model tiers to task difficulty.