How does this skill calculate sample size?

It uses standard statistical power (80%) and significance levels (95%) against your baseline conversion rate and minimum detectable effect (MDE) to provide reliable estimates for your experiment.

What are guardrail metrics in an A/B test?

Guardrail metrics are secondary metrics, such as revenue, retention, or page load speed, that the skill helps you monitor to ensure your experiment doesn't accidentally damage core business performance while chasing a primary goal.

Can I use this for low-traffic websites?

Yes, the skill identifies low-traffic scenarios (under 1,000 users/day) and recommends qualitative alternatives like moderated testing or user interviews if statistical significance cannot be reached within a reasonable timeframe.

Does it help with multi-variant testing?

Yes, it provides guidance on the multiple comparisons problem and recommends statistical corrections like Bonferroni or Bayesian approaches to maintain the integrity of your results when testing more than two variants.

A/B Test Planner

Name: A/B Test Planner
Author: mohitagw15856

bymohitagw15856

•

295

•

Analíticas y Monitorización

Designs statistically rigorous A/B tests and experiment plans to ensure product changes deliver trustworthy results.

This skill transforms Claude into an expert product experimenter, helping product managers and developers design scientifically sound A/B tests for features, UI changes, and pricing models. It guides users through formulating directional hypotheses, calculating required sample sizes and durations, and defining critical guardrail metrics to protect core business health. By providing a structured framework for experiment design, it helps teams avoid common pitfalls like p-hacking and insufficient traffic, resulting in a complete execution and interpretation guide for data-driven decision-making.

Características Principales

01Results interpretation framework with ship, iterate, or reject criteria

02Comprehensive test plan output including primary and guardrail metrics

03Standardized hypothesis generation using data-driven templates

04Automated sample size and duration estimation based on traffic and baseline rates

05295 GitHub stars

06Statistical guidance to prevent p-hacking and account for weekly seasonality

Casos de Uso

01Interpreting inconclusive test results to determine whether to iterate or abandon a feature

02Calculating the required sample size and duration for a new onboarding flow experiment

03Setting up guardrail metrics to protect core revenue while testing a UI redesign

Características Principales

01Results interpretation framework with ship, iterate, or reject criteria

02Comprehensive test plan output including primary and guardrail metrics

03Standardized hypothesis generation using data-driven templates

04Automated sample size and duration estimation based on traffic and baseline rates

05295 GitHub stars

06Statistical guidance to prevent p-hacking and account for weekly seasonality

Casos de Uso

01Interpreting inconclusive test results to determine whether to iterate or abandon a feature

02Calculating the required sample size and duration for a new onboarding flow experiment

03Setting up guardrail metrics to protect core revenue while testing a UI redesign