Can it detect if my experiment setup was flawed?

Yes, it checks for Sample Ratio Mismatch (SRM) to ensure randomization worked correctly and validates if the sample size was large enough for your Minimum Detectable Effect (MDE).

Does this skill support raw data files?

Yes, it can process CSV, Excel, and analytics exports directly by generating and running Python scripts to perform statistical calculations.

What statistical tests does it perform?

It primarily utilizes two-tailed z-tests and chi-squared tests to determine p-values and 95% confidence intervals for conversion data.

How does it handle non-significant results?

The skill evaluates the trend and power of the test to recommend whether you should extend the duration for more data or stop the test if a meaningful difference is unlikely.

A/B Test Analysis Pro

Name: A/B Test Analysis Pro
Author: KoryakinYurij

byKoryakinYurij

0•

데이터 과학 및 ML

Evaluates experiment results with statistical rigor to provide clear ship, extend, or stop recommendations based on data.

This skill empowers product teams and developers to interpret A/B test results accurately by automating complex statistical calculations including p-values, confidence intervals, and sample size validation. It identifies critical experimentation pitfalls like Sample Ratio Mismatch (SRM), novelty effects, and underpowered tests while balancing primary conversion metrics against vital guardrails. By translating raw data into actionable business logic, it helps users decide whether to roll out a feature to 100% of traffic or return to the drawing board.

주요 기능

010 GitHub stars

02Sample size and power analysis to ensure experiment validity

03Standardized analysis summaries with clear ship/extend/stop recommendations

04Sample Ratio Mismatch (SRM) detection to flag randomization issues

05Automated statistical significance and p-value calculation via Python scripts

06Guardrail metric monitoring to prevent unintended negative side effects

사용 사례

01Determining if a new UI variant significantly improved conversion rates compared to the control

02Validating if a completed experiment ran long enough to overcome weekly seasonality and novelty effects

03Analyzing raw CSV export data from analytics tools to generate a formal experiment post-mortem

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add koryakinyurij/self-sustain-system ab-test-analysis

For use in Claude.ai and ChatGPT

주요 기능

010 GitHub stars

02Sample size and power analysis to ensure experiment validity

03Standardized analysis summaries with clear ship/extend/stop recommendations

04Sample Ratio Mismatch (SRM) detection to flag randomization issues

05Automated statistical significance and p-value calculation via Python scripts

06Guardrail metric monitoring to prevent unintended negative side effects

사용 사례

01Determining if a new UI variant significantly improved conversion rates compared to the control

02Validating if a completed experiment ran long enough to overcome weekly seasonality and novelty effects

03Analyzing raw CSV export data from analytics tools to generate a formal experiment post-mortem

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add koryakinyurij/self-sustain-system ab-test-analysis

For use in Claude.ai and ChatGPT