Showing 28 verified skills. 284 preview entries are hidden until we confirm a real source. Show preview skills · Why?
Design A/B tests with proper sample sizes and statistical power
claude install community/ab-test-designerExperiment design assistant: calculate required sample sizes, estimate test duration, define primary and guardrail metrics, and produce experiment specifications.
This is the actual SKILL.md file that powers this skill. Copy it to install.
---
name: ab-test-designer
description: |
Design statistically valid A/B tests: calculate sample size, estimate
duration, pick primary and guardrail metrics, and output a ready-to-run
experiment spec. Trigger on "A/B test", "experiment design", "sample
size", "statistical power", "ship an experiment".
allowed-tools:
- Read
- Write
- Edit
- Bash(python3 *)
---
# A/B Test Designer
Produce experiment specifications that survive statistical review. The
output: a document that states the hypothesis, the primary metric, the
minimum detectable effect, required sample size, duration, and guardrails.
## Prerequisites
- Baseline conversion rate or mean for the primary metric
- Traffic volume estimate (daily users or events per arm)
- Team alignment on what a meaningful lift looks like
## Steps
1. **Write the hypothesis in one sentence.** Form: "Changing X will
increase Y by at least Z percent because of mechanism M." Vague
hypotheses produce vague experiments.
2. **Pick a single primary metric.** Conversion rate, revenue per
visitor, or retention at day 7. Everything else is secondary or a
guardrail.
3. **Define guardrails.** Metrics that must NOT regress: page load
time, error rate, unsubscribe rate. Set non-inferiority margins.
4. **Calculate sample size.** For a two-proportion test at 80 percent
power and alpha 0.05:
```python
from statsmodels.stats.power import NormalIndPower
from statsmodels.stats.proportion import proportion_effectsize
effect = proportion_effectsize(baseline + mde, baseline)
n = NormalIndPower().solve_power(effect, power=0.8, alpha=0.05)
```
Round up. This is per arm.
5. **Estimate duration.** Sample size divided by daily traffic per arm,
rounded up to full weeks to avoid day-of-week bias. Minimum one week.
6. **Define the analysis plan before launch.** Exact statistical test,
segmentation cuts, and stopping rules. Lock it. Peeking without
pre-registration inflates false positive rates.
7. **Write the experiment spec.** Hypothesis, metric, MDE, sample size,
duration, guardrails, analysis plan, owner, end date. Share for
review before flipping the flag.
## Anti-patterns
- Running until the result looks good (peeking)
- Measuring every metric and cherry-picking the significant one
- Ending early because one arm has a hot first day
- Ignoring novelty effects on changes visible to existing users
## Output
- One-page experiment spec in Markdown
- Sample size and duration calculation (reproducible script)
- Primary metric + guardrails defined before launch
- Locked analysis plan with explicit stopping rules
mkdir -p ~/.claude/skills/ab-test-designer~/.claude/skills/ab-test-designer/SKILL.mdResulting file structure:
~/.claude/
skills/
ab-test-designer/
SKILL.md <-- skill definitionSkills are loaded automatically by Claude Code when you start a new session. The skill name and description in the frontmatter determine when Claude triggers it.
Recommended from shared domain, career, and tool overlap with A/B Test Designer
Extract themes and sentiment from customer feedback at scale
Both used by Data Scientist, Product Manager
Design product dashboards with the right KPIs for your stage
Both used by Data Scientist, Product Manager
Design and optimize prompts with evaluation frameworks and A/B testing
Both used by Data Scientist, Product Manager
Analyze experiment results with significance tests and segment breakdowns
Both used by Data Scientist, Product Manager
Run cohort analyses with retention curves and activation funnels
Both used by Data Scientist, Product Manager
Segment customers with data-driven clustering and targeting profiles
Both used by Data Scientist, Product Manager
A/B Test Designer