A/B Test Calculator
Free web tool: A/B Test Calculator
Control (A)
Variant (B)
About A/B Test Calculator
The A/B Test Statistical Significance Calculator determines whether the difference in conversion rates between your control (A) and variant (B) is statistically significant or could be due to random chance. Enter the number of visitors and conversions for each variant, and the calculator computes the p-value, z-score, relative lift, and 95% confidence intervals using a two-tailed two-proportion z-test with pooled proportion — the standard method for A/B test analysis in digital experimentation.
Product managers, UX designers, marketing analysts, and growth engineers use A/B testing to make data-driven decisions about website changes, email subject lines, ad copy, pricing pages, onboarding flows, and more. However, simply comparing conversion rates without statistical validation can lead to false positives — declaring a winner when the difference is actually due to random variation in small samples. This calculator helps you determine whether you have collected enough data to make a confident decision.
The calculator implements the normal approximation to the binomial distribution using the Abramowitz and Stegun polynomial approximation for the standard normal CDF (accurate to within 1.5×10⁻⁷). The pooled proportion combines both samples' data to estimate the true conversion rate under the null hypothesis of no difference. Statistical significance is declared at the conventional α = 0.05 threshold (p < 0.05), corresponding to a 95% confidence level. The 95% confidence intervals use the standard 1.96 standard error multiplier for each variant independently.
Key Features
- Two-tailed two-proportion z-test with pooled proportion for accurate A/B test analysis
- p-value displayed to 4 decimal places with "<0.001" for very significant results
- Z-score output showing the number of standard deviations separating the two conversion rates
- Relative lift calculation: percentage improvement of variant B over control A
- 95% confidence intervals for both control and variant conversion rates
- Clear significance verdict: statistically significant (p < 0.05) or not significant
- Color-coded result display — green for significant, yellow for inconclusive results
- Instant recalculation with each input change — no page refresh required
Frequently Asked Questions
What does "statistically significant" mean in an A/B test?
Statistical significance means the observed difference in conversion rates between A and B is unlikely to have occurred by random chance alone. At p < 0.05, there is less than a 5% probability that the difference is due to sampling variation. This does not guarantee the variant will perform the same in production — it means your sample data provides sufficient evidence to reject the hypothesis that both variants perform identically.
What is the p-value and how should I interpret it?
The p-value is the probability of observing a difference as large as (or larger than) the one measured, assuming there is truly no difference between A and B. A p-value of 0.03 means there is a 3% chance the result is due to random chance. Lower p-values indicate stronger evidence against the null hypothesis. The conventional threshold for declaring significance is p < 0.05.
What is the z-score in an A/B test?
The z-score measures how many standard deviations the observed difference in conversion rates is from zero (no difference). A z-score above 1.96 or below -1.96 corresponds to p < 0.05 (two-tailed). Larger absolute z-scores indicate stronger evidence of a real difference between variants.
What sample size do I need for a reliable A/B test?
As a rule of thumb, you need at least 100 conversions per variant before the statistical test is reliable. For low-conversion-rate pages (under 2%), you may need thousands of visitors per variant. The minimum detectable effect (MDE) — the smallest lift you care about — also determines sample size: smaller desired lifts require larger samples. Use a sample size calculator before starting a test.
What is conversion rate lift and how is it calculated?
Lift is the relative percentage improvement of variant B over control A. It is calculated as (rate_B - rate_A) / rate_A × 100. A lift of +15% means variant B converts 15% more visitors than control A. Note that lift can be positive even when the result is not statistically significant — always check the p-value before acting on lift numbers.
What are 95% confidence intervals in this context?
The 95% confidence interval for each conversion rate is a range that would contain the true conversion rate 95% of the time if the experiment were repeated many times. If the confidence intervals for A and B do not overlap, this is a strong visual indicator of statistical significance. They are calculated using the standard 1.96 × standard error formula for each variant independently.
When should I stop an A/B test?
Stop the test when: (1) you have reached your pre-determined minimum sample size, AND (2) the p-value is below 0.05 (or whatever significance threshold you set before the test). Peeking at results repeatedly and stopping early when you see p < 0.05 inflates the false positive rate — this is called "p-hacking." Decide the stopping rule before the test begins.
What if my result is "not statistically significant"?
An insignificant result does not prove that A and B perform the same — it means you have insufficient evidence to declare a winner. You may need to collect more data, or the true effect size may be too small to detect with your current traffic levels. Consider running the test longer or revisiting the hypothesis and targeting a change with a potentially larger impact.