User Rating 0.0
Total Usage 0 times
Control (Variant A)
Total sessions or users exposed
Number of successful events
Rate: —
Treatment (Variant B)
Total sessions or users exposed
Number of successful events
Rate: —
Prior & Simulation Settings
Beta shape parameter α₀
Beta shape parameter β₀
More = precise, slower
Presets:
Is this tool helpful?

Your feedback helps us improve.

About

Frequentist A/B tests answer the wrong question. They report the probability of observing data given no effect exists (p-value), not the probability that variant B actually beats variant A. Misinterpreting p < 0.05 as "95% chance B is better" is a statistical error that costs organizations real revenue. This calculator uses Bayesian inference with a Beta-Binomial conjugate model. Each variant's conversion rate θ is modeled as a Beta posterior: Beta(α0 + c, β0 + n c), where c is conversions and n is visitors. Monte Carlo sampling (100,000 draws) computes the probability each variant wins and expected loss directly.

Expected loss is the metric that matters for decision-making. It quantifies how much conversion rate you sacrifice by choosing the wrong variant. A variant with 99% probability to win but 0.001% expected loss is a safe call. A variant with 80% probability to win and 2.5% expected loss deserves more data. The tool assumes a uniform prior Beta(1, 1) by default, which is non-informative. Adjust prior parameters if you have historical baseline data. Note: this model assumes independent Bernoulli trials with a fixed conversion probability per variant. It does not account for time-varying effects, novelty bias, or segment interactions.

bayesian ab test ab testing calculator posterior distribution probability to win expected loss beta distribution monte carlo simulation credible interval conversion rate optimization bayesian statistics

Formulas

The Bayesian A/B framework uses a Beta-Binomial conjugate model. Given a prior Beta(α0, β0) and observed data (c conversions from n visitors), the posterior distribution over the conversion rate θ is:

θ | c, n ~ Beta(α0 + c, β0 + n c)

The Beta probability density function is:

f(θ; α, β) = θα1(1 θ)β1B(α, β)

where B(α, β) = Γ(α)Γ(β)Γ(α + β) is the Beta function. The probability that variant B beats variant A is computed via Monte Carlo sampling:

P(θB > θA) 1N Ni=1 I(θB(i) > θA(i))

Expected loss for choosing variant A is:

E[LA] = 1N Ni=1 max(θB(i) θA(i), 0)

Where θ = true conversion rate for a variant, α0 = prior alpha parameter (shape 1), β0 = prior beta parameter (shape 2), c = observed conversions, n = total visitors, N = number of Monte Carlo samples, I() = indicator function returning 1 if condition is true.

Reference Data

MetricDefinitionDecision ThresholdNotes
Probability to WinP(θB > θA) 95%Most common stopping rule
Expected Loss (A)E[max(θB θA, 0)] 0.1% absoluteRisk of choosing A when B is better
Expected Loss (B)E[max(θA θB, 0)] 0.1% absoluteRisk of choosing B when A is better
95% Credible IntervalCentral interval containing 95% of posterior massNarrower is betterNot the same as a confidence interval
Posterior Meanαα + β - Shrinks toward prior with small samples
Posterior Varianceαβ(α + β)2(α + β + 1) - Decreases with more data
Relative UpliftθB θAθA × 100%Context-dependentComputed from posterior means
Prior: UniformBeta(1, 1)DefaultNo prior knowledge assumed
Prior: JeffreysBeta(0.5, 0.5)AlternativeMinimally informative, invariant prior
Prior: InformedBeta(α0, β0)CustomUse historical data to set parameters
Sample Size Guidance - 100 conversions per variantRule of thumb for stable posteriors
Minimum Detectable EffectSmallest Δθ the test can resolve1-5% relativeDepends on sample size
Monte Carlo Error 1N 0.1% at 100k samplesIncrease samples for precision
ConjugacyBeta prior × Binomial likelihood Beta posterior - Closed-form update, exact
Stopping RuleExpected loss below threshold0.01-0.5%Unlike frequentist, peeking is valid

Frequently Asked Questions

Frequentist tests compute a p-value: the probability of observing your data (or more extreme) assuming no difference exists between variants. This does not directly answer "Is B better than A?" Bayesian testing computes the full posterior distribution of each variant's conversion rate, then directly calculates P(θ_B > θ_A). You can also peek at results at any time without inflating error rates - a critical advantage over frequentist sequential testing which requires correction (Bonferroni, alpha-spending) for multiple looks.
The default uniform prior Beta(1, 1) assigns equal probability to all conversion rates from 0% to 100%. This is appropriate when you have no prior knowledge. If your baseline conversion rate is known (e.g., 5%), you can set an informed prior such as Beta(5, 95), which centers the prior at 5% with the weight of approximately 100 pseudo-observations. The Jeffreys prior Beta(0.5, 0.5) is an alternative non-informative prior that is invariant under reparameterization. With sufficient data (hundreds of conversions), the choice of prior has negligible impact on results.
Unlike frequentist tests, Bayesian tests allow valid inference at any sample size. The recommended stopping criterion is expected loss. When the expected loss of choosing the leading variant drops below a business-meaningful threshold (commonly 0.1% to 0.5% of absolute conversion rate), you can stop. A probability to win above 95% is a common but less rigorous alternative. Avoid stopping based on posterior means alone - two variants can have nearly identical means but very different expected losses depending on distribution overlap.
For two variants with Beta posteriors, a closed-form solution exists using the regularized incomplete Beta function. However, Monte Carlo sampling generalizes trivially to more than two variants, to computing expected loss, to computing arbitrary quantiles and credible intervals, and to more complex models (e.g., revenue per visitor with Gamma-Poisson models). At 100,000 samples, the Monte Carlo error on probability estimates is approximately 1/√100000 ≈ 0.3%, which is more than sufficient for decision-making.
The posterior variance of a Beta(α, β) distribution is αβ / [(α+β)²(α+β+1)]. As sample size n increases, α + β grows, and variance shrinks proportionally to 1/n. With 100 visitors and 5 conversions, the 95% credible interval for θ spans roughly 1.6% to 11.3%. With 10,000 visitors and 500 conversions, it narrows to approximately 4.6% to 5.4%. You need at least 100 conversions per variant for stable, actionable posteriors.
This calculator models binary outcomes (converted vs. not converted) using the Beta-Binomial conjugate model. It is appropriate for click-through rate, sign-up rate, purchase rate, or any binary event. For continuous metrics like revenue per visitor or session duration, you would need a different model (e.g., Gamma-Poisson for count data, Normal-Normal for continuous). The Beta model should not be applied to non-binary data by discretizing - this loses information and introduces bias.
With zero conversions and a Beta(1,1) prior, the posterior becomes Beta(1, 1+n), which has a mean of 1/(2+n). For example, with 1000 visitors and 0 conversions, the posterior mean is approximately 0.1%. The model handles this gracefully - it does not divide by zero or produce undefined results. The posterior simply concentrates near zero, which correctly reflects the data. This is an advantage over the frequentist approach where 0 conversions creates complications with confidence interval formulas.