Question 1

What is incrementality testing?

Accepted Answer

Incrementality testing answers: 'would this conversion have happened anyway if we hadn't run the ads?' Standard attribution (last-click, MTA) gives the credit to whichever channel touched the user last. Incrementality measures actual causal lift by comparing a treatment group (sees ads) to a holdout (doesn't) and computing the conversion-rate difference. It's the only methodology that catches channels which are cannibalizing organic conversions you'd have earned anyway.

Question 2

How is this different from a regular A/B test?

Accepted Answer

The math is identical — both are two-proportion tests. The setup differs: an A/B test compares two creative variants both shown to ad-exposed users. An incrementality test compares ad-exposed users to a group that sees no ads. Sample sizes are usually much larger for incrementality because lift is typically smaller (5-15% incremental lift is common; A/B winners often lift 10-30% relative to the loser).

Question 3

What's a good minimum detectable lift to design for?

Accepted Answer

Industry norm: 10-20% relative lift. Smaller is more rigorous but explodes your sample size. If you're running a Meta Conversion Lift test on a large account, 10% is reasonable. For geo holdouts on a smaller account, you might design for 25% to keep the test affordable. Don't design for 5% unless you have very high daily volume — the test will take months.

Question 4

Why 80% statistical power?

Accepted Answer

Power is the probability of detecting a real lift if one exists. 80% is the industry default (you'll catch 4 out of 5 real lifts). 90% is more conservative but requires ~30% more sample. 95% power is rare in marketing — it's usually overkill for what we're trying to measure.

Question 5

Should I always use 95% confidence?

Accepted Answer

For directional reads (is this channel incremental?), 95% is standard. For high-stakes irreversible decisions (cutting a whole channel), use 99% to reduce false positives. For early-read 'should I bet on this' decisions, 90% is reasonable since you'll re-test later. Match the confidence threshold to the cost of being wrong.

Question 6

What if my calculated test duration is longer than 8 weeks?

Accepted Answer

Three ways to shorten. (1) Increase daily traffic per group — for geo tests, this means picking larger / more metros. (2) Widen the minimum detectable lift — if you only care about lifts above 20%, design for that. (3) Reduce power to 70% — you accept more false negatives but finish faster. If none of those work, the test is impractical for this channel; use MMM (media-mix modeling) for measurement instead.

Conversion Lift Sample Size Calculator

Test design

What this is for

Frequently asked

Run a real incrementality test every quarter.