Holdout Campaign

Definition

A holdout campaign is an experiment in which a predefined portion of an eligible audience is intentionally withheld from receiving a marketing treatment (e.g., an ad, email, offer, or on-site personalization). The performance of the treated group is compared to the holdout (control) group to estimate incrementality—what the campaign caused beyond what would have happened anyway.

In marketing, holdouts quantify true business impact across channels, helping leaders separate correlation from causation, validate vendor claims, and allocate budget to programs that create incremental revenue, leads, or retention—rather than activity that would have occurred without the spend.

How to calculate

Define a measurement window long enough to capture conversions (account for attribution lag).

  • Absolute lift (conversion outcomes)
    Lift = p_exposed − p_holdout, where p is the conversion rate.
  • Relative lift
    Relative Lift = (p_exposed − p_holdout) / p_holdout
  • Incremental revenue
    Inc_Rev = Revenue_exposed − Revenue_holdout
  • Incremental ROAS (iROAS)
    iROAS = Inc_Rev / Ad_Spend
  • Incremental profit (if margin known)
    Inc_Profit = (Inc_Rev × Gross_Margin) − Ad_Spend
  • Statistical significance
    Use a two-proportion z-test or logistic regression with a treatment indicator; report confidence intervals for lift or iROAS.
  • Sample size (rule of thumb)
    Determine per-group n via power analysis using baseline conversion p0, expected lift Δ, desired power (e.g., 80%), and alpha (e.g., 5%). Reserve a 5–20% holdout depending on traffic and expected effect.

How to utilize

Common use cases:

  • Email & lifecycle: Prove incremental opens, clicks, orders; validate triggered vs. batch sends.
  • Paid media: Measure incremental conversions or revenue from prospecting and retargeting.
  • On-site/app personalization: Quantify net impact of recommendations, banners, and offers.
  • Loyalty & retention: Test whether points, perks, or save-offers change churn behavior.
  • B2B: Evaluate ABM sequences on accounts or buying groups (account-level randomization).

Practical steps:

  1. Choose the unit of randomization (user, household, account, geography). Match it to how exposure happens to avoid contamination.
  2. Define eligibility and exclusions (e.g., recent purchasers, suppression lists).
  3. Randomly assign to treatment vs. holdout and enforce the holdout across all entry points.
  4. Run for a fixed window; set guardrail metrics (e.g., unsubscribes, customer service contacts).
  5. Analyze intention-to-treat (ITT) to keep groups comparable; optionally add covariate adjustment (e.g., CUPED) to reduce variance.
  6. Decide and scale: If lift is significant and profitable, graduate from test to standard practice and schedule periodic re-validation.

Comparison to similar approaches

ApproachWhat it measuresWhen to useStrengthsLimitations
Holdout campaign (control group)Incremental effect of a campaign vs. no exposureDirect, person/account-level programsCausal, simple to explain; channel-agnosticRequires withholding treatment; needs enough volume/time
A/B/n creative testsRelative performance between variants (all treated)Optimize subject lines, creatives, bidsFast optimization within a campaignDoes not reveal incrementality vs. doing nothing
Geo-experiments / matched marketsMarket-level incrementalityOffline media, OOH, TV, or when user-level randomization isn’t possibleLower spillover risk; realistic scaleFewer units → lower power; geographic heterogeneity
Ghost ads / PSA controlsIncrementality in walled gardens via simulated controlLarge ad platformsReduces contamination; platform-nativeMethod details/platform limits; black-box elements
Pre-post (time series)Change vs. historical baselineEarly directional readsQuick, data-lightConfounded by seasonality and external factors
Propensity / synthetic controlQuasi-experimental incrementalityWhen randomization isn’t feasibleUses observational dataSensitive to unobserved bias
Uplift modelingPredicted individual-level causal effectTargeting only the “persuadables”Improves efficiency of who to treatRequires prior experiments and data science maturity
Media mix modeling (MMM)Channel contribution at aggregate levelStrategic budget planningWorks with privacy constraints; long-horizonCoarse granularity; complements, not replaces, holdouts

Best practices

  • Randomize at the right level (person, household, account, or geo) to match exposure and avoid cross-contamination.
  • Pre-register the plan: KPIs, window, success thresholds, guardrails, and analysis method.
  • Size for power using realistic baseline rates and minimum detectable effect.
  • Stratify or block on key covariates (e.g., lifecycle stage, value tier) before randomization.
  • Enforce the holdout in all orchestration tools; audit delivery logs for leaks.
  • Respect conversion lags; include late conversions or run a follow-up sensitivity analysis.
  • Use ITT as the primary estimate; add treatment-on-the-treated as a sensitivity check if applicable.
  • Adjust for seasonality and major events; avoid overlapping experiments that create interference.
  • Report uncertainty (CIs, p-values) and business impact (incremental revenue/profit, iROAS).
  • Re-validate periodically; effects decay as markets and audiences change.
  • Experimentation platforms integrated with CDPs to automate eligibility, assignment, and enforcement across channels.
  • Privacy-preserving measurement via clean rooms, aggregated reporting, and modeled conversions.
  • Sequential and real-time testing with alpha-spending controls for faster reads without inflating false positives.
  • Causal ML and uplift-driven targeting that uses past holdouts to prioritize customers most likely to be influenced.
  • Hybrid MMM + experiment workflows where market-level models are calibrated with ongoing holdouts.
  • Platform-native incrementality tools (e.g., ghost ads) expanding beyond walled gardens and into retail media and CTV.