Identity Resolution

Definition

Identity resolution is the process of recognizing and linking identifiers that belong to the same person, account, or household across devices, channels, and data sources. It consolidates disparate records (e.g., email, MAID, cookie, CRM ID, phone) into persistent profiles or clusters, governed by deterministic rules, probabilistic models, or hybrids.

Relation to marketing

In marketing, identity resolution connects touchpoints across web, app, offline, and media platforms to enable accurate targeting, frequency control, personalization, and measurement. It ensures that activation systems and analytics draw from a consistent customer profile—improving audience quality, reducing waste, and supporting closed-loop attribution across teams and tools.

How to calculate

Common program KPIs and formulas:

  • Match rate (%):
    Match Rate = (Resolved_records ÷ Input_records) × 100
  • Precision (positive predictive value):
    Precision = True_Matches ÷ (True_Matches + False_Matches)
  • Recall (sensitivity):
    Recall = True_Matches ÷ (True_Matches + Missed_Matches)
  • F1 score:
    F1 = 2 × (Precision × Recall) ÷ (Precision + Recall)
  • Over-merge rate (%): records incorrectly clustered together
    Over-merge = False_Merges ÷ Total_Merges × 100
  • Under-merge (fragmentation) rate (%): same person split across clusters
    Under-merge = Missed_Links ÷ Potential_Links × 100
  • Profile completeness (avg fields per profile):
    Completeness = Sum_of_populated_fields ÷ Total_profiles
  • Time to resolution (TTR):
    Average time from ingest of a new identifier to its link in a profile.
  • Active reach lift: conversion or ROAS difference between resolved vs. unresolved audiences.
  • Cost per resolved profile (CPRP):
    CPRP = (Data_costs + Platform_fees + Ops_costs) ÷ Resolved_profiles

How to utilize

Typical use cases and patterns:

  • Audience activation: Build warehouse/CDP segments using resolved profiles and sync to CRM, MAP, and ad platforms for targeted outreach, suppressions, and lookalikes.
  • Personalization: Drive consistent content, offers, and pricing across web, app, and email using unified attributes (e.g., LTV tier, lifecycle stage, product affinities).
  • Frequency and reach management: Cap exposure across channels by operating on person- or household-level IDs.
  • Attribution and measurement: Improve path analysis, MTA/MMM inputs, and cohort reporting with linked identifiers.
  • Customer care and CX: Provide agents with consolidated histories for faster resolution and tailored service.
  • Compliance and consent enforcement: Apply consent status and regional policies at the profile level before activation.

Implementation steps:

  1. Ingest & normalize identifiers: emails (raw/hashed), phone, CRM IDs, MAIDs, cookies, login IDs, postal data.
  2. Stitching logic: deterministic rules (exact matches, UID2/hashed email, login) plus probabilistic models (name/phone/address similarity, device co-occurrence).
  3. Profile graphing: maintain person and household graphs with survivorship and provenance.
  4. Governance: consent, purpose limitation, retention, and audit trails.
  5. Activation & feedback: expose resolved IDs and attributes to destinations; re-ingest performance to refine rules and models.

Comparison to similar approaches

ApproachWhat it doesStrengthsConsiderationsBest fit
Identity resolution (person graph)Links identifiers into person/household profilesCross-channel consistency; improves activation & measurementRequires data quality, consent, and ongoing QACross-channel marketing and analytics
DeduplicationRemoves exact/near-duplicate records within a single sourceSimple, fastLimited to one source; no cross-channel stitchingCRM hygiene, list cleanup
Deterministic matchingExact rules (e.g., same hashed email)High precision, explainableLower recall without loginsLogged-in ecosystems, B2B known contacts
Probabilistic matchingSimilarity and co-occurrence modelsHigher recall, device bridgingRisk of over-merge; needs monitoringCookieless/device bridging, media graphs
MDM (Customer/Party)Enterprise golden record with survivorshipStrong governance and data stewardshipOften not optimized for adtech/martech activation cadenceBack-office master data, compliance
CDP (profiles & activation)Stores profiles and activates audiencesMarketer-friendly UI and connectorsMay rely on an external ID graph or create a separate oneRapid activation with built-in stitching
Clean roomsPrivacy-safe data joins across partiesPrivacy controls, overlap measurementLimited PII exposure; activation paths varyRetail media, partner measurement

Best practices

  • Adopt a hybrid strategy: use deterministic rules for core links and probabilistic methods to extend reach, with thresholds per use case.
  • Manage identity as a graph: store edges with confidence scores, timestamps, and source provenance; enable rollbacks.
  • Separate person vs. household: define and maintain both where relevant (CTV, direct mail, utilities).
  • Set SLAs and guardrails: precision/recall targets by destination; enforce minimum confidence for sensitive activations.
  • Track consent by identifier and profile: propagate opt-outs and purpose restrictions across edges before sync.
  • Standardize keys and hashing: consistent normalization (case, whitespace), salted hashing for emails/phones when required.
  • QA continuously: canary tests, shadow scoring, and post-activation audits for over-/under-merge trends.
  • Version rules as code: peer-reviewed changes with automated tests and lineage.
  • Enrich selectively: add third-party data where it clearly improves match quality or completeness; measure incrementality.
  • Plan for decay: handle identifier churn (cookie resets, device changes) with recency weighting and edge expiry.
  • Cookieless identity: greater reliance on first-party logins, hashed emails, and publisher/retailer provided IDs.
  • Privacy-enhancing tech: clean rooms, differential privacy, and on-device matching for safer joins.
  • Policy-as-code: automated consent and data-minimization checks embedded in stitch pipelines and connectors.
  • Graph-native analytics: real-time scoring on identity graphs to power triggers and recommendations.
  • Collaborative identity: partner graphs and retail media networks with secure overlap and limited PII movement.
  • AI-assisted stitching: model-generated rules, active learning on ambiguous edges, and anomaly detection in near real time.