Definition
Identity resolution is the process of recognizing and linking identifiers that belong to the same person, account, or household across devices, channels, and data sources. It consolidates disparate records (e.g., email, MAID, cookie, CRM ID, phone) into persistent profiles or clusters, governed by deterministic rules, probabilistic models, or hybrids.
Relation to marketing
In marketing, identity resolution connects touchpoints across web, app, offline, and media platforms to enable accurate targeting, frequency control, personalization, and measurement. It ensures that activation systems and analytics draw from a consistent customer profile—improving audience quality, reducing waste, and supporting closed-loop attribution across teams and tools.
How to calculate
Common program KPIs and formulas:
- Match rate (%):
Match Rate = (Resolved_records ÷ Input_records) × 100
- Precision (positive predictive value):
Precision = True_Matches ÷ (True_Matches + False_Matches)
- Recall (sensitivity):
Recall = True_Matches ÷ (True_Matches + Missed_Matches)
- F1 score:
F1 = 2 × (Precision × Recall) ÷ (Precision + Recall)
- Over-merge rate (%): records incorrectly clustered together
Over-merge = False_Merges ÷ Total_Merges × 100
- Under-merge (fragmentation) rate (%): same person split across clusters
Under-merge = Missed_Links ÷ Potential_Links × 100
- Profile completeness (avg fields per profile):
Completeness = Sum_of_populated_fields ÷ Total_profiles
- Time to resolution (TTR):
Average time from ingest of a new identifier to its link in a profile. - Active reach lift: conversion or ROAS difference between resolved vs. unresolved audiences.
- Cost per resolved profile (CPRP):
CPRP = (Data_costs + Platform_fees + Ops_costs) ÷ Resolved_profiles
How to utilize
Typical use cases and patterns:
- Audience activation: Build warehouse/CDP segments using resolved profiles and sync to CRM, MAP, and ad platforms for targeted outreach, suppressions, and lookalikes.
- Personalization: Drive consistent content, offers, and pricing across web, app, and email using unified attributes (e.g., LTV tier, lifecycle stage, product affinities).
- Frequency and reach management: Cap exposure across channels by operating on person- or household-level IDs.
- Attribution and measurement: Improve path analysis, MTA/MMM inputs, and cohort reporting with linked identifiers.
- Customer care and CX: Provide agents with consolidated histories for faster resolution and tailored service.
- Compliance and consent enforcement: Apply consent status and regional policies at the profile level before activation.
Implementation steps:
- Ingest & normalize identifiers: emails (raw/hashed), phone, CRM IDs, MAIDs, cookies, login IDs, postal data.
- Stitching logic: deterministic rules (exact matches, UID2/hashed email, login) plus probabilistic models (name/phone/address similarity, device co-occurrence).
- Profile graphing: maintain person and household graphs with survivorship and provenance.
- Governance: consent, purpose limitation, retention, and audit trails.
- Activation & feedback: expose resolved IDs and attributes to destinations; re-ingest performance to refine rules and models.
Comparison to similar approaches
Approach | What it does | Strengths | Considerations | Best fit |
---|---|---|---|---|
Identity resolution (person graph) | Links identifiers into person/household profiles | Cross-channel consistency; improves activation & measurement | Requires data quality, consent, and ongoing QA | Cross-channel marketing and analytics |
Deduplication | Removes exact/near-duplicate records within a single source | Simple, fast | Limited to one source; no cross-channel stitching | CRM hygiene, list cleanup |
Deterministic matching | Exact rules (e.g., same hashed email) | High precision, explainable | Lower recall without logins | Logged-in ecosystems, B2B known contacts |
Probabilistic matching | Similarity and co-occurrence models | Higher recall, device bridging | Risk of over-merge; needs monitoring | Cookieless/device bridging, media graphs |
MDM (Customer/Party) | Enterprise golden record with survivorship | Strong governance and data stewardship | Often not optimized for adtech/martech activation cadence | Back-office master data, compliance |
CDP (profiles & activation) | Stores profiles and activates audiences | Marketer-friendly UI and connectors | May rely on an external ID graph or create a separate one | Rapid activation with built-in stitching |
Clean rooms | Privacy-safe data joins across parties | Privacy controls, overlap measurement | Limited PII exposure; activation paths vary | Retail media, partner measurement |
Best practices
- Adopt a hybrid strategy: use deterministic rules for core links and probabilistic methods to extend reach, with thresholds per use case.
- Manage identity as a graph: store edges with confidence scores, timestamps, and source provenance; enable rollbacks.
- Separate person vs. household: define and maintain both where relevant (CTV, direct mail, utilities).
- Set SLAs and guardrails: precision/recall targets by destination; enforce minimum confidence for sensitive activations.
- Track consent by identifier and profile: propagate opt-outs and purpose restrictions across edges before sync.
- Standardize keys and hashing: consistent normalization (case, whitespace), salted hashing for emails/phones when required.
- QA continuously: canary tests, shadow scoring, and post-activation audits for over-/under-merge trends.
- Version rules as code: peer-reviewed changes with automated tests and lineage.
- Enrich selectively: add third-party data where it clearly improves match quality or completeness; measure incrementality.
- Plan for decay: handle identifier churn (cookie resets, device changes) with recency weighting and edge expiry.
Future trends
- Cookieless identity: greater reliance on first-party logins, hashed emails, and publisher/retailer provided IDs.
- Privacy-enhancing tech: clean rooms, differential privacy, and on-device matching for safer joins.
- Policy-as-code: automated consent and data-minimization checks embedded in stitch pipelines and connectors.
- Graph-native analytics: real-time scoring on identity graphs to power triggers and recommendations.
- Collaborative identity: partner graphs and retail media networks with secure overlap and limited PII movement.
- AI-assisted stitching: model-generated rules, active learning on ambiguous edges, and anomaly detection in near real time.
Related Terms
- Customer Data Platform (CDP)
- Entity Resolution
- Deterministic Matching
- Extract Transform Load
- Reverse ETL
- Probabilistic Matching
- Identity Graph
- Hashed Email (HEM)
- Mobile Advertising ID (MAID)
- Household Graph
- Data Clean Room
- Consent Management.