Definition
A hashed email (HEM) is a cryptographic representation of an email address created by running the address through a one-way hash function (e.g., SHA-256). The output is a fixed-length string that cannot be feasibly reversed to reveal the original address when implemented with appropriate safeguards (e.g., salts/HMACs).
Relation to marketing
HEM is widely used as a privacy-preserving match key across platforms and datasets. It lets marketers link first-party records to ad platforms, publishers, measurement systems, and clean rooms without transmitting raw, personally identifiable email addresses. Typical uses include identity resolution, audience onboarding, cross-channel frequency control, suppression, measurement, and attribution in environments where third-party cookies and mobile IDs are constrained.
How to calculate
Inputs and decisions
- Normalization policy: Define how you standardize emails before hashing. At minimum, trim whitespace and lowercase. Decide (and document) whether to apply domain-specific canonicalization (e.g., handling “+tag” or dot rules) — only do this if both sides use the exact same rules.
- Hash function: Prefer SHA-256. Avoid MD5 for new implementations.
- Protection against guessing: Because email space is guessable, use HMAC-SHA-256 with a secret key or salted hashing when you do not need cross-party interoperability. For open-ecosystem matches (e.g., to ad platforms), unsalted SHA-256 is often required; mitigate with contractual and process controls.
Minimal, widely interoperable workflow
- Normalize: email_normalized = trim(lowercase(email_raw))
- Hash: hem = SHA-256(email_normalized)
- Encode as hex; store/transmit the hex string.
More privacy-protective workflow (when both sides can coordinate)
- Normalize as above.
- Compute HMAC: hem = HMAC-SHA-256(secret_key, email_normalized)
 (Rotate keys; use per-partner keys to prevent linkage across partners.)
Verification
- Hash a known test email using your policy and confirm equality with counterpart outputs. Version and log your normalization and hashing policy so future changes don’t break matches.
How to utilize
- Audience onboarding: Convert customer lists to HEM and upload to platforms that accept hashed identifiers to build or refresh custom audiences.
- Identity resolution: Use HEM as a deterministic key to join records across CRM, CDP, analytics, and ad platforms.
- Suppression and compliance: Share HEM-based suppression lists with partners without exposing raw emails.
- Frequency and reach management: Coordinate caps and deduplicate exposure across channels by matching HEMs in clean rooms.
- Attribution and measurement: Join impression/click logs to conversion files via HEM inside a privacy-safe environment.
- Look-alike modeling and enrichment: Seed modeling with HEM-mapped audiences where platforms support it.
Comparison to similar approaches
| Approach | What it is | Strengths | Limitations | Typical use | 
|---|---|---|---|---|
| Hashed Email (HEM) | One-way hash of normalized email | Deterministic match, privacy-preserving vs raw email, cookie-independent | Vulnerable to guessing if unsalted; requires consistent normalization | Onboarding, suppression, identity resolution | 
| Raw Email | Plaintext email address | Highest match rate and portability | High privacy and regulatory risk; restricted sharing | Internal systems, consented communications | 
| Phone Hash | Hash of phone number | Useful where phones are prevalent | Formatting variance; similar guessability risks | Onboarding, identity resolution | 
| MAID | Mobile advertising ID (IDFA/GAID) | Built for ads; device-level | Availability declining; consent constraints | Mobile attribution (where allowed) | 
| First-Party Cookie/ID | Site/app-scoped identifier | Strong within a domain/app | Poor cross-site portability | On-site personalization, analytics | 
| Publisher/Platform UID | Proprietary user IDs | High match within ecosystem | Walled-garden lock-in | Within a single platform | 
| Clean Room Join Keys | Encrypted/computed keys for joins | Strong privacy with compute-in-place | Setup overhead; requires partners | Measurement, reach, overlap | 
Best practices
- Document normalization policy and keep it consistent; version any change.
- Prefer SHA-256; avoid legacy MD5 for new work.
- Use HMAC or salted hashes when cross-party interoperability is not required; use per-partner keys.
- Minimize raw PII exposure: Hash at the edge; restrict access to plaintext emails to the smallest set of services and people.
- Encrypt in transit and at rest; implement key management with rotation and audit trails.
- Store only what you need: Retain normalized plaintext only if operationally required; otherwise keep just the HEM.
- Validate inputs: Enforce RFC-compliant email format before hashing; reject nulls and placeholders.
- Multi-hash strategies: When interoperating broadly, you may store both SHA-256 (hex) and HMAC-SHA-256 variants, clearly labeled.
- Contractual controls: Specify hashing policies, allowed uses, retention, and deletion SLAs with partners.
- Testing and monitoring: Maintain test vectors, spot-check match rates, and alert on unexpected drift.
Future trends
- Post-cookie ecosystem: Broader reliance on deterministic, consented identifiers such as HEM for audience building and measurement.
- Clean room adoption: More joins will occur via privacy-preserving computation with ephemeral, encrypted match keys rather than sharing raw HEMs.
- Per-partner cryptography: Migration from unsalted SHA-256 to HMAC or derived keys by partner to reduce linkage risk.
- Stronger compliance expectations: Clearer regulatory guidance on pseudonymous identifiers, with tighter consent, purpose limits, and retention controls.
- Interoperable frameworks: Growth of standardized schemas and normalization policies to reduce match friction across platforms.
Related Terms
- Identity Resolution
- Audience Onboarding
- Deterministic Matching
- Clean Room
- SHA-256
- HMAC
- Salt
- Personally Identifiable Information (PII)
- Customer Data Platform (CDP)
- Suppression List
