Configure deterministic identity matching and stitching
This task produces a functioning identity graph that links anonymous behavioral events to authenticated profile records, enabling the real-time customer profile to present a unified view of a customer across all channels. The output is verifiable in the Profile UI: a single profile record that shows events from both pre-login and post-login sessions, attributes from multiple source datasets, and segment membership derived from the combined data.
Deterministic vs. probabilistic stitching. Deterministic stitching fires when two records share an exact, declared identity value — for example, the same ECID value appearing in both an anonymous event dataset and a registration event. No inference is required; the match is exact. Probabilistic stitching (device graph, co-op) is a separate, optional layer that infers identity links from behavioral signals when no shared key exists. This task covers deterministic configuration only; probabilistic methods introduce additional complexity and vendor-specific data-sharing agreements.
Namespace hierarchy. Every platform that implements identity resolution uses a namespace concept to prevent false matches between identifiers from different systems (e.g., a CRM customer ID of "12345" should not match an email-system message ID of "12345"). Declaring the correct namespace for each identity field is therefore a data-quality gate, not just a labeling exercise. In AEP, standard namespaces (ECID, Email, Phone) are pre-provisioned; custom namespaces (CRM ID, Loyalty ID) must be created in the Identity Service admin UI before they can be referenced in schemas.
Merge rules. The identity graph produces a set of linked fragments; the profile merge rule determines how fragments from different datasets are combined when a conflict exists (e.g., two datasets report different values for homeAddress.city). The default rule is "last-write-wins" per dataset priority; custom merge rules can prefer specific datasets or use union semantics. Merge rule selection affects which attribute values appear in the merged profile record.
Parallel viability (moderate). Deterministic identity resolution is available in Segment (Personas Identity Graph), Snowflake (identity join tables with Hightouch or dbt), and every major CDP. The namespace + exact-match pattern is universal; the UI surface for declaring namespaces and validating the graph varies. Phase 3 will document Segment Identity Graph configuration and a Snowflake identity resolution SQL pattern.