ComposableStack.AI CDP
HomeAssessmentAgentLibraryCurriculumHow It WorksSolutionsAbout
← All tasks
Operational taskmodule2· status: complete

Configure XDM schemas and profile-enabled datasets

Design XDM schemas for both profile and event data by selecting appropriate field groups, defining primary and secondary identity fields with namespace assignments, enabling schemas for profile ingestion, and creating linked datasets that hydrate the real-time customer profile.

This task produces two reusable schemas — one for identity-bearing profile attributes, one for timestamped behavioral events — and a pair of datasets linked to those schemas that feed the real-time customer profile. The output is an operational data model that all downstream tasks (streaming ingestion, batch ingestion, segmentation, query service) can reference by name. Getting this right at design time avoids costly schema migrations later and ensures that all ingested data can be unified into a single profile view.

Schema class selection. The XDM Individual Profile class is appropriate for records that represent a persistent entity: CRM records, loyalty profiles, consent preferences. The XDM ExperienceEvent class is appropriate for time-stamped interaction records: page views, product views, purchases, call-center interactions. Both classes can coexist in the same profile store — the identity graph links them by shared identity namespaces.

Identity namespace hierarchy. Each schema must designate exactly one Primary Identity field. The choice of primary identity determines how profile fragments are stitched: if different channels use different primary identifiers, the identity graph creates links between them when a shared secondary identifier is observed. For a Profile schema, email is typically primary because it is stable and human-readable; for an ExperienceEvent schema, ECID is typically primary because it is available from the first anonymous page view. Both fields should be present on both schema classes to enable stitching.

Profile activation. Enabling the Profile toggle on a schema is a one-way, irreversible action per schema version. Once enabled, all data ingested against that schema contributes to the Real-time Customer Profile merge tree. Datasets must independently be enabled for profile contribution at the dataset level; a profile-enabled schema does not automatically make every dataset linked to it profile-active.

Parallel viability (high). The pattern — semantic data model + identity key designation + profile store activation — appears in every modern CDP. Segment Protocols defines event and trait schemas; Snowflake schema-on-write defines the identity join keys; Hightouch or Census maps CRM columns to activation audiences. Phase 3 will document equivalent Segment Protocols configuration and Snowflake identity table design.

Side-by-side implementations

Adobe Experience Platform (AEP)·confidence 85%
Adobe Experience Platform (AEP)Auto-drafted, pending review

In AEP Schemas, create two schemas: one based on XDM Individual Profile (to answer "who is this customer?") and one based on XDM ExperienceEvent (to answer "what does this customer do?"). For the Profile schema, add field groups Demographic Details, Personal Contact Details, and Preference Details, then create a custom Field Group with an identification object containing emailId (Primary Identity, Email namespace), ecid (ECID namespace), and mobilenr (Phone namespace). For the ExperienceEvent schema, add Web Details, Commerce Details, and Environment Details, plus a custom Field Group with ecid as Primary Identity (ECID namespace). Enable the Profile toggle on both schemas, then create Datasets linked to each schema in the Datasets UI. Datasets with Profile enabled contribute records to the Real-time Customer Profile upon successful batch or stream ingestion.

Capability: Identity Resolution

Sources

  • source.tech-training-module2-ex2
  • source.tech-training-module2-ex3
  • source.experienceleague-adobe-com.en-docs-experience-platform-xdm-home-2026
Snowflake·confidence 85%
SnowflakeAuto-drafted, pending review

In a composable CDP built on Snowflake, data modeling lives in dbt. A dbt model (or a raw Snowflake table) for profile data defines columns equivalent to XDM Individual Profile fields — email, ECID, loyalty ID, mobile number, and demographic attributes. Identity columns (email, phone, loyalty_id) are designated as join keys in the dbt model rather than declared via a formal identity namespace registry. Dataset provisioning is table or view creation in Snowflake; no separate dataset-creation UI step is required. The profile-toggle equivalent is whether a table is included in the identity resolution dbt model and queried by the reverse-ETL tool (Hightouch, Census) for audience activation.

Capability: Identity Resolution

Sources

  • source.docs-getdbt-com.building-a-dbt-project-models
  • source.docs-snowflake-com.user-guide-tables-creating
Hightouch·confidence 85%
HightouchAuto-drafted, pending review

In Hightouch, a "profile dataset" is a Source Model — a SQL SELECT statement run against a Snowflake table that produces one row per profile with identity fields and attributes. The model designates a primary key (equivalent to AEP's Primary Identity field) used to match and deduplicate records at the destination. Schema design is handled upstream in dbt: dbt source.yml defines expected column types and constraints (equivalent to XDM field groups), and dbt models JOIN identity and attribute tables into the unified profile shape Hightouch reads. There is no separate profile-toggle step — any Snowflake table with an identity column is immediately queryable by a Hightouch model without a distinct activation-enable step. Hightouch reads a single unified model combining profile attributes (dbt Individual Profile model) and aggregated event counts (dbt ExperienceEvent model) rather than managing separate schema class hierarchies per AEP's XDM class pattern.

Capability: Identity Resolution

Sources

  • source.hightouch-com.docs-getting-started-models

Task-level sources

  • technical-training/module2/index.md
  • technical-training/module2/ex2.md
  • technical-training/module2/ex3.md

How is this implementation?

Sign-in-gated. Tomorrow morning's curriculum-ingestor consumes your feedback: "Inaccurate" queues the task for re-review, "needs update" queues it for a refresh, and "one vendor panel is wrong" re-drafts just that panel.

What kind of feedback?
Sign-in required. Free.