Ingest streaming events via HTTP API or SDK
Configure an XDM-mapped streaming endpoint or SDK launch rule to deliver behavioral events to the platform in real time, verifying arrival through dataset monitoring.
This task produces a live event stream flowing from a web or mobile property into a platform dataset with sub-second latency. The output is a profile-enabled dataset whose record count and batch-status indicators grow as users interact with the property. Streaming ingestion is the primary path for capturing real-time behavioral signals that must be available for segmentation and personalization within seconds of occurrence.
Endpoint configuration. Streaming ingestion requires a named endpoint (a Datastream in AEP, a Source slug in Segment) linked to a destination dataset and schema. The endpoint validates each inbound event against the target schema; events failing validation are quarantined in an error dataset rather than silently dropped. Selecting the correct sandbox and enabling the "Profile" toggle on both the schema and dataset is a prerequisite — without it events land in storage but never hydrate profile records.
SDK vs. direct HTTP. An SDK (Web SDK, Mobile SDK) handles authentication, batching, retry on failure, and XDM serialization automatically. A direct HTTP API call gives more control for server-side or non-browser environments but requires the implementer to manage authentication tokens, construct the XDM payload explicitly, and handle HTTP errors. Both approaches result in the same inbound event shape; the choice is driven by deployment context.
Verification. The Dataset Activity dashboard shows total record counts and a time-series of recent batches. Each batch carries a Batch ID that can be used to retrieve detailed error information if records fail validation. The AEP Debugger extension (for SDK-based deployments) allows real-time inspection of edge network requests before they reach the dataset.
Parallel viability (high). Any cloud data platform provides an equivalent streaming ingest path: Segment's HTTP Source, Snowflake's Snowpipe, BigQuery's streaming insert API, or Kafka Connect. The structural pattern — endpoint + schema contract + dataset destination + monitoring — is universal. Phase 3 will document Segment HTTP Source configuration as a parallel implementation.
Side-by-side implementations
Parallel implementation not yet available.
Snowflake Snowpipe provides continuous, serverless ingestion of micro-batched files staged in S3, Azure Blob, or GCS into Snowflake tables with sub-minute latency. For true real-time SDK-level streaming, Snowflake Streaming (via the Snowflake Streaming Ingest SDK or the Kafka Connector in low-latency mode) writes rows into a staging channel table with sub-second latency. Client-side event capture uses a JavaScript SDK — such as Segment Analytics.js or a custom first-party SDK — that sends events to a serverless proxy, which forwards them to a Snowflake Streaming endpoint or a Kafka topic consumed by the Snowflake connector. Verification uses Snowflake's QUERY_HISTORY or a SELECT COUNT(*) on the target table with a timestamp filter to confirm records are arriving in near-real time.
Capability: Reverse-ETL (CDW-to-Destination Sync)
Parallel implementation not yet available.
Task-level sources
- technical-training/module2/index.md
- technical-training/module2/ex1.md
- technical-training/module2/ex2.md
How is this implementation?
Sign-in-gated. Tomorrow morning's curriculum-ingestor consumes your feedback: "Inaccurate" queues the task for re-review, "needs update" queues it for a refresh, and "one vendor panel is wrong" re-drafts just that panel.