CDP Recommendation Agent | ComposableStack.AI

Configuring a Kafka source connector for real-time event streaming produces a continuously running pipeline that translates Kafka topic messages into XDM-formatted experience events and delivers them to a downstream data platform's streaming ingestion endpoint. The primary outputs are a registered connector instance in Kafka Connect, a verified RUNNING status for that connector, and an active dataflow in the target platform showing incoming records.

Key decisions include: which Kafka topics to consume (and whether to map one topic to one dataset or fan across multiple), whether to enable schema validation at the connector layer or rely on the ingestion endpoint for validation, and how to handle consumer group offsets and replay behavior for error recovery. The connect-distributed mode (as opposed to standalone) is strongly recommended for production as it enables fault tolerance and horizontal scaling across multiple Kafka Connect worker nodes.

This task has high parallelism across CDP architectures because Kafka is infrastructure-layer technology that is independent of the downstream platform. The AEP Sink Connector is AEP-specific, but the functional equivalent for a Snowflake-based composable CDP is the Snowflake Kafka Connector, which streams topic messages directly into Snowflake tables. For dbt-managed pipelines, the upstream source is still Kafka but the ingestion lands in a staging table that dbt then transforms. Teams evaluating composable alternatives should note that the schema contract (XDM vs. raw JSON vs. Avro) shifts depending on which connector is used, so schema governance decisions made at this stage have downstream implications for the data transformation layer.

Side-by-side implementations

Adobe Experience Platform (AEP)·confidence 85%

Adobe Experience Platform (AEP)Auto-drafted, pending review

AEP's Kafka ingestion path requires two components to be configured in tandem. First, in AEP Sources (Sources Catalog → HTTP API), create an HTTP API Source Connector: create a new account with XDM Compatible mode enabled, link it to an existing dataset (e.g., Demo System - Event Dataset for Call Center), and capture the Streaming Endpoint URL (format: https://dcs.adobedc.net/collection/<hash>). Second, on the Kafka Connect side, edit connect-distributed.properties to set key.converter.schemas.enable=false and value.converter.schemas.enable=false, configure plugin.path to point to a connectors directory, place the AEP Sink Connector JAR (streaming-connect-sink-0.0.14-java-11.jar) in that directory, and start Kafka Connect with bin/connect-distributed.sh. Once running, use the Kafka Connect REST API (POST Create AEP Sink Connector) to register the connector instance with the Streaming Endpoint URL set as the aep.endpoint property. Verify with GET Available Kafka Connect connectors and GET Check Kafka Connect Connector Status to confirm the connector is in RUNNING state.

Capability: Audience Segmentation

Snowflake·confidence 85%

SnowflakeAuto-drafted, pending review

The Snowflake Kafka Connector (available as a Kafka Connect plugin from Confluent Hub or the Snowflake community) streams Kafka topic messages into Snowflake tables with sub-minute latency. Setup involves two sides: on the Snowflake side, creating a role, user, warehouse, and target table with an INGEST channel; on the Kafka Connect side, placing the Snowflake Kafka Connector JAR in the plugin.path directory, configuring connect-distributed.properties with key.converter.schemas.enable=false and value.converter.schemas.enable=false, and registering the connector instance via the Kafka Connect REST API with snowflake.url.name, snowflake.user.name, snowflake.private.key, snowflake.database.name, snowflake.schema.name, and topics properties. This is architecturally identical to the AEP Sink Connector setup — the same distributed Kafka Connect infrastructure and REST API registration pattern, differing only in the connector JAR and endpoint credentials.

Capability: Reverse-ETL (CDW-to-Destination Sync)

Hightouch

Parallel implementation not yet available.

How is this implementation?

Sign-in-gated. Tomorrow morning's curriculum-ingestor consumes your feedback: "Inaccurate" queues the task for re-review, "needs update" queues it for a refresh, and "one vendor panel is wrong" re-drafts just that panel.

Configure a Kafka source connector for real-time event streaming

Side-by-side implementations

Task-level sources

How is this implementation?