ComposableStack.AI CDP
HomeAssessmentAgentLibraryCurriculumHow It WorksSolutionsAbout
← All tasks
Operational taskmodule12· status: complete

Connect a BigQuery data source and ingest GA4 event data

Authenticate a cloud data warehouse via OAuth, map its event table fields to a target schema, configure a delta-based ingestion schedule, and launch the dataflow to land GA4 behavioral data in the platform.

GA4 exports behavioral event data to BigQuery in a well-defined table structure, making BigQuery a common staging layer for organizations that want to combine web analytics with their customer profile data. Connecting that staging layer to a CDP requires OAuth-based authentication to the cloud project, field-level mapping from the source table's schema to the platform's canonical event schema, and a scheduling strategy that pulls only new or changed rows on each run rather than reloading the full table.

The OAuth credential setup is often the most friction-heavy step: practitioners need GCP project-level permissions to create an OAuth application, must navigate the consent-screen configuration, and must explicitly authorize the BigQuery API scope to obtain a valid refresh token. Once credentials are in place, field mapping is the analytical judgment task: deciding which GA4 dimensions map to which XDM paths determines whether the ingested data can later be joined to profile records for identity resolution. The delta field selection determines ingestion efficiency — choosing a high-cardinality timestamp column enables true incremental loads; choosing a low-cardinality or absent delta field forces full table scans.

Parallel viability: High parallelism. BigQuery-to-CDP ingestion is a vendor-agnostic pattern. Composable equivalents include BigQuery → dbt transformations → Snowflake → Segment (Connections) or BigQuery → Fivetran/Airbyte → any warehouse with a CDP connector. The AEP BigQuery Source Connector adds value specifically when the destination schema is AEP's XDM model and the data needs to participate in AEP's Real-time Customer Profile; teams using a different profile store or analytical warehouse should evaluate the composable path on its merits.

Side-by-side implementations

Adobe Experience Platform (AEP)·confidence 85%
Adobe Experience Platform (AEP)Auto-drafted, pending review

Practitioners configure the BigQuery Source Connector in AEP by first setting up a GCP OAuth application (OAuth consent screen, generating clientId, clientSecret, and refreshToken via OAuth 2.0 Playground with the BigQuery API scope). In the AEP Sources UI they create a new BigQuery connection supplying projectId, clientId, clientSecret, and refreshToken. They then select the target BigQuery dataset and table (e.g., a GA4 export table), preview the schema, and perform XDM field mapping — for example: GA_ID → identityMap.gaid, customerID → _tenantId.identification.core.loyaltyId, Page → web.webPageDetails.name, Device → environment.type, timestamp → timestamp. A delta field (the timestamp column) is designated to enable incremental loads. The dataflow is given a name and launched; AEP then runs scheduled ingestion pulls based on the configured interval.

Capability: Reverse-ETL (CDW-to-Destination Sync)

Sources

  • source.tech-training-module12-ex3
  • source.tech-training-module12-ex4
  • source.experienceleague-adobe-com.en-docs-experience-platform-sources-bigquery-2026
Snowflake·confidence 85%
SnowflakeAuto-drafted, pending review

The composable equivalent of AEP's BigQuery Source Connector is Fivetran (or Airbyte) configured with a BigQuery source connector that continuously replicates the GA4 export table from BigQuery into Snowflake. Setup involves creating a GCP Service Account with BigQuery Data Viewer role, providing the project ID and dataset name to Fivetran, selecting the events_* table partitions, and configuring a daily or continuous sync cadence. In Snowflake, the replicated GA4 events land in a raw schema; a dbt model normalizes nested GA4 JSON arrays (event_params, user_properties) using LATERAL FLATTEN. XDM field mapping is replaced by dbt column aliases and type casts that produce the same semantic fields (loyaltyId, page name, timestamp). The delta/incremental load equivalent is dbt's incremental materialization filtering on the event_date partition column.

Capability: Reverse-ETL (CDW-to-Destination Sync)

Sources

  • source.docs-snowflake-com.kafka-connector-overview
  • source.docs-getdbt-com.incremental-models
Hightouch

Parallel implementation not yet available.

Task-level sources

  • technical-training/module12/index.md
  • technical-training/module12/ex3.md
  • technical-training/module12/ex4.md
  • technical-training/module12/summary.md

How is this implementation?

Sign-in-gated. Tomorrow morning's curriculum-ingestor consumes your feedback: "Inaccurate" queues the task for re-review, "needs update" queues it for a refresh, and "one vendor panel is wrong" re-drafts just that panel.

What kind of feedback?
Sign-in required. Free.