CDP Recommendation Agent | ComposableStack.AI

This task produces a named derived dataset containing the pre-computed results of a complex analytical query, ready for consumption by BI tools, downstream activation systems, or data science pipelines without incurring per-query execution cost. The output is a persistent dataset with a stable table name, a creation timestamp, and a row count that can be refreshed on a schedule. This pattern is the bridge between exploratory SQL analysis and production reporting pipelines.

CTAS as a pre-computation pattern. Create Table As Select (CTAS) is a standard SQL construct that stores a query's result set as a new table. In the context of platform query services, CTAS serves two additional purposes: it decouples query execution time from BI tool timeout limits (since the result is pre-materialized), and it creates a versioned snapshot of data at a specific point in time that can be compared with future snapshots to detect changes. Each CTAS execution overwrites the previous result unless a naming convention (e.g., appending a date suffix) is used to create a historical series.

Scheduling strategies. The appropriate refresh cadence depends on the downstream use case. Daily snapshots are sufficient for campaign targeting and executive dashboards. Hourly or sub-hourly snapshots are appropriate for operational monitoring (churn signals, support ticket volumes). Near-real-time views are better served by streaming queries or live BI connections to the raw dataset, bypassing CTAS entirely. The Query Service API's Schedules endpoint supports cron-style schedule expressions; the UI provides a visual schedule builder.

BI tool integration. Once a CTAS dataset exists, any PostgreSQL-compatible BI tool can connect using the Query Service connection string and query the derived table as if it were a native PostgreSQL table. AEP Query Service does not require a separate JDBC driver — the standard libpq or JDBC PostgreSQL driver is sufficient. Reports built against derived datasets are faster and more reliable than reports that query raw event datasets directly, because the aggregation has already been computed.

Parallel viability (high). CTAS and scheduled query materialization are universal patterns: Snowflake Dynamic Tables, BigQuery Materialized Views, dbt models, Databricks Delta Live Tables. The API-based query management pattern (submit → poll status → retrieve results) is also universal across cloud data warehouse APIs. Phase 3 will document equivalent Snowflake task scheduling and dbt model run configuration.

Schedule and export query results as derived datasets

Side-by-side implementations

Task-level sources

How is this implementation?