AggregatingMergeTree stores pre-aggregated values and/or aggregate states that are automatically merged during background compaction.
Use it with:
SimpleAggregateFunction for simple rollups (store merged values directly)AggregateFunction for complex aggregations (store aggregate states and merge them at read time)import { OlapTable, ClickHouseEngines, SimpleAggregated, Aggregated, DateTime} from "@514labs/moose-lib"; interface DailyStats { date: DateTime; user_id: string; // SimpleAggregateFunction(sum, UInt64) total_views: number & SimpleAggregated<"sum", number>; // SimpleAggregateFunction(max, Float64) max_score: number & SimpleAggregated<"max", number>; // SimpleAggregateFunction(anyLast, DateTime) last_activity: DateTime & SimpleAggregated<"anyLast", DateTime>;} const dailyStats = new OlapTable<DailyStats>("daily_stats", { engine: ClickHouseEngines.AggregatingMergeTree, orderByFields: ["date", "user_id"]}); interface MetricsById { id: string; /** * AggregateFunction(avg, Float64) * Stores the result of avgState(rating) and is merged with avgMerge(...) */ avg_rating: number & Aggregated<"avg", [number]>; /** * AggregateFunction(uniqExact, String) * Stores the result of uniqExactState(user_id) and is merged with uniqExactMerge(...) */ daily_uniques: number & Aggregated<"uniqExact", [string]>;} const metricsById = new OlapTable<MetricsById>("metrics_by_id", { engine: ClickHouseEngines.AggregatingMergeTree, orderByFields: ["id"],});Review the ClickHouse aggregate functions documentation for the complete list of supported functions.
Pre-aggregating with AggregatingMergeTree can reduce query latency by orders of magnitude for dashboard queries, since aggregation happens at insert time rather than query time.
A common pattern is populating an AggregatingMergeTree table from a materialized view:
import { OlapTable, MaterializedView, ClickHouseEngines, UInt64, SimpleAggregated, DateTime, sql} from "@514labs/moose-lib"; // Source: raw eventsinterface PageView { timestamp: DateTime; user_id: string; page: string; duration_ms: number;} const pageViews = new OlapTable<PageView>("page_views", { orderByFields: ["timestamp", "user_id"],}); // Target: daily aggregatesinterface DailyPageStats { date: DateTime; user_id: string; view_count: UInt64 & SimpleAggregated<"sum", UInt64>; total_duration: number & SimpleAggregated<"sum", number>; max_duration: number & SimpleAggregated<"max", number>;} const targetTable = new OlapTable<DailyPageStats>("daily_page_stats", { engine: ClickHouseEngines.AggregatingMergeTree, orderByFields: ["date", "user_id"],}); const mv = new MaterializedView<DailyPageStats>({ materializedViewName: "mv_daily_page_stats", selectTables: [pageViews], targetTable: targetTable, selectStatement: sql` SELECT toStartOfDay(${pageViews.columns.timestamp}) AS date, ${pageViews.columns.user_id} AS user_id, count() AS view_count, sum(${pageViews.columns.duration_ms}) AS total_duration, max(${pageViews.columns.duration_ms}) AS max_duration FROM ${pageViews} GROUP BY date, user_id `,});SimpleAggregateFunction column typesAggregateFunction / aggregate statesimport { OlapTable, MaterializedView, ClickHouseEngines, UInt64, SimpleAggregated, DateTime, sql} from "@514labs/moose-lib"; // Source: raw eventsinterface PageView { timestamp: DateTime; user_id: string; page: string; duration_ms: number;} const pageViews = new OlapTable<PageView>("page_views", { orderByFields: ["timestamp", "user_id"],}); // Target: daily aggregatesinterface DailyPageStats { date: DateTime; user_id: string; view_count: UInt64 & SimpleAggregated<"sum", UInt64>; total_duration: number & SimpleAggregated<"sum", number>; max_duration: number & SimpleAggregated<"max", number>;} const targetTable = new OlapTable<DailyPageStats>("daily_page_stats", { engine: ClickHouseEngines.AggregatingMergeTree, orderByFields: ["date", "user_id"],}); const mv = new MaterializedView<DailyPageStats>({ materializedViewName: "mv_daily_page_stats", selectTables: [pageViews], targetTable: targetTable, selectStatement: sql` SELECT toStartOfDay(${pageViews.columns.timestamp}) AS date, ${pageViews.columns.user_id} AS user_id, count() AS view_count, sum(${pageViews.columns.duration_ms}) AS total_duration, max(${pageViews.columns.duration_ms}) AS max_duration FROM ${pageViews} GROUP BY date, user_id `,});