Modeling Materialized Views

Overview

Materialized views are write-time transformations in ClickHouse. A static SELECT populates a destination table from one or more sources. You query the destination like any other table. The MaterializedView class wraps ClickHouse MATERIALIZED VIEW and keeps the SELECT explicit. When you edit the destination schema in code and update the SELECT accordingly, Moose applies the corresponding DDL, orders dependent updates, and backfills in dev mode, so the pipeline stays consistent as you iterate.

Materialized Views: ClickHouse vs. Other Databases

Requires two parts: a SELECT transformation and a destination table schema

Transformation is write-time: runs on INSERT into source table(s) and writes to the destination table

SQL‑native pipelines: MV 'triggers' live inside ClickHouse, so pipelines are defined and executed entirely in the database — no external orchestrator needed. Cascading MVs require correct DDL order.

In local dev, Moose Migrate generates and applies DDL to your local database.

On change, Moose Migrate:

Generates and applies destination table DDL when you update the schema in code

Applies DDL in dependency order across views and tables

Backfills in dev mode when the MaterializedView is created or updated

Hot‑reloads the view and destination table locally and keeps APIs in sync

Roadmap: inferring schema from SELECT

Today, destination schemas are declared in code and kept in sync manually with your SELECT. Moose Migrate coordinates DDL and dependencies when you make those changes. A future enhancement will infer the destination schema from the SELECT and update it automatically.

Dependency awareness

This dependency awareness is critical for cascading materialized views. Moose Migrate orders DDL across views and tables to avoid failed migrations and partial states.

Basic Usage

BasicUsage.ts

import { MaterializedView, OlapTable, ClickHouseEngines, sql } from "@514labs/moose-lib";import { sourceTable } from "path/to/SourceTable"; // or a view // Define the schema of the transformed rows-- this is static and it must match the results of your SELECT. It also represents the schema of your entire destination table.interface TargetSchema {  id: string;  average_rating: number;  num_reviews: number;}export const mv = new MaterializedView<TargetSchema>({  // The transformation to run on the source table  selectStatement: sql`  SELECT    ${sourceTable.columns.id},    avg(${sourceTable.columns.rating}) AS average_rating,     count(*) AS num_reviews     FROM ${sourceTable}    GROUP BY ${sourceTable.columns.id}  `,  // Reference to the source table(s) that the SELECT reads from  selectTables: [sourceTable],    // Creates a new OlapTable named "target_table" where the transformed rows are written to.  targetTable: {    name: "target_table",    orderByFields: ["id"],  },  // The name of the materialized view in ClickHouse  materializedViewName: "mv_to_target_table", });

Materialized View is like a trigger

The ClickHouse MATERIALIZED VIEW object acts like a trigger: on new inserts into the source table(s), it runs the SELECT and writes the transformed rows to the destination.

Quick Reference

ViewOptions.ts

import type { Sql } from "@514labs/moose-lib"; interface MaterializedViewConfig<T> {  // Static SELECT that computes the destination rows  selectStatement: string | Sql;   // Tables/views the query reads from  selectTables: (OlapTable<any> | View)[];   // Name of the ClickHouse MATERIALIZED VIEW object  materializedViewName: string;   // Destination table where materialized rows are stored  targetTable?:    | OlapTable<T>    | {        name: string;        engine?: ClickHouseEngines;        orderByFields?: (keyof T & string)[];      };   /** @deprecated prefer targetTable */  tableName?: string;  /** @deprecated prefer targetTable */  engine?: ClickHouseEngines;  /** @deprecated prefer targetTable */  orderByFields?: (keyof T & string)[];}

Modeling the Target Table

The destination table is where the transformed rows are written by the materialized view. You can model it in two ways:

Basic Transformation, Cleaning, Filtering, Denormalization

Create a narrower, query-optimized table from a wide source. Apply light transforms (cast, rename, parse) at write time.

Denormalization.ts

import { MaterializedView, sql } from "@514labs/moose-lib"; interface Dest { id: string; value: number; created_at: string } new MaterializedView<Dest>({  selectStatement: sql`    SELECT id, toInt32(value) AS value, created_at    FROM ${sourceTable}    WHERE active = 1  `,  selectTables: [sourceTable],  targetTable: { name: "proj_table" },  materializedViewName: "mv_to_proj_table",});

Aggregations

Fan-in Patterns

When you have multiple sources that you want to merge into a single destination table, its best to create an OlapTable and reference it in each MV that needs to write to it:

FanIn.ts

interface DailyCounts { day: string; user_id: string; events: number } // Create the destination table explicitlyconst daily = new OlapTable<DailyCounts>("daily_counts", {  engine: ClickHouseEngines.SummingMergeTree,  orderByFields: ["day", "user_id"],});  // MV 1 - write to the daily_counts tableconst webStmt = sql`SELECT toDate(ts) AS day, user_id, 1 AS events FROM ${webEvents}`;const mv1 = new MaterializedView<DailyCounts>({  selectStatement: webStmt,  selectTables: [webEvents],  targetTable: daily,  materializedViewName: "mv_web_to_daily_counts",}); // MV 2 - write to the daily_counts tableconst mobileStmt = sql`SELECT toDate(ts) AS day, user_id, 1 AS events FROM ${mobileEvents}`;const mv2 = new MaterializedView<DailyCounts>({  selectStatement: mobileStmt,  selectTables: [mobileEvents],  targetTable: daily,  materializedViewName: "mv_mobile_to_daily_counts",});

Blue/green schema migrations

Create a new table for a breaking schema change and use an MV to copy data from the old table; when complete, switch reads to the new table and drop just the MV and old table.

Blue/green table migrations

For more information on how to use materialized views to perform blue/green schema migrations, see the Schema Versioning guide.

Defining the transformation

The selectStatement (TypeScript) or select_statement (Python) is a static SQL query that Moose runs to transform data from your source table(s) into rows for the destination table.

Recommended: ClickHouse SQL Reference

Transformations are defined as ClickHouse SQL queries. We strongly recommend using the ClickHouse SQL reference and functions overview to help you develop your transformations.

Use the Moose sql template to interpolate tables and columns safely. This gives type-checked column references and prevents runtime parameters. Reference tables and columns via objects in your project (e.g., ${sourceTable}, ${sourceTable.columns.id}) rather than string literals.

Transformation.ts

import { MaterializedView, sql } from "@514labs/moose-lib"; interface Dest { id: string; name: string; day: string } const transformation = sql`  SELECT    ${users.columns.id}        AS id,    ${users.columns.name}      AS name,    toDate(${events.columns.ts}) AS day  FROM ${events}  JOIN ${users} ON ${events.columns.user_id} = ${users.columns.id}  WHERE ${events.columns.active} = 1`; new MaterializedView<Dest>({  selectStatement: transformation,  selectTables: [events, users],  targetTable: { name: "user_activity_by_day" },  materializedViewName: "mv_user_activity_by_day",});

The columns returned by your SELECT must exactly match the destination table schema.

Use column aliases (AS target_column_name) to align names.
All destination columns must be present in the SELECT, or the materialized view won't be created. Adjust your transformation or table schema so they match.

Advanced: Writing SELECT statements to Aggregated tables

Go to the Advanced: Writing SELECT statements to Aggregated tables section for more details.

Backfill Destination Tables

In dev mode, when a MaterializedView is created or updated, Moose automatically backfills the destination table by running your SELECT.

In production, backfill is not automatic to avoid duplicate data if new rows arrive during the backfill. See Backfill new Materialized Views for details.

S3Queue sources are not backfilled

Materialized views that source from S3Queue tables are not backfilled automatically. S3Queue tables only process new files added to S3 after the table is created - there is no historical data to backfill from. The MV will start populating as new files arrive in S3.

You can see the SQL that Moose will run to backfill the destination table when you generate the Migration Plan.

During dev mode, as soon as you save the MaterializedView, Moose will run the backfill and you can see the results in the destination table by querying it in your local ClickHouse instance.

Query Destination Tables

You can query the destination table like any other table.

Advanced: Querying Aggregated tables

Go to the Querying Aggregated tables section for more details on how to query Aggregated tables.

Advanced: Aggregations + Materialized Views

This section dives deeper into advanced patterns and tradeoffs when building aggregated materialized views.

Target Tables with `AggregatingMergeTree`

When using an AggregatingMergeTree target table, you must use the AggregateFunction type to model the result of the aggregation functions:

AggTransform.ts

import { MaterializedView, ClickHouseEngines, Aggregated, sql } from "@514labs/moose-lib";  interface MetricsById {  id: string;  /**   * Result of avgState(events.rating)   *   - avgState(number) returns number, so model the type as number   *   - Aggregated arg type is [number] because the column (events.rating) is a number   *   - Aggregated function name is "avg"   */  avg_rating: number & Aggregated<"avg", [number]>;  /**   * Result of uniqExactState(events.user_id)  *  - uniqExact returns an integer; use number & ClickHouseInt<"uint64"> for precision  *  - Aggregated arg type is [string] because the column (events.user_id) is a string  *  - Aggregated function name is "uniqExact"  */  daily_uniques: number & ClickHouseInt<"uint64"> & Aggregated<"uniqExact", [string]>;} //  All Aggregate Functions in this query have a [functionName][State]() suffix.const stmt = sql`  SELECT    ${events.columns.id} AS id,    avgState(${events.columns.rating})      AS avg_rating,    uniqExactState(${events.columns.user_id}) AS daily_uniques  FROM ${events}  GROUP BY ${events.columns.id}`;  new MaterializedView<MetricsById>({  selectStatement: stmt,  selectTables: [events],  targetTable: {    name: "metrics_by_id",    engine: ClickHouseEngines.AggregatingMergeTree,    orderByFields: ["id"],  },  materializedViewName: "mv_metrics_by_id",});

Common mistakes

Using avg()/uniqExact() in the SELECT instead of avgState()/uniqExactState()
Forgetting to annotate the schema with Aggregated<...> (TypeScript) or AggregateFunction(...) (Python) so the target table can be created correctly
Mismatch between GROUP BY keys in your SELECT and the orderByFields (TypeScript) or order_by_fields (Python) of your target table

Modeling columns with `AggregateFunction`

Pattern: U & Aggregated<"agg_func_name", [Types]>
U is the read-time type (e.g., number, string)
agg_func_name is the aggregation name (e.g., avg, uniqExact)
Types are the argument types. These are the types of the columns that are being aggregated.

FunctionToTypeMapping.ts

number & Aggregated<"avg", [number]> // avgState(col: number)number & ClickHouseInt<"uint64"> & Aggregated<"uniqExact", [string]> // uniqExactState(col: string)number & ClickHouseInt<"uint64"> & Aggregated<"count", []> // countState(col: any) string & Aggregated<"argMax", [string, Date]> // argMaxState(col: string, value: Date)string & Aggregated<"argMin", [string, Date]> // argMinState(col: string, value: Date) number & Aggregated<"corr", [number, number]> // corrState(col1: number, col2: number)

Writing SELECT statements to Aggregated tables

When you write to an AggregatingMergeTree table, you must add a State suffix to the aggregation functions in your SELECT statement.

AggTransform.ts

import { MaterializedView, ClickHouseEngines, Aggregated, sql } from "@514labs/moose-lib"; interface MetricsById {  id: string;  avg_rating: number & Aggregated<"avg", [number]>;  total_reviews: number & Aggregated<"sum", [number]>;} const aggStmt = sql`  SELECT    ${reviews.columns.id} AS id,    avgState(${reviews.columns.rating}) AS avg_rating,    countState(${reviews.columns.id})   AS total_reviews  FROM ${reviews}  GROUP BY ${reviews.columns.id}`; const mv = new MaterializedView<MetricsById>({  selectStatement: aggStmt,  selectTables: [reviews],  targetTable: {    name: "metrics_by_id",    engine: ClickHouseEngines.AggregatingMergeTree,    orderByFields: ["id"],  },  materializedViewName: "mv_metrics_by_id",});

Warning:

Why states? Finalized values (e.g., avg()) are not incrementally mergeable. Storing states lets ClickHouse maintain results efficiently as new data arrives. Docs: https://clickhouse.com/docs/en/sql-reference/aggregate-functions/index and https://clickhouse.com/docs/en/sql-reference/aggregate-functions/combinators#-state

Querying Aggregated Tables

When you query a table with an AggregatingMergeTree engine, you must use aggregate functions with the Merge suffix (e.g., avgMerge) or rely on Moose's Aggregated typing plus sql to auto-finalize at query time (TypeScript only).

QueryAgg.ts

import { sql } from "@514labs/moose-lib"; // Auto-finalized via Aggregated + sqlconst cols = mv.targetTable.columns; // mv from earlier Agg exampleconst autoFinalized = sql`  SELECT ${cols.avg_rating}, ${cols.total_reviews}  FROM ${mv.targetTable}  WHERE ${cols.id} = '123'`; // Manual finalization (explicit ...Merge)const manual = sql`  SELECT    avgMerge(avg_rating)   AS avg_rating,    countMerge(total_reviews) AS total_reviews  FROM metrics_by_id  WHERE id = '123'`;

Choosing the right engine

Overview: Which engine should I use?

Use MergeTree for copies/filters/enrichment without aggregation semantics.
Use SummingMergeTree when all measures are additive, and you want compact, eventually-consistent sums.
Use AggregatingMergeTree for non-additive metrics and advanced functions; store states and finalize on read.
Use ReplacingMergeTree for dedup/upserts or as an idempotent staging layer before rollups.

Modeling Materialized Views

Overview

Materialized Views: ClickHouse vs. Other Databases

Requires two parts: a SELECT transformation and a destination table schema

Transformation is write-time: runs on INSERT into source table(s) and writes to the destination table

SQL‑native pipelines: MV 'triggers' live inside ClickHouse, so pipelines are defined and executed entirely in the database — no external orchestrator needed. Cascading MVs require correct DDL order.

In local dev, Moose Migrate generates and applies DDL to your local database.

On change, Moose Migrate:

Generates and applies destination table DDL when you update the schema in code

Applies DDL in dependency order across views and tables

Backfills in dev mode when the MaterializedView is created or updated

Hot‑reloads the view and destination table locally and keeps APIs in sync

Roadmap: inferring schema from SELECT

Dependency awareness

This dependency awareness is critical for cascading materialized views. Moose Migrate orders DDL across views and tables to avoid failed migrations and partial states.

Basic Usage

BasicUsage.ts

import { MaterializedView, OlapTable, ClickHouseEngines, sql } from "@514labs/moose-lib";import { sourceTable } from "path/to/SourceTable"; // or a view // Define the schema of the transformed rows-- this is static and it must match the results of your SELECT. It also represents the schema of your entire destination table.interface TargetSchema {  id: string;  average_rating: number;  num_reviews: number;}export const mv = new MaterializedView<TargetSchema>({  // The transformation to run on the source table  selectStatement: sql`  SELECT    ${sourceTable.columns.id},    avg(${sourceTable.columns.rating}) AS average_rating,     count(*) AS num_reviews     FROM ${sourceTable}    GROUP BY ${sourceTable.columns.id}  `,  // Reference to the source table(s) that the SELECT reads from  selectTables: [sourceTable],    // Creates a new OlapTable named "target_table" where the transformed rows are written to.  targetTable: {    name: "target_table",    orderByFields: ["id"],  },  // The name of the materialized view in ClickHouse  materializedViewName: "mv_to_target_table", });

BasicUsage.ts

import { MaterializedView, OlapTable, ClickHouseEngines, sql } from "@514labs/moose-lib";import { sourceTable } from "path/to/SourceTable"; // or a view // Define the schema of the transformed rows-- this is static and it must match the results of your SELECT. It also represents the schema of your entire destination table.interface TargetSchema {  id: string;  average_rating: number;  num_reviews: number;}export const mv = new MaterializedView<TargetSchema>({  // The transformation to run on the source table  selectStatement: sql`  SELECT    ${sourceTable.columns.id},    avg(${sourceTable.columns.rating}) AS average_rating,     count(*) AS num_reviews     FROM ${sourceTable}    GROUP BY ${sourceTable.columns.id}  `,  // Reference to the source table(s) that the SELECT reads from  selectTables: [sourceTable],    // Creates a new OlapTable named "target_table" where the transformed rows are written to.  targetTable: {    name: "target_table",    orderByFields: ["id"],  },  // The name of the materialized view in ClickHouse  materializedViewName: "mv_to_target_table", });

Materialized View is like a trigger

The ClickHouse MATERIALIZED VIEW object acts like a trigger: on new inserts into the source table(s), it runs the SELECT and writes the transformed rows to the destination.

Quick Reference

ViewOptions.ts

import type { Sql } from "@514labs/moose-lib"; interface MaterializedViewConfig<T> {  // Static SELECT that computes the destination rows  selectStatement: string | Sql;   // Tables/views the query reads from  selectTables: (OlapTable<any> | View)[];   // Name of the ClickHouse MATERIALIZED VIEW object  materializedViewName: string;   // Destination table where materialized rows are stored  targetTable?:    | OlapTable<T>    | {        name: string;        engine?: ClickHouseEngines;        orderByFields?: (keyof T & string)[];      };   /** @deprecated prefer targetTable */  tableName?: string;  /** @deprecated prefer targetTable */  engine?: ClickHouseEngines;  /** @deprecated prefer targetTable */  orderByFields?: (keyof T & string)[];}

ViewOptions.ts

import type { Sql } from "@514labs/moose-lib"; interface MaterializedViewConfig<T> {  // Static SELECT that computes the destination rows  selectStatement: string | Sql;   // Tables/views the query reads from  selectTables: (OlapTable<any> | View)[];   // Name of the ClickHouse MATERIALIZED VIEW object  materializedViewName: string;   // Destination table where materialized rows are stored  targetTable?:    | OlapTable<T>    | {        name: string;        engine?: ClickHouseEngines;        orderByFields?: (keyof T & string)[];      };   /** @deprecated prefer targetTable */  tableName?: string;  /** @deprecated prefer targetTable */  engine?: ClickHouseEngines;  /** @deprecated prefer targetTable */  orderByFields?: (keyof T & string)[];}

Modeling the Target Table

The destination table is where the transformed rows are written by the materialized view. You can model it in two ways:

Basic Transformation, Cleaning, Filtering, Denormalization

Create a narrower, query-optimized table from a wide source. Apply light transforms (cast, rename, parse) at write time.

Denormalization.ts

import { MaterializedView, sql } from "@514labs/moose-lib"; interface Dest { id: string; value: number; created_at: string } new MaterializedView<Dest>({  selectStatement: sql`    SELECT id, toInt32(value) AS value, created_at    FROM ${sourceTable}    WHERE active = 1  `,  selectTables: [sourceTable],  targetTable: { name: "proj_table" },  materializedViewName: "mv_to_proj_table",});

Denormalization.ts

import { MaterializedView, sql } from "@514labs/moose-lib"; interface Dest { id: string; value: number; created_at: string } new MaterializedView<Dest>({  selectStatement: sql`    SELECT id, toInt32(value) AS value, created_at    FROM ${sourceTable}    WHERE active = 1  `,  selectTables: [sourceTable],  targetTable: { name: "proj_table" },  materializedViewName: "mv_to_proj_table",});

Aggregations

Fan-in Patterns

When you have multiple sources that you want to merge into a single destination table, its best to create an OlapTable and reference it in each MV that needs to write to it:

FanIn.ts

interface DailyCounts { day: string; user_id: string; events: number } // Create the destination table explicitlyconst daily = new OlapTable<DailyCounts>("daily_counts", {  engine: ClickHouseEngines.SummingMergeTree,  orderByFields: ["day", "user_id"],});  // MV 1 - write to the daily_counts tableconst webStmt = sql`SELECT toDate(ts) AS day, user_id, 1 AS events FROM ${webEvents}`;const mv1 = new MaterializedView<DailyCounts>({  selectStatement: webStmt,  selectTables: [webEvents],  targetTable: daily,  materializedViewName: "mv_web_to_daily_counts",}); // MV 2 - write to the daily_counts tableconst mobileStmt = sql`SELECT toDate(ts) AS day, user_id, 1 AS events FROM ${mobileEvents}`;const mv2 = new MaterializedView<DailyCounts>({  selectStatement: mobileStmt,  selectTables: [mobileEvents],  targetTable: daily,  materializedViewName: "mv_mobile_to_daily_counts",});

FanIn.ts

interface DailyCounts { day: string; user_id: string; events: number } // Create the destination table explicitlyconst daily = new OlapTable<DailyCounts>("daily_counts", {  engine: ClickHouseEngines.SummingMergeTree,  orderByFields: ["day", "user_id"],});  // MV 1 - write to the daily_counts tableconst webStmt = sql`SELECT toDate(ts) AS day, user_id, 1 AS events FROM ${webEvents}`;const mv1 = new MaterializedView<DailyCounts>({  selectStatement: webStmt,  selectTables: [webEvents],  targetTable: daily,  materializedViewName: "mv_web_to_daily_counts",}); // MV 2 - write to the daily_counts tableconst mobileStmt = sql`SELECT toDate(ts) AS day, user_id, 1 AS events FROM ${mobileEvents}`;const mv2 = new MaterializedView<DailyCounts>({  selectStatement: mobileStmt,  selectTables: [mobileEvents],  targetTable: daily,  materializedViewName: "mv_mobile_to_daily_counts",});

Blue/green schema migrations

Create a new table for a breaking schema change and use an MV to copy data from the old table; when complete, switch reads to the new table and drop just the MV and old table.

Blue/green table migrations

For more information on how to use materialized views to perform blue/green schema migrations, see the Schema Versioning guide.

Defining the transformation

The selectStatement (TypeScript) or select_statement (Python) is a static SQL query that Moose runs to transform data from your source table(s) into rows for the destination table.

Recommended: ClickHouse SQL Reference

Transformations are defined as ClickHouse SQL queries. We strongly recommend using the ClickHouse SQL reference and functions overview to help you develop your transformations.

Transformation.ts

import { MaterializedView, sql } from "@514labs/moose-lib"; interface Dest { id: string; name: string; day: string } const transformation = sql`  SELECT    ${users.columns.id}        AS id,    ${users.columns.name}      AS name,    toDate(${events.columns.ts}) AS day  FROM ${events}  JOIN ${users} ON ${events.columns.user_id} = ${users.columns.id}  WHERE ${events.columns.active} = 1`; new MaterializedView<Dest>({  selectStatement: transformation,  selectTables: [events, users],  targetTable: { name: "user_activity_by_day" },  materializedViewName: "mv_user_activity_by_day",});

Transformation.ts

import { MaterializedView, sql } from "@514labs/moose-lib"; interface Dest { id: string; name: string; day: string } const transformation = sql`  SELECT    ${users.columns.id}        AS id,    ${users.columns.name}      AS name,    toDate(${events.columns.ts}) AS day  FROM ${events}  JOIN ${users} ON ${events.columns.user_id} = ${users.columns.id}  WHERE ${events.columns.active} = 1`; new MaterializedView<Dest>({  selectStatement: transformation,  selectTables: [events, users],  targetTable: { name: "user_activity_by_day" },  materializedViewName: "mv_user_activity_by_day",});

The columns returned by your SELECT must exactly match the destination table schema.

Use column aliases (AS target_column_name) to align names.
All destination columns must be present in the SELECT, or the materialized view won't be created. Adjust your transformation or table schema so they match.

Advanced: Writing SELECT statements to Aggregated tables

Go to the Advanced: Writing SELECT statements to Aggregated tables section for more details.

Backfill Destination Tables

In dev mode, when a MaterializedView is created or updated, Moose automatically backfills the destination table by running your SELECT.

In production, backfill is not automatic to avoid duplicate data if new rows arrive during the backfill. See Backfill new Materialized Views for details.

S3Queue sources are not backfilled

You can see the SQL that Moose will run to backfill the destination table when you generate the Migration Plan.

During dev mode, as soon as you save the MaterializedView, Moose will run the backfill and you can see the results in the destination table by querying it in your local ClickHouse instance.

Query Destination Tables

You can query the destination table like any other table.

Advanced: Querying Aggregated tables

Go to the Querying Aggregated tables section for more details on how to query Aggregated tables.

Advanced: Aggregations + Materialized Views

This section dives deeper into advanced patterns and tradeoffs when building aggregated materialized views.

Target Tables with `AggregatingMergeTree`

When using an AggregatingMergeTree target table, you must use the AggregateFunction type to model the result of the aggregation functions:

AggTransform.ts

import { MaterializedView, ClickHouseEngines, Aggregated, sql } from "@514labs/moose-lib";  interface MetricsById {  id: string;  /**   * Result of avgState(events.rating)   *   - avgState(number) returns number, so model the type as number   *   - Aggregated arg type is [number] because the column (events.rating) is a number   *   - Aggregated function name is "avg"   */  avg_rating: number & Aggregated<"avg", [number]>;  /**   * Result of uniqExactState(events.user_id)  *  - uniqExact returns an integer; use number & ClickHouseInt<"uint64"> for precision  *  - Aggregated arg type is [string] because the column (events.user_id) is a string  *  - Aggregated function name is "uniqExact"  */  daily_uniques: number & ClickHouseInt<"uint64"> & Aggregated<"uniqExact", [string]>;} //  All Aggregate Functions in this query have a [functionName][State]() suffix.const stmt = sql`  SELECT    ${events.columns.id} AS id,    avgState(${events.columns.rating})      AS avg_rating,    uniqExactState(${events.columns.user_id}) AS daily_uniques  FROM ${events}  GROUP BY ${events.columns.id}`;  new MaterializedView<MetricsById>({  selectStatement: stmt,  selectTables: [events],  targetTable: {    name: "metrics_by_id",    engine: ClickHouseEngines.AggregatingMergeTree,    orderByFields: ["id"],  },  materializedViewName: "mv_metrics_by_id",});

AggTransform.ts

import { MaterializedView, ClickHouseEngines, Aggregated, sql } from "@514labs/moose-lib";  interface MetricsById {  id: string;  /**   * Result of avgState(events.rating)   *   - avgState(number) returns number, so model the type as number   *   - Aggregated arg type is [number] because the column (events.rating) is a number   *   - Aggregated function name is "avg"   */  avg_rating: number & Aggregated<"avg", [number]>;  /**   * Result of uniqExactState(events.user_id)  *  - uniqExact returns an integer; use number & ClickHouseInt<"uint64"> for precision  *  - Aggregated arg type is [string] because the column (events.user_id) is a string  *  - Aggregated function name is "uniqExact"  */  daily_uniques: number & ClickHouseInt<"uint64"> & Aggregated<"uniqExact", [string]>;} //  All Aggregate Functions in this query have a [functionName][State]() suffix.const stmt = sql`  SELECT    ${events.columns.id} AS id,    avgState(${events.columns.rating})      AS avg_rating,    uniqExactState(${events.columns.user_id}) AS daily_uniques  FROM ${events}  GROUP BY ${events.columns.id}`;  new MaterializedView<MetricsById>({  selectStatement: stmt,  selectTables: [events],  targetTable: {    name: "metrics_by_id",    engine: ClickHouseEngines.AggregatingMergeTree,    orderByFields: ["id"],  },  materializedViewName: "mv_metrics_by_id",});

Common mistakes

Using avg()/uniqExact() in the SELECT instead of avgState()/uniqExactState()
Forgetting to annotate the schema with Aggregated<...> (TypeScript) or AggregateFunction(...) (Python) so the target table can be created correctly
Mismatch between GROUP BY keys in your SELECT and the orderByFields (TypeScript) or order_by_fields (Python) of your target table

Modeling columns with `AggregateFunction`

Pattern: U & Aggregated<"agg_func_name", [Types]>
U is the read-time type (e.g., number, string)
agg_func_name is the aggregation name (e.g., avg, uniqExact)
Types are the argument types. These are the types of the columns that are being aggregated.

FunctionToTypeMapping.ts

number & Aggregated<"avg", [number]> // avgState(col: number)number & ClickHouseInt<"uint64"> & Aggregated<"uniqExact", [string]> // uniqExactState(col: string)number & ClickHouseInt<"uint64"> & Aggregated<"count", []> // countState(col: any) string & Aggregated<"argMax", [string, Date]> // argMaxState(col: string, value: Date)string & Aggregated<"argMin", [string, Date]> // argMinState(col: string, value: Date) number & Aggregated<"corr", [number, number]> // corrState(col1: number, col2: number)

Pattern: U & Aggregated<"agg_func_name", [Types]>
U is the read-time type (e.g., number, string)
agg_func_name is the aggregation name (e.g., avg, uniqExact)
Types are the argument types. These are the types of the columns that are being aggregated.

FunctionToTypeMapping.ts

number & Aggregated<"avg", [number]> // avgState(col: number)number & ClickHouseInt<"uint64"> & Aggregated<"uniqExact", [string]> // uniqExactState(col: string)number & ClickHouseInt<"uint64"> & Aggregated<"count", []> // countState(col: any) string & Aggregated<"argMax", [string, Date]> // argMaxState(col: string, value: Date)string & Aggregated<"argMin", [string, Date]> // argMinState(col: string, value: Date) number & Aggregated<"corr", [number, number]> // corrState(col1: number, col2: number)

Writing SELECT statements to Aggregated tables

When you write to an AggregatingMergeTree table, you must add a State suffix to the aggregation functions in your SELECT statement.

AggTransform.ts

import { MaterializedView, ClickHouseEngines, Aggregated, sql } from "@514labs/moose-lib"; interface MetricsById {  id: string;  avg_rating: number & Aggregated<"avg", [number]>;  total_reviews: number & Aggregated<"sum", [number]>;} const aggStmt = sql`  SELECT    ${reviews.columns.id} AS id,    avgState(${reviews.columns.rating}) AS avg_rating,    countState(${reviews.columns.id})   AS total_reviews  FROM ${reviews}  GROUP BY ${reviews.columns.id}`; const mv = new MaterializedView<MetricsById>({  selectStatement: aggStmt,  selectTables: [reviews],  targetTable: {    name: "metrics_by_id",    engine: ClickHouseEngines.AggregatingMergeTree,    orderByFields: ["id"],  },  materializedViewName: "mv_metrics_by_id",});

AggTransform.ts

import { MaterializedView, ClickHouseEngines, Aggregated, sql } from "@514labs/moose-lib"; interface MetricsById {  id: string;  avg_rating: number & Aggregated<"avg", [number]>;  total_reviews: number & Aggregated<"sum", [number]>;} const aggStmt = sql`  SELECT    ${reviews.columns.id} AS id,    avgState(${reviews.columns.rating}) AS avg_rating,    countState(${reviews.columns.id})   AS total_reviews  FROM ${reviews}  GROUP BY ${reviews.columns.id}`; const mv = new MaterializedView<MetricsById>({  selectStatement: aggStmt,  selectTables: [reviews],  targetTable: {    name: "metrics_by_id",    engine: ClickHouseEngines.AggregatingMergeTree,    orderByFields: ["id"],  },  materializedViewName: "mv_metrics_by_id",});

Warning:

Querying Aggregated Tables

QueryAgg.ts

import { sql } from "@514labs/moose-lib"; // Auto-finalized via Aggregated + sqlconst cols = mv.targetTable.columns; // mv from earlier Agg exampleconst autoFinalized = sql`  SELECT ${cols.avg_rating}, ${cols.total_reviews}  FROM ${mv.targetTable}  WHERE ${cols.id} = '123'`; // Manual finalization (explicit ...Merge)const manual = sql`  SELECT    avgMerge(avg_rating)   AS avg_rating,    countMerge(total_reviews) AS total_reviews  FROM metrics_by_id  WHERE id = '123'`;

QueryAgg.ts

import { sql } from "@514labs/moose-lib"; // Auto-finalized via Aggregated + sqlconst cols = mv.targetTable.columns; // mv from earlier Agg exampleconst autoFinalized = sql`  SELECT ${cols.avg_rating}, ${cols.total_reviews}  FROM ${mv.targetTable}  WHERE ${cols.id} = '123'`; // Manual finalization (explicit ...Merge)const manual = sql`  SELECT    avgMerge(avg_rating)   AS avg_rating,    countMerge(total_reviews) AS total_reviews  FROM metrics_by_id  WHERE id = '123'`;

Choosing the right engine

Overview: Which engine should I use?

Use MergeTree for copies/filters/enrichment without aggregation semantics.
Use SummingMergeTree when all measures are additive, and you want compact, eventually-consistent sums.
Use AggregatingMergeTree for non-additive metrics and advanced functions; store states and finalize on read.
Use ReplacingMergeTree for dedup/upserts or as an idempotent staging layer before rollups.