CollapsingMergeTree
CollapsingMergeTree is a MergeTree-family engine that collapses pairs of rows during background merges using a required sign column (where 1 is a “state” row and -1 is a “cancel” row). The merge behavior is defined by ClickHouse.
When to Use
- Track object state changes without running ClickHouse UPDATEs
- Model “current state” from a stream of changes (append-only writes)
- Reduce storage by collapsing old states during merges
Usage
CollapsingTable.ts
import { Key, OlapTable, ClickHouseEngines, Int8 } from "@514labs/moose-lib"; interface UserActivity { userId: Key<string>; pageViews: number; duration: number; sign: Int8; // Required: 1 = state row, -1 = cancel row} const userActivity = new OlapTable<UserActivity>("user_activity", { engine: ClickHouseEngines.CollapsingMergeTree, sign: "sign", orderByFields: ["userId"],});Configuration Options
| Option | Type | Description |
|---|---|---|
orderByFields | string[] | Sorting key used by ClickHouse for grouping rows |
sign | string | Name of the Int8 sign column (1 = state, -1 = cancel) |
Example: Modeling State Changes
To update an object, write two rows:
- A cancel row (
sign = -1) that matches the prior state’s sorting key - A state row (
sign = 1) with the new state
Collapsing Behavior
By default, collapsing happens during ClickHouse background merges. To return fully-collapsed results, ClickHouse documents two common approaches:
- Aggregate with sign (recommended for analytics queries)
- Use
FINALat read time (less efficient; typically avoid on large scans)
import { sql } from "@514labs/moose-lib"; // Example: compute fully-collapsed metrics via sign-aware aggregationconst collapsed = sql` SELECT userId, sum(pageViews * sign) AS pageViews, sum(duration * sign) AS duration FROM user_activity GROUP BY userId HAVING sum(sign) > 0`;ClickHouse behavior
For full details (including the required Sign column, algorithm, and FINAL semantics), see the ClickHouse docs: CollapsingMergeTree table engine.
See Also
- VersionedCollapsingMergeTree — Collapsing with an explicit version column
- MergeTree — Append-only baseline engine
- Replicated Engines — High availability variants