Schema Versioning
Viewing:
Overview
This guide walks through a common pattern for migrating breaking schema changes in an OLAP database, by creating a new table with the new schema and migrating data from the old table to the new one with a materialized view.
Mental model
- Why versions: Breaking changes require safe coordination amongst the upstream producers and downstream consumers for a given table. Give the new table the same logical name, but add a
version
so Moose creates a separate physical table (e.g.,events_0_1
). This allows old and new to coexist while dependencies can upgrade on their own schedule. - Why materialized views: In ClickHouse, a MATERIALIZED VIEW is a trigger on insert. It can both backfill the new table and continuously transform new rows inserted into the old table—perfect for moving data during a schema migration. This is a more efficient way to migrate data than a traditional ETL process because it pushes everything down to the ClickHouse engine.
MooseTip:
Setting config.version
on an OlapTable
changes only the underlying table name (suffixes dots with underscores). Your code still refers to the logical table you exported.
High-level workflow
Versioned migration flow
Create the new schema for the new table
Create the new table with the same logical name, but set config.version to e.g. "0.1"
Create a materialized view that SELECTs from the old table and writes into the new table (its targetTable)
Let the view backfill and keep migrating new inserts automatically
Cut over your readers/writers to the new versioned table, then drop the old one when ready
Example: rename a column and add a new one
Assume the original events
table has a name
column. We want to:
- rename
name
→eventName
- add a new
type
column with a default
Original table (version 0.0)
import { Key, OlapTable } from "@514labs/moose-lib";
interface EventV0 {
id: Key<string>;
name: string;
createdAt: Date;
}
export const events = new OlapTable<EventV0>("events", { version: "0.0" });
from typing import Annotated
from pydantic import BaseModel
from moose_lib import OlapTable, Key
class EventV0(BaseModel):
id: Annotated[str, Key]
name: str
created_at: str # datetime in your format
events = OlapTable[EventV0]("events", config=OlapTableConfig(version="0.0"))
New table (bump to version 0.1)
Create a new table with the same logical name, but set version: "0.1"
. Moose will create events_0_1
in ClickHouse.
import { Key, OlapTable } from "@514labs/moose-lib";
interface EventV1 {
id: Key<string>;
eventName: string; // renamed
type: string; // new
createdAt: Date;
}
export const eventsV1 = new OlapTable<EventV1>("events", { version: "0.1" });
from typing import Annotated
from pydantic import BaseModel
from moose_lib import OlapTable, Key
class EventV1(BaseModel):
id: Annotated[str, Key]
event_name: str
type: str
created_at: str
events_v1 = OlapTable[EventV1]("events", config=OlapTableConfig(version="0.1"))
Create the materialized view to migrate data
Create a materialized view that:
- SELECTs from the old table (
events_v0
) - maps fields to the new schema
- writes into the versioned target table (
events_v1
)
Warning:
Pass the versioned OlapTable
instance as targetTable
. If you only pass a tableName
, Moose will create an unversioned target.
import { MaterializedView, sql } from "@514labs/moose-lib";
import { events } from "../tables/events"; // old table
import { eventsV1, EventV1 } from "../tables/events_v01"; // new versioned table
export const migrateEventsToV01 = new MaterializedView<EventV1>({
materializedViewName: "mv_events_to_0_1",
selectTables: [events],
selectStatement: sql`
SELECT
${events.columns.id} as id,
${events.columns.name} as eventName,
'user' as type,
${events.columns.createdAt} as createdAt
FROM ${events}
`,
targetTable: eventsV1,
orderByFields: ["id"],
});
from moose_lib import MaterializedView, MaterializedViewOptions
from app.tables.events import events
from app.tables.events_v01 import events_v1, EventV1
migrate_events_to_v01 = MaterializedView[EventV1](MaterializedViewOptions(
materialized_view_name="mv_events_to_0_1",
select_statement=(
f"SELECT id as id, name as event_name, 'user' as type, created_at as created_at FROM {events.name}"
),
select_tables=[events],
table_name=events_v1,
order_by_fields=["id"],
))
What happens when you export this view:
- Moose creates the versioned table if needed
- Moose creates the MATERIALIZED VIEW and immediately runs a one-time backfill (
INSERT INTO ... SELECT ...
) - ClickHouse keeps the view active: any new inserts into
events
automatically flow intoevents_0_1
Cutover and cleanup
- Update readers to query the new table export (
eventsV1
). - Update writers/streams to produce to the new table if applicable.
- After verifying parity and retention windows, drop the old table and the migration view.
Notes and tips
- Use semantic versions like
0.1
,1.0
,1.1
. Moose will renderevents_1_1
as the physical name. - Keep the migration view simple and deterministic. If you need complex transforms, prefer explicit SQL in the
selectStatement
. - Very large backfills can take time. Consider deploying during low-traffic windows.