1. MooseStack
  2. Moose Dev
  3. Building on CDC-Enabled ClickHouse

Building on CDC-Enabled ClickHouse

This guide walks you through setting up a Moose project on top of an existing ClickHouse database that uses CDC (Change Data Capture) pipelines—ClickPipes, PeerDB, Debezium, or similar tools—to replicate data from external sources.

What You'll Learn

By the end of this guide, you'll understand how to:

  1. Bootstrap a Moose project from an existing ClickHouse database
  2. Work with externally managed tables that CDC services control
  3. Develop locally with realistic sample data from production
  4. Keep your models in sync as upstream schemas evolve

Prerequisites

  • A ClickHouse database with CDC-replicated tables
  • Your ClickHouse connection URL (e.g., https://user:pass@host:8443/database)
  • Node.js 20+ (TypeScript) or Python 3.10+ (Python)

Step 1: Initialize Your Project

The fastest way to get started is with moose init --from-remote. This single command:

  • Creates a new Moose project
  • Connects to your ClickHouse and introspects all tables
  • Generates type-safe models for every table
  • Automatically detects PeerDB-managed tables (via _peerdb_* columns) and marks them as EXTERNALLY_MANAGED. Tables managed by other CDC tools (ClickPipes, Debezium, etc.) can be manually marked with lifeCycle: LifeCycle.EXTERNALLY_MANAGED
  • Saves your connection config to moose.config.toml
  • Stores credentials securely in your OS keychain
moose init my-analytics-app --from-remote "https://user:pass@host:8443/database" --language typescript

You'll see output like:

Created my-analytics-app from typescript-empty template        Success Project created at my-analytics-app     Connecting to remote ClickHouse...  Introspecting tables in 'database'...         Config Wrote [dev.remote_clickhouse] to moose.config.toml (host: host, database: database)       Keychain Stored credentials securely for project 'my-analytics-app'
What About Interactive Setup?

If you don't want to put credentials in the command, run without the URL:

moose init my-analytics-app --from-remote --language typescript

Moose will prompt you for host, username, password, and database interactively.

After initialization, navigate into your project and install dependencies:

cd my-analytics-app
npm install

Step 2: Understanding Externally Managed Tables

Open your project and look at the generated files. You'll find your tables split into two categories:

Regular Tables (in your main file)

Tables that Moose fully manages—you control the schema, and Moose creates/migrates them:

app/index.ts
import { OlapTable } from "@514labs/moose-lib"; export interface analytics_events {  id: string;  event_type: string;  timestamp: string;} export const AnalyticsEventsTable = new OlapTable<analytics_events>("analytics_events", {  orderByFields: ["timestamp", "id"],});

Externally Managed Tables (in a separate file)

Tables where an external process (your CDC pipeline) owns the schema. Moose generates these in a dedicated file:

app/externalModels.ts
// AUTO-GENERATED FILE. DO NOT EDIT.// This file will be replaced when you run `moose db pull`. import typia from "typia";import { OlapTable, LifeCycle, ClickHouseEngines } from "@514labs/moose-lib"; export interface users {  id: string & typia.tags.Format<"uuid">;  email: string;  name: string;  created_at: string & typia.tags.Format<"date-time">;  // PeerDB metadata columns (added by CDC)  _peerdb_synced_at: string & typia.tags.Format<"date-time">;  _peerdb_is_deleted: number;  _peerdb_version: number;} export const UsersTable = new OlapTable<users>("users", {  orderByFields: ["id"],  engine: ClickHouseEngines.ReplacingMergeTree,  ver: "_peerdb_version",  lifeCycle: LifeCycle.EXTERNALLY_MANAGED,  // <-- Key difference});

What Does EXTERNALLY_MANAGED Mean?

The lifeCycle: LifeCycle.EXTERNALLY_MANAGED setting tells Moose:

Moose BehaviorRegular TablesExternally Managed
Creates table in productionYesNo
Runs migrationsYesNo
Generates type-safe modelsYesYes
Allows building views/APIs on topYesYes
You edit the schemaYesNo (regenerated by db pull)

In short: Moose gives you all the developer experience benefits (types, autocomplete, views, APIs) without touching the tables your CDC pipeline owns.

Don't Edit External Models Directly

The externalModels.ts / external_models.py file is regenerated every time you run moose db pull. Any manual changes will be overwritten. If you need to customize how a table is modeled, move it to your main file and remove EXTERNALLY_MANAGED.


Step 3: Local Development Setup

Now let's configure how Moose handles these external tables during local development. Open your moose.config.toml:

moose.config.toml
# This was auto-generated by moose init --from-remote[dev.remote_clickhouse]host = "your-clickhouse-host.example.com"port = 8443database = "production_db"use_ssl = trueprotocol = "http"

Creating Local Mirror Tables

When you run moose dev, Moose creates the schema for externally managed tables in your local ClickHouse, but they start out empty. To populate them with sample data from your remote ClickHouse for local development, enable local mirrors:

moose.config.toml
[dev.externally_managed.tables]# Create local copies of external tablescreate_local_mirrors = true # Seed with sample data from remote (0 = empty tables)sample_size = 1000 # Re-seed on every moose dev start (false = only if missing)refresh_on_startup = false

Configuration Options Explained

OptionDefaultDescription
create_local_mirrorsfalseWhen true, Moose creates local tables matching your external table schemas
sample_size0Number of rows to copy from remote for each table. Set to 0 for schema-only (empty tables)
refresh_on_startupfalseWhen true, drops and recreates mirrors on every moose dev start. When false, only creates if missing

When Does Seeding Happen?

Understanding when data is pulled is important for your workflow:

ScenarioWhat Happens
First moose dev runMirror tables created, sample_size rows seeded from remote
Subsequent runs (refresh_on_startup = false)Nothing—existing local data preserved
Subsequent runs (refresh_on_startup = true)Tables dropped, recreated, and reseeded
Remote unreachableTables created from local model definitions, no data seeded

Materialized Views and Sample Data

Here's something important to understand: Materialized Views only process new data as it arrives.

In production, your CDC pipeline continuously inserts data, which triggers your materialized views. But locally, you have static seeded data—the MV won't retroactively process it.

Solutions for local MV development:

  1. Seed enough data - Set sample_size high enough to have meaningful test data in your base tables

  2. Use refresh_on_startup = true - This re-inserts data on each startup, triggering MVs (but slower startup)

  3. Manually trigger with moose seed - Insert test data while moose dev is running (requires the local dev server to be up):

    moose seed clickhouse --limit 100
  4. Test MVs with direct inserts - During development, insert test rows manually to trigger MV logic

Production Behavior

In production, this isn't an issue—CDC continuously streams data, and your MVs process it in real-time.


Step 4: Running Local Development

With your config set up, start the dev server:

moose dev

First Run with Credentials

If credentials aren't in your keychain (e.g., you manually edited the config), Moose prompts you:

Credentials Remote ClickHouse credentials required:            Host:     your-clickhouse-host.example.com            Database: production_db Enter username (default: default)> your_username Enter password> ******** Keychain Stored credentials securely for project 'my-analytics-app'

Credentials are stored in your OS keychain and reused automatically on subsequent runs—no additional configuration needed.

What Happens on Startup

  1. Local infrastructure starts — Docker containers for ClickHouse, Redpanda, etc.
  2. External tables detected — credentials resolved from OS keychain (you'll be prompted if missing)
  3. Remote schema compared — local mirrors created if create_local_mirrors = true
  4. Data seeded (if sample_size > 0 and remote is reachable)
  5. Dev server starts at http://localhost:4000

Developing Locally

Now you can build on top of your CDC data:

  • Create views that aggregate or transform external table data
  • Build APIs that query across managed and external tables
  • Test queries against realistic sample data
  • Iterate quickly with hot-reload—no production impact

Step 5: Syncing Schema Changes

CDC pipelines evolve. Your DBA adds columns, the CDC service updates metadata fields, or new tables appear. When this happens, your local models need to sync.

Manual Sync

Run moose db pull to refresh your external models:

moose db pull

This:

  • Connects to your remote ClickHouse (using saved credentials)
  • Introspects current schemas for all externally managed tables
  • Regenerates externalModels.ts / external_models.py
  • Adds any new tables that appeared in the remote database

Example: A New Column Appears

Say your CDC pipeline starts syncing a new phone column to the users table:

Connecting to remote ClickHouse...  Introspecting remote tables...External models refreshed (3 table(s))

Your external models file now includes the new column:

app/externalModels.ts
export interface users {  id: string;  email: string;  name: string;  phone: string;  // <-- New column!  created_at: string;  // ...}

TypeScript immediately catches any code that now needs updating.

Automatic Sync on Dev Start

For active development where schemas change frequently, auto-sync on startup:

moose.config.toml
[http_server_config]on_first_start_script = "moose db pull" [watcher_config]# Prevent reload loop from generated file changesignore_patterns = ["app/externalModels.ts"]

Now every moose dev starts by pulling the latest schemas.


Complete Configuration Reference

Here's a full moose.config.toml for a CDC-based project:

moose.config.toml
language = "typescript" # Remote ClickHouse connection (auto-generated by moose init --from-remote)[dev.remote_clickhouse]host = "abc123.us-east-1.aws.clickhouse.cloud"port = 8443database = "production"use_ssl = trueprotocol = "http" # Local development with external tables[dev.externally_managed.tables]create_local_mirrors = truesample_size = 500refresh_on_startup = false # Auto-sync schemas on startup[http_server_config]on_first_start_script = "moose db pull" # Don't trigger reloads on generated files[watcher_config]ignore_patterns = ["app/externalModels.ts"]

Quick Reference

TaskCommand
Start new project from existing ClickHousemoose init my-app --from-remote <URL> --language <typescript|python>
Sync external models after schema changemoose db pull
Start local developmentmoose dev
Seed more data locallymoose seed clickhouse --limit 1000

Next Steps

  • External Tables Reference - Deep dive into EXTERNALLY_MANAGED lifecycle
  • Materialized Views - Build real-time aggregations on CDC data
  • APIs & Web Apps - Expose your data through type-safe endpoints

On this page

What You'll LearnPrerequisitesStep 1: Initialize Your ProjectStep 2: Understanding Externally Managed TablesRegular Tables (in your main file)Externally Managed Tables (in a separate file)What Does `EXTERNALLY_MANAGED` Mean?Step 3: Local Development SetupCreating Local Mirror TablesConfiguration Options ExplainedWhen Does Seeding Happen?Materialized Views and Sample DataStep 4: Running Local DevelopmentFirst Run with CredentialsWhat Happens on StartupDeveloping LocallyStep 5: Syncing Schema ChangesManual SyncExample: A New Column AppearsAutomatic Sync on Dev StartComplete Configuration ReferenceQuick ReferenceNext Steps
FiveonefourFiveonefour
Fiveonefour Docs
MooseStackHostingTemplatesGuides
Release Notes
Source527
  • Overview
Build a New App
  • 5 Minute Quickstart
  • Browse Templates
  • Existing ClickHouse
Add to Existing App
  • Next.js
  • Fastify
Fundamentals
  • Moose Runtime
  • MooseDev MCP
  • Language Server
  • Data Modeling
Moose Modules
  • Moose OLAP
  • Moose Streaming
  • Moose Workflows
  • Moose APIs & Web Apps
Deployment & Lifecycle
  • Moose Dev
    • CDC Managed Tables
  • Moose Migrate
  • Moose Deploy
Reference
  • API Reference
  • Data Types
  • Table Engines
  • CLI
  • Configuration
  • Observability Metrics
  • Help
  • Release Notes
Contribution
  • Documentation
  • Framework
moose init my-analytics-app --from-remote "https://user:pass@host:8443/database" --language typescript
Created my-analytics-app from typescript-empty template        Success Project created at my-analytics-app     Connecting to remote ClickHouse...  Introspecting tables in 'database'...         Config Wrote [dev.remote_clickhouse] to moose.config.toml (host: host, database: database)       Keychain Stored credentials securely for project 'my-analytics-app'
moose init my-analytics-app --from-remote --language typescript
cd my-analytics-app
npm install
app/index.ts
import { OlapTable } from "@514labs/moose-lib"; export interface analytics_events {  id: string;  event_type: string;  timestamp: string;} export const AnalyticsEventsTable = new OlapTable<analytics_events>("analytics_events", {  orderByFields: ["timestamp", "id"],});
app/externalModels.ts
// AUTO-GENERATED FILE. DO NOT EDIT.// This file will be replaced when you run `moose db pull`. import typia from "typia";import { OlapTable, LifeCycle, ClickHouseEngines } from "@514labs/moose-lib"; export interface users {  id: string & typia.tags.Format<"uuid">;  email: string;  name: string;  created_at: string & typia.tags.Format<"date-time">;  // PeerDB metadata columns (added by CDC)  _peerdb_synced_at: string & typia.tags.Format<"date-time">;  _peerdb_is_deleted: number;  _peerdb_version: number;} export const UsersTable = new OlapTable<users>("users", {  orderByFields: ["id"],  engine: ClickHouseEngines.ReplacingMergeTree,  ver: "_peerdb_version",  lifeCycle: LifeCycle.EXTERNALLY_MANAGED,  // <-- Key difference});
app/externalModels.ts
export interface users {  id: string;  email: string;  name: string;  phone: string;  // <-- New column!  created_at: string;  // ...}

TypeScript immediately catches any code that now needs updating.

moose.config.toml
[http_server_config]on_first_start_script = "moose db pull" [watcher_config]# Prevent reload loop from generated file changesignore_patterns = ["app/externalModels.ts"]
moose.config.toml
language = "typescript" # Remote ClickHouse connection (auto-generated by moose init --from-remote)[dev.remote_clickhouse]host = "abc123.us-east-1.aws.clickhouse.cloud"port = 8443database = "production"use_ssl = trueprotocol = "http" # Local development with external tables[dev.externally_managed.tables]create_local_mirrors = truesample_size = 500refresh_on_startup = false # Auto-sync schemas on startup[http_server_config]on_first_start_script = "moose db pull" # Don't trigger reloads on generated files[watcher_config]ignore_patterns = ["app/externalModels.ts"]