External Tables
Viewing:
Overview
External tables allow you to connect Moose to database tables that are managed outside of your application. This is essential when working with:
- CDC (Change Data Capture) services like ClickPipes, Debezium, or AWS DMS
- Legacy database tables managed by other teams
- Third-party data sources with controlled schema evolution
When to Use External Tables
External Table Use Cases
CDC Services
When schema is controlled by services like ClickPipes, Debezium, or AWS DMS
Legacy Integration
Connecting to existing tables managed by other teams or systems
Third-party Data
Working with data sources where you don't control schema evolution
Strict Change Management
Environments with formal database change approval processes
Configuration
Set lifeCycle: LifeCycle.EXTERNALLY_MANAGED
to tell Moose not to modify the table schema:
import { OlapTable, LifeCycle } from "@514labs/moose-lib";
interface CdcUserData {
id: string;
name: string;
email: string;
updated_at: Date;
}
// Connect to CDC-managed table
const cdcUserTable = new OlapTable<CdcUserData>("cdc_users", {
lifeCycle: LifeCycle.EXTERNALLY_MANAGED
});
Set life_cycle=LifeCycle.EXTERNALLY_MANAGED
to tell Moose not to modify the table schema:
from moose_lib import OlapTable, OlapConfig, LifeCycle
from pydantic import BaseModel
from datetime import datetime
class CdcUserData(BaseModel):
id: str
name: str
email: str
updated_at: datetime
# Connect to CDC-managed table
cdc_user_table = OlapTable[CdcUserData]("cdc_users", OlapConfig(
life_cycle=LifeCycle.EXTERNALLY_MANAGED
))
Getting Models for External Tables
New project: initialize from your existing ClickHouse
If you don’t yet have a Moose project, use init-from-remote to bootstrap models from your existing ClickHouse:
moose init my-project --from-remote <YOUR_CLICKHOUSE_URL> --language <typescript|python>
What happens:
- Moose introspects your database and generates table models in your project.
- If Moose detects ClickPipes (or other CDC-managed) tables, it marks those as
EXTERNALLY_MANAGED
and writes them into a dedicated external models file:- TypeScript:
app/externalModels.ts
- Python:
app/external_models.py
- TypeScript:
- This is a best-effort detection to separate CDC-managed tables from those you may want Moose to manage in code.
How detection works (ClickPipes/PeerDB example):
- Moose looks for PeerDB-specific fields that indicate CDC ownership and versions, such as
_peerdb_synced_at
,_peerdb_is_deleted
,_peerdb_version
, and related metadata columns. - When these are present, the table will be marked
EXTERNALLY_MANAGED
and emitted into the external models file automatically.
import { OlapTable, LifeCycle, ClickHouseEngines } from "@514labs/moose-lib";
import type { ClickHouseInt, ClickHouseDecimal, ClickHousePrecision, ClickHouseDefault } from "@514labs/moose-lib";
import typia from "typia";
export interface foo {
id: string & typia.tags.Format<"uuid">;
name: string;
description: string | undefined;
status: string;
priority: number & ClickHouseInt<"int32">;
is_active: boolean;
metadata: string | undefined;
tags: string[];
score: (string & ClickHouseDecimal<10, 2>) | undefined;
large_text: string | undefined;
created_at: string & typia.tags.Format<"date-time"> & ClickHousePrecision<6>;
updated_at: string & typia.tags.Format<"date-time"> & ClickHousePrecision<6>;
_peerdb_synced_at: string & typia.tags.Format<"date-time"> & ClickHousePrecision<9> & ClickHouseDefault<"now64()">;
_peerdb_is_deleted: number & ClickHouseInt<"int8">;
_peerdb_version: number & ClickHouseInt<"int64">;
}
export const FooTable = new OlapTable<foo>("foo", {
orderByFields: ["id"],
engine: ClickHouseEngines.ReplacingMergeTree,
ver: "_peerdb_version",
settings: { index_granularity: "8192" },
lifeCycle: LifeCycle.EXTERNALLY_MANAGED,
});
Existing project: mark additional external tables
If there are other tables in your DB that are not CDC-managed but you want Moose to treat as external (not managed by code):
- Mark them as external in code
// In a file you control (not the external file yet)
const table = new OlapTable<MySchema>("my_table", {
lifeCycle: LifeCycle.EXTERNALLY_MANAGED
});
table = OlapTable[MySchema](
"my_table",
OlapConfig(
life_cycle=LifeCycle.EXTERNALLY_MANAGED
)
)
- Move them into the external models file
- Move the model definitions to your external file (
app/externalModels.ts
orapp/external_models.py
). - Ensure your root file still loads only the external models via a single import:
- Add
import "./externalModels";
from external_models import *
in yourapp/index.ts
app/main.py
file.
- Add
This keeps truly external tables out of your managed code path, while still making them available locally (and in tooling) without generating production DDL.
Important Considerations
Do not edit external data models
EXTERNALLY_MANAGED
tables reflect schemas owned by your CDC/DBA/ETL processes. Do not change their field shapes in code.
If you accidentally edited an external model, revert to the source of truth by running DB Pull: /moose/olap/db-pull.
Local vs production behavior
Locally, externally managed tables are created/kept in sync in your development ClickHouse so you can develop against them and seed data. See Seed (ClickHouse) in the CLI: /moose/moose-cli#seed-clickhouse.
Moose will not apply schema changes to EXTERNALLY_MANAGED
tables in production. If you edit these table models in code, those edits will not produce DDL operations in the migration plan (they will not appear in plan.yaml
).
For more on how migration plans are generated and what shows up in plan.yaml
, see /moose/olap/planned-migrations.
Staying in sync with remote schema
For EXTERNALLY_MANAGED
tables, keep your code in sync with the live database by running DB Pull. You can do it manually or automate it in dev.
moose db pull --connection-string <YOUR_CLICKHOUSE_URL>
Keep external models fresh
Use DB Pull to regenerate your external models file from the remote schema. To run it automatically during development, see the script hooks in the local development guide.