Syncing External Tables
Viewing:
What this is
Use moose db pull
to refresh the definitions of tables you marked as EXTERNALLY_MANAGED
from a live ClickHouse instance. It reads your code to find external tables, fetches their remote schemas, regenerates one external models file, and creates a small git commit if anything changed. If new external tables were added remotely (e.g., new CDC streams), they are added to the external models file as part of the same run.
When to use it
- External tables changed remotely: a DBA, CDC, or ETL pipeline updated schema.
- Keep types in sync: update generated models without touching fully-managed tables.
- Safe by design: does not modify the database or your managed models.
MooseTip:
Requirements
- Tables are defined with
lifeCycle: EXTERNALLY_MANAGED
life_cycle=LifeCycle.EXTERNALLY_MANAGED
- A ClickHouse connection string (native or HTTP/S)
Connection strings
db pull
accepts both native and HTTP(S) URLs. Native strings are automatically converted to HTTP(S) with the appropriate ports.
Examples:
# Native (auto-converted to HTTPS + 8443)
moose db pull --connection-string "clickhouse://explorer@play.clickhouse.com:9440/default"
# HTTPS (explicit database via query param)
moose db pull --connection-string "https://play.clickhouse.com/?user=explorer&database=default"
# Local HTTP
moose db pull --connection-string "http://localhost:8123/?user=default&database=default"
What gets written
app/externalModels.ts
app/external_models.py
db pull
treats this file as the single source of truth for EXTERNALLY_MANAGED
tables. It introspects the remote schema, updates existing external tables, and adds any newly detected external tables here. It does not modify models elsewhere in your codebase.
Structure for external tables
Keep all external tables in this file and import it once from your root (app/index.ts
app/main.py
).
Important:
- The file is overwritten on every run (or at the path passed via
--file-path
). - If you customize the path, ensure your root file imports it so Moose loads your external models.
How it works
When you run db pull
the CLI does the following:
- Loads your project’s infrastructure map and identifies tables marked as
EXTERNALLY_MANAGED
. - Connects to the remote ClickHouse specified by
--connection-string
and introspects the live schemas for those tables. - Regenerates a single external models file that mirrors the remote schema.
- Adds any newly detected external tables from the remote database to the generated file so your code stays in sync as sources evolve.
- Does not change any fully managed tables, your
app/index.ts
app/main.py
, or the database itself. - Creates a small git commit if the generated file changed, so you can review and share the update.
Example output
// AUTO-GENERATED FILE. DO NOT EDIT.
// This file will be replaced when you run `moose db pull`.
// ...typed table definitions matching remote EXTERNALLY_MANAGED tables...
# AUTO-GENERATED FILE. DO NOT EDIT.
# This file will be replaced when you run `moose db pull`.
# ...pydantic models matching remote EXTERNALLY_MANAGED tables...
Command
moose db pull --connection-string <STRING> [--file-path <OUTPUT_PATH>]
- —connection-string: Required. ClickHouse URL (native or HTTP/S)
- —file-path: Optional. Override the default output file. The file at this path will be regenerated (overwritten) on each run.
Typical Use Cases
Remote schema changed; update local types
Your DBA, CDC pipeline (e.g., ClickPipes), or ETL job updated a table’s schema. To keep your code accurate and type-safe, refresh your external models so queries, APIs, and materialized views reference the correct columns and types.
moose db pull --connection-string <YOUR_CLICKHOUSE_URL>
This updates only EXTERNALLY_MANAGED
models and leaves managed code untouched.
Automatically run on dev startup (keep local fresh)
In active development, schemas can drift faster than you commit updates. Running db pull
on dev startup helps ensure your local code matches the live schema you depend on.
export REMOTE_CLICKHOUSE_URL="clickhouse://<user>:<password>@<host>:<port>/<database>"
Add to moose.config.toml
:
[http_server_config]
on_first_start_script = "moose db pull --connection-string $REMOTE_CLICKHOUSE_URL"
This runs once when the dev server first starts. To run after code reloads, use on_reload_complete_script
. If you run this frequently, prefer HTTP(S) URLs and cache credentials via env/secrets to avoid friction.
New project from an existing DB
If you’re starting with an existing ClickHouse database, bootstrap code with init --from-remote
, then use db pull
over time to keep external models fresh:
moose init my-project --from-remote $REMOTE_CLICKHOUSE_URL --language <typescript|python>
See the full guide
Review the full getting started guide to learn more about how to bootstrap a new Moose OLAP project from an existing ClickHouse DB.
A new CDC/external table appeared; add it to code
Your CDC pipeline created a new table (or exposed a new stream). Pull to add the new table to your external models file automatically.
moose db pull --connection-string <YOUR_CLICKHOUSE_URL>
The regenerated external models file will now include the newly discovered external table.
Troubleshooting
- No changes written: Ensure tables are actually marked as
EXTERNALLY_MANAGED
and names match remote. - Unsupported types: The CLI will list tables with unsupported types; they’re skipped in the generated file.
- Auth/TLS errors: Verify scheme/ports (8123 or 8443) and credentials; try HTTPS if native URL fails.
- Git commit issues: The command attempts a lightweight commit; commit manually if your working tree is dirty.
Related
- External Tables: concepts and configuration
- Initialize from ClickHouse: bootstrap projects from an existing DB
- Supported Types: mapping and constraints