Database Tables
Viewing typescript
switch to python
Tables are the foundation of Moose’s data storage layer, designed to integrate with your entire data pipeline. They store data ingested from your APIs and streams while powering your analytics queries for your downstream transformations, APIs, and more.
Overview
Tables in Moose provide a type-safe way to define your database schema directly in your code, eliminating the tedious database administration tasks that typically interrupt your development flow.
Working with OLAP Tables
Define your table schema as a data model
Create a type definition that describes your table structure
Instantiate your table
Create a typed table instance from your model with one line of code
Configure storage options
Optionally specify indexing, ordering, and other optimizations for your query patterns
Iterate on your schema
Add, edit, or remove fields and tables update automatically
Basic Example
interface MyFirstTable {
id: Key<string>;
name: string;
age: number;
}
// Create a table named "first_table"
const myTable = new OlapTable<MyFirstTable>("first_table");
from pydantic import BaseModel
from moose_lib import Key, OlapTable
from pydantic import BaseModel
class MyFirstTable(BaseModel):
id: Key[str]
name: str
age: int
# Create a table named "first_table"
my_table = OlapTable[MyFirstTable]("first_table")
Benefits:
- ✅ Boilerplate CREATE/ALTER TABLE statements handled for you
- ✅ Automatic type mapping between your code and database types
- ✅ Built-in validation before storage
Creating Tables
There are two main approaches to creating tables:
Creating Tables in Ingestion Pipelines
For end-to-end data flows, create tables as part of an ingestion pipeline:
import { IngestPipeline, Key } from "@514labs/moose-lib";
// Define your schema
interface UserEvent {
id: Key<string>;
userId: string;
timestamp: Date;
eventType: string;
}
// Create a complete ingestion pipeline with a table
const eventsPipeline = new IngestPipeline<UserEvent>("user_events", {
ingest: true, // Creates a REST API endpoint at POST localhost:4000/ingest/user_events
stream: true, // Creates Kafka/Redpanda topic
table: { // Creates and configures the table named "user_events"
orderByFields: ["id", "timestamp"],
deduplicate: true
}
});
// Access the table component when needed
const eventsTable = eventsPipeline.table;
from moose_lib import IngestPipeline, Key, OlapTable
from pydantic import BaseModel
class UserEvent(BaseModel):
id: Key[str]
user_id: str
timestamp: Date
event_type: str
events_pipeline = IngestPipeline[UserEvent]("user_events", {
ingest: True, # Creates a REST API endpoint at POST localhost:4000/ingest/user_events
stream: True, # Creates a Kafka/Redpanda topic
table: { # Creates and configures the table named "user_events"
orderByFields: ["id", "timestamp"],
deduplicate: True
}
})
# Access the table component when needed:
events_table = events_pipeline.get_table()
This integrated approach automatically creates the necessary infrastructure and connects all components, ensuring data flows smoothly from API to database with full type checking at every step.
Querying Tables
Moose provides a sql
tagged template literal function that you can use to write SQL queries on your tables, which supports interpolation of table names, column names, and values.
Moose lets you define f
strings that you can use to write SQL queries on your tables, which supports interpolation of table names, column names, and values.
This is useful for:
- Transforming your data in your database and saving the results as regular views or materialized views.
- Querying tables from your Consumption API route handlers to expose your data to client applications that are running outside of Moose.
import { sql } from "@514labs/moose-lib";
interface MyTableSchema {
id: Key<string>;
name: string;
value: number;
}
const myTable = new OlapTable<MyTableSchema>("my_table");
const query = sql`
SELECT ${myTable.id}, ${myTable.name}, ${myTable.value} FROM ${myTable}
`;
// When Moose runs this query, it will automatically convert the interpolated values into the correct SQL syntax for your database:
// SELECT id, name, value FROM my_table
This lets you write SQL that feels like TypeScript - get autocomplete for your table columns and catch typos at dev time instead of waiting for runtime.
from moose_lib import MooseClient, OlapTable, Key
class MyTableSchema(BaseModel):
id: Key[str]
name: str
value: int
my_table = OlapTable[MyTableSchema]("my_table")
client = MooseClient()
query = client.query.execute(
f"SELECT id, name, value FROM {my_table.name}" # Use a variable to interpolate the table name
)
Configuring Table Indexes
You must configure table indexing using one of these approaches:
- Define at least one
Key
in your table schema - Specify
orderByFields
in the table config - Use both (all
Key
fields must come first in theorderByFields
array)
import { OlapTable, Key } from '@514labs/moose-zlib';
// Approach 1: Using primary key only
interface Record1 {
id: Key<string>; // Primary key field
field1: string;
field2: number;
}
const table1 = new OlapTable<Record1>("table1"); // id is the primary key
from moose_lib import Key, OlapTable
from pydantic import BaseModel
class Record1(BaseModel):
id: Key[str] # Primary key field
field1: str
field2: int
table1 = OlapTable[Record1]("table1") # id is the primary key
Order By Fields Only
// Approach 2: Using orderByFields only
interface SchemaWithoutPrimaryKey {
field1: string;
field2: number;
field3: Date;
}
const tableWithOrderByFieldsOnly = new OlapTable<SchemaWithoutPrimaryKey>("table2", {
orderByFields: ["field1", "field2"] // Specify ordering without primary key
});
Leverage the OlapTableConfig
class to configure your table:
from moose_lib import Key, OlapTable, OlapTableConfig
from pydantic import BaseModel
from datetime import datetime
class SchemaWithoutPrimaryKey(BaseModel):
field1: str
field2: int
field3: datetime
table2 = OlapTable[SchemaWithoutPrimaryKey]("table2", OlapTableConfig(
orderByFields=["field1", "field2"] # Specify ordering without primary key
))
Using Both Primary Key and Order By Fields
// Approach 3: Using both (primary key must be first)
interface SchemaWithKey {
id: Key<string>; // Primary key field
field1: string;
field2: number;
}
const tableWithKeyAndOrderByFields = new OlapTable<SchemaWithKey>("table3", {
orderByFields: ["id", "field1"] // Primary key must be first
});
from moose_lib import Key, OlapTable, OlapTableConfig
from pydantic import BaseModel
class SchemaWithKey(BaseModel):
id: Key[str]
field1: str
field2: int
table3 = OlapTable[SchemaWithKey]("table3", OlapTableConfig(
orderByFields=["id", "field1"] # Primary key must be first
))
Using Multiple Primary Keys
interface MultiKeyRecord {
key1: Key<string>;
key2: Key<number>;
field1: string;
}
const multiKeyTable = new OlapTable<MultiKeyRecord>("multi_key_table", {
orderByFields: ["key1", "key2", "field1"] // Multiple keys must come first
});
from moose_lib import Key, OlapTable, OlapTableConfig
from pydantic import BaseModel
class MultiKeyRecord(BaseModel):
key1: Key[str]
key2: Key[int]
field1: str
multi_key_table = OlapTable[MultiKeyRecord]("multi_key_table", OlapTableConfig(
orderByFields=["key1", "key2", "field1"] # Multiple keys must come first
))
Deduplication
Keep only the latest record per key:
const table = new OlapTable<Record>("table", {
orderByFields: ["id"],
deduplicate: true
});
from moose_lib import Key, OlapTable, OlapTableConfig
class Record(BaseModel):
id: Key[str]
table = OlapTable[Record]("table", OlapTableConfig(
orderByFields=["id"],
deduplicate=True
))
Deduplication Caveats
Deduplication uses ClickHouse’s ReplacingMergeTree engine for eventual deduplication, meaning duplicate records may not be removed immediately. This process can take time and is not guaranteed to be immediate. For more details, see the ClickHouse documentation.
Invalid Configurations
// Error: No primary key or orderByFields
interface BadRecord1 {
field1: string;
field2: number;
}
const badTable1 = new OlapTable<BadRecord1>("bad_table1");
// Error: Primary key not first in orderByFields
interface BadRecord2 {
id: Key<string>;
field1: string;
}
const badTable2 = new OlapTable<BadRecord2>("bad_table2", {
orderByFields: ["field1", "id"] // Wrong order - primary key must be first
});
// Error: Nullable field in orderByFields
interface BadRecord3 {
id: Key<string>;
field1: string;
field2?: number;
}
const badTable3 = new OlapTable<BadRecord3>("bad_table3", {
orderByFields: ["id", "field2"] // Can't have nullable field in orderByFields
});
from moose_lib import Key, OlapTable, OlapTableConfig
from typing import Optional
class BadRecord1(BaseModel):
field1: str
field2: int
bad_table1 = OlapTable[BadRecord1]("bad_table1") ## No primary key or orderByFields
class BadRecord2(BaseModel):
id: Key[str]
field1: str
bad_table2 = OlapTable[BadRecord2]("bad_table2", OlapTableConfig(
orderByFields=["field1", "id"] # Wrong order - primary key must be first
))
class BadRecord3(BaseModel):
id: Key[str]
field1: str
field2: Optional[int]
bad_table3 = OlapTable[BadRecord3]("bad_table3", OlapTableConfig(
orderByFields=["id", "field2"] # Can't have nullable field in orderByFields
))
Development Workflow
Local Development with Hot Reloading
One of the powerful features of Moose is its integration with the local development server:
- Start your local development server with
moose dev
- When you define or modify an
OlapTable
in your code and save the file:- The changes are automatically detected
- The TypeScript compiler plugin processes your schema definitions
- The infrastructure is updated in real-time to match your code changes
- Your tables are immediately available for testing
For example, if you add a new field to your schema:
// Before
interface BasicSchema {
id: Key<string>;
name: string;
}
// After adding a field
interface BasicSchema {
id: Key<string>;
name: string;
createdAt: Date; // New field
}
# Before
class BasicSchema(BaseModel):
id: Key[str]
name: str
# After adding a field
class BasicSchema(BaseModel):
id: Key[str]
name: str
created_at: datetime
The Moose framework will:
- Detect the change when you save the file
- Update the table schema in the local ClickHouse instance
- Make the new field immediately available for use
Verifying Your Tables
You can verify your tables were created correctly using:
# List all tables in your local environment
moose ls
Connecting to your local ClickHouse instance
You can connect to your local ClickHouse instance with your favorite database client. Your credentials are located in your .moose/config.toml
file:
[clickhouse_config]
db_name = "local"
user = "panda"
password = "pandapass"
use_ssl = false
host = "localhost"
host_port = 18123
native_port = 9000