MooseStack

Moose OLAP

Compression Codecs

Viewing:

Compression Codecs

Moose lets you specify ClickHouse compression codecs per-column to optimize storage and query performance. Different codecs work better for different data types, and you can chain multiple codecs together.

When to use compression codecs

  • Time series data: Use Delta or DoubleDelta for timestamps and monotonically increasing values
  • Floating point metrics: Use Gorilla codec for sensor data, temperatures, and other float values
  • Text and JSON: Use ZSTD with compression levels (1-22) for large strings and JSON
  • High cardinality data: Combine specialized codecs with general-purpose compression (e.g., Delta, LZ4)

Basic Usage

import { OlapTable, Key, DateTime, Codec, UInt64 } from "@514labs/moose-lib";
 
interface Metrics {
  id: Key<string>;
  // Delta codec for timestamps (monotonically increasing)
  timestamp: DateTime & ClickHouseCodec<"Delta, LZ4">;
 
  // Gorilla codec for floating point sensor data
  temperature: number & ClickHouseCodec<"Gorilla, ZSTD(3)">;
 
  // DoubleDelta for counters and metrics
  request_count: number & ClickHouseCodec<"DoubleDelta, LZ4">;
 
  // ZSTD for text/JSON with compression level
  log_data: Record<string, any> & ClickHouseCodec<"ZSTD(3)">;
  user_agent: string & ClickHouseCodec<"ZSTD(3)">;
 
  // Compress array elements
  tags: string[] & ClickHouseCodec<"LZ4">;
  event_ids: UInt64[] & ClickHouseCodec<"ZSTD(1)">;
}
 
export const MetricsTable = new OlapTable<Metrics>("Metrics", {
  orderByFields: ["id", "timestamp"]
});

Codec Chains

You can chain multiple codecs together. Data is processed by each codec in sequence (left-to-right).

interface Events {
  // Delta compress timestamps, then apply LZ4
  timestamp: DateTime & ClickHouseCodec<"Delta, LZ4">;
 
  // Gorilla for floats, then ZSTD for extra compression
  value: number & ClickHouseCodec<"Gorilla, ZSTD(3)">;
}

Combining with Other Annotations

Codecs work alongside other ClickHouse annotations:

import { ClickHouseDefault, ClickHouseTTL } from "@514labs/moose-lib";
 
interface UserEvents {
  id: Key<string>;
  timestamp: DateTime & ClickHouseCodec<"Delta, LZ4">;
 
  // Codec + Default value
  status: string & ClickHouseDefault<"'pending'"> & ClickHouseCodec<"ZSTD(3)">;
 
  // Codec + TTL
  email: string & ClickHouseTTL<"timestamp + INTERVAL 30 DAY"> & ClickHouseCodec<"ZSTD(3)">;
 
  // Codec + Numeric type
  event_count: UInt64 & ClickHouseCodec<"DoubleDelta, LZ4">;
}

Syncing from Remote Tables

When using moose init --from-remote to introspect existing ClickHouse tables, Moose automatically captures codec definitions and generates the appropriate annotations in your data models.

Notes

  • Codec expressions must be valid ClickHouse codec syntax (without the CODEC() wrapper)
  • ClickHouse may normalize codecs by adding default parameters (e.g., Delta becomes Delta(4))
  • Moose applies codec changes via migrations using ALTER TABLE ... MODIFY COLUMN
  • Not all codecs work with all data types - ClickHouse will validate during table creation