We value your privacy

This site uses cookies to improve your browsing experience, analyze site traffic, and show personalized content. See our Privacy Policy.

  1. Guides
  2. Static Report Generation

Static Report Generation

TL;DR
  • Static reports are pre-generated snapshots (PDF, Excel, CSV) that capture data at a specific point in time
  • They remain essential for compliance, auditability, and stakeholder workflows despite interactive dashboards
  • Moose + ClickHouse provides a developer-first approach: define TypeScript models, get auto-generated infrastructure

This guide will:

  1. Help you understand when static reports are the right choice
  2. Walk through a complete implementation with file ingestion, materialized views, and PDF generation
  3. Explain the underlying concepts so you can adapt the approach to your needs
Business Context
Understand when and why static reports matter
The Scenario
Apex Retail daily sales consolidation example
Start Building
Setup, data models, ingestion, and PDF export

How to use this guide

Different starting points depending on what you need:

Business leaders and operations teams: Read Parts 1-2 for the business case and use cases. Skim Part 3 for a concrete example, then hand off to developers.

Developers: Skip to Part 4 and start building. Parts 1-2 have business context if you need it later.

Data engineers evaluating this approach: Parts 1-2 cover the business case, Part 3 is a concrete scenario, Parts 4-8 are implementation, Part 9 explains the underlying concepts.

Prerequisites

  • Node.js 20+ (required for pdfmake 0.3)
  • Docker Desktop running
  • pnpm (npm install -g pnpm)
  • Basic TypeScript and SQL knowledge

Verify your environment

Before starting, confirm your tools are ready:

node --version   # Should be v20.x.x or higher
docker --version
docker info > /dev/null 2>&1 && echo "Docker is running" || echo "Docker is NOT running"

If Docker isn't running, start Docker Desktop before proceeding.

Tips for success

Keep moose dev running throughout the tutorial. Use a dedicated terminal tab or window. If you accidentally close it, run moose dev again from the project directory.

First startup takes longer. Docker images download on first run (2-5 minutes depending on connection). Subsequent starts are much faster.

Services need time to initialize. After creating new APIs or views, wait 20-30 seconds before testing. Watch the moose dev terminal for compilation messages.

File watching is automatic. Save a file and the system rebuilds—no manual restart needed. Type errors block startup, so fix them before proceeding.

Logs are invaluable for debugging. All logs are stored in ~/.moose/YYYY-MM-DD-cli.log. Watch in real-time:

tail -f ~/.moose/$(date +%Y-%m-%d)-cli.log

Verify data at each stage. Don't assume it worked—validate with queries. Test APIs with curl before running scripts that depend on them.


Part 1: What are static reports?

Static reports are pre-generated snapshots of data, delivered as fixed artifacts—most commonly PDF or Excel files, though CSV exports and dashboard views are also common formats.

These reports capture what the data looked like at a specific point in time, with a defined and unchanging scope, metric logic, and layout. Once generated, a static report remains frozen until it is explicitly regenerated.

Key characteristics

Pre-generated snapshots

Delivered as fixed artifacts (PDF, Excel, CSV)

Point-in-time capture

Defined, unchanging scope and metric logic

Stable reference

Frozen documents preferred by many stakeholders over interactive dashboards

Auditability support

Enable sign-off workflows and compliance requirements

Portable handoffs

Act as handoff layers for Finance, Operations, and Compliance

Why static reports matter

Despite the proliferation of interactive dashboards and real-time analytics tools, static reports remain essential to business operations:

ReasonBusiness Value
Stable reference pointsStakeholders prefer frozen documents they can review, annotate, and archive rather than dashboards that show different numbers each time
Reduced version confusionWhen everyone references the same PDF from the same date, there's no ambiguity about which numbers are being discussed
Compliance and auditabilityRegulatory requirements often mandate immutable records that can be retrieved and verified years later
Universal accessibilityNot everyone has system access—many stakeholders need a PDF they can email or an Excel file they can manipulate offline
Period-close reliabilityFinancial close processes depend on definitive snapshots that won't change after sign-off

Part 2: Business use cases

Understanding where static reports create value helps organizations prioritize their data infrastructure investments.

Selling data assets to third parties

Static reports enable revenue generation through data monetization:

Data archival and retrieval:

  • Medical patient record retrieval for healthcare providers and researchers
  • Mortgage records and refinancing documentation for financial institutions
  • Historical transaction records for audits and legal proceedings

Data broker services:

  • Medical research records for pharmaceutical companies and academic institutions
  • Vehicle history reports (CarFax-style) for automotive transactions
  • Credit history reports for lending decisions
  • Purchase history for marketing and customer analytics
  • Real estate transaction history for property valuations

Compliance and regulated reporting

For organizations in regulated industries, static reports provide critical capabilities:

  • Immutable, versioned records that support approval workflows and retention policies
  • External sharing capabilities without requiring system access
  • Reproducibility through stored run metadata including dataset version, query parameters, pipeline run timestamp, and result hash
  • Easy cross-department distribution while maintaining audit trails

Analyst workflows

Many analysts and business users prefer working with exported data rather than interactive dashboards:

  • Exporting curated datasets (not raw data) into Excel or Google Sheets
  • Faster pivot table creation and data exploration
  • Reconciliation workflows comparing multiple data sources
  • Annotation and commentary that lives alongside the data

Automations and operational handoffs

Static reports serve as predictable, machine-readable inputs to downstream processes:

  • Email-to-ticket systems that process report contents
  • Slack integrations that distribute key metrics
  • RPA (Robotic Process Automation) tools that consume standardized formats
  • System migrations enabling side-by-side comparison between old and new systems

Understanding the challenges

Before investing in static report generation capabilities, understand the technical challenges:

ChallengeImpact
Data discoveryTeams spend hours finding duplicates, missing fields, and inconsistent definitions before reports can be generated
Data cleaningFormat inconsistencies, NULL handling, and duplicate entities cause report errors
Multiple formatsSame logical data arrives in different structures from different sources
Validation and trustWithout structured UAT and golden records, errors reach executives or regulators
Timing and orchestrationReports generated before data is complete deliver flawed results

Evaluating solutions

When evaluating static report generation solutions, consider:

CapabilityWhat to Look For
Data qualityAutomated profiling, duplicate detection, validation rules
Format supportNative handling of Excel, PDF, CSV, JSON, XML
Audit trailComplete lineage from source to report with timestamps
ScalabilityOLAP-based processing for large datasets
OrchestrationDependency management, freshness checks, retry logic
Self-serviceUser-friendly export interfaces with consistent formatting

Why Moose for static reports?

Moose addresses these challenges with a developer-first approach:

Traditional ApproachWith Moose
Manual ClickHouse setupData model → auto infrastructure
Custom ingestion codeAuto-generated /ingest API
Hand-written DDLType-safe schema from TypeScript
Slow iteration cyclesHot reload, local-first development
Manual API boilerplateType-safe Consumption APIs
Risky production changesLifecycle management prevents accidents

Part 3: The scenario

Apex Retail daily sales consolidation

You're a data engineer at Apex Retail, a chain with 12 stores across 4 regions. Each night, store POS (point-of-sale) systems export the day's sales as CSV or JSON files.

The current process: An analyst downloads files from each store, copies data into Excel, and emails a summary to regional managers. Takes 2 hours every morning. When the analyst is sick, reports don't go out.

What you'll build: A pipeline that ingests store uploads, stores them in ClickHouse, pre-aggregates daily summaries, and generates PDF reports. About 60 lines of TypeScript.

Why automate?

FactorManualAutomated
Time per report2 hours< 1 minute
Annual analyst hours500+~10 (maintenance)
Error rate5-10%0%
AvailabilityOne person24/7
Audit trailNoneGit history
ScalabilityLinearConstant

Sound familiar?

  • Reports late when the analyst is out
  • "The numbers don't match" disputes
  • Hours debugging Excel formulas
  • Can't pull last Tuesday's numbers quickly

Why not existing tools?

ApproachProblem
Excel macrosNo version control, breaks when formats change
BI dashboardsGood for exploration, PDF export still needs code
Scheduled SQLRaw SQL gets unmaintainable
Manual scriptsNo streaming, no type safety

This approach:

  • Type-safe models catch schema mismatches at compile time
  • Streaming ingestion via Redpanda
  • Materialized views make queries instant
  • Git history tracks all changes
  • Same inputs produce same outputs

Architecture

Loading diagram...

End result

  1. File loader script for CSV/JSON ingestion
  2. Pre-computed daily aggregations (millisecond queries)
  3. Parameterized JSON API endpoint
  4. PDF export via local script or API endpoint

Part 4: Setup and data model

Step 1: Install Moose CLI

bash -i <(curl -fsSL https://fiveonefour.com/install.sh) moose

Verify:

moose --version

Step 2: Initialize project

moose init sales-reports typescript-empty
cd sales-reports
pnpm install

Moose uses pnpm workspaces. npm install won't work.

Step 3: Start dev server

Stop other Moose projects first

If you have other Moose projects running, stop them before continuing. Multiple Moose instances will conflict on ports 4000, 9000, and 18123. Use docker ps to check for running containers and stop them if needed.

Open a new terminal window (or terminal tab) and run:

cd sales-reports
moose dev
Important

moose dev runs continuously and doesn't exit. Keep this terminal open for the entire tutorial. You'll run all other commands (curl, scripts, etc.) in a separate terminal.

The dev server auto-reloads on file changes. First run may auto-upgrade the CLI to match moose-lib version.

Wait for services to start

First startup takes 2-5 minutes while Docker images download. Subsequent starts are faster (10-30 seconds).

Watch the terminal for these indicators that services are ready:

  • ✓ ClickHouse is ready
  • ✓ Redpanda is ready
  • HTTP server started on port 4000

Verify services

In your other terminal, test the services:

# Test the API server
curl http://localhost:4000/health

Should return JSON with healthy and unhealthy arrays listing service status.

curl http://localhost:18123/ping

Should return Ok.

If checks fail, wait another 20-30 seconds. Common startup issues:

  • Connection refused: Services still initializing—wait longer
  • Docker errors: Ensure Docker Desktop is running (check menu bar icon on Mac)

What started:

ServicePortDescription
HTTP Server4000API endpoints
ClickHouse18123Columnar database for analytics
Redpanda19092Message queue for data streaming (Kafka-compatible)
Temporal8080Scheduled task runner

All in Docker. No cloud accounts needed.

Debugging tip

ClickHouse has a web UI at http://localhost:18123/play for running queries interactively—useful for debugging.

Step 4: Define data model

Create app/ingest/SalesTransaction.ts:

import { IngestPipeline } from "@514labs/moose-lib"; // Define the data model - matches store POS export formatexport interface SalesTransaction {  // Unique identifier  transactionId: string;  // e.g., "TXN-W001-20250115-0001"   // Timestamp - the star of time-series analytics  saleTimestamp: Date;   // Store & location  storeId: string;             // e.g., "STORE-W001"  region: string;              // West, East, Central, South   // Product info  productId: string;  productCategory: string;     // Electronics, Clothing, Food, Home, Sports   // Customer (optional - from loyalty card)  customerId: string;   // Transaction amounts  quantity: number;  unitPrice: number;  totalAmount: number;} // IMPORTANT: Always include the type parameter <SalesTransaction> - it's required// for the compiler plugin to generate the correct schemaexport const salesPipeline = new IngestPipeline<SalesTransaction>("sales", {  ingestApi: true,  stream: true,  table: {    // Time-first ordering for efficient range queries    // The first field becomes the primary key in ClickHouse    orderByFields: ["saleTimestamp", "transactionId"],  }});

The IngestPipeline must include the type parameter. Without it: "Supply the type param T so that the schema is inserted by the compiler plugin."

This declaration creates:

  • ClickHouse table matching the interface
  • Redpanda stream
  • REST endpoint at /ingest/sales

Step 5: Export models

Update app/index.ts:

export * from "./ingest/SalesTransaction";
Important

Every new model, view, or API must be exported from app/index.ts. Forgetting this is a common mistake—without the export, tables and endpoints won't be created, and there won't be any errors in the logs.

Verify

Check the moose dev terminal for:

✓ Detected model changes✓ Created table: sales✓ Created stream: sales✓ Registered API endpoint: /ingest/sales

If not visible, check that the file saved and look for TypeScript errors.

From that one interface, Moose inferred the ClickHouse schema, created the table with time-first ordering, set up a Redpanda topic, and generated a REST endpoint with validation.

Step 6: Test ingestion

curl -X POST http://localhost:4000/ingest/sales \
  -H "Content-Type: application/json" \
  -d '{
    "transactionId": "TXN-W001-20250115-0001",
    "saleTimestamp": "2025-01-15T10:30:00Z",
    "storeId": "STORE-W001",
    "region": "West",
    "productId": "PROD-ELE-001",
    "productCategory": "Electronics",
    "customerId": "CUST-1001",
    "quantity": 2,
    "unitPrice": 299.99,
    "totalAmount": 599.98
  }'

Returns SUCCESS.

The data flows through Redpanda into ClickHouse.

Verify in ClickHouse (credentials are dev defaults, don't use in production):

curl -s "http://localhost:18123/?user=panda&password=pandapass" \
  -d "SELECT * FROM local.sales FORMAT Pretty"

Part 5: File ingestion

Stores upload CSV or JSON files rather than calling APIs directly. Here's a file loader.

Step 1: Install dependencies

pnpm add papaparse date-fns
pnpm add -D @types/papaparse ts-node

ts-node lets you run TypeScript files directly without compiling first.

Step 2: Create file loader

Create scripts/load-file.ts:

import Papa from "papaparse";import fs from "fs";import path from "path"; // Note: saleTimestamp can be string or Date because papaparse dynamicTyping// converts ISO date strings to Date objectsinterface SalesTransaction {  transactionId: string;  saleTimestamp: string | Date;  storeId: string;  region: string;  productId: string;  productCategory: string;  customerId: string | null;  // null for customers without loyalty cards  quantity: number;  unitPrice: number;  totalAmount: number;} async function loadFile(filePath: string): Promise<void> {  const ext = path.extname(filePath).toLowerCase();  const content = fs.readFileSync(filePath, "utf-8");   let records: SalesTransaction[];   if (ext === ".csv") {    const { data } = Papa.parse<SalesTransaction>(content, {      header: true,      dynamicTyping: true,    });    // Filter empty rows and normalize fields    records = data      .filter((r) => r.transactionId)      .map((r) => ({        ...r,        // papaparse dynamicTyping converts Date objects - convert back to ISO string        saleTimestamp: r.saleTimestamp instanceof Date          ? r.saleTimestamp.toISOString()          : r.saleTimestamp,        // papaparse dynamicTyping converts empty CSV fields to null        // but the API expects empty strings for optional string fields        customerId: r.customerId ?? "",      }));  } else if (ext === ".json") {    const parsed = JSON.parse(content);    const rawRecords: SalesTransaction[] = Array.isArray(parsed) ? parsed : [parsed];    // Apply same normalization as CSV: filter empty rows and normalize fields    records = rawRecords      .filter((r) => r.transactionId)      .map((r) => ({        ...r,        // Normalize null customerId to empty string (API expects string, not null)        customerId: r.customerId ?? "",      }));  } else {    throw new Error(`Unsupported file format: ${ext}`);  }   console.log(`Parsed ${records.length} records from ${filePath}`);   // Send records to ingest API (collecting errors for partial failure handling)  const errors: Array<{ index: number; error: string }> = [];  for (let i = 0; i < records.length; i++) {    const response = await fetch("http://localhost:4000/ingest/sales", {      method: "POST",      headers: { "Content-Type": "application/json" },      body: JSON.stringify(records[i]),    });     if (!response.ok) {      errors.push({ index: i, error: response.statusText });    }  }   if (errors.length > 0) {    console.error(`Failed to ingest ${errors.length} records:`, errors);  }  console.log(`Successfully loaded ${records.length - errors.length} of ${records.length} records`);} // CLI executionconst filePath = process.argv[2];if (!filePath) {  console.error("Usage: pnpm exec ts-node scripts/load-file.ts <path-to-file>");  process.exit(1);} loadFile(filePath).catch(console.error);

This sends records one at a time. For production, batch them.

Step 3: Create sample data

Create sample-data/west-store-001.csv:

transactionId,saleTimestamp,storeId,region,customerId,productId,productCategory,quantity,unitPrice,totalAmountTXN-W001-20250115-0001,2025-01-15T09:15:00Z,STORE-W001,West,CUST-1001,PROD-ELE-001,Electronics,2,299.99,599.98TXN-W001-20250115-0002,2025-01-15T09:32:00Z,STORE-W001,West,CUST-1002,PROD-CLO-015,Clothing,1,49.99,49.99TXN-W001-20250115-0003,2025-01-15T10:05:00Z,STORE-W001,West,,PROD-FOO-042,Food,3,12.99,38.97TXN-W001-20250115-0004,2025-01-15T10:45:00Z,STORE-W001,West,CUST-1003,PROD-HOM-008,Home,1,89.99,89.99TXN-W001-20250115-0005,2025-01-15T11:20:00Z,STORE-W001,West,CUST-1004,PROD-SPO-022,Sports,2,34.99,69.98TXN-W001-20250115-0006,2025-01-15T12:00:00Z,STORE-W001,West,,PROD-ELE-055,Electronics,1,149.99,149.99TXN-W001-20250115-0007,2025-01-15T13:30:00Z,STORE-W001,West,CUST-1005,PROD-CLO-033,Clothing,3,29.99,89.97TXN-W001-20250115-0008,2025-01-15T14:15:00Z,STORE-W001,West,CUST-1001,PROD-FOO-018,Food,5,8.99,44.95TXN-W001-20250115-0009,2025-01-15T15:45:00Z,STORE-W001,West,CUST-1006,PROD-HOM-041,Home,1,199.99,199.99TXN-W001-20250115-0010,2025-01-15T16:30:00Z,STORE-W001,West,,PROD-ELE-012,Electronics,1,599.99,599.99

Empty customerId = no loyalty card.

Step 4: Load sample data

mkdir -p sample-data
# Create the CSV file above, then:
pnpm exec ts-node scripts/load-file.ts sample-data/west-store-001.csv
Script shortcut

You can also add a script to package.json: "load-file": "ts-node scripts/load-file.ts" and run pnpm run load-file sample-data/west-store-001.csv.

Output:

Parsed 10 records from sample-data/west-store-001.csvSuccessfully loaded 10 records

Verify:

curl -s "http://localhost:18123/?user=panda&password=pandapass" \
  -d "SELECT count() FROM local.sales"
Note on duplicates

Running the file loader multiple times creates duplicate records—ClickHouse doesn't enforce unique constraints. To start fresh, stop moose dev and remove the project's Docker volumes. Caution: docker volume prune deletes all unused volumes on your machine. Instead, use docker volume ls to list volumes, identify the ones for your project (typically containing clickhouse or redpanda), and remove them selectively with docker volume rm <volume_name>.

Step 5: Add validation (recommended)

Production data is messy. Add validation to catch issues early.


Part 6: Time-series aggregation

Create a materialized view that pre-aggregates daily summaries.

How ClickHouse materialized views work

  • Views execute on INSERT, not on a schedule
  • Data aggregates per batch, not globally
  • Background merges consolidate eventually, but always use GROUP BY for accurate results
  • SummingMergeTree only works for sum() and count()

Step 1: Create materialized view

Create app/views/DailySalesSummary.ts:

import { MaterializedView, sql } from "@514labs/moose-lib";import { salesPipeline } from "../ingest/SalesTransaction"; // Define the aggregated data modelexport interface DailySalesSummary {  saleDate: Date;  region: string;  productCategory: string;  transactionCount: number;  totalQuantity: number;  totalRevenue: number;} // Get table reference from pipeline (with null check for TypeScript)const salesTable = salesPipeline.table;if (!salesTable) {  throw new Error("Sales table not configured in pipeline");} // Create materialized view with auto-created target tableexport const dailySalesMV = new MaterializedView<DailySalesSummary>({  tableName: "DailySalesSummary",  materializedViewName: "DailySalesSummary_MV",  orderByFields: ["saleDate", "region", "productCategory"],  selectTables: [salesTable],  selectStatement: sql`    SELECT      toStartOfDay(saleTimestamp) AS saleDate,      region,      productCategory,      count() AS transactionCount,      sum(quantity) AS totalQuantity,      sum(totalAmount) AS totalRevenue    FROM ${salesTable}    GROUP BY saleDate, region, productCategory  `,});

toStartOfDay() truncates to daily granularity. ORDER BY saleDate first for efficient date-range queries. Always use GROUP BY when querying.

Step 2: Update exports

Add to app/index.ts:

export * from "./ingest/SalesTransaction";export * from "./views/DailySalesSummary";

Step 3: Verify

curl -s "http://localhost:18123/?user=panda&password=pandapass" \
  -d "SHOW TABLES FROM local"

Should show both sales and DailySalesSummary.

On each INSERT to sales, the materialized view (MV) runs and groups transactions by day, collapsing potentially millions of rows into one per day/region/category. Results land in DailySalesSummary. Since MVs aggregate per-batch, you need GROUP BY when querying.

Performance difference:

TargetRowsQuery time
Raw transactions (1 year)~10M2-10 sec
Daily summaries (1 year)~1,50010-50 ms

That's 100x faster because aggregation happens once at insert time, not on every query.

Step 4: Query aggregated data

Always use GROUP BY:

curl -s "http://localhost:18123/?user=panda&password=pandapass" -d "
  SELECT
    saleDate,
    region,
    sum(transactionCount) AS transactions,
    sum(totalRevenue) AS revenue
  FROM local.DailySalesSummary
  GROUP BY saleDate, region
  ORDER BY saleDate, region
  FORMAT Pretty
"

ClickHouse time functions

FunctionOutputUse case
toStartOfHour(ts)2025-01-15 10:00:00Intraday
toStartOfDay(ts)2025-01-15Daily
toStartOfWeek(ts, 1)2025-01-13Weekly (Monday)
toStartOfMonth(ts)2025-01-01Monthly
toHour(ts)10Hour-of-day
toDayOfWeek(ts)3 (Wed)Day-of-week

Part 7: Report API

Build a consumption API with parameterized date ranges.

Step 1: Create API

Create app/apis/salesReport.ts:

import { Api } from "@514labs/moose-lib";import { dailySalesMV } from "../views/DailySalesSummary"; // Define query parametersexport interface SalesReportParams {  startDate: string;   // Format: YYYY-MM-DD  endDate: string;     // Format: YYYY-MM-DD  region?: string;     // Optional filter} // Define response interfaceexport interface SalesReportResponse {  saleDate: string;  region: string;  productCategory: string;  transactionCount: number;  totalQuantity: number;  totalRevenue: number;} // Create the API endpointexport const salesReportApi = new Api<SalesReportParams, SalesReportResponse[]>(  "sales-report",  async ({ startDate, endDate, region }, { client, sql }) => {    // Access the target table from the materialized view    const targetTable = dailySalesMV.targetTable;     // Build query with optional region filter    // Note: We use separate queries for clarity. Production code could    // consolidate these using conditional query building.    let query;    if (region) {      query = sql`        SELECT          saleDate,          region,          productCategory,          sum(transactionCount) AS transactionCount,          sum(totalQuantity) AS totalQuantity,          sum(totalRevenue) AS totalRevenue        FROM ${targetTable}        WHERE saleDate BETWEEN ${startDate} AND ${endDate}          AND region = ${region}        GROUP BY saleDate, region, productCategory        ORDER BY saleDate DESC, totalRevenue DESC      `;    } else {      query = sql`        SELECT          saleDate,          region,          productCategory,          sum(transactionCount) AS transactionCount,          sum(totalQuantity) AS totalQuantity,          sum(totalRevenue) AS totalRevenue        FROM ${targetTable}        WHERE saleDate BETWEEN ${startDate} AND ${endDate}        GROUP BY saleDate, region, productCategory        ORDER BY saleDate DESC, totalRevenue DESC      `;    }     const result = await client.query.execute<SalesReportResponse>(query);    return result.json();  });

Step 2: Update exports

Add to app/index.ts:

export * from "./ingest/SalesTransaction";export * from "./views/DailySalesSummary";export * from "./apis/salesReport";

Creates a REST endpoint at /consumption/sales-report with startDate, endDate, and optional region parameters. Queries the MV (fast), uses parameterized SQL (no injection), returns typed JSON.

PDF generator and dashboards both hit this endpoint.

Step 3: Test

curl "http://localhost:4000/consumption/sales-report?startDate=2025-01-01&endDate=2025-01-31"

A successful response returns JSON like this:

[  {    "saleDate": "2025-01-15",    "region": "West",    "productCategory": "Electronics",    "transactionCount": 3,    "totalQuantity": 4,    "totalRevenue": 1349.96  },  {    "saleDate": "2025-01-15",    "region": "West",    "productCategory": "Clothing",    "transactionCount": 2,    "totalQuantity": 4,    "totalRevenue": 139.96  }]

If you get an empty array [], verify data was ingested in Part 3 and the date range matches your data.

Filter by region:

curl "http://localhost:4000/consumption/sales-report?startDate=2025-01-15&endDate=2025-01-15&region=West"

Part 8: PDF export

Transform JSON into PDF. You'll implement two complementary approaches:

  1. Local script - For batch generation, scheduled jobs, and local development
  2. API endpoint - For on-demand generation and external system integration

Implement both to support the full range of PDF generation use cases.

Approach A: Local script

Step 1: Install pdfmake

pnpm add pdfmake
pnpm add -D @types/pdfmake
Node.js requirement

This guide uses pdfmake 0.3.x which requires Node.js 20 LTS (Long-Term Support) or higher. The API changed significantly from version 0.2.x—see the pdfmake 0.3 documentation for details.

Step 2: Create PDF generator

Create scripts/generate-pdf-report.ts:

import { TDocumentDefinitions } from "pdfmake/interfaces"; // Use require for pdfmake as it doesn't have proper ESM exportsconst pdfmake = require("pdfmake"); // Use standard PDF fonts (no external files needed)// Note: In pdfmake 0.3, these are built-in standard fontsconst fonts = {  Roboto: {    normal: "Helvetica",    bold: "Helvetica-Bold",    italics: "Helvetica-Oblique",    bolditalics: "Helvetica-BoldOblique",  },}; // Register fonts with pdfmake (required for pdfmake 0.3)pdfmake.addFonts(fonts); // Report data interface (matches API response)interface SalesReportRow {  saleDate: string;  region: string;  productCategory: string;  transactionCount: number;  totalQuantity: number;  totalRevenue: number;} async function fetchReportData(  startDate: string,  endDate: string): Promise<SalesReportRow[]> {  const url = `http://localhost:4000/consumption/sales-report?startDate=${startDate}&endDate=${endDate}`;  const response = await fetch(url);   if (!response.ok) {    throw new Error(`API request failed: ${response.statusText}`);  }   return response.json();} function aggregateByRegion(  data: SalesReportRow[]): Map<string, { revenue: number; transactions: number }> {  const regionMap = new Map<string, { revenue: number; transactions: number }>();   for (const row of data) {    const existing = regionMap.get(row.region) || { revenue: 0, transactions: 0 };    regionMap.set(row.region, {      revenue: existing.revenue + row.totalRevenue,      transactions: existing.transactions + row.transactionCount,    });  }   return regionMap;} function aggregateByCategory(  data: SalesReportRow[]): Map<string, { revenue: number; transactions: number }> {  const categoryMap = new Map<string, { revenue: number; transactions: number }>();   for (const row of data) {    const existing = categoryMap.get(row.productCategory) || {      revenue: 0,      transactions: 0,    };    categoryMap.set(row.productCategory, {      revenue: existing.revenue + row.totalRevenue,      transactions: existing.transactions + row.transactionCount,    });  }   return categoryMap;} function formatCurrency(amount: number): string {  return `$${amount.toLocaleString("en-US", { minimumFractionDigits: 2, maximumFractionDigits: 2 })}`;} async function generatePdf(  data: SalesReportRow[],  reportDate: string,  outputPath: string): Promise<void> {   // Calculate totals  const totalRevenue = data.reduce((sum, row) => sum + row.totalRevenue, 0);  const totalTransactions = data.reduce(    (sum, row) => sum + row.transactionCount,    0  );   // Aggregate by region and category  const regionData = aggregateByRegion(data);  const categoryData = aggregateByCategory(data);   // Sort regions by revenue descending  const sortedRegions = Array.from(regionData.entries()).sort(    (a, b) => b[1].revenue - a[1].revenue  );   // Sort categories by revenue descending  const sortedCategories = Array.from(categoryData.entries()).sort(    (a, b) => b[1].revenue - a[1].revenue  );   const docDefinition: TDocumentDefinitions = {    content: [      // Header      { text: "APEX RETAIL", style: "companyName" },      { text: "Daily Sales Summary", style: "reportTitle" },      { text: reportDate, style: "reportDate" },      {        canvas: [          { type: "line", x1: 0, y1: 5, x2: 515, y2: 5, lineWidth: 1 },        ],      },       // Executive Summary      {        text: "EXECUTIVE SUMMARY",        style: "sectionHeader",        margin: [0, 15, 0, 10],      },      {        columns: [          {            text: [              { text: "Total Revenue: ", bold: true },              formatCurrency(totalRevenue),            ],          },          {            text: [              { text: "Transactions: ", bold: true },              totalTransactions.toLocaleString(),            ],          },        ],        margin: [0, 0, 0, 15],      },       // Revenue by Region      {        text: "REVENUE BY REGION",        style: "sectionHeader",        margin: [0, 10, 0, 10],      },      {        table: {          headerRows: 1,          widths: ["*", "auto", "auto", "auto"],          body: [            [              { text: "Region", style: "tableHeader" },              { text: "Revenue", style: "tableHeader" },              { text: "Transactions", style: "tableHeader" },              { text: "Share", style: "tableHeader" },            ],            ...sortedRegions.map(([region, stats]) => [              region,              { text: formatCurrency(stats.revenue), alignment: "right" as const },              { text: stats.transactions.toLocaleString(), alignment: "right" as const },              {                text: `${totalRevenue > 0 ? ((stats.revenue / totalRevenue) * 100).toFixed(1) : "0.0"}%`,                alignment: "right" as const,              },            ]),          ],        },        layout: "lightHorizontalLines",      },       // Revenue by Category      {        text: "REVENUE BY CATEGORY",        style: "sectionHeader",        margin: [0, 20, 0, 10],      },      {        table: {          headerRows: 1,          widths: ["*", "auto", "auto", "auto"],          body: [            [              { text: "Category", style: "tableHeader" },              { text: "Revenue", style: "tableHeader" },              { text: "Transactions", style: "tableHeader" },              { text: "Share", style: "tableHeader" },            ],            ...sortedCategories.map(([category, stats]) => [              category,              { text: formatCurrency(stats.revenue), alignment: "right" as const },              { text: stats.transactions.toLocaleString(), alignment: "right" as const },              {                text: `${totalRevenue > 0 ? ((stats.revenue / totalRevenue) * 100).toFixed(1) : "0.0"}%`,                alignment: "right" as const,              },            ]),          ],        },        layout: "lightHorizontalLines",      },    ],     footer: {      text: `Generated: ${new Date().toISOString()} | Apex Retail Confidential`,      style: "footer",      margin: [40, 0],    },     styles: {      companyName: { fontSize: 24, bold: true, alignment: "center" },      reportTitle: {        fontSize: 16,        alignment: "center",        margin: [0, 5, 0, 5],      },      reportDate: { fontSize: 12, alignment: "center", color: "gray" },      sectionHeader: { fontSize: 12, bold: true },      tableHeader: { bold: true, fillColor: "#eeeeee" },      footer: { fontSize: 8, color: "gray", alignment: "center" },    },    defaultStyle: { font: "Roboto", fontSize: 10 },    pageMargins: [40, 40, 40, 60],  };   // Create PDF using pdfmake 0.3 API  const pdf = pdfmake.createPdf(docDefinition);   // Use the promise-based write() method (pdfmake 0.3+)  await pdf.write(outputPath);  console.log(`PDF report saved to ${outputPath}`);} // Main executionasync function main() {  const startDate = process.argv[2] || "2025-01-01";  const endDate = process.argv[3] || "2025-12-31";  const outputPath = process.argv[4] || "apex-retail-report.pdf";   console.log(`Fetching report data from ${startDate} to ${endDate}...`);   try {    const data = await fetchReportData(startDate, endDate);     if (data.length === 0) {      console.log("No data found for the specified date range.");      return;    }     console.log(`Generating PDF with ${data.length} data rows...`);    await generatePdf(data, `${startDate} to ${endDate}`, outputPath);  } catch (error) {    console.error("Error generating report:", error);    process.exit(1);  }} main();

Step 3: Generate report

With moose dev running, first verify the API returns data:

curl "http://localhost:4000/consumption/sales-report?startDate=2025-01-15&endDate=2025-01-15"

If the API returns valid JSON (not an empty []), generate the PDF:

pnpm exec ts-node scripts/generate-pdf-report.ts 2025-01-15 2025-01-15 january-15-report.pdf

Expected output:

Fetching report data from 2025-01-15 to 2025-01-15...Generating PDF with 5 data rows...PDF report saved to january-15-report.pdf
Test first

Always test the API with curl before running the PDF generator. If you see "No data found" or the script hangs, verify the API returns data for your date range.

Step 4: Verify

Actually open the PDF file to verify formatting and data accuracy—the console output won't show layout issues. The PDF should contain:

  • Header with company name and date range
  • Executive summary (revenue, transactions)
  • Revenue by region with percentages
  • Revenue by category with percentages
  • Footer with timestamp

PDF generates from pre-aggregated data, so it's fast regardless of raw transaction count.

Approach B: API endpoint

For production use cases, you may want to generate PDFs on-demand via an API endpoint. This approach:

  • Returns PDF binary directly in the HTTP response
  • Allows external systems to request reports via HTTP
  • Supports direct download via curl or browser
Why WebApp instead of Api?

The standard Api class cannot return binary data—it hardcodes Content-Type: application/json and JSON-serializes all responses. The WebApp class provides direct access to the HTTP handler, allowing you to set custom headers and write binary data.

Step 1: Create the PDF API endpoint

Create app/apis/salesReportPdf.ts:

import { WebApp, getMooseUtils } from "@514labs/moose-lib";import { IncomingMessage, ServerResponse } from "http";import { parse as parseUrl } from "url";import { TDocumentDefinitions } from "pdfmake/interfaces";import { dailySalesMV } from "../views/DailySalesSummary"; // Use require for pdfmake as it doesn't have proper ESM exportsconst pdfmake = require("pdfmake"); // Use standard PDF fonts (no external files needed)const fonts = {  Roboto: {    normal: "Helvetica",    bold: "Helvetica-Bold",    italics: "Helvetica-Oblique",    bolditalics: "Helvetica-BoldOblique",  },};pdfmake.addFonts(fonts); // Report data interface (matches materialized view output)interface SalesReportRow {  saleDate: string;  region: string;  productCategory: string;  transactionCount: number;  totalQuantity: number;  totalRevenue: number;} function formatCurrency(amount: number): string {  return `$${amount.toLocaleString("en-US", { minimumFractionDigits: 2, maximumFractionDigits: 2 })}`;} // Generate PDF buffer using pdfmake 0.3 getBuffer() methodasync function generatePdfBuffer(  data: SalesReportRow[],  reportDateRange: string): Promise<Buffer> {  // Calculate totals  const totalRevenue = data.reduce((sum, row) => sum + row.totalRevenue, 0);  const totalTransactions = data.reduce(    (sum, row) => sum + row.transactionCount,    0  );   // Aggregate by region  const regionMap = new Map<string, { revenue: number; transactions: number }>();  for (const row of data) {    const existing = regionMap.get(row.region) || { revenue: 0, transactions: 0 };    regionMap.set(row.region, {      revenue: existing.revenue + row.totalRevenue,      transactions: existing.transactions + row.transactionCount,    });  }  const sortedRegions = Array.from(regionMap.entries()).sort(    (a, b) => b[1].revenue - a[1].revenue  );   // Aggregate by category  const categoryMap = new Map<string, { revenue: number; transactions: number }>();  for (const row of data) {    const existing = categoryMap.get(row.productCategory) || {      revenue: 0,      transactions: 0,    };    categoryMap.set(row.productCategory, {      revenue: existing.revenue + row.totalRevenue,      transactions: existing.transactions + row.transactionCount,    });  }  const sortedCategories = Array.from(categoryMap.entries()).sort(    (a, b) => b[1].revenue - a[1].revenue  );   const docDefinition: TDocumentDefinitions = {    content: [      { text: "APEX RETAIL", style: "companyName" },      { text: "Daily Sales Summary", style: "reportTitle" },      { text: reportDateRange, style: "reportDate" },      {        canvas: [          { type: "line", x1: 0, y1: 5, x2: 515, y2: 5, lineWidth: 1 },        ],      },      {        text: "EXECUTIVE SUMMARY",        style: "sectionHeader",        margin: [0, 15, 0, 10],      },      {        columns: [          {            text: [              { text: "Total Revenue: ", bold: true },              formatCurrency(totalRevenue),            ],          },          {            text: [              { text: "Transactions: ", bold: true },              totalTransactions.toLocaleString(),            ],          },        ],        margin: [0, 0, 0, 15],      },      {        text: "REVENUE BY REGION",        style: "sectionHeader",        margin: [0, 10, 0, 10],      },      {        table: {          headerRows: 1,          widths: ["*", "auto", "auto", "auto"],          body: [            [              { text: "Region", style: "tableHeader" },              { text: "Revenue", style: "tableHeader" },              { text: "Transactions", style: "tableHeader" },              { text: "Share", style: "tableHeader" },            ],            ...sortedRegions.map(([region, stats]) => [              region,              { text: formatCurrency(stats.revenue), alignment: "right" as const },              { text: stats.transactions.toLocaleString(), alignment: "right" as const },              {                text: `${totalRevenue > 0 ? ((stats.revenue / totalRevenue) * 100).toFixed(1) : "0.0"}%`,                alignment: "right" as const,              },            ]),          ],        },        layout: "lightHorizontalLines",      },      {        text: "REVENUE BY CATEGORY",        style: "sectionHeader",        margin: [0, 20, 0, 10],      },      {        table: {          headerRows: 1,          widths: ["*", "auto", "auto", "auto"],          body: [            [              { text: "Category", style: "tableHeader" },              { text: "Revenue", style: "tableHeader" },              { text: "Transactions", style: "tableHeader" },              { text: "Share", style: "tableHeader" },            ],            ...sortedCategories.map(([category, stats]) => [              category,              { text: formatCurrency(stats.revenue), alignment: "right" as const },              { text: stats.transactions.toLocaleString(), alignment: "right" as const },              {                text: `${totalRevenue > 0 ? ((stats.revenue / totalRevenue) * 100).toFixed(1) : "0.0"}%`,                alignment: "right" as const,              },            ]),          ],        },        layout: "lightHorizontalLines",      },    ],    footer: {      text: `Generated: ${new Date().toISOString()} | Apex Retail Confidential`,      style: "footer",      margin: [40, 0],    },    styles: {      companyName: { fontSize: 24, bold: true, alignment: "center" },      reportTitle: {        fontSize: 16,        alignment: "center",        margin: [0, 5, 0, 5],      },      reportDate: { fontSize: 12, alignment: "center", color: "gray" },      sectionHeader: { fontSize: 12, bold: true },      tableHeader: { bold: true, fillColor: "#eeeeee" },      footer: { fontSize: 8, color: "gray", alignment: "center" },    },    defaultStyle: { font: "Roboto", fontSize: 10 },    pageMargins: [40, 40, 40, 60],  };   const pdf = pdfmake.createPdf(docDefinition);  // pdfmake 0.3 provides promise-based getBuffer()  return pdf.getBuffer();} const handler = async (req: IncomingMessage, res: ServerResponse) => {  const url = parseUrl(req.url || "", true);  const startDate = url.query.startDate as string;  const endDate = url.query.endDate as string;  const region = url.query.region as string | undefined;   // Validate required parameters  if (!startDate || !endDate) {    res.writeHead(400, { "Content-Type": "application/json" });    res.end(JSON.stringify({ error: "startDate and endDate query parameters are required" }));    return;  }   // Validate date format (YYYY-MM-DD)  const dateRegex = /^\d{4}-\d{2}-\d{2}$/;  if (!dateRegex.test(startDate) || !dateRegex.test(endDate)) {    res.writeHead(400, { "Content-Type": "application/json" });    res.end(JSON.stringify({ error: "Dates must be in YYYY-MM-DD format" }));    return;  }   // Validate date order  if (startDate > endDate) {    res.writeHead(400, { "Content-Type": "application/json" });    res.end(JSON.stringify({ error: "startDate must be before or equal to endDate" }));    return;  }   try {    // Query ClickHouse directly using Moose utilities    const { client, sql } = await getMooseUtils();    const targetTable = dailySalesMV.targetTable;     // Build query with optional region filter    let query;    if (region) {      query = sql`        SELECT          saleDate,          region,          productCategory,          sum(transactionCount) AS transactionCount,          sum(totalQuantity) AS totalQuantity,          sum(totalRevenue) AS totalRevenue        FROM ${targetTable}        WHERE saleDate BETWEEN ${startDate} AND ${endDate}          AND region = ${region}        GROUP BY saleDate, region, productCategory        ORDER BY saleDate DESC, totalRevenue DESC      `;    } else {      query = sql`        SELECT          saleDate,          region,          productCategory,          sum(transactionCount) AS transactionCount,          sum(totalQuantity) AS totalQuantity,          sum(totalRevenue) AS totalRevenue        FROM ${targetTable}        WHERE saleDate BETWEEN ${startDate} AND ${endDate}        GROUP BY saleDate, region, productCategory        ORDER BY saleDate DESC, totalRevenue DESC      `;    }     const result = await client.query.execute<SalesReportRow>(query);    const data = await result.json() as SalesReportRow[];     if (data.length === 0) {      res.writeHead(404, { "Content-Type": "application/json" });      res.end(JSON.stringify({ error: "No data found for the specified date range" }));      return;    }     // Generate PDF buffer    const pdfBuffer = await generatePdfBuffer(data, `${startDate} to ${endDate}`);     // Return binary PDF response with appropriate headers    res.writeHead(200, {      "Content-Type": "application/pdf",      "Content-Disposition": `attachment; filename="sales-report-${startDate}-to-${endDate}.pdf"`,      "Content-Length": pdfBuffer.length,    });    res.end(pdfBuffer);  } catch (error) {    console.error("PDF generation error:", error);    res.writeHead(500, { "Content-Type": "application/json" });    res.end(JSON.stringify({ error: "Failed to generate PDF" }));  }}; // Create WebApp with custom mount path// Note: Cannot use reserved paths like /consumption or /ingestexport const salesReportPdfApi = new WebApp("salesReportPdf", handler, {  mountPath: "/reports",  metadata: {    description: "Generate sales report as PDF",  },});
Mount path restrictions

WebApp mount paths cannot use reserved paths like /consumption, /ingest, or /api. Use a custom path like /reports instead.

Step 2: Update exports

Add the new API to app/index.ts:

export * from "./ingest/SalesTransaction";export * from "./views/DailySalesSummary";export * from "./apis/salesReport";export * from "./apis/salesReportPdf";

Step 3: Test the PDF endpoint

With moose dev running, generate a PDF directly via curl:

curl "http://localhost:4000/reports?startDate=2025-01-15&endDate=2025-01-15" > report.pdf

You can also filter by region:

curl "http://localhost:4000/reports?startDate=2025-01-15&endDate=2025-01-15&region=West" > west-report.pdf

The endpoint returns:

  • 200 OK with PDF binary on success
  • 400 Bad Request if startDate or endDate is missing
  • 404 Not Found if no data exists for the date range
  • 500 Internal Server Error on unexpected failures

When to use each approach

With both approaches implemented, you have flexibility for different scenarios:

ApproachBest For
Local script (Approach A)Batch generation, scheduled jobs, local development, one-off reports
API endpoint (Approach B)On-demand generation, external consumers, integration with other systems, user-triggered downloads

Both approaches use the same pdfmake library and produce identical PDF output. The key differences are:

  • Local script fetches data from the JSON API, then generates a PDF file on disk
  • API endpoint queries ClickHouse directly and returns the PDF binary in the HTTP response

Having both gives you the best of both worlds: scripts for automation and batch processing, plus an API for real-time requests.


Part 9: How it works

Why ClickHouse is fast

ClickHouse is columnar. It stores data differently from row-oriented databases like PostgreSQL.

Row storage (PostgreSQL):

Record 1: [txn_id=1, timestamp=..., store=W001, region=West, amount=599.98]Record 2: [txn_id=2, timestamp=..., store=W001, region=West, amount=49.99]Record 3: [txn_id=3, timestamp=..., store=W002, region=East, amount=89.99]

For SELECT SUM(amount) FROM sales WHERE region='West':

  1. Read every row from disk
  2. Check region
  3. Add matching amounts
  4. Repeat millions of times

You read 100% of data to use 2% of it.

Column storage (ClickHouse):

txn_ids:    [1, 2, 3, ...]timestamps: [2025-01-15T09:00, 2025-01-15T09:30, ...]stores:     [W001, W001, W002, ...]regions:    [West, West, East, ...]amounts:    [599.98, 49.99, 89.99, ...]

Same query in ClickHouse:

  1. Read only region column
  2. Read only amount column
  3. Scan both in parallel
  4. Skip everything else

Read 2% of data, use 100% of it. 10-100x faster.

Why materialized views beat batch jobs

Traditional: scheduled job (cron, Airflow) aggregates data hourly/daily.

Raw Data → [Wait for schedule] → Batch Job → Aggregated Table

Problems: data goes stale between runs, batch job rescans everything, failures leave gaps until next run, scaling means bigger windows or incremental logic.

ClickHouse MVs:

Raw Data → [INSERT triggers MV] → Aggregated Table (immediate)
  1. INSERT into sales
  2. MV's SELECT runs on new rows immediately
  3. Results go into DailySalesSummary
  4. Synchronous: MV updated by the time INSERT returns

Data stays current, no separate batch system to manage, scales with write volume, and a failed INSERT doesn't leave the MV in a bad state.

Batch aggregation

MVs process batches, not global data. Multiple inserts for the same day/region/category create multiple rows.

Two inserts for West/Electronics:

Batch 1 (morning): 50 transactions → MV inserts [saleDate=2025-01-15, region=West, category=Electronics, count=50, revenue=5000]Batch 2 (evening): 30 transactions → MV inserts [saleDate=2025-01-15, region=West, category=Electronics, count=30, revenue=3000]

Two rows for the same day/region/category. This is why GROUP BY is required:

-- Wrong: May return multiple rows per day/regionSELECT * FROM DailySalesSummary WHERE saleDate = '2025-01-15' -- Correct: Consolidates all batches into final aggregatesSELECT saleDate, region, sum(transactionCount), sum(totalRevenue)FROM DailySalesSummaryGROUP BY saleDate, region

SummingMergeTree auto-combines during background merges, but GROUP BY is still safe practice since merges are async.

When to use what

StrategyWhenExample
Raw queriesAd-hoc, <1M rowsDebugging
Materialized viewRepeated queries, predictable patternsDashboards
Pre-aggregated APIExternal consumersPDF reports

If you run the same aggregation twice, make an MV.

Time-series patterns

Order by time first:

orderByFields: ["saleTimestamp", "transactionId"]  // GoodorderByFields: ["transactionId", "saleTimestamp"]  // Bad for time queries

Time-first ORDER BY means date-range queries skip irrelevant blocks.

Time granularity:

GranularityFunctionRows/yearUse
HourlytoStartOfHour()~8,760Intraday
DailytoStartOfDay()~365Standard
WeeklytoStartOfWeek()~52Trends
MonthlytoStartOfMonth()~12Executive

Finer = more rows = slower but more detail.

Multiple levels:

Raw transactions → Hourly MV → Daily MV → Monthly MV

Each level 10-30x smaller than the previous.

How Moose helps

The patterns above—columnar storage, materialized views, time-first ordering—are what make ClickHouse fast. But implementing them correctly requires expertise: writing DDL, managing schemas, configuring streams, and keeping everything in sync.

Moose handles this complexity so you can focus on your data model:

ClickHouse Best PracticeWithout MooseWith Moose
Columnar schema designHand-write DDL, manage migrationsDefine TypeScript interface, schema auto-generated
Time-first orderingConfigure ORDER BY manuallySet orderByFields in pipeline config
Materialized viewsWrite CREATE MATERIALIZED VIEW DDLDeclare MaterializedView with typed query
Streaming ingestionSet up Kafka/Redpanda, write consumersAuto-configured with IngestPipeline
Type-safe APIsBuild REST endpoints manuallyApi class with typed params and responses

This is what we covered in Why Moose for static reports?—a developer-first approach that gives you ClickHouse performance without ClickHouse complexity.


Wrap-up

What you built

CSV/JSON Files → Ingest API → ClickHouse Table → Materialized View → API → PDF

Components

SalesTransaction model

Type-safe schema that creates ClickHouse table

File loader

Parses CSV/JSON, sends to ingest API

DailySalesSummary MV

Pre-aggregates at insert time

sales-report API

Date-range queries with JSON response

PDF export (two options)

Local script or API endpoint returning binary PDF

Before vs after

Before: Analyst downloads 12 CSVs, copies to Excel, writes formulas, formats Word doc, emails PDF. 2+ hours daily. Breaks when analyst is out.

After: POS uploads to API, ClickHouse aggregates, API serves any date range, PDF generates in under a second. Runs unattended. Git tracks every change.

In this scenario, that's around 500 hours of manual Excel work eliminated (2 hours/day × 250 days). Your actual savings depend on your current process. The code is deterministic, so the same inputs produce the same outputs. And since it's just querying a database, adding more stores doesn't change how long reports take to generate.

Performance

OperationRow DBMoose + ClickHouse
Transaction scanSeconds-minutesMilliseconds
Daily aggregation10-30+ sec<100ms
Report generationMinutes<1 sec

Next steps

1

Add dimension tables

Customer, product, and store reference data

2

Create additional MVs

Hourly, weekly, and monthly aggregations

3

Add Excel export

Use ExcelJS for spreadsheet output

4

Schedule delivery

Automate with Temporal workflows

5

Deploy to production

Use Fiveonefour hosting


Reference

Sample queries

Daily revenue by region:

SELECT  saleDate,  region,  sum(totalRevenue) AS revenue,  sum(transactionCount) AS transactionsFROM local.DailySalesSummaryGROUP BY saleDate, regionORDER BY saleDate DESC, revenue DESC;

Hourly pattern (raw data):

SELECT  toHour(saleTimestamp) AS hour_of_day,  count() AS transactions,  sum(totalAmount) AS revenueFROM local.salesGROUP BY hour_of_dayORDER BY hour_of_day;

Top categories:

SELECT  productCategory,  sum(totalRevenue) AS revenue,  sum(transactionCount) AS transactionsFROM local.DailySalesSummaryWHERE saleDate BETWEEN '2025-01-01' AND '2025-01-31'GROUP BY productCategoryORDER BY revenue DESCLIMIT 5;

Project structure

      • index.ts
        • SalesTransaction.ts
        • DailySalesSummary.ts
        • salesReport.ts
        • salesReportPdf.ts
      • load-file.ts
      • generate-pdf-report.ts
      • west-store-001.csv
    • moose.config.toml
    • package.json
    • tsconfig.json
FilePurpose
app/index.tsExport all modules
app/ingest/SalesTransaction.tsIngestPipeline definition
app/views/DailySalesSummary.tsMaterializedView
app/apis/salesReport.tsConsumption API (JSON)
app/apis/salesReportPdf.tsPDF API (binary response)
scripts/load-file.tsCSV/JSON file loader
scripts/generate-pdf-report.tsPDF generator (local script)
sample-data/west-store-001.csvSample transactions

Configuration

moose.config.toml:

[project]name = "sales-reports"language = "typescript" [features]olap = truestreaming = true [clickhouse_config]db_name = "local"host = "localhost"host_port = 18123native_port = 9000user = "panda"password = "pandapass"use_ssl = false

Troubleshooting

Setup

EUNSUPPORTEDPROTOCOL: Use pnpm install, not npm install.

pnpm build warnings: You'll see "Ignored build scripts" warnings—these are safe to ignore. They're security warnings about packages that want to run build scripts.

Supply the type param T: Add type parameter: new IngestPipeline<SalesTransaction>(...).

Primary key must be prefix of sorting key: First field in orderByFields becomes primary key.

Port 4000, 9000, or 18123 in use:

# Find what's using the port
lsof -i :4000
lsof -i :9000
lsof -i :18123
 
# Kill the process if needed
kill -9 <PID>

Stop other Moose projects before starting a new one. Port 9000 is the ClickHouse native port and commonly conflicts with other Moose instances.

ClickHouse connection refused: Wait 20-30 seconds for containers (longer on first startup). Verify:

curl http://localhost:18123/ping

Check Docker:

docker ps | grep clickhouse

Docker not starting: Run docker info. On Mac/Windows, check Docker Desktop is running (menu bar icon). If Docker stops mid-tutorial, all services fail with "connection refused" errors.

Model not detected: Check file saved, export in app/index.ts, TypeScript errors in terminal. Type errors block the dev server from processing changes.

TypeScript errors blocking startup: The moose dev terminal shows compilation errors. Fix all errors before proceeding—the system won't process new models or APIs until code compiles.

TypeScript errors persist after fixing code: If errors continue to appear with stale line numbers after you've fixed the code, restart moose dev. The TypeScript compiler may cache compiled code, and a full restart clears the cache.

Data

CSV date parsing fails: Use ISO 8601 format (YYYY-MM-DDTHH:MM:SSZ).

Wrong FormatCorrect Format
01/15/2025 (US format)2025-01-15T09:00:00Z
15-01-2025 (EU format)2025-01-15T09:00:00Z
2025-1-15 (no leading zeros)2025-01-15T09:00:00Z

Ingestion error: Verify JSON matches interface:

# Test with minimal valid payload
curl -X POST http://localhost:4000/ingest/sales \
  -H "Content-Type: application/json" \
  -d '{"transactionId":"TEST-001","saleTimestamp":"2025-01-15T10:00:00Z","storeId":"STORE-001","region":"West","productId":"PROD-001","productCategory":"Electronics","customerId":"","quantity":1,"unitPrice":10.00,"totalAmount":10.00}'

Type mismatches: quantity must be number not "1", totalAmount must be number not "10.00".

Duplicate IDs: ClickHouse doesn't enforce uniqueness. Check:

SELECT transactionId, count() as cntFROM local.salesGROUP BY transactionIdHAVING cnt > 1

Deduplicate with GROUP BY or subquery. FINAL only works with ReplacingMergeTree/CollapsingMergeTree.

Queries

MV returns unexpected results: MVs aggregate per-batch. Use GROUP BY:

-- Wrong (may show partial aggregates)SELECT * FROM local.DailySalesSummary -- Correct (consolidates partial aggregates)SELECT  saleDate,  region,  sum(transactionCount) AS transactionCount,  sum(totalRevenue) AS totalRevenueFROM local.DailySalesSummaryGROUP BY saleDate, region

API returns empty: Check date format (YYYY-MM-DD not MM-DD-YYYY). Verify data exists for the range.

Query slow: Check you're querying the MV not raw table. Check date filter uses indexed column. Run EXPLAIN.

PDF

Not generating: Check API accessible, fetch errors in output, pdfmake installed. PDF generation should complete in under 2 seconds—if it hangs, the API is probably not responding.

Empty/malformed: Check API response has data, date range has transactions.

Test the API first:

curl "http://localhost:4000/consumption/sales-report?startDate=2025-01-15&endDate=2025-01-15"

If this returns empty [], verify data was ingested and the date range is correct.

Cleanup

Stopping the dev server: Press Ctrl+C in the moose dev terminal. Docker containers stop automatically.

Starting over completely: Stop moose dev, delete the project directory, and run moose init again.

Reset data but keep code: Stop moose dev, then remove the project's Docker volumes. Caution: docker volume prune deletes all unused volumes on your machine—not just this project's. Use docker volume ls to list volumes, identify yours (typically containing clickhouse or redpanda), and remove them with docker volume rm <volume_name>.

Data persists across restarts: If you stop and restart moose dev, your ingested data is still there—it's stored in Docker volumes.

Resources

  • Moose docs
  • OLAP guide
  • Materialized views
  • GitHub
  • Quickstart

On this page

How to use this guidePrerequisitesVerify your environmentPart 1: What are static reports?Why static reports matterPart 2: Business use casesSelling data assets to third partiesCompliance and regulated reportingAnalyst workflowsAutomations and operational handoffsUnderstanding the challengesEvaluating solutionsWhy Moose for static reports?Part 3: The scenarioApex Retail daily sales consolidationWhy automate?ArchitectureEnd resultPart 4: Setup and data modelStep 1: Install Moose CLIStep 2: Initialize projectStep 3: Start dev serverStep 4: Define data modelStep 5: Export modelsStep 6: Test ingestionPart 5: File ingestionStep 1: Install dependenciesStep 2: Create file loaderStep 3: Create sample dataStep 4: Load sample dataStep 5: Add validation (recommended)Part 6: Time-series aggregationHow ClickHouse materialized views workStep 1: Create materialized viewStep 2: Update exportsStep 3: VerifyStep 4: Query aggregated dataClickHouse time functionsPart 7: Report APIStep 1: Create APIStep 2: Update exportsStep 3: TestPart 8: PDF exportApproach A: Local scriptApproach B: API endpointPart 9: How it worksWhy ClickHouse is fastWhy materialized views beat batch jobsBatch aggregationWhen to use whatTime-series patternsHow Moose helpsWrap-upWhat you builtBefore vs afterPerformanceReferenceSample queriesProject structureConfigurationTroubleshootingResources
Edit this page
FiveonefourFiveonefour
Fiveonefour Docs
MooseStackHostingTemplatesGuides
Release Notes
Source576
  • Improving the Performance of Your Dashboards
  • Chat in your app
  • Custom analytics agents
  • Optimize a Slow ClickHouse Query with Your AI Agent