MooseStack

Moose OLAP

Failed Migrations

Handling Failed Migrations

One of the main benefits of the Moose local development environment is that you can detect breaking schema changes before they happen in production. This can be specifically useful for identifying incompatible data type changes when you change a column’s data type and the generated migration cannot cast the existing data to the new type.

This page describes how to recover from a failed migration in dev and gives a playbook for safely achieving the desired type change.

What happened

You changed a column’s data type on a table that already has data. The dev migration tried to run an in-place ALTER and ClickHouse created a mutation that failed (incompatible cast, nullability, defaults, etc.).

Symptoms:

  • Failed migration in dev
  • A stuck mutation on the table
  • Reverting your code type alone doesn’t help until the mutation is cleared

Quick recovery (dev)

Follow these steps to get unblocked quickly.

View the terminal logs to see the failing mutation

In your terminal, you should see a message like this:

⢹ Processing Infrastructure changes from file watcher
  ~ Table events:
      Column changes:
        ~ value: String -> Float64
  Applying: ALTER TABLE events MODIFY COLUMN value Float64
  ClickHouse mutation created: mutation_id='00000001-0000-4000-8000-000000000123'
  Error: Code: 368. Conversion failed: cannot parse 'abc' as Float64 (column: value)
  Status: mutation failed; table may be partially transformed

Copy the mutation ID from the terminal logs and run the following command to kill the mutation.

Kill the mutation

  • If you have the mutation_id:
KILL MUTATION WHERE mutation_id = '<id>';
  • If you didn’t capture the ID, find it and kill by table:
SELECT mutation_id, command, is_done, latest_fail_reason
FROM system.mutations
WHERE database = currentDatabase() AND table = '<your_table>'
ORDER BY create_time DESC;
 
KILL MUTATION WHERE database = currentDatabase() AND table = '<your_table>';

ClickHouse ALTERs are implemented as asynchronous mutations, not transactional. If a mutation fails mid-way, some parts may have been rewritten while others were not, leaving the table partially transformed. The failed mutation also remains queued until you kill it. Clear the mutation first, then proceed.

Coming Soon: Automatic mutation killing

Soon, Moose will automatically generate a local DDL plan that kills the mutation and “rolls back” the transformation to the data that was changed before the failure occurred.

Revert your code to match the current DB schema

  • Change the column type in code back to the previous (working) type
  • Save your changes; let moose dev resync. You should be able to query the table again
MooseTip:

If the table only has disposable dev data, you can also TRUNCATE TABLE <db>.<table> or drop/recreate the table and let moose dev rebuild it. Only do this in dev.

Safely achieving the desired type change

Instead of editing the column type in place, you can add a new column with the target type and backfill the data. This is the recommended approach.

Add a new column + backfill

app/tables/events.ts
enum StatusEnum {
  "active" = "active",
  "inactive" = "inactive",
}
 
interface Event {
  id: Key<string>;
  name: string;
  createdAt: Date;
  status: string;
  status_v2: StatusEnum; // New field - will trigger migration
}

Then, generate a plan to add the new column and backfill the data.

moose generate migration --url <YOUR_PRODUCTION_DATABASE_URL> --save --token <MOOSE_ADMIN_API_KEY>

Open the generated /migrations/plan.yaml file. You’ll see the AddTableColumn operation to add the new column. Right after it, you can add a RawSql operation to backfill the data. Here you can write an ALTER TABLE statement to update the new column with the data from the old column:

migrations/plan.yaml
- AddTableColumn:
    table: "events"
    column:
      name: "status_v2"
      data_type:
        Nullable:
          nullable: "StatusEnum"
      default: null
- RawSql:
    sql:
      - "ALTER TABLE events UPDATE status_v2 = toStatusEnumOrNull(status) WHERE status_v2 IS NULL"
    description: "Backfill status_v2 from status"

Then, when writing to the table, double write to both columns.

This allows for all surrounding processes and applications that rely on the old column to continue working, and you can later deprecate the old column and rename the new column when you are ready.

Later, deprecate the old column and rename the new column

Once the column backfill is complete and you are ready to deprecate the old column, you can rename the new column to the old column name and apply this in a new, subsequent PR.

In your code, you can rename the column and deprecate the old column:

app/tables/events.ts
interface Event {
  id: Key<string>;
  name: string;
  createdAt: Date;
  status_old: string; // rename status to status_old
  status: StatusEnum; // rename status_v2 to status
}

Initially you’ll see two DeleteTableColumn operations, followed by two AddTableColumn operations.

IMPORTANT: DELETE ALL FOUR GENERATED DeleteTableColumn AND AddTableColumn OPERATIONS WITH THE FOLLOWING:

migrations/plan.yaml
- RenameTableColumn:
    table: "events"
    before_column_name: "status"
    after_column_name: "status_old"
- RenameTableColumn:
    table: "events"
    before_column_name: "status_v2"
    after_column_name: "status"
Dropping the old column

Once the old column is no longer needed, you can drop it in a third PR.

migrations/plan.yaml
- DropTableColumn:
    table: "events"
    column_name: "status_old"

Common breaking cases

  • String -> Int/Float: can fail on non-numeric rows; prefer toInt64OrNull(...)/toFloat64OrNull(...) + backfill
  • Nullable(T) -> T (NOT NULL): fails if any NULLs exist and no default is provided; backfill then drop nullability
  • Narrowing types (e.g., Int64 -> Int32): fails if values overflow; validate and transform first
Related

Read about migration planning and how to use it to safely manage schema changes in production.