Welcome to Aurora
bash -i <(curl -fsSL https://fiveonefour.com/install.sh) aurora,moose
What is Aurora?
Aurora is a set of tools to make your chat, copilot or BI fluent in data engineering. Use the CLI to set up MCPs with the tools you need in the clients you use. Create new data engineering projects with a moose-managed ClickHouse based infrastructure or use these agents with your existing data infrastructure.
Quickstart Guides
Core features
Aurora offers a comprehensive suite of MCP tools and agents designed to streamline data engineering workflows, enabling faster and more efficient deployment and management.
CLI based deployment
- 3 steps in CLI to chat with ClickHouse
- 5 minutes to build a full stack OLAP project
- Template based new projects for building your own infrastructure
Client agnostic
- Data engineering in your IDE
- Data analytics and analytics engineering in your chat client
- BI in your BI tool of choice
Infrastructure agnostic, opinionated stack available
- Opinionated OLAP deployments with Moose: Optimized ClickHouse development and deployment
- Direct integration with your architecture: DuckDB, Snowflake, Databricks
- Integration with your enterprise: metadata, CICD, logging and more
Context-aware data engineering agents
- Full-stack context: code, logs, data, docs
- Self-improving feedback loops
- Embedded metadata for continuity
Enterprise ready by default
- Each agent: context gathering → implementation → testing → doc workflows
- Governance defaults, easily configurable: SDLC, data quality, reporting, privacy and security default practices
- Learns your policies with minimal context, enforces them automatically
Why Aurora exists
LLMs don't have the tools or context they need to be good data engineers, or to make good data engineering cyborgs.
LLM tools are hard to set up, and usually small in scope
It takes me minutes to set up an MCP server, and I have to hold my breath every time I want to use it.
Shallow context makes bad data
If I'm just looking at the data, if I'm just looking at the code, if I'm just looking at the logs, if i'm just looking at the docs, I'm not going to make good decisions. I need all that context.
I don't want to start from scratch to use AI
I've already got an entire data infrastructure, why should I start from scratch to use AI?
Brittle data engineering agents make for annoying coworkers
I want my agents to be easy to prompt, to be able to test their work, and to be able to fit within my SDLC.
The DIY Approach
How would I prompt my LLM to create a new egress API on top of ClickHouse.
Use an LLM to create a SQL query representing the data I want to expose
Iterate on that until I'm happy with the query, and ready to create the egress API
Write a SQL query to get the DDL for that database
Open my IDE to wherever I have my egress API code, and feed its LLM the DDL as context
Manually add sample data as further context
Manually add code examples as further context
Replicate my ClickHouse database locally
Prompt chat to create the egress API
Manually test the egress API
Deploy the egress API
The Aurora Approach
How would I use Aurora to create a new egress API on top of ClickHouse.
Install Aurora and Moose
Create a project from my data in ClickHouse
Prompt Aurora to create the egress API with the given business requirements. Context added automatically. Used by Aurora to build egress API and test it.
Deploy
What jobs can you do with Aurora?
Ad hoc analytics
Give your LLM client a way to chat with your data, in ClickHouse, Databricks, Snowflake, or wherever it lives
Analytics Engineering
Agents that can build new data products, materialized views and egress methods
Data Engineering
Have agents build and test end to end data pipelines
Data Wrangling
Agents that interact with your data systems like DuckDB, Databricks, Snowflake and more to create scripts, clean data, and prepare data for use
Data Migration
Automated creation of data pipelines to migrate data from legacy systems to a modern data backend
Data quality, governance and reporting
Agents that can help you enforce data quality, governance and reporting during development and in run time