Back to catalog
Data Engineering
Complete stack for building data pipelines. ETL, data warehouses, orchestration, and data quality.
Get this bundle
Who This Bundle Is For
Data engineers and analysts building data processing pipelines.
What's Included
MCP Servers
PostgreSQL — OLTP database. Transactions, data source.
ClickHouse — OLAP database for analytics. Fast aggregations on large datasets.
SQLite — lightweight database for local development and testing.
Airflow — pipeline orchestration. DAGs, scheduling, monitoring.
Skills
Airflow DAG Builder — create DAGs for task orchestration.
Change Data Capture — capture changes from sources.
BigQuery Partitioning — optimize table partitioning.
Agents
Data Engineer — build reliable data pipelines.
Database Optimizer — optimize queries and schemas.
Analytics Reporter — create analytical reports.
How to Use
- Define your data sources
- Create a DAG for ETL processes
- Set up CDC for incremental loading
- Optimize queries with Database Optimizer
Example Prompt
Create an Airflow DAG for an ETL pipeline:
- Source: PostgreSQL (orders, products, users)
- Sink: ClickHouse (data warehouse)
- Schedule: every hour
- Logic: incremental loading by updated_at
- Alerts: Slack on errors
Data Pipeline Architecture
┌────────────┐ ┌────────────┐ ┌────────────┐
│ PostgreSQL │ │ MySQL │ │ API │
│ (OLTP) │ │ (OLTP) │ │ Sources │
└─────┬──────┘ └─────┬──────┘ └─────┬──────┘
│ │ │
└──────────────────┼──────────────────┘
│
┌──────▼──────┐
│ Airflow │
│ (Extract) │
└──────┬──────┘
│
┌──────▼──────┐
│ Transform │
│ (dbt) │
└──────┬──────┘
│
┌──────▼──────┐
│ ClickHouse │
│ (OLAP) │
└──────┬──────┘
│
┌──────▼──────┐
│ Dashboards │
│ (Metabase) │
└─────────────┘
Results
- Reliable data pipelines
- Real-time analytics
- Optimized queries
- Data quality monitoring
