Data Engineering Directory — Warehouse, Ingestion, dbt, Orchestration

Storage Layer

Snowflake, BigQuery, Databricks, ClickHouse — cloud warehouses vs lakehouses vs OLAP, with honest cost modeling.

Table Formats & Lakehouse Coming soon

Apache Iceberg, Delta Lake, Apache Hudi — when you need a table format layer above Parquet and how each maps to your query engine.

Storage Layer

Ingestion Layer

Ingestion & ELT Coming soon

Fivetran, Airbyte, Stitch, Meltano — managed vs open source ingestion and when the price difference is actually justified.

Ingestion Layer

Stream Processing Coming soon

Apache Kafka, Apache Flink, AWS Kinesis — when batch is no longer enough and how to pick a streaming architecture that doesn't become a maintenance burden.

Ingestion Layer

Transformation Layer

Transformation Coming soon

dbt, SQLMesh, Coalesce — the transformation layer that most data stacks get wrong until it's painful. When dbt is the obvious choice and when it isn't.

Transformation Layer

Reverse ETL & Activation Coming soon

Census, Hightouch, Polytomic — syncing warehouse data back to operational tools and why this unlocks the rest of your stack.

Activation Layer

Orchestration Layer

Pipeline Orchestration Coming soon

Airflow, Prefect, Dagster, Mage — pipeline orchestration and the hidden operational cost most teams underestimate when choosing between managed and self-hosted.

Orchestration Layer

Need a data engineering specialist?

Find a vetted consultant who specializes in warehouse architecture, dbt implementation, or data pipeline design — browse by specialty or get matched.

Find a Consultant →