Transformation: dbt
Overview
This directory contains a data transformation pipeline that:
Takes data from Iceberg tables in the landing zone
Transforms it using dbt (data build tool)
Creates analytics-ready tables in staging and mart schemas
The pipeline runs as an AWS ECS Fargate task using a Docker container.
How It Works
Infrastructure Components
AWS Athena: SQL query engine for data transformation
Amazon S3: Hosts the Iceberg tables for both source and transformed data
AWS Glue: Provides the catalog for Iceberg tables
Amazon ECS: Orchestrates the dbt container execution
Amazon ECR: Stores the dbt docker container image
Terraform: Provisions and manages all infrastructure
Project Structure
Data Transformation Flow
The pipeline follows these transformation layers:
Sources: Raw data from landing tables created by ingestion pipelines
Staging: Initial cleaning, type conversion, deduplication and renaming
Mart: Final models organized by business domain, ready for analytics and reporting
Sources
Sources are defined in the sources/ folder and reference the landing tables created by the ingestion pipelines:
You can generate this file automatically using the BoringData CLI:
Models Structure
The dbt models follow a layered architecture pattern:
Each folder in the
modelsdirectory corresponds to a distinct schema in Athenamodels/staging/➡️<environment>_stagingschema in Athenamodels/mart/➡️<environment>_martschema in Athena
Development Guide
Option 1: Execute dbt Locally
For rapid development with local dbt execution:
Setup your environment:
Configure dbt profile: Create or update
~/.dbt/profiles.ymlwith:Run dbt commands:
Option 2: Execute in AWS ECS Fargate
Once your template is deployed to AWS you can run dbt in the cloud environment:
This will trigger an ECS Fargate task to execute the specified dbt command and store results in Iceberg.
Deployment
For manual deployment:
This process:
Builds the Docker image locally
Pushes it to ECR
The next time you trigger an ECS task, it will use the latest image.
Common Commands
Resources
Last updated