A Python tool for generating dbt (data build tool) projects with configurable scale and complexity for testing purposes.
Disclaimer: This is a community-driven project initially set up with the help of GitHub Copilot and is not officially affiliated with dbt Labs. The tool is designed to help users create dbt projects for testing and learning purposes. The project is tested mainly with dbt-fabric but should work with other dbt adapters as well.
GenerateDBT creates complete dbt projects with models, macros, and seed data that can be used to test dbt functionality across different data platforms (Snowflake, BigQuery, Postgres, Microsoft Fabric, Databricks, etc.). The generated code is platform-agnostic and follows dbt best practices.
- π― Configurable Scale: Choose the number of models and macros to generate
- π Complexity Levels: Simple, medium, or complex model patterns
- ποΈ Model Layers: Staging, intermediate, and mart models following dbt conventions
- π§ Utility Macros: String, date, aggregation, and data quality macros
- π Documentation: Auto-generated schema.yml and README files
- π± Seed Data: Sample CSV files for testing
- π Platform-Agnostic: Works with any dbt-supported database
git clone https://github.com/MartinHofpower/GenerateDBT.gitcd GenerateDBT
pip install -e .Generate a default dbt project:
generate-dbtThis creates a project with:
- 10 models (staging, intermediate, and marts)
- 5 macros
- 3 seed data files
- Medium complexity
- Output to
./generated_dbt_project
Generate a simple project:
generate-dbt --complexity simpleGenerate a complex project with more models:
generate-dbt --num-models 20 --num-macros 10 --complexity complexGenerate to a specific directory:
generate-dbt --output-dir ./my_test_project --project-name my_dbt_testgenerate-dbt \
--num-models 15 \
--num-macros 8 \
--num-seeds 5 \
--complexity complex \
--output-dir ./test_project \
--project-name fabric_test \
--max-dependencies 4 \
--no-intermediate| Option | Default | Description |
|---|---|---|
--num-models |
10 | Number of models to generate |
--num-macros |
5 | Number of macros to generate |
--complexity |
medium | Complexity level (simple, medium, complex) |
--output-dir |
./generated_dbt_project | Output directory |
--project-name |
test_dbt_project | Name of the dbt project |
--num-seeds |
3 | Number of seed data files |
--max-dependencies |
3 | Max dependencies per model |
--no-staging |
False | Skip staging models |
--no-intermediate |
False | Skip intermediate models |
--no-marts |
False | Skip mart models |
generated_dbt_project/
βββ dbt_project.yml # Project configuration
βββ README.md # Generated project documentation
βββ models/
β βββ schema.yml # Model documentation and tests
β βββ staging/ # Staging models (light transformations)
β β βββ stg_*.sql
β βββ intermediate/ # Intermediate transformations
β β βββ int_*.sql
β βββ marts/ # Business-level models
β βββ (fct_*.sql, dim_*.sql)
βββ macros/ # Reusable SQL macros
β βββ string_utils.sql
β βββ date_utils.sql
β βββ ...
βββ seeds/ # Sample CSV data
β βββ raw_data_1.csv
β βββ ...
βββ tests/ # Custom test directory
- Basic SELECT statements
- Minimal transformations
- Simple macros with single operations
- CTEs (Common Table Expressions)
- Basic joins and aggregations
- Macros with conditional logic
- Data quality checks
- Multiple CTEs and complex joins
- Window functions
- Incremental materializations
- Advanced macros with loops
- Comprehensive data quality frameworks
-
Navigate to the project:
cd generated_dbt_project -
Install dbt and adapter:
pip install dbt-core dbt-<your-adapter>
Replace
<your-adapter>with your platform (e.g.,dbt-snowflake,dbt-bigquery,dbt-fabric,dbt-postgres) -
Configure profiles.yml: Create or update
~/.dbt/profiles.ymlwith your database credentials:test_dbt_project: outputs: dev: type: <adapter_type> # Add your connection details target: dev
-
Test connection:
dbt debug
-
Load seed data:
dbt seed
-
Run models:
dbt run
-
Run tests:
dbt test -
Generate and view docs:
dbt docs generate dbt docs serve
This tool generates platform-agnostic dbt code that works with any supported adapter:
- Snowflake:
pip install dbt-snowflake - BigQuery:
pip install dbt-bigquery - Postgres:
pip install dbt-postgres - Redshift:
pip install dbt-redshift - Microsoft Fabric:
pip install dbt-fabric - Databricks:
pip install dbt-databricks
Simply install the appropriate adapter and configure your profiles.yml accordingly.
- π§ͺ Testing dbt on new platforms: Quickly generate test projects for Microsoft Fabric, Databricks, or other platforms
- π Learning dbt: Study example projects with various patterns
- π Training: Create sample projects for teaching dbt concepts
- π¬ Performance testing: Generate large projects to test performance
- π Debugging: Create reproducible test cases for dbt issues
- Python 3.7+
- PyYAML
- Click
git clone https://github.com/MartinHofpower/GenerateDBT.git
cd GenerateDBT
pip install -e .Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- Built for testing dbt projects across various data platforms
- Follows dbt best practices and conventions
- Inspired by the need for flexible, scalable dbt testing scenarios