feat: Improve reproducibility and templating using mise #13

Bafbi · 2025-11-28T21:06:18Z

Description

This PR implements a "zero-friction" repository setup using mise to automate dependency management, environment configuration, and infrastructure tasks. It addresses the need for reproducible environments and dynamic configuration for dbt and prefect.

Related Issue

Closes #11

Key Changes

Mise Integration (mise.toml):
- Defined tools (uv, opentofu, gcloud) and tasks to standardize development workflows.
- Added tasks for infrastructure management: infra:init, infra:plan, infra:outputs.
- Added tasks for configuration rendering: dbt:render_profiles, prefect:render_configs.
- Added tasks for GCP auth and setup: gcloud:auth, gcloud:create_sa_key.
Scripts & Automation (scripts/):
- sync_env: Automates .env creation from .env.example, preserving existing values.
- render_template: A Jinja2-based renderer that injects environment variables and Terraform outputs into configuration files.
- setup_prefect_blocks.py: Automates the creation of Prefect blocks (GCP credentials, BigQuery targets).
- get_git_env.sh: Exports git metadata as environment variables for use in templates.
Templating:
- dbt/profiles.tpl.yml: Dynamic dbt profile template supporting dev/prod environments via env vars or Terraform outputs.
- prefect.tpl.yml: Dynamic Prefect configuration template.
Infrastructure:
- Updated Terraform configuration in infrastructure/ to align with the new directory structure and outputs.
- Added profiles.toml for Prefect profile management.
Documentation:
- Updated README.md with instructions on using mise and uv for local development.

Verification Steps

Setup Environment:
```
mise install
mise run sync_env
```
Infrastructure (Optional/Dry-run):
```
mise run infra:init
mise run infra:plan
```

Render Configurations:

mise run dbt:render_profiles
# Check dbt/profiles.yml content
mise run prefect:render_configs
# Check prefect.yml content

Run Tests:
```
uv run prefect_flows/test.py
```

Impact

Developers now need mise installed to work with the repo effectively.
dbt/profiles.yml and prefect.yml are now generated artifacts and should not be manually edited (they are git-ignored).

…re setup

… configuration

…neration task

…in.tf and variables.tf)

…rofiles.yml

…der + create Prefect blocks)

…infra:sync_vars task

…st flow to use new block names

….gitignore)

…le in favor of scripts

Copilot

Pull request overview

This PR introduces mise as a zero-friction automation tool to standardize development workflows and improve reproducibility across the repository. It replaces the previous Python-based infrastructure/setup_profiles/ module with simpler, more maintainable scripts that leverage Jinja2 templating for dynamic configuration generation.

Key Changes:

Implemented mise task runner with comprehensive task definitions for infrastructure, configuration rendering, and GCP operations
Added template-based configuration system using Jinja2 for dbt profiles and prefect configs
Created automation scripts for environment synchronization, template rendering, Prefect block setup, and Git metadata extraction

Reviewed changes

Copilot reviewed 27 out of 29 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
`mise.toml`	Defines tools, tasks, and environment variables for the mise-based workflow
`scripts/sync_env`	Python script to synchronize .env files from .env.example, preserving user values
`scripts/render_template`	Jinja2 template renderer that injects env vars and Terraform outputs
`scripts/setup_prefect_blocks.py`	Automated Prefect block creation from rendered dbt profiles
`scripts/get_git_env.sh`	Bash script to export Git metadata as environment variables
`dbt/profiles.tpl.yml`	Jinja2 template for dbt profiles with environment-based configuration
`prefect.tpl.yml`	Jinja2 template for Prefect deployment configuration
`dbt/profiles.yml`	Generated dbt profiles file (should be git-ignored)
`prefect.yml`	Generated Prefect configuration (should be git-ignored)
`infrastructure/variables.tf`	Updated Terraform variables to use suffix-based naming
`infrastructure/main.tf`	Refactored to use local variables for dataset IDs
`infrastructure/providers.tf`	Extracted provider configuration to separate file
`infrastructure/outputs.tf`	Simplified outputs, removed redundant values
`.env.example`	Added template for required environment variables
`README.md`	Updated with mise-based workflow instructions
`infrastructure/README.md`	Comprehensive documentation of automated and manual workflows

Files not reviewed (1)

infrastructure/.terraform.lock.hcl: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

dbt/profiles.tpl.yml

dbt/profiles.yml

mise.toml

prefect.tpl.yml

scripts/render_template

prefect.yml

dbt/profiles.yml

scripts/setup_prefect_blocks.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…in main.tf

…formatting

… flags

…tegration

… from environment variables

…modify save flag default

Copilot

Pull request overview

Copilot reviewed 26 out of 28 changed files in this pull request and generated 13 comments.

Files not reviewed (1)

infrastructure/.terraform.lock.hcl: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-29T20:59:11Z

scripts/render_template

@@ -0,0 +1,204 @@
+#!/usr/bin/env -S uv run --script


The requires-python line is missing the closing triple slash comment marker # ///. The inline PEP 723 script metadata format requires both opening # /// script and closing # /// markers.

Suggested change

#!/usr/bin/env -S uv run --script

#!/usr/bin/env -S uv run --script

# /// script

Copilot · 2025-11-29T20:59:11Z

scripts/get_git_env.sh

+  # branch or short ref
+  GIT_BRANCH=$(_safe git rev-parse --abbrev-ref HEAD 2>/dev/null || echo "")
+  if [ "$GIT_BRANCH" = "HEAD" ]; then
+    GIT_BRANCH="$(git describe --all --always 2>/dev/null || echo "${GIT_COMMIT_SHORT}")"


On line 63, git describe is not wrapped with the _safe function, unlike other git commands. If this command fails (e.g., no git history), it could cause the script to exit due to set -euo pipefail. Wrap this with _safe for consistency: GIT_BRANCH="$(_safe git describe --all --always || echo "${GIT_COMMIT_SHORT}")".

Suggested change

GIT_BRANCH="$(git describe --all --always 2>/dev/null || echo "${GIT_COMMIT_SHORT}")"

GIT_BRANCH="$(_safe git describe --all --always || echo "${GIT_COMMIT_SHORT}")"

Copilot · 2025-11-29T20:59:12Z

infrastructure/providers.tf

+  }
+}
+
+provider "google" {}


The provider "google" block is missing required configuration. Without explicit project and region arguments, Terraform will rely on environment variables or default credentials, which may lead to unexpected behavior. Consider adding: project = var.project_id and region = var.region to ensure explicit configuration.

Suggested change

provider "google" {}

provider "google" {

project = var.project_id

region = var.region

}

Copilot · 2025-11-29T20:59:12Z

infrastructure/README.md

+5. **Plan Infrastructure:** `mise run infra:plan` shows the changes that will be applied.
+6. **Apply Infrastructure:**
+   ```bash
+   tofu -chdir=infrastructure apply


The documentation instructs users to run tofu -chdir=infrastructure apply directly in step 6, but this bypasses the mise task that sets up the required environment variables (TF_VAR_region, TF_VAR_project_id, TF_VAR_sa_dbt_id). This should be mise run infra:apply to ensure consistency with the automated approach.

Suggested change

tofu -chdir=infrastructure apply

mise run infra:apply

Copilot · 2025-11-29T20:59:12Z

scripts/setup_prefect_blocks.py

+        if value is None:
+            value = None


The condition if value is None: value = None on lines 90-91 is redundant and has no effect. This should be removed or the logic should be clarified if a different check was intended.

Suggested change

if value is None:

value = None

Copilot · 2025-11-29T20:59:13Z

infrastructure/outputs.tf

@@ -6,24 +6,4 @@ output "bq_dev_dataset_id" {
 output "bq_prod_dataset_id" {
  description = "BigQuery prod dataset ID"
  value       = google_bigquery_dataset.prod_dataset.dataset_id


Several terraform outputs have been removed (sa_email, project_id, region, sa_key_path) that are still referenced in the template files (e.g., dbt/profiles.tpl.yml uses tf.project_id, tf.region, tf.sa_key_path). While these have fallback environment variables, the templates may fail to render correctly if both terraform outputs and environment variables are missing. Consider adding these outputs back or documenting the required environment variables.

Suggested change

value = google_bigquery_dataset.prod_dataset.dataset_id

value = google_bigquery_dataset.prod_dataset.dataset_id

}

output "sa_email" {

description = "Service account email"

value = google_service_account.sa.email

}

output "project_id" {

description = "GCP project ID"

value = var.project_id

}

output "region" {

description = "GCP region"

value = var.region

}

output "sa_key_path" {

description = "Path to the service account key file"

value = local.sa_key_path

Copilot · 2025-11-29T20:59:14Z

.env.example

+GCLOUD_PROJECT=
+# list for available regions: https://cloud.google.com/about/locations ; auto : https://cloud.withgoogle.com/region-picker/
+GCLOUD_REGION=
+PREFECT_API_URL=
+PREFECT_API_KEY=


The .env.example file is missing several environment variables that are referenced in the templates and scripts, including GCP_SA_KEY_PATH, BQ_DEV_DATASET, BQ_PROD_DATASET, and potentially others. Add these missing variables with appropriate comments to guide users in setting up their environment correctly.

Copilot · 2025-11-29T20:59:14Z

mise.toml

+# TF_LOG = { value = "DEBUG" }
+PREFECT_PROFILES_PATH = { value = "profiles.toml" }
+
+GCLOUD_SA_DBT_ID = { value = "dbt-sa" }


The default value for GCLOUD_SA_DBT_ID is "dbt-sa" in mise.toml, but the corresponding Terraform variable sa_dbt_id has a default of "projet-bi-dbt-sa" in infrastructure/variables.tf. This inconsistency could lead to mismatches between the service account created by Terraform and the one expected by other scripts. Align these default values or document the intentional difference.

Suggested change

GCLOUD_SA_DBT_ID = { value = "dbt-sa" }

GCLOUD_SA_DBT_ID = { value = "projet-bi-dbt-sa" }

Copilot · 2025-11-29T20:59:14Z

scripts/sync_env

+# requires-python = ">=3.10"
+# dependencies = [
+#     "python-dotenv",
+#     "click",


The dependency 'click' is listed in the script dependencies but is never imported or used. This should be removed to avoid unnecessarily installing unused packages.

Suggested change

# "click",

Copilot · 2025-11-29T20:59:15Z

scripts/setup_prefect_blocks.py

@@ -0,0 +1,262 @@
+#!/usr/bin/env -S uv run --script


The requires-python line is missing the closing triple slash comment marker # ///. The inline PEP 723 script metadata format requires both opening # /// script and closing # /// markers.

Suggested change

#!/usr/bin/env -S uv run --script

#!/usr/bin/env -S uv run --script

# /// script

* feat: Setup Prefect and IaC config to support prod environment * hotfix: change Prefect deployment shell script

…profiles and pipeline flows

…ipts

…iptions in dbt YAML files

* chore: update asset_base path in dbt configuration and adjust file copy commands in GitHub Actions workflow * chore: update asset paths in dbt documentation and GitHub Actions workflow for consistency * chore: standardize asset paths in dbt documentation workflow and update configuration for clarity * chore: correct asset path in fct__viewings * chore: minor changes * chore: update asset_base path in erd_config.yml for consistency * chore: update IAM role for storage admin and adjust seed script parameters for data generation * feat: enhance data ingestion process by adding ingestion_date to models and updating GCS paths for Hive partitioning * fix: correct release date calculation in seed_script to use a 30-day interval for seasons * feat: implement snapshot strategy in models and update external table definitions for ingestion_date partitioning

…solete test image

…raphs, references)

Bafbi added 18 commits November 17, 2025 12:28

feat: Add initial configuration and automation tasks for infrastructu…

6ecced8

…re setup

refactor: Remove unused Terraform resources and variables for cleaner…

a6d15ff

… configuration

feat: Update environment configuration and add service account key ge…

50e6fdf

…neration task

refactor(infra): rename dbt service account var and cleanup outputs

772ed8d

feat(dbt): update profiles template to use dynamic env vars

1baf6e0

chore(env): update mise config and env example for dynamic vars

f20d011

feat(prefect): replace static config with template for dynamic git url

2062ccb

feat(infra): use dataset suffix vars and local dataset ids (rework ma…

bde4aad

…in.tf and variables.tf)

feat(dbt): update template to use env and tf context + add rendered p…

6799391

…rofiles.yml

feat(scripts): add render_template and setup_prefect_blocks CLIs (ren…

9297949

…der + create Prefect blocks)

chore(mise): add dbt render & prefect setup tasks; remove deprecated …

d1a7d79

…infra:sync_vars task

feat(prefect): add rendered prefect.yml, update template usage and te…

59638a1

…st flow to use new block names

chore(gitignore): stop tracking terraform-outputs.json (now added to …

103ed47

….gitignore)

chore(gitignore): add terraform-outputs.json to .gitignore

1e67b4a

refactor(infra): remove deprecated infrastructure/setup_profiles modu…

c3a753c

…le in favor of scripts

docs: update mise workflow

b4cffc8

chore: scripts and ignore tweaks

9ade7bf

fix: some nitpick from code review

fbaf7e9

Bafbi linked an issue Nov 28, 2025 that may be closed by this pull request

Improve reproducibility and tempalating of the repo using mise #11

Open

Bafbi requested review from CyprienKelma and Copilot November 28, 2025 21:06

Copilot started reviewing on behalf of Bafbi November 28, 2025 21:07 View session

Bafbi added this to Summer Media Nov 28, 2025

github-project-automation bot moved this to Todo in Summer Media Nov 28, 2025

Bafbi added documentation Improvements or additions to documentation enhancement New feature or request labels Nov 28, 2025

Copilot finished reviewing on behalf of Bafbi November 28, 2025 21:09

Copilot AI reviewed Nov 28, 2025

View reviewed changes

Bafbi and others added 2 commits November 28, 2025 22:20

chore: update .gitignore and untrack templated files

bc6cfe1

chore: apply nitpicks

74014b0

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Bafbi added 9 commits November 29, 2025 13:28

fix: needing the .env to enable mise

35c7c1f

fix: update environment variable references in mise.toml for consistency

438d16f

fix: replace local variables with inline expressions for dataset IDs …

fb6255b

…in main.tf

fix: update .gitignore to ignore tfvars files and clean up mise.toml …

55e47ec

…formatting

feat: add 'infra:apply' task to run 'opentofu apply' and update usage…

f7fda74

… flags

feat: enhance mise.toml with setup task and improve infra dependencies

31c5834

feat: update task dependencies in mise.toml for improved setup and in…

38f677d

…tegration

feat: update service account key generation task to use default email…

d29790f

… from environment variables

feat: update dbt and prefect tasks to include additional sources and …

1c65a91

…modify save flag default

Bafbi requested a review from Copilot November 29, 2025 20:53

Copilot started reviewing on behalf of Bafbi November 29, 2025 20:54 View session

Copilot finished reviewing on behalf of Bafbi November 29, 2025 20:57

Copilot AI reviewed Nov 29, 2025

View reviewed changes

Merge branch 'main' into feat/improve-deployment

1d12e68

Bafbi moved this from Todo to In Progress in Summer Media Nov 29, 2025

Bafbi self-assigned this Nov 29, 2025

Bafbi removed this from Summer Media Nov 29, 2025

CyprienKelma and others added 13 commits December 13, 2025 17:22

feat: Setup Prefect and IaC config to support prod environment (#17)

2fceac6

hotfix: refactor/prod dataset (#18)

114c56a

* feat: Setup Prefect and IaC config to support prod environment * hotfix: change Prefect deployment shell script

chore: fix description content of fact table

030b7ca

hotfix: update Prefect deployment

6738666

hotfix: update dbt commands to include target specification in setup_…

fdcc637

…profiles and pipeline flows

refactor: minor changes on Prefect deployment scripts

bd19068

chore: add dbt dependency installation step to Prefect deployment scr…

aa32d8a

…ipts

hotfix: update dbt command execution

74ade0e

chore: update Prefect deployment configuration and refine model descr…

1fe66c5

…iptions in dbt YAML files

chore: update .gitignore to include presentation assets and remove ob…

917f8bd

…solete test image

docs: add MOE POC and related documentation (presentation, annexes, g…

844955f

…raphs, references)

chore(mise): add typst to tools in mise.toml

e886bb2

	#!/usr/bin/env -S uv run --script
	#!/usr/bin/env -S uv run --script
	# /// script

	GIT_BRANCH="$(git describe --all --always 2>/dev/null \|\| echo "${GIT_COMMIT_SHORT}")"
	GIT_BRANCH="$(_safe git describe --all --always \|\| echo "${GIT_COMMIT_SHORT}")"

-  value       = google_bigquery_dataset.prod_dataset.dataset_id
+  value       = google_bigquery_dataset.prod_dataset.dataset_id
+}
+output "sa_email" {
+  description = "Service account email"
+  value       = google_service_account.sa.email
+}
+output "project_id" {
+  description = "GCP project ID"
+  value       = var.project_id
+}
+output "region" {
+  description = "GCP region"
+  value       = var.region
+}
+output "sa_key_path" {
+  description = "Path to the service account key file"
+  value       = local.sa_key_path

	GCLOUD_SA_DBT_ID = { value = "dbt-sa" }
	GCLOUD_SA_DBT_ID = { value = "projet-bi-dbt-sa" }

feat: Improve reproducibility and templating using mise #13

Are you sure you want to change the base?

feat: Improve reproducibility and templating using mise #13

Uh oh!

Conversation

Bafbi commented Nov 28, 2025

Description

Related Issue

Key Changes

Verification Steps

Impact

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants