Skip to content

Refactor baml_builtins_macros: split into modules, deduplicate, and improve readability#3090

Open
hellovai wants to merge 4 commits intocanaryfrom
refactor/builtins-macros-dedup
Open

Refactor baml_builtins_macros: split into modules, deduplicate, and improve readability#3090
hellovai wants to merge 4 commits intocanaryfrom
refactor/builtins-macros-dedup

Conversation

@hellovai
Copy link
Contributor

@hellovai hellovai commented Feb 9, 2026

Summary

  • Split monolithic 2722-line lib.rs into 7 focused modules: parse.rs, collect.rs, util.rs, codegen_builtins.rs, codegen_native.rs, codegen_sys_ops.rs, codegen_accessors.rs. The new lib.rs is just 65 lines of thin entry points.
  • Replaced opaque tuples in NativeFnDef (e.g., receiver: Option<(String, String, bool, bool)>) with named structs (ReceiverInfo, ParamInfo, ReturnInfo) — making all destructuring sites self-documenting.
  • Extracted shared CollectedBuiltins struct with a from_modules() constructor, eliminating the duplicated 15-line parse+collect block across all 4 proc macros.
  • Integrated AccessorTypeDef collection into the shared walker (CollectContext), removing the separate collect_accessor_types / collect_accessor_types_from_module functions.
  • Unified utility functions: removed sys_op_ref_type_ident (callers now use path_to_rust_ident + HashMap lookup) and sys_op_receiver_name (callers now use to_snake_case).
  • Removed unnecessary Option wrapper from BuiltinFieldDef.ty (it was always Some).
  • Code-gen builtin_types module in bex_heap from the with_builtins! DSL (from prior work, included in this branch).

Test plan

  • cargo check passes with zero warnings
  • cargo test --lib passes (78+ tests, 0 failures)
  • cargo clippy passes (enforced by pre-commit hook)
  • cargo fmt passes (enforced by pre-commit hook)
  • All pre-commit hooks pass

Made with Cursor

Summary by CodeRabbit

  • New Features

    • Automatic generation and registration of builtin accessors, native functions, and system ops at compile time.
    • New proc-macro to emit accessor types and runtime glue for builtins.
  • Chores

    • Centralized DSL parsing and collection pipeline for builtin declarations.
    • Reorganized codegen into dedicated modules and updated workspace deps.
    • Replaced hand-written builtin accessors with the generated mechanism.

…mprove readability

- Split monolithic 2722-line lib.rs into 7 focused modules:
  parse.rs, collect.rs, util.rs, codegen_builtins.rs,
  codegen_native.rs, codegen_sys_ops.rs, codegen_accessors.rs
- Replace opaque tuples in NativeFnDef with named structs
  (ReceiverInfo, ParamInfo, ReturnInfo)
- Extract shared CollectedBuiltins struct with from_modules()
  constructor, eliminating duplicated parse+collect blocks across
  all 4 proc macros
- Integrate AccessorTypeDef collection into the shared walker,
  removing duplicate collect_accessor_types functions
- Unify path_to_rust_ident (removed sys_op_ref_type_ident duplicate)
- Replace sys_op_receiver_name with to_snake_case
- Change BuiltinFieldDef.ty from Option<TokenStream2> to TokenStream2
- Code-gen builtin_types module in bex_heap from with_builtins! DSL

Co-authored-by: Cursor <cursoragent@cursor.com>
@vercel
Copy link

vercel bot commented Feb 9, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
beps Ready Ready Preview, Comment Feb 9, 2026 7:26pm
promptfiddle Ready Ready Preview, Comment Feb 9, 2026 7:26pm

Request Review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 9, 2026

📝 Walkthrough

Walkthrough

Adds a DSL parser and collection pass plus multiple code generators (accessors, builtins registry, native glue, sys_op traits), utility helpers, and wires generated builtin accessors into bex_heap by replacing handwritten builtin_types and adding workspace deps.

Changes

Cohort / File(s) Summary
Parsing & Collection
baml_language/crates/baml_builtins_macros/src/parse.rs, baml_language/crates/baml_builtins_macros/src/collect.rs
New DSL parser for with_builtins! and a two-pass CollectedBuiltins builder that collects builtin defs, types, native/sys_op metadata, and accessor-type info; exposes AST and collection structs consumed by codegen.
Accessor Codegen
baml_language/crates/baml_builtins_macros/src/codegen_accessors.rs
New generator producing owned structs and typed accessor wrappers for collected accessor types, AsBexExternalValue impls, accessor methods (heap param when needed), into_owned conversions, and nested modules for generated types.
Builtins Registry Codegen
baml_language/crates/baml_builtins_macros/src/codegen_builtins.rs
Generates path constants and ALL slice, iteration macros (for_all_builtins, for_native_builtins, for_native_functions, for_all_sys_ops), and static collections of builtin signatures and type definitions to centralize builtin metadata.
Native Function Glue
baml_language/crates/baml_builtins_macros/src/codegen_native.rs
Generates NativeFunctions trait with clean methods and default glue (__<fn>), argument extraction from &[Value], return conversions to Value, get_native_fn dispatch, and public wrapper functions delegating to the glue.
SysOp Traits & Dispatch
baml_language/crates/baml_builtins_macros/src/codegen_sys_ops.rs
Generates per-module SysOp{Module} traits (clean methods + hidden glue), argument extraction/error mapping, and SysOps::from_impl<T: ...>() constructor to assemble a dispatch table from trait implementations.
Utilities & Macro Entrypoints
baml_language/crates/baml_builtins_macros/src/util.rs, baml_language/crates/baml_builtins_macros/src/lib.rs
Adds identifier/type helpers (casing, path→ident, type→pattern) and refactors lib.rs to use CollectedBuiltins with dedicated codegen modules; adds proc-macro generate_builtin_accessors.
Workspace & Heap Integration
baml_language/crates/bex_heap/Cargo.toml, baml_language/crates/bex_heap/src/accessor.rs
Adds workspace dependencies (baml_builtins, baml_builtins_macros) and replaces the handwritten builtin_types module in bex_heap with baml_builtins::with_builtins!(baml_builtins_macros::generate_builtin_accessors);.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main changes: refactoring baml_builtins_macros by splitting into modules, deduplicating logic, and improving readability.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch refactor/builtins-macros-dedup

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Comment @coderabbitai help to get the list of available commands and usage tips.

@codspeed-hq
Copy link

codspeed-hq bot commented Feb 9, 2026

CodSpeed Performance Report

Merging this PR will not alter performance

Comparing refactor/builtins-macros-dedup (2aac7d9) with canary (a6c3995)

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

✅ 15 untouched benchmarks
⏩ 63 skipped benchmarks1

Footnotes

  1. 63 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

…n, receiver order

- codegen_accessors: MapStringUnknown value_type -> Ty::BuiltinUnknown
- codegen_native: compile_error! when #[uses(vm)] + mutable receiver
- codegen_native: add Option<...> handling in generate_single_extraction
  (extraction_rhs_expr helper + Null -> None, other -> Some(inner))
- parse: validate #[uses(...)] arguments, error on unknown (e.g. uses(typo))
- parse: enforce self as first parameter (reject if params non-empty or receiver set)

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Comment on lines +258 to +263
FieldTypeKind::MapStringUnknown => {
quote!(#name: self.#name(heap)?
.into_iter()
.map(|(k, v)| Ok((k, v.as_owned_but_very_slow(heap)?)))
.collect::<Result<_, _>>()?)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Performance note: as_owned_but_very_slow used for MapStringUnknown conversion.

The method name indicates this is an expensive operation. For into_owned on types with MapStringUnknown fields, be aware of potential performance implications if called frequently.

@codspeed-hq
Copy link

codspeed-hq bot commented Feb 9, 2026

Unable to generate the performance report

There was an internal error while processing the run's data. We're working on fixing the issue. Feel free to contact us on Discord or at support@codspeed.io if the issue persists.

2 similar comments
@codspeed-hq
Copy link

codspeed-hq bot commented Feb 9, 2026

Unable to generate the performance report

There was an internal error while processing the run's data. We're working on fixing the issue. Feel free to contact us on Discord or at support@codspeed.io if the issue persists.

@codspeed-hq
Copy link

codspeed-hq bot commented Feb 9, 2026

Unable to generate the performance report

There was an internal error while processing the run's data. We're working on fixing the issue. Feel free to contact us on Discord or at support@codspeed.io if the issue persists.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Comment on lines +232 to +256
/// Map BAML type names to Rust output types.
fn rust_type_for_output(type_name: &str, is_generic: bool) -> TokenStream2 {
if is_generic {
return quote!(Value);
}

match type_name {
"String" => quote!(String),
"i64" => quote!(i64),
"f64" => quote!(f64),
"bool" => quote!(bool),
"()" => quote!(()),
"Media" => quote!(MediaValue),
"PromptAst" => quote!(PromptAst),
"PrimitiveClient" => quote!(PrimitiveClient),
t if t.starts_with("Array") => quote!(Vec<Value>),
t if t.starts_with("Map") => quote!(IndexMap<String, Value>),
t if t.starts_with("Option<") => {
let inner = &t[7..t.len() - 1];
let inner_type = rust_type_for_output(inner, false);
quote!(Option<#inner_type>)
}
_ => quote!(Value),
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Trim inner type for Option<…> outputs.
rust_type_for_output slices the inner type without trimming. If whitespace is present (e.g., Option< String >), the inner mapping can fall back to Value. Mirror the input mapping’s .trim() to avoid this mismatch.

🛠️ Suggested fix
-            let inner = &t[7..t.len() - 1];
+            let inner = t[7..t.len() - 1].trim();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant