Refactor baml_builtins_macros: split into modules, deduplicate, and improve readability#3090
Refactor baml_builtins_macros: split into modules, deduplicate, and improve readability#3090
Conversation
…mprove readability - Split monolithic 2722-line lib.rs into 7 focused modules: parse.rs, collect.rs, util.rs, codegen_builtins.rs, codegen_native.rs, codegen_sys_ops.rs, codegen_accessors.rs - Replace opaque tuples in NativeFnDef with named structs (ReceiverInfo, ParamInfo, ReturnInfo) - Extract shared CollectedBuiltins struct with from_modules() constructor, eliminating duplicated parse+collect blocks across all 4 proc macros - Integrate AccessorTypeDef collection into the shared walker, removing duplicate collect_accessor_types functions - Unify path_to_rust_ident (removed sys_op_ref_type_ident duplicate) - Replace sys_op_receiver_name with to_snake_case - Change BuiltinFieldDef.ty from Option<TokenStream2> to TokenStream2 - Code-gen builtin_types module in bex_heap from with_builtins! DSL Co-authored-by: Cursor <cursoragent@cursor.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughAdds a DSL parser and collection pass plus multiple code generators (accessors, builtins registry, native glue, sys_op traits), utility helpers, and wires generated builtin accessors into bex_heap by replacing handwritten builtin_types and adding workspace deps. Changes
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Comment |
CodSpeed Performance ReportMerging this PR will not alter performanceComparing
|
baml_language/crates/baml_builtins_macros/src/codegen_native.rs
Outdated
Show resolved
Hide resolved
…n, receiver order - codegen_accessors: MapStringUnknown value_type -> Ty::BuiltinUnknown - codegen_native: compile_error! when #[uses(vm)] + mutable receiver - codegen_native: add Option<...> handling in generate_single_extraction (extraction_rhs_expr helper + Null -> None, other -> Some(inner)) - parse: validate #[uses(...)] arguments, error on unknown (e.g. uses(typo)) - parse: enforce self as first parameter (reject if params non-empty or receiver set) Co-authored-by: Cursor <cursoragent@cursor.com>
| FieldTypeKind::MapStringUnknown => { | ||
| quote!(#name: self.#name(heap)? | ||
| .into_iter() | ||
| .map(|(k, v)| Ok((k, v.as_owned_but_very_slow(heap)?))) | ||
| .collect::<Result<_, _>>()?) | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Performance note: as_owned_but_very_slow used for MapStringUnknown conversion.
The method name indicates this is an expensive operation. For into_owned on types with MapStringUnknown fields, be aware of potential performance implications if called frequently.
Unable to generate the performance reportThere was an internal error while processing the run's data. We're working on fixing the issue. Feel free to contact us on Discord or at support@codspeed.io if the issue persists. |
2 similar comments
Unable to generate the performance reportThere was an internal error while processing the run's data. We're working on fixing the issue. Feel free to contact us on Discord or at support@codspeed.io if the issue persists. |
Unable to generate the performance reportThere was an internal error while processing the run's data. We're working on fixing the issue. Feel free to contact us on Discord or at support@codspeed.io if the issue persists. |
| /// Map BAML type names to Rust output types. | ||
| fn rust_type_for_output(type_name: &str, is_generic: bool) -> TokenStream2 { | ||
| if is_generic { | ||
| return quote!(Value); | ||
| } | ||
|
|
||
| match type_name { | ||
| "String" => quote!(String), | ||
| "i64" => quote!(i64), | ||
| "f64" => quote!(f64), | ||
| "bool" => quote!(bool), | ||
| "()" => quote!(()), | ||
| "Media" => quote!(MediaValue), | ||
| "PromptAst" => quote!(PromptAst), | ||
| "PrimitiveClient" => quote!(PrimitiveClient), | ||
| t if t.starts_with("Array") => quote!(Vec<Value>), | ||
| t if t.starts_with("Map") => quote!(IndexMap<String, Value>), | ||
| t if t.starts_with("Option<") => { | ||
| let inner = &t[7..t.len() - 1]; | ||
| let inner_type = rust_type_for_output(inner, false); | ||
| quote!(Option<#inner_type>) | ||
| } | ||
| _ => quote!(Value), | ||
| } | ||
| } |
There was a problem hiding this comment.
Trim inner type for Option<…> outputs.
rust_type_for_output slices the inner type without trimming. If whitespace is present (e.g., Option< String >), the inner mapping can fall back to Value. Mirror the input mapping’s .trim() to avoid this mismatch.
🛠️ Suggested fix
- let inner = &t[7..t.len() - 1];
+ let inner = t[7..t.len() - 1].trim();
Summary
lib.rsinto 7 focused modules:parse.rs,collect.rs,util.rs,codegen_builtins.rs,codegen_native.rs,codegen_sys_ops.rs,codegen_accessors.rs. The newlib.rsis just 65 lines of thin entry points.NativeFnDef(e.g.,receiver: Option<(String, String, bool, bool)>) with named structs (ReceiverInfo,ParamInfo,ReturnInfo) — making all destructuring sites self-documenting.CollectedBuiltinsstruct with afrom_modules()constructor, eliminating the duplicated 15-line parse+collect block across all 4 proc macros.AccessorTypeDefcollection into the shared walker (CollectContext), removing the separatecollect_accessor_types/collect_accessor_types_from_modulefunctions.sys_op_ref_type_ident(callers now usepath_to_rust_ident+ HashMap lookup) andsys_op_receiver_name(callers now useto_snake_case).Optionwrapper fromBuiltinFieldDef.ty(it was alwaysSome).builtin_typesmodule inbex_heapfrom thewith_builtins!DSL (from prior work, included in this branch).Test plan
cargo checkpasses with zero warningscargo test --libpasses (78+ tests, 0 failures)cargo clippypasses (enforced by pre-commit hook)cargo fmtpasses (enforced by pre-commit hook)Made with Cursor
Summary by CodeRabbit
New Features
Chores