Conversation
📝 WalkthroughWalkthroughImplements intronic-aware normalization in Mapper (new private normalize_intronic used by maybe_normalize), adds Normalizer::normalize_bounded to normalize with explicit boundaries, and updates DataError formatting to include the inner error message; also bumps rust toolchain in Cargo.toml. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Mapper
participant Genomic as GenomicVariant
participant Normalizer as Normalizer<br/>(bounded)
participant Transcript as TranscriptVariant
Client->>Mapper: call maybe_normalize(var, normalize=true)
alt variant spans intron
Mapper->>Mapper: normalize_intronic(var)
Mapper->>Genomic: convert transcript -> genomic variant
Mapper->>Genomic: determine exon/strand and intron bounds
Mapper->>Normalizer: normalize_bounded(genomic_var, bounds)
Normalizer->>Normalizer: check_and_guard -> normalize_alleles(boundary)
Normalizer-->>Mapper: normalized genomic variant
Mapper->>Transcript: map genomic result -> transcript (g_to_c/g_to_n)
Transcript-->>Client: normalized transcript variant
else non-intronic or fallback
Mapper->>Normalizer: normalize(var) (standard path)
Normalizer-->>Client: normalized or original variant
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #257 +/- ##
==========================================
- Coverage 90.38% 89.81% -0.57%
==========================================
Files 18 18
Lines 9606 9527 -79
==========================================
- Hits 8682 8557 -125
- Misses 924 970 +46
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/mapper/assembly.rs`:
- Around line 330-343: The exons returned by self.provider.get_tx_exons must be
explicitly sorted by alt_start_i before using them to derive strand/intron
bounds; modify the handling of the exons vector (the local exons variable) so
that right after retrieval and before the strand = exons.first() call you sort
the collection by exon.alt_start_i (ascending), then proceed to check for empty
and read .first(); this ensures downstream logic that computes intron boundaries
and uses alt_strand relies on properly ordered exons from both UTA and cdot/json
providers.
🧹 Nitpick comments (2)
src/mapper/assembly.rs (2)
345-362: Boundary defaults to0..i32::MAXif variant doesn't fall within any intron gap.If the loop at lines 353-362 doesn't find an intron containing the variant (e.g., if the variant spans an exon-intron boundary or the exon data is incomplete),
boundaryremains0..i32::MAX. This would allow the variant to shuffle without restriction, which may not be the intended behavior for intronic normalization.Consider returning an error or logging when no appropriate intron boundary is found:
♻️ Suggested improvement for boundary validation
// find intron "containing" the variant to set boundaries let mut boundary = 0..i32::MAX; + let mut found_intron = false; // assumes exons are sorted (by alt_start_i) for i in 0..exons.len().saturating_sub(1) { let prev_exon_end = exons[i].alt_end_i; let next_exon_start = exons[i + 1].alt_start_i; // check if variant is roughly within this gap if var_start >= prev_exon_end && var_end <= next_exon_start { boundary = prev_exon_end..next_exon_start; + found_intron = true; break; } } + if !found_intron { + tracing::debug!( + "Could not find intron boundary for variant {} (range {}..{})", + var, var_start, var_end + ); + }
346-346: Consider using a more specific error instead ofError::General.
Error::Generalprovides minimal context. A more descriptive error would aid debugging when the genomic variant lacks a location range.♻️ Suggested improvement
- let var_range = var_g.loc_range().ok_or(Error::General)?; + let var_range = var_g + .loc_range() + .ok_or_else(|| Error::MissingGenomeIntervalPosition(format!("{}", var_g)))?;
… via roundtrip to genomic loc and back
Summary by CodeRabbit
New Features
Improvements
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.