Fix SourceLink parsing to support both wildcard and exact path mappings#2355
Conversation
- Fixed regex bug that prevented parsing multiple SourceLink mappings - Updated ParseSourceLinkJson to accept both wildcard patterns and exact file paths - Enhanced GetUrlForFilePathUsingSourceLink with two-phase matching (exact then prefix) - Added comprehensive test for both mapping types - Resolves microsoft#2350
Per SourceLink spec rule 2: paths without wildcards should only be used for exact matching, not prefix matching. This change tracks whether each mapping was originally a wildcard pattern and only performs prefix matching for those entries. Previously, a non-wildcard path like 'C:\foo\bar.cs' could incorrectly match as a prefix for 'C:\foo\bar.cs.bak', violating the spec.
|
@ivberg, thank you for submitting this. I am in the process of reviewing this and doing some testing against the SourceLink spec. |
|
@ivberg I pushed an additional commit based on some analysis of the spec. The tests you added pass. Is there any local testing that you might want to do on this before merging? |
marklio
left a comment
There was a problem hiding this comment.
In general, this seems fine. I left a few comments to consider at this time or the future.
| { | ||
| string path = map.Item1; | ||
| string urlReplacement = map.Item2; | ||
| string urlTemplate = map.Item2; |
There was a problem hiding this comment.
SUGGESTION: deconstruct these in the foreach.
|
|
||
| // If no exact match, try prefix matching (wildcard patterns only) | ||
| // Per spec rule 2: only paths that had a wildcard (*) should be used for prefix matching | ||
| foreach (Tuple<string, string, bool> map in _sourceLinkMapping) |
There was a problem hiding this comment.
COMMENT: It's at least a little weird to swing through these twice and operate on disjoint sets. I get there's a simplicity to looking through once for the "exact match" and once for the others. Is this something that can be simplified in the data structure so this code isn't so weird? Some example approaches:
- Two different maps for isWildCard
- Sort for isWildcard first
There was a problem hiding this comment.
I think this is possible, but I'm not sure it's worth it. The set size is usually quite small and so it's likely more complex to the change the data structure to iterate through them one less time.
| /// Supports both wildcard patterns ("path\\*" -> "url/*") and exact path mappings ("path\\file.h" -> "url"). | ||
| /// </summary> | ||
| private List<Tuple<string, string>> ParseSourceLinkJson(IEnumerable<string> sourceLinkContents) | ||
| private List<Tuple<string, string, bool>> ParseSourceLinkJson(IEnumerable<string> sourceLinkContents) |
There was a problem hiding this comment.
QUESTION: Is this a good opportunity to move to ValueTuple (and the built-in syntax for them that supports names?)
|
Thank you for the feedback @marklio. I'm going to merge this one and consider these in the future. |
I did not test. This is semi-automated fix by remote codespace and Claude Sonnet 4.5 fix attempts.