Fix text mapping for special characters#11390
Open
etvorun wants to merge 2 commits intodotnet:mainfrom
Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes a regression from the combining-mark script check introduced in #6857 that incorrectly split certain emoji/keycap sequences (e.g. 1️⃣), leading to crashes during line breaking/font fallback.
Changes:
- Adds a “script-agnostic combining” classification and an
IsSameScripthelper for script comparisons. - Updates font mapping logic to keep script-agnostic marks with their base character while preserving the cross-script combining fallback behavior from #6857.
- Updates DirectWriteForwarder itemization/character attribute analysis to incorporate the new script comparison behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
src/Microsoft.DotNet.Wpf/src/PresentationCore/MS/internal/FontFace/PhysicalFontFamily.cs |
Adjusts font mapping logic to keep script-agnostic combining marks with the base character (emoji/keycap sequences). |
src/Microsoft.DotNet.Wpf/src/PresentationCore/MS/internal/Classification.cs |
Adds script-agnostic combining detection and an IsSameScript helper; updates classification API surface. |
src/Microsoft.DotNet.Wpf/src/DirectWriteForwarder/CPP/DWriteWrapper/TextAnalyzer.cpp |
Adds combining-mark script comparison behavior during analysis and plumbs new classification info. |
src/Microsoft.DotNet.Wpf/src/DirectWriteForwarder/CPP/DWriteWrapper/IClassification.h |
Extends the classification interface with a new out parameter and IsSameScript. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/Microsoft.DotNet.Wpf/src/DirectWriteForwarder/CPP/DWriteWrapper/TextAnalyzer.cpp
Show resolved
Hide resolved
src/Microsoft.DotNet.Wpf/src/DirectWriteForwarder/CPP/DWriteWrapper/TextAnalyzer.cpp
Outdated
Show resolved
Hide resolved
src/Microsoft.DotNet.Wpf/src/PresentationCore/MS/internal/Classification.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.DotNet.Wpf/src/DirectWriteForwarder/CPP/DWriteWrapper/IClassification.h
Outdated
Show resolved
Hide resolved
lindexi
reviewed
Jan 27, 2026
src/Microsoft.DotNet.Wpf/src/PresentationCore/MS/internal/Classification.cs
Outdated
Show resolved
Hide resolved
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #11386
Problem
PR #6857 introduced a script comparison check to prevent combining marks from different scripts from staying with their base character during font fallback. While this fixed a legitimate issue, it inadvertently broke emoji sequences because:
Emoji keycap sequences like "1️⃣" consist of:
Common/DigitInheritedCommon/SymbolThe script check saw these as different scripts and broke the combining relationship, causing the text itemizer to split the sequence incorrectly, which led to a crash in Line Services.
Solution
Introduced the concept of script-agnostic combining marks - characters that are designed to modify any base character regardless of script. These include:
The fix ensures these script-agnostic marks always stay with their base character, while the original PR #6857 script check still applies to regular combining marks.
Changes
Native Code (DirectWriteForwarder)
IClassification.hisExtendedout parameter toGetCharAttributeandIsSameScriptmethodTextAnalyzer.cppisExtendedparameter and skip script check for script-agnostic marksManaged Code (PresentationCore)
Classification.csIsScriptAgnosticCombining()method, updatedGetCharAttribute()withisExtendedparameter, addedIsSameScript()toClassificationUtilityPhysicalFontFamily.csIsScriptAgnosticCombiningfor proper emoji sequence handlingTesting
Manual Testing
Test Cases
Risk
Low - The change is additive and only affects the specific case of script-agnostic combining marks. The existing script check from PR #6857 remains in place for all other combining marks.
Related Issues/PRs
Microsoft Reviewers: Open in CodeFlow