Skip to content

Thai combining marks cause width miscalculation in modern terminals (Kitty, Ghostty) #3957

@noomzopendream

Description

@noomzopendream

Problem

Thai combining marks (vowels and tone marks) are marked as zero-width in CELL_WIDTHS, following Unicode's Mn (Nonspacing Mark) classification. However, modern terminal emulators (tested: Kitty, Ghostty) render Thai text with these characters taking visible width, causing table columns and other layouts to misalign.

Reproduction

from rich.console import Console
from rich.table import Table
from rich.cells import cell_len

console = Console()
table = Table(title='Thai Text Width Test')
table.add_column('Text', width=10)
table.add_column('cell_len')

for text in ['สวัสดี', 'น้ำ', 'กรุงเทพ']:
    table.add_row(text, str(cell_len(text)))

console.print(table)

Width Analysis

Text Codepoints Combining marks Rich cell_len Terminal width
สวัสดี 6 2 4 6
น้ำ 3 1 2 3
กรุงเทพ 7 1 6 7

The discrepancy equals the number of combining marks — terminals render each Thai codepoint as width 1.

Affected Ranges in CELL_WIDTHS

Range Characters Unicode Category
U+0E31 Mn (sara am vowel)
U+0E34-U+0E3A ิ ี ึ ื ุ ู ฺ Mn (vowel signs)
U+0E47-U+0E4E ็ ่ ้ ๊ ๋ ์ ํ ๎ Mn (tone marks)

Environment

  • Rich: 13.7.1
  • Python: 3.14.2
  • Terminals tested:
    • Kitty 0.45.0 — affected
    • Ghostty 1.2.3 — affected
  • Platform: macOS

Proposed Fix

Change Thai combining mark ranges from width 0 to width 1 in _cell_widths.py. This matches actual terminal rendering behavior for Thai script.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions