Skip to content

Conversation

@mattsu2020
Copy link
Contributor

@mattsu2020 mattsu2020 commented Feb 4, 2026

Summary

This PR enhances the unexpand utility to improve compatibility with GNU coreutils and ensure more predictable behavior.

Motivation

While testing unexpand, I identified discrepancies in behavior compared to GNU coreutils, particularly in handling whitespace and tab conversion. This change aims to reduce those differences and improve overall reliability.

Changes

Refactored the unexpand logic to align more closely with GNU coreutils behavior.

Improved error handling by returning Result types instead of panicking, enhancing robustness.

Added and updated tests to cover the modified behavior and prevent regressions.

Compatibility

The goal of this change is to match GNU coreutils behavior as closely as possible without introducing breaking changes.

Testing

Added unit and integration tests.

Verified behavior against GNU coreutils.

Confirmed no regressions in existing tests.

related

#10698

Simplified conditional expressions using ternary operators for better readability and improved code formatting by reorganizing multi-line function calls with proper indentation.
…d security

Updated multiple dependency versions across the project including:
- cc from 1.2.52 to 1.2.55
- find-msvc-tools from 0.1.7 to 0.1.9
- flate2 from 1.1.8 to 1.1.9
- iana-time-zone from 0.1.64 to 0.1.65
- libm from 0.2.15 to 0.2.16
- notify-types from 2.0.0 to 2.1.0
- portable-atomic from 1.13.0 to 1.13.1
- portable-atomic-util from 0.2.4 to 0.2.5
- regex-automata from 0.4.13 to 0.4.14
- regex-lite from 0.1.8 to 0.1.9
- windows-sys from 0.59.0 to 0.61.2
- zerocopy from 0.8.33 to 0.8.38

These updates provide bug fixes, performance improvements, and security enhancements.
Simplify the conditional return statement in `utf8_incomplete_tail` using `usize::from()` for cleaner code. Also restructure the control flow in `unexpand_chunk` to eliminate unnecessary else block and improve readability by moving the write operation outside the conditional.
…acter

The unexpand utility was incorrectly skipping tab conversion after encountering an initial non-space character, even when the current column position matched the starting column. This caused inconsistent behavior where tabs were not converted to spaces as expected. The fix adds a check for `state.col == state.scol` to ensure conversion only skips when both conditions are met: the initial non-space character is found AND we're at the
@codspeed-hq
Copy link

codspeed-hq bot commented Feb 4, 2026

CodSpeed Performance Report

Merging this PR will improve performance by ×4.5

Comparing mattsu2020:unexpand (e0d9e19) with main (c086d43)

Summary

⚡ 3 improved benchmarks
✅ 281 untouched benchmarks
⏩ 38 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation unexpand_large_file[10] 548.5 ms 122.8 ms ×4.5
Simulation unexpand_many_lines[100000] 261.6 ms 58.7 ms ×4.5
Memory unexpand_many_lines[100000] 64.9 KB 56.8 KB +14.24%

Footnotes

  1. 38 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@github-actions
Copy link

github-actions bot commented Feb 4, 2026

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)

@sylvestre
Copy link
Contributor

one more time ...
split this into several pr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants