Skip to content

Feature request: Binary string diffing support #298

@gschlager

Description

@gschlager

Problem

When comparing binary strings (ASCII-8BIT encoding), super_diff shows unreadable output with escape sequences and control characters. This is common when testing file formats, network protocols, or any binary data.

Proposed solution

Add built-in support for binary string diffing that:

  1. Detects binary strings - Strings with Encoding::ASCII_8BIT
  2. Displays as hex dump - Shows offset, hex bytes, and printable ASCII:
    00000000: 7573 7461 7220 2000 0000 0000 0000 0000  ustar  .........                                                                                                                                         
    00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................                                                                                                                                         
    
  3. Shows context around changes - Only displays a few lines around actual differences (similar to unified diff)
  4. Inspects cleanly - Shows <binary string (512 bytes)> instead of garbled text

Working implementation

We have a working extension in mini_tarball that we'd be happy to contribute as a PR. It includes:

  • Differs::BinaryString
  • InspectionTreeBuilders::BinaryString
  • OperationTreeBuilders::BinaryString
  • OperationTrees::BinaryString
  • OperationTreeFlatteners::BinaryString

The implementation follows super_diff's architecture and has been working well for us when testing tar archive generation.

Would you be interested in a PR for this feature?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions