[LANG-1802] Optimize CharRange.hashCode() to reduce collision rate and improve calculation efficiency#1524
[LANG-1802] Optimize CharRange.hashCode() to reduce collision rate and improve calculation efficiency#1524IcoreE wants to merge 4 commits intoapache:masterfrom
Conversation
…d improve calculation efficiency
|
Hello @IcoreE |
…cts.hash version and bitwise operation version
|
Hello @garydgregory In the PR, I have provided the unit test code: CharRangeHashCodeTest.java |
# Conflicts: # src/main/java/org/apache/commons/lang3/CharRange.java
…ects.hash vs Bitwise)

1. Background
The CharRange class (package: org.apache.commons.lang3) is a core component of the CharSet utility, and its hashCode() method is frequently invoked in scenarios such as HashMap/HashSet storage. The current implementation of hashCode() (linear combination) has severe hash collision problems and suboptimal calculation efficiency, which affects the performance of upper-layer applications.
2. Problem Description
The current hashCode() implementation has two critical issues: LANG-1802
These two logically distinct instances (equals() returns false) have the same hash code, leading to serious bucket conflicts in HashMap.
3. Proposed Solution
Replace the current linear combination implementation with a bitwise splicing + XOR flag scheme, which maximizes the use of 32-bit int space, minimizes collision rate, and optimizes calculation efficiency: