[fix] Bookie Info lost by notification race condition. #20642#4646
[fix] Bookie Info lost by notification race condition. #20642#4646gaozhangmin wants to merge 1 commit intoapache:masterfrom
Conversation
|
Can you provide some information about the process of locating the problem you encountered, or is this just a simple logic synchronization with Pulsar? |
|
@StevenLuMT |
There was a problem hiding this comment.
Pull Request Overview
This PR addresses a race condition in BookKeeper's ZKRegistrationClient where bookie information could be lost due to notification timing issues. The fix separates the cache for writable and read-only bookies and ensures sequential processing of ZooKeeper events to prevent race conditions.
- Splits the single
bookieServiceInfoCacheinto separate caches for writable and read-only bookies - Introduces sequential processing of ZooKeeper notifications using a
Sequencerutility class - Adds comprehensive test coverage to verify the fix handles network delays during bookie transitions
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
ZKRegistrationClient.java |
Core fix implementing separate caches and sequential event processing |
FutureUtils.java |
Adds Sequencer utility class for sequential task execution |
ZKRegistrationClientTest.java |
New test to verify race condition fix with network delay simulation |
FaultInjectableZKRegistrationManager.java |
Test utility for fault injection during registration operations |
bookkeeper-server/src/main/java/org/apache/bookkeeper/discover/ZKRegistrationClient.java
Outdated
Show resolved
Hide resolved
bookkeeper-server/src/main/java/org/apache/bookkeeper/discover/ZKRegistrationClient.java
Show resolved
Hide resolved
bookkeeper-server/src/main/java/org/apache/bookkeeper/discover/ZKRegistrationClient.java
Show resolved
Hide resolved
|
rerun failure checks |
1 similar comment
|
rerun failure checks |
|
@gaozhangmin Is it different from this fix #4481 |
it's different case |
27b9b85 to
634bf4f
Compare
634bf4f to
40e60d5
Compare
|
rerun failure checks |
According to pulsar's fix: apache/pulsar#20642
race conditions problem is also exists in bk's ZKRegistrationClient.java