mvcc: fix watch compaction race causing double notifications#21189
mvcc: fix watch compaction race causing double notifications#21189Aman-Cool wants to merge 1 commit intoetcd-io:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Aman-Cool The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @Aman-Cool. Thanks for your PR. I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
The choose() function conditionally returned the original unsynced group when len(watchers) < maxWatchers. This caused chooseAll() to modify the original group during iteration, leading to: - Potential panic from double-delete of watchers - Race condition causing duplicate compaction notifications Fix by always returning a copy in choose() and properly handling compacted watcher cleanup in syncWatchers(). Signed-off-by: Aman-Cool <aman017102007@gmail.com>
9e8da9d to
eca4fac
Compare
|
@serathius @ahrtr , This PR fixes a watch compaction bug where |
This PR documents and fixes a correctness issue in the watch compaction path.
Issue
When the number of unsynced watchers was below the sync limit,
choose()returned the original watcher group instead of a copy. During compaction, this caused the watcher map to be modified while it was being iterated, which could lead to:removing missing watcher!This can occur during normal operation with auto-compaction enabled.
Fix
choose()now always returns a copy of the watcher groupchooseAll()no longer deletes watchers while iteratingsyncWatchers()Result
The fix eliminates panics and silent watch failures while preserving existing behavior.
All relevant MVCC tests pass, including compaction-related watch tests.