-
Notifications
You must be signed in to change notification settings - Fork 35
Make rebalancer prefer buckets without rw refs #633
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Make rebalancer prefer buckets without rw refs #633
Conversation
Currently one part of the bucket-related code is at the top of file, another one - in the middle. The current commit is called upon to solve this "problem". The commit doesn`t make any changes to the code, just moves it. The diff is ugly, since the default git algorithm matches poorly here, use `git diff --patience` instead. The main motivation is to move the ref-related code before bucket_transfer_start/end(), which will use refs in the upcoming commits. The `check_is_master()` function is moved to the helper section, where it belongs, since it's used by the ref code. Needed for tarantool#351 NO_DOC=refactoring NO_TEST=refactoring
efe8655 to
15c2a62
Compare
And a bunch of other bugs. Seems like a working solution now, ready for review |
a84098a to
5ca27b9
Compare
This commit adds new functions, which are meant to be used instead of the bucket_transfer_start/end() when sending a bucket. This is needed for tarantool#351, where rebalancer will start to prefer the buckets without refs. For that to work we need to mark the bucket as `transfering` and set `rw_lock` far earlier, than `bucket_send` happens(). The first one is required in order to simplify the process of picking buckets and helps not to pick the same one twice. The second one is essential, since we don't want new refs to happen, when we picked the bucket for sending. The explicit check for being master is dropped from the `bucket_send()` function, since it was never needed there, the check is done in the `bucket_refrw_touch()` anyway. The commit also encapsulates the pcalling of `bucket_send_xc` to separate function: `bucket_send_perform()`, which will also be needed for applier of routes to send a bucket. Part of tarantool#351 NO_DOC=refactoring NO_TEST=refactoring
The `rebalancer_in_work` test case enabled the rebalancer even when it was already enabled. Moreover, at the end of the test the rebalancer is disabled, consequently it doesn't work for the subsequent tests, which is strange for the test, which exists exclusively to test the rebalancer. The rebalancer itself isn't needed for existing tests, but it will be required for the tests, added in the next commit, so let's drop enabling/disabling the rebalancer from the test. Needed for tarantool#351 NO_DOC=test
Before this commit the rebalancer picked the first encountered bucket and waited for number of rw refs on it to become 0. This doesn't work reliably, when cluster has very long RW requests, which block bucket rebalancing for hours: the instance picks the same buckets and timeouts, while waiting for no refs. In order to fix that let's prefer buckets without refs. If there's not enough such buckets, take the remainings from the first available buckets, as it worked before. From now on rebalancer routes applier knows all of the buckets, which are going to be sent before these sends start, which is required for fixing tarantool#573 and tarantool#333. Needed for tarantool#333 Needed for tarantool#573 Closes tarantool#351 NO_DOC=bugfix
After the previuos commit the test became pretty flaky and fails with error "Duplicate key exists in unique index 'pk' in space '_bucket'", when trying to force create buckets. Let's properly wait for all buckets to be garbage collected and not just one, as it was before. Follow-up tarantool#351 NO_DOC=test
5ca27b9 to
744a3fe
Compare
Gerold103
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Yeah, it is nice to get this debt fixed 😁💪.
| exception_guard.ref.rw_lock = false | ||
| end | ||
| bucket_transfer_end(bucket_id) | ||
| local function bucket_send_perform(bucket_id, destination, opts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you even need bucket_send_perform() now? How about inlining it in bucket_send()? It isn't too big anyway.
| local function bucket_batch_send_prepare(routes) | ||
| local ok, err | ||
| local prepared_buckets = {} | ||
| for _, bids in pairs(routes) do | ||
| for _, bid in ipairs(bids) do | ||
| ok, err = bucket_send_prepare(bid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be problematic. This function will rw-lock potentially unlimited number of buckets. Which is going to be in conflict with the reason why we didn't ever care about allowing rw-requests during sending - because it is usually fast and affects one bucket at a time.
Imagine you need to send 1kk buckets out of 30kk. It is going to be a lot of buckets locked in each replicaset. These buckets might be not having RW refs right now, but might easily start receiving RW requests in a second.
Instead of building the routes so detailed and preparing all the buckets in advance I would recommend to think about an approach with limited batches. So the dispenser still operates on bucket counts. But each worker will independently try to pick and send max N buckets (preferring with no refs) at a time. Like 10 at once. Better make it configurable, I guess.
mrForza
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the high-quality patch! 🔥 Left some comments below
| return false | ||
| end | ||
| -- Skip buckets, which has already being picked. | ||
| if picked_buckets[bucket_id] then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did I get it right that we should check picked_buckets[bucket_id] == true because some already picked buckets may be transferred to another replicaset during rebalancer_build_routes_with_buckets?
E.g. We picked all needed bucket from rs1. Then when we go to rs3, some picked buckets from rs1 can arrive here? Maybe we should add a test that checks this behavior?
| end | ||
| -- Skip buckets, which has RW refs. | ||
| local ref = bucket_refrw_touch(bucket_id) | ||
| if not ref then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be simplified:
return ref and ref.rw == 0| end | ||
| }) | ||
| for _, bid in ipairs(bucket_ids) do | ||
| assert(not picked_buckets[bid]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need this assertion here? picked_buckets can't consists of buckets which has been picked before because we have a necessary check on 2603 line in filter_func.
| end | ||
|
|
||
| local function recovery_bucket_stat(bid) | ||
| local ok, err = check_is_master() | ||
| local function bucket_send_prepare(bid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: in commit message - transfering. Double 'r' needed
Disclaimer
Closes #351