Skip to content

Conversation

@Serpentian
Copy link
Collaborator

@Serpentian Serpentian commented Jan 22, 2026

Disclaimer

  • Dear reviewers, please don't be afraid by the patch size, I'm just moving the code)
  • Part of Doubled buckets #619 RFC

Closes #351

Currently one part of the bucket-related code is at the top of file,
another one - in the middle. The current commit is called upon to solve
this "problem". The commit doesn`t make any changes to the code, just
moves it. The diff is ugly, since the default git algorithm matches
poorly here, use `git diff --patience` instead.

The main motivation is to move the ref-related code before
bucket_transfer_start/end(), which will use refs in the upcoming
commits. The `check_is_master()` function is moved to the helper
section, where it belongs, since it's used by the ref code.

Needed for tarantool#351

NO_DOC=refactoring
NO_TEST=refactoring
@Serpentian Serpentian marked this pull request as draft January 26, 2026 12:43
@Serpentian Serpentian force-pushed the gh-351-reblancer-takes-buckets-with-no-refs branch from efe8655 to 15c2a62 Compare January 26, 2026 14:46
@Serpentian
Copy link
Collaborator Author

Serpentian commented Jan 26, 2026

  • Fixed incorrect error handling and order of checks in the second commit, now all tests pass without changes there
  • Fixed the same bucket choosen twice in the fourth commit, fixed error handling there

And a bunch of other bugs. Seems like a working solution now, ready for review

@Serpentian Serpentian marked this pull request as ready for review January 26, 2026 14:59
@Serpentian Serpentian force-pushed the gh-351-reblancer-takes-buckets-with-no-refs branch 3 times, most recently from a84098a to 5ca27b9 Compare January 27, 2026 17:01
This commit adds new functions, which are meant to be used instead of
the bucket_transfer_start/end() when sending a bucket.

This is needed for tarantool#351, where rebalancer will start to
prefer the buckets without refs. For that to work we need to mark the
bucket as `transfering` and set `rw_lock` far earlier, than
`bucket_send` happens(). The first one is required in order to simplify
the process of picking buckets and helps not to pick the same one twice.
The second one is essential, since we don't want new refs to happen,
when we picked the bucket for sending.

The explicit check for being master is dropped from the `bucket_send()`
function, since it was never needed there, the check is done in the
`bucket_refrw_touch()` anyway.

The commit also encapsulates the pcalling of `bucket_send_xc` to
separate function: `bucket_send_perform()`, which will also be needed
for applier of routes to send a bucket.

Part of tarantool#351

NO_DOC=refactoring
NO_TEST=refactoring
The `rebalancer_in_work` test case enabled the rebalancer even when
it was already enabled. Moreover, at the end of the test the rebalancer
is disabled, consequently it doesn't work for the subsequent tests,
which is strange for the test, which exists exclusively to test the
rebalancer.

The rebalancer itself isn't needed for existing tests, but it will be
required for the tests, added in the next commit, so let's drop
enabling/disabling the rebalancer from the test.

Needed for tarantool#351

NO_DOC=test
Before this commit the rebalancer picked the first encountered bucket
and waited for number of rw refs on it to become 0. This doesn't work
reliably, when cluster has very long RW requests, which block bucket
rebalancing for hours: the instance picks the same buckets and timeouts,
while waiting for no refs.

In order to fix that let's prefer buckets without refs. If there's not
enough such buckets, take the remainings from the first available buckets,
as it worked before.

From now on rebalancer routes applier knows all of the buckets, which
are going to be sent before these sends start, which is required for
fixing tarantool#573 and tarantool#333.

Needed for tarantool#333
Needed for tarantool#573
Closes tarantool#351

NO_DOC=bugfix
After the previuos commit the test became pretty flaky and fails with
error "Duplicate key exists in unique index 'pk' in space '_bucket'",
when trying to force create buckets.

Let's properly wait for all buckets to be garbage collected and not just
one, as it was before.

Follow-up tarantool#351

NO_DOC=test
@Serpentian Serpentian force-pushed the gh-351-reblancer-takes-buckets-with-no-refs branch from 5ca27b9 to 744a3fe Compare January 27, 2026 17:03
@Serpentian Serpentian assigned kamenkremen and mrForza and unassigned Serpentian Jan 28, 2026
@Serpentian Serpentian requested a review from mrForza January 28, 2026 22:33
Copy link
Collaborator

@Gerold103 Gerold103 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Yeah, it is nice to get this debt fixed 😁💪.

exception_guard.ref.rw_lock = false
end
bucket_transfer_end(bucket_id)
local function bucket_send_perform(bucket_id, destination, opts)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you even need bucket_send_perform() now? How about inlining it in bucket_send()? It isn't too big anyway.

Comment on lines +944 to +949
local function bucket_batch_send_prepare(routes)
local ok, err
local prepared_buckets = {}
for _, bids in pairs(routes) do
for _, bid in ipairs(bids) do
ok, err = bucket_send_prepare(bid)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be problematic. This function will rw-lock potentially unlimited number of buckets. Which is going to be in conflict with the reason why we didn't ever care about allowing rw-requests during sending - because it is usually fast and affects one bucket at a time.

Imagine you need to send 1kk buckets out of 30kk. It is going to be a lot of buckets locked in each replicaset. These buckets might be not having RW refs right now, but might easily start receiving RW requests in a second.

Instead of building the routes so detailed and preparing all the buckets in advance I would recommend to think about an approach with limited batches. So the dispenser still operates on bucket counts. But each worker will independently try to pick and send max N buckets (preferring with no refs) at a time. Like 10 at once. Better make it configurable, I guess.

Copy link
Contributor

@mrForza mrForza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the high-quality patch! 🔥 Left some comments below

return false
end
-- Skip buckets, which has already being picked.
if picked_buckets[bucket_id] then
Copy link
Contributor

@mrForza mrForza Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did I get it right that we should check picked_buckets[bucket_id] == true because some already picked buckets may be transferred to another replicaset during rebalancer_build_routes_with_buckets?

E.g. We picked all needed bucket from rs1. Then when we go to rs3, some picked buckets from rs1 can arrive here? Maybe we should add a test that checks this behavior?

end
-- Skip buckets, which has RW refs.
local ref = bucket_refrw_touch(bucket_id)
if not ref then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be simplified:

return ref and ref.rw == 0

end
})
for _, bid in ipairs(bucket_ids) do
assert(not picked_buckets[bid])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need this assertion here? picked_buckets can't consists of buckets which has been picked before because we have a necessary check on 2603 line in filter_func.

end

local function recovery_bucket_stat(bid)
local ok, err = check_is_master()
local function bucket_send_prepare(bid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: in commit message - transfering. Double 'r' needed

@mrForza mrForza assigned Serpentian and unassigned mrForza Feb 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Rebalancer should firstly try to pick buckets which already have no refs

4 participants