Skip to content

Comments

slack-19.0: vtorc: improve handling of partial cell topo results#599

Merged
tanjinx merged 6 commits intoslack-19.0from
bp-pr17718.slack-19.0
Feb 10, 2025
Merged

slack-19.0: vtorc: improve handling of partial cell topo results#599
tanjinx merged 6 commits intoslack-19.0from
bp-pr17718.slack-19.0

Conversation

@timvaillancourt
Copy link

Description

This PR is an early backport of v22 bugfix PR: vitessio#17718

Related Issue(s)

Checklist

  • "Backport to:" labels have been added if this change should be back-ported to release branches
  • If this change is to be back-ported to previous releases, a justification is included in the PR description
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on CI?
  • Documentation was added or is not required

Deployment Notes

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
@timvaillancourt timvaillancourt added bug Something isn't working upstream-backport An upstream backport v22-backport labels Feb 8, 2025
@github-actions github-actions bot added this to the v19.0.7 milestone Feb 8, 2025
@timvaillancourt timvaillancourt marked this pull request as ready for review February 8, 2025 13:07
@timvaillancourt timvaillancourt requested a review from a team as a code owner February 8, 2025 13:07
@timvaillancourt timvaillancourt changed the title vtorc: improve handling of partial cell topo results slack-19.0: vtorc: improve handling of partial cell topo results Feb 8, 2025
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
@tanjinx tanjinx merged commit b4bbe74 into slack-19.0 Feb 10, 2025
163 of 165 checks passed
@tanjinx tanjinx deleted the bp-pr17718.slack-19.0 branch February 10, 2025 16:35
twthorn pushed a commit that referenced this pull request Mar 17, 2025
…599)

* `vtorc`: improve handling of partial cell topo results

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* add unit test

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* improve test

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* add comments

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* move sort to test

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* goimports

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

---------

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
makinje16 pushed a commit that referenced this pull request Mar 20, 2025
…599)

* `vtorc`: improve handling of partial cell topo results

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* add unit test

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* improve test

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* add comments

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* move sort to test

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* goimports

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

---------

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
tanjinx added a commit that referenced this pull request Mar 24, 2025
…d Journal Events (#585)

* VTGate VStream: Ensure reasonable delivery time for reshard journal event  (vitessio#16639)

Signed-off-by: Malcolm Akinje <malcolm.akinje@gmail.com>
Signed-off-by: Malcolm Akinje <makinje@slack-corp.com>

* Backport sqlparser patch for v15->v19 upgrade: 14763 Fix accepting bind variables in time related function calls (#590)

* Fix accepting bind variables in time related function calls. (vitessio#14763)

Signed-off-by: Manan Gupta <manan@planetscale.com>

* fix test

---------

Signed-off-by: Manan Gupta <manan@planetscale.com>
Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com>

* Upgrade vitess addons to 0.19.8 (#591)

This upgrade allows us to control whether vtorc raises problems or not
via an environment variable.

Signed-off-by: Eduardo J. Ortega U. <5791035+ejortegau@users.noreply.github.com>

* Use prefix in all vtorc check and recover logs (vitessio#17526) (#592)

This is a backport of vitessio#17526 . Original PR description below:

Description
This is meant to make recovery actions more easily identified from the logs. See vitessio#17465

Signed-off-by: Eduardo J. Ortega U. <5791035+ejortegau@users.noreply.github.com>

* `slack-19.0`: various backports for `vtorc`, part 2 (#596)

* Ensure all topo read calls consider `--topo_read_concurrency` (vitessio#17276)

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Revert "add keyrange support for vtorc clusters_to_watch (#457)"

This reverts commit 45c2199.

* [release-19.0] `vtorc`: require topo for `Healthy: true` in `/debug/health` (vitessio#17129) (vitessio#17351)

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: Tim Vaillancourt <tim@timvaillancourt.com>
Co-authored-by: Manan Gupta <manan@planetscale.com>

* `vtorc`: fetch all tablets from cells once + filter during refresh (vitessio#17388)

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Support KeyRange in `--clusters_to_watch` flag (vitessio#17604)

Signed-off-by: Manan Gupta <manan@planetscale.com>

* missing func

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Add api end point to print the current database state in VTOrc (vitessio#15485)

Signed-off-by: Manan Gupta <manan@planetscale.com>

---------

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: Manan Gupta <manan@planetscale.com>
Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com>

* `slack-19.0`: `vtorc`: improve handling of partial cell topo results (#599)

* `vtorc`: improve handling of partial cell topo results

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* add unit test

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* improve test

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* add comments

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* move sort to test

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* goimports

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

---------

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* `slack-19.0`: skip tests that will fail on v15 downgrade testing (#605)

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* `slack-19.0`: Add stats for shards watched by VTOrc (#606)

* Add stats for shards watched by VTOrc

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Use len() in make

---------

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Add `GetServerStatus` RPC to use in PRS (vitessio#16022) (#607)

Signed-off-by: Manan Gupta <manan@planetscale.com>
Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com>

* backport/patch connection pool bug/perf fixes (#604)

* [release-19.0] smartconnpool: do not allow connections to starve (vitessio#17675) (vitessio#17683)

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* smartconnpool: Better handling for idle expiration (vitessio#17756)

Signed-off-by: Vicent Marti <vmg@strn.cat>

---------

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: Vicent Martí <42793+vmg@users.noreply.github.com>
Co-authored-by: Tim Vaillancourt <tim@timvaillancourt.com>

* pool: reopen connection closed by idle timeout (vitessio#17818) (#609)

Signed-off-by: Harshit Gangal <harshit@planetscale.com>
Signed-off-by: Vicent Martí <42793+vmg@users.noreply.github.com>
Co-authored-by: Harshit Gangal <harshit@planetscale.com>
Co-authored-by: Vicent Martí <42793+vmg@users.noreply.github.com>

* VReplication: Support excluding lagging tablets and use this in vstream manager (vitessio#17835) (#612)

* `slack-19.0`: backport v22 VTOrc optimizations, part 2 (#613)

* `vtorc`: remove duplicate instance read from backend (vitessio#17834)

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* `vtorc`: add index for `inst.ReadInstanceClusterAttributes` table scan

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

---------

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Add stats for shards watched by VTOrc, purge stale shards (vitessio#17815) (#616)

* --consolidator-query-waiter-cap to set the max number of waiter for consolidated query (vitessio#17244) (#614)

Signed-off-by: Jun Wang <jun.wang@demonware.net>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Co-authored-by: jwang <121262788+jwangace@users.noreply.github.com>
Co-authored-by: Jun Wang <jun.wang@demonware.net>

* `slack-19.0` backport v22 `vtorc` optimizations + stats, part 3 (#618)

* Remove unused code in discovery queue creation (vitessio#17515)

Signed-off-by: Manan Gupta <manan@planetscale.com>

* vtorc: Cleanup unused code (vitessio#15508)

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* `vtorc`: cleanup discover queue, add concurrency flag (vitessio#17825)

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* `vtorc`: add tablets watched stats

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* fix missing merge conflict update

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* `vtorc`: skip unnecessary `inst.ReadTablet` in `logic.LockShard(...)`

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* `vtorc`: use `errgroup` in keyspace/shard discovery

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* fix import

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* fix ineffassign

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* missing import

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* `vtorc`: add stats for discovery workers

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* get count from backend

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* rm unused map

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

---------

Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com>
Co-authored-by: Dirkjan Bussink <d.bussink@gmail.com>

* Bp pr 17558 pr 17858.slack19.0 (#615)

* VReplication: Improve error handling in VTGate VStreams (vitessio#17558)

Signed-off-by: Tom Thornton <thomaswilliamthornton@gmail.com>

* Backport vitessio#17858

---------

Signed-off-by: Tom Thornton <thomaswilliamthornton@gmail.com>

* `slack-19.0`: re-backport tweaks from vitessio#17911 (#621)

* fix bug in reverse `if`

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* simplify

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* add `ReadTabletCountsByShard` test

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* use map of map

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* capitalize Cell

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* gofmt lint

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* fix plural in names

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

---------

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* fix releasing the global read lock when mysqlshell backup fails (vitessio#17000) (#623)

Signed-off-by: Renan Rangel <rrangel@slack-corp.com>

* VStream API: allow keyspace-level heartbeats to be streamed (vitessio#16593) (#620)

* VStream API: allow keyspace-level heartbeats to be streamed (vitessio#16593)

Signed-off-by: Malcolm Akinje <makinje@slack-corp.com>

* `slack-19.0` backport v22 `vtorc` optimizations + stats, part 3 (#618)

* Remove unused code in discovery queue creation (vitessio#17515)

Signed-off-by: Manan Gupta <manan@planetscale.com>

* vtorc: Cleanup unused code (vitessio#15508)

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* `vtorc`: cleanup discover queue, add concurrency flag (vitessio#17825)

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* `vtorc`: add tablets watched stats

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* fix missing merge conflict update

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* `vtorc`: skip unnecessary `inst.ReadTablet` in `logic.LockShard(...)`

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* `vtorc`: use `errgroup` in keyspace/shard discovery

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* fix import

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* fix ineffassign

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* missing import

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* `vtorc`: add stats for discovery workers

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* get count from backend

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* rm unused map

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

---------

Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com>
Co-authored-by: Dirkjan Bussink <d.bussink@gmail.com>

* Bp pr 17558 pr 17858.slack19.0 (#615)

* VReplication: Improve error handling in VTGate VStreams (vitessio#17558)

Signed-off-by: Tom Thornton <thomaswilliamthornton@gmail.com>

* Backport vitessio#17858

---------

Signed-off-by: Tom Thornton <thomaswilliamthornton@gmail.com>

* `slack-19.0`: re-backport tweaks from vitessio#17911 (#621)

* fix bug in reverse `if`

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* simplify

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* add `ReadTabletCountsByShard` test

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* use map of map

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* capitalize Cell

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* gofmt lint

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* fix plural in names

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

---------

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

---------

Signed-off-by: Malcolm Akinje <makinje@slack-corp.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Tom Thornton <thomaswilliamthornton@gmail.com>
Signed-off-by: Malcolm Akinje <malcolm.akinje@gmail.com>
Co-authored-by: Tim Vaillancourt <tim@timvaillancourt.com>
Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com>
Co-authored-by: Dirkjan Bussink <d.bussink@gmail.com>
Co-authored-by: Tom Thornton <thomaswilliamthornton@gmail.com>

* Increase health check channel buffer (vitessio#17821) (#625)

Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Malcolm Akinje <makinje@slack-corp.com>
Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com>

* VStream: Allow for automatic resume after Reshard across VStreams (vitessio#15393) (#627)

Signed-off-by: Tanjin Xu <tanjin.xu@slack-corp.com>
Co-authored-by: Matt Lord <mattalord@gmail.com>

---------

Signed-off-by: Malcolm Akinje <malcolm.akinje@gmail.com>
Signed-off-by: Malcolm Akinje <makinje@slack-corp.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Eduardo J. Ortega U. <5791035+ejortegau@users.noreply.github.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Harshit Gangal <harshit@planetscale.com>
Signed-off-by: Vicent Martí <42793+vmg@users.noreply.github.com>
Signed-off-by: Jun Wang <jun.wang@demonware.net>
Signed-off-by: Tom Thornton <thomaswilliamthornton@gmail.com>
Signed-off-by: Renan Rangel <rrangel@slack-corp.com>
Signed-off-by: Tanjin Xu <tanjin.xu@slack-corp.com>
Co-authored-by: Tanjin Xu <109303790+tanjinx@users.noreply.github.com>
Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com>
Co-authored-by: Eduardo J. Ortega U. <5791035+ejortegau@users.noreply.github.com>
Co-authored-by: Tim Vaillancourt <tim@timvaillancourt.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: Manan Gupta <manan@planetscale.com>
Co-authored-by: Vicent Martí <42793+vmg@users.noreply.github.com>
Co-authored-by: Harshit Gangal <harshit@planetscale.com>
Co-authored-by: Tom Thornton <thomaswilliamthornton@gmail.com>
Co-authored-by: jwang <121262788+jwangace@users.noreply.github.com>
Co-authored-by: Jun Wang <jun.wang@demonware.net>
Co-authored-by: Dirkjan Bussink <d.bussink@gmail.com>
Co-authored-by: Renan Rangel <rvrangel@users.noreply.github.com>
Co-authored-by: Matt Lord <mattalord@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working upstream-backport An upstream backport v22-backport

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants