Skip to content

Comments

[slack-22.0] Add debug logging in reparent tests and disable replica offline/down tests for v19#794

Merged
tanjinx merged 5 commits intoslack-22.0from
v22-verify-tests
Feb 7, 2026
Merged

[slack-22.0] Add debug logging in reparent tests and disable replica offline/down tests for v19#794
tanjinx merged 5 commits intoslack-22.0from
v22-verify-tests

Conversation

@tanjinx
Copy link

@tanjinx tanjinx commented Feb 6, 2026

Adds logging to capture error details when GetShardReplication fails during external reparent tests. This helps diagnose failures in TestReparentFromOutsideWithNoPrimary by showing the actual error message and command output.

Description

  1. 7333803 - Add debug logging to CheckReparentFromOutside test helper
  2. eefb6bc - Fix test command argument order
  3. b20993f - Skip TestReparentReplicaOffline for vtctld version 19 and below
  4. b0325bd - Fix assertNodeCount to handle GetShardReplication response structure
  5. 0cb249c - Skip TestReparentWithDownReplica for vtctld version 19 and below

Related Issue(s)

Checklist

  • "Backport to:" labels have been added if this change should be back-ported to release branches
  • If this change is to be back-ported to previous releases, a justification is included in the PR description
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on CI?
  • Documentation was added or is not required

Deployment Notes

AI Disclosure

Adds logging to capture error details when GetShardReplication fails
during external reparent tests. This helps diagnose failures in
TestReparentFromOutsideWithNoPrimary by showing the actual error
message and command output.

Co-Authored-By: Claude <svc-devxp-claude@slack-corp.com>
Signed-off-by: Tanjin Xu <tanjin.xu@slack-corp.com>
@tanjinx tanjinx added this to the v22.0.3 milestone Feb 6, 2026
@codecov-commenter
Copy link

codecov-commenter commented Feb 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 69.78%. Comparing base (e3f4453) to head (0cb249c).

Additional details and impacted files
@@             Coverage Diff             @@
##           slack-22.0     #794   +/-   ##
===========================================
  Coverage       69.77%   69.78%           
===========================================
  Files            1605     1605           
  Lines          213999   213999           
===========================================
+ Hits           149325   149340   +15     
+ Misses          64674    64659   -15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

tanjinx and others added 4 commits February 6, 2026 13:32
Signed-off-by: Tanjin Xu <tanjin.xu@slack-corp.com>
The test behavior changed in v20+. In v19 and below, PRS would fail
when a cross-cell replica was offline. In v20+, PRS succeeds because
it only requires same-cell tablets to be reachable with semi-sync
durability.

Skip this test for v19 and below to avoid test failures due to the
behavior difference.

Co-Authored-By: Claude <svc-devxp-claude@slack-corp.com>
Signed-off-by: Tanjin Xu <tanjin.xu@slack-corp.com>
The GetShardReplication command returns a nested JSON structure:
{
  "shard_replication_by_cell": {
    "zone1": {
      "nodes": [...]
    }
  }
}

The assertNodeCount function was incorrectly looking for result["nodes"]
at the top level, causing a panic when trying to call Len() on a nil
reflect.Value. Updated to navigate the correct nested structure.

This fixes the panic in TestReparentFromOutside:
  panic: reflect: call of reflect.Value.Len on zero Value

Co-Authored-By: Claude <svc-devxp-claude@slack-corp.com>
Signed-off-by: Tanjin Xu <tanjin.xu@slack-corp.com>
The test behavior changed in v20+. In v19 and below, PRS would succeed
when a same-cell replica MySQL was down. In v20+, PRS fails with an
error because it verifies all tablets are reachable before proceeding.

The test expects the v20+ behavior where PRS fails when a tablet is
unreachable. Skip this test for v19 and below to avoid test failures.

Co-Authored-By: Claude <svc-devxp-claude@slack-corp.com>
Signed-off-by: Tanjin Xu <tanjin.xu@slack-corp.com>
@tanjinx tanjinx marked this pull request as ready for review February 7, 2026 00:49
@tanjinx tanjinx requested a review from a team as a code owner February 7, 2026 00:49
@tanjinx tanjinx changed the title Add debug logging to CheckReparentFromOutside test helper [slack-22.0] Add debug logging in reparent tests and disable TestReparentReplicaOffline and TestReparentWithDownReplica for v19 Feb 7, 2026
@tanjinx tanjinx changed the title [slack-22.0] Add debug logging in reparent tests and disable TestReparentReplicaOffline and TestReparentWithDownReplica for v19 [slack-22.0] Add debug logging in reparent tests and disable replica offline/down tests for v19 Feb 7, 2026
@tanjinx tanjinx added the v22 label Feb 7, 2026
@tanjinx tanjinx merged commit 946a513 into slack-22.0 Feb 7, 2026
92 of 95 checks passed
@tanjinx tanjinx deleted the v22-verify-tests branch February 7, 2026 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants