admission: implement 1ms disk bandwidth error correction by angeladietz · Pull Request #163347 · cockroachdb/cockroach

angeladietz · 2026-02-11T15:24:25Z

Will rebase once #163342 is merged, this pr is for the 2nd two commits.

admission: implement 1ms disk bandwidth error correction

Enable disk bandwidth error correction on every tick (1ms when loaded)
rather than once per second. This improves responsiveness of token
adjustment to actual disk I/O.

Key changes:

Change errorAdjustmentInterval from 1 to 0.001 so shouldAdjustForError
returns true on every tick
Add initialized flag to diskTokensError to skip the first call, which
has no prior measurement to compute a delta from
Change adjustDiskTokenError to accept DiskStats directly and use
GetDiskStats for instantaneous collection rather than GetPebbleMetrics
Log potential overadmission when disk write tokens go negative

Epic: CRDB-47073
Informs: #162214
Release note: None

admission: log extended durations of unchanged disk stats

Add observability for cases where disk stats don't change across
consecutive 1ms error correction ticks. This helps assess whether the
1ms tick rate is effective given the kernel's disk stat update
frequency (typically 4ms on Linux).

Introduce staleStatTracker which:

Counts consecutive ticks where BytesRead and BytesWritten are unchanged
Logs (rate-limited) when count exceeds thresholds of 20, 50, or 100
Tracks cumulative counts of runs exceeding each threshold

Epic: CRDB-47073
Informs: #162214
Release note: None

trunk-io · 2026-02-11T15:24:30Z

Merging to master in this repository is managed by Trunk.

To merge this pull request, check the box to the left or comment /trunk merge below.

cockroach-teamcity · 2026-02-11T15:24:44Z

This change is

angeladietz · 2026-02-11T21:31:23Z

@sumeerbhola My last commit admission: log extended durations of unchanged disk stats is to emulate the logging you added in your prototype here, but i think we could get visibility into the same info by adding a histogram. Do you have a preference between my approach and using a histogram? I personally thought a histogram might be unnecessary since its more so for us to get a sense of how useful 1ms error correction is and less so an indicator for cluster health.

sumeerbhola

Regarding that logging, I was just playing around to get a sense whether these stats were updated frequently enough to reflect changes. The results indicated that they are. I don't think we need any observability in this regard, but I'll take a look at what you added.

@sumeerbhola made 1 comment.
Reviewable status: complete! 0 of 0 LGTMs obtained.

sumeerbhola

some simplification etc. comments

@sumeerbhola partially reviewed 3 files and made 4 comments.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @angeladietz).

pkg/util/admission/granter.go line 648 at r2 (raw file):

		cumUnaccountedError := *cumError - *accountedError
		if cumUnaccountedError != 0 {
			*accountedError += cumUnaccountedError

Can we get rid of accountedForError inside kvStoreTokenGranter now that we are accounting for all the error.

pkg/util/admission/granter.go line 856 at r2 (raw file):

		writeTokensPerSec := sg.mu.diskTokensAvailable.writeByteTokens / adjustmentInterval
		if writeTokensPerSec < 0 {
			log.Dev.Infof(context.Background(), "Potential overadmission=%s/s", humanizeutil.IBytes(-writeTokensPerSec))

This was fine for the prototype, but I think we need to plumb it back to the ioLoadListener, so that it is recorded as part of the log statement it outputs every 15s. In case you haven't seen it in escalations, an example is here https://cockroachlabs.atlassian.net/wiki/spaces/CKB/pages/3561455699/Playbook+v24.1+Admission+Control+Overload+Escalations+Dashboard#Logs.

pkg/util/admission/io_load_listener.go line 440 at r2 (raw file):

// corrections. Set to 0.001 (1ms) to enable error correction on every tick when
// loaded.
const errorAdjustmentInterval = 0.001

We should get rid of the shouldAdjustForError method and anything it needed, including this const. We are now deliberately adjusting for the error every time we add to the token bucket.

Enable disk bandwidth error correction on every tick (1ms when loaded) rather than once per second. This improves responsiveness of token adjustment to actual disk I/O. Key changes: - Change errorAdjustmentInterval from 1 to 0.001 so shouldAdjustForError returns true on every tick - Add initialized flag to diskTokensError to skip the first call, which has no prior measurement to compute a delta from - Change adjustDiskTokenError to accept DiskStats directly and use GetDiskStats for instantaneous collection rather than GetPebbleMetrics - Log potential overadmission when disk write tokens go negative Epic: CRDB-47073 Informs: cockroachdb#162214 Release note: None

Add observability for cases where disk stats don't change across consecutive 1ms error correction ticks. This helps assess whether the 1ms tick rate is effective given the kernel's disk stat update frequency (typically 4ms on Linux). Introduce staleStatTracker which: - Counts consecutive ticks where BytesRead and BytesWritten are unchanged - Logs (rate-limited) when count exceeds thresholds of 20, 50, or 100 - Tracks cumulative counts of runs exceeding each threshold Epic: CRDB-47073 Informs: cockroachdb#162214 Release note: None

This function was previously used to ensure error correction only occured at a certain interval, but this logic was recently adjusted so we now error correct on every tick. Thus, this function isn't needed anymore. Epic: CRDB-47073 Informs: cockroachdb#162214 Release note: None

github-actions · 2026-02-16T20:21:43Z

Potential Bug(s) Detected

The three-stage Claude Code analysis has identified potential bug(s) in this PR that may warrant investigation.

Next Steps:
Please review the detailed findings in the workflow run.

Note: When viewing the workflow output, scroll to the bottom to find the Final Analysis Summary.

After you review the findings, please tag the issue as follows:

If the detected issue is real or was helpful in any way, please tag the issue with O-AI-Review-Real-Issue-Found
If the detected issue was not helpful in any way, please tag the issue with O-AI-Review-Not-Helpful

Previously, error correction used accountedForError to track what portion of cumError had already been applied to available tokens, and would only deduct (cumError - accountedForError)/15 on each error tick. This dampening was needed because at coarser error correction intervals, applying the full error immediately caused large token oscillations and under-utilization (cockroachdb#160964). With the recent move to 1ms error correction ticks, the per-tick error is small enough to apply directly without dampening. This removes accountedForError and simplifies errorAdjFunc to subtract the interval error immediately. Epic: CRDB-47073 Informs: cockroachdb#162214 Release note: None

Previously, potential disk overadmission was logged independently in the kvStoreTokenGranter on every interval boundary, separate from the main "IO overload" log line that oncallers use during escalations. Now, the overadmission signal is plumbed through to the diskBandwidthLimiter and included in its SafeFormat output, so it appears in the existing 15s log line alongside the other disk bandwidth stats. This is done by stashing the disk write token balance (before the max(0) reset) in addAvailableTokens on the last tick, returning it via getDiskTokensUsedAndReset, and passing it through to computeElasticTokens where it is stored in diskBandwidthLimiterState. When the balance is negative, SafeFormat appends the overadmission rate. Epic: CRDB-47073 Informs: cockroachdb#162214 Release note: None

angeladietz marked this pull request as ready for review February 12, 2026 18:44

angeladietz requested review from a team as code owners February 12, 2026 18:44

angeladietz requested review from sumeerbhola and removed request for a team February 12, 2026 19:09

sumeerbhola reviewed Feb 13, 2026

View reviewed changes

sumeerbhola requested changes Feb 13, 2026

View reviewed changes

angeladietz added 3 commits February 16, 2026 15:14

angeladietz force-pushed the store-ac-error-correction branch from 4ef2fc8 to 4751fea Compare February 16, 2026 20:15

github-actions bot added the o-AI-Review-Potential-Issue-Detected AI reviewer found potential issue. Never assign manually—auto-applied by GH action only. label Feb 16, 2026

angeladietz added 2 commits February 16, 2026 16:55

angeladietz force-pushed the store-ac-error-correction branch from 4751fea to be9607b Compare February 16, 2026 21:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

admission: implement 1ms disk bandwidth error correction#163347

admission: implement 1ms disk bandwidth error correction#163347
angeladietz wants to merge 5 commits intocockroachdb:masterfrom
angeladietz:store-ac-error-correction

angeladietz commented Feb 11, 2026 •

edited

Loading

Uh oh!

trunk-io bot commented Feb 11, 2026

Uh oh!

cockroach-teamcity commented Feb 11, 2026

Uh oh!

angeladietz commented Feb 11, 2026

Uh oh!

sumeerbhola left a comment

Uh oh!

sumeerbhola left a comment

Uh oh!

github-actions bot commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

angeladietz commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

trunk-io bot commented Feb 11, 2026

Uh oh!

cockroach-teamcity commented Feb 11, 2026

Uh oh!

angeladietz commented Feb 11, 2026

Uh oh!

sumeerbhola left a comment

Choose a reason for hiding this comment

Uh oh!

sumeerbhola left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 16, 2026

Potential Bug(s) Detected

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

angeladietz commented Feb 11, 2026 •

edited

Loading