Skip to content

Add configurable option to scale-down during task run time for cost-based autoscaler#18958

Merged
capistrant merged 3 commits intoapache:masterfrom
Fly-Style:cost-base-autoscaler-scaledown-with-barrier
Jan 28, 2026
Merged

Add configurable option to scale-down during task run time for cost-based autoscaler#18958
capistrant merged 3 commits intoapache:masterfrom
Fly-Style:cost-base-autoscaler-scaledown-with-barrier

Conversation

@Fly-Style
Copy link
Contributor

@Fly-Style Fly-Style commented Jan 27, 2026

To test the cost-based autoscaler in real-world scenarios, we temporarily allow scale-downs during task runtime. In case if we don't want to scale down immediately.
Two configuration options has been added:

  • scaleDownDuringTaskRolloverOnly indicates if task scaling down is limited to periods during task rollovers only. false by default, allows scaling down during normal task run time.
  • scaleDownBarrier, A number of successful scale down attempts which should be skipped to allow prevention of the auto-scaler from scaling down tasks immediately when the computed cost-based metrics fall below this barrier. Setting it to 0 removes the threshold.

There is a parallel PR that tries to implement the most correct behaviour for scaling during task rollover: #18954

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • been tested in a test Druid cluster.

@Fly-Style Fly-Style force-pushed the cost-base-autoscaler-scaledown-with-barrier branch from e689f4e to e26868e Compare January 27, 2026 13:48
@Fly-Style Fly-Style force-pushed the cost-base-autoscaler-scaledown-with-barrier branch from e26868e to 4fd0f86 Compare January 27, 2026 14:06
Copy link
Contributor

@capistrant capistrant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

serde test needs fixing up it seems. in addition to that left some lower stakes comments questions. Overall I think it is mergeable as is, but I do wonder about best default for scaleDownDuringTaskRolloverOnly in the long term

Comment on lines 173 to 174
} else if (!config.isScaleDownOnTaskRolloverOnly() && optimalTaskCount < currentTaskCount &&
(config.getScaleDownBarrier() == 0 || ++scaleDownCounter == config.getScaleDownBarrier())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit suggestion: might be easier cognitively to drop the ||?

Suggested change
} else if (!config.isScaleDownOnTaskRolloverOnly() && optimalTaskCount < currentTaskCount &&
(config.getScaleDownBarrier() == 0 || ++scaleDownCounter == config.getScaleDownBarrier())) {
} else if (!config.isScaleDownOnTaskRolloverOnly() && optimalTaskCount < currentTaskCount && ++scaleDownCounter >= config.getScaleDownBarrier()) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, good catch!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: 688833e

|`idleWeight`|The weight of extracted poll idle value in cost function. | No | 0.75 |
|`defaultProcessingRate`|A planned processing rate per task, required for first cost estimations. | No | 1000 |
|`scaleDownBarrier`| A number of successful scale down attempts which should be skipped to prevent the auto-scaler from scaling down tasks immediately. | No | 5 |
|`scaleDownDuringTaskRolloverOnly`| Indicates whether task scaling down is limited to periods during task rollovers only. | No | False |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that we want this to be true as default, if that is like the production behavior we think will be preferred? I say if that is like the production behavior we think will be preferred because of this in the description:

To test the cost-based autoscaler in real-world scenarios, we temporarily allow scale-downs during task runtime

like instead of temporarily defaulting for testing, default for long term productionalization plans and we can override for folks testing behavior more rapidly?

Copy link
Contributor Author

@Fly-Style Fly-Style Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, that's the case. Scale down during task rollover is now in unsatisfactory condition, and the existence of that PR with scaleDownDuringTaskRolloverOnly set to false is required for smooth autoscaler testing.
There is a parallel PR: #18954, I tried to integrate a new way to gradually stop tasks and scale down, but it is very risky and need a very careful review and testing.

Well, by bad, I did not produce a good PR description. I updated the it.

this.idleWeight = Configs.valueOrDefault(idleWeight, DEFAULT_IDLE_WEIGHT);
this.defaultProcessingRate = Configs.valueOrDefault(defaultProcessingRate, DEFAULT_PROCESSING_RATE);
this.scaleDownBarrier = Configs.valueOrDefault(scaleDownBarrier, DEFAULT_SCALE_DOWN_BARRIER);
this.scaleDownDuringTaskRolloverOnly = Configs.valueOrDefault(scaleDownDuringTaskRolloverOnly, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should this follow pattern of default at top of file or are we only doing that for magic numbers with context added in comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, did not get your point :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering if inlining the default when the rest are constants at the top of the file for just this one config is intentional. I think it is no big deal

{
private static final long DEFAULT_SCALE_ACTION_PERIOD_MILLIS = 15 * 60 * 1000; // 15 minutes
private static final long DEFAULT_MIN_TRIGGER_SCALE_ACTION_FREQUENCY_MILLIS = 1200000; // 20 minutes
private static final long DEFAULT_SCALE_ACTION_PERIOD_MILLIS = 10 * 60 * 1000; // 10 minutes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this caused the default serde test to fail. I assume the one below would also fail assertions if not fixed up as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: 688833e

Copy link
Contributor

@capistrant capistrant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the fixing up @Fly-Style. LGTM

@capistrant capistrant merged commit ed0264a into apache:master Jan 28, 2026
40 checks passed
@Fly-Style Fly-Style deleted the cost-base-autoscaler-scaledown-with-barrier branch January 28, 2026 15:39
Comment on lines +96 to +101
Assert.assertEquals(DEFAULT_SCALE_ACTION_PERIOD_MILLIS, config.getScaleActionPeriodMillis());
Assert.assertEquals(DEFAULT_MIN_TRIGGER_SCALE_ACTION_FREQUENCY_MILLIS, config.getMinTriggerScaleActionFrequencyMillis());
Assert.assertEquals(DEFAULT_LAG_WEIGHT, config.getLagWeight(), 0.001);
Assert.assertEquals(DEFAULT_IDLE_WEIGHT, config.getIdleWeight(), 0.001);
Assert.assertEquals(DEFAULT_PROCESSING_RATE, config.getDefaultProcessingRate(), 0.001);
Assert.assertEquals(DEFAULT_SCALE_DOWN_BARRIER, config.getScaleDownBarrier());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than importing the constants, it would be nice to verify the numerical values directly here, so that when the constants change, we consciously update the tests as well.

|`idleWeight`|The weight of extracted poll idle value in cost function. | No | 0.75 |
|`defaultProcessingRate`|A planned processing rate per task, required for first cost estimations. | No | 1000 |
|`scaleDownBarrier`| A number of successful scale down attempts which should be skipped to prevent the auto-scaler from scaling down tasks immediately. | No | 5 |
|`scaleDownDuringTaskRolloverOnly`| Indicates whether task scaling down is limited to periods during task rollovers only. | No | False |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this config, it makes sense to have this option. But I agree with @capistrant in that we would want the default of this flag to be true so that scale down happens only on task rollover by default. But we can fix it up once we have addressed the issues with the task rollover bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a reason for this option's existence, actually :D

|`lagWeight`|The weight of extracted lag value in cost function.| No| 0.25|
|`idleWeight`|The weight of extracted poll idle value in cost function. | No | 0.75 |
|`defaultProcessingRate`|A planned processing rate per task, required for first cost estimations. | No | 1000 |
|`scaleDownBarrier`| A number of successful scale down attempts which should be skipped to prevent the auto-scaler from scaling down tasks immediately. | No | 5 |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This config does not seem to be aligned with the whole cost-based story.
It is a little counterintuitive as it essentially negates the decision taken by the auto-scaler.

IIUC, this has been put in as a hack until we have fixed the issue with scaling on task rollover.

Please remove this config and we should try to evaluate the following instead:

  • increase the window over which we compute the average lag
  • AND/OR add more terms to the cost function
  • AND/OR add a config like minScaleDownDelay (which serves a role similar to the scaleDownBarrier but in terms of time rather than count)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minScaleDelay idea looks better, I like it more 👍🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants