Skip to content

Comments

Increase queue size warning threshold#49

Open
ccutrer wants to merge 1 commit intoseime:masterfrom
ccutrer:queue-size-warning-threshold
Open

Increase queue size warning threshold#49
ccutrer wants to merge 1 commit intoseime:masterfrom
ccutrer:queue-size-warning-threshold

Conversation

@ccutrer
Copy link
Contributor

@ccutrer ccutrer commented Jul 24, 2025

If you have a LOT of ESPHome devices, you can be in a completely stable state, but this is spamming the log

I currently have 48 active devices, and was getting this warning about once per second in my log, with a queue depth of 53 every time.

If you have a LOT of ESPHome devices, you can be in a completely
stable state, but this is spamming the log

Signed-off-by: Cody Cutrer <cody@cutrer.us>
@seime
Copy link
Owner

seime commented Jul 26, 2025

The idea was to warn the user about data not being processed in a timely fashion, indicating performance issue of the OH server or network issues.

A fixed number will be wrong for most users, but I think 2 * num_devices would be a suitable. Should be managed by the handler factory.

@ccutrer
Copy link
Contributor Author

ccutrer commented Jul 28, 2025

I like the idea to make the threshold dynamic, but I need to dig into this more. I noticed that when I started my openHAB yesterday it spammed the log with a queue size of 102. So maybe there's something more going on that I'm not understanding.

@seime
Copy link
Owner

seime commented Jul 28, 2025

The queue size is just a small symptom check, what is more interesting is how long each task sits in the queue. That requires a bit more code. In addition the thread pool size is fixed at 4 currently which might not be suitable for a large installation such as yours. Probably another thing to make configurable on the binding level.

@ccutrer
Copy link
Contributor Author

ccutrer commented Jul 28, 2025

Heh, yeah. I have 18 cores on my CPU :). And ESPHome is a relatively small number of my total smart devices. I hate to add configuration settings just for the sake of it if we can automatically make it better. I also noticed that I only see the warnings while openHAB is first booting and not only is this binding connecting to all devices almost simultaneously, but I also have ~700 other things coming online all at the same time. So what would you think about this:

  • automatically sizing the pool to be the number of CPUs available, with a minimum of 4
  • not logging this if the start level is less than <some number>, meaning the system is still booting. Probably the "all things started" start level
  • at that point, if I'm still getting spurious log messages, then look at adjusting the threshold to be some multiple of the number of active ESPHome things
  • and possibly add a throttle to the log messages - don't log again if we've logged within the last 30 seconds

@seime
Copy link
Owner

seime commented Jul 29, 2025

All suggestions sounds reasonable, but I would still like to change logging from queue size to queue wait time for any given task as I think it would be more accurate.

The reason I ditched the ThingHandler executor was due to users complaining about ping timeouts and disconnects. I'm sure you also have increased the ThingHandler thread pool size to avoid "random" delays in OH? The default size (5 or 10?) isn't enough to deal with various bindings doing I/O on congested networks or hard-to-reach devices scattered around. When the Monitored executor is advanced enough my plan is to PR it to OH core so that users can get some information when random delays occur, which bindings or devices that cause issues are, and how to improve on the situation.

@seime
Copy link
Owner

seime commented Aug 8, 2025

Possible upcoming improvements in core: openhab/openhab-core#4947

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants