Skip to content

Comments

Implement raw mDNS listener for immediate reconnection#69

Closed
amandel wants to merge 1 commit intoseime:masterfrom
amandel:feature/reconnect-if-device-comes-online
Closed

Implement raw mDNS listener for immediate reconnection#69
amandel wants to merge 1 commit intoseime:masterfrom
amandel:feature/reconnect-if-device-comes-online

Conversation

@amandel
Copy link
Contributor

@amandel amandel commented Feb 7, 2026

ESPhome devices that are mostly offline and only come online for a short amount of time, have to wait until a connection reattempt is performed to be able to come online. We could increase the number of reconnect attempts to get a faster reaction time. This PR solves this by listening to to 1st network action that every esphome device does after startup. We then can immediately respond with a connection attempt and so reduce the reconnect time to a minimum.

  • Add ESPHomeRawMDNSListener to listen for mDNS packets directly on port 5353, bypassing JmDNS to avoid caching issues.
  • Implement manual mDNS packet parsing to detect _esphomelib._tcp.local services and extract hostnames.
  • Update ESPHomeHandlerFactory to track active handlers and dispatch reappearance events based on IP or hostname matching.
  • Add onDeviceReappeared to ESPHomeHandler to reset exponential backoff and trigger immediate connection.

I did not expect this to become so complicated. This is work in progress - too much AI code I did not check - feedback welcome.

@seime
Copy link
Owner

seime commented Feb 9, 2026

Thanks @amandel .

Initial thoughts:

  1. I believe there are some efforts to fix/improve mDNS handling in OH core. Not sure if this applies to what you are trying to solve here? Or is this a commonly used "pattern" implemented by several bindings with the same needs?
  2. If going forward, should device matching be on deviceId rather than hostname/ip address?

@amandel
Copy link
Contributor Author

amandel commented Feb 9, 2026

Hi @seime,

thanks for the feedback.

  1. My initial thought was that I can just collect mDNS events and then trigger the reconnect. Turns out that these events are not populated (not even on os level) if there is no change. This is the case if a device is simply switched off and on again later. A graceful reboot leads to a mDNS goodby and then a new registration, which we can see via jmDNS. But I did not want to rely on that graceful shutdown. Therefor I started to listen directly on the multi cast address to be able to capture the mDNS messages myself.
  2. All I have is the info that is shared via mDNS,
     [mdns:182]:   Services:
     [mdns:184]:   - _esphomelib, _tcp, 6053
     [mdns:187]:     TXT: friendly_name = esp-buero
     [mdns:187]:     TXT: version = 2026.1.3
     [mdns:187]:     TXT: mac = 3573bdee6af2
     [mdns:187]:     TXT: platform = ESP32
     [mdns:187]:     TXT: board = esp32dev
     [mdns:187]:     TXT: network = wifi
     [mdns:187]:     TXT: api_encryption = Noise_NNpsk0_25519_ChaChaPoly_SHA256
     [mdns:184]:   - _http, _tcp, 80

The friendly_name is populated with whatever is configured in the esphome yaml.

I observed in the logs that we might get more trigger messages than expected, but since we can directly lookup the current connection state to decide if we need to act, this should be all fine.

BTW: This mDNS listening increases the already fast reconnect time even compared with the low retry values from before.

@seime
Copy link
Owner

seime commented Feb 9, 2026

Regarding pt 2: This is part of the mdns traffic captured by wireshark - I have replaced the esphome.name (which is called deviceId in the thing config) with MY_ESPHOME_DOT_NAME_FIELD.

Do you see such mdns packets?

...
    Answers
        _esphomelib._tcp.local: type PTR, class IN, MY_ESPHOME_DOT_NAME_FIELD._esphomelib._tcp.local
            Name: _esphomelib._tcp.local
            Type: PTR (12) (domain name PoinTeR)
            .000 0000 0000 0001 = Class: IN (0x0001)
            0... .... .... .... = Cache flush: False
            Time to live: 4500 (1 hour, 15 minutes)
            Data length: 18
            Domain Name: MY_ESPHOME_DOT_NAME_FIELD._esphomelib._tcp.local
    Additional records
    [Unsolicited: True]

@amandel
Copy link
Contributor Author

amandel commented Feb 9, 2026

Yes I see this. And I missed this because in my setup this is always == hostname.

@amandel
Copy link
Contributor Author

amandel commented Feb 9, 2026

Now the deviceId is used. I'm not sure if we should do the same for the ESPHomeBluetoothProxyHandler?

@seime
Copy link
Owner

seime commented Feb 9, 2026

Now the deviceId is used. I'm not sure if we should do the same for the ESPHomeBluetoothProxyHandler?

Please elaborate

@amandel
Copy link
Contributor Author

amandel commented Feb 9, 2026

The esphomeHandlers currently only collects and reconnects esphome and not bluetooth thing types. I'm not sure if we need to take care for both.

@seime
Copy link
Owner

seime commented Feb 10, 2026

The Bluetooth proxy currently doesn't support connecting to devices, just passively listening to broadcasts for the time being. But when it gets there it would definately be useful to have it reconncet configured things as soon as they re-appear.

PRs welcome as always :)

@seime
Copy link
Owner

seime commented Feb 14, 2026

I tested your branch, and the device came online in OH within a few seconds after regaining power. Nice!

I noticed this log message:
[INFO ] [ESPHome Raw mDNS Listener] [nal.discovery.ESPHomeRawMDNSListener] - Ignoring mDNS goodbye packet for _esphomelib._tcp.local

Since we don't use it (yet?), maybe set it to DEBUG level?

Another extension to this PR could be an option to switch of the automatic reconnect attempts and rather rely on this message based one? Or simply recommend to set the reconnect interval values high?

Thanks for contributing!

@amandel
Copy link
Contributor Author

amandel commented Feb 15, 2026

Thanks for the feedback. The info level was indeed a leftover. Surely we must not trigger a reconnect if the devices sends a goodby. This happens on a proper shutdown, so it should have sent a disconnect request before. I think there is nothing we can do with this message?

I wonder if I should change the Map<ThingUID, ESPHomeHandler> esphomeHandlers to a Map<String /*DeviceId*/, ESPHomeHandler> esphomeHandlers. This makes the lookup simpler but comes with more effort to maintain the map. The case that the DeviceId is changes in the configuration. Also we can not cover the case with multiple instances with the same DeviceId. Might be because of an faulty configuration or even by intention. Given the number of devices even in large setups this might be just a micro optimization that increases complexity. What do you think?

We could increase the defaults for the automatic reconnects. I'm a bit more careful for different types of setups now ;). There might be setups where the mDNS messages do no reach the openHAB instance (some Container setups). We could argue this setup needs a fix, but we can also use the current more fail safe defaults and suggest changes depending on the setup.

@seime
Copy link
Owner

seime commented Feb 15, 2026

IMHO ESPHome is moving forward at a pace faster than I can keep up with, especially with cornercase configurations and 100% feature parity (never had it and probably never will). Building and maintaining support for features nobody is actually using just slows down support for other future requests.

The codebase is getting increasingly complex, and I think some refactoring/splitting of ESPHomeHandler is due (but after merging open PRs). But if you think it is an important update (I do have considered testing the "multidevice" features of ESPHome), I won't object nor refuse to merge :)

@amandel
Copy link
Contributor Author

amandel commented Feb 15, 2026

I see. This one is mainly to address the use case where devices do not run 7/24 like also described by @magx2 in #68.

So I will not add more complexity and keep it as is. openhab-esphome is a key feature for my openHAB setup.

@amandel amandel marked this pull request as ready for review February 15, 2026 11:32
@amandel amandel force-pushed the feature/reconnect-if-device-comes-online branch from 420060a to edffaaa Compare February 16, 2026 14:39
* Add `ESPHomeRawMDNSListener` to listen for mDNS packets directly on port 5353, bypassing JmDNS to avoid caching issues.
* Implement manual mDNS packet parsing to detect `_esphomelib._tcp.local` services and extract deviceID.
* Update `ESPHomeHandlerFactory` to track active handlers and dispatch reappearance events based on IP or hostname matching.
* Add `onDeviceReappeared` to `ESPHomeHandler` to reset exponential backoff and trigger immediate connection.Make mDNS code more readable
@amandel amandel force-pushed the feature/reconnect-if-device-comes-online branch from edffaaa to 12774e1 Compare February 16, 2026 14:41
@seime
Copy link
Owner

seime commented Feb 20, 2026

@amandel is this still WIP or ready to merge?

@amandel amandel changed the title WIP: Implement raw mDNS listener for immediate reconnection Implement raw mDNS listener for immediate reconnection Feb 20, 2026
@amandel
Copy link
Contributor Author

amandel commented Feb 20, 2026

@amandel is this still WIP or ready to merge?

Can be merged. I wanted to change the title a while ago, but obviously failed.

@seime seime closed this in #75 Feb 21, 2026
@seime
Copy link
Owner

seime commented Feb 21, 2026

Thanks @amandel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants