Conversation
Add test coverage Signed-off-by: Jacob Laursen <jacob-github@vindvejr.dk>
Signed-off-by: Jacob Laursen <jacob-github@vindvejr.dk>
rmichalak
left a comment
There was a problem hiding this comment.
It looks good on paper, however be aware I currently have no ways of testing it live with my alarm system. I'm still on openhab v2 and slowly upgrading my setup to v4 (which triggered me to raise the original issue after a few years of absence in the project).
I need to upgrade my dev env and it can take me a while to do so and be fully up to speed with everything that changed since v2.
|
Actually, there may be more to it... before your fix, these malformed responses where throwing an exception and this was triggering restart immediately. Now, they just return I agree that the changes you made make sense, but perhaps we should also distinguish between 'unknown' response (as in - not handled by this binding, like here) and erroneous response, which we should count and restart the binding if we cross the threshold. This may require more changes and I would need to go down to the protocol details once more, plus I will need to setup my dev environment from scratch 😓 |
|
Also mind that there's an alternative binding for the same alarm system: https://www.openhab.org/addons/bindings/paradoxalarm/ |
I agree, see also #17829 (comment). So in other words, it seems we already had this issue with this existing sanity check: which I stayed consistent with, but expanded to cover all cases, so we avoid throwing And how is it currently handled when e.g. a
I guess there is no rush, or have you had this reported by others as well? If it won't make it for 4.3, it can still be backported into a 4.3.x patch later. |
I don't know yet, I guess I need to check the protocol again and make sure we handle all the proper responses, even if only with
Correct, and that's perhaps why it makes sense to still merge #17829 (limited to receiver thread) as it applies a bandaid on this wound at least (even if not fitting perfectly).
No, I didn't have any reports. Honestly, I'm not sure if anyone is using this binding, apart from me... I guess the one with IP150 module is more popular (and the module itself is cheaper now, than it used to be a few years ago). I may even consider buying it, but since I'm already in this rabbit hole, I would like to leave my extension stable... |
But doesn't that PR suffer from this same problem:
? So maybe it does way too many (potentially unneeded) reconnections because of running into the parser bugs? Maybe it doesn't restart enough, if malformed messages are shorter than 4 characters? And then there is the already mentioned issue that it may hide bugs because it catches all exceptions - which is why I decided to give this PR a shot. Preferable you could test this when you have your development system up running again, although I realize that might be too late for 4.3. Have in mind that we can also include such bugfixes in patch releases. I think as a temporary solution (any maybe no one, even including you, is currently using the 4.2/4.3 version of the binding) this PR could be merged, because it must be better to gracefully handle malformed message (without triggering reconnection) than to let the binding crash, which will not trigger reconnection either, or..? |
The original PR I proposed (#17829) is tested as it's been working in my environment for over three years (on openhab2). I just reapplied it to openhab v4 codebase and as it's very simple in nature, it should work there as well. Agree it's rather rudimentary and we can do better, but it works at least....
What I observed with my alarm system (which may be very specific to my env), is that it starts with a few malformed responses and then the alarm system stops responding at all. So when we receive a malformed message, there's really no going back from this stage. we need to restart the connection. I tried with a few different baud rates and other connection settings but I couldn't achieve long-term stability. I tried exchanging all the components in the middle (cable, rs232 -> usb converter) without any luck. Perhaps some restarts are not needed because of shortcoming of the current implementation and you are right that some of the messages shorter than 4 characters may not trigger restart, but for this there's another safeguard to count the number of responses. If it does not match the number of messages sent (by a threshold), we trigger the restart as well. I may be able to run more tests in January and I think there's really no rush with it... Seems like I may be the only one using this binding (or it's indeed only about my faulty alarm system going south), as otherwise we would have had some reports about binding not working. |
Sorry, I missed this point, that actually it is an old fix and already tested.
👍 Since there doesn't seem to be any rush, do we agree to leave both PR's open for the time being, and then try to "get it right" once you find the time? |
Sure thing. let's do that! |
|
Superceded by #18035. |
This fixes possible
NumberFormatExceptionandIndexOutOfBoundsExceptionfor unexpected input.Also adds test coverage and debug logging of messages that could not be parsed. The coverage is based on the current logic rather than real messages, since I don't know what they look like.
Related to #17829
Resolves #17828