Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestions: user defined watchdog, listen to log thru MQTT #26

Open
Nabla128k opened this issue Nov 4, 2020 · 3 comments
Open

Suggestions: user defined watchdog, listen to log thru MQTT #26

Nabla128k opened this issue Nov 4, 2020 · 3 comments

Comments

@Nabla128k
Copy link

Hi,
I am trying to run my first pysmartnode device in the wild and I am thinking about few enhancements:

  • user serviceable watchdog: a component, that runs a timer (in a scale of hours e.g.) and subscribed to specific MQTT topic; if it not receive a message, it will proceed a soft reboot/ as lowest reintit of pysmartnode as it's possible... Reason: I tried to write my own sensor component BME280 (https://github.com/Nabla128k/pysmartnode/blob/dev/pysmartnode/components/sensors/bme280.py) (beginner's experiment)) and it's not reliable... partially maybe a fault of sensor board itself and outside nearly 100% humidity idk). And it's not easily accessible, I have to take a car, go 20 minutes, go thru garden, unlock cabinet... not forget a laptop in car...). So it would be nice to have a possibility to initiate reboot/reinitialization remotely as a first try.
    And a true independent hardware watchdog I think is an overkill.

  • for the same reason - posibility to listen, e.g. for some time, a log trace from pysmartnode device via MQTT

I am not sure, if I am able to develop such enhancements, so I am posting it here just for consideration and/or maybe to get some guidance/opposing...

@kevinkk525
Copy link
Owner

kevinkk525 commented Nov 4, 2020

Hi,
thanks for opening this issue.

I'm trying to understand your situation:

  • An ESP8266 in the wild with a BME280
  • Somehow has a wifi connection as you are talking about MQTT
  • Assumption: Wireless network but no internet because you are talking about having to go there with a laptop?

user serviceable watchdog:

feel free to write such a watchdog. The included one can easily be disabled in favor of a user defined watchdog. The only purpose of the included watchdog is to reset a hanging esp8266 unit, which rarely happens now, but it's possible. The lowest reinit of pysmartnode is a machine.reset() though.

Question:

Your watchdog should reset if it doesn't receive a message within a specified time. Why? If the device hangs, the current watchdog will reset it. If the mqtt connection is lost, it will try to reconnect but you can actually hook into all wifi events. When the mqtt client can't reconnect, it will reconnec the wifi, which will trigger a "wifi down" event, which you can subscribe with a callback by using

from pysmartnode import config
mqtt=config.getMQTT()
mqtt.registerWifiCallback(myCallback)

def myCallback(mqtt, state):
    print("wifi is",state)

(Note: I just realized I had some bugs in the wifi callback.. fixed in a22c23a and c7152b8)
Anyway, that's just a suggestion for mqtt because the mqtt client sends ping messages to the broker anyway. So if the goal is just to reset if the connection is unstable, then this would work too.
Or is the goal to reset if there are no publications from the sensor? That could be solved without mqtt by querying the last timestamp of the sensor: bme.getTimestamp(SENSOR_TEMPERATURE)
And then compare that with the your timeout. The timestamp is always from the last successfull reading.

BME280

Your module looks good! I'd be happy about a pull request once you feel that it works reliable for you.
I can't see anything in the code that would indicate a relibility problem. What exactly is the problem you're seeing? Many failed readings? Does it just hang somewhere?

I have 3 suggestions for the module:

  • lines 236 to 238 are not neccessary, the reading values are rounded in the background according to the provided arguments "precision_temperature" (etc).
  • additionally to the last point, you're storing the readings as strings. This will result in all kinds of problems if you use the sensor with other components because they will rely on a temperature sensor returning a float and not a string. (Using a string also prevents the base class ComponentSensor from applying the rounding and offset configuration)
  • In case of a failed reading you are not updating the stored value. It would be better to _setValue(SENSOR_xxx, None) so that other components querying your sensor recognize an invalid reading.

posibility to listen, e.g. for some time, a log trace from pysmartnode device via MQTT

Sorry I can't really follow what you are trying to achieve here. Which device should listen to what log trace?
The logs are sent via mqtt and on the server you can run the https://github.com/kevinkk525/SmartServer to store all those log messages and then access them via ftp for example. What exactly are you thinking about?


I am not sure, if I am able to develop such enhancements, so I am posting it here just for consideration and/or maybe to get some guidance/opposing...

Thanks for posting. Discussions always help to improve a project.

@Nabla128k
Copy link
Author

Oh, thanks for really detailed answer.
Watchdog: my device has wifi connectivity, accespoint is accesible from outside. But I have no way to initiate reset of pysmartnode unit remotely if I want to try repair faulty readings from BME280. So I was thinking about sending refresh messages from access point to my watchdog and if I stop sending, node will reboot. Access point is Raspberry connected by GSM modem and MQTT broker is outside, behind VPN. So sometimes GSM hangs, sometimes VPN hangs (both trying now to detect at Pi and reestablish connections). During this disruptions the node have no access to MQTT broker, but wifi connection may be ok (and probably is).

BME280: in fact I don't know what's happening. Sometimes it starts showing constant values of all sensors and it never changes. In other case it stops sending numeric values (all no values at all) - HomeAssistant says entity is non-numeric. I don't see MQTT messages of BME, but for other sensor it's ok (DS18S20) and node is alive. Soft reboot sometimes helps, sometimes not. And - it's the reason for the last point - the node log accessible via MQTT (next point).

Log: I just want to look at node log without serial connection to ESP8266, so via MQTT. I didn't know it's already sending log out via MQTT. I will try SmartServer.

So thanks again and sorry for stealing your time with my amateur approach :-)

@kevinkk525
Copy link
Owner

kevinkk525 commented Nov 5, 2020

ok so the device is connected to an mqtt broker over a RPI with a GSM module. So there is no broker on that PI. That means a lot of messages will get lost if the broker is unreachable.
That also means that all log messages sent over mqtt are lost when the connection is down. I'd suggest thinking about having an mqtt broker on the RPI and creating a bridge to the broker behind the VPN: http://www.steves-internet-guide.com/mosquitto-bridge-configuration/
That way you can run SmartServer on the RPI and have all log messages from the device, even if the GSM and VPN connection are offline for a time.
Note: Only log messages by the logging module are sent over mqtt, not the whole repl/serial output! (but all important things are covered by the logging module).

So I was thinking about sending refresh messages from access point to my watchdog and if I stop sending, node will reboot.

When would you stop sending a refresh message? You would somehow need to recognize faulty readings. So if you use a program that analyzes the published BME messages, then you could use a module on the device that just listens to topic "reboot" and resets when it receives a messages there. But of course a watchdog like module would work too. However, you could just as well program a module on the device itself to check for faulty BME readings.
And if you think about resetting the device manually, you can access it using webrepl if your vpn is configured so that you can access that network your rpi and device are in.
Using webrepl you can also look at the repl/serial output manually.

BME280: Does it say anything in the logs while it shows constant values? like reading errors maybe? (Guess you don't know yet because you haven't saved the logs from mqtt yet)
Non-numeric values: I guess this could be possible because you save string values instead of float in your module.

So thanks again and sorry for stealing your time with my amateur approach :-)

Nah you're not stealing my time. You're showing me how to improve my project to make it better and easier to use :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants