One of my Raspberry Pi 3 Model B+ is experiencing a “Watchdog timeout” about every 30 minutes. The piaware service is “killed” and restarted. Any suggestions? Here are the log entries:
Aug 09 00:18:53 raspberrypi systemd[1]: Cannot find unit for notify message of PID 2867. Aug 09 00:19:17 raspberrypi systemd[1]: piaware.service: Watchdog timeout (limit 2min)!
Aug 09 00:19:17 raspberrypi systemd[1]: piaware.service: Killing process 29531 (piaware) with signal SIGABRT.
Aug 09 00:19:17 raspberrypi systemd[1]: piaware.service: Killing process 29567 (fa-mlat-client) with signal SIGABRT.
Aug 09 00:19:17 raspberrypi systemd[1]: piaware.service: Killing process 29576 (faup1090) with signal SIGABRT.
Aug 09 00:19:17 raspberrypi systemd[1]: piaware.service: Main process exited, code=killed, status=6/ABRT
Aug 09 00:19:17 raspberrypi systemd[1]: piaware.service: Unit entered failed state.
Aug 09 00:19:17 raspberrypi systemd[1]: piaware.service: Failed with result ‘watchdog’.
Aug 09 00:19:47 raspberrypi systemd[1]: piaware.service: Service hold-off time over, scheduling restart.
Aug 09 00:19:47 raspberrypi systemd[1]: Stopped FlightAware ADS-B uploader.
Aug 09 00:19:47 raspberrypi systemd[1]: Started FlightAware ADS-B uploader.
Aug 09 00:19:48 raspberrypi piaware[3043]: creating pidfile /run/piaware/piaware.pid
Aug 09 00:19:48 raspberrypi piaware[3043]: ****************************************************
Aug 09 00:19:48 raspberrypi piaware[3043]: piaware version 8.2 is running, process ID 3043
Aug 09 00:19:48 raspberrypi piaware[3043]: your system info is: Linux raspberrypi 4.19.66-v7+ #1253 SMP Thu Aug 15 11:49:46 BST 2019 armv7l GNU/Linux
Aug 09 00:19:49 raspberrypi piaware[3043]: Connecting to FlightAware adept server at piaware.flightaware.com/1200
Aug 09 00:19:49 raspberrypi piaware[3043]: Connection with adept server at piaware.flightaware.com/1200 established
Aug 09 00:19:49 raspberrypi piaware[3043]: TLS handshake with adept server at piaware.flightaware.com/1200 completed
Aug 09 00:19:50 raspberrypi piaware[3043]: FlightAware server certificate validated
Aug 09 00:19:50 raspberrypi piaware[3043]: encrypted session established with FlightAware
This is cause-and-effect (systemd is losing the watchdog reset notify message), but I don’t know why the reset is getting lost / using the wrong PID, and I don’t know why it would take 30 minutes for it to break.
Linux version 4.19.66-v7+
CPU: ARMv7 Processor [410fd034] revision 4 (ARMv7)
CPU: div instructions available: patching division code
CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
OF: fdt: Machine model: Raspberry Pi 3 Model B Plus Rev 1.3
Yes… the Watchdog timeouts occur anywhere from 30 minutes to an hour apart… they are fairly random occurrences. I have four Raspberry Pi’s w/FlightAware dongles. This is the only one of the four that is affected. The other three are on the same network, same network switch, same Internet router… no Watchdog timeouts. It just affects the piaware.service on the one Pi… none of the other services on the Pi are being restarted. And as soon as it’s back up, it starts reporting aircraft positions again… then the service gets killed and the process starts over again…
Right. Bring it back to a known good (software) state so that we can see if it’s something specific to your environment (network etc), or a result of something done to the image. Maybe try it on a spare sdcard.
(rebuilding is something you should have a plan for anyway – sdcards do fail)