CFBMC-8277: Node reboot because of BMC heartbeat missed
Issue
- Node reboots because of heartbeat missed
[Node-01: spmgrd: sp.heartbeat.stopped:info]: Have not received a IPMI heartbeat from the Service Processor (SP) in last 600 seconds. [Node-01: spmgrd: sp.heartbeat.stopped:info]: Have not received a IPMI heartbeat from the Service Processor (SP) in last 600 seconds. [Node-01: spmgrd: callhome.sp.hbt.missed:notice]: Call home for SP HBT MISSED [Node-01: spmgrd: callhome.sp.hbt.stopped:alert]: Call home for SP HBT STOPPED [Node-01: env_mgr: sp.ipmi.lost.shutdown:EMERGENCY]: SP heartbeat stopped and cannot be recovered. To prevent hardware damage and data loss, the system will shut down in 10 minutes. [Node-01: env_mgr: monitor.shutdown.emergency:EMERGENCY]: Emergency shutdown: Environmental Reason Shutdown (System reboot to recover the BMC) - BMC is running 13.12
