CFBMC-2571: ONTAP cluster node reboots unexpectedly in BMC 15.12
Issue
BMC 15.12 (AFF A250, AFF C250, ASA A250, ASA C250 or FAS500 systems)
- ONTAP cluster node reboots unexpectedly:
[node_name: spmgrd: callhome.sp.hbt.stopped:alert]: Call home for SP HBT STOPPED
[node_name: env_mgr: sp.ipmi.lost.shutdown:EMERGENCY]: SP heartbeat stopped and cannot be recovered. To prevent hardware damage and data loss, the system will shut down in 10 minutes.
[node_name: env_mgr: monitor.shutdown.emergency:EMERGENCY]: Emergency shutdown: Environmental Reason Shutdown (System reboot to recover the BMC) - ONTAP may report hwassist error at event log:
[node01: cf_hwassist: cf.hwassist.missedKeepAlive:error]: HW-assisted takeover missing keep-alive messages from HA partner (node02)
[node01: cf_hwassist: cf.hwassist.recvKeepAlive:info]: hw_assist: Received hw_assist KeepAlive alert from partner(node02) - BMC system event log shows that the BMC performed a software reset:
Pilot Software reset
Kernel Panic Reboot
OrFPGA pull BMC whole reset
Pilot FPGA AC cycle