SP heartbeat stopped on FAS8200 or A300
Applies to
- All ONTAP versions
- AFF A300 / FAS8200
Issue
The controller is not responsive. The EMS reports:
[spsm_listener: callhome.sp.hbt.stopped:alert]: Call home for SP HBT STOPPED
[env_mgr: sp.ipmi.lost.shutdown:EMERGENCY]: SP heartbeat stopped and cannot be recovered. To prevent hardware damage and data loss, the system will shut down in 2 minutes.
[env_mgr: sp.ipmi.lost.shutdown:EMERGENCY]: SP heartbeat stopped and cannot be recovered. To prevent hardware damage and data loss, the system will shut down in 2 minutes.
In some cases, issue is accompanied with following errors
Record 2165: Mon Oct 21 22:40:01 2024 [SysFW.notice]: Failed to recover SP<---
Record 2167: Mon Oct 21 22:40:01 2024 [SysFW.notice]: IPMI:Get midplane FRU 0 inventory:failed
Record 2168: Thu Jan 1 00:05:00 1970 [Trap Event.critical]: hwassist post_error (26)
Record 2169: Thu Jan 1 00:05:00 1970 [Trap Event.critical]: SNMP post_error (26)
Record 2170: Mon Oct 21 22:40:02 2024 [SysFW.critical]: IPMI PCI Slot Control failed.
Record 2171: Thu Jan 1 00:05:01 1970 [Trap Event.critical]: hwassist post_error (26