Node down due to "CPU_Cat_Error" event
Applies to
- AFF A220
- FAS2750
- FAS2720
- FAS2650
- AFF C190
- AFF A800
Issue
- No error messages found in the BMC console logs.
- The BMC events show the
CPU_Cat_Errorjust before the Heartbeat stop:
Record 1458: Mon Mar 30 13:15:07.660000 2020 [IPMI.notice]: 00ba | 02 | EVT: 0301ffff | CPU_Cat_Error | Assertion Event, "State Asserted"
Record 1459: Mon Mar 30 13:26:15.570000 2020 [BMC.critical]: Heartbeat stopped
- Node reboots unexpectedly due to loss of SP heartbeat and is power cycled
[?] Mon Apr 27 11:08:13 +0200 [Node-02: cf_hwassist: cf.hwassist.takeoverTrapRecv:notice]: hw_assist: Received takeover hw_assist alert from partner(Node-01), system_down because power_cycle_via_sp.
