AFF A700/FAS9000 rebooted due to a watchdog reset
Applies to
- AFF A700
- FAS9000
Issue
- L2 watchdog reset causes a node to reboot unexpectedly
- Service Processor (SP) events logs from the node report:
Record 676: Sat Sep 03 07:48:57.485823 2022 [IPMI Event.critical]: NMI
Record 677: Sat Sep 03 07:48:58.692798 2022 [IPMI Event.critical]: L2 watchdog timeout hard reset
Record 678: Sat Sep 03 07:48:58.730798 2022 [Trap Event.critical]: hwassist l2_watchdog_reset (29)
Record 683: Sat Sep 03 07:49:13.116765 2022 [IPMI.notice]: 3f04 | 02 | EVT: 6fc804ff | System_Watchdog | Assertion Event, "Timer interrupt"
Record 684: Sat Sep 03 07:49:13.132351 2022 [IPMI.notice]: 4004 | 02 | EVT: 6fc104ff | System_Watchdog | Assertion Event, "Hard reset"
Record 685: Sat Sep 03 07:49:43.344481 2022 [IPMI.notice]: 4104 | 02 | EVT: 6f02ffff | PCM_Status | Assertion Event, "Fault"
- After reboot, an error similar to the following may be observed during boot and in EMS:
[cluster-01:mgr.boot.reason_abnormal:EMERGENCY]: System rebooted due to a watchdog reset.