Handling L2 Watchdog Resets on the AFF A320 Platform
Applies to
- AFF A320
Issue
- Node reboots unexpectedly
- Node does not reboot after an unexpected shutdown
BMC logs on the impacted node show the following:
Record 402: Thu May 05 06:20:35.070000 2022 [ASUP.notice]: First notification email | (REBOOT (abnormal)) WARNING | Send failed
Record 403: Thu May 05 06:20:40.640000 2022 [IPMI.notice]: 0076 | 02 | EVT: 6fc302ff | System_Watchdog | Assertion Event, "Power cycle"
Record 404: Thu May 05 06:20:40.640000 2022 [IPMI Event.critical]: L2 watchdog timeout power cycle
- If node reboots, the following error can be seen in the EMS log files
Thu May 05 15:33:43 +0800 [netapp: splog_main: mgr.boot.reason_abnormal:EMERGENCY]: System rebooted due to a watchdog reset.
Thu May 05 15:33:43 +0800 [netapp: splog_main: callhome.reboot.watchdog:alert]: Call home for REBOOT (watchdog reset)
- If node is unable to reboot,
system senors
from the BMC may show theAttn_Sensor1
asAsserted
Power_Event | 0x0 | discrete | | na | na | na | na
System_FW_Status | 0x0 | discrete | 0x2f | na | na | na | na
Wrench_Port_Up | 0x0 | discrete | Enabled | na | na | na | na
Attn_Sensor1 | 0x0 | discrete | Asserted | na | na | na | na