Handling L2 Watchdog Resets on the FAS2620 / FAS2650 / AFF A200
Applies to
- FAS2620 / FAS2650 / AFF A200
 
Issue
- Node reboots unexpectedly
 - Node does not reboot after an unexpected shutdown
 - Service Processor logs on the impacted node show the following:
 
Record 454: Mon Feb 08 11:49:20.924775 2021 [IPMI Event.critical]: L2 watchdog timeout hard reset
Record 455: Mon Feb 08 11:49:20.984259 2021 [Trap Event.critical]: hwassist l2_watchdog_reset (29)
Record 456: Mon Feb 08 11:49:23.000822 2021 [SP.critical]: Filer Reboot
- If node reboots, the following error can be seen in the EMS log files
 
[cluster-01:mgr.boot.reason_abnormal:EMERGENCY]: System rebooted due to a watchdog reset.
- If node is unable to reboot, 
system senorsfrom the SP may show senors unavailble (na) or faulted (Fault) 
Sensor Name      | Current    | Unit       | Status     | LCR       | LNC       | UNC       | UCR
-----------------+------------+------------+------------+-----------+-----------+-----------+-----------
SYSTEM:
System_FW_Status | na         | discrete   | na         | na        | na        | na        | na
System_Watchdog  | 0x0        | discrete   |            | na        | na        | na        | na
Wrench_Port_Up   | na         | discrete   | na         | na        | na        | na        | na
CONTROLLER_A:
PCM_Status       | 0x0        | discrete   | Fault      | na        | na        | na        | na
Attn_Sensor1     | 0x0        | discrete   | Asserted   | na        | na        | na        | na
CPU-1_DTS_Temp   | na         | degrees C  | na         | na        | na        | -10.000   | 0.000
CPU-2_DTS_Temp   | na         | degrees C  | na         | na        | na        | -10.000   | 0.000
CPU0_PVCCP       | na         | Volts      | na         | 1.580     | 1.670     | 1.920     | 2.010
CPU1_PVCCP       | na         | Volts      | na         | 1.580     | 1.670     | 1.920     | 2.010
