AFF A400 FAS8700 and FAS8300 reboot with CHASSIS OVER TEMPERATURE SHUTDOWN EMERGENCY due to low memory
Applies to
- ONTAP 9
- AFF A400
- FAS8700 and FAS8300
- BMC FW 13.4 and earlier
Issue
- The node reboots after a shutdown.
- Multiple sensors value shows
na
:
PVCCIN_CPU0 | na | Volts | na | na | 0.010 | 0.020 | 2.460 | 2.470 | na
PVCCIN_CPU1 | na | Volts | na | na | 0.010 | 0.020 | 2.460 | 2.470 | na
PVDDQ_ABC | na | Volts | na | na | 0.014 | 0.021 | 1.711 | 1.732 | na
PVDDQ_DEF | na | Volts | na | na | 0.014 | 0.021 | 1.711 | 1.732 | na
PVDDQ_GHI | na | Volts | na | na | 0.014 | 0.021 | 1.711 | 1.732 | na
PVDDQ_KLM | na | Volts | na | na | 0.014 | 0.021 | 1.711 | 1.732 | na
P1V05_PCH | na | Volts | na | na | 0.940 | 0.992 | 1.100 | 1.147 | na
...
CX5_Temp1 | na | degrees C | na | na | 0.000 | 5.000 | 80.000 | 85.000 | na
CX5_Temp2 | na | degrees C | na | na | 0.000 | 5.000 | 80.000 | 85.000 | na
...
RiserL_Temp1 | na | degrees C | na | na | 0.000 | 5.000 | 60.000 | 70.000 | na
RiserL_Temp2 | na | degrees C | na | na | 0.000 | 5.000 | 60.000 | 70.000 | na
RiserM_Temp1 | na | degrees C | na | na | 0.000 | 5.000 | 60.000 | 70.000 | na
RiserM_Temp2 | na | degrees C | na | na | 0.000 | 5.000 | 60.000 | 70.000 | na
RiserM_Temp3 | na | degrees C | na | na | 0.000 | 5.000 | 60.000 | 70.000 | na
RiserM_Temp4 | na | degrees C | na | na | 0.000 | 5.000 | 60.000 | 70.000 | na
RiserR_Temp1 | na | degrees C | na | na | 0.000 | 5.000 | 60.000 | 70.000 | na
RiserR_Temp2 | na | degrees C | na | na | 0.000 | 5.000 | 60.000 | 70.000 | na
RiserR_Temp3 | na | degrees C | na | na | 0.000 | 5.000 | 60.000 | 70.000 | na
RiserR_Temp4 | na | degrees C | na | na | 0.000 | 5.000 | 60.000 | 70.000 | na
CPU0_Temp | na | degrees C | na | na | na | na | 90.000 | 100.000 | na
CPU1_Temp | na | degrees C | na | na | na | na | 90.000 | 100.000 | na
Mezz_Temp1 | na | degrees C | na | na | 0.000 | 5.000 | 80.000 | 85.000 | na
Mezz_Temp2 | na | degrees C | na | na | 0.000 | 5.000 | 54.000 | 57.000 | na
- Multiple temperature sensors shows status "nc" in
sel elist
:
02/17/2022 | 16:57:53 | Temperature LED2_Temp | Lower Non-critical going low | Reading 4 < Threshold 3 degrees C
02/17/2022 | 16:58:00 | Temperature LED1_Temp | Lower Non-critical going low | Reading 4 < Threshold 3 degrees C
02/17/2022 | 17:48:22 | Temperature MP_Temp3 | Lower Non-critical going low | Reading 5 < Threshold 5 degrees C
02/17/2022 | 17:49:40 | Temperature System_Inlet | Lower Non-critical going low | Reading 5 < Threshold 5 degrees C
02/17/2022 | 17:49:55 | Temperature MP_Temp1 | Lower Non-critical going low | Reading 5 < Threshold 5 degrees C
02/17/2022 | 17:50:09 | Temperature MP_Temp1 | Lower Non-critical going low | Reading 6 < Threshold 5 degrees C
02/17/2022 | 17:50:15 | Temperature MP_Temp1 | Lower Non-critical going low | Reading 5 < Threshold 5 degrees C
02/17/2022 | 17:50:40 | Temperature MP_Temp1 | Lower Non-critical going low | Reading 6 < Threshold 5 degrees C
- Example of what is shown before the reboot:
Nov 11, 2020 07:00:41 0100 HA Group Notification (CHASSIS POWER DEGRADED: Power Supply Status Critical: PSU1, PSU2.) ERROR
Nov 11, 2020 07:34:52 0100 HA Group Notification (CHASSIS OVER TEMPERATURE SHUTDOWN) EMERGENCY
Nov 11, 2020 10:16:20 0100 HA Group Notification (BATTERY ('Bat Temp' unreadable)) EMERGENCY
Nov 11, 2020 10:16:55 0100 HA Group Notification (BATTERY ('Bat Volt' unreadable)) EMERGENCY
Nov 11, 2020 10:17:07 0100 HA Group Notification (BATTERY ('Bat Curr' unreadable)) EMERGENCY
Nov 11, 2020 10:17:18 0100 HA Group Notification (BATTERY ('Bat Full Cap' unreadable)) EMERGENCY
Nov 11, 2020 10:17:40 0100 HA Group Notification (CHASSIS FAN FRU FAILED: Fan1_1) ERROR
Nov 11, 2020 10:17:54 0100 HA Group Notification (CHASSIS FAN FRU FAILED: Fan2_1) ERROR
Nov 11, 2020 10:18:08 0100 HA Group Notification (CHASSIS FAN FRU FAILED: Fan2_2) ERROR
Nov 11, 2020 10:18:19 0100 HA Group Notification (CHASSIS FAN FRU FAILED: Fan3_1) ERROR
Nov 11, 2020 10:18:30 0100 HA Group Notification (CHASSIS FAN FRU FAILED: Fan3_2) ERROR