Environmental shutdown even though the temperature sensor back to normal within 2 minutes
Applies to
- ONTAP 9
- Service Processor (SP)
- SP 5.x
Issue
Environmental Shutdown even though the temperature sensor back to normal within 2 minutes.
Example:
- System show temperature is OK in EMS, but still shutdown in 2 minutes.
21 Nov 2020 03:40:27 [NSPAVM105: alert] env_mgr monitor chassisTemperature warm: describe_toowarm="CPU0 Temp Margin is critical high (88 C)."
21 Nov 2020 03:40:27 [NSPAVM105: emergency] env_mgr monitor shutdown chassisOverTemp: describe_toohot="CPU0 Temp Margin is critical high. System will be shutdown in 2 minutes"21 Nov 2020 03:40:27 [NSPAVM105: emergency] Chassis temperature is too high..
21 Nov 2020 03:40:37 [NSPAVM105: notice] env_mgr monitor chassisTemperature ok:
21 Nov 2020 03:41:26 [NSPAVM105: emergency] env_mgr callhome chassis overtemp: subject="CHASSIS OVER TEMPERATURE SHUTDOWN"
21 Nov 2020 03:42:28 [NSPAVM105: emergency] statd monitor shutdown emergency: reason="Environmental Reason Shutdown (Temperature critical)"
- SP events all show the temperature sensor is normal, but system still shutdown in 2 mintues.
Record 1218: Fri Nov 20 19:40:23 2020 [IPMI.notice]: 6b01 | 02 | EVT: 015758f5 | CPU0_Temp_Margin | Assertion Event, "Upper Non-critical going high"
Record 1219: Fri Nov 20 19:40:23 2020 [IPMI.notice]: 6c01 | 02 | EVT: 015958ff | CPU0_Temp_Margin | Assertion Event, "Upper Critical going high"
Record 1220: Fri Nov 20 19:40:33 2020 [IPMI.notice]: 6d01 | 02 | EVT: 8159ccff | CPU0_Temp_Margin | Deassertion Event, "Upper Critical going high"
Record 1221: Fri Nov 20 19:40:33 2020 [IPMI.notice]: 6e01 | 02 | EVT: 8157ccf5 | CPU0_Temp_Margin | Deassertion Event, "Upper Non-critical going high"
Record 1222: Fri Nov 20 19:40:55 2020 [IPMI.notice]: 6f01 | 02 | EVT: 0301ffff | Attn_Sensor1 | Assertion Event, "State Asserted"
Record 1223: Fri Nov 20 19:41:03 2020 [IPMI.notice]: 7001 | 02 | EVT: 0300ffff | Attn_Sensor1 | Assertion Event, "State Deasserted"
Record 1224: Fri Nov 20 19:42:23 2020 [IPMI.emergency]: triggered OS halt: Temperature critical