Emergency shutdown: Environmental Reason Shutdown (Temperature critical) due to MB
Applies to
- ONTAP 9
- Motherboard (MB)
Issue
- One node reporting ONTAP event messages due to operating outside of normal temperature.
Example:
[node_name-2: env_mgr: monitor.chassisTemperature.warm:alert]: Chassis temperature is too warm: Bat Ambient 2 is warning high (42 C).
[node_name-2: env_mgr: monitor.chassisTemperature.warm:alert]: Chassis temperature is too warm: Bat Ambient 1 is warning high (42 C).
[node_name-2: monitor: monitor.globalStatus.critical:EMERGENCY]: Chassis temperature is too high..
[node_name-2: env_mgr: callhome.chassis.hitemp:error]: Call home for CHASSIS OVER TEMPERATURE
[node_name-2: env_mgr: monitor.temp.unreadable:error]: The controller temperature (RiserM LTemp2) is not readable
[node_name-2: env_mgr: monitor.temp.unreadable:error]: The controller temperature (RiserM RTemp1) is not readable.
- Partner node reports the same temperature ONTAP event messages.
Example:
[node_name-1: env_mgr: monitor.shutdown.chassisOverTemp:EMERGENCY]: Chassis temperature is too hot: Multiple Temp sensors are too high. System will be shutdown in 2 minutes
[node_name-1: env_mgr: callhome.chassis.overtemp:EMERGENCY]: Call home for CHASSIS OVER TEMPERATURE SHUTDOWN
[node_name-1: env_mgr: monitor.shutdown.emergency:EMERGENCY]: Emergency shutdown: Environmental Reason Shutdown (Temperature critical)
- Node unable to boot with.
Example:
Boot Loader version 6.6.4
Copyright (C) 2000-2003 Broadcom Corporation.
Portions Copyright (C) 2002-2022 NetApp, Inc. All Rights Reserved.
ACPI RSDP Found at 0x6f7fe014
BIOS POST Failure(s) detected: PCIe device missing error detected. Abort AUTOBOOT
- BMC events for that node:
Example:
Record 2486: Fri Jan 19 05:50:28.000000 2024 [SysFW.notice]: Device 47/0/0 (SW0-VS1-P40) missing
Record 2487: Fri Jan 19 05:50:28.000000 2024 [SysFW.notice]: Device 74/0/0 (SW0-VS0-P32) missing
Record 2488: Fri Jan 19 05:50:43.000000 2024 [Boot Loader.critical]: Abort Autoboot due to BIOS POST failure.
- Issue remains after a node re-seat
- Issue remains after PCIe cards re-seat/swapping.
- Issue remains after upgrading the FW to the latest: steps to update ONTAP's Service Processor (SP) or Baseboard Management Controller (BMC)