What do storage system and shelf temperature thresholds mean?
Applies to
- All FAS systems
- All AFF systems
- Environmental checks
Answer
The environment command can be used to display information about the storage system and shelf environment. In this display, there are a number of values that are presented. The top of the display will show:
PSU Status ok
Temperature ok
Voltage ok
System Fan Fru 1 ok
System Fan Fru 2 ok
PSU 1 Fan ok
PSU 2 Fan ok
NVRAM5-temperature-3 ok
NVRAM5-battery-3 ok.
While this display shows an OK status, more information can be obtained by using the following command:
Data ONTAP 8 7-Mode:
> environment chassis list-sensors temperature
ONTAP Cluster Mode:
::> system node run -node <node name> -command environment chassis list-sensors
::> system node environment sensors show
This will list all the temperature sensors in the chassis of the storage system.
Example:
Temp_Unit_A, temperature, lm81_first, 0, normal, mon: Temperature, 27C, -1C, -1C, 48C, 50C
This shows that this sensor is normal. The current reading on this sensor is 27 degrees C.
The low critical and low warnings are -1C and the high warning is 48C and the high critical is 50C.
When the warning levels are reached an AutoSupport will be generated and if the critical levels are reached the storage system will shutdown. This will be an immediate shutdown and power off of the storage system.
Loss of client’s in-flight data that have not been acknowledged by NVRAM may occur, similarly to array’s power loss scenario. This is done to prevent damage to the system.
For each shelf that is installed on a system, there will be a display similar to the following:
Temperature Sensor installed element list: 1, 2, 3; with error: none
Shelf temperatures by element:
[1] 38 C (100 F) (ambient) Normal temperature range
[2] 45 C (113 F) Normal temperature range
[3] 44 C (111 F) Normal temperature range
Temperature thresholds by element:
[1] High critical: 50 C (122 F); high warning 40 C (104 F)
Low critical: 0C (32 F); low warning 10 C (50 F)
[2] High critical: 63 C (145 F); high warning 53 C (127 F)
Low critical: 0C (32 F); low warning 10 C (50 F)
[3] High critical: 63 C (145 F); high warning 53 C (127 F)
Low critical: 0C (32 F); low warning 10 C (50 F)
In each shelf there are three sensors. Sensor 1 is in the shelf, sensor 2 is in the A module and Sensor 3 is in the B module.
This shows that sensors 1, 2, and 3 are installed and no sensor shows an error condition. If one had an error, the number of the sensor would be listed as follows:
Temperature Sensor installed element list: 1, 2, 3; with error: 2
This would show that the A module temperature sensor has a fault of some kind. To determine the exact problem, the rest of the display would have to be analyzed.
This portion of the output shows the current temperature of each element and if that temperature is within normal ranges.:
Shelf temperatures by element:
[1] 38 C (100 F) (ambient) Normal temperature range
[2] 45 C (113 F) Normal temperature range
[3] 44 C (111 F) Normal temperature range
This portion of the display shows the thresholds at which certain actions occur:
Temperature thresholds by element:
[1] High critical: 50 C (122 F); high warning 40 C (104 F)
Low critical: 0C (32 F); low warning 10 C (50 F)
[2] High critical: 63 C (145 F); high warning 53 C (127 F)
Low critical: 0C (32 F); low warning 10 C (50 F)
[3] High critical: 63 C (145 F); high warning 53 C (127 F)
Low critical: 0C (32 F); low warning 10 C (50 F)
If the warning threshold is reached, a warning AutoSupport is generated and if the critical threshold is reached on a shelf it will not power off. Shelf sensors will exhibit different behavior than the chassis sensors.
- Shelves will not power down when there is a sensors issue
- A sensor issue on the shelf will never cause a shutdown of the storage system
The threshold in the storage system chassis and each shelf is set and cannot be changed. If a sensor is not reporting, firmware versions need to be checked. If the firmware is up-to-date, the piece of equipment with the defective sensor will need to be power cycled or replaced. This may be a motherboard, power supply, shelf or ESH or expander module.
Additional Information
Common AutoSupport Callhomes
callhome.chassis.hitemp |
callhome.chassis.overtemp |
callhome.chassis.undertemp |