CFSHELF-1826: ses.status.temperatureError (...) temperature error for Temperature sensor 1 in shelf
Issue
- Shelf attention LED ON in OPS frontal panel.
- AutoSupport ENVIRONMENT output with a "failure" reported in that sensor [1]). Example:
Channel: 0a
Shelf: 0
SES device path: local access: 0a.00.99
Module type: IOM12; monitoring is active
Shelf status: critical condition
...
Temperature Sensor installed element list: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11; with error: 1
Shelf temperatures by element:
[1] 128 C (262 F) (ambient) Overtemperature failure!
[2] 20 C (68 F) Normal temperature range
...
[11] 31 C (87 F) Normal temperature range
- ONTAP event messages example:
::> event log show-event shelf
Time Node Severity Event
---------------------------------------------------------------------------
1/2/2025 10:35:00 node_name EMERGENCY monitor.globalStatus.critical: Disk shelf fault.
1/2/2025 10:34:28 node_name ALERT monitor.shelf.fault: Critical fault reported on disk storage shelf attached to channel 0b. Check fans, power supplies, disks, and temperature sensors.
1/2/2025 10:34:19 node_name ERROR ses.status.temperatureError: DS224-12 (S/N SHFHU2048000395) shelf 0 on channel 0c temperature error for Temperature sensor 1: critical status; overtemperature failure. Current temperature: 128 C (262 F). This module is on the front of the shelf on the left, on the OPS panel.
1/2/2025 10:33:52 node_name DEBUG stackmon.shelf.discovery.complete: One or more shelves have been discovered.
...