Disk shelf DS460C is reporting intermittent UNDER-TEMPERATURE (UT) alerts
Applies to
- DS460C Shelves
- ONTAP 9
Issue
- DS460C shelf reporting undertemperature failure alerts similar to the following
[node-n01: dsa_worker5: ses.status.temperatureError:error]: DS460-12 (S/N XXXX) shelf 21 on channel 0b temperature error for Temperature sensor 1: critical status; undertemperature failure. Current temperature: -2 C (28 F). This module is on the front of the shelf on the left, on the OPS panel.
[node-n01: statd: monitor.shelf.fault:alert]: Critical fault reported on disk storage shelf attached to channel 0b. Check fans, power supplies, disks, and temperature sensors.
[node-n01: statd: callhome.shlf.fault:error]: Call home for SHELF_FAULT
- Shelf fault clears after random amount of time
- Confirmed no site temperature issues
- Shelf logs show constant I2C errors from the shelf IOMs
EXAMPLE:
Mon May 1 22:22:22 2024 ( 452+09:23:07.194); 0600002A; M0; HAL; hal; 04; HAL_I2CTgt_Write: bus[1] addr:94 err:5 cerr:5
Mon May 1 22:22:22 2024 ( 452+09:23:07.203); 0600002A; M0; HAL; hal; 04; HAL_I2CTgt_Write: bus[1] addr:94 err:5 cerr:1
Mon May 1 22:22:22 2024 ( 452+09:23:07.203); 0600002A; M0; HAL; hal; 04; HAL_I2CTgt_Write: bus[1] addr:94 err:5 cerr:2
Mon May 1 22:22:22 2024 ( 452+09:23:07.204); 0600002A; M0; HAL; hal; 04; HAL_I2CTgt_Write: bus[1] addr:94 err:5 cerr:3
Mon May 1 22:22:22 2024 ( 452+09:23:07.204); 0600002A; M0; HAL; hal; 04; HAL_I2CTgt_Write: bus[1] addr:94 err:5 cerr:4
Mon May 1 22:22:22 2024 ( 452+09:23:07.204); 0600002A; M0; HAL; hal; 04; HAL_I2CTgt_Write: bus[1] addr:94 err:5 cerr:5
Mon May 1 22:22:22 2024 ( 452+09:23:08.273); 0600002B; M0; HAL; hal; 04; HAL_I2CTgt_Read: bus[1] addr:94 err:5 cerr:1 stuckErr:0
Mon May 1 22:22:22 2024 ( 452+09:23:08.274); 0600002B; M0; HAL; hal; 04; HAL_I2CTgt_Read: bus[1] addr:94 err:5 cerr:2 stuckErr:0
Mon May 1 22:22:22 2024 ( 452+09:23:08.274); 0600002B; M0; HAL; hal; 04; HAL_I2CTgt_Read: bus[1] addr:94 err:5 cerr:3 stuckErr:0
Mon May 1 22:22:22 2024 ( 452+09:23:08.274); 0600002B; M0; HAL; hal; 04; HAL_I2CTgt_Read: bus[1] addr:94 err:5 cerr:4 stuckErr:0