CriticalFanXFruFaultAlert is reported due to chassis fan failure - sensor reports critical
Applies to
- ONTAP 9
- Chassis FAN
Issue
- The following errors are reported for a fan module in event logs:
[Node1: env_mgr: monitor.chassisFan.stop:error]: Chassis fan contains at least one stopped fan: Fan2_2 (failed)
[Node1: env_mgr: callhome.c.fan.fru.fault:error]: Call home for CHASSIS FAN FRU FAILED: Fan2_2
[Node1: monitor: monitor.globalStatus.critical:EMERGENCY]: One fan has failed: SysFan2 F2.
[Node1: cphmd: hm.alert.raised:alert]: Alert Id = CriticalFan2FruFaultAlert , Alerting Resource = 021XXXXXXXX882 raised by monitor chassis
[Node1: mgwd: callhome.hm.alert.critical:alert]: Call home for Health Monitor process cphm: CriticalFan2FruFaultAlert[021XXXXXXXX882].
- The
SP-LATEST-IPMI
section of autosupport shows critical status for the affected fan:
======================================
hsamcmd --fault-show-all
===============================
tag origin fld fault reason count time
---- ------- ---- ------------- ------ -----
1 0x5 /chassis-1/fan-2 ipmi Fan2_Speed2 lower non-critical 1 Tue May 9 19:20:38 2023
2 0x5 /chassis-1/fan-2 ipmi Fan2_Speed2 lower critical 1 Tue May 9 19:20:38 2023
FAN2:
Fan2_Status | 0x0 | discrete | Ready | na | na | na | na
Fan2_Current | 6.150 | Amps | ok | 0.000 | 0.500 | 10.000 | 12.000
Fan2_Speed1 | 8300.000 | RPM | ok | 1000.000 | 1250.000 | 9750.000 | 10000.000
Fan2_Speed2 | 0.000 | RPM | cr | 1000.000 | 1250.000 | 9750.000 | 10000.000
Fan2_Speed3 | 8300.000 | RPM | ok | 1000.000 | 1250.000 | 9750.000 | 10000.000
Fan2_Speed4 | 8250.000 | RPM | ok | 1000.000 | 1250.000 | 9750.000 | 10000.000