PSUFruFanBadAlert reported in the system
Applies to
- ONTAP 9
- AFF A700s
- Service Processor (SP)
Issue
- PSU is reported as degraded and the following events are seen on both nodes:
[Node-01: env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Fan is Critical Low (0 RPM)
[Node-01: env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 InPower is Warning High (968 W)
[Node-01: cphmd: hm.alert.raised:alert]: Alert Id = PSUFruFanBadAlert , Alerting Resource = XXXXXXXXXXXXXX raised by monitor chassis
[Node-01: env_mgr: callhome.chassis.ps.degraded:error]: Call home for CHASSIS POWER SUPPLY DEGRADED: PS 1
[Node-01: mgwd: callhome.hm.alert.major:alert]: Call home for Health Monitor process cphm: PSUFruFanBadAlert[XXXXXXXXXXXXXX].
- The output of system health alert showcommand shows the following:
Cluster1::> system health alert show
               Node: Node-01
           Alert ID: PSUFruFanBadAlert
           Resource: XXXXXXXXXXXXXX
           Severity: Major
    Indication Time: Tue Feb 28 20:16:33 2023
           Suppress: false
        Acknowledge: false
     Probable Cause: Power Supply Unit PSU1 FRU has a major fan problem.
                     The nodes in this chassis are Node-01.
    Possible Effect: The power supply unit (PSU) might stop functioning if
                     the temperature increases.
 Corrective Actions: 1. Check PSU1 FRU and the fans associated with it.
                     2. Refer to the Hardware specification guide for more information on the 
                     position of the power  supply unit (PSU) 
                     and ways to  check or replace it.
                     3. Contact support personnel if the alert persists.
- The sensors of the affected PSU show critical status in the SP-LATEST-IPMIsection of autosupport:
Sensor Reading:
    PSU1_VIN         | 0.000      | Volts      | cr    | na        | 90.000    | 94.000    | 260.000   | 264.000   | na 
    PSU1_IIN         | 0.000      | Amps       | cr    | na        | 0.000     | 0.000     | 14.960    | 16.000    | na 
    PSU1_PIN         | 0.000      | Watts      | cr    | na        | 4.000     | 4.000     | 960.000   | 1020.000  | na 
    PSU1_FAN         | 0.000      | RPM        | cr    | na        | 768.000   | 1248.000  | na        | na        | na 
- The PSU reported in the above alert is marked as BAD in the PLATFORM-SENSORS.XMLsection of the autosupport logs.
- The alerts are seen even after replacing the affected PSU.
