Skip to main content
NetApp Knowledge Base

Multiple fan failure alert with system making noise

Views:
189
Visibility:
Public
Votes:
0
Category:
aff-series
Specialty:
hw
Last Updated:

Applies to

  • AFF and FAS systems
  • ONTAP 9
  • Disk Shelves

Issue

  • The following alerts are reported in the event logs frequently from both nodes:

[Node-01: statd: monitor.shelf.fault:debug]: Critical fault reported on disk storage shelf attached to channel 0a. Check fans, power supplies, disks, and temperature sensors.
[Node-01: statd: monitor.fan.failed:debug]: Multiple fans has failed.
[Node-01: env_mgr: monitor.fan.warning:debug]: multiple fans have failed. Replace it to avoid overheating
[Node-01: env_mgr: callhome.c.fan.fru.fault:debug]: Call home for CHASSIS FAN FRU FAILED: Multiple fans have failed

  • The output of storage show fault reveals that one of the power supplies is not being detected:

::> system node run -node * -command storage show fault

Enclosure Status: unrecoverable
Channel: 0a
Shelf: 0
Shelf Type: DS224-12
Product Serial Number: 952240001855
Module Type: IOM12E

Power Supplies:
Element Status         Status Bytes  Status Descriptions
  1: OK                01,00,00,20   RQSTED ON
  2: NOT INSTALLED     05,00,00,20   

Fans:
Element Status         Status Bytes  Status Descriptions
  1: OK                01,02,EC,26   
  2: OK                01,02,EC,26   
  3: NOT INSTALLED     05,00,00,20   
  4: NOT INSTALLED     05,00,00,20   

Input Power Monitor:
Element Status         Status Bytes  Status Descriptions
  1: OK                01,00,29,07   
  2: NOT INSTALLED     05,00,00,00   

Power Crest Factor:
Element Status         Status Bytes  Status Descriptions
  1: OK                01,00,29,07   
  2: NOT INSTALLED     05,00,00,00

  • The SP sensors are unable to report the readings even after PSU replacement:

Sensor Name              State          Current    Critical     Warning     Warning    Critical
                                        Reading       Low         Low         High       High
-------------------------------------------------------------------------------------------------
SNMP Bad Fan Count                      MULTI_FAILED
Chassis is Under Temp                       NO
Chassis is Over Temp                        NO
PSU2 Bad                 invalid            --
PSU1 Bad                                 FALSE
PSU2                     invalid            --
PSU1                                      GOOD
PSU2 ON                                     ON
PSU1 ON                                     ON
PSU1 INFO                               FRU_AVAIL
PSU1 INFO                               FRU_AVAIL
PSU1 FRU                                  GOOD
PSU2 FRU                                MULTIFAULT
Partner Status                          A_SIDE_PRESENT
PSU1 Present                            PRESENT  
PSU2 Present             not_available      --
PSU2 5V                  not_available     -- mV       --          --          --          --       
PSU2 12V                 not_available     -- mV       --          --          --          --       
PSU2 5V Curr             not_available     -- mA       --          --          --          --       
PSU2 12V Curr            not_available     -- mA       --          --          --          --       
PSU2 Fan 1               not_available     -- RPM      --          --          --          --       
PSU2 Fan 2               not_available     -- RPM      --          --          --          --       
PSU2 Inlet Temp          not_available     -- C         0 C         5 C        57 C        62 C     
PSU2 Hotspot Temp        not_available     -- C         0 C         5 C        90 C       100 C     
PSU_FAN                                 FAIL_2

  • As one PSU fan is not detected, the other PSU fans start spinning faster, making noise.
  • The SP/BMC is already on the latest firmware version.
  • Reboot of the SP/BMC does not stop the alerts.
  • The e0M port is not subjected to high traffic as described in KB: CHASSIS FAN FRU FAILED: Multiple fans have failed even after upgrading SP/BMC
  • Issue persists despite performing takeover/giveback of the nodes one by one.

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.