Skip to main content
NetApp Knowledge Base

HA pair down due to multi disk failure

Views:
803
Visibility:
Public
Votes:
0
Category:
aff-series
Specialty:
hw
Last Updated:

Applies to

AFF A250

Issue

  • Automatic node shutdown due to 2 disks failed with a third one in a reconstruction so the "reconstruction stalled".

Aggregate aggregate1 (failed, raid_dp, partial, fast zeroed) (block checksums)
  Plex /aggregate1/plex0 (offline, failed, inactive)
    RAID group /aggregate1/plex0/rg0 (partial, block checksums)

      RAID Disk    Device    HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      ---------    ------    ------------- ---- ---- ---- ----- --------------    --------------
      dparity     0n.0P2    0n    0   0          0 SSD-NVM   N/A 867680/222126208  867688/222128256 (reconstruct stalled)
      parity      0n.1P2    0n    0   1          0 SSD-NVM   N/A 867680/222126208  867688/222128256 (fast zeroed)
      data    FAILED        N/A                        867680/ -
      data        0n.3P2    0n    0   3          0 SSD-NVM   N/A 867680/222126208  867688/222128256 
      ...
      data        0n.19P2    0n    0   19         0 SSD-NVM   N/A 867680/222126208  867688/222128256 
      data    FAILED        N/A                        867680/ -
      Raid group is missing 2 disks.

Mon Jul 05 2021 09:16:31 GMT [node_name1: statd: monitor.brokendisk.notice:NOTICE]: When two disks are broken in raid_dp volume, the system shuts down automatically every 24 hours to encourage you to replace the disk. If you reboot the system, it will run for another 24 hours before shutting down.

  • Multidisk PANIC:

Panic_Message: aggr aggregate1: raid volfsm, fatal multi-disk error.. Raid type - raid_dp Group name plex0/rg0 state DEGRADEDRECONS. 1 disk failed in the gr...

  • WAFL inconsistency error:

Sat Jul 03 02:27:13 +0000 [node_name1: wafl_exempt00: wafl.raid.incons.userdata:error]: WAFL inconsistent: inconsistent user data block at VBN 729078144 (vvbn:69395562 fbn:69395562 level:0) in private inode (fileid:container snapid:0 file_type:6 disk_flags:0xc10000800800143 error:120 raid_set:1) in volume volume_name@vserver:ab0123c4-56de-78fg-9hi0-j123kl45m6n7.
Sat Jul 03 02:27:13 +0000 [node_name1: wafl_exempt00: wafl.incons.userdata.vol:alert]: WAFL inconsistent: volume volume_name@vserver:ab0123c4-56de-78fg-9hi0-j123kl45m6n7 has an inconsistent user data block. Note: Any new Snapshot copies might contain this inconsistency.
Sat Jul 03 02:27:13 +0000 [node_name1: wafl_exempt00: callhome.wafl.inconsistent.user.block:alert]: Call home for WAFL INCONSISTENT USER BLOCK

  • PCIe link errors for NVMe SSD disks:

Fri Jun 25 2021 11:30:52 GMT [node_name1: kernel: nvme.link.error:ERROR]: PCIe link initialization error for NVMe SSD in slot 22.

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.