Skip to main content
NetApp Knowledge Base

Disks fail aggressively in a single disk stack

Views:
99
Visibility:
Public
Votes:
0
Category:
metrocluster
Specialty:
metrocluster
Last Updated:

Applies to

  • ONTAP 9
  • Fabric MetroCluster
  • ATTO FB7500N

Issue

Multiple disks fail in a short space of time
Example:
ClusterA::> storage disk show -broken
Original Owner: ClusterB-01
  Checksum Compatibility: block
                                                Drawer                            Usable Physical
    Disk            Outage Reason  HA Shelf Bay  /Slot Chan   Pool  Type    RPM     Size     Size
    --------------- ------------- --- ----- --- ------ ---- ------ ----- ------ -------- --------
    1.51.22         failed        11b    51  22   -/-     A FAILED   SSD      -        -   6.99TB
Original Owner: ClusterB-01
  Checksum Compatibility: block
                                                Drawer                            Usable Physical
    Disk            Outage Reason  HA Shelf Bay  /Slot Chan   Pool  Type    RPM     Size     Size
    --------------- ------------- --- ----- --- ------ ---- ------ ----- ------ -------- --------
    1.51.15         failed        11b    51  15   -/-     A FAILED   SSD      -        -  894.3GB
Original Owner: ClusterA-01
  Checksum Compatibility: block
                                                Drawer                            Usable Physical
    Disk            Outage Reason  HA Shelf Bay  /Slot Chan   Pool  Type    RPM     Size     Size
    --------------- ------------- --- ----- --- ------ ---- ------ ----- ------ -------- --------
    1.51.6          failed         1d    51   6   -/-     B FAILED   SSD      -  894.0GB  894.3GB
    1.51.16         failed         1d    51  16   -/-     B FAILED   SSD      -   6.99TB   6.99TB
    1.51.17         failed         1d    51  17   -/-     B FAILED   SSD      -   6.99TB   6.99TB
    1.51.18         failed         1d    51  18   -/-     B FAILED   SSD      -   6.99TB   6.99TB
 
 
A significant amount of errors are found in the event log for one of the ATTO bridges
Example:
INFO FC TM Cmd Rcvd: Abort Task Set to LUN:27 on FC Port 1
 
Error counters on the ATTO bridge are increasing
Example:
; Fibre Channel Error Counts
; Port | Link Failures | Sync Loss | Signal Loss | Invalid Tx | Invalid CRC
;==========================================================================
   1                 1           2             0           16          4796
   2                 1           1             0            4             0

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.