Skip to main content
NetApp Knowledge Base

Faulty HBA results in multiple failed disks

Views:
198
Visibility:
Public
Votes:
0
Category:
fas-systems
Specialty:
hw
Last Updated:

Applies to

FAS2650

Issue

  • Node goes down and partner reports performing a takeover
CLTFLT:HA Group Notification from Node-01 (CONTROLLER TAKEOVER COMPLETE AUTOMATIC) ALERT
  • Multiple failed disks shown
cluster::> storage disk show -broken
Original Owner: Checksum Compatibility: block
Drawer Usable Physical Disk Outage Reason
HA Shelf Bay /Slot Chan Pool Type RPM Size Size
--------------- ------------- --- ----- --- ------ ----
4.2.4     failed 0a 2 4 -/- B NONE SAS 10000 - 1.64TB
4.2.12    failed 0a 2 12 -/- B NONE SAS 10000 - 1.64TB
4.2.16    failed 0a 2 16 -/- B NONE SAS 10000 - 1.64TB
4.3.4     failed 0b 3 4 -/- A NONE SAS 10000 - 1.64TB
4.3.6     failed 0b 3 6 -/- A NONE SAS 10000 - 1.64TB
4.3.20    failed 0b 3 20 -/- A NONE SAS 10000 - 1.64TB
 
cluster::> node run -node Node-02 sysconfig -r
Aggregate aggr2_cluster_02_SAS (failed, mixed_raid_type, partial, hybrid) (block checksums)
  Plex /aggr2_cluster_02_SAS/plex0 (offline, failed, inactive)
    RAID group /aggr2_cluster_02_SAS/plex0/rg0 (partial, block checksums, raid_dp)
      RAID Disk    Device      HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      ---------    ------      ------------- ---- ---- ---- ----- --------------    --------------
      dparity     0a.02.0     0a    2   0   SA:A   0   SAS 10000 1713523/3509295616 1716957/3516328368
      parity      0b.03.0     0b    3   0   SA:B   0   SAS 10000 1713523/3509295616 1716957/3516328368
      data    FAILED          N/A                        1713523/ -
      data    FAILED          N/A                        1713523/ -
      data    FAILED          N/A                        1713523/ -
      data    FAILED          N/A                        1713523/ -
      data    FAILED          N/A                        1713523/ -
      data    FAILED          N/A                        1713523/ -
      data    FAILED          N/A                        1713523/ -
      data    FAILED          N/A                        1713523/ -
      Raid group is missing 8 disks.
  • Aggregate shows as failed / offline
  • SCSI cmd checkconditions appear in EMS log before the disk failures
[Node-02: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Disk device 0b.03.10: Check Condition: CDB 0x2a:a7fc9600:0200: Sense Data SCSI:hardware error -  (0x4 - 0x3 0x0 0x82)(994).
[Node-02: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Disk device 0a.02.6: Check Condition: CDB 0x2a:a7fc9600:0200: Sense Data SCSI:hardware error -  (0x4 - 0x3 0x0 0x8)(1480).
[Node-02: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Disk device 0a.02.8: Check Condition: CDB 0x2a:a7fc9600:0200: Sense Data SCSI:hardware error -  (0x4 - 0x3 0x0 0x8)(1512).
[Node-02: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Disk device 0a.02.10: Check Condition: CDB 0x2a:a7fc9600:0200: Sense Data SCSI:hardware error -  (0x4 - 0x3 0x0 0x8)(1525).
 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.