Raid groups failed after switchback in MCC-IP

Last updated

Mar 20, 2024
Save as PDF
Share
1. Share
2. Tweet
3. Share

Views:: 109

Visibility:: Public

Votes:: 0

Category:: metrocluster

Specialty:: metrocluster

Last Updated:: 3/20/2024, 4:29:51 PM

Applies to

MCC-IP

Issue

During planned maintenance, raid groups in MCC are offline after switchback

Aggregate aggr1 (online, raid4, mirror degraded, fast zeroed) (block checksums)
  Plex /aggr1/plex0 (offline, failed, inactive)
    RAID group /aggr1/plex0/rg0 (partial, block checksums)

      RAID Disk Device          HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      --------- ------          ------------- ---- ---- ---- ----- --------------    --------------
      parity    FAILED                  N/A                        409828/ -
      data      0b.10.7P1       0b    10  7   SA:A   0   SSD   N/A 409828/839327744  409836/839344128 (fast zeroed)
      data      0b.10.8P1       0b    10  8   SA:A   0   SSD   N/A 409828/839327744  409836/839344128 (fast zeroed)
      data      0b.10.9P1       0b    10  9   SA:A   0   SSD   N/A 409828/839327744  409836/839344128 (fast zeroed)
      data      0b.10.0P1       0b    10  0   SA:A   0   SSD   N/A 409828/839327744  409836/839344128 (fast zeroed)
      data      0b.10.1P1       0b    10  1   SA:A   0   SSD   N/A 409828/839327744  409836/839344128 (fast zeroed)
      data      FAILED                  N/A                        409828/ -
      data      FAILED                  N/A                        409828/ -
      data      0b.10.4P1       0b    10  4   SA:A   0   SSD   N/A 409828/839327744  409836/839344128
      data      FAILED                  N/A                        409828/ -
      data      0b.10.10P1      0b    10  10  SA:A   0   SSD   N/A 409828/839327744  409836/839344128
      Raid group is missing 4 disks.

metrocluster operation history show reports:

MetroCluster switchover completed with errors in auto heal. Reason: Exceeded the maximum number of retries for the command in phase Heal aggregates

EMS_LOG_FILE.gz will show the following for one or more drives:

raid_label_io_writeError disk_info="Disk /aggr1/plex0/rg0/0m.i1.1L4P1 Shelf 10 Bay 3