Missing NVME drives due to PCIe link initialization error
Applies to
- AFF A250
- 9.13.1P1
Issue
- Node lost access to all internal NVMe drives due to PCIe link initialization error
[Node-02: kernel: nvme.link.error:error]: PCIe link initialization error for NVMe SSD in slot 7.
[Node-02: kernel: nvme.link.error:error]: PCIe link initialization error for NVMe SSD in slot 14.
[Node-02: kernel: nvme.link.error:error]: PCIe link initialization error for NVMe SSD in slot 20.
[Node-02: kernel: nvme.link.error:error]: PCIe link initialization error for NVMe SSD in slot 12.
- Cluster reports multiple disk missing alert
[Node-02: SKL cerror: pcie.stealth.errors:debug]: params: {'pcie_errors': 'IIO1: RPT(22,0,0): PLX PCIE 9797 switch on Controller, Br[9797](24,0,0): RcvErr(P0(1),P1(1),P2(1),P3(1)), Br[9797](24,4,0): RcvErr(P5(1),P6(1)), Br[9797](24,8,0):
[Node-02: config_failed_disk: callhome.disks.missing:error]: Call home for MULTIPLE DISKS MISSING
- Aggregates created by the disks in shelf 0 goes offline
Aggregate aggr1_Node_02 (online, raid_dp, mirror degraded, fast zeroed) (block checksums)
Plex /aggr1_Node_02/plex0 (offline, failed, inactive, pool0)
RAID group /aggr1_Node_02/plex0/rg0 (partial)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity FAILED N/A 883638/ -
parity FAILED N/A 883638/ -
data FAILED N/A 883638/ -
data FAILED N/A 883638/ -
data FAILED N/A 883638/ -
data FAILED N/A 883638/ -
data FAILED N/A 883638/ -
data FAILED N/A 883638/ -
data FAILED N/A 883638/ -
data FAILED N/A 883638/ -
data FAILED N/A 883638/ -
Raid group is missing 11 disks.
Aggregate aggr0_Node_02 (online, raid_dp, mirror degraded, fast zeroed) (block checksums)
Plex /aggr0_Node_02/plex0 (offline, failed, inactive, pool0)
RAID group /aggr0_Node_02/plex0/rg0 (partial)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity FAILED N/A 63849/ -
parity FAILED N/A 63849/ -
data FAILED N/A 63849/ -
data FAILED N/A 63849/ -
data FAILED N/A 63849/ -
Raid group is missing 5 disks.
- No panic occured because we are in Metrocluster environment