Node panic during MB reseat of the Partner
Applies to
- AFF C800
- ONTAP 9
Issue
- Node Panic during Motherboard reseat of the Partner
- Node start losing access to disks during the reseat of the partner,
ems-log
reports
Node-01: kernel: nvme.link.error:error]: PCIe link initialization error for NVMe SSD in slot X
Node-01: kernel: nvme.link.error:error]: PCIe link initialization error for NVMe SSD in slot X
Node-01: kernel: nvme.link.error:error]: PCIe link initialization error for NVMe SSD in slot X
Node-01: kernel: nvme.link.error:error]: PCIe link initialization error for NVMe SSD in slot X
Node-01: kernel: nvme.link.error:error]: PCIe link initialization error for NVMe SSD in slot X
Node-01: kernel: nvme.link.error:error]: PCIe link initialization error for NVMe SSD in slot X
Node-01: scsi_cmdblk_strthr_admin: disk.timeout.flush.start:debug]: Aggressive timeout flush started on disk 0n.X
- Panic String reports
Node-01: splog_main: mgr.stack.string:notice]: Panic string: aggr Node-01_n2_root: raid volfsm, fatal multi-disk error.. Raid type - raid_dp Group name plex0/rg0 state NORMAL. 6 disks failed in the group. Disk 0
Node-01: splog_main: mgr.stack.at:notice]: Panic occurred at: Fri Feb 7 09:28:24 2025
Node-01: splog_main: mgr.stack.proc:notice]: Panic in process: config_thread
- Issue occurs with both Nodes during the reseat of the Partner
- Motherboard reseat of the panicked Node required to recover the disappeared disks