Multi Disk Panic but after reboot no disks are missing
This knowledge base was generated through a script and may contain formatting issues. For the accurate content, please refer to the WIP editor. WIP editor.
Applies to
- FAS/AFF
- ONTAP 9
Issue
Node reboots unexpectedly after loss of access to some or all of the node's disks.
1. Loss of all storage adapter paths to a specific shelf or shelf stack reported in EMS or console log
Tue Aug 06 15:28:56 -0400 [node1: pmcsas_timeout_2: sas.link.error:error]: Could not recover link on SAS adapter 11a after 110 seconds. Offlining the adapter. Tue Aug 06 15:28:56 -0400 [node1: pmcsas_timeout_2: sas.link.error:error]: Could not recover link on SAS adapter 11c after 110 seconds. Offlining the adapter. Tue Aug 06 15:34:23 -0400 [node1: pmcsas_timeout_2: sas.link.error:error]: Could not recover link on SAS adapter 11a after 45 seconds. Offlining the adapter. Tue Aug 06 15:34:23 -0400 [node1: pmcsas_timeout_2: sas.link.error:error]: Could not recover link on SAS adapter 11c after 45 seconds. Offlining the adapter.
2. EMS/console Logs shows all the drives missing from particular shelf
Tue Aug 06 15:28:56 -0400 [node1: config_thread: raid.config.spare.disk.missing:info]: Spare Disk 11c.40.55 Shelf 40 Drawer 5 Slot 7 Bay 55 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXXX] UID [x.x.x.x] is missing. Tue Aug 06 15:28:56 -0400 [node1: config_thread: raid.config.filesystem.disk.missing:info]: File system Disk /node1_sata_aggr2/plex0/rg0/11c.40.3 Shelf 40 Drawer 1 Slot 3 Bay 3 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXXX] UID [x.x.x.x] is missing. Tue Aug 06 15:28:56 -0400 [node1: config_thread: raid.rg.degraded:notice]: : Raid group /node1_sata_aggr2/plex0/rg0 is degraded Tue Aug 06 15:28:56 -0400 [node1: config_thread: raid.config.filesystem.disk.missing:info]: File system Disk /node1_sata_aggr2/plex0/rg0/11c.40.5 Shelf 40 Drawer 1 Slot 5 Bay 5 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXXX] UID [x.x.x.x] is missing. Tue Aug 06 15:28:56 -0400 [node1: config_thread: raid.config.filesystem.disk.missing:info]: File system Disk /node1_sata_aggr2/plex0/rg0/11c.40.7 Shelf 40 Drawer 1 Slot 7 Bay 7 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXXX] UID [x.x.x.x] is missing. Tue Aug 06 15:28:56 -0400 [node1: config_thread: raid.config.filesystem.disk.missing:info]: File system Disk /node1_sata_aggr2/plex0/rg0/11c.40.9 Shelf 40 Drawer 1 Slot 9 Bay 9 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXXX] UID [x.x.x.x] is missing.
3. Node reports a multidisk error panic alert
Tue Aug 06 15:28:56 -0400 [node1: config_thread: cf.multidisk.fatalProblem:error]: Node encountered a multidisk error or other fatal error while waiting to be taken over. aggr node1_sata_aggr2: raid volfsm, fatal multi-disk error.. Primary Tier: Raid type - raid_tec Group name plex0/rg0 state NORMAL. 23 disks failed in the group. Disk 11c.40.3 Shelf 40 Drawer 1 Slot 3 Bay 3 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXX] UID [*] error: no valid path to disk.
4. Node reboots due to the panic
SP event log example:
Record 1229: Tue Aug 06 19:29:17.048900 2024 [SP.critical]: Filer Reboots
5. After reboot, EMS/console logs report all the drives and drawers reinserted
Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11c.40.8 Shelf 40 Drawer 1 Slot 8 Bay 8 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXX] UID [x:x:x:x] has been inserted into the system Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11a.40.53 Shelf 40 Bay 53 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXX] UID [x:x:x:x] has been inserted into the system Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11a.40.13 Shelf 40 Bay 13 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXX] UID [x:x:x:x] has been inserted into the system Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11c.40.54 Shelf 40 Drawer 5 Slot 6 Bay 54 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXX] UID [x:x:x:x] has been inserted into the system Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11a.40.15 Shelf 40 Bay 15 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXX] UID [x:x:x:x] has been inserted into the system Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11a.40.37 Shelf 40 Bay 37 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXX] UID [x:x:x:x] has been inserted into the system Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11c.40.34 Shelf 40 Drawer 3 Slot 10 Bay 34 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXX] UID [x:x:x:x] has been inserted into the system Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11a.40.43 Shelf 40 Bay 43 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXX] UID [x:x:x:x] has been inserted into the system Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11a.40.41 Shelf 40 Bay 41 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXX] UID [x:x:x:x] has been inserted into the system Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11c.40.56 Shelf 40 Drawer 5 Slot 8 Bay 56 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXXX] UID [x:x:x:x] has been inserted into the system Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11c.40.50 Shelf 40 Drawer 5 Slot 2 Bay 50 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXXX] UID [x:x:x:x] has been inserted into the system Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11c.40.10 Shelf 40 Drawer 1 Slot 10 Bay 10 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXXX] UID [x:x:x:x] has been inserted into the system Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11a.40.35 Shelf 40 Bay 35 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXX] UID [x:x:x:x] has been inserted into the system Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11c.40.52 Shelf 40 Drawer 5 Slot 4 Bay 52 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXXX] UID [x:x:x:x] has been inserted into the system Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11a.40.25 Shelf 40 Bay 25 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXX] UID [x:x:x:x] has been inserted into the system Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11c.40.38 Shelf 40 Drawer 4 Slot 2 Bay 38 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXXX] UID [x:x:x:x] has been inserted into the system Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11a.40.7 Shelf 40 Bay 7 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXXX] UID [x:x:x:x] has been inserted into the system Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11a.40.29 Shelf 40 Bay 29 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXXX] UID [x:x:x:x] has been inserted into the system Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11a.40.21 Shelf 40 Bay 21 [NETAPP X377_HLBRE10TA07 NA03] S/N [XXXXXXXX] UID [x:x:x:x] has been inserted into the system Tue Aug 06 15:52:33 -0400 [node1: dmgr_thread: raid.disk.inserted:info]: Disk 11c.40.14 Shelf 40 Drawer 2 Slot 2 Bay 14 [NETAPP X377_SEVNE10TA07 NA02] S/N [XXXXXXXX] UID [x:x:x:x] has been inserted into the system
6. System is back to optimal and node is up