Node reboot because of Multi Disk Panic with one or more shelves missing
Applies to
- All FAS/AFF Systems
- Disk Shelves
Issue
- The shelves are powered off.
- Node panics with Multi-Disk Panic (MDP) and shelf power interrupted message:
PANIC: aggr aggr0: raid volfsm, fatal multi-disk error. raid type raid_dp in SK process config_thread on release 8.2.3P4 on Mon Feb 10 07:52:07 CET 2020version: 8.2.3P4: Fri May 29 14:06:54 PDT 2015compile flags: x86_64Writing panic info to HA mailbox disks.HA: current time (in sk_msecs) 128169944521 (in sk_cycles) 299073053244098161CF monitor: takeover will occur on reboot
Mon Feb 10 07:51:19 CET [nodeb: dsa_worker3: callhome_shlf_power_intr_1:error]: params: {'subject': 'SHELF POWER INTERRUPTED'}
- Console logs instead of panic may report the following instead:
Waiting to be taken over. REBOOT in 22 seconds.
- Service Processor (SP) Event record:
Record 2200: Mon Feb 10 01:49:54 2020 [Trap Event.critical]: hwassist abnormal_reboot (28)
- If the shelf containing root aggregate disks, which includes mailbox disks, goes offline, partner will be unable to takeover and will generate Active IQ alert:
HA Group Notification (PARTNER DOWN, TAKEOVER IMPOSSIBLE ) ERROR