CONTAP-635287: Multi-disk fault on both nodes in HA pair which recovers and no additional issues are seen
Issue
- One node goes down for a panic and second node takes over.
- Example of the multi disk panic string:
PANIC: aggr aggrname: raid volfsm, fatal multi-disk error.. - All paths to external shelves go missing:
2025-12-12T16:03:00Z 11896312711239144 [17:0] NVMEOF_INFO: peg_nvmeof_discovery_calculate_is_last_path: fe80::d239:eaff:feb0:d4f5: All paths to shelf 6 (SN: XXXXXXXXXXXXXXX) are down
2025-12-12T16:03:01Z 11896312970733882 [6:0] NVMEOF_INFO: peg_nvmeof_discovery_calculate_is_last_path: fe80::d239:eaff:feb0:d31d: All paths to shelf 3 (SN: XXXXXXXXXXXXXXX) are down
2025-12-12T16:03:01Z 11896313578405878 [4:0] NVMEOF_INFO: peg_nvmeof_discovery_calculate_is_last_path: fe80::d239:eaff:feb0:e405: All paths to shelf 2 (SN: XXXXXXXXXXXXXXX) are down
2025-12-12T16:03:01Z 11896313763032190 [13:0] NVMEOF_INFO: peg_nvmeof_discovery_calculate_is_last_path: fe80::d239:eaff:feb0:e5b1: All paths to shelf 7 (SN: XXXXXXXXXXXXXXX) are down
2025-12-12T16:03:01Z 11896314434659598 [0:0] NVMEOF_INFO: peg_nvmeof_discovery_calculate_is_last_path: fe80::d239:eaff:feb0:d5d1: All paths to shelf 4 (SN: XXXXXXXXXXXXXXX) are down
2025-12-12T16:03:02Z 11896315306690008 [1:0] NVMEOF_INFO: peg_nvmeof_discovery_calculate_is_last_path: fe80::d239:eaff:feb0:d36d: All paths to shelf 1 (SN: XXXXXXXXXXXXXXX) are down
2025-12-12T16:03:02Z 11896315735472858 [22:0] NVMEOF_INFO: peg_nvmeof_discovery_calculate_is_last_path: fe80::d239:eaff:feb0:e4a5: All paths to shelf 5 (SN: XXXXXXXXXXXXXXX) are down
2025-12-12T16:03:02Z 11896316928272420 [6:0] NVMEOF_INFO: peg_nvmeof_discovery_calculate_is_last_path: fe80::d239:eaff:fea1:63ac: All paths to shelf 8 (SN: XXXXXXXXXXXXXXX) are down - link down on storage ports can be seen:
[?] Fri Dec 12 10:05:07 -0600 [nodename: kernel: netif.linkDown:info]: Ethernet e5a: Link down, check cable.
[?] Fri Dec 12 10:05:07 -0600 [nodename: kernel: netif.linkDown:info]: Ethernet e5b: Link down, check cable. - After panics, both nodes come back online and no additional issues are seen.
