Skip to main content
NetApp Knowledge Base

CHW-3644: AFF-C800 MDP during partner node reseat

Views:
4
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
HW
Last Updated:

Issue

  • Partner node experienced an L2 Watchdog reset and has been taken over
  • All disks properly seated and pushed in
  • Partner node is reseated, during this time the node that has taken over experiences a multi-disk fault:

 
PANIC: aggrname: raid volfsm, fatal multi-disk error..
 
  • Link disabled messages are seen prior to the multi-disk fault:

 
[?]  Wed Nov 19 22:44:47 -0500 [nodename: kernel: nvme.link.disabled.error:error]: PCIe link disabled for NVMe SSD in slot 6 due to excessive errors.
[?]  Wed Nov 19 22:44:47 -0500 [nodename: kernel: nvme.link.disabled.error:error]: PCIe link disabled for NVMe SSD in slot 4 due to excessive errors.
[?]  Wed Nov 19 22:44:47 -0500 [nodename: kernel: nvme.link.disabled.error:error]: PCIe link disabled for NVMe SSD in slot 1 due to excessive errors.
[?]  Wed Nov 19 22:44:47 -0500 [nodename: kernel: nvme.link.disabled.error:error]: PCIe link disabled for NVMe SSD in slot 8 due to excessive errors.
[?]  Wed Nov 19 22:44:47 -0500 [nodename: kernel: nvme.link.disabled.error:error]: PCIe link disabled for NVMe SSD in slot 7 due to excessive errors.
[?]  Wed Nov 19 22:44:47 -0500 [nodename: kernel: nvme.link.disabled.error:error]: PCIe link disabled for NVMe SSD in slot 30 due to excessive errors.

 
 
  • Scsi check conditions are seen prior to the multi-disk fault:

 
Wed Nov 19 22:44:12 -0500 [vama822-03: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Unknown device 0n.1L9998: Check Condition: CDB 0x12: Sense Data SCSI:aborted command -  (0xb - 0x90 0x6 0xfa)(26876).
Wed Nov 19 22:44:12 -0500 [vama822-03: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Unknown device 0n.4L9998: Check Condition: CDB 0x12: Sense Data SCSI:aborted command -  (0xb - 0x90 0x6 0xfa)(26877).
Wed Nov 19 22:44:12 -0500 [vama822-03: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Unknown device 0n.30L9998: Check Condition: CDB 0x12: Sense Data SCSI:aborted command -  (0xb - 0x90 0x6 0xfa)(26876).
Wed Nov 19 22:44:12 -0500 [vama822-03: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Unknown device 0n.8L9998: Check Condition: CDB 0x12: Sense Data SCSI:aborted command -  (0xb - 0x90 0x6 0xfa)(26876).
Wed Nov 19 22:44:12 -0500 [vama822-03: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Unknown device 0n.6L9998: Check Condition: CDB 0x12: Sense Data SCSI:aborted command -  (0xb - 0x90 0x6 0xfa)(26877).
Wed Nov 19 22:44:12 -0500 [vama822-03: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Unknown device 0n.7L9998: Check Condition: CDB 0x12: Sense Data SCSI:aborted command -  (0xb - 0x90 0x6 0xfa)(26877).
Wed Nov 19 22:44:42 -0500 [vama822-03: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Unknown device 0n.6L9998: Check Condition: CDB 0x12: Sense Data SCSI:aborted command -  (0xb - 0x90 0x6 0xfa)(57106).
Wed Nov 19 22:44:42 -0500 [vama822-03: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Unknown device 0n.1L9998: Check Condition: CDB 0x12: Sense Data SCSI:aborted command -  (0xb - 0x90 0x6 0xfa)(57110).
Wed Nov 19 22:44:42 -0500 [vama822-03: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Unknown device 0n.4L9998: Check Condition: CDB 0x12: Sense Data SCSI:aborted command -  (0xb - 0x90 0x6 0xfa)(57111).
Wed Nov 19 22:44:42 -0500 [vama822-03: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Unknown device 0n.8L9998: Check Condition: CDB 0x12: Sense Data SCSI:aborted command -  (0xb - 0x90 0x6 0xfa)(57120).
Wed Nov 19 22:44:42 -0500 [vama822-03: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Unknown device 0n.7L9998: Check Condition: CDB 0x12: Sense Data SCSI:aborted command -  (0xb - 0x90 0x6 0xfa)(57129).
Wed Nov 19 22:44:42 -0500 [vama822-03: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Unknown device 0n.30L9998: Check Condition: CDB 0x12: Sense Data SCSI:aborted command -  (0xb - 0x90 0x6 0xfa)(57133).
 

  • After reseat the partner is able to recover from L2 watchdog reset
  • Node which experienced the panic recovers after booting back up

 
 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.