Remote drives fail after DWDM maintenance
Applies to
- MetroCluster IP
- ONTAP 9
Issue
- Errors are observed in the EMS log
- Remote drives are reported as failed
- SYNCMIRROR PLEX FAILED & MULTIPLE DISKS MISSING autosupports
- EMS log:
Mon Nov 08 12:32:07 +0100 [ClusterA-02: kernel: iscsi.session.stateChanged:notice]: iSCSI session state is changed to Reconnecting for the target iqn.2016-07.com.netapp:  (type: dr_auxiliary, address: 0.0.0.0:65200). Reason: no ping reply after 5 seconds.
Mon Nov 08 12:32:07 +0100 [ClusterA-02: intr: ctl.session.stateChanged:notice]: iSCSI CAM target layer's session state is changed to terminated for the initiator iqn.1994-09.org.freebsd:  (address: 0.0.0.0). Reason: no ping reply after 5 seconds.
Mon Nov 08 12:32:07 +0100 [ClusterA-02: kernel: iscsi.session.stateChanged:notice]: iSCSI session state is changed to Reconnecting for the target iqn.2016-06.com.netapp:  (type: dr_auxiliary, address: 0.0.0.0:65200). Reason: no ping reply after 5 seconds.
Mon Nov 08 12:32:07 +0100 [ClusterA-02: doneq0: scsi.mcc.adt.ioTransportError:error]: mcc_adt[1] - Transport error during execution of command: HA status 0x13: CAM transport status 0x1b : cdb 0x88:00000001bf178980:00000008.
Mon Nov 08 12:32:07 +0100 [ClusterA-02: doneq0: scsi.mcc.adt.ioTransportError:error]: mcc_adt[1] - Transport error during execution of command: HA status 0x13: CAM transport status 0x1b : cdb 0x88:00000001bf178190:00000008.
Mon Nov 08 12:32:07 +0100 [ClusterA-02: scsi_cmdblk_strthr_admin: scsi.cmd.abortedByHost:error]: Disk device 0v.i2.1L21: Command aborted by host adapter: HA status 0x13: cdb 0x88:00000001bf178980:00000008. 
- sysconfig -routput:
Plex /ClusterA-02/plex1 (offline, failed, inactive, pool1)
    RAID group /ClusterA-02/plex1/rg0 (partial, block checksums)
      RAID Disk	Device    	HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      ---------	------    	------------- ---- ---- ---- ----- --------------    --------------
      dparity 	0m.i2.3L13P1	0m    20  12         1   SSD   N/A 1799343/3685054464 1799351/3685070848 (fast zeroed)
      parity  	0m.i1.0L14P1	0m    20  13         1   SSD   N/A 1799343/3685054464 1799351/3685070848 (fast zeroed)
      data	            FAILED    		N/A                        1799343/ -
      data	            FAILED    		N/A                        1799343/ -
      data	            FAILED    		N/A                        1799343/ -
      data	            FAILED    		N/A                        1799343/ -
      data	            FAILED    		N/A                        1799343/ -
      Raid group is missing 5 disks.
