Fabric MetroCluster - Link Reset on ISL Port(s)
Applies to
- Fabric-attached MetroCluster
- ONTAP 9
- Back-end Brocade Switches
Issue
Possible symptoms:
- There are a combination of FC-VI adapter disconnection errors or transport errors in the ems-log when accessing disks at the remote site:
[node1: fcvi_cm: fcvi.qlgc.sent.disconnect:notice]: FC-VI adapter: Disconnect request sent on port 1a. QP name = RAID, QP index = 3, Remote node's system id = 123456789.
[node1: fcvi_cm: fcvi.qlgc.sent.disconnect:notice]: FC-VI adapter: Disconnect request sent on port 1a. QP name = MISC, QP index = 4, Remote node's system id = 123456789.
[node1: fcvi_cm: fcvi.qlgc.sent.disconnect:notice]: FC-VI adapter: Disconnect request sent on port 1a. QP name = STREAM, QP index = 5, Remote node's system id = 123456789.
[node1: fcvi_cm: fcvi.qlgc.sent.disconnect:notice]: FC-VI adapter: Disconnect request sent on port 1a. QP name = DRSOM_HB, QP index = 6, Remote node's system id = 123456789.
[node1: ispfcvi2500_main1: fcvi.qlgc.ioErr:error]: FC-VI adapter: FCVI driver on port 1a received IO error. Status = FW detected response error(status code = 0x121), FCVI opcode = Write Request(0x1), QP name = RAID, QP index = 11, Remote node's system id = 123456789.
[node1: fcvi_cm: fcvi.qlgc.sent.disconnect:notice]: FC-VI adapter: Disconnect request sent on port 1a. QP name = STREAM, QP index = 13, Remote node's system id = 123456789.
[node1: fcvi_cm: fcvi.qlgc.sent.disconnect:notice]: FC-VI adapter: Disconnect request sent on port 1a. QP name = DRSOM_HB, QP index = 14, Remote node's system id = 123456789.
[node1: fcvi_cm: fcvi.qlgc.sent.disconnect:notice]: FC-VI adapter: Disconnect request sent on port 1a. QP name = DRC_KILL, QP index = 15, Remote node's system id = 123456789.
[node1: isp2400_timeout_2: fci.device.quiesce:debug]: Adapter 5a encountered a command timeout on Disk device REMOTE_FAB1_SW1:6.126 (0x06070600) LUN 22 cdb 0x2a:5a93b808:01f8 retry: 0 Quiescing the device.
[node1: isp2400_timeout_2: fci.device.timeout:debug]: HBA 5a encountered a device timeout on Disk device REMOTE_FAB1_SW1:6.126 (0x06070600) LUN 30 cdb 0x2a:5a93b818:01e8 retry: 0
- The Brocade Switches report that one or more ISL interfaces reset:
CHASSIS, WARNING, Brocade6505, Single RDY/Frame Loss detected and recovered on Slot 0,Port 8(12) rdy(0x1)/frame(0x0).
CHASSIS, WARNING, Brocade6505, Multi RDY/Frame Loss detected on Slot 0, Port 8
CHASSIS, WARNING, Brocade6505, Link Reset on Port S0,P10
2334, FID 128, INFO,switch1, Port (ID: 8) QoS is disabled.
- The node may report failed disks at the other site:
[node1: disk_server_0: scsi.debug:debug]: shm_setup_for_failure disk REMOTE_FAB1_SW1:9.126L1024 (S/N XXXXXXXXXXXXXX) error 4000h
[node1: config_thread: raid.config.filesystem.disk.failed:error]: File system Disk n1_aggr/plex6/rg0/REMOTE_FAB1_SW1:9.126L1024 Shelf 12 Bay 23 [NETAPP X371_S1643960ATE NA54] S/N [XXXXXXXXXXXXXX] UID [5002538B:01290E60:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000] failed.
- LUNs can experience connections/disconnections, hence LUNs might be inaccessible.