Adapter timeouts causing lun disconnects
Applies to
- ONTAP 9
- Brocade switch
- Fibre Channel Protocol (FCP)
- Windows Host
Issue
- Windows Cluster failover is failing as Disks are not visible to the second node or via the second path.
- New FC LIF's created do not come up with operational status up
- Several adaptors on ONTAP are timing out causing disconnection to different hosts
Example:
cluster01::> network fcp adapter show -node node1 -adapter Xa
Error: show failed: Timeout while getting fabric information
cluster01::> network fcp adapter show -node node01 -adapter Xb
Error: show failed: Timeout while getting fabric information
- MGWD log spammed with the following timeout messages
Example:[kern_mgwd:info:2548] 0x83771bf00: 0: ERR: SAN::FCP::ADAPTER_KERNEL: src/tables/san/fcp_adapter_internal.cc:get_imp:95 returning: 418/24 - Timeout while getting fabric information
[kern_mgwd:info:2548] 0x83771bf00: 0: ERR: SAN::FCP::ADAPTER: src/tables/san/fcp_adapter.cc:get_imp:719 returning: 418/24 - Timeout while getting fabric information
[kern_mgwd:info:2548] 0x83771bf00: 0: ERR: NET::VIF::SAN: src/tables/san/net_vif_san.cc:populateFcpPortmap:991 Failed getting the FCP port on node netapp01 for lif lif01: Timeout while getting fabric information
- Bouncing the port from ONTAP temporarily helps, but the issue is back in an hour or two
- Bouncing the port from the switch does not change anything
- Fabric Performance Impact Notifications (FPIN) in use on the fabric