Adapter timeouts causing lun disconnects
Applies to
- ONTAP 9
- Brocade switch
- Fibre Channel Protocol (FCP)
- Windows Host
- ESXi Host
- QLogic adapters on storage
- Fabric Performance Impact Notifications (FPIN)
Issue
- Hosts lose their access to luns every time there is a host patch update.
- New FC LIF's created do not come up with
operational status up
- The host is unable to connect to the luns until a TO/GB is done on ONTAP
- All LUNs are disconnected from the hosts after the Ontap upgrade.
- Rebooting the hosts does not resolve.
- Several adaptors on ONTAP are timing out causing disconnection to different hosts
Example:
cluster01::> network fcp adapter show -node node1 -adapter Xa
Error: show failed: Timeout while getting fabric information
cluster01::> network fcp adapter show -node node01 -adapter Xb
Error: show failed: Timeout while getting fabric information
MGWD.log
timeout messages observed:
Example:[kern_mgwd:info:2548] 0x83771bf00: 0: ERR: SAN::FCP::ADAPTER_KERNEL: src/tables/san/fcp_adapter_internal.cc:get_imp:95 returning: 418/24 - Timeout while getting fabric information
[kern_mgwd:info:2548] 0x83771bf00: 0: ERR: SAN::FCP::ADAPTER: src/tables/san/fcp_adapter.cc:get_imp:719 returning: 418/24 - Timeout while getting fabric information
[kern_mgwd:info:2548] 0x83771bf00: 0: ERR: NET::VIF::SAN: src/tables/san/net_vif_san.cc:populateFcpPortmap:991 Failed getting the FCP port on node netapp01 for lif lif01: Timeout while getting fabric information
- Bouncing the port from ONTAP temporarily helps, but the issue is back in an hour or two
- Bouncing the port from the switch does not change anything