In a SAN environment, a fabric is the transport layer and is essential for reliable communications. When the transport layer is unreliable, hosts may experience issues ranging from high or increased latency to LUN resets to IO timeouts.
In a NETAPP SAN, either Ethernet or Fibre Channel can be used for data transport. This workflow will cover common Fibre Channel issues.
If the issue is a NETAPP LIF (not a port) that is down, then see this KB: FCP/FCoE LIF reports operationally down
Link is flapping
A link flapping event is defined as a link that goes down and comes back up (usually immediately) with no user intervention. Usually, but not always, the link will go down again either immediately or shortly thereafter, and the process will repeat.
Port is down or offline on the switch
The switch reports a port status as down for a port.
The switch may log an error stating the reason for the port being offline in the
errdump output for Brocade or
show logging log for Cisco:
Mon Jun 26 2017 09:45:41 GMT+03:00ErrorS0,P12(Bp8) user_idx:12 [PID 0x3e0c00] faulted due to SFP validation failure.
Additionally, the Switchshow output shows the port offline:
Index Port Address Media Speed State Proto
2 2 3e0200 id N16 No_Light FC
11 11 3e0b00 id N16 No_Light FC
12 12 3e0c00 id 16G Laser_Flt FC
19 19 3e1300 id N16 Laser_Flt FC
Port is down or offline on the filer
The NETAPP controller will report the port as simply offline or link disconnected in
slot 1: Fibre Channel Target Host Adapter 1b (Emulex LPe32000 (LPe32002) rev. 12, <LINK DISCONNECTED>)
slot 0: FC Host Adapter 0c (QLogic 8324 rev. 2, L-port, <OFFLINE (hard)>)
slot 0: FC Host Adapter 0a (QLogic 8324 rev. 2, L-port, <OFFLINE>)
Port may not negotiate speed corectly
Ports negotiate speeds different than what is expected. For example, a 16 gb port may negotiate to 8 gb.