High latency on FC LUN with IO WQE failure in EMS

Last updated

Dec 10, 2024
Save as PDF
Share
1. Share
2. Tweet
3. Share

Views:: 3,550

Visibility:: Public

Votes:: 2

Category:: fas-systems

Specialty:: san

Last Updated:: 12/10/2024, 10:19:45 AM

Applies to

ONTAP 9
FCP
Brocade switches
Cisco switches

Issue

Workloads on hosts accessing LUNs via FCP incur significant application and client latency
Frequent IO WQE failures in EMS log.

Example:

Mon Feb 10 00:28:21 +03 [NODE01: fct_tpd_work_thread_0: fcp.io.status:debug]: STIO Adapter:10a IO WQE failure, Handle 0x2, Type 8, S_ID: 20253, VPI: 18, OX_ID: 263, Status 0x3 Ext_Status 0x1d

Unable to access storage virtual machine data LUN on AFF SAN
High latency observed intermittently on FC LUN
Brocade switch repeatedly reported link resets due to credit loss and C3 Tx discards or timeouts on specific port under porterrshow

/fabos/cliexec/porterrshow:

frames enc crc crc too too bad enc disc link loss loss frjt fbsy c3timeout pcs uncor

tx rx in err g_eof shrt long eof out c3 fail sync sig tx rx err err

0: 89.4m 139.8m 0 0 0 0 0 0 0 1.3k 0 0 0 0 0 1.3k 0 0 0

Cisco switch under Fabriclog output link reset events logged for that specific port

Switch 0; Thu Jul 28 00:00:18 2022 GMT (GMT+0:00)

00:02:11.754993 SCN LR_PORT(0);g=0x266 LR_IN D2,P0 D2,P0 0 NA 00:02:26.934854 SCN LR_PORT(0);g=0x266 LR_OUT D2,P0 D2,P0 0 NA 00:02:39.918129 SCN Port Offline;rsn=0x4,g=0x268 D2,P0 D2,P0 0 NA 00:02:39.918135 *Removing all nodes from port D2,P0 D2,P0 0 NA 00:02:40.770569 SCN LR_PORT(0);g=0x268 D2,P0 D2,P0 0 NA 00:02:40.773044 SCN Port Online; g=0x268,isolated=0 D2,P0 D2,P1 0 NA

fcp adapter stats -node <node_name> -adapter 1a -instance
- Indicates any protocol layer issue adjacent to storage
network fcp adapter show on ONTAP CLI displays low TX power:

Received Optical Power 570.7 (uWatts) SPF Transmitted Optical Power 123.8 (uWatts)