Skip to main content
NetApp Knowledge Base

FCP target and path loss on several blades in a chassis

Views:
359
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
SAN
Last Updated:

Applies to

  • ONTAP 9
  • HPE Synergy
  • Brocade Fabric OS 9.1
  • VMware ESXi

Issue

  • Sporadic and intermittent path and target loss on different ESXi hosts (blades in HPE Synergy chassis)
  • ONTAP confirms related initiators not being logged in. Affected initiators and LIFs may change every few minutes.

::*> fcp ping-igroup show -vserver SVM -igroup * -ext-status wwpn-not-logged_in
  (vserver fcp ping-igroup show)
          Igroup                     Logical    Node      Ping     Extended
Vserver   Name        WWPN           Interface  Name      Status   Status
--------- ----------- -------------- ---------- --------- -------- -----------
SVM
          SYNERGYESXGRP1 20:00:xx:xx:xx:xx:xx:29 SVM_fc07 NODEA12 reachable wwpn-not-logged_in
          SYNERGYESXGRP2 20:00:xx:xx:xx:xx:xx:09 SVM_fc07 NODEA12 reachable wwpn-not-logged_in
          SYNERGYESXGRP4 20:00:xx:xx:xx:xx:xx:01 SVM_fc02 NODEA11 reachable wwpn-not-logged_in
          SYNERGYESXGRP4 20:00:xx:xx:xx:xx:xx:01 SVM_fc04 NODEA12 reachable wwpn-not-logged_in
          SYNERGYESXGRP7 20:00:xx:xx:xx:xx:xx:31 SVM_fc02 NODEA11 reachable wwpn-not-logged_in
          SYNERGYESXGRP8 20:00:xx:xx:xx:xx:xx:31 SVM_fc04 NODEA12 reachable wwpn-not-logged_in

  • Sometimes initiators are confirmed as logged in, but related host still misses target and has dead paths
  • WQE with Ext_Status 0x16 errors in EMS / event log show

fcp.io.status: STIO Adapter:2a IO WQE failure, Handle 0x5, Type 8, S_ID: 10902, VPI: 259, OX_ID: 24C, Status 0x3 Ext_Status 0x16

  • FC Host bus target adapter resets due to command termination hung with SRAM dumps (likely but not necessarily on multiple storage controllers)

::> event log show -severity debug -event *fcp.io.status*hung*|*SRAM*
Time                Node             Severity      Event
------------------- ---------------- ------------- ---------------------------
12/21/2022 12:44:48 NODEA12          DEBUG         scsitarget.fcp.dump: FCP target SRAM dump generated for adapter 2a, fct_tpd_check_hung_commands: Command termination hung. cmd:0xfffff80917a41c60 (state=0xa, flags=0x2,ctio_sent=2/2, RecvExAddr=0x1aec, OX_ID=0x72, RX_ID=0xffff, SID=0x10902)
12/21/2022 12:44:48 NODEA12          DEBUG         fcp.io.status: STIO Adapter:2a, found hung cmd:0xfffff80917a41c60(state=10, flags=0x2, ctio_sent=2/2,RecvExAddr=0x1aec, OX_ID=0x72, RX_ID=0xffff,SID=0x10902, Cmd[28], req_q_free:0)
12/21/2022 11:56:38 NODEA12          DEBUG         fcp.io.status: STIO Adapter:1a, found hung cmd:0xfffff8090d1a4010(state=5, flags=0x0, ctio_sent=1/1,RecvExAddr=0x14ef, OX_ID=0x264, RX_ID=0xffff,SID=0x10902, Cmd[2A], req_q_free:0)
12/21/2022 11:55:51 NODEA11          DEBUG         scsitarget.fcp.dump: FCP target SRAM dump generated for adapter 2a, fct_tpd_check_hung_commands: Command termination hung. cmd:0xfffff80917b392f8 (state=0xa, flags=0x2,ctio_sent=2/3, RecvExAddr=0x146e, OX_ID=0x178, RX_ID=0xffff, SID=0x10c03)
12/21/2022 11:55:46 NODEA11          DEBUG         fcp.io.status: STIO Adapter:2a, found hung cmd:0xfffff80917b392f8(state=7, flags=0x0, ctio_sent=1/2,RecvExAddr=0x146e, OX_ID=0x178, RX_ID=0xffff,SID=0x10c03, Cmd[8A], req_q_free:0)
5 entries were displayed.

  • On Brocade switch, increasing c3timeout on TX for host ports and RX for storage ports indicates end device congestion
  1. Run statsclear
  2. Wait 15 minutes
  3. Run porterrshow 

Example:

FID128:admin> porterrshow 8-13
           frames        enc     crc     crc     too     too     bad     enc    disc    link    loss    loss    frjt    fbsy     c3timeout     pcs      uncor
         tx       rx      in     err     g_eof   shrt    long    eof     out    c3      fail    sync    sig                      tx      rx      err     err
   8:    4.8m    8.4m    0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0
   9:    1.1k    1.1k    0       0       0       0       0       0       0     717       0       0       0       0       0     717       0       0       0
  10:    0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0
  11:    0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0
  12:    3.9m    6.8m    0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0
  13:  976       1.2k    0       0       0       0       0       0       0     840       0       0       0       0       0     840       0       0       0
FID128:admin> porterrshow 20-23
           frames        enc     crc     crc     too     too     bad     enc    disc    link    loss    loss    frjt    fbsy     c3timeout     pcs      uncor
         tx       rx      in     err     g_eof   shrt    long    eof     out    c3      fail    sync    sig                      tx      rx      err     err
  20:    6.7m    7.6m    0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0
  21:    3.9m    2.5m    0       0       0       0       0       0       0     518       0       0       0       0       0       0     259       0       0
  22:    3.9m    2.5m    0       0       0       0       0       0       0     974       0       0       0       0       0       0     487       0       0
  23:    6.4m    3.0m    0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.