FCP target and path loss on several blades in a chassis
Applies to
- ONTAP 9
- HPE Synergy
- Brocade Fabric OS 9.1
- VMware ESXi
Issue
- Sporadic and intermittent path and target loss on different ESXi hosts (blades in HPE Synergy chassis)
- ONTAP confirms related initiators not being logged in. Affected initiators and LIFs may change every few minutes.
::*> fcp ping-igroup show -vserver SVM -igroup * -ext-status wwpn-not-logged_in
(vserver fcp ping-igroup show)
Igroup Logical Node Ping Extended
Vserver Name WWPN Interface Name Status Status
--------- ----------- -------------- ---------- --------- -------- -----------
SVM
SYNERGYESXGRP1 20:00:xx:xx:xx:xx:xx:29 SVM_fc07 NODEA12 reachable wwpn-not-logged_in
SYNERGYESXGRP2 20:00:xx:xx:xx:xx:xx:09 SVM_fc07 NODEA12 reachable wwpn-not-logged_in
SYNERGYESXGRP4 20:00:xx:xx:xx:xx:xx:01 SVM_fc02 NODEA11 reachable wwpn-not-logged_in
SYNERGYESXGRP4 20:00:xx:xx:xx:xx:xx:01 SVM_fc04 NODEA12 reachable wwpn-not-logged_in
SYNERGYESXGRP7 20:00:xx:xx:xx:xx:xx:31 SVM_fc02 NODEA11 reachable wwpn-not-logged_in
SYNERGYESXGRP8 20:00:xx:xx:xx:xx:xx:31 SVM_fc04 NODEA12 reachable wwpn-not-logged_in
- Sometimes initiators are confirmed as logged in, but related host still misses target and has dead paths
- WQE with
Ext_Status 0x16
errors in EMS /event log show
fcp.io.status: STIO Adapter:2a IO WQE failure, Handle 0x5, Type 8, S_ID: 10902, VPI: 259, OX_ID: 24C, Status 0x3 Ext_Status 0x16
- FC Host bus target adapter resets due to
command termination hung
with SRAM dumps (likely but not necessarily on multiple storage controllers)
::> event log show -severity debug -event *fcp.io.status*hung*|*SRAM*
Time Node Severity Event
------------------- ---------------- ------------- ---------------------------
12/21/2022 12:44:48 NODEA12 DEBUG scsitarget.fcp.dump: FCP target SRAM dump generated for adapter 2a, fct_tpd_check_hung_commands: Command termination hung. cmd:0xfffff80917a41c60 (state=0xa, flags=0x2,ctio_sent=2/2, RecvExAddr=0x1aec, OX_ID=0x72, RX_ID=0xffff, SID=0x10902)
12/21/2022 12:44:48 NODEA12 DEBUG fcp.io.status: STIO Adapter:2a, found hung cmd:0xfffff80917a41c60(state=10, flags=0x2, ctio_sent=2/2,RecvExAddr=0x1aec, OX_ID=0x72, RX_ID=0xffff,SID=0x10902, Cmd[28], req_q_free:0)
12/21/2022 11:56:38 NODEA12 DEBUG fcp.io.status: STIO Adapter:1a, found hung cmd:0xfffff8090d1a4010(state=5, flags=0x0, ctio_sent=1/1,RecvExAddr=0x14ef, OX_ID=0x264, RX_ID=0xffff,SID=0x10902, Cmd[2A], req_q_free:0)
12/21/2022 11:55:51 NODEA11 DEBUG scsitarget.fcp.dump: FCP target SRAM dump generated for adapter 2a, fct_tpd_check_hung_commands: Command termination hung. cmd:0xfffff80917b392f8 (state=0xa, flags=0x2,ctio_sent=2/3, RecvExAddr=0x146e, OX_ID=0x178, RX_ID=0xffff, SID=0x10c03)
12/21/2022 11:55:46 NODEA11 DEBUG fcp.io.status: STIO Adapter:2a, found hung cmd:0xfffff80917b392f8(state=7, flags=0x0, ctio_sent=1/2,RecvExAddr=0x146e, OX_ID=0x178, RX_ID=0xffff,SID=0x10c03, Cmd[8A], req_q_free:0)
5 entries were displayed.
- On Brocade switch, increasing
c3timeout
on TX for host ports and RX for storage ports indicates end device congestion
- Run
statsclear
- Wait 15 minutes
- Run
porterrshow
Example:
FID128:admin> porterrshow 8-13
frames enc crc crc too too bad enc disc link loss loss frjt fbsy c3timeout pcs uncor
tx rx in err g_eof shrt long eof out c3 fail sync sig tx rx err err
8: 4.8m 8.4m 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
9: 1.1k 1.1k 0 0 0 0 0 0 0 717 0 0 0 0 0 717 0 0 0
10: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
12: 3.9m 6.8m 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
13: 976 1.2k 0 0 0 0 0 0 0 840 0 0 0 0 0 840 0 0 0
FID128:admin> porterrshow 20-23
frames enc crc crc too too bad enc disc link loss loss frjt fbsy c3timeout pcs uncor
tx rx in err g_eof shrt long eof out c3 fail sync sig tx rx err err
20: 6.7m 7.6m 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
21: 3.9m 2.5m 0 0 0 0 0 0 0 518 0 0 0 0 0 0 259 0 0
22: 3.9m 2.5m 0 0 0 0 0 0 0 974 0 0 0 0 0 0 487 0 0
23: 6.4m 3.0m 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
- SFP readings are healthy on storage and switches (also good TX/RX towards Synergy ports) as per