VMs in hung state and are not accessible from its dedicated host from VMware end
Applies to
- Ontap 9
- ESXi Host
- SAN (FC)
Issue
- VMs are not accessible and IO errors seen from the host end.
- Link reset events i.e scsitarget_hwpfct_errorReset along with IO WQE errors with Extended status 16 logged on storage end on the target port.
- Extended status 16 means that the Host initiator has sent an abort to clear the current command queue.
<LR d="17Sep2023 20:10:47" n="NetApp" t="1694961647" id="1631641068/405059" p="5" s="Ok" o="fct_tpd_thread_6" vf="" type="0" seq="1012607" >
<scsitarget_hwpfct_errorReset_1
hba="3c"
details="status:0x87800000, status1:0x52004c62, status2:0x610102, DIP:1, RN:1, RDY:1, Dump owner:6"/>
<LR d="17Sep2023 20:10:47" n="NetApp" t="1694961647" id="1631641068/405060" p="5" s="Ok" o="fct_tpd_thread_7" vf="" type="0" seq="1012608" >
<scsitarget_hwpfct_errorReset_1
hba="3d"
details="status:0x87800000, status1:0x52004c62, status2:0x610102, DIP:1, RN:1, RDY:1, Dump owner:6"/>
<LR d="17Sep2023 20:10:47" n="NetApp" t="1694961647" id="1631641068/405061" p="5" s="Ok" o="fct_tpd_thread_7" vf="" type="0" seq="1012609" >
<scsitarget_fct_reset_1
hba="3d"/>
<LR d="17Sep2023 20:10:48" n="NetApp" t="1694961648" id="1631641068/405062" p="5" s="Ok" o="fct_tpd_thread_6" vf="" type="0" seq="1012610" >
<scsitarget_hwpfct_dump_saved_1
adapter="3c"
filename="/etc/log/fctsli_3c_20230917_201047/fct_fw_3c.dmp.gz"/>
<LR d="17Sep2023 20:10:50" n="NetApp" t="1694961650" id="1631641068/405070" p="5" s="Ok" o="fct_tpd_work_thread_0" vf="" type="0" seq="1012618" >
<scsitarget_hwpfct_linkUp_1
hba="3d"/>
<LR d="17Sep2023 20:10:50" n="NetApp" t="1694961650" id="1631641068/405071" p="5" s="Ok" o="fct_tpd_work_thread_0" vf="" type="0" seq="1012619" >
<scsitarget_hwpfct_linkUp_1
hba="3c"/>
- ASUP
SYSCONFIG-A.XML
FC port status shows asLink Not Connected
slot 9: Fibre Channel Target Host Adapter 9c
(Emulex LPe32000 (LPe32002) rev. 12, <LINK NOT CONNECTED>)
Board Name: 111-03249
Serial Number: FC72662xxx
Firmware rev: 11.2.219.4
Host Port Addr: 000000
FC Nodename: 50:0a:09:80:80:xx:xx:xx (500a098080xxxxxx)
FC Portname: 50:0a:09:83:a0:xx:xx:xx (500a0983a0xxxxxx)
Connection: No link
Switch Port: Unknown
SFP Vendor Name: FINISAR CORP.
SFP Vendor P/N: FTLF8532P4BCV-EM
SFP Vendor Rev: A
SFP Serial No.: PXQ1Mxx
SFP Connector: LC
SFP Capabilities: 8, 16, 32 Gbit/Sec
- Switch port connected to the target port is in
err disabled
state as pershow interface brief
command and reportscredit loss events
under`show process creditmon credit-loss-events`
.
Cisco# show interface gigabit0/7
GigabitEthernet4/45 is down, line protocol is down (err-disabled)
Hardware is Gigabit Ethernet, address is 001b.54aa.c107 (bia 001b.54aa.c107)
MTU 1500 bytes, BW 100000 Kbit, DLY 100 usec,
reliability 234/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
`show process creditmon credit-loss-events`
Module: 04 Credit Loss Events: YES
----------------------------------------------------
| Interface | Total | Timestamp |
| | Events | |
----------------------------------------------------
| fc4/45 | 114 | 1. Mon Sep 18 22:56:17 2023 |
| | | 2. Mon Sep 18 22:56:13 2023 |
| | | 3. Mon Sep 18 22:55:43 2023 |
| | | 4. Mon Sep 18 22:55:42 2023 |
| | | 5. Mon Sep 18 22:55:19 2023 |
| | | 6. Mon Sep 18 22:54:45 2023 |
| | | 7. Mon Sep 18 22:54:30 2023 |
| | | 8. Mon Sep 18 22:54:19 2023 |
| | | 9. Mon Sep 18 22:54:09 2023 |
| | |10. Mon Sep 18 22:53:52 2023 |
----------------------------------------------------
- SFP Tx and Rx power are in the recommended range on the target ports.
- Credit-loss-recovery is due to a complete lack of Tx B2B credits. It has two main causes:
- Severe congestion on the adjacent device. The end device needs to be investigated.
- Physical link problems leading to corrupted/lost B2B credits and frames.
- Problems involving link errors, all of the physical components(SFPs, cables, patch panels, etc.) need to be checked and/or replaced on both ends to isolate the issue further.