"Path redundancy to storage device" and "End path evaluation" alerts on ESXi host
Applies to
- VMware vSphere
- NetApp ONTAP
Issue
- When reviewing ESXi's vmkernel.log, you see events indicating that near constant path evaluation is occurring for all mounted LUNs:
2022-12-05T16:04:46.349Z cpu8:2099281)StorageDevice: 7059: End path evaluation for device naa.600a098038331464853abcd5635391234
2022-12-05T16:04:46.349Z cpu30:2099276)StorageDevice: 7059: End path evaluation for device naa.600a09803831464853abcd5635394567
2022-12-05T16:04:46.349Z cpu86:2099277)StorageDevice: 7059: End path evaluation for device naa.600a09803831464853abcd563539a789
2022-12-05T16:04:46.349Z cpu47:2099282)StorageDevice: 7059: End path evaluation for device naa.600a09803831464853abcd5635390123
2022-12-05T16:04:46.349Z cpu34:2099278)StorageDevice: 7059: End path evaluation for device naa.600a09803831464853abcd563539cdef
- In
vobd
log, "Path redundancy to storage device" alerts seen on host end :
Path redundancy to storage device naa.600a0980xxxxxxxxxxxxxxxxxxx degraded. Path vmhba4:C0:T181:L26 is down. Affected datastores: xxx.
Path redundancy to storage device naa.600a0980xxxxxxxxxxxxxxxxxxx degraded. Path vmhba4:C0:T181:L24 is down. Affected datastores: xxx.
- Those events are frequent and seen along with failing SEND KEY or MAINTENANCE KEY SCSI commands:
2022-12-05T22:10:15.259Z cpu50:2100588)VMW_SATP_ALUA: satp_alua_issueCommandOnPath:706: Path (vmhba64:C3:T1:L302) command 0xa3 failed with transient error status Transient storage condition, suggest retry. sense data: 0x6 0x3f 0xe. Waiting for 20 second$
2022-12-05T22:10:15.259Z cpu50:2100588)VMW_SATP_ALUA: satp_alua_issueCommandOnPath:706: Path (vmhba64:C4:T0:L302) command 0xa3 failed with transient error status Transient storage condition, suggest retry. sense data: 0x6 0x3f 0xe. Waiting for 20 second$
- Notice the runtime IDs for those failing SCSI commands:
vmhba64:C4:T0:L302
vmhba64:C3:T1:L302
- In this example, for LUN ID 302, the targets that ESXi is attempting to issue the SCSI commands to are T0 and T1. If you check the actual paths being used for the LUN, you will see that the target id is different (T4):
esxcfg-mpath -b
naa.600a098038331464853abcd5635391234 : NETAPP iSCSI Disk (naa.600a098038331464853abcd5635391234)
vmhba64:C3:T4:L302 LUN:302 state:active iscsi Adapter: iqn.1998-01.com.vmware:esxiqn Target: IQN=iqn.1992-08.com.netapp:netappiqn Alias= Session=00023d000004 PortalTag=1037
vmhba64:C4:T4:L302 LUN:302 state:active iscsi Adapter: iqn.1998-01.com.vmware:esxiqn Target: IQN=iqn.1992-08.com.netapp:netappiqn Alias= Session=00023d000005 PortalTag=1036
vmhba64:C5:T4:L302 LUN:302 state:active iscsi Adapter: iqn.1998-01.com.vmware:esxiqn Target: IQN=iqn.1992-08.com.netapp:netappiqn Alias= Session=00023d000006 PortalTag=1035
vmhba64:C0:T4:L302 LUN:302 state:active iscsi Adapter: iqn.1998-01.com.vmware:esxiqn Target: IQN=iqn.1992-08.com.netapp:netappiqn Alias= Session=00023d000001 PortalTag=1032
vmhba64:C6:T4:L302 LUN:302 state:active iscsi Adapter: iqn.1998-01.com.vmware:esxiqn Target: IQN=iqn.1992-08.com.netapp:netappiqn Alias= Session=00023d000007 PortalTag=1034
vmhba64:C1:T4:L302 LUN:302 state:active iscsi Adapter: iqn.1998-01.com.vmware:esxiqn Target: IQN=iqn.1992-08.com.netapp:netappiqn Alias= Session=00023d000002 PortalTag=1042
vmhba64:C7:T4:L302 LUN:302 state:active iscsi Adapter: iqn.1998-01.com.vmware:esxiqn Target: IQN=iqn.1992-08.com.netapp:netappiqn Alias= Session=00023d000008 PortalTag=1033
vmhba64:C2:T4:L302 LUN:302 state:active iscsi Adapter: iqn.1998-01.com.vmware:esxiqn Target: IQN=iqn.1992-08.com.netapp:netappiqn Alias= Session=00023d000003 PortalTag=1038