Skip to main content
NetApp Knowledge Base

VMware Atomic Test and Set heartbeats impact connectivity to storage

Views:
3,797
Visibility:
Public
Votes:
6
Category:
ontap-9
Specialty:
virt
Last Updated:

Applies to

  • Data ONTAP    
  • SAN Protocols (iSCSI & FC)
  • VMware vSphere 5.5U2 and up
  • Atomic Test & Set (ATS) Heartbeat

Issue

  • ESXi hosts experience repeated LUN resets.  For example, from the vmkernel.log, events similar to the following are seen:

2022-04-04T12:33:17.949Z: [scsiCorrelator] 5693120512199us: [vob.scsi.scsipath.por] Power-on Reset occurred on naa.600a0980xxxxxxxxxxxxxxxxxxxxxxx
2022-04-04T12:41:45.715Z: [scsiCorrelator] 5693628237688us: [vob.scsi.scsipath.por] Power-on Reset occurred on naa.600a0980xxxxxxxxxxxxxxxxxxxxxxx
2022-04-04T12:41:48.217Z: [scsiCorrelator] 5693630739987us: [vob.scsi.scsipath.por] Power-on Reset occurred on naa.600a0980xxxxxxxxxxxxxxxxxxxxxxx
2022-04-04T12:41:59.394Z: [scsiCorrelator] 5693641916248us: [vob.scsi.scsipath.por] Power-on Reset occurred on naa.600a0980xxxxxxxxxxxxxxxxxxxxxxx

  • In addition to LUN resets, there is also evidence of datastore heartbeat timeouts:

2022-04-04T08:03:16.087Z: [vmfsCorrelator] 5677371531974us: [esx.problem.vmfs.heartbeat.timedout] xxxxxxxx-xxxxxxxx-xxx-xxxxxxxxxxx lun_vmfs_1
2022-04-04T08:03:28.087Z: [vmfsCorrelator] 5677383532045us: [esx.problem.vmfs.heartbeat.timedout] xxxxxxxx-xxxxxxxx-xxx-xxxxxxxxxxx lun_vmfs_2
2022-04-04T08:03:28.087Z: [vmfsCorrelator] 5677383532279us: [esx.problem.vmfs.heartbeat.timedout] xxxxxxxx-xxxxxxxx-xxx-xxxxxxxxxxx lun_vmfs_3
2022-04-04T08:03:28.087Z: [vmfsCorrelator] 5677383532445us: [esx.problem.vmfs.heartbeat.timedout] xxxxxxxx-xxxxxxxx-xxx-xxxxxxxxxxx lun_vmfs_4
2022-04-04T12:33:17.354Z: [vmfsCorrelator] 5693572799758us: [esx.problem.vmfs.heartbeat.timedout] xxxxxxxx-xxxxxxxx-xxx-xxxxxxxxxxx lun_vmfs_5
2022-04-04T12:43:14.401Z: [vmfsCorrelator] 5694169846835us: [esx.problem.vmfs.heartbeat.timedout] xxxxxxxx-xxxxxxxx-xxx-xxxxxxxxxxx lun_vmfs_6

  • When reviewing SCSI commands issued by the host, we find that both ATS (Atomic Test and Set) and SCSI Reservation commands are being issued:

SCSI RESERVE

2022-04-04T08:03:31.105Z cpu1:2097761)NMP: nmp_PathDetermineFailure:3560: SCSI cmd RESERVE failed on path vmhba64:C2:T1:L3, reservation state on device naa.600a0980383136446f3f4e4c5a484355 is unknown.
2022-04-04T08:03:31.118Z cpu1:2097761)NMP: nmp_PathDetermineFailure:3560: SCSI cmd RESERVE failed on path vmhba64:C2:T1:L3, reservation state on device naa.600a0980383136446f3f4e4c5a484355 is unknown.
2022-04-04T08:03:31.118Z cpu19:2097779)NMP: nmp_PathDetermineFailure:3560: SCSI cmd RESERVE failed on path vmhba64:C2:T1:L2, reservation state on device naa.600a0980383136446f3f4e4c5a484356 is unknown.
2022-04-04T08:03:31.131Z cpu1:2097761)NMP: nmp_PathDetermineFailure:3560: SCSI cmd RESERVE failed on path vmhba64:C2:T1:L3, reservation state on device naa.600a0980383136446f3f4e4c5a484355 is unknown.

ATS Compare and Write

2022-04-04T09:35:44.604Z cpu16:2098262)NMP: nmp_ThrottleLogForDevice:3872: Cmd 0x89 (0x45baf02353c0, 2099582) to dev "naa.600a0980xxxxxxxxxxxxxxxxxxxxxxx" on path "vmhba64:C3:T1:L2" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0xe 0x1d 0x0. Act:NONE
2022-04-04T12:30:30.386Z cpu27:2098262)NMP: nmp_ThrottleLogForDevice:3872: Cmd 0x89 (0x459adcd68040, 2099582) to dev "naa.600a0980383136446f3f4e4c5a484356" on path "vmhba64:C3:T1:L2" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0xe 0x1d 0x0. Act:NONE
2022-04-04T12:33:16.355Z cpu1:2097617)NMP: nmp_ThrottleLogForDevice:3872: Cmd 0x89 (0x45bac31031c0, 2097193) to dev "naa.600a09803831364468244e4c5a47494e" on path "vmhba64:C1:T0:L0" Failed: H:0x5 D:0x0 P:0x0 Invalid sense data: 0x0 0x0 0x0. Act:EVAL
2022-04-04T12:43:13.402Z cpu2:2097617)NMP: nmp_ThrottleLogForDevice:3872: Cmd 0x89 (0x45baf1751140, 2097193) to dev "naa.600a09803831364468244e4c5a47494e" on path "vmhba64:C0:T0:L0" Failed: H:0x5 D:0x0 P:0x0 Invalid sense data: 0x0 0x0 0x0. Act:EVAL

 

 

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.
Scan to view the article on your device