MetroCluster VMware ESXi logs show datastores not available
Applies to
- Fabric MetroCluster
- ONTAP 9
- ATTO 7500N/7600N fibrebridge
- VMware ESXi
Issue
- ESXi logs show datastores not available approx. every 30 minutes. The datastore becomes available again within a short period of time.
2024-01-05T12:34:35.389Z info hostd[2115691] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 1372 : Lost access to volume 61fbad99-55ed7c20-f077-e4434b628350 (NA99B_S98SVM10L9HsanVMWARElogvol01) due to connectivity issues. Recovery attempt is in progress and outcome will be reported shortly.
2024-01-05T12:34:40.580Z info hostd[2115654] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 1373 : Successfully restored access to volume 61fbad99-55ed7c20-f077-e4434b628350 (NA99B_S98SVM10L9HsanVMWARElogvol01) following connectivity issues.
- Checking the front end connectivity shows no issues on nodes or switches.
- The NetApp node logs show command and device timeouts and SCSI command aborted errors which are successful on retry:
Tue Jan 16 10:02:43 +0100 [cl01n01: isp2400_timeout_2: fci.device.quiesce:debug]: Adapter 0g encountered a command timeout on Disk device sw1:8.126 (0x08070800) LUN 512 cdb 0x9a:00000000220c9999:0001:0030 retry: 0 Quiescing the device.
Tue Jan 16 10:02:44 +0100 [cl01n01: isp2400_timeout_2: fci.device.timeout:debug]: HBA 0g encountered a device timeout on Disk device sw1:8.126 (0x08070800) LUN 512 cdb 0x9a:00000000220c9999:0001:0030 retry: 0
Tue Jan 16 10:02:44 +0100 [cl01n01: isp2400_intrd: scsi.cmd.abortedByHost:error]: Disk device sw1:8.126L512: Command aborted by host adapter: HA status 0x4: cdb 0x9a:00000000220c9999:0001:0030.
Tue Jan 16 10:02:44 +0100 [cl01n01: slifc_intrd: scsi.cmd.retrySuccess:debug]: Disk device sw1:9.126L512: request successful after retry #1/#0: cdb 0x9a:00000000220c9999:0001:0030 (10648).
-
The paths of the disks with errors are all common to one ATTO fibrebridge.