NFS datastore writing issues due to mismatch of MTU on Host and Storage Ports
Applies to
- NFS
- ESXi
- Linux
- MTU
Issue
- Linux mounted NFS share hangs during write operation but is accessible.
- ESXi mounted NFS datastores are accessible.
- The issue does not occur when both ends are set to an MTU of 1500.
- Even on the same device, there is no problem with an MTU of 9000 on ports connected to a different switch.
- Occurs when using the VLAN of a0a ports.
- The issue does not occur when using the VLAN of a0b ports.
- All ports have MTU size set to 9000.
- All a0a VLAN ports, including member ports, report CRC errors.
vifmgr.cluscheck.crcerrors: Port a0a on node NodeA is reporting a high number of observed hardware errors, possibly CRC errors.
vifmgr.cluscheck.crcerrors: Port a0a-101 on node NodeA is reporting a high number of observed hardware errors, possibly CRC errors.
vifmgr.cluscheck.crcerrors: Port e0e on node NodeA is reporting a high number of observed hardware errors, possibly CRC errors.
vifmgr.cluscheck.crcerrors: Port e0g on node NodeA is reporting a high number of observed hardware errors, possibly CRC errors.
- New VMs can be created / storage vMotion VMs is successful, but after a few MBs are transferred, the transfer stops and after a few minutes there is a timeout error on the respective tasks:
- vmkernel.log: APD state is maintained intermittently
2021-09-28T16:12:33.376Z cpu0:2098712)WARNING: NFS: 337: Lost connection to the server 172.27.143.110 mount point /netapp_test, mounted as 8b08cbd1-435c57cd-0000-000000000000 ("netapp_test")
2021-09-28T16:14:15.781Z cpu0:2098712)NFS: 346: Restored connection to the server 172.27.143.110 mount point /netapp_test, mounted as 8b08cbd1-435c57cd-0000-000000000000 ("netapp_test")
2021-09-28T16:14:15.781Z cpu6:2097603)StorageApdHandler: 507: APD exit event for 0x4314ef9fec20 [8b08cbd1-435c57cd]
2021-09-28T16:14:15.781Z cpu6:2097603)StorageApdHandlerEv: 117: Device or filesystem with identifier [8b08cbd1-435c57cd] has exited the All Paths Down state.