Packet loss causing slowness and latency outside of ONTAP
Applies to
ONTAP 9
Issue
- Packet loss identified by multiple duplicate acknowledgements (at least #1, #2, and #3) along with (fast) retransmissions
Example:
- Application slowness and high i/o wait times, performance issues or higher latency are observed
- Throughput may drop to Zero then recover.
- If the loss is severe enough, the connection may be disconnected
- ONTAP latency is low and utilization of CPU and disk is lower than normal or expected
- Packet loss may be increasing in ONTAP 9.5 or higher
Example: Increasing Rexmit
and OOORcv
columns for 10.1.2.9 from netstat -anceWCT
node 1: Proto Recv-Q Send-Q Rexmit OOORcv 0-win Local Address Foreign Address =-=-=-=-=-=Sat Aug 13 2022, 23:16:01 -0400 BSD-NETSTAT-ANCEWCT 6 lines tcp4 0 0 6198853 112463181 0 10.1.2.3.3260 10.1.2.9.47254 =-=-=-=-=-=Sat Aug 20 2022, 23:26:09 -0400 BSD-NETSTAT-ANCEWCT 6 lines tcp4 0 0 1304064 57461127 0 10.1.2.3.3260 10.1.2.9.21933 node 2: =-=-=-=-=-=Sat Aug 13 2022, 23:15:27 -0400 BSD-NETSTAT-ANCEWCT 6 lines tcp4 0 48 5768522 3592331 0 10.40.40.27.3260 10.1.2.9.43744 =-=-=-=-=-=Sat Aug 20 2022, 23:25:03 -0400 BSD-NETSTAT-ANCEWCT 6 lines tcp4 0 48 1366568 11947331 0 10.40.40.27.3260 10.1.2.9.43425
ifstat
has 0 errors on the port of the data LIFevent log show
has no entries indicating CRCs or other local link degradation- Packet loss causes TCP to perform poorly, creating up to seconds of latency from the user or application end