Why is a workload's latency high when the IOPS are low?
Applies to
- ONTAP 9
- Data ONTAP 7-mode
Answer
- ONTAP will respond to requests as they come in, and a workload that has few requests will appear to be higher but be responding perfectly fine
- Low IOP workloads (ie., 5 IOPs and 32kB/s) will:
- Not be in RAM cache, so will need to go to disk more
- Not have a high sample size, so they are mathematically considered statistically irrelevant (more in Additional Information)
- Not have enough samples to average out any outliers
- To put this another way: low IOP workloads are not a problem in the absence of other symptoms (errors, application not responding, network issues, etc.)
- Low IOPS are typically below 500-600 IOPS but can vary, reported latency can reach the seconds, or tens of seconds range due to the latency averaging skew
- Increasing the workload on the volume with low IOPS can further help determine if latency skew is the reason the latency shows an inflated number
Additional Information
- Definitions:
- mean: average, or the sum of all instance values divided by number of instances
- median: the instance value in the middle when values are ordered from smallest to largest
- mode: the instance value occurring most often
- In the statistics branch of math, you need to use mean, median, and mode to help calculate that
Example 1: Latency observed across 3 instances in a period (say 3 ops in a minute): 1 ms, 100 ms, 1 ms
- mean: (1+100+1)/3=34 ms
- median: 1 ms
- mode: 1 ms
- ONTAP will often give average latency, but in this case, the median and mode show that latency is actually really good
Example 2: Latency observed across 20 instances (7 ops/second): 1ms, 1ms, 1ms, 1ms, 100ms, 1ms, 1ms...1ms (19 @ 1 ms, 1@100 ms)
- mean: (19+100) /20=5.95ms
- median: 1 ms
- mode: 1 ms
- In this case, average latency is more accurate than the prior example because we have enough data to have better confidence in the numbers
How to identify a client, network, or ONTAP problem calculating concurrency