High Read or Write Latency due to CPU bottleneck from user workload
- Views:
- 3,204
- Visibility:
- Public
- Votes:
- 1
- Category:
- aff-series
- Specialty:
- perf
- Last Updated:
- 3/11/2025, 2:43:59 PM
Applies to
- AFF & FAS
- ONTAP 9
Issue
- A Performance Capacity alert may be triggerred for a node from Active IQ Unified Manager
- Alarm
IO wait time
is reported from the vCenter. - High CPU utilization is seen at >80%
- Node 1 has high CPU due to user workload, but node 2 is idle as seen in node shell
sysstat -x 1
command - High read and/or write latency seen on some or all volumes and/or LUNs.
- Most of the user workload is on a particular node, and the partner/other nodes of cluster are mostly idle or unbalanced
- Some of the alert examples that we may see on the Active IQ Unified Manager for higher latency and CPU utilization breach are as below :
Latency value of 12.2 ms/op on Cluster1_N6 has triggered a WARNING event based on threshold setting of 10.0 ms/op
NetApp Node Node-1:kernel:Node-1 is reporting hig CPU utilization of 91.1637%, placing the node into warn state
Note: Columns removed to improve readability
Cluster::> node run node1 sysstat -x 1 CPU NFS CIFS HTTP Total Net kB/s Disk kB/s in out read write 89% 22453 0 0 22463 1491948 8098 664188 2631848 86% 22448 0 0 22478 1492337 8121 607184 658216 95% 24478 0 0 24509 1592134 8106 78844 101992 85% 22453 0 0 23134 1492587 8108 810668 2736420 Cluster::> qos statistics volume latency show Workload ID Latency Network Cluster Data Disk QoS NVRAM --------------- ------ ---------- ---------- ---------- ---------- --------- --------- --------- -total- - 136.49ms 99.00us 70.00us 136.17ms 153.00us 0ms 0ms vserver1_vol1.. 4201 206.05ms 130.00us 0ms 205.88ms 44.00us 0ms 0ms