Sudden latency and CPU utilization from workloads going idle to busy resolved by QoS
Applies to
- ONTAP 9
- FAS/AFF systems
Issue
- CPU is high (near or at 100%) constantly
WAFL_Ex
may be one of the busiest domains (where Data Processing happens)- Hosts report high IO wait times for a particular duration in a day.
- Clients report slowness while accessing multiple shares and could notice increased latency on AIQUM for same timestamps.
- Latency has increased and is higher suddenly
- NFS may be impacted
- sysstat -M 1 can be used to check the busiest domain.
Example: sysstat
output shows CPU raising due to an increase in user workload (columns removed for readability)
Cluster::> node run -node <node> -command sysstat -x 1 CPU NFS CIFS HTTP Total Net kB/s HDD kB/s in out read write 11% 1324 0 0 1324 169 131 5300 0 28% 72 0 0 72 483 526 4928 12 53% 175 0 0 175 254 407 5176 24 23% 143 0 0 143 146 72 4752 0 12% 230 0 0 230 134 259 5808 24 40% 5766 0 0 5766 207 720 44336 36956 53% 108 0 0 108 15698 14391 32340 24 46% 30 0 0 30 30975 30269 29900 0 87% 32124 0 0 32124 576397 53287 203513 12 99% 44334 0 0 44334 659406 45518 256931 251353 99% 43692 0 0 43692 609739 16930 263599 565448 99% 44492 0 0 44492 633509 41562 261366 116257