High CPU due to user workload causing read, write and other latency

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

Views:: 3,993

Visibility:: Public

Votes:: 1

Category:: ontap-9

Specialty:: perf

Last Updated:

Applies to

ONTAP 9

Issue

CPU utilization is near 100% and still not may have an impact on read/write latency
High write\read latency shown on node level from AIQUM or impacted volumes from CPU D-blade:

image (54) _.png

EMS log reports wafl.cp.toolong error event
Application /jobs are inconsistent or take longer than usual
An Active IQ Unified Manager alert can also be seen sometimes:

High CPU utilization Error: cluster1:kernel:node1 on cluster1 is reporting high CPU utilization of 91.1024 %, placing the node into warn state

Workload cannot be reduced

Example: Node 1 has a high CPU due to user workload, but other nodes of the cluster are idle/barely utilized as seen in the node shell sysstat -x 1 command

Note: Columns removed to improve readability

::> node run node1 sysstat -x 1
CPU    NFS   CIFS   HTTP   Total       Net   kB/s    Disk   kB/s   
                                        in    out    read  write   
97%  22453      0      0   22463  1491948   8098  664188 2631848 
91%  22448      0      0   22478  1492337   8121  607184  658216 
94%  22478      0      0   22509  1492134   8106   78844  101992 
96%  22453      0      0   23134  1492587   8108  810668 2736420 

::> qos statistics volume latency show
Workload            ID    Latency    Network    Cluster       Data       Disk        QoS      NVRAM
--------------- ------ ---------- ---------- ---------- ----------  ---------  ---------  ---------
-total-              -   136.49ms    99.00us    70.00us   136.17ms   153.00us        0ms        0ms
vserver1_vol1..   4201   206.05ms   130.00us        0ms   205.88ms    44.00us        0ms        0ms