Skip to main content
NetApp Knowledge Base

Active IQ Wellness: Up to High Impact - This system is reporting a high average CPU utilization

Views:
1,822
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
perf
Last Updated:

 

Applies to

  • ONTAP 9
  • Data ONTAP 8.2 7-Mode 

Answer

Value of reviewing this information:

High CPU utilization is not specifically a problem for ONTAP 9 or Data ONTAP 7-Mode, however, it is an indicator along with throughput of current workload of a system and increases should be monitored.   As the utilization of the system increases as measured by CPU utilization, the latency of I/O operations may increase which might result in impact to performance of applications using the systems. 

How this wellness check is validated? 

For more information on how to validate CPU utilization via CLI for ONTAP 9 and Data ONTAP 7-Mode, see KB: CPU utilization in Data ONTAP: Scheduling and Monitoring
 
The risk is validated via AutoSupport Counter Manager data sent to NetApp in Daily Performance Data Notice AutoSupport messages. 
 

Average CPU Utilization is reviewed across all of existing NetApp systems to determine the level of impact for this alert: 

  • Values greater than the 99.5th percentile or top 0.5% will result in a High Risk
  • Values in from the 99th to 99.5th percentile will result in a Medium Risk 
What should I do about the information provided by this Active IQ Wellness rule?  

If you already have a plan for this proactive Active IQ warning, please acknowledge it within your Active IQ dashboard.  This will ensure that the Wellness warnings you see are issues you do not have a plan in place to address. 
 
To address this type of scenario: 

  1. Ensure that you are monitoring workload indicators such as your throughput in xbps/IOPS/and CPU% (AVG and Peaks) and track trends so you can respond and plan before getting to the point of experiencing performance impact.   A good start is the Performance Management guidance provided by ONTAP documentation or scheduling regular sysstat -M for 7-Mode. 
    If in the course of monitoring you detect an increase in latency in conjunction with the increase in CPU% plan to reduce or relocate workload as necessary to ensure continued expected performance. 
  2. It is recommended to Active IQ Unified Manager to monitor performance. 
    Workload latency is your best indicator of probable issues, as it will increase as system load does.
  3. Use ActiveIQ to review CPU over time: 

    1103632-1.png
 
  1. If CPU utilization is consistently above 60% per node (If ONTAP release is 7-Mode or 9.6 or earlier) or 70% per node (if ONTAP release is 9.7 or later) the recommendation would be to review workloads and relocate workloads to less busy nodes.
  2. If the utilization is intermittent, or a one-time spike, it may not be necessary to take any action other than continued monitoring.
  3. You can use System Manager or CLI to move a volume. 
  4. For more information, see KB: CPU utilization in Data ONTAP: Scheduling and Monitoring

Additional Information

Where can I find more information on this topic? 

 

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.