Skip to main content
NetApp Knowledge Base

AIQUM detects Cluster Monitoring is stuck and fails to gather all node's information

Views:
86
Visibility:
Public
Votes:
0
Category:
active-iq-unified-manager
Specialty:
om
Last Updated:

Applies to

  • Active IQ Unified Manager (AIQUM) 9.13P1
  • ONTAP 9

Issue

  • AIQUM detects Cluster Monitoring is stuck event:

Cluster Monitoring is stuck

Monitoring is stuck for the cluster <CLUSTER>. Reason: Monitoring completion time exceeded. Monitoring StartTime: <TIMESTAMP>, Monitoring EndTime: <TIMESTAMP>. Contact AIQUM technical support.

  • AIQUM fails to gather information of  all ONTAP clusters
  • ocumserver.log indicates timeout during data acquisition:

ERROR [oncommand] [reconciliation-1] [<CLUSTER>(incremental@13:50:42.775)|complete] [c.n.d.c.CollectionCompletionNotifier] Timeout occurred while waiting on collection completion listener ClusterSparesEventDetector..EnhancerBySpringCGLIB..a7e633af. Cancelling it so that others can continue.
INFO  [oncommand] [collection-completion-sync-8] [c.n.d.c.r.SnapmirrorTableCompletionListener] SnapmirrorRelationship Table is updated after collection completion for cluster <CLUSTER>
INFO  [oncommand] [reconciliation-1] [<CLUSTER>(incremental@13:50:42.775)|complete] [c.n.d.c.CollectionCompletionNotifier] Collection completion notification for cluster <CLUSTER>(inventory.ontap.fas.Cluster:1910729) finished in 0:15:00.625.
INFO  [oncommand] [reconciliation-1] [<CLUSTER>(incremental@13:50:42.775)] [com.netapp.dfm.collector.LockUtils] Releasing reconciliation-processing lock(java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock@32cddd9e[Locked by thread reconciliation-1]) for 51
ERROR [oncommand] [reconciliation-1] [<CLUSTER>(incremental@13:50:42.775)] [c.n.dfm.collector.OcieJmsListener] Inventory change listener error
java.util.concurrent.TimeoutException: null
    at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:204)
    at deployment.dfm-app.war//com.netapp.dfm.collector.CollectionCompletionNotifier.notifyListeners(CollectionCompletionNotifier.java:320)
    at deployment.dfm-app.war//com.netapp.dfm.collector.OcieJmsListener.reconcileAndNotify(OcieJmsListener.java:1322)
    at deployment.dfm-app.war//com.netapp.dfm.collector.OcieJmsListener.reconcileDataSourceChanges(OcieJmsListener.java:1009)
    at deployment.dfm-app.war//com.netapp.dfm.collector.OcieJmsListener.handleChangeMessage(OcieJmsListener.java:976)
    at deployment.dfm-app.war//com.netapp.dfm.collector.OcieJmsListener$2.run(OcieJmsListener.java:483)
    at deployment.dfm-app.war//com.netapp.dfm.common.metrics.executor.ThreadPoolMonitorExecutor.lambda$submit$1(ThreadPoolMonitorExecutor.java:179)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at deployment.dfm-app.war//com.netapp.dfm.common.metrics.executor.ThreadPoolMonitorExecutor.lambda$execute$0(ThreadPoolMonitorExecutor.java:165)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)

  • journalctl reports many Rate limit exceeded messages:

<AIQUM_HOST> kernel: Rate limit exceeded: IN=eth0 OUT= MAC=<MAC_ADDRESS> SRC=<IP> DST=<AIQUM_IP> LEN=52 TOS=0x02 PREC=0x00 TTL=124 ID=32617 DF PROTO=TCP SPT=64078 DPT=443 WINDOW=8192 RES=0x00 CWR ECE SYN URGP=0 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.