MetroCluster monitoring failed in ActiveIQ Unified Manager with Reason: An internal error has occurred during MetroCluster component refresh due to hex values to the "Reason for Last Reboot" field
Applies to
- Active IQ Unified Manager 9.6+
- Oncommand Unified Manager 6.x/7.x/9.x
- Metrocluster
Issue
- In Configuration menu, Cluster Data Sources tab, MetroCluster clusters will have this message in Description column: 'Monitoring failed...'
- Event created with title 'Event: Cluster Monitoring Failed' and Trigger Condition: 'Monitoring failed for cluster <MetroCluster-CLUSTER>. Reason: An internal error has occurred during MetroCluster component refresh. Contact technical support.'
Collecting a support bundle, or directly looking at collected log 'ocum-error.log
', will have the following entry:
"ERROR [oncommand] [collection-completion-0] [c.n.d.i.m.MccFabricConfigDiscoveryHandler] MetroCluster monitoring failed for <MetroCluster-CLUSTER>"
ERROR [oncommand] [reconcile-1] [affhosmc.chkd.net(incremental@13:58:23.999)] [c.n.dfm.collector.OcieJmsListener] Error during MetroCluster component monitoring : com.ctc.wstx.exc.WstxIOException: Invalid UTF-8 middle byte 0xb (at char #34185, byte #31999)
ocumserver.log
ERROR [oncommand] [reconcile-2] [QSCPS040(incremental@00:22:34.950)] [c.n.dfm.collector.OcieJmsListener] Error during MetroCluster component monitoring : MetroCluster component monitoring failed. Zapi execution failed XML processing error during ZAPI call storage-bridge-get-iter to QSCPS040: Invalid UTF-8 middle byte 0xb (at char #34126, byte #31999)
com.netapp.dfm.ontap.outbound.zapi.OcumMonitoringFailedException: MetroCluster component monitoring failed. Zapi execution failed XML processing error during ZAPI call storage-bridge-get-iter to <metrocluster name>: Invalid UTF-8 middle byte 0xb (at char #34126, byte #31999)
This will show up in autosupports or output for the 'Reason for last reboot' for the bridge in the storage-bridge-view.
Critical Hardware Error Detected0xED 0x
or
Critical Software Error Detected0xED 0xB
You can log into AIQ and pull up the cluster by name,serial or system id.
- Click on AutoSupport on the left side of the page
- Select the node from the drop down.
- Click on a weekly autosupport.
- Check the storage-bridge-view.xml
- Review the 'Reason for last reboot' to confirm.