OnCommand Insight poll failure due to internal error for an ONTAP datasource
Applies to
Issue
- Performance data cannot be acquired since performance polls fail with the following message within the data collector's landing page in OnCommand Insight (OCI) or OnCommand System Manager:
Unable to poll performance ... error = Performance Recent Status
Internal error:
com.onaro.sanscreen.acquisition.framework.datasource.DataSourceErrorException: General Error
Or
Data ONTAP API fail: System busy: 7 requests on table "perf_object_get_instances" have been pending for 1678674 seconds. The last completed call took 0 seconds.
- Upon reviewing the
storageperformance
sample logs for the ONTAP cluster's data collector in question (located within the Error Report inacq folder
storageperformance_datacollectorname
>one of the timestamp folders
log_sample.log
), the below error message may be observed:
Example:
2021-03-12 17:19:33,895 ERROR [com.onaro.sanscreen.acquisition.datasource.netapp_ontap.NetAppOntapPerformancePackage] datalake collect and report (Poll Count: 1207, Is Macro Poll: false) : [storageperformance] data-collector-name: 1 apis failed: [storageperformance] data-collector-name: perf-object-get-instances(Object : workload) failed: Trying to perform arithmetic between two counters with different cardinality. Counter "read_io_type" has 1 elements, but the other counter "read_io_type" has 10 elements. (1 times)
2021-03-12 17:21:54,206 ERROR [com.onaro.sanscreen.acquisition.datasource.netapp_ontap.builder.ZapiIterBase] Aborting all performance api calls due to: perf-object-instance-list-info-iter(Object : lif) failed: System busy: 7 requests on table "perf_object_instance_list_info" have been pending for 2922550 seconds. The last completed call took 0 seconds.
2022-03-19 01:13:22,377 ERROR [com.onaro.sanscreen.acquisition.datasource.netapp_ontap.NetAppOntapPerformancePackage] datalake collect and report (Poll Count: 10124, Is Macro Poll: false) : [storageperformance] data-collector-name: 15 apis failed: [storageperformance] data-collector-name: perf-object-get-instances(Object : workload) failed: RPC: Remote system error [from mgwd on node "node_name" (VSID: -1) to cm at 127.0.0.1] (1 times)
- Furthermore, when attempting to run a
statistics lif show
command in the CLI (accessing through the node management LIF for any one of the nodes in the cluster) against the cluster SVM as shown below, a similar error may also be observed.
Note: The error message should be identical to the highlighted portion obtained from the performance sample log in OCI, although the particular performance-object
ZAPI call may be different between the two error messages:
cluster1::> statistics lif show -vserver cluster1
Error: command failed: System busy: 7 requests on table "perf_object_get_instances" have been pending for 1147109 seconds. The last completed call took 0 seconds.