OnCommand Insight Data Warehouse (7.3.3, 7.3.4) Jobs show Skipped Build, or WebUI becomes unavailable due to Wildfly Scheduler issue
- Views:
- 189
- Visibility:
- Public
- Votes:
- 1
- Category:
- oncommand-insight-data-warehouse
- Specialty:
- oci
- Last Updated:
Applies to
OnCommand Insight Data Warehouse (DWH) 7.3.3 and 7.3.4
Issue
In the DWH portal, your "Jobs" will show many entries for "Skipped build". You may also receive emails stating the job failed with "Skipped build".
DWH SANScreen Server Service start times will gradually take longer.
If the issue is not addressed, the number of MySQL table entries will grow into the millions and eventually cause the underlying deployment files for the "SANScreen Server" service to stay in a '.isdeploying. state for hours, eventually changin to a '.failed' state, causing OCI or DWH web portal to become unresponsive and display HTTP 404 not found error.
Example of some deployment file failed naming convention:
download.war.isdeploying
download-etl.ear.isdeploying
dwh-redirect.war.isdeploying
usermanager.war.isdeploying
--------------------------
download.war.failed
download-etl.ear.failed
dwh-redirect.war.failed
usermanager.war.failed
It is possible to parse the .failed file with a Text editor. For the issue described in this KB article, the .failed logs contain the following text:
"WFLYCTL0063: Composite operation was rolled back"
Once issue progresses to where the SANScreen Server Service Deployment files are taking hours to deploy properly, the<install path>\SANscreen\wildfly\standalone\log\Wildfly.log file will begin to show Memory related errors:
2019-XX-XX 07:07:57,350 ERROR [DeploymentScanner-threads - 1] management-operation (OperationContextImpl.java:1216) - <DWH_Name>: Timeout after [5] seconds waiting for service container stability while finalizing an operation. Process must be restarted. Step that first updated the service container was 'deploy' at address '[("deployment" => "download.war")]'
2019-XX-XX 07:14:26,274 WARN [ServerService Thread Pool -- 84] ra (ActiveMQActivation.java:422) - java.lang.OutOfMemoryError: Java heap space
2019-XX-XX 07:28:40,249 ERROR [default I/O-11] listener (ChannelListeners.java:94) - XNIO001007: A channel event listener threw an exception:
java.lang.NoClassDefFoundError: Could not initialize class org.xnio.IoUtils
2019-XX-XX 07:28:22,937 WARN [Thread-75 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$3@3ebed9d4)] OrderedExecutorFactory (OrderedExecutorFactory.java:127) - Java heap space: java.lang.OutOfMemoryError: Java heap space