CAIQUM-5746: High memory and CPU utilization in AIQUM due to AIQCASecure database lock issue
Issue
- CPU and memory use are increasing every day in AIQUM
- After a prolonged period of increasing memory usage, accessing AIQUM Web GUI fails with
ERR_CONNECTION_REFUSED - OpenJDK Platform Binary process has high memory usage in the case of Windows
- ConfigAdvisorCLI processes are stuck
- False ActiveIQ Events are triggered. For example: "Verify That Intercluster Peering Is Up"
/var/log/ocie/recording/AIQCASecureData/collected_log/ConfigAdvisorAIDE/Logs/JobLogs/XXXXXXXXX.logshows error:
Collection from <ONTAP_IP>_<AIQUM>_202311251252.0.files Failed with error - database is locked'
Traceback (most recent call last):
File "peewee.py", line 3237, in execute_sql
sqlite3.OperationalError: database is locked- This issue can cause an out-of-memory (OOM) condition, and in that case
journalctl.txtcontains logs like the following.
kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/ocie.service,task=java,pid=3450748,uid=999
kernel: Out of memory: Killed process 3450748 (java) total-vm:7338364kB, anon-rss:3554460kB, file-rss:0kB, shmem-rss:0kB, UID:999 pgtables:8120kB oom_score_adj:0
systemd[1]: ocie.service: A process of this unit has been killed by the OOM killer.
systemd[1]: ocie.service: Main process exited, code=killed, status=9/KILL
ocie[3458400]: Stopping Active IQ Management Server service: ocie.
systemd[1]: ocie.service: Failed with result 'oom-kill'