Skip to main content
NetApp Knowledge Base

StorageGRID services state changed to unknown due to out of memory

Views:
327
Visibility:
Public
Votes:
0
Category:
storagegrid
Specialty:
sgrid
Last Updated:

Applies to

  • StorageGRID
  • DDS service (Distributed Data Store)
  • LDR service (Local Distribution Router)
  • SSM service (Server Status Monitor)

Issue

  • State of StorageGRID services like DDS, LDR and SSM of a Storage Node change to unknown and recover after a few minutes.
  • servermanager.log indicates Cassandra service is ended and restarted:
2021-01-23 12:34:38 +0000 | cassandra | cassandra ended
2021-01-23 12:34:54 +0000 | cassandra | starting cassandra
  • Base OS messages log shows Java process (the Cassandra service for StorageGRID) is killed by oom_reaper:
Jan 23 12:34:22 localhost kernel: [123456.123456] oom_reaper: reaped process 1234 (java), now anon-rss:10347420kB, file-rss:27560kB, shmem-rss:144kB
  • StorageGRID node reboots due to Out-of-Memory errors found in daemon.log 
    Line 26927: Mar 22 13:39:37 localhost wdogd[1691]: OOMM: successfully forked OOM canary process
    Line 26967: Mar 22 13:39:38 localhost wdogd[1691]: OOMM: /usr/bin/storagegrid-oom-recover considering initializing swap file at Tue Mar 22 13:39:38 UTC 2022
    Line 27033: Mar 22 13:39:41 localhost wdogd[1691]: OOMM: Setting up swapspace version 1, size = 1024 MiB (1073737728 bytes)
    Line 27034: Mar 22 13:39:41 localhost wdogd[1691]: OOMM: no label, UUID=

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.