Skip to main content
NetApp Knowledge Base

What could trigger an Event-Triggered ASUP for StorageGRID

Views:
177
Visibility:
Public
Votes:
0
Category:
storagegrid-webscale
Specialty:
sgrid
Last Updated:

Applies to

NetApp StorageGRID 11.1 and above

Answer

The following events will cause StorageGRID to dispatch an Event-Triggered ASUP:

  • Unexpected node down
  • Cassandra related issues
  • Lost objects
  • Low storage space

Additional Information

StorageGRID first introduced event-triggered ASUPs on 11.1:

Alarm Code Name Service Description
CAHP (Cstar Issue) Java Heap Usage Percent DDS

An alarm is triggered if Java is unable to perform garbage collection at a rate that allows enough heap space for the system to properly function. An alarm may indicate a user workload that exceeds the resources available across the system for the DDS keyvalue store

CASA (Cstar Issue) Data Store Status DDS An alarm is raised if the Cassandra data store becomes unavailable.

This alarm may also be an indication that the DDS service’s distributed key value store (Cassandra database) for a Storage Node requires rebuilding

CDLP (Cstar Issue) Data Load (percent) DDS

This alarm is triggered for a Storage Node if the free space reserved on object store 0 for object metadata (Metadata Reserved Free Space (CAWM)) has reached capacity. The Cassandra database requires a certain amount of free storage space to perform essential operations such as compaction and repair. These Cassandra operations will be impacted if the metadata load continues to grow.

LOST (Lost Objects) Lost Objects CMS Triggered when the StorageGRID Webscale system fails to retrieve a copy of the requested object from anywhere in the system. Lost objects represent a loss of data.
NTBR (Insufficient Storage) Free TableSpace NMS This alarm is triggered if there is a sudden drop in how fast database usage has been changing.
NTLR (Cstar Issue) Repair Completion Status DDS If a nodetoool repair task for Cassandra stalls, the normal background process of checking for and repairing potential database inconsistencies cannot complete and is retried every hour.
SAVP (Insufficient Storage) Total Usable Space LDR If usable space reaches a low threshold, options include expanding the StorageGRID Webscale system or move object data to archive through an Archive Node.
SSTS (Insufficient Storage) Storage Status BLDR If the value of Storage Status is Insufficient Usable Space, there is no more available storage on the Storage Node and data ingests are redirected to other available Storage Node
VMFI (Insufficient Storage) Entries Available SSM This is an indication that additional storage is required. Contact technical support.
VMFR (Insufficient Storage) Space Available SSM If the value of Space Available gets too low (see alarm thresholds), it needs to be investigated as to whether there are log files growing out of proportion, or objects taking up too much disk space (see alarm thresholds) that need to be reduced or deleted.

On StorageGRID 11.2, the alarm, NODE_DOWN-CRITICAL, was added

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.