What could trigger an Event-Triggered ASUP for StorageGRID
Applies to
NetApp StorageGRID 11.1 and above
Answer
The following events will cause StorageGRID to dispatch an Event-Triggered ASUP:
- Unexpected node down
- Cassandra related issues
- Lost objects
- Low storage space
Additional Information
- What are the StorageGRID alerts that trigger AutoSupport automatic case creation
- StorageGRID first introduced event-triggered ASUPs on 11.1:
Alarm Code | Name | Service | Description |
CAHP (Cstar Issue) | Java Heap Usage Percent | DDS |
An alarm is triggered if Java is unable to perform garbage collection at a rate that allows enough heap space for the system to properly function. An alarm may indicate a user workload that exceeds the resources available across the system for the DDS keyvalue store |
CASA (Cstar Issue) | Data Store Status | DDS | An alarm is raised if the Cassandra data store becomes unavailable.
This alarm may also be an indication that the DDS service’s distributed key value store (Cassandra database) for a Storage Node requires rebuilding |
CDLP (Cstar Issue) | Data Load (percent) | DDS |
This alarm is triggered for a Storage Node if the free space reserved on object store 0 for object metadata (Metadata Reserved Free Space (CAWM)) has reached capacity. The Cassandra database requires a certain amount of free storage space to perform essential operations such as compaction and repair. These Cassandra operations will be impacted if the metadata load continues to grow. |
LOST (Lost Objects) | Lost Objects | CMS | Triggered when the StorageGRID Webscale system fails to retrieve a copy of the requested object from anywhere in the system. Lost objects represent a loss of data. |
NTBR (Insufficient Storage) | Free TableSpace | NMS | This alarm is triggered if there is a sudden drop in how fast database usage has been changing. |
NTLR (Cstar Issue) | Repair Completion Status | DDS | If a nodetoool repair task for Cassandra stalls, the normal background process of checking for and repairing potential database inconsistencies cannot complete and is retried every hour. |
SAVP (Insufficient Storage) | Total Usable Space | LDR | If usable space reaches a low threshold, options include expanding the StorageGRID Webscale system or move object data to archive through an Archive Node. |
SSTS (Insufficient Storage) | Storage Status | BLDR | If the value of Storage Status is Insufficient Usable Space, there is no more available storage on the Storage Node and data ingests are redirected to other available Storage Node |
VMFI (Insufficient Storage) | Entries Available | SSM | This is an indication that additional storage is required. Contact technical support. |
VMFR (Insufficient Storage) | Space Available | SSM | If the value of Space Available gets too low (see alarm thresholds), it needs to be investigated as to whether there are log files growing out of proportion, or objects taking up too much disk space (see alarm thresholds) that need to be reduced or deleted. |
On StorageGRID 11.2, the alarm, NODE_DOWN-CRITICAL, was added