Why StorageGRID Unable to Communicate with Node alert did not trigger
Applies to
NetApp StorageGRID
Answer
The Unable to communicate with node alert looks for the following attributes from Prometheus.
-
(count(up != 1) without (job) > 0)- Checks if a node is unreachable or offline -
unless on(instance) storagegrid_administratively_down == 1- Checks if the node is administratively down
The administratively down attribute determines if the node being down is expected. The attribute is set during the following exceptions.
- Placed into maintenance mode
- Being decommissioned
- Undergoing a clone operation
- Powered offline from the BMC or physically
- Shut down from CLI
If the administratively down attribute is true, the alert will not trigger.
Additional Information
additionalInformation_text
