StorageGRID: How to troubleshoot "Node Down" alert
Applies to
NetApp StorageGRID
Description
This article provides generic troubleshooting steps for a node down event in StorageGRID.
One or more of the following errors may be reported:
NDDOWN:StorageGRID Notification from XXXXXX(NODE_DOWN-CRITICAL) ERROR
- One or more StorageGRID nodes appear next to a blue icon
(Disconnected – Unknown) in the Grid Manager
Unable to communicate with Node
Alert
For all StorageGRID releases, the blue icon indicates that a node is disconnected for an unknown reason:
In the example, the Storage Node named DC1-S3 has a blue icon. The Connection State on the Node Information panel is Unknown, and the Unable to communicate with node
alert is active.
Possible causes for a node down event:
- Maintenance operations such as a scheduled reboot or a software upgrade
- Network issue
- Resource exhaustion (CPU, Memory, Disk) b/c of high Workload (very busy GRID)
- Hardware issue
- Incorrect management cabling post maintenance
- VMware console not responding