Node goes down due to root volume going full with snapshots
Applies to
ONTAP 9
Issue
- Node is taken over by its partner and cluster rings are offline.
cluster1::> storage failover show
Takeover
Node Partner Possible State Description
-------------- -------------- -------- -------------------------------------
node-01 node-02 true Connected to node-02. Waiting
for cluster applications to come
online on the local node. Offline
applications: mgmt, vldb, vifmgr,
bcomd, crs.
node-02 node-01 true Connected to node-01, Partial
giveback
2 entries were displayed.
cluster1::*> cluster ring show
Node UnitName Epoch DB Epoch DB Trnxs Master Online
--------- -------- -------- -------- -------- --------- ---------
node-01 mgmt - - - - -
node-02 mgmt 26 26 1983 node-02 master
node-02 vldb 24 24 4379 node-02 master
node-02 vifmgr 15 15 23 node-02 master
node-02 bcomd 15 15 1 node-02 master
node-02 crs 15 15 1 node-02 master
6 entries were displayed.
- The error is seen when logging into the node
CRITICAL: This node is not healthy because the root volume is low on space
(<10MB). The node can still serve data, but it cannot participate in cluster
operations until this situation is rectified. Free space using the nodeshell or
contact technical support for assistance.
Internal error: Cannot open corrupt replicated database. Automatic recovery
attempt has failed or is disabled. Check the event logs for details. This node
is not fully operational. Contact support personnel for the root volume recovery
procedures.
- The root volume of the node is full
cluster1::> volume show -volume vol0
Vserver Volume Aggregate State Type Size Available Used%
--------- ------------ ------------ ---------- ---- ---------- ---------- -----
node-01 vol0 aggr_n01_root - RW - - -
node-02 vol0 aggr_n02_root online RW 54.13GB 31.55GB 38%