Remove a node from the cluster with RDB offline
Applies to
Issue
- Node remove fails causing RDB ring to go offline on the node in question:
-
::*> cluster remove-node -node <node-05>
Warning: This command will remove node "<node-05>" from the cluster. You must remove the failover partner as well. After the node is removed, erase its configuration and initialize all disks by using the "Clean configuration and initialize all disks (4)" option from the boot menu.
Do you want to continue? {y|n}: y
[Job 41674] Job is queued: Cluster remove-node of Node:<node-05> with UUID:1b6d50ea-bc66-11e6-9fb1-39cd3243c3bf.Error: command failed: [Job 41674] Job failed:
Node "<node-05>" has 1 volumes. Please either move or delete them from the node before removing the node. The volumes are: <vol01>
::*> cluster show
Node Health Eligibility Epsilon
-------------------- ------- ------------ ------------
node-05 false false false
node-06 true true false
node-07 true true false
3 entries were displayed.::*> cluster ring show
node UnitName Epoch DB Epoch DB Trnxs Master Online
--------- -------- -------- -------- -------- --------- ---------
node-05 mgmt 0 84 146 - offline
node-05 vldb 0 58 262397 - offline
node-05 vifmgr 0 72 111 - offline
node-05 bcomd 0 52 111 - offline
node-05 crs 0 21 1 - offline
node-06 mgmt 85 85 123594 node-06 master
node-06 vldb 59 59 22306 node-06 master
node-06 vifmgr 73 73 455908 node-06 master
node-06 bcomd 53 53 4 node-06 master
node-06 crs 22 22 1 node-06 master
node-07 mgmt 85 85 123594 node-06 secondary
node-07 vldb 59 59 22306 node-06 secondary
node-07 vifmgr 73 73 455908 node-06 secondary
node-07 bcomd 53 53 4 node-06 secondary
-
- Volume detele job also stuck for long time:
-
::*> job show -id 41674 -instance
Job ID: 41674
Owning Vserver: SVM1
Name: Vol Delete
Description: Delete vol01
Priority: High
Node: NODE-05
Affinity: Cluster
Schedule: @now
Queue Time: 07/22 15:09:56
Start Time: -
End Time: -
Drop-dead Time: -
Restarted?: false
State: Unknown
Status Code: 1
Completion String:
Job Type: VOL_DELETE
Job Category: VOPL
UUID: 1c90427c-eaee-11eb-ac6d-00a098dac15e
Execution Progress: -
User Name:
Process: mgwd
Restart Is or Was Delayed?: false
Restart Is Delayed by Module: -
-
- Delete vserver shows error:
::> vserver delete -vserver SVM1
Warning: Unable to list entries on node node-05. RPC: Couldn't make connection [from
mgwd on node "master-node" (VSID: -1) to mgwd at 169.xxx.xxx.xxx]
Warning: When Vserver "SVM1" is deleted,
the following objects are automatically removed as well:
Routes: 1
Do you want to continue? {y|n}: y
[Job 35808] Job is queued: Delete Vserver SVM1.
Warning: Unable to list entries on node node-05. RPC: Couldn't make connection [from
mgwd on node "node-01" (VSID: -1) to mgwd at 169.xxx.xxx.xxx]
[Job 35808]
Error: command failed: [Job 35808] Job failed:
Failed to check file or LUN copy operation status. Reason: "RPC: Couldn't make
connection [from mgwd on node "node-02" (VSID: -3) to mgwd at 169.xxx.xxx.xxx]".