Migration of a node's root volume/aggregate onto new disks fails due to internal error
Applies to
Issue
- Migration of a node's root volume/aggregate onto new disks fails due to an internal error
Example 1:
Wed Jan 05 10:46:28 +0800 [Node_name: mgwd: migrate.root.failed:error]: Root aggregate migration failed on node Node_name. Reason: Internal error. Failed to offline the volume "vol0". Reason: ..
Wed Jan 05 10:46:28 +0800 [Node_name: mgwd: mgmtgwd.jobmgr.jobcomplete.failure:info]: Job "Migrate root aggregate" [id 4315] (Root aggregate migration job for node Node_name) completed unsuccessfully: Internal error. Failed to offline the volume "vol0". Reason: . (1).
Example 2:
Execution Progress: Complete: Internal error. Failed to verify the new root aggregate status.
Example 3:
Execution Progress: Complete: Internal error. Failed to copy contents from old root to new root volume.
Example 4:
8/19/2024 10:54:37 Cluster-01 INFORMATIONAL mgmtgwd.jobmgr.jobcomplete.failure: Job "Migrate root aggregate" [id 589591] (Root aggregate migration job for node Node-01) completed unsuccessfully: Internal error. Failed to destroy the volume "vol0". Reason: . (1).
8/19/2024 10:54:37 Cluster-01 ERROR migrate.root.failed: Root aggregate migration failed on node Node-01. Reason: Internal error. Failed to destroy the volume "vol0". Reason: ..
Execution Progress: Complete: Internal error. Failed to rename the new root aggregate. Reason: . [1]
Execution Progress: Complete: Timeout: Operation "copy_root_volume_contents_iterator::create_imp()" took longer than 600 seconds to complete
Example 7:
Root volume of the node missing from the cluster shell and visible under node shell.
- The new root aggregate is created successfully and the node is healthy, but the root migration job cannot be resumed
- Attempting to resume the migration job fails with:
Internal error. Failed to verify the new root aggregate status.
