I/O Interruptions during manual LIF migration and aggregate relocation
Applies to
- AFF-A900
- ONTAP 9.12.1P14
- Environments using NFS with VMware workloads
- Manual LIF migration
- Manual Aggregate relocation(ARL)
Issue
- During planned manual LIF migration and ARL operations in an ONTAP cluster, symptoms below were observed
- Brief VM I/O interruptions and VM hangs
- Increased CPU load and high-availability events reported by affected VMs
Audit log
Tue Dec 16 2025 20:08:23 [kern_audit:info:4314] 8003e80000000142:8003e8000000040c :: cluster1:console :: console :: cluster1:admin :: network interface migrate -vserver svm1 -lif svm1_lif2 -destination-node node-01 -destination-port a1a :: PendingTue Dec 16 2025 20:08:24 [kern_audit:info:4314] 8003e80000000142:8003e8000000040c :: cluster1:console :: console :: cluster1:admin :: network interface migrate -vserver svm1 -lif svm1_lif2 -destination-node node-01 -destination-port a1a :: Success
Tue Dec 16 2025 20:11:15 [kern_audit:info:4314] 8003e80000000142:8003e80000000521 :: cluster1:console :: console :: cluster1:admin :: storage aggregate relocation start -node node-02 -destination node-01 -aggregate-list aggr2 -relocate-to-higher-version true :: PendingTue Dec 16 2025 20:11:15 [kern_audit:info:4314] 8003e80000000142:8003e80000000521 :: cluster1:console :: console :: cluster1:admin :: Question: Warning: Destination is running a... :: PendingTue Dec 16 2025 20:11:17 [kern_audit:info:4314] 8003e80000000142:8003e80000000521 :: cluster1:console :: console :: cluster1:admin :: Question: Warning: Destination is running a... : y :: SuccessTue Dec 16 2025 20:11:17 [kern_audit:info:4314] 8003e80000000142:8003e80000000521 :: cluster1:console :: console :: cluster1:admin :: storage aggregate relocation start -node node-02 -destination node-01 -aggregate-list aggr2 -relocate-to-higher-version true :: Success
- LIF migration finished within 1 second
Example:
Tue Dec 16 20:08:24 [node-02: vifmgr: vifmgr.lifBeingRemoved:notice]: LIF svm1_lif2 (on virtual server 3), IP address xxx.xxx.xxx.76, is being removed from node node-02, port a2a.Tue Dec 16 20:08:24 [node-02: vifmgr: vifmgr.lifmoved.byadmin:notice]: LIF svm1_lif2 (on virtual server 3), IP address xxx.xxx.xxx.76, is being moved to node node-01, port a1a.Tue Dec 16 20:08:24 [node-01: vifmgr: vifmgr.lifsuccessfullymoved:notice]: LIF svm1_lif2 (on virtual server 3), IP address xxx.xxx.xxx.76, is now hosted on node node-01, port a1a.
- NFS server lock recovery completed without any subsequent errors
Example:
Tue Dec 16 20:08:29 [node-01: nblade2: Nblade.recoveryBegin:notice]: NFS server lock recovery has begun for Vserver "svm1", LIFID "1027", LIF IP address "xxx.xxx.xxx.76".
- NFS server grace period lasted from 20:08:29 to 20:09:14 (45s)
Example:
Tue Dec 16 20:08:29 [node-01: nblade2: Nblade.graceBegin:notice]: NFS server grace state has begun for Vserver "svm1", LIF ID "1027", LIP IP address "xxx.xxx.xxx.76".Tue Dec 16 20:09:14 [node-01: nblade2: Nblade.graceEnd:notice]: NFS server grace state has ended for Vserver "svm1", LIF ID "1027", LIF IP address "xxx.xxx.xxx.76".
- ARL finished within 7 seconds
Example:
Tue Dec 16 20:11:17 [node-02: sfo_arl_worker: arl.aggrStart:notice]: Starting relocation of aggregate aggr2 at time 29490708598.Tue Dec 16 20:11:22 [node-02: sfo_arl_worker: volmigrate.migrating:info]: Migrating volume aggr2 to node-01 (ID xxxxxxxxxx).Tue Dec 16 20:11:22 [node-02: config_thread: raid.aggregate.relocate:info]: Aggregate aggr2: Prepared for 'relocation' operation.Tue Dec 16 20:11:23 [node-02: sfo_arl_worker: volmigrate.result:info]: Migration of volume aggr2 to node-01 was successful.Tue Dec 16 20:11:24 [node-02: sfo_arl_worker: arl.OpFinished:notice]: Aggregate relocation operation from the source node node-02 to the destination node node-01 finished in 7243 milliseconds. 'override-vetoes' set to false, and 'override-destination-checks' set to false.Tue Dec 16 20:11:23 [node-01: wafl_spcd_main: monitor.volumes.one.ok:debug]: Aggregate aggr2 is OK.Tue Dec 16 20:11:23 [node-01: config_thread: raid.config.online.req.ok:notice]: Aggregate aggr2 is mounted. Time taken to mount the aggregate was 615 milliseconds.
