Unable to Delete Kubernetes Volume Snapshots on ONTAP – “Busy/Locked” Errors with Trident CSI
Applies to
- Kubernetes clusters with Trident-managed PVs/PVCs
- Third-party backup tools (e.g., Commvault) integrated with Kubernetes
Issue
When attempting to delete Kubernetes volume snapshots managed by Trident on a NetApp ONTAP backend, the operation fails with ONTAP returning “busy” or “locked” errors. This results in:
- Stale/orphaned snapshots that cannot be deleted (even manually).
- Backup jobs (e.g., Commvault) failing or stalling due to inability to clean up old snapshots, causing space to fill up and preventing new backups from triggering.
- Kubernetes PVC provisioning and other volume operations experiencing delays or timeouts.
- Example log output:
Failed to delete snapshot: ONTAP reports snapshot is busy/locked due to dependent clone(s) MountVolume.MountDevice failed for volume "pvc-xxxx": rpc error: code = DeadlineExceeded desc = context deadline exceeded
- Trident logs show repeated snapshot delete failures and ONTAP error codes for busy/locked objects.
