System Performance Degradation After Mass LUN Deletion
Applies to
- NetApp AFF-A300
- ONTAP 9.13.1P9 (Cluster Mode)
- iSCSI protocol environments
- Systems experiencing mass LUN deletions and high aggregate utilization
Issue
- A large number of LUNs were deleted (~106TB, ~40% of aggregate), causing:
- High CPU utilization (background delete workload spiked to 30%)
- WAFL_CP (Consistency Point) workload elevated to ~50%
- Massive latency for client workloads (WAFLSuspOther latency in the hundreds of seconds)
- Timeout and failure to access volumes for many instances
- Example EMS log:
Mon Dec 01 1:00:00+0000 [Node01:VdomAsyncTh_03:LUN.destroy:notice]: LUN /vol/vol_01/volume-d7s8d9s0-d8s7-7744-7283-875b7b6b9b5b destroyed (UUID:d7s8d9s0-d8s7-7744-7283-875b7b6b9b5b). - Business impact: No access to volume in write/read for affected clients.
