Kubernetes Pods Stuck in Terminating/BackOff State When NetApp Storage Is Unavailable
Applies to
- NetApp ONTAP (All Flash FAS, AFF-C800, and similar)
- Trident CSI Driver for Kubernetes
- Kubernetes clusters using NetApp persistent volumes
Issue
Kubernetes pods that rely on NetApp persistent storage remain stuck in abnormal states (e.g., Terminating, BackOff) and cannot be gracefully deleted or recovered when the backend NetApp storage is offline (e.g., due to a controller panic and LUNs entering nvfail state).
Log Output/Symptom Examples:
- Pods remain in
Terminatingstate for extended periods. - Pods enter
BackOffif volume mount attempts repeatedly fail. - Cluster events show failed volume unmounts or inaccessible storage.
- Storage controller logs indicate LUNs in
nvfailand backend services offline.
