Upgrading GKE with Anthos fails on Trident storage
Applies to
- Astra Trident
- Google Kubernetes Engine (GKE) cluster with Anthos
Issue
Running gkectl diagnose in preparation for upgrading gives a FAILURE on checking storage similar to:
user@hostname:~$ gkectl diagnose cluster --kubeconfig kubeconfig --cluster-name gke-anthos-clusterPreparing for the diagnose tool...Diagnosing the cluster...... DONEDiagnose result is saved successfully in /home/user/diagnose-user-gke-anthos-cluster-20230130155819.json- Validation Category: Cluster HealthinessChecking user cluster and node pools...SUCCESSChecking user cluster certificates...SUCCESSChecking cluster object...SUCCESS...Checking GKE Hub Membership...SUCCESSChecking all poddisruptionbudgets...SUCCESSChecking storage...FAILUREReason: 3 storage error(s).Unhealthy Resources:PersistentVolume kubernetes.io/csi/csi.trident.netapp.io^pvc-1234abcd-1234-abcd-1234-12345abcde12: virtual disk "kubernetes.io/csi/csi.trident.netapp.io^pvc-1234abcd-1234-abcd-1234-12345abcde12" IS NOT attached to machine "hostname-of-node-01" but IS listed in the Node.StatusPersistentVolume kubernetes.io/csi/csi.trident.netapp.io^pvc-1234abcd-1234-abcd-1234-12345abcde12: virtual disk "kubernetes.io/csi/csi.trident.netapp.io^pvc-1234abcd-1234-abcd-1234-12345abcde12" IS NOT attached to machine "hostname-of-node-02" but IS listed in the Node.StatusPersistentVolume kubernetes.io/csi/csi.trident.netapp.io^pvc-1234abcd-1234-abcd-1234-12345abcde12: virtual disk "kubernetes.io/csi/csi.trident.netapp.io^pvc-1234abcd-1234-abcd-1234-12345abcde12" IS NOT attached to machine "hostname-of-node-03" but IS listed in the Node.Status