After OCP upgrade daemonset pods stuck at container creating state
Applies to
- Astra Trident
- Openshift Container Platform (OCP) 4.12.25
Issue
After OCP upgrade from 4.12.7 to 4.12.25, daemonset pods failed to communicate to the trident controller which results daemonset pods stuck at container crating state.
Trident Controller Log:
level=error msg="Trident is initializing, please try again later" handler=AddOrUpdateNode logLayer=rest_frontend node=Node_Name requestID=0b8a7adc-01c5-450e-9a3a-a450492a655f requestSource=REST workflow="trident_rest=logger"
level=error msg="Could not initialize storage driver." error="error initializing ontap-nas driver: could not create Data ONTAP API client: error creating ONTAP API client: error reading SVM details: Post \"https://xx.xx.xx.xx/servlets/netapp.servlets.admin.XMLrequest_filer\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)" logLayer=core requestID=58be7067-9905-4df0-9ebf-b3eb20065630 requestSource=Internal workflow="core=bootstrap"
level=debug msg="Failed storage backend." backendName=netapptrident_dtsp backendUUID= driver=ontap-nas logLayer=core requestID=58be7067-9905-4df0-9ebf-b3eb20065630 requestSource=Internal workflow="core=bootstrap"
Trident Operator Log:
level=error msg="Trident REST interface was not available after 210.00 seconds; err: command terminated with exit code 1; Error: could not get version: Trident is initializing, please try again later (503 Service Unavailable)\ncommand terminated with exit code 1"
Trident-Node Log:
level=warning msg="Could not update Trident controller with node registration, will retry." error="failed during retry for CreateNode: could not log into the Trident CSI Controller: error communicating with Trident CSI Controller; Put \https://TridentCSI_ServiceIP:34571/trident/v1/node/NOde_Name\: dial tcp TridentCSI_ServiceIP:34571: connect: connection refused" increment=10.948125443s logLayer=csi_frontend requestID=ccaf9603-dbc3-4ffb-b1ad-f54b6d569f9c requestSource=Internal workflow="plugin=activate"
readiness probe failed: HTTP probe failed with statuscode: 503