HCC FW upgrade to 2.99 gets stuck when already upgraded node has all drives Failed/Missing
Applies to
- Hybrid Cloud Control (HCC) storage firmware bundle 2.99.2
- Up to Element software 12.3
- Up to management services 2.18
Issue
- While the FW upgrade is still running, one of the already upgraded nodes suddenly fails all the drives out:
driveFailed
- <#> drive(s) with state: "DriveMissing" driveID: <driveID#>.
- If the failing node is sharing slices with the node that is currently upgraded (
nodeMaintenanceMode
), there is no possibility to sync the data - As a result
volumesOffline
will appear for the shared slices - Rebooting/power draining the failing node does not bring the drives back online
- Attempts to force the other node out of maintenance mode does not succeed:
FailedToRecover
- Symlink on failing node is still showing the old package
# ls -l /sf/packages/platform-config
/sf/packages/platform-config -> solidfire-platform-config-<NOT_2.99.2>