SolidFire node Stuck in Maintenance Mode After Element Upgrade: “DisableMaintenanceMode forceDisable=true” Fails Due to Excess Standby Slices
Applies to
- Element Software Upgrade 12.7 to 12.9
- SF-Series Nodes
- H-Series Nodes
Issue
After a cluster upgrade from Element Software 12.7 to 12.9, a node becomes stuck in maintenance mode and cannot exit, even when using the API call with the forceDisable=true flag:
API Call: DisableMaintenanceModeParams: { "forceDisable": true, "nodes": [X] }Result: { "currentMode": "FailedToRecover", "requestedMode": "Disabled" }
- The node is not holding any primary volumes but still has secondary volumes/standby slices.
- Repeated core dumps on the master service (SIGABRT/SIGKILL).
Restarted MasterService: previous run killed with signal 6 (SIGABRT) core dump coreFileCount=1 servicecorelimit=7
- Active IQ events show:
nodeMaintenanceMode The node with node ID X failed to recover from maintenance modeUnexpected Exception - xDBSessionExpired Assertion Failed: [!(wtype == DBWatchType::SessionExpired)]
- Attempts to disable standby slice assignments or force maintenance mode off fail if active standbys exist.
- Cluster is degraded, node replacement/upgrade is blocked.
