Stuck block or slice sync during upgrade to Element 12.3 after an upgraded node experiences a drive failure
Applies to
- NetApp Element software
- Upgrade to Element 12.3 or above from any lower version
Issue
During the cluster upgrade, a node finishes its upgrade to Element 12.3, has a failed drive, and:
- The cluster master (CM) node is on the older Element version
unresponsiveService
andblockServiceUnhealthy
orsliceServiceUnhealthy
alerts are present for the drive with the issue but a block or slice sync never occurs so these errors remain indefinitely- The cluster UI and Active IQ show one of the following
driveFailed
errors for the associated drive:
1 drive(s) with state: "SmartHealthFailed" driveID: <drive_ID>.
1 drive(s) with state: "NvmeCapacitance" driveID: <drive_ID>.
-
Additional errors seen in that may require Support intervention:
- 12.2 Scenarios (driveHealthFault) only this
-
12.3 Scenarios (driveHealthFault and Drivefailed) both of these and any of the following:
- NvmeSparesExhausted
- NvmeMediaErrors
- NvmeReadOnly
- NvmeCapacitance
- NvmePersistentMemory
- SmartHealthFailed