EC rebalance job fails at high percentage with low amount of failures
Applies to
NetApp StorageGRID
Issue
EC rebalance job fails at high percentage with low amount of failures.
Example:
Job ID : 17947704893988286897
Site : <sitename>
State : Failure
Total Moves : 182
Completed Moves : 180
Failures (retryable) : 2
Failures (non-retryable) : 0
Percentage : 98
Start Time : 2023-11-25 12:12:48 UTC
End Time : 2023-12-19 08:11:18 UTC
Retry Rebalance : Yes
When enabling debugging on the EC leader, it reports: Dec 20 13:51:01 <nodename> chunk[188411]: [httpserver.go:151:14a233159cd9c56d] WARNING: PUT 11A24F88-DF02-473B-A591-8B8EB850D259/001A34084B9D1020_0_1 - failed with 503: volume read only
