Snapvault replication jobs for multiple VM datastore are failing on StorageGRID
Applies to
- NetApp StorageGRID 11.6.0.10
- FabricPool
Issue
- Latency is observed on the SnapMirror Replication when the Volumes Policy has been set to ALL.
- Restarting of Primary and Secondary Admin Node improves latency temporarily.
- Check Diagnostics under StorageGRID GUI > Support > Diagnostics and found :
- TCP Retransmission
- HTTP 499 Errors ( Typically occurs when the Client terminates the connection before the Server can respond)
- HTTP 500 Errors ( Resembles the error is on the client side or at Network level)
- Load balancer Request timeouts
- Disk Read and Write Latency
- On checking Back-End E-Series Disk Drives were reporting IOP_FAST_TIMEOUT_ERROR
- Cassandra Timeout Requests