StorageGRID Node not responding to network requests due to HA Group FAULT state
Applies to
NetApp StorageGRID
Issue
- Nodes not responding to client request, but incoming packets are seen, due to which Client connection timeouts and degraded data transfer is observed.
- From
/var/local/log/hagroups.logon the affected node:
Keepalived_vrrp[19619]: Script `ha_check` now returning 1Keepalived_vrrp[19619]: VRRP_Script(ha_check) failed (exited with status 1)Keepalived_vrrp[19619]: (TeHDCz2SSD1) Entering FAULT STATEKeepalived_vrrp[19619]: (TeHDCz2SSD2) Entering FAULT STATEKeepalived_vrrp[19619]: (TeHDCz2SSD2) removing VIPs.Keepalived_vrrp[19619]: (TEHDCz2SSDDEV2) removing Virtual RoutesKeepalived_vrrp[19619]: Netlink: error: No such process (3), type=RTM_DELROUTE(25), seq=..., pid=0
- Packet capture on the node showed SYN packets received but no SYN/ACK response.
root@xxx:~# tcpdump -i eth2.217 -nn host XX.XX.XX.XX ... (incoming SYN packets, but no SYN/ACK response)
