StorageGRID Appliance reboots unexpectedly due to network isolation
Applies to
- NetApp StorageGRID 11.5 and later
- NetApp StorageGRID Appliance
Issue
StorageGRID Appliance node reboots by itself without any apparent reason.
In the node log (/var/log/storagegrid/nodes/<nodename>.log
in base-os) the following can be observed:
[2021-06-14T12:36:14.818704] INFO -- Possible network isolation: Node has no contact with other nodes. If this warning persists, use the /usr/sbin/add_node_ip.py command to tell this node the address of another node in the grid. See the Recovery and Maintenance Guide for details.
[2021-06-14T12:36:14.818919] INFO -- 2021-06-14 12:36:14 +0000 | dynip | Possible network isolation: Node has no contact with other nodes.
[2021-06-14T12:36:30.821317] INFO -- Node service caught SIGTERM
[2021-06-14T12:36:30.841484] INFO -- Node service caught SIGTERM
[2021-06-14T12:36:30.841436] WARN -- Got socket error 4 with message Interrupted system call
There should be a gap of at least 10 minutes between the first logged isolation event and the reboot (SIGTERM).