StorageGRID appliance SG5600 Controller is rebooting unexpectedly
Applies to
- StorageGRID Webscale: 10.x and above
- StorageGRID Appliances (SGA): SG5612 and SG5660
Issue
- The SG5600 Controller is randomly resetting.
- In general, The impacts on the GRID appear to be minimal outside some alarms.
- In very rare situations, two SGAs could experience simultaneous resets and may prevent them from coming back online.
- "
base-os-logs/var/log/storagegrid/nodes/<gridid>.log
" shows following event, this indicates the controller has rebooted.
Example:
Feb 13 01:02:19 localhost logger: [2018-02-13T01:02:19+0000 INSG] Starting system logging: syslog-ng.
Feb 13 01:02:19 localhost logger: [2018-02-13T01:02:19+0000 INSG] Detected new container with persistent data. Initializing container and restoring data
Feb 13 01:02:19 localhost logger: [2018-02-13T01:02:19+0000 INSG] [root@SG:/root] >>> /usr/sbin/restore_container.sh 2>&1
......
Feb 13 01:02:45 localhost logger: [2018-02-13T01:02:45+0000 INSG] #######################################
Feb 13 01:02:45 localhost logger: [2018-02-13T01:02:45+0000 INSG] STARTING SERVICES
Feb 13 01:02:45 localhost logger: [2018-02-13T01:02:45+0000 INSG] #######################################