StorageGRID Upgrade to 11.4 gets stuck in "waiting for reboot" step or reboots but not fully comes back
Applies to
- StorageGRID upgrade to 11.4
- Docker service on container nodes
Issue
Upgrade to 11.4:
- gets stuck at the "
waiting for reboot
" step - or, a node reboots for some other reason and does not come back
- services don't start
- node is "blue" in GRID Manager UI
- normal ssh to port 22 does not connect, but can be reached via ssh to port 8022
- SSH into the base OS level on the node
ssh -p 8022
- Check
/var/log/syslog
for logs similar to the following:
Jan 13 20:06:25 localhost dockerd[784]: time="2020-01-13T20:06:25.225330454Z" level=error msg="failed connecting to containerd" error="failed to dial \"/var/run/docker/containerd/containerd.sock\": context deadline exceeded" module=libcontainerd
Jan 13 20:06:25 localhost dockerd[784]: time="2020-01-13T20:06:25.325657112Z" level=info msg="killing and restarting containerd" module=libcontainerd pid=825
Jan 13 20:06:25 localhost dockerd[784]: time="2020-01-13T20:06:25.426661368Z" level=error msg="containerd did not exit successfully" error="signal: killed" module=libcontainerd
Jan 13 20:06:26 localhost dockerd[784]: panic: runtime error: invalid memory address or nil pointer dereference
Jan 13 20:06:26 localhost dockerd[784]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x561adb97d4d0]
Jan 13 20:06:26 localhost dockerd[784]: goroutine 75 [running]:
Jan 13 20:06:26 localhost dockerd[784]: github.com/containerd/containerd.(*Client).Close(0x0, 0x0, 0x0)
Jan 13 20:06:26 localhost dockerd[784]: #011/build/docker.io-s7YiOg/docker.io-18.09.1+dfsg1/.gopath/src/github.com/containerd/containerd/client.go:536 +0x30
Jan 13 20:06:26 localhost dockerd[784]: github.com/docker/docker/libcontainerd/supervisor.(*remote).monitorDaemon(0xc0008329c0, 0x561adcb3c660, 0xc00016c6c0)
Jan 13 20:06:26 localhost dockerd[784]: #011/build/docker.io-s7YiOg/docker.io-18.09.1+dfsg1/.gopath/src/github.com/docker/docker/libcontainerd/supervisor/remote_daemon.go:321 +0x24f
Jan 13 20:06:26 localhost dockerd[784]: created by github.com/docker/docker/libcontainerd/supervisor.Start
Jan 13 20:06:26 localhost dockerd[784]: #011/build/docker.io-s7YiOg/docker.io-18.09.1+dfsg1/.gopath/src/github.com/docker/docker/libcontainerd/supervisor/remote_daemon.go:90 +0x3b5