Upgrade from 10.3 to 10.4 stuck and lost connection to virtual Admin node
Applies to
StorageGRID Webscale 10.3.x
Issue
- The overall upgrade process is stuck and not having access to GRID manager.
- We are only having access to command line through VMware vSphere.
/var/local/log/gdu-server.log
on the admin node shows the whole upgrade process is waiting for admin node to finish its upgrade job:
I, [2018-03-30T04:06:16.736979 0008183] INFO -- gdu-server: This platform base-os is owned by storagegrid. Updating base-os packages
I, [2018-03-30T04:06:17.283990 0008183] INFO -- gdu-server: Starting base-os upgrade. This node will be stopped in 30 seconds.
I, [2018-03-30T04:06:17.564760 0008183] INFO -- gdu-server: Executing command `rm -f /etc/DoNotStartNode` on <IP_Address>
I, [2018-03-30T04:06:17.568264 0008183] INFO -- gdu-server: updategrid completed. Waiting for node to update base-os and reboot
/var/log/upgrade.log
does not show any errors:
#storagegrid node validate all
gives error:
-----Error Message----- ERROR -- <Affected node>: GRID_NETWORK_GATEWAY = 0.0.0.0 ERROR -- <Affected node>: GRID_NETWORK_GATEWAY is not a valid IP address