Skip to main content
NetApp Knowledge Base

Upgrade from 10.3.x to 10.4 stalls

Views:
212
Visibility:
Public
Votes:
0
Category:
storagegrid-webscale
Specialty:
sgrid
Last Updated:

Applies to

StorageGRID Webscale 10.3.x

Issue

The overall upgrade process is stalling at the stage of 'Upgrade Grid Nodes' from the GMI (Grid Management Interface):

1085323_1.png

One storage node is in gray status from the grid topology:

1085323_2.png

Alarms of 'State Changed' for services like DDS, SSM, LDR are reported with the Trigger Value as Administratively Down.

/var/local/log/gdu-server.log on the admin node shows the whole upgrade process is waiting for one storage node to finish its node-scope upgrade job :

I, [2018-03-30T04:05:19.932678 0008183] INFO -- gdu-server: Attempting to upgrade from 10.3.0.4 to 10.4.0...
I, [2018-03-30T04:05:20.924058 0008183] INFO -- gdu-server: Stopping all services
I, [2018-03-30T04:06:15.996732 0008183] INFO -- gdu-server: Backing up resource files
I, [2018-03-30T04:06:16.051361 0008183] INFO -- gdu-server: Packing persistent data
I, [2018-03-30T04:06:16.736361 0008183] INFO -- gdu-server: No config dir found. Could not write version file.
I, [2018-03-30T04:06:16.736722 0008183] INFO -- gdu-server: Upgrade phase 1 is complete.
I, [2018-03-30T04:06:16.736979 0008183] INFO -- gdu-server: This platform base-os is owned by storagegrid. Updating base-os packages
I, [2018-03-30T04:06:17.283990 0008183] INFO -- gdu-server: Starting base-os upgrade. This node will be stopped in 30 seconds.
I, [2018-03-30T04:06:17.564760 0008183] INFO -- gdu-server: Executing command `rm -f /etc/DoNotStartNode` on <IP_Address>
I, [2018-03-30T04:06:17.568264 0008183] INFO -- gdu-server: updategrid completed. Waiting for node to update base-os and reboot

The following errors are seen from /var/log/upgrade.log on the storage node that is shown as gray in the grid topology:

Setting up pge-updater (10.4.0-20170324.1849.f3f1236) ...
/var/lib/dpkg/info/pge-updater.postinst:16:in `run_cmd': /sbin/pge_image_updater install failed: (RuntimeError)
lsblk: /dev/disk/by-label/pge-actv-root: not a block device
lsblk: /dev/disk/by-label/pge-actv-root: not a block device
    from /var/lib/dpkg/info/pge-updater.postinst:58:in `<main>'
dpkg: error processing package pge-updater (--configure):
 subprocess installed post-installation script returned error exit status 1
......
Errors were encountered while processing:
 pge-updater
E: Sub-process /usr/bin/dpkg returned an error code (1)
Package install completed with exit code 100
Failed to install required base-os packages
Not cleaning up upgrade-stage
Rebooting...
WARNING: Ignoring /sbin/shutdown request while base-os upgrade flag exists.
Complete the base-os upgrade to restore /sbin/shutdown command behaviour

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

Scan to view the article on your device