Skip to main content
NetApp Knowledge Base

Upgrade from 10.3.x to 10.4 stalls

Views:
244
Visibility:
Public
Votes:
0
Category:
storagegrid-webscale
Specialty:
sgrid
Last Updated:

Applies to

StorageGRID Webscale 10.3.x

Issue

The overall upgrade process is stalling at the stage of 'Upgrade Grid Nodes' from the GMI (Grid Management Interface):

1085323_1.png

One storage node is in gray status from the grid topology:

1085323_2.png

Alarms of 'State Changed' for services like DDS, SSM, LDR are reported with the Trigger Value as Administratively Down.

/var/local/log/gdu-server.log on the admin node shows the whole upgrade process is waiting for one storage node to finish its node-scope upgrade job :

I, [2018-03-30T04:05:19.932678 0008183] INFO -- gdu-server: Attempting to upgrade from 10.3.0.4 to 10.4.0...
I, [2018-03-30T04:05:20.924058 0008183] INFO -- gdu-server: Stopping all services
I, [2018-03-30T04:06:15.996732 0008183] INFO -- gdu-server: Backing up resource files
I, [2018-03-30T04:06:16.051361 0008183] INFO -- gdu-server: Packing persistent data
I, [2018-03-30T04:06:16.736361 0008183] INFO -- gdu-server: No config dir found. Could not write version file.
I, [2018-03-30T04:06:16.736722 0008183] INFO -- gdu-server: Upgrade phase 1 is complete.
I, [2018-03-30T04:06:16.736979 0008183] INFO -- gdu-server: This platform base-os is owned by storagegrid. Updating base-os packages
I, [2018-03-30T04:06:17.283990 0008183] INFO -- gdu-server: Starting base-os upgrade. This node will be stopped in 30 seconds.
I, [2018-03-30T04:06:17.564760 0008183] INFO -- gdu-server: Executing command `rm -f /etc/DoNotStartNode` on <IP_Address>
I, [2018-03-30T04:06:17.568264 0008183] INFO -- gdu-server: updategrid completed. Waiting for node to update base-os and reboot

The following errors are seen from /var/log/upgrade.log on the storage node that is shown as gray in the grid topology:

Setting up pge-updater (10.4.0-20170324.1849.f3f1236) ...
/var/lib/dpkg/info/pge-updater.postinst:16:in `run_cmd': /sbin/pge_image_updater install failed: (RuntimeError)
lsblk: /dev/disk/by-label/pge-actv-root: not a block device
lsblk: /dev/disk/by-label/pge-actv-root: not a block device
    from /var/lib/dpkg/info/pge-updater.postinst:58:in `<main>'
dpkg: error processing package pge-updater (--configure):
 subprocess installed post-installation script returned error exit status 1
......
Errors were encountered while processing:
 pge-updater
E: Sub-process /usr/bin/dpkg returned an error code (1)
Package install completed with exit code 100
Failed to install required base-os packages
Not cleaning up upgrade-stage
Rebooting...
WARNING: Ignoring /sbin/shutdown request while base-os upgrade flag exists.
Complete the base-os upgrade to restore /sbin/shutdown command behaviour

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.
Scan to view the article on your device