- ONTAP 9
- Clustered Data ONTAP 8
This article contains a list of most ONTAP Upgrade operational and troubleshooting workflows. However, it is not a comprehensive list.
This can be used to narrow your search to the more commonly utilized troubleshooting KBs, broken down to a specific category.
Data ONTAP upgrades typically consists of multiple step processes. Every node in an ONTAP cluster that is upgraded must go through the following steps:
- Download the ONTAP software package file to the cluster from an accessible Web server.
- Install the ONTAP software package file as the secondary boot image on the storage controller's boot media.
- Set the newly installed boot image as the primary boot image.
- Reboot (via
storage failover takeoverprocess) the storage controller to load the newly installed primary boot image.
After the controller has rebooted, ONTAP upgrade tasks are automatically executed after the controller has completed reboot.
Once the ONTAP upgrade tasks are completed, the storage controller is effectively upgraded.
To resolve failures that occur during the upgrade process, it is crucial to identify which step in the process described above has failed.
Methods of performing ONTAP upgradesWith clustered Data ONTAP 8.3 and later, there are two methods of performing ONTAP upgrades non-disruptively:
- Manual Upgrade: Sometimes referred to as a non-disruptive upgrade, or NDU, the manual upgrade process involves many steps run on each storage controller individually. These steps are performed by the storage administrator. Based on the number of storage controllers configured in the cluster and the desired version of ONTAP that is being installed, the steps can be performed individually on each storage controller in a 'rolling upgrade' manor or in parallel using the 'batch upgrade' process.
- Automated Upgrade: Referred to as an automated non-disruptive upgrade, or ANDU, the automated process greatly simplifies the number of steps involved to perform a cluster-wide ONTAP upgrade. The automated upgrade system performs all the manual steps required to perform a cluster-wide ONTAP upgrade. Depending on the number of storage controllers configured in the cluster and the desired version of ONTAP that is being installed, the automated upgrade process will perform the upgrade in a 'rolling upgrade' manner, or 'batch upgrade' manner.
DocumentationIt is highly recommended to review the ONTAP 'Upgrade and Revert/Downgrade Guide' which fully documents the ONTAP upgrade process as well as the 'Release Notes' documents also documents the changes in ONTAP for each version. Every ONTAP version has an 'Upgrade and Revert/Downgrade Guide' and 'Release Notes' which documents any version-specific information.
For more information, see the Data ONTAP 8 Product Documentation
For more information, see the ONTAP 9 Product Documentation
Select the appropriate software version on those pages and the following page will contain links to the 'Upgrade and Revert/Downgrade Guide' and 'Release Notes' for that version.
Upgrade advisor action planIt is also recommended to generate Upgrade Advisor action plan. The Upgrade Advisor action plans are custom generated action plans for a given cluster. Visit the NetApp Active IQ Web site to generate these action plans.
For assistance on troubleshooting issues with generating an Upgrade Advisor action plan, review the following KB:
Active IQ - Upgrade Advisor fails to generate
Troubleshooting upgrade problems
Complete wipe and re-initialization of the clusterThere are some limited cases where the storage administrator intends to completely wipe the ONTAP software, destroy all user data on all volumes and install a different version of ONTAP on the storage controller. Instead of leveraging the ONTAP upgrade process to accomplish this, the ONTAP software can be installed, the disks attached to the storage controller can completely be wiped of data and re-initialized using the ONTAP special boot menu. Please note that this procedure is disruptive and will wipe the storage controller of ALL data.
For more information, see the following KB article:
How to perform a software installation from the Data ONTAP boot menu
After the software installation is complete, wipe the storage controller of all data using the following KB article:
How to wipe the configuration of a clustered Data ONTAP 8.x node and re-initialize it
Problems with downloading ONTAP Software Package File to the cluster from an accessible Web server
ONTAP utilizes each storage controller's node management logical inferface (LIF) to connect to a reachable Web server in order to download the ONTAP software package file. If there are problems with the '
system image get', '
system image update -package' or '
cluster image package get' commands, this may indicate some issues with the following:
- Looking up the IP address for the Web server in DNS:
erify the correct DNS servers that can resolve the IP address of the Web server are configured for the admin SVM:
cluster1::> dns show -vserver
Test to see if the Web server hostname is resolvable:
cluster1::> set advanced
cluster1::*> vserver services name-service getxxbyyy gethostbyname -node
- Unable to connect to the web server:
Use the ping utility to ensure that the Web server is accessible from the node management LIF
Workaround: Run the '
system image get' command with IPv6 or run the '
cluster image package get' command with IPv4.
Cluster image package get fails
'cluster image package get'command to download the ONTAP package file fails, try running the '
system image get' (manual upgrade method) command to see if the package can be downloaded via the manual method. If so, this may indicate a failure with the ONTAP subsystem that manages the automated upgrade method.
To continue with the Automated Nondisruptive Upgrades (ANDU), run the '
cluster image uppatde -version x.x' command, but save the image saved to the cluster image repository.
To do this, run the following to move the image from the
etc/softwaredir to the repository:
- Download the system image to the cluster repository:
::*> cluster image package get -url file:///mroot/etc/software/93P7_q_image.tgz
- Check to ensure cluster image repository now shows the ONTAP 9.3P7 image:
::*> cluster image package show-repository
- Check if each node has the image installed:
::>system node package show
- If some nodes are missing the image, then log directly into the management interface of either nodes to download the cluster image.
So for instance, log into node02 s mgmt lif.
:> set advanced
::*> cluster image package get -url file:///mroot/etc/software/93P7_q_image.tgz)
- Continue with automated cluster upgrade
cluster image uppatde -version x.x'
While using the manual upgrade method can serve as a workaround for upgrading the cluster, it is recommended to contact NetApp Technical Support for further assistance with troubleshooting the failure with using the automated update method.
Troubleshooting validation warning messages from the '
cluster image validate' Command
cluster image validate' (automated upgrade method) command performs a series of cluster-wide checks to ensure that the cluster can be upgraded non-disruptively. Any errors or warnings that the validation operation reports will prevent the automated upgrade from beginning. These must be resolved before continuing the upgrade. Refer to the 'Error-Action' field in the '
cluster image validate' output to identify the corrective action to take to resolve the errors or warning. The following command can be run once the storage administrator has determined that any remaining errors or warnings can be safely ignored:
cluster1::> cluster image update -ignore-validation-warning true
|Ensure the nodes being updated are running same version of Data ONTAP||Seen during upgrade from 9.3 to 9.x in an MC config||Bug 1142709|
The ONTAP operating system is installed on the boot media device of the storage controller. The default boot media device can store up to two ONTAP software images, one as the primary (default) boot image and the other as the secondary boot image. Typically when a system boots the default boot image, that is the active (current) boot image in use. The '
Troubleshooting default boot image setting
system image show'
command lists the information for each boot image and if it is default and current boot image.
cluster1::> system image show
Is Is Install
Node Image Default Current Version Date
-------- ------- ------- ------- ------------------------- -------------------
image1 false false 9.1P4 8/12/2017 09:11:43
image2 true true 9.1P7 8/31/2017 14:34:30
image1 false false 9.1P4 8/12/2017 09:15:21
image2 true true 9.1P7 8/31/2017 14:34:52
4 entries were displayed.
During upgrade, the ONTAP software package is installed to non-active boot image and then is marked as the default boot image. However, this only takes effect after a clean shutdown of the ONTAP operating system during storage failover takeover of the storage controller. The '
Setting default boot image to' message should appear on the console storage controller that is being upgraded just prior to ONTAP shutdown. Here is an example of the messages seen:
Waiting for PIDS: 1244.
Setting default boot image to image2... done.
If the '
Setting default boot image to' message never appears, this may indicate that ONTAP was not able to cleanly shut down. The subsequent reboot will not load the image that was set as the default image and the storage controller will not undergo the upgrade. If this occurs, contact NetApp Technical Support for further assistance to determine why the storage controller was not able to cleanly shut down.
Resuming automated upgrades that are paused due to error
The automated update process will pause if it encounters an error situation. For example, if storage giveback failed for a storage controller for some reason like a giveback veto, then automated update process will show '
pause-on-error'. The storage administrator must correct the error condition in order to continue the upgrade. Run the '
cluster image show-update-progress' command to identify why the automated update process was paused. The 'Comments' field will identify why the automated update process was paused and possibly suggest corrective actions to take. Once the corrective action has been taken, the automated update process can be resumed by running the '
cluster image resume-update' command.
Troubleshooting ONTAP upgrade task failuresAfter a storage controller completes a reboot during an ONTAP upgrade, the system begins upgrading the controller's software configuration so that new software features can be made available once the entire cluster is completely upgraded. These tasks automatically run in the background. When logging into the storage controller after reboot, you may see a SYSTEM MESSAGE similar to the following, which indicates the controller is running these background tasks:
The upgrade of this node is in progress or not completed. The ability to provide
data service to clients is not affected while the upgrade completes. You can
check on the status of the upgrade by running "system node upgrade-revert show"
in advanced privilege mode. The status for this node should be listed as
'complete'. If the upgrade has stopped, you can restart the upgrade by running
"system node upgrade-revert upgrade" in advanced privilege mode. If this command
does not complete the node's upgrade, contact technical support immediately. The
node will be ready for management operations once the upgrade is completed
If these upgrade tasks are interrupted or encounter errors, the following SYSTEM MESSAGE may be observed:
The upgrade is not complete: an upgrade task aborted. This node is not fully
operational. Contact support personnel for the upgrade repair procedure.
One or more upgrade tasks on this node failed. This node is not fully
operational. Contact support personnel for the upgrade repair procedure.
To find the status of these upgrade tasks, run the following advanced privilege level commands:
cluster1::> set advanced
cluster1::*> system node upgrade-revert show
cluster1::*> system node upgrade-revert show -task-status
If there are any failed or aborted upgrade tasks, the following command may be used to restart or re-run those task(s):
cluster1::*> system node upgrade-revert upgrade -node
If the upgrade task continues to fail, then contact NetApp Technical Support for further assistance.
Troubleshooting mixed version message that appears during upgradeIn the process of upgrading ONTAP on cluster configurations greater than two storage controllers, during the upgrade where some storage controllers have completed the upgrade and other are still yet to be upgraded, the cluster is considered in a mixed version state. When logging into the cluster, you may see the following SYSTEM MESSAGE displayed:
Warning: The cluster is in a mixed version state. Update all of the nodes to
the same version as soon as possible.
When the cluster is in a mixed version state, the cluster continues to operate and behave as the old version installed without the new features of the newer ONTAP version. Only once all storage controllers have successfully upgraded to the new version is the entire cluster considered upgraded and new features are available for use.
The version of ONTAP software is tracked in 3 ways:
- The version of ONTAP that the software booted on the storage controller. This can be checked with the following command:
cluster1::> node run -node * -command version
- The effective version of ONTAP that the node configuration has been upgraded to. This can be checked with the following command:
cluster1::> version -node *
- The effective version of ONTAP that the cluster configuration has been upgraded to. This can be checked with the following command:
ONTAP is designed to remain operational and serving data during a mixed version state, however, it is not recommended to remain in mixed version state for longer than the time it takes to upgrade the entire cluster. It is also highly discouraged to make any configuration changes to the cluster while the cluster is in a mixed version state.
The cluster can also enter mixed version state when a storage controller with a newer version of ONTAP is joined to a cluster that is older version. If this occurs, then upgrade the rest of the cluster to the newer ONTAP version.