This article contains a list of most ONTAP Upgrade operational and troubleshooting workflows. However, it is not a comprehensive list.
This can be used to narrow your search to the more commonly utilized troubleshooting KBs, broken down to a specific category.
Data ONTAP upgrades typically consists of multiple step processes. Every node in an ONTAP cluster that is upgraded must go through the following steps:
- Download of the ONTAP software package file to the cluster from an accessible web server.
- Installation of the ONTAP software package file as the secondary boot image on the storage controller's boot media.
- Setting the newly installed boot image as primary boot image.
- Rebooting (via
storage failover takeoverprocess) the storage controller to load the newly installed primary boot image.
- After the controller has rebooted, ONTAP upgrade tasks are automatically executed after the controller has completed reboot.
- Once the ONTAP upgrade tasks are completed, the storage controller is effectively upgraded.
To resolve failures that occur during the upgrade process, it is crucial to identify which step in the process described above has failed.
Methods of Performing ONTAP UpgradesWith Clustered Data ONTAP 8.3 and newer, there are two methods of performing ONTAP upgrades non-disruptively:
- Manual Upgrade: Sometimes referred to as a non-disruptive upgrade, or NDU, the manual upgrade process involves many steps run on each storage controller individually. These steps are performed by the storage administrator. Based on the number of storage controllers configured in the cluster and the desired version of ONTAP that is being installed, the steps can be performed individually on each storage controller in a "rolling upgrade" manor or in parallel using the "batch upgrade" process.
- Automated Upgrade: Referred to as an automated non-disruptive upgrade, or ANDU, the automated process greatly simplifies the number of steps involved to perform a cluster-wide ONTAP upgrade. The automated upgrade system performs all the manual steps required to perform a cluster-wide ONTAP upgrade. Depending on the number of storage controllers configured in the cluster and the desired version of ONTAP that is being installed, the automated upgrade process will perform the upgrade in a "rolling upgrade" manner, or "batch upgrade" manner.
DocumentationIt is highly recommended to review the ONTAP " Upgrade and Revert/Downgrade Guide" which fully documents the ONTAP upgrade process as well as the " Release Notes" documents also documents the changes in ONTAP for each version. Every ONTAP version has an " Upgrade and Revert/Downgrade Guide" and " Release Notes" which documents any version specific information.
The Data ONTAP 8 documentation can be found here: Data ONTAP 8 Product Documentation
The ONTAP 9 documentation can be found here: ONTAP 9 Product Documentation
Select the appropriate software version on those pages and the following page will contain links to the " Upgrade and Revert/Downgrade Guide" and " Release Notes" for that version.
Upgrade Advisor Action PlanIt is also recommended to generate Upgrade Advisor action plan. The Upgrade Advisor action plans are custom generated action plans for a given cluster. Please visit the NetApp Active IQ website to generate these action plans.
For assistance on troubleshooting issues with generating an Upgrade Advisor action plan, please review the following KB:
Active IQ - Upgrade Advisor fails to generate
Troubleshooting Upgrade Problems
Complete Wipe And Re-initialization Of The ClusterThere are some limited cases where the storage administrator intends to completely wipe the ONTAP software, destroy all user data on all volumes and install a different version of ONTAP on the storage controller. Instead of leveraging the ONTAP upgrade process to accomplish this, the ONTAP software can be installed, the disks attached to the storage controller can completely be wiped of data and re-initialized using the ONTAP special boot menu. Please note that this procedure is disruptive and will wipe the storage controller of ALL data.
Refer to this KB article:
How to perform a software installation from the Data ONTAP boot menu
After the software installation is complete, wipe the storage controller of all data using this KB article:
How to wipe the configuration of a clustered Data ONTAP 8.x node and re-initialize it
Problems with Downloading ONTAP Software Package File To The Cluster From An Accessible Web ServerONTAP utilizes each storage controller's node management logical inferface (LIF) to connect to a reachable web server in order to download the ONTAP software package file. If there are problems with the "system image get", "system image update -package
cluster1::> dns show -vserver
Test to see if the web server hostname is resolvable:
cluster1::> set advanced
cluster1::*> vserver services name-service getxxbyyy gethostbyname -node
"system image get"with IPv6 or use
"cluster image package get
Cluster image package get fails
" cluster image package get"command to download the ONTAP package file fails, try using the " system image get" (manual upgrade method) command to see if the package can be downloaded via the manual method. If so, this may indicate a failure with the ONTAP subsystem that manages the automated upgrade method.
To continue with the Automated nondisruptive upgrades (ANDU) you must use the "cluster image uppatde -version x.x" command but you need the image saved to the cluster image repository.
To do this, run the following to move the image from the etc/software dir to the repository.
Download the system image to the cluster repository:
Check to ensure cluster image repository now shows the ONTAP 9.3P7 image:
Check if each node has the image installed:
If some nodes are missing the image , then you will need to log directly into the management interface of either node to download the cluster image.
So for instance, log into node02 s mgmt lif.
Continue with Automated cluster upgrade
"cluster image uppatde -version x.x"
While using the manual upgrade method can serve as a workaround for upgrading the cluster, it is recommended to contact NetApp Technical Support for further assistance with troubleshooting the failure with using the automated update method.
Troubleshooting Validation Warning Messages From "cluster image validate" CommandThe "
cluster image validate" (automated upgrade method) command performs a series of cluster-wide checks to ensure that the cluster can be upgraded non-disruptively. Any errors or warnings that the validation operation reports will prevent the automated upgrade from beginning. These must be resolved before continuing the upgrade. Please refer to the
" Error-Action"field in the " cluster image validate" output to identify the corrective action to take to resolve the errors or warning. The following command can be run once the storage administrator has determined that any remaining errors or warnings can be safely ignored:
cluster1::> cluster image update -ignore-validation-warning true
|"Ensure the nodes being updated are running same version of Data ONTAP."||seen during upgrade from 9.3 to 9.x in an MC config||Bug 1142709|
The ONTAP operating system is installed on the boot media device of the storage controller. The default boot media device can store up to two ONTAP software images, one as the primary (default) boot image and the other as the secondary boot image. Typically when a system boots the default boot image, that is the active (current) boot image in use. The
Troubleshooting Default Boot Image Setting
" system image show"command lists the information for each boot image and if it is default and current boot image.
During upgrade, the ONTAP software package is installed to non-active boot image and then is marked as the default boot image. However, this only takes effect after a clean shutdown of the ONTAP operating system during storage failover takeover of the storage controller. The " Setting default boot image to" message should appear on the console storage controller that is being upgraded just prior to ONTAP shutdown. Here is an example of the messages seen:
" Setting default boot image to"message never appears, this may indicate that ONTAP was not able to cleanly shut down. The subsequent reboot will not load the image that was set as the default image and the storage controller will not undergo the upgrade. If this occurs, please contact NetApp Technical Support for further assistance to determine why the storage controller was not able to cleanly shut down.
Resuming Automated Upgrades That Are Paused Due To ErrorThe automated update process will pause if it encounters an error situation. For example, if storage giveback failed for a storage controller for some reason like a giveback veto, then automated update process will show
" pause-on-error".The storage administrator must correct the error condition in order to continue the upgrade. Use the
" cluster image show-update-progress"command to identify why the automated update process was paused. The
" Comments"field will identify why the automated update process was paused and possibly suggest corrective actions to take. Once the corrective action has been taken, the automated update process can be resumed by using the
" cluster image resume-update"command.
Troubleshooting ONTAP Upgrade Task FailuresAfter a storage controller completes a reboot during an ONTAP upgrade, the system beings upgrading the controller's software configuration so that new software features can be made available once the entire cluster is completely upgraded. These tasks automatically run in the background. When logging into the storage controller after reboot, you may see a SYSTEM MESSAGE like this, which indicates the controller is running these background tasks:
If these upgrade tasks were interrupted or encounter errors, the following SYSTEM MESSAGE may be observed:
The status of these upgrade tasks can be found by running the following advanced privilege level commands:
If there are any failed or aborted upgrade tasks, the following command maybe used to restart or re-run those task(s):
If the upgrade task continues to fail, then please contact NetApp Technical Support for further assistance.
Troubleshooting Mixed Version Message That Appears During UpgradeIn the process of upgrading ONTAP on cluster configurations greater than two storage controllers, during the upgrade where some storage controllers have completed the upgrade and other are still yet to be upgraded, the cluster is considered in a mixed version state. When logging into the cluster, you may see the follow SYSTEM MESSAGE displayed:
When the cluster is in a mixed version state, the cluster continues to operate and behave as the old version installed without the new features of the newer ONTAP version. Only once all storage controllers have successfully upgraded to the new version is the entire cluster considered upgraded and new features are available for use.
The version of ONTAP software is tracked in 3 ways:
2) The effective version of ONTAP that the node configuration has been upgraded to. This can be checked with the following command:
3) The effective version of ONTAP that the cluster configuration has been upgraded to. This can be checked with the following command:
ONTAP is designed to remain operational and serving data during a mixed version state, however it is not recommended to remain in mixed version state for longer than the time it takes to upgrade the entire cluster. It is also highly discouraged to make any configuration changes to the cluster while the cluster is in a mixed version state.
The cluster can also enter mixed version state when a storage controller with a newer version of ONTAP is joined to a cluster that is older version. If this occurs, then upgrade the rest of the cluster to the newer ONTAP version.