Takeover not possible during ONTAP Upgrade due to Storage failover interconnect error and Disk inventory not exchanged
Applies to
- AFF-A20
- ONTAP 9
- Automated Non-Disruptive Upgrade (ANDU)
Issue
- During ANDU, one of the nodes in the HA pair is upgraded to ONTAP 9.18.1, while the partner node remains at 9.16.1Px.
- The upgrade process is paused with the following takeover was not possible error:
Cluster::> storage failover show
Takeover
Node Partner Possible State Description
-------------- -------------- -------- -------------------------------------
Node-A Node-B false Waiting for Node-B, Takeover is
not possible: The version of
software running on each node of the
SFO pair is incompatible, Storage
failover interconnect error, NVRAM
log not synchronized, Disk inventory
not exchanged
Node-B Node-A false Waiting for Node-A, Takeover is
not possible: The version of
software running on each node of the
SFO pair is incompatible, Storage
failover interconnect error, NVRAM
log not synchronized, Disk inventory
not exchanged
- The HA interconnect links are up, but the RDMA connection is down:
::*> system ha interconnect status show
Node: Node-A Link 0 Status: up Link 1 Status: up Is Link 0 Active: true Is Link 1 Active: true IC RDMA Connection: down
Node: Node-B Link 0 Status: up Link 1 Status: up Is Link 0 Active: true Is Link 1 Active: true IC RDMA Connection: down2 entries were displayed.
-
The following errors are seen in the event logs:
[Node-A: cfdisk_config: cf.diskinventory.sendFailed:debug]: params: {'reason': 'HA Interconnect down', 'errorCode': '0'}[Node-A: cfdisk_config: cf.diskinventory.sendFailed:debug]: params: {'reason': 'HA Interconnect down', 'errorCode': '0'}
[Node-B: cfdisk_config: cf.diskinventory.sendFailed:debug]: params: {'reason': 'HA Interconnect down', 'errorCode': '0'}[Node-B: cfdisk_config: cf.diskinventory.sendFailed:debug]: params: {'reason': 'HA Interconnect down', 'errorCode': '0'}
-
Reseat of the HA cables does not change the RDMA connection status.
-
Manual takeover fails with the following error:
::> storage failover takeover -ofnode Node-A
Warning: A takeover will be initiated. When the partner node reboots, a giveback will be automatically initiated. Do you want to continue? {y|n}: y
Error: command failed: Failed to initiate takeover. Reason: Disk inventory infomation not recieved yet
-
An attempt to allow disk inventory mismatch and version mismatch also fails with the following error:
::*> storage failover takeover -ofnode Node-A -allow-disk-inventory-mismatch true -option allow-version-mismatch
Warning: A takeover will be initiated. When the partner node reboots, a giveback will be automatically initiated. Do you want to continue? {y|n}: y
Warning: Initiating a takeover while node cannot see some of the partner disks is not recommended. Do you want to continue? {y|n}: y
Warning: Initiating a takeover while the partner is running a mismatched ONTAP version is not recommended, unless you are performing a nondisruptive upgrade or downgrade. Do you want to continue? {y|n}: y
Error: command failed: Failed to initiate takeover. Reason: Partner is not UP .
