Cluster ports down after ONTAP upgrade or reboot
Applies to
- ONTAP 9
- Automated nondisruptive upgrade (ANDU)
- FAS/AFF/ASA systems with on-board cluster ports
- Switched or switchless clusters
Issue
- Cluster ports show down after during an ONTAP upgrade after the first reboot into the new ONTAP version. This can also bee seen with a planned or unplanned reboot of the given node.
- The
storage failover show
displays the following state description:
Node Partner Possible State Description
-------------- -------------- -------- -------------------------------------
node-01 node-02 true Connected to node-02, Partial
giveback
node-02 node-01 true Connected to node-01. Waiting for
cluster applications to come online
on the local node. Offline
applications: mgmt, vldb, vifmgr,
bcomd, crs.
- The
cluster image show-update-progress
will show a typical output:
Cluster::*> cluster image show-update-progress
Estimated Elapsed
Update Phase Status Duration Duration
-------------------- ----------------- --------------- ---------------
Pre-update checks completed 00:10:00 00:01:04
Data ONTAP updates paused-on-error 01:32:00 1 days 02:39
Details:
Node name Status Status Description
-------------------- ----------------- --------------------------------------
node-01 waiting
node-02 failed Error: Node "node-02" is not in
"connected" state after giveback.
Action: Use the "storage failover
show" command to verify that node
"node02" is in the "connected"
state.
Status: Paused - An error occurred in "Data ONTAP updates" phase. The update cannot continue until the error has been resolved. Resolve all issues, then use the "cluster image resume-update" command to resume the update.
- The output of
cluster ring show
from the up node will not be able to list the entries from the affected node:
::*> cluster ring show
Node UnitName Epoch DB Epoch DB Trnxs Master Online
--------- -------- -------- -------- -------- --------- ---------
node-01 mgmt 6 6 41 node-01 master
node-01 vldb 6 6 3 node-01 master
node-01 vifmgr 6 6 30 node-01 master
node-01 bcomd 6 6 1 node-01 master
node-01 crs 6 6 1 node-01 master
Warning: Unable to list entries on node node-02. RPC: Couldn't make connection [from mgwd on node "node-01"
(VSID: -1) to mgwd at 169.254.200.103]
The rings will show offline
locally on the affected node:
::*> cluster ring show
Node UnitName Epoch DB Epoch DB Trnxs Master Online
--------- -------- -------- -------- -------- --------- ---------
node-02 mgmt 0 3 1165 - offline
node-02 vldb 0 3 39 - offline
node-02 vifmgr 0 3 248 - offline
node-02 bcomd 0 3 16 - offline
node-02 crs 0 3 8 - offline
node-02 mgmt - - - - -
6 entries were displayed.
Warning: Unable to list entries on node node-01. RPC: Couldn't make connection [from mgwd on node "node-02"
(VSID: -1) to mgwd at 169.254.200.103]
5 entries were displayed.