Waiting for reservations to clear after ANDU panic reboot
Applies to
- ONTAP 9
- Platforms with shared Cluster/HA interconnect ports
What AFF, ASA and FAS platforms use shared Cluster and HA Ethernet ports? - ANDU
Issue
- During an ANDU ONTAP upgrade, after the first node is taken over, finishes the upgrade and reboots, the HA interconnect link is down
- The takeover node may have panicked or was rebooted at some point after the takeover
- The upgraded node will boot only to "waiting for reservations to clear" and a giveback will not be possible (due to not being at "Waiting for giveback")
- The ANDU process will stop and cannot be resumed due to a lack of a working HA interconnect
- Both physical links of the HA interconnect are UP and the cluster ports are working normally
- Only the RDMA link is down:
cluster::> system ha interconnect status show
Node: node1
Link 0 Status: up
Link 1 Status: up
Is Link 0 Active: true
Is Link 1 Active: true
IC RDMA Connection: down
Warning: Unable to list entries on node "node2". RPC: Couldn't make connection [from mgwd on node "node1" (VSID: -1) to kernel at 169.254.200.103]
Error: show failed: RPC: Couldn't make connection [from mgwd on node "node1" (VSID: -1) to kernel at 169.254.200.103] * waiting for reservations to clear is shown in node2's console - During troubleshooting, rebooting the down node, rebooting the up node, reseating interconnect cables, and rebooting the cluster switches have no effect on the RDMA link status
