Skip to main content
NetApp Knowledgebase

What are the ONTAP quorum considerations for two node clusters?

Views:
220
Visibility:
Public
Votes:
0
Category:
data-ontap-8
Specialty:
core
Last Updated:

 

Applies to

  • Clustered Data ONTAP 8
  • ONTAP 9

Answer

In a 2-node cluster, the default determination for a node to see if it can serve data or not is to verify if its partner is available over the interconnect, or taken over. If the partner is neither responding over the interconnect, nor taken over, a node will stop to serve data, despite being online. This determination is called 'cluster HA' and the default supported configuration.

This option can be disabled, in that case nodes use quorum voting to determine their status in the cluster and avoid split brain. When cluster HA is disabled, nodes will stop serving data when they cannot communicate with the partner, whether the partner is taken over or not. Only if a node has epsilon will it still serve data if the partner is unavailable over the cluster network or is taken over.

There is no concept of epsilon when cluster HA is enabled, only a takeover of the partner will guarantee access to a healthy node if there is an issue with communication to the partner.

Examples:

  • There are 2 nodes in a cluster, both can communicate with their partner, cluster HA is enabled and there is storage failover as well.
    Both nodes should be serving data just fine.
  • Cluster HA is enabled on the cluster, node 2 is taken over as a result of a panic.
  • Cluster HA is disabled on a cluster, node 2 is taken over as a result of maintenance. No node has epsilon.
  • Cluster HA is enabled, node 2 was shut down for maintenance with a halt-inhibit-takeover true
  • Cluster HA is disabled, node 2 is epsilon and taken over by node 1.
  • Cluster HA is disabled, node 1 is epsilon, and has node 1 taken over.
  • Cluster HA is disabled, node 1 is epsilon, and node 2 is halted without takeover.

In which of the above cases does node 1 serve data?
In which of the above cases is node 2's data also available?

The following is an example of a two-node cluster with a broken interconnect. You need to reseat the motherboard and cannot perform a takeover.
You can move all the data from the node that you want to take down over to the partner node in the two-node cluster by performing a storage aggregate relocation in clustered Data ONTAP 8.2:

::> storage aggregate relocation start -node node0 -destination node1 -aggregate-list aggr1, aggr2

Or by using volume move to relocate data. 

::> volume move start -vserver vs0 -volume volume_test -destination-aggregate dest_aggr -perform-validation-only true

After ALL the data is moved, the cluster HA needs to be set to false, you can confirm that it is set to TRUE by running the following command:

::> cluster ha show
High Availability Configured: true

To change HA to false, run the following command:

cluster::> cluster ha modify -configured false

Once cluster HA is disabled, set epsilon to true on the node that should survive, by running the following command:

::> set -privilege diagnostic

::*> cluster show
Node                   Health Eligibility   Epsilon
-------------------- ------- ------------ ------------
node0                  true    true         false
node1                  true    true         false

::*> cluster modify -node node1 -epsilon true

::*> cluster show
Node                   Health Eligibility   Epsilon
-------------------- ------- ------------ ------------
node0                  true    true         false
node1                  true    true         true

With epsilon set for node1, halt node0 and reseat the motherboard.

Additional Information

Add your text here.