Troubleshooting Workflow: MHost RDB apps out of quorum
Applies to
- ONTAP 9
Issue
Commands and services might stop functioning or function in a limited capacity when an RDB application is Out-Of-Quorum (that is, 'Local unit offline').
This is typically a transitional state, due to network partition, health of the remote nodes, or health of the local node.
The RDB cluster configuration consists of a defined set of replication sites (nodes), all of which are known to one another. Cluster membership and configuration is stored within the replicated file /var/rdb/_sitelist
. All RDB applications or rings (mgwd, vldb, vifmgr, bcomd, and so on) ring share the sitelist configuration.
_sitelist
(cluster configuration data) is automatically replicated within the system. The contents include the following:
- Version
- Cluster UUID
- List of sites.
Each site has an ID, a hostname, a pair of cluster IP addresses, and a state (eligible/ineligible). The eligibility setting governs whether this site will participate in quorum formation - this is an administrative choice. Additionally, one site might be designated as holding 'epsilon' - an extra partial vote that allows quorum to form with only half the sites. 'Epsilon' is not the same as 'master'. In the two-node HA mode, _sitelist
contains the HA_CONFIG
attribute; this implies that completely different rules are in effect for quorum handling.
A quorum is a connected majority of like RDB apps, with one instance selected as the master. The master is usually one of the first several instances in the _sitelist
. Each replication ring operates completely independent of the other rings. It is normal for different rings to have different masters, but typically located on the same node.
A node that is Out of Quorum (OOQ) is not a participating member of a quorum. That is, it has not yet participated in quorum formation (just booting up) or lost contact with the master, either because it has taken itself OOQ or the master has pushed it OOQ.
In the offline state, databases cannot be written or updated by the master of a quorum. However, a local point-in-time read-only copy of databases are available. How useful the read-only copy is depends on the specific RDB app. For example the vldb might continue to answer queries from the N-blade when offline. It is necessary to consult owners of various apps.
RDB apps compete with D-blade and N-blade for CPU and I/O cycles. The system is not a real-time system and does not have Service Level Agreements (SLAs) planned for a future release. Therefore, RDB apps go OOQ occasionally on heavily loaded systems. This condition is not a bug.
When a local or remote node is OOQ, CLI commands call fail with 'Local unit offline' in the error message (some commands automatically retry when offline is encountered unbeknownst to the admin). When this happens, the command should be retried before digging deeper, as the condition is usually transitory.
If any of the following issues occur on the master, all apps go offline momentarily until a new master is elected.
Advanced commands:
To investigate the state of the quorum for all rings, use the advanced level command cluster ring show
.
::>set advanced
::*> cluster ring show
Node UnitName Epoch DB Epoch DB Trnxs Master Online
--------- -------- -------- -------- -------- --------- ---------
csiptc-2240-09 mgmt 88 88 917522 csiptc-2240-09 master
csiptc-2240-09 vldb 90 90 3889 csiptc-2240-09 master
csiptc-2240-09 vifmgr 87 87 308046 csiptc-2240-09 master
csiptc-2240-09 bcomd 86 86 10 csiptc-2240-09 master
csiptc-2240-09 crs 87 87 107 csiptc-2240-09 master
csiptc-2240-10 mgmt 88 88 917522 csiptc-2240-09 secondary
csiptc-2240-10 vldb 90 90 3889 csiptc-2240-09 secondary
csiptc-2240-10 vifmgr 87 87 308046 csiptc-2240-09 secondary
csiptc-2240-10 bcomd 86 86 10 csiptc-2240-09 secondary
csiptc-2240-10 crs 87 87 107 csiptc-2240-09 secondary
10 entries were displayed.
The cluster show
command displays only the quorum state of mgwd (use cluster ring show
and rdb_dump
for all rings).
::*> cluster show
Node Health Eligibility Epsilon
-------------------- ------- ------------ ------------
csiptc-2240-09 true true false
csiptc-2240-10 true true false
2 entries were displayed.
Systemshell commands:
- Enter diagnostic mode:
set diag
- Enter the systemshell on the appropriate node (you may have to unlock the diag user account):
systemshell -node <node-name>
To investigate the state of the quorum when mgwd is not running, run rdb_dump
on the FreeBSD shell. From any cluster node, use the tool to extract current state information for any or all of the RDB applications. The typical technique is to cat the /var/rdb/_sitelist
, then use the rdb_dump
tool to investigate by directing it at the IP addresses of interest (or at localhost). rdb_dump
is capable of showing:
- Overall health
- Transaction flow
- Database versions
- Various components and internals.
Type rdb_dump -h
for a list of options. Note that all rdb_dump
output is from the point of view of the process that is being queried.
% rdb_dump -h
rdb_dump [<host>] [options] <unit>*
-h - help
-c [n] - continuous with n sec delay (default 3)
-v - verbose; all options other than 'c'
-e - environment vars
-f - configuration info
-x - internal developer info on selected components
-u - Local Unit
-d - individual database summary
-q - Quorum Mgr
-r - Recovery Mgr
-t - Transaction Mgr
-z - Call exportHealth API to query health at a node.
[<host>] - Name or IP, default localhost.
<unit>* - select from: vldb, management, vifmgr, bcomd, t1, smfpilot (test units).
if omitted, dumps all product units on the host.
Options may be combined, e.g., '-qrtx'.
rdb_dump
shows cluster configuration and health information from the perspective of an individual unit.
Notes:
- 'Master' is a dynamic role, and 'epsilon' is a configuration setting. It often occurs that the master and epsilon sites differ.
- The replication groups (vldb, vifmgr, bcomd, mgwd) operate independently. There may be different masters and health information for each. However, the configuration info should be shared.
Configuration
To analyze inter-box issues:
- Check that the environment and configuration (-e and -f) match as expected.
- Check that the various unit instances agree on configuration.
Health (by default)
Given the correct configuration, health information will summarize the status of the replication group.
Note: Health obtained from the master is always the most accurate; there is a slight delay in propagating secondary info to other secondaries, but they will come into agreement.
Monitoring
Use -c
to continuously monitor a box under normal operation. Also, when rebooting a box, use -c
to show the apps as they start and come online.