What does partial giveback mean in Clustered Data ONTAP?
Applies to
- CORE
- Clustered Data ONTAP 8
- Administration
- ONTAP 9.X
Answer
In clustered Data ONTAP, at times, on running the storage failover show
command, an output similar to the following is displayed:
::*> storage fail show
(storage failover show)
Takeover InterConn
Node Partner Enabled Possible Up State
-------------- -------------- ------- -------- --------- ------------------
node-01 node-02 true true true connected
node-02 node-01 true true true giveback_partial_connected
2 entries were displayed.
However, the output of the cluster show
command will display no issues and the data will be served normally:
::*> cluster show
Node Health Eligibility Epsilon
-------------------- ------- ------------ ------------
node-01 true true false
node-02 true true false
2 entries were displayed.
A clustered Data ONTAP cluster consists of a series of HA pairs connected through an Ethernet cluster network. Each node in an HA pair can and will fail the storage to its directly connected partner, in the event of an outage or operator initiated takeover. When this occurs, the process is exactly the same as that for 7-Mode at a physical storage level; the disk reservations change and temporary ownership of the disks is given to the partner sysid.
However, in clustered Data ONTAP, two types of failovers can occur:
- CFO (cluster failover): This is a failover of mroot aggregates.
- SFO (storage failover): This is a failover of data aggregates.
The type of failover is defined by the aggr option ha_policy
:
::> node run local aggr status -v aggr1
Aggr State Status Options
aggr1 online raid4, aggr nosnap=off, raidtype=raid4,
32-bit
raidsize=8,
ignore_inconsistent=off,
snapmirrored=off,
resyncsnaptime=60,
fs_size_fixed=off,
snapshot_autodelete=on,
lost_write_protect=on,
ha_policy=sfo, <-- this is an SFO aggr
hybrid_enabled=off,
percent_snapshot_space=5%,
free_space_realloc=on
Volumes: datavol1, datavol2
Plex /aggr1/plex0: online, normal, active
RAID group /aggr1/plex0/rg0: normal, block checksums
::> node run local aggr status -v aggr0_rr_01
Aggr State Status Options
aggr0_node1 online raid4, aggr root, diskroot, nosnap=off,
64-bit
raidtype=raid4, raidsize=8,
ignore_inconsistent=off,
snapmirrored=off,
resyncsnaptime=60,
fs_size_fixed=off,
snapshot_autodelete=off,
lost_write_protect=on,
ha_policy=cfo, <-- this is a CFO aggr
hybrid_enabled=off,
percent_snapshot_space=5%,
free_space_realloc=on
Volumes: vol0
Plex /aggr0_node1/plex0: online, normal, active
RAID group /aggr0_node1/plex0/rg0: normal, block checksums
When a giveback occurs, the storage will be once again homed to the owning node of the disks.
However, this process can be vetoed in certain conditions, such as:
- CIFS sessions are active
- SnapMirrors are running
- AutoSupports are being generated
- Storage issues occur (such as failed disks)
When this occurs, check the event log to find out the reasons why the veto occurred:
::> event log show -messagename cf.rsrc.givebackVeto -instance
Example:
::*> event log show -messagename cf.rsrc.givebackVeto -instance
Node: node-02
Sequence#: 22780
Time: 9/21/2011 11:57:13
Severity: ALERT
EMS Severity: SVC_ERROR
Source: cf_main
Message Name: cf.rsrc.givebackVeto
Event: cf.rsrc.givebackVeto: Failover monitor: disk check: giveback cancelled due to active state
Kernel Generation Number: 1316618408
Kernel Sequence Number: 970
If the veto is due to a minor issue (such as AutoSupport, failed disks, or CIFS sessions), run the following command to override the veto and complete the storage giveback:
Note: This is similar to performing a cf giveback -f
in 7-Mode:
::> storage failover giveback -fromnode [nodename] -override-vetoes true
If the partner storage system is not in a ‘waiting for giveback’ condition, ensure that the command that that is run specifies that*:
::*> storage failover giveback -fromnode node-02 -require-partner-waiting false -override-vetoes true
WARNING: Initiating a giveback with vetoes overridden will result in giveback
proceeding even if the node detects outstanding issues that would make
a giveback dangerous or disruptive. Do you want to continue?
{y|n}: y
Caution: The following commands should not be run without the supervision and approval of NetApp Support.
- Run the
failover show
command:
::> storage failover show –instance
- Run the following command to check the status of the partner:
::> node run local cf status
- Run the following command to issue the giveback again:
::> storage failover giveback -fromnode [nodename] -override-vetoes true -require-partner-waiting true
Note: A partial giveback can also be seen when a node is in the process of giveback. Wait for a few minutes for this to clear.
- In Cluster-Mode, run the following command to enable the option to automatically giveback storage upon failover:
::> storage failover modify -node * -auto-giveback true
When the option is enabled, the storage will giveback on its own with a delay as specified by the following option (default value is 300 seconds):
::> node run local options cf.giveback.auto.delay.seconds
cf.giveback.auto.delay.seconds 300
This option will also ignore issues such as disk checks and active CIFS sessions.
- Run the following advanced level command to check the giveback status:
::> set advanced
::*> storage failover progress-table show
Additional Information