Skip to main content

NetApp_Insight_2020.png 

NetApp Knowledgebase

What does partial giveback mean in Clustered Data ONTAP?

Views:
1,065
Visibility:
Public
Votes:
0
Category:
data-ontap-8
Specialty:
core
Last Updated:

 

Applies to

  • CORE
  • Clustered Data ONTAP 8
  • Administration
  • ONTAP 9.X

Answer

In clustered Data ONTAP, at times, on running the storage failover show command, an output similar to the following is displayed:

::*> storage fail show
  (storage failover show)
                                      Takeover InterConn
Node           Partner        Enabled Possible Up        State
-------------- -------------- ------- -------- --------- ------------------
node-01        node-02        true    true     true      connected
node-02        node-01        true    true     true      giveback_partial_connected
2 entries were displayed.


However, the output of the cluster show command will display no issues and the data will be served normally:

::*> cluster show
Node                 Health  Eligibility   Epsilon
-------------------- ------- ------------  ------------
node-01              true    true          false
node-02              true    true          false
2 entries were displayed.


A clustered Data ONTAP cluster consists of a series of HA pairs connected through an Ethernet cluster network. Each node in an HA pair can and will fail the storage to its directly connected partner, in the event of an outage or operator initiated takeover. When this occurs, the process is exactly the same as that for 7-Mode at a physical storage level; the disk reservations change and temporary ownership of the disks is given to the partner sysid.

However, in  clustered Data ONTAP, two types of failovers can occur:

  • CFO (cluster failover): This is a failover of mroot aggregates.
  • SFO (storage failover): This is a failover of data aggregates.

The type of failover is defined by the aggr option ha_policy:

::> node run local aggr status -v aggr1

           Aggr State           Status            Options
          aggr1 online          raid4, aggr       nosnap=off, raidtype=raid4,
                                
32-bit            raidsize=8,
                                
                  ignore_inconsistent=off,
                                                  snapmirrored=off,
                                                  resyncsnaptime=60,
                                                  fs_size_fixed=off,
                                                  snapshot_autodelete=on,
                                                  lost_write_protect=on,
                                                  ha_policy=sfo, <-- this is an SFO aggr
                                                  hybrid_enabled=off,
                                                  percent_snapshot_space=5%,
                                                  free_space_realloc=on

                Volumes: datavol1, datavol2

                Plex /aggr1/plex0: online, normal, active
                    RAID group /aggr1/plex0/rg0: normal, block checksums
                    
                    ::> node run local aggr status -v aggr0_rr_01

Aggr State           Status            Options
    aggr0_node1 online          raid4, aggr       root, diskroot, nosnap=off,
                                
64-bit            raidtype=raid4, raidsize=8,
                                
                  ignore_inconsistent=off,
                                                  snapmirrored=off,
                                                  resyncsnaptime=60,
                                                  fs_size_fixed=off,
                                                  snapshot_autodelete=off,
                                                  lost_write_protect=on,
                                                  ha_policy=cfo, <-- this is a CFO aggr
                                                  hybrid_enabled=off,
                                                  percent_snapshot_space=5%,
                                                  free_space_realloc=on

                Volumes: vol0

                Plex /aggr0_node1/plex0: online, normal, active
                    RAID group /aggr0_node1/plex0/rg0: normal, block checksums


When a giveback occurs, the storage will be once again homed to the owning node of the disks.

However, this process can be vetoed in certain conditions, such as:

  • CIFS sessions are active
  • SnapMirrors are running
  • AutoSupports are being generated
  • Storage issues occur (such as failed disks)

When this occurs, check the event log to find out the reasons why the veto occurred:

::> event log show -messagename cf.rsrc.givebackVeto -instance

Example:

::*> event log show -messagename cf.rsrc.givebackVeto -instance

                    Node: node-02
               Sequence#: 22780
                    Time: 9/21/2011 11:57:13
                Severity: ALERT
            EMS Severity: SVC_ERROR
                  Source: cf_main
            Message Name: cf.rsrc.givebackVeto
                   Event: cf.rsrc.givebackVeto: Failover monitor: disk check: giveback cancelled due to active state
Kernel Generation Number: 1316618408
  Kernel Sequence Number: 970


If the veto is due to a minor issue (such as AutoSupport, failed disks, or CIFS sessions), run the following command to override the veto and complete the storage giveback:

Note: This is similar to performing a cf giveback -f in 7-Mode:

::> storage failover giveback -fromnode [nodename] -override-vetoes true

If the partner storage system is not in a ‘waiting for giveback’ condition, ensure that the command that that is run specifies that*:

::*> storage failover giveback -fromnode node-02 -require-partner-waiting false -override-vetoes true

WARNING: Initiating a giveback with vetoes overridden will result in giveback
         proceeding even if the node detects outstanding issues that would make
         a giveback dangerous or disruptive. Do you want to continue?
          {y|n}: y

Caution: The following commands should not be run without the supervision and approval of NetApp Support.

  1. Run the failover show command:

::> storage failover show –instance

  1. Run the following command to check the status of the partner:

::> node run local cf status

  1. Run the following command to issue the giveback again:

::> storage failover giveback -fromnode [nodename] -override-vetoes true -require-partner-waiting true

Note: A partial giveback can also be seen when a node is in the process of giveback. Wait for a few minutes for this to clear.

  1. In Cluster-Mode, run the following command to enable the option to automatically giveback storage upon failover:

    ::> storage failover modify -node * -auto-giveback true

When the option is enabled, the storage will giveback on its own with a delay as specified by the following option (default value is 300 seconds):

::> node run local options cf.giveback.auto.delay.seconds

cf.giveback.auto.delay.seconds 300


This option will also ignore issues such as disk checks and active CIFS sessions.

  1. Run the following advanced level command to check the giveback status:

::> set advanced
::*> storage failover progress-table show

Additional Information