Skip to main content

NetApp_Insight_2020.png 

NetApp Knowledgebase

Why is a warning displayed when attempting to create one or more data volumes in the root aggregate in Data ONTAP Cluster-Mode?

Views:
80
Visibility:
Public
Votes:
0
Category:
clustered-data-ontap-8
Specialty:
core
Last Updated:

Applies to

Data ONTAP Cluster- Mode

Answer

When attempting to create a volume inside a root aggregate in Data ONTAP Cluster-Mode, a warning similar to the following is reported:

Warning: You are about to create a volume on a root aggregate.  This may cause severe performance or stability problems and therefore is not recommended. Do
         you want to proceed? {y|n}:

A few reasons why Data ONTAP reports this warning are as below:

  • Loss of access during giveback

When performing a giveback in Data ONTAP Cluster-Mode, aggregates are given back sequentially. First, the root aggregate is given back. With its root volume back online, the node given back will synchronize its cluster database and make sure all its settings are correct, so that it can serve the data correctly before other aggregates are returned. During this time, all the volumes in the data aggregates are still served out by the partner. One by one, the partial giveback (where some aggregates are given back and some are not) becomes a full giveback as aggregates return.

However, data volumes in the root aggregate are given back together with the root volume. As long as it takes for the recovering node to be fully healthy and ready to serve data, these volumes are not available for clients; the node giving back has released them, and the recovering node is not able to serve data until all its cluster-related data is fully verified. This process of volume downtime can take a number of minutes, making a giveback a highly disruptive process for the volumes also stored in the root aggregate.

  • File system consistency

Aggregates and volumes can get inconsistent due to a number of possible reasons. For example, suddenly disconnecting a drive stack (loop) from the controller head during an active I/O can cause an aggregate to become inconsistent. The root volume is required for a node to function properly. In the case of an inconsistency of the root aggregate, a node will stop serving the data until the root volume is made available again. Restoring consistency to the root aggregate is particularly difficult, as a node needs to be fully offline while the filesystem is being checked.

As the critical root volume data can be recovered by synchronizing with the cluster, an easy recovery method (much faster than a consistency check) is to simply destroy the inconsistent aggregate, re-create it, add a root volume, and synchronize it with the cluster to get the crucial data back.

Caution: This operation should be performed only with the assistance of NetApp Support.

If there is a data volume on the root aggregate, simply wiping and recreating the aggregate is not an option. Additionally, by adding the data volume to the aggregate, there is an additional risk of filesystem inconsistencies on the root aggregate in general.

  • Resource contention

Space

The root volume stores important data, and needs a significant amount of space to do so. The replicated database is stored here and is required to always be up to     date for a node to function. Additionally, the root volume is used to store important diagnostic information like core dumps, packet traces, and logs. If the root volume fills up completely, a node will stop serving data and cannot function. To make sure the root volume is large enough, there are minimal size requirements. For more information, click here to view a table with the minimum sizes for root volumes mentioned in the Data ONTAP 8.1 System Administration Guide for 7-Mode.

Given these space constraints, the root aggregate might or might not have much space available for data volumes to exist there, besides the root volume.

Disk I/O contention

The root volume in Cluster-Mode is used to store and update various tables of the replicated database. Crucial information regarding the locations of LIFs, volumes, aggregates, and different jobs required to run in the cluster are stored in these tables. If a root aggregate has very busy data volumes, the disks in the aggregate will experience higher latency. When a node is unable to update its copy of the replicated database fast enough, it will consider itself unhealthy and stop serving all the data until it can catch up. This is extremely disruptive and affects all the volumes on the node, even if the cause is related to the data volumes stored on the root aggregate only.

Additional Information

additionalInformation_text