Skip to main content

NetApp_Insight_2020.png 

NetApp Knowledgebase

Active IQ Wellness: Up to High Impact - Aggregate is almost full or Volume vulnerable to becoming 100% Full

Views:
696
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
core
Last Updated:

 

Applies to

  • ONTAP 9 
  • Data ONTAP 8.2 7-Mode
  • Data ONTAP 8.1 7-Mode 
  • Data ONTAP 8 7-Mode 

Answer

Value of reviewing this information:

Aggregates that have over 97% capacity used with non-guaranteed volumes may experience impact to availability if used capacity reach 100% up to and including write failures to volumes and LUN offline events. Additionally, aggregates over 97% capacity may see increased write latency. This risk is in place to help proactively plan to avoid this type of event.

here are some situations where there are a mix of guarantees in an aggregate or where one volume has had its guarantee disabled due to insufficient overall space the impact can be limited to a single volume experiencing write failures or offline LUNs as opposed to every volume within the aggregate.  Finally background deduplication has a requirement of 3% free space within an aggregate as discussed in TR-4476 - NetApp Data Compression, Deduplication, and Data Compaction. 

How this wellness check is validated? 

The following commands can be used to validate Aggregate Space usage: 

  • ONTAP 9: storage aggregate show-space – fields physical-used,physical-used-percent
  • 7-Mode: aggr status, df- A, df   
The following AutoSupport sections are reviewed by Active IQ to determine aggregate capacity usage:  
  • ONTAP 9: AGGR-STATUS-S to determine physical space used, not including root aggregates.
    Aggregate : <aggregate_name>
          Feature                                           Used      Used%
          --------------------------------      ----------------      ----- 
          Volume Footprints                                152TB        75%
          Aggregate Metadata                              7.24GB         0%
          Total Used                                       152TB        75%
          Total Physical Used                              150TB        74%
     
  • 7- Mode:
    • If all volumes are Guarantee=None
      <aggregate_name> 217574431488 163743726624 53830704864      75%
      <aggregate_name>/.snapshot          0          0          0       0%
    • If some or all volumes are Guarantee=Volume the sum of guaranteed volumes from DF is subtracted from the total Aggregate Size and Used size.  The remaining Size and used size from DF -A is used to calculate remaining space. This can result in an aggr being full and a volume with a disabled volume guarantee or guarantee of none being at risk.

The impact level of the Active IQ Wellness Rule Aggregate is almost full is determined by the following physical capacity utilization levels: 

  • Greater than 97%: High Impact
  • 93% to 97%: Medium Impact 

The impact level of the Active IQ Wellness Rule Volume vulnerable to becoming 100% Full is determined by the following aggregate capacity levels if a mix of guarantees are detected:

  • Greater than 97%: High Impact
  • 93% to 97%: Medium Impact 
What should I do about the information provided by this Active IQ Wellness rule?

If you already have a plan for this proactive Active IQ warning, acknowledge it within your Active IQ dashboard. This will ensure that the Wellness warnings you see are issues you do not have a plan in place to address. 
 
To address this type of scenario:

  1. Use caution before adding any additional capacity to the system and ensure that you are monitoring existing capacity via any mechanism you have available.
  2. Do not add additional workloads to any aggregate seeing these capacity alerts.
  3. It is recommended to use Active IQ Unified Manager to monitor storage capacity. 
    Configure Active IQ Unified Manager to monitor and alert on aggregate capacity

    1103631-1.png  
     
  4. For additional steps and considerations regarding addressing low capacity, review the troubleshooting section in this KB: Space Usage
    1. Use the Capacity Panel inside Active IQ Unified Manager to identify aggregates with more space that can be used as targets for migration.
    2. Verify that you are moving to like storage (FlashPool to FlashPool, for example) Ideally, you want to move to equal or better performant storage.
    3. You can use System Manager or CLI to move a volume.
       
  5. Ensure storage efficiency related Active IQ warnings are addressed.
  6. For scenarios where the aggr is low on capacity, however the aggregate has substantial free physical space the option is typically to either change the guarantee of other volumes on the aggregate to all be none, or shrink the guaranteed volumes that are putting the non-guaranteed volume at risk.  This can be done within System Manager or CLI. 

Additional Information

Where can I find more information on this topic?