What is Adaptive QoS and how does it work?
Applies to
- ONTAP 9.3 or newer
- Adaptive QoS (AQoS)
- Active IQ Unified Manager (AIQUM)
- NetApp Service Level Manager (NSLM)
Answer
- Adaptive QoS uses QoS Throughput Floors and Ceilings (Minimum and Maximum throttle limits) to set individual Volume limits.
    - Floors are used to prevent bully workloads from robbing workloads of their share of resources
        - This keeps workloads from going below a minimum amount of IOPS and/or MB/s
- Think of AQoS as the inverse of a ceiling or traditional QoS policy: it keeps other workloads from taking resources from the workload the policy is applied to
 
- Ceilings are used to limit busier workloads to prevent robbing resources from other workloads
 
- Floors are used to prevent bully workloads from robbing workloads of their share of resources
        
- Adaptive QoS is dynamic based upon volume size, meaning a 10GB volume would have a different Floor and Ceiling than a 10 TB volume.
    - This means the Ceiling is the greater of 1) Expected IOPS, 2) Peak IOPS, or 3) Absolute Minimum IOPS
        - Note: Expected or peak could be determined by if used or allocated space is set, so if calculating, please consider that
 
- The Floor is always Expected IOPS unless below the Absolute Minimum IOPS
 
- This means the Ceiling is the greater of 1) Expected IOPS, 2) Peak IOPS, or 3) Absolute Minimum IOPS
        
- As with regular QoS, AQoS is a cluster wide process as I/O may hit any LIF on any node in the Cluster
Terms
| Adaptive QoS | Dynamic QoS Ceilings and Floors which grow or shrink based upon volume size used or allocated | 
| Throughput Floor (Minimum) | A guaranteed throughput measurement (IOPS and/or MB/s) that gives non-floored workloads less priority over floored workloads | 
| Throughput Ceiling (Maximum) | A hard limit of how many IOPs a Volume is allocated (regular QoS) | 
| Expected IOPS | The Throughput Floor value in IOPS per terabyte (unless specified different) | 
| Absolute minimum IOPS | An IOP Throughput Floor, used when Expected IOPs becomes too low and overrides Expected IOPs or Peak Example: A 10 GB Volume with a default Adaptive QoS "value" policy group will have a Floor of 75 IOPs, not 1.28 IOPs Expected. | 
| Peak IOPS | 
 | 
| Allocated Space | 
 | 
| Used Space | 
 | 
| Headroom | 
 | 
Note: The calculated value can be seen with the qos workload show -instance command
Cluster::> qos workload show -instance
          Workload Name: aqos1-wid32444
...
     Maximum Throughput: 1425IOPS
- Custom policies may be made available with the qos adaptive-policy-group create command
- By default, three buckets are created:
| policy-group | expected-iops | peak-iops | absolute-min-iops | expected-iops-allocation | peak-iops-allocation | 
|---|---|---|---|---|---|
| extreme | 6144IOPS/TB | 12288IOPS/TB | 1000IOPS | allocated-space | used-space | 
| performance | 2048IOPS/TB | 4096IOPS/TB | 500IOPS | allocated-space | used-space | 
| value | 128IOPS/TB | 512IOPS/TB | 75IOPS | allocated-space | used-space | 
How do QoS floors and ceilings decide when to throttle?
- IOPS from volumes without a floor will be put into a queue Best Effort
- IOPS from volumes with a floor will be put into a deadline queue
- IOPS from volumes with floors get priority over volumes without floors in the dblade as long as workload on those volumes is under the floor value.
- IOPS from volumes exceeding the floor are treated the same as a volume without a floor.
    - IOPS queue in a delay center called QoS Minimums
- IOPS in this queue are treated along with those from volumes without a floor in the Best Effort queue
 
- If CPU is above the headroom value:
    - Below the CPU headroom optimal point, Best Effort IOPS are yielded to deadline IOPS
- Above the CPU headroom optimal point on ONTAP 9.6 and below, the deadline IOPS may be lower but will still get priority over Best Effort IOPS
- Above the CPU headroom optimal point on ONTAP 9.7 and above, deadline IOPS will get the same values as below the optimal point and Best Effort IOPS are throttled heavier
 
- IOPS from volumes hitting the Throughput Ceiling will be hard throttled at that value.
Additional Information
- Supported QoS features per ONTAP version
    - The link takes you to General Support but there are a few sections below that which have more tables of supported features for QoS and AQoS.
 
- Documentation on Adaptive QoS.
- Documentation on Throughput Floors
- What is Quality of Service (QoS) in ONTAP?
- Here is some additional examples of space usage and throttling levels for comparison:
    - Expected IOPS = 128/TB
- Peak IOPS = 512/TB
- Absolute Min IOPS = 75
- Expected IOPs = allocated space
- Peak IOPs = used space
 
| Volume Size | Data Stored | QoS Min IOPS (SSD Aggregate only) | QoS Max IOPS | 
| 1 GB | 0GB | 75 (Absolute Minimum) | 75 (Absolute Minimum) | 
| 1 TB | 0 TB | 128 (Expected) | 128 (Expected) | 
| 1 TB | .1 TB | 128 (Expected) | 128 (Expected) | 
| 1 TB | .2 TB | 128 (Expected) | 128 (Expected) | 
| 1 TB | .3 TB | 128 (Expected) | 154 (Peak) | 
| 1 TB | .5 TB | 128 (Expected) | 256 (Peak) | 
| 1 TB | 1 TB | 128 (Expected) | 512 (Peak) | 
| 2 TB | 2 TB | 256 (Expected) | 1024 (Peak) | 
