Why does the sum of all volume IOPS in an aggregate not match the aggregate IOPS?
Applies to
- OnCommand Unified Manager (OCUM)
- Active IQ Unified Manager (AIQUM)
- ONTAP 9
Answer
- Volume and Aggregate (disk) layers are two different layers and accounted differently in ONTAP
- Volume level includes only foreground operations, but aggregate level operations include all background operations and foreground
- Workloads can present more or less aggregate operations compared to volume operations
- Examples:
- A 1 GB file may be read in 1 MB chunks at the protocol level (1 IOP per 1 MB), but requires multiple WAFL read requests on the backend to fill one operation in parallel
- In the 1 GB file in 1 MB chunk example above, once the first megabyte chunk is read, readahead will predictively fetch the second, third, and so on chunks before the user/application requests the data to improve read performance
- Data frequently accessed remains cached in RAM and does not require disk access
- A workload is all writes which are buffered by consistency points, requiring differing levels of disk access for consistency points to flush to disk
- Examples:
- Another possiblity is that data gets cached that is frequently used, causing more IOPS than the aggregate level
Why may aggregates have different op counts than the volume?
- There are many background operations ONTAP does which decouple it from volume work
- Background workloads such as SnapMirror, wafl scanners, deduplication, etc. all take disk IOPS but would not count as a user IOP
- These are triggered by internal configurations or schedules
- They are tracked in the internal
qos statistics
workloads
- Workloads have different behaviors within ONTAP
- IOPS get mixed together as requests come in
- Example: One volume may have 4kB reads, the other 1 MB reads, and there may be write workloads which do not get written to disk until the Consistency Point
- One IOP on the network may trigger several WAFL IOPS or it may take several IOPS on the network to make one WAFL IOP
- IOPS get mixed together as requests come in
How do different IOP types (read, write, other, and small or large IOPS) affect volume IOP counts compared to disk?
- Reads that read the same data repeatedly are cached after the first hit
- Readahead will fetch the blocks that are needed before requested, which causes more disk IOPS than volume IOPS
- Large reads (> 64 kB) have one volume IOP but multiple disk IOPS are merged
- Writes go through Consistency Points, and are not written to disk
- Other IOPS that do not modify file system structures generally are in cache
- Example: A volume may have 100,000 GETATTR IOPS but not use disk to read the data after initial load of workload
- Other IOPs that do modify file system structures may require several disk IOPS to update metadata and data blocks
- Example: A SETATTR that overwrites a file has to:
- Update metadata
- Mark the existing block free
- Write new data in Consistency Point on new spot on disk due to WAFL behavior
- Example: A SETATTR that overwrites a file has to:
Should we use these outputs in AIQUM to see if more workload can be added
- No, use the Performance Capacity graphs in AIQUM to see headroom at node and aggregate levels
Additional Information
Example: The sum of all the volumes in the aggregate (under 2,000) do not total the 5,000 aggregate IOPS in AIQUM in the screenshot below