What are some considerations of Cloud Volumes ONTAP AWS Performance?
Applies to
Cloud Volumes ONTAP (CVO) on AWSAnswer
The AWS backend has many options, and there are choices within CVO that can cause issues if not considered:
Disk
-
Notes:
- AWS appends sequential IOPs as a single IOP but random IOPs are counted individually
- Node limits may be hit
- For smaller instance types, such as r5.2xlarge, there is an instance limitation of 30 minutes of burst credits for disk i/o, than the baseline is the maximum for 24 hours
- IOPs for disks such as NVRAM or other OS disks are not accounted in ONTAP, and must be viewed from AWS Console/Cloudwatch
- Amazon Web Services uses Elastic Block Storage (EBS) Volumes, which can be described as the type of disk used to back storage for CVO
- General Purpose SSD (gp2) - SSD with scalable IOPS (standard + burst rate)
- Most commonly selected option
- 3 IOPS / GB - scalable performance, 500 GB = 1500 IOPS, 1 TB = 3000 IOPS
- Larger disk size enables better performance
- The IOPs are 16KiB (16,000*1024 Bytes) I/o size
- When reviewing
statit
output, take the transfers and look at chain size - If
statit
shows 8000 ureads and a chain size of 8, that would be the equivilent of 16,000 IOPs at 16KiB
- When reviewing
- Note: gp3 disks are new as of December 2020 and work similar, but without IOP constraints of the size of EBS Volume
- Provisioned IOPS SSD (io1) - SSD with fixed IOPS
- Guaranteed IOPS
- High performance workloads
- More costly than other options
- Throughput Optimized HDD (st1)
- Ideal for streaming data workloads, such as SnapMirror
- Not recommended for general workloads due to burst credits
- Not recommended for FabricPool warm storage tier due to transfer log
- ST1 drives have the same throughput specifications as USB3 drives on local PCs
- EBS Magnetic / Cold HDD (sc1) - traditional spinning media
- Not recommended or supported
- General Purpose SSD (gp2) - SSD with scalable IOPS (standard + burst rate)
-
Recommendations:
- Plan to have not only enough disk capacity but IOP overhead:
- Burst credits deplete on EBS volumes, and latency will climb when this happens
- Burst credits deplete at the rate they are used up, and only refill a set amount of credits once workload is reduced
- The size of disk is important as this determines how many IOPs are available at a base performance
- Plan to have not only enough disk capacity but IOP overhead:
Instance Type
-
AWS instance types can be described as the hardware dedicated to the Cloud Volumes ONTAP install, including CPU, RAM, and network bandwidth
- General Purpose - Balanced resources
- T2 - low cost, burst performance oriented
- M4 - balanced resources for many workloads
- Compute optimized - better for workloads that need more CPU resources (deduplication, compression, compaction)
- C4 - Latest generation Intel Xeon processors
- Memory optimized - better for workloads with large working sets (high file count, complex directory structure, database workloads)
- X1 - Optimized for large-scale, enterprise-class, in-memory applications
- R4 - Optimized for memory-intensive applications
- General Purpose - Balanced resources
Other Considerations
- Cloud Volumes ONTAP HA functionality
- Cloud Volumes ONTAP does support HA configurations, though there may be performance issues for some workloads. Improvements to this feature are constantly being made.
- NVLOG performance
- As NVRAM is not a physical RAM DIMM but SSD backed, this can become a bottleneck
- For workloads with higher write speed needs, a larger instance size is needed, and faster drives to prevent bottlenecks upstream
- Write speed (HA Supported as of ONTAP 9.8, only single node in 9.7)
- Normal - Data is written to NVRAM prior to being committed to disk - this is the safest option and should be used in most cases
- High - Data is simply left in active memory buffers and committed to disk - since data is NOT written to NVRAM, should an unplanned shutdown occur, data could be lost, not recommended and should only be used for workloads with transient data that can safely be lost
- Usage profile options that get setup when deploying Cloud Volumes ONTAP
- Highest Performance - Recommended for applications requiring lowest latency
- Performance with Efficiency - Good performance with ONTAP storage efficiency (deduplication)
- Shared Tenancy vs. Dedicated Hardware
- Shared tenancy is the most commonly selected option, but you could experience "noisy neighbor" symptoms from other AWS workloads, dedicated hardware avoids this but is more costly
- Node IOPs limit
- Each backend disk can handle
Additional Information
- Amazon EBS Volume Types
- Amazon EC2 Instance Types
- Performance Characterization of NetApp Cloud Volumes ONTAP for Amazon Web Services
- Performance Characterization of NetApp Cloud Volumes ONTAP for Azure with Application Workloads
- Performance Characterization of NetApp Cloud Volumes ONTAP for Google Cloud