Skip to main content
NetApp Knowledge Base

What are Common Performance Terms for Data ONTAP?

Last Updated:


Applies to

  • Data ONTAP 8.X
  • ONTAP 9.X


  • Throughput - Rate of data transmitted over a communication channel, often interchanged or confused with Bandwidth
    • Units
      • Ops/sec
      • Bytes/sec
      • MB/sec
      • GB/sec
  • Bandwidth - The maximum possible rate of data that can be transmitted over a communication channel, often interchanged or confused with Throughput
    • Units
      • Bytes/sec
      • MB/sec
      • GB/sec
  • Latency - The total time since an input or command is issued and the response is received
    • Units
      • seconds (sec)
      • milliseconds (ms)
      • microseconds (us)
  • Utilization - A measurement of the amount of time in a sample period that a given resource was utilized; utilization is a useful metric of performance, but for Data ONTAP should not be the primary metric
    • Units
      • %
  • Bottleneck - The point of congestion in a computing system that impacts performance, there might be more than one bottleneck in an environment
    • NetApp Technical Support looks to address the bottleneck contributing the most to overall latency first
  • Concurrency - Measurement of the parallelism of workload in a computing system
    • The more parallelism there is in a workload, the more simultaneous operations are “in flight” at any point in time
    • This allows the system to be more efficient in processing work, and complete more operations in less time even with the same latency per op as a low concurrency workload
    • Little’s Law shows the relationship between throughput, latency and concurrency in a steady state. Though it looks intuitively easy, it’s quite a remarkable result:
      • Throughput = Concurrency / Latency
      • Latency is controlled by Data ONTAP
      • Concurrency is controlled by the clients/applications
      • In order to achieve the best throughput, it should be considered to lower the latency and/or increase the concurrency


Assume a request that takes 1 milliseconds (ms) to complete. An application using one thread with one outstanding read or write operation should achieve 1000 IOPS (1 second or 1000 ms / 1 ms per request). Theoretically, if the thread count is doubled, then the application should be able to achieve 2000 IOPS. If the outstanding asynchronous read or write operations for each thread are doubled, then the application should be able to achieve 4000 IOPS. In practice, request rates do not always scale so linearly, due to overhead in the client from task scheduling, context switching, and so forth.

Note: This is an example showing how to optimize the throughput by increasing the concurrency from the client side, assuming that 1ms latency is already good enough and there is no room for further improvement from a latency perspective.

  • Randomness - Refers to a workload that is performed in an unpredicted sequence, with no order or pattern
  • Sequentiality - Refers to a workload that is performed in a predetermined, ordered sequence. Many patterns can be detected: forward, backward, skip counts, etc.


Additional Information

Wikipedia page on Little's Law



Registered NetApp customers get unlimited access to our dynamic Knowledge Base.

New authoritative content is published and updated each day by our team of experts.

Current Customer or Partner?

Sign In for unlimited access

New to NetApp?

Learn more about our award-winning Support