What are the Delay Centers from different performance monitoring tools?
Applies to
- ONTAP 9
- Active IQ Unified Manager (AIQUM)
- OnCommand Unified Manager (OCUM)
- AIQ PAS
Answer
- ONTAP relies on QoS to break down the latency and identify the components that are incontention, each Delay Center represents the delay in its corresponding component.
- So far, various performance monitoring tools have used different terms for each Delay Center, this KB article intends to clarify the differences and help correlate the various Delay Centers from different tools.
|
Delay Centers |
CLI |
AIQUM/OCUM |
AIQ PAS |
Perfstat |
Harvest |
|
CPU Delay in Protocol Layer |
Network |
Network Processing |
CPU N-blade |
CPU_network or CPU_protocol |
Network |
|
CPU Delay in Network Layer |
Network |
Network Processing |
CPU N-blade |
CPU_network |
Frontend |
|
Delay in Cluster Interconnect |
Cluster |
Cluster Interconnect |
Cluster Interconnect |
DELAY_CENTER_CLUSTER_INTERCONNECT |
Cluster |
|
External Delay |
Network |
Network |
Network |
DELAY_CENTER_NETWORK |
Network |
|
CPU Delay in WAFL Layer |
Data |
Data Processing |
CPU D-blade |
CPU_wafl_exempt |
Backend |
|
Disk Delay |
Disk |
Aggregate Operations |
Disk IO |
DELAY_CENTER_DISK_IO |
Disk |
|
CP Delay |
Data |
Data Processing |
WAFL Susp CP |
DELAY_CENTER_WAFL_SUSP_CP |
N/A |
|
Delay from other suspensions |
Data |
Data Processing |
WAFL Susp Other |
DELAY_CENTER_WAFL_SUSP_OTHER |
Suspend |
|
NVLOG TRANSFER Delay |
NVRAM |
Data Processing |
NVLOG_Transfer |
DELAY_CENTER_NVLOG_TRANSFER |
NVLog Mirroring |
|
QoS Max Delay |
QoS Max |
QoS Limit Max |
QoS Limit |
DELAY_CENTER_QOS_LIMIT |
Throttle |
|
QoS Min Delay |
QoS Min |
QoS Limit Min |
QoS Min |
DELAY_CENTER_QOS_MIN_THROUGHPUT |
N/A |
|
Cloud Delay |
Cloud |
Cloud Latency |
Cloud IO |
DELAY_CENTER_CLOUD_IO |
Cloud |
|
FlexCache Delay |
FlexCache |
N/A |
FlexCache RAL + FlexCache Spinhi |
DELAY_CENTER_FLEXCACHE_RAL + DELAY_CENTER_FLEXCACHE_SPINHI |
N/A |
|
Sync SnapMirror Delay |
SM Sync |
Sync SnapMirror |
Sync Repl |
DELAY_CENTER_SYNC_REPL |
N/A |
|
Volume Activation Delay |
VA |
Volume Activation |
COP |
DELAY_CENTER_COP |
N/A |
Notes:
- The command from CLI to break down the latency is:
Cluster::> qos statistics volume latency show
- The long delay from a particular Delay Center is a good indicator of that component being in contention, further performance analysis should be focused on that component.
- The ones in Bold are the Delay Centers where the delays from multiple components are combined together, in this case, multiple components need to be taken into consideration:
- Network & Data from CLI
- Network Processing & Data Processing from AIQUM
- CPU N-blade from AIQ PAS
- Delay accounting might be slightly different for different verticals, especially in the Network/Protocol related Delay Centers, refer to the table below.
- ASA makes a difference in Cluster Interconnect, refer to the table below
| Delay Center | Protocol | Perfstat | CLI | AIQUM | AIQ PAS |
|---|---|---|---|---|---|
| CPU Delay in Protocol Layer | NAS | CPU_network | Network | Network Processing | CPU N-blade |
| CPU Delay in Network Layer | NAS | CPU_network | Network | Network Processing | CPU N-blade |
| Delay in Cluster Interconnect | NAS | DELAY_CENTER_CLUSTER_INTERCONNECT | Cluster | Cluster Interconnect | Cluster Interconnect |
| External Delay (R2T, XFER_RDY) | NAS | N/A | N/A | N/A | N/A |
| CPU Delay in Protocol Layer | iSCSI | CPU_protocol | Network | Network Processing | CPU N-blade |
| CPU Delay in Network Layer | iSCSI | CPU_network | Network | Network Processing | CPU N-blade |
| Delay in Cluster Interconnect | iSCSI | N/A | N/A | N/A | N/A |
| iSCSI ASA | DELAY_CENTER_CLUSTER_INTERCONNECT | Cluster | Cluster Interconnect | Cluster Interconnect | |
| External Delay (R2T, XFER_RDY) | iSCSI | DELAY_CENTER_NETWORK | Network | Network | Network |
| CPU Delay in Protocol Layer | FCP | CPU_protocol | Network | Network Processing | CPU N-blade |
| CPU Delay in Network Layer | FCP | N/A | N/A | N/A | N/A |
| CPU Delay in Network Layer | FCP ASA | CPU_network | Network | Network Processing | CPU N-blade |
| Delay in Cluster Interconnect | FCP | N/A | N/A | N/A | N/A |
| FCP ASA | DELAY_CENTER_CLUSTER_INTERCONNECT | Cluster | Cluster Interconnect | Cluster Interconnect | |
| External Delay (R2T, XFER_RDY) | FCP | DELAY_CENTER_NETWORK | Network | Network | Network |
Notes:
- Perfstat is the only exception where CPU_protocol and CPU_network are distinguished.
- In other tools, they are combined together in the Network CPU related Delay Centers, such as Network (CLI), Network Processing (AIQUM) and CPU N-blade (AIQ PAS).
- For NAS, all the protocol related overhead occurs in nwk_exmept, just as the network related overead. So it only has CPU_network.
- ASA is an exception for SAN.
- For non-ASA, SAN doesn't have indirect access when the primary path is utilized, so no delay from Cluster Interconnect.
- For ASA, indirect access is always expected, so delay from Cluster Interconnect should be expected.
- For ASA FCP, CPU Delay in Network layer should be expected as in the backend, the indirect access will utilize nwk_exempt CPU domain.
Additional Information
| Delay Centers | Definition |
| CPU Delay in Protocol Layer | Delay in handling the protocol requests, such as NFS, CIFS, FCP and iSCSI |
| CPU Delay in Network Layer | Delay in network handling, such as converting a TCP frame to a protocol request, or vice versa |
| Delay in Cluster Interconnect | Delay in communication between different nodes when there is indirect traffic |
| External Delay | Delay outside of ONTAP, such as iSCSI Ready to Transfer (R2T) or FCP Transfer Ready (XFER_RDY) |
| CPU Delay in WAFL Layer | Delay in WAFL CPU |
| Disk Delay | Delay from fetching the required data from Disk layer |
| CP Delay | Delay from WAFL Suspensions on CPs, usually B/b CPs |
| Delay from other suspensions | Delay from WAFL Suspensions on the reasons other than CP related |
| NVLOG TRANSFER Delay | Delay from nvlogging |
| QoS Max Delay | Delay from QoS Limit |
| QoS Min Delay | Delay from QoS Floor, the volumes with QoS Floor will not see this delay, but they might starve the other volumes without QoS Floor |
| Cloud Delay | Delay between ONTAP and the cloud tier on which user data is stored |
| FlexCache Delay | Delay between FlexCache volume and Origin volume when ONTAP has to lookup or retrieve the data from the Origin, or write the data to the Origin. |
| Sync SnapMirror Delay | Delay from Sync SnapMirror replication |
| Volume Activation Delay | Delay from WAFL schduling when a node has more than 1,000 FlexVols |
The links below have more detailed explanation on each Delay Centers:
