Skip to main content
NetApp Knowledge Base

FAQ: SnapMirror Lag

Views:
1,331
Visibility:
Public
Votes:
0
Category:
snapmirror
Specialty:
DP
Last Updated:

Applies to

  • ONTAP 9
  • SnapMirror

Frequently Asked Questions


 
What is SnapMirror lag time?
  • SnapMirror relationship lag is the difference between when a snapshot is created and the system time on the destination when the next transfer completes
    • For example
      • A snapshot is created at 08:00
      • A scheduled SnapMirror update runs at 12:00
      • The transfer takes 15 minutes to complete
      • Lag time will be 4 hours and 15 minutes
        • How is that calculated? The difference between snapshot creation time and initiating the transfer is 4 hours, plus 15 minutes for the transfer to complete
  • SnapMirror or SnapVault relationship's lag time is calculated using:
    • The snapshot timestamp
    • The time on the destination system
    • The amount of time needed to transfer the snapshot from source to destination
  • The term 'lag' is typically associated with performance, with the common perception that lag is the elapsed time since the last successful update
  • While this is not completely incorrect, it does not account for 2 other factors:
    • The time, based on the clock and timezone, on the source and destination storage controllers
    • The duration of the transfer
  • The time on the source and destination is important because this determines the timestamps on the file system and snapshots.
    • If time is configured incorrectly, timestamps will be inaccurate  
    • Because lag is calculated based on snapshot timestamps, if the time is not correct, lag will not be correct  
  • The duration of the transfer is also overlooked because of the nature of replication
    • Lag is not measured based only on the time a transfer starts and completes
    • Lag is measured from the time the snapshot is created on the source, plus the duration of the transfer
      • Transfer can be schedule update or manual update transfers. 
  • Consider the following SnapMirror scenario:
    Source  Destination
    ControllerA:vol_1 ControllerB:vol_1_mir
  1. A scheduled update starts at 12:00pm
  2. A SnapMirror snapshot is created on the Source volume, and a transfer is started 
  3. The transfer takes 45 minutes to complete
  4. The time on the destination system is now 12:46pm
  5. The transfer completed 1 minute ago

If measured during step 5, the lag is 46 minutes, because:

  • 46 minutes have elapsed since the snapshot was created on the source
  • 46 minutes elapsed since the snapshot was successfully transferred to the destination
  • On the Destination, Lag is calculated by finding the difference between:
    • The snapshot creation timestamp
    • The time on the destination, based on the destination storage controller's clock 
    • If the time is not configured correctly on the destination or source, the lag time will be incorrect
  • Consider the following scenario:
    Primary Secondary
    CIFS_SVM:vol_1 CIFS_DR:vol_1_dr
  1. Based on the snapshot policy on vol_1, a snapshot is created at 5pm
  2. The snapshot is created with the snapmirror-label sv_daily
  3. At 1 AM the following morning, a scheduled snapmirror update is triggered, configured to replicate any snapshot labeled sv_daily
  4. The transfer takes 30 minutes to complete

The lag in this scenario would be 8 hours 30 minutes, because:

  • At the time of the scheduled snapmirror update, eight hours had elapsed since the snapshot was created and labeled sv_daily
  • Transferring the snapshot from the source to the destination took 30 minutes

In summary

  • Lag is the difference between the snapshot timestamp and the time on the destination system
  • Lag includes the amount of time needed to transfer a snapshot from source to destination
  • When examined in the context of snapshot timestamp, and the duration of the transfer, "long" lag times are often found to be normal

What is SnapMirror lag time?

How do I troubleshoot SnapMirror lag issues?

How to troubleshoot SnapMirror lag issues

How is SnapMirror lag time calculated
  • SnapMirror lag time is the difference between when the snapshot was created on the source and the system time on the destination
  • There are additional factors that can affect lag, including 
    • The timestamp on the last successfully transferred snapshot
    • The system time on the destination
    • The amount of time needed to transfer the snapshot from the source to destination
  • Example of how lag is calculated
    • A snapshot is created at 08:00
    • A scheduled SnapMirror update runs at 12:00
    • The transfer takes 15 minutes to complete
    • Lag time will be 4 hours and 15 minutes
      • How is that calculated? The difference between snapshot creation time and initiating the transfer is 4 hours, plus 15 minutes for the transfer to complete

How is SnapMirror lag time calculated?

What is an example of how SnapMirror lag time is calculated?
  • Scenario 1   
    Source  Destination
    ControllerA:vol_1 ControllerB:vol_1_mir
  1. A scheduled update starts at 12:00pm
  2. A SnapMirror snapshot is created on the Source volume, and a transfer is started 
  3. The transfer takes 45 minutes to complete
  4. The time on the destination system is now 12:46pm
  5. The transfer completed 1 minute ago

If measured during step 5, the lag is 46 minutes, because:

  • 46 minutes have elapsed since the snapshot was created on the source
  • 46 minutes elapsed since the snapshot was successfully transferred to the destination
  • On the Destination, Lag is calculated by finding the difference between:
    • The snapshot creation timestamp
    • The time on the destination, based on the destination storage controller's clock 
    • If the time is not configured correctly on the destination or source, the lag time will be incorrect
  • Scenario 2
    Primary Secondary
    CIFS_SVM:vol_1 CIFS_DR:vol_1_dr
  1. Based on the snapshot policy on vol_1, a snapshot is created at 5pm
  2. The snapshot is created with the snapmirror-label sv_daily
  3. At 1 AM the following morning, a scheduled snapmirror update is triggered, configured to replicate any snapshot labeled sv_daily
  4. The transfer takes 30 minutes to complete

The lag in this scenario would be 8 hours 30 minutes, because:

  • At the time of the scheduled snapmirror update, eight hours had elapsed since the snapshot was created and labeled sv_daily
  • Transferring the snapshot from the source to the destination took 30 minutes

What is an example of how SnapMirror lag time is calculated?

Does the time on the source and destination impact lag time?
  • Yes, the system time on both source and destination clusters must be correct to see valid lag time.
  • The lag time is calculated by the timestamp when the last transferred snapshot was created and the system time on the destination when the transfer completes
    • If system times are configured incorrectly, timestamps will be inaccurate
    • Inaccurate timestamps will result in an inaccurate lag time

Does the system time on source and destination impact SnapMirror lag time?

Why is Active IQ Unified Manager sending alerts about SnapMirror lag time?
  • Active IQ Unified Manager can be configured to send alerts when SnapMirror lag time exceeded the specified threshold
    • ocumEvtMirrorVaultRelationshipLagWarning Asynchronous Mirror and Vault Lag Warning
    • ocumEvtMirrorVaultRelationshipLagWarning Asynchronous Mirror and Vault Lag Error
    • ocumEvtSnapMirrorRelationshipLagError Mirror Replication Lag Error
    • ocumEvtSnapMirrorRelationshipLagWarning Mirror Replication Lag Warning
  • Thresholds can be viewed under Settings -> Event Thresholds -> Relationship
    • Considering the thresholds in the below screenshot, a relationship configured to update every 60 minutes would alert as follows
      • Warning (150%): 90+ minutes of lag time
      • Error (250%): 150+ minutes of lag time

clipboard_e55f0b8b4d4759a0b7ff91ca0d6a4f1f8.png

Why is Active IQ Unified Manager sending alerts about SnapMirror lag time?

How do I configure lag thresholds in Active IQ Unified Manager?

How to configure lag thresholds in AIQUM for unmanaged protection relationships

Additional Information

additionalInformation_text

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.