What does "lag" mean for SnapMirror/SnapVault?
Applies to
ONTAP 9
Answer
- A SnapMirror or SnapVault relationship's lag time is calculated using:
- The snapshot timestamp
- The time on the destination system
- The amount of time needed to transfer the snapshot from source to destination
- The term 'lag' is typically associated with performance, with the common perception that lag is the elapsed time since the last successful update
- While this is not completely incorrect, it does not account for 2 other factors:
- The time, based on the clock and timezone, on the source and destination storage controllers
- The duration of the transfer
- The time on the source and destination is important because this determines the timestamps on the file system and snapshots.
- If time is configured incorrectly, timestamps will be inaccurate
- Because lag is calculated based on snapshot timestamps, if the time is not correct, lag will not be correct
- The duration of the transfer is also overlooked because of the nature of replication
- Lag is not measured based only on the time a transfer starts and completes
- Lag is measured from the time the snapshot is created on the source, plus the duration of the transfer.
- Transfer can be schedule update or manual update transfers.
- Consider the following SnapMirror scenario:
Source Destination ControllerA:vol_1 ControllerB:vol_1_mir
- A scheduled update starts at 12:00pm
- A SnapMirror snapshot is created on the Source volume, and a transfer is started
- The transfer takes 45 minutes to complete
- The time on the destination system is now 12:46pm
- The transfer completed 1 minute ago
If measured during step 5, the lag is 46 minutes, because:
- 46 minutes have elapsed since the snapshot was created on the source
- 46 minutes elapsed since the snapshot was successfully transferred to the destination
- On the Destination, Lag is calculated by finding the difference between:
- The snapshot creation timestamp
- The time on the destination, based on the destination storage controller's clock
- If the time is not configured correctly on the destination or source, the lag time will be incorrect
- Consider the following scenario:
Primary Secondary CIFS_SVM:vol_1 CIFS_DR:vol_1_dr
- Based on the snapshot policy on vol_1, a snapshot is created at 5pm
- The snapshot is created with the snapmirror-label sv_daily
- At 1 AM the following morning, a scheduled snapmirror update is triggered, configured to replicate any snapshot labeled sv_daily
- The transfer takes 30 minutes to complete
The lag in this scenario would be 8 hours 30 minutes, because:
- At the time of the scheduled snapmirror update, eight hours had elapsed since the snapshot was created and labeled sv_daily
- Transferring the snapshot from the source to the destination took 30 minutes
In summary
- Lag is the difference between the snapshot timestamp and the time on the destination system
- Lag includes the amount of time needed to transfer a snapshot from source to destination
- When examined in the context of snapshot timestamp, and the duration of the transfer, "long" lag times are often found to be normal
Additional Information