Are long Consistency Points (wafl.cp.toolong) normal?
Applies to
- ONTAP 9
- Data ONTAP 7-mode
Answer
- If no performance impact, then yes, the occasional long Consistency Point (CP) is normal and expected.
- It is recommended to check if latency is higher than expected.
- Sometimes long CPs occur frequently and a case may be opened to investigate further if the cause cannot be determined.
Additional Information
- ONTAP can do several background tasks in CP and will use the longer CP time:
- deswizzling
- wafl scans
- deletions
- If ONTAP is more idle, long CPs are expected to allow more priority for background work or write bursts.
- CPs happen asynchronously of the write path.
- Writes are acknowledged when written to NVRAM on both the local and partner node.
- Latency is unaffected as writes can be buffered until CP completion.
- This means latency will not be impacted as data stays in RAM for 30 or more seconds due to low write workload.
- ONTAP 9.15.1 includes some enhancements so this background delay will be less, however this is not necessary to upgrade immediately for a fix due to being expected behavior.
- If latency goes up during this time, above expected levels, then the investigation is needed.
- The long CP is a secondary symptom of a primary symptom such as:.
- "wafl.cp.toolong" message seen in ONTAP logs when disk is about to fail or being failed.
- "wafl.cp.toolong" message seen in ONTAP Logs during high disk workload.
- "wafl.cp.toolong" message seen in ONTAP logs for archive, backup, or disaster recovery storage system.
- A CPU bottleneck in the Data Processing (CPU d-blade) delay center.
- The long CP is a secondary symptom of a primary symptom such as:.
- Example: Messages being logged in the event log (EMS) in different Data ONTAP or ONTAP versions:
- ONTAP 9:
Mon Dec 23 00:20:36 EST [FilerA: wafl_exempt08: wafl.cp.toolong:error]: Aggregate fas_01_DATA_AGGR experienced a long CP... This message occurs when a WAFL(R) consistency point (CP) takes longer than 30 seconds. A CP lasting more than 30 seconds might cause client latency and potentially a client outage.
- Data ONTAP 7-mode:
Mon Feb 22 16:14:08 CLT [FilerA: wafl_CP_proc: wafl.cp.toolong.warning:warning]: params: ... Mon F eb 22 16:14:08 CLT [FilerA: wafl_CP_proc: wafl.cp.slovol.warning:warning]: params: ...
- ONTAP 9: