Recurring “wafl.cp.toolong:error” Events and NFS Write Failures After ONTAP Upgrade
Applies to
- NetApp AFF-A700
- ONTAP 9.14.1P9, 9.15.1P11
- Clusters with NFS workloads and scheduled backups.
- Customers experiencing “wafl.cp.toolong:error” and NFS writev failed (error no.32) during backup snapshot deletes
Issue
- After upgrading to ONTAP 9.14.1P9 (and persisting in 9.15.1P11), customers observed recurring “wafl.cp.toolong:error” messages in the event log, most frequently during backup windows. These events coincided with NFS clients reporting write failures (error no.32, “broken pipe”) in their kernel logs.
FriAug1205:23:26+0200 [node1:wafl_exempt02:wafl.cp.toolong:error]: Aggregate aggr1_xxxxxxxxx experienced a long CP.
Oct1705:24:46 h3020022SAPSLI_74[15390]: Q0IBasisSystem: Operating system call writev failed (error no.32)
- The errors occur when backup software deletes snapshots after nightly backups, causing WAFL to trigger long consistency points (CP).
- NFS clients experience write failures during these long CPs, but no measurable latency or production impact was reported.
