Slow FlexCache operations on the origin volume on workloads like SVN
Applies to
- ONTAP 9.5 and newer
- FlexCache
- NFSv3
- NLM
- FlexCache Cache volumes with read-only NFS Export policies
Issue
- NLM locks will hang and queue up behind an oustanding rename operation
- The workload is a SVN Repository that hosts code very similar to how GIT functions
- High rename latency is seen in the seconds or tens of seconds
- Note: Renames are the main cause but any operation that modifies the file system (writes, setattrs, deletes, etc.) will cause this
- From
strace
on Linux clients the operation seen to hang can be seen as renames - Multiple accesses on the same file at the same time with origin volume modifying operations and cache volume reading operations
Example: Highlighted are lines from a strace output showing the latency issue caused by a hanging rename for NFS
[ +0.001852] task:svn state:D stack: 0 pid:61131 ppid: 46494 flags:0x00000004 [ +0.000001] Call Trace: [ +0.000011] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc] [ +0.000001] __schedule+0x3fa/0x740 [ +0.000007] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc] [ +0.000000] schedule+0x4b/0xb0 [ +0.000006] rpc_wait_bit_killable+0x24/0xa0 [sunrpc] [ +0.000000] __wait_on_bit+0x6e/0xa0 [ +0.000011] out_of_line_wait_on_bit+0x8e/0xb0 [ +0.000001] ? init_wait_var_entry+0x50/0x50 [ +0.000005] __rpc_wait_for_completion_task+0x2d/0x30 [sunrpc] [ +0.000008] nfs_rename+0xbc/0x2e0 [nfs] [ +0.000005] vfs_rename+0x681/0x920 [ +0.000004] ? lookup_dcache+0x44/0x70 [ +0.000004] do_renameat2+0x494/0x530 [ +0.000004] ? do_renameat2+0x494/0x530 [ +0.000004] __x64_sys_rename+0x20/0x30 [ +0.000004] do_syscall_64+0x37/0x80 [ +0.000004] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Example:
- SVN uses file locking to maintain a file called
current
(used to know current SVN repo version) by putting NLM locks on a file calledwrite-lock
- Here is an example
vserver lock show
for thewrite-lock
file
Cluster::> vserver lock show -vserver svm0 -volume nfs -path /vol2/svn/write-lock -fields
bytelock-offset,client-address,bytelock-length,lock-state
vserver volume lif lif-id path lock-state bytelock-offset bytelock-length client-address
------- ------ ----- ------ -------------------- ---------- --------------- -------------------- --------------
svm0 nfs lif_1 1026 /vol2/svn/write-lock granted 0 18446744073709551615 10.1.2.3