NFS access hangs and CLI commands hang or slow to respond after upgrading cluster with flexcache to 9.15.1
Applies to
- Ontap 9.15.1+
- Flexcache
- NFS
Issue
- After upgrading a node hosting a flexcache to 9.15.1+ NFS access hangs and most CLI commands hang or are slow to receive responses
- The following is also seen in logs
- sktrace.log shows Slow SPINVFS_FILEOPS_<OP> with "time=<val>", greater than 5000
- SPINVFS_FILEOPS_WRITERPC: spin_writerpc end: tid=(123456:spiniod 3) vnode 0xfffff1234a5678b0 dsid 0xa12b34cd fid 12345, err=0, time=7227 ms
- mgwd.log may log 'SQLite mutex_trace: long delay in enter'
- [kern_mgwd:info:3822] 0xabcdef01234: 0: WARNING: SQL_CONTEXT: sqlite_mutex_leave_traced:src/sql_context.cc:609 SQLite mutex_trace: long delay in enter (6353ms) for mutex 0x123a01234
- ems may log CTRAN Connection closed
- CsmMpAgentThread: csm.connClose:debug]: Session (req=node:dblade, rsp=node:dblade, uniquifier=0b01234567890123) with Tag CTRAN: Connection from local lif 0 to remote IP is closed.
- CsmMpAgentThread: csm.connClose:debug]: Session (req=node:dblade, rsp=node:dblade, uniquifier=0b01234567890123) with Tag CTRAN: Connection from local lif 0 to remote IP is closed.
- sktrace.log shows Slow SPINVFS_FILEOPS_<OP> with "time=<val>", greater than 5000
- Nodes with 48 or more CPU cores and a large number of flexcache relationships to a large number of remote nodes are more likely to experience this issue