Storepool Owner/OpenState exhaustion causes NFSv4 file access failure
Applies to
- ONTAP 9
- NFSv4.0
- NFSv4.1
Issue
- NFS client unable to open NFSv4 files, Read/Write operations get hung, delete operation succeed
- NFS client application crash
- High CPU utilization can be observed
- Commands like "
cd
" "ls
" "touch
" get hung EMS.log
reports differentNblade.nfsV4PoolThreshold
errors that led to a Store Pool exhaustion -Nblade.nfsV4PoolExhaust:EMERGENCY
:-
[node-01: kernel: Nblade.nfsV4PoolThreshold:notice]: NFS Store Pool for OpenState is nearing exhaustion (80% of pool currently in use).
[node-01: kernel: ems.engine.suppressed:debug]: Event 'Nblade.nfsV4PoolThreshold' suppressed 3477 times in last 204 seconds. -
[node-01: kernel: Nblade.nfsV4PoolThreshold:notice]: NFS Store Pool for OpenState is nearing exhaustion (90% of pool currently in use).
[node-01: kernel: ems.engine.suppressed:debug]: Event 'Nblade.nfsV4PoolThreshold' suppressed 99337 times in last 61 seconds -
[node-01: kernel: Nblade.nfsV4PoolThreshold:notice]: NFS Store Pool for OpenState is nearing exhaustion (99% of pool currently in use).
[node-01: kernel: ems.engine.suppressed:debug]: Event 'Nblade.nfsV4PoolThreshold' suppressed 139821 times in last 61 seconds. -
[node-01: kernel: Nblade.nfsV4PoolExhaust:EMERGENCY]: NFS Store Pool for OpenState exhausted. Associated object type is CLUSTER_NODE with UUID: 69b7a0c0-8dea-11ed-bcfe-d000eaa1111d.
-
Note: Other Store Pool resources can be affected too:
[node-01: kernel: Nblade.nfsV4PoolExhaust:EMERGENCY]: NFS Store Pool for Open exhausted. Associated object type is CLUSTER_NODE with UUID: 69b7a0c0-8dea-11ed-bcfe-d000eaa1111d.
or
node01 EMERGENCY Nblade.nfsV4PoolExhaust: NFS Store Pool for Owner exhausted. Associated object type is CLUSTER_NODE with UUID: 39865935-
7a8e-11ef-8ccf-d039eaa50000.
- Below
nfs4sequesnceInvalid
alerts are seen pointing to client IP.
Wed Dec 04 12:16:30 -0800 [node-n1: kernel: nblade.nfs4SequenceInvalid:notice]: NFS client (IP: 172.23.xxx.xvc) sent sequence# 21, but server expected sequence# 20. Server error: BAD_SEQID.
Wed Dec 04 12:19:48 -0800 [node-n1: kernel: nblade.nfs4SequenceInvalid:notice]: NFS client (IP: 172.23.xxx.xvc) sent sequence# 25, but server expected sequence# 24. Server error: BAD_SEQID.
- Top consumer IP is identified by checking the event
PerClientStorePoolThreshold
OR by collecting additional data
::> event log show -event PerClientStorePoolThreshold
- Identify the data interface (lif) that the top consumer mounts
::> network connections active show-clients -remote-address <Topconsumer>