Skip to main content
NetApp Knowledge Base

NFS operations hang or NFS not responding errors reported when entire flexgroup usage reaches 100 percent

Views:
603
Visibility:
Public
Votes:
1
Category:
fas-systems
Specialty:
CORE
Last Updated:

Applies to

  • ONTAP 9
  • Flexgroup
  • NFS

Issue

  • NFS client's kernel log contains
    • mount: server <name> not responding, timed out
  • find command does not respond
  • Client experiences NFS latency
  • storage aggregate show command returns error
cluster::*> storage aggregate show
 
Info: Failed to get the information for aggregate aggr0_node09. Reason: ZSM - failed, status code = 571, extra = Timeout: Operation "ksmfRawZapi_iterator::get_imp()" took longer than 110
seconds to complete [from mgwd on node "node01" (VSID: -1) to kernel at 169.254.33.96], took 109.996s, max 110s [169.254.33.96:951].
Failed to get the information for aggregate node09. Reason: ZSM - failed, status code = 571, extra = Timeout: Operation "ksmfRawZapi_iterator::get_imp()" took longer than 110
seconds to complete [from mgwd on node "node01" (VSID: -1) to kernel at 169.254.33.96], took 109.997s, max 110s [169.254.33.96:951].
 
Aggregate     Size Available Used% State   #Vols  Nodes            RAID Status
--------- -------- --------- ----- ------- ------ ---------------- ------------
aggr0_node09    -         -     - unknown      - node09          -
aggr0_node10 1020GB   49.46GB   95% online       1 node10          raid_dp,normal
node09          -         -     - unknown      - node09          -
node10      527.0TB   148.3TB   72% online      93 node10          raid_dp,normal
  • cf status command returns error
cluster::*> cf status
Takeover
Node           Partner        Possible State Description
-------------- -------------- -------- -------------------------------------
node09        node10        -        Up. Node accessible via HA-IC, but cluster access failed
node10        node09        true     Connected to node09
  • EMS log
Sun Jan 08 01:20:04 [node09: wafl_exempt14: wafl.vol.fsp.full:error]: volume flexvol__0005@vserver:xxxxxxxx-0a45-11e8-86ae-xxxxxxxxxxxx: insufficient space in FSP wafl_remote_reserve to satisfy a request of 0 holes and 12 overwrites.
 
Sun Jan 08 01:20:30 [node01: kernel: Nblade.nfsLongRunningOp:debug]: Detected a long running network process operation.
The client IP address:port is xxx.xxx.109.64:922.
The local IP address:port is xxx.xxx.207.30:2049.
The protocol requesting the operation is NFS3.
The RPC program number for the operation is 100003.
The protocol procedure for the operation is ReadDirPlus (17).
The disk process UUID is xxxxxxxx926a11e9999b00a0xxxxxxxx.
The Vserver associated with the operation is vserver1.
The UID of the user is 0.
The MSID for the volume is xxxxxxxxxx.
The inode number of the file is 45644.
 
Sun Jan 08 01:21:56 [node01: kernel: Nblade.dBladeNoResponse.NFS:error]: File operation timed out because there was no response from the data-serving node.
Node UUID: xxxxxxxx-ff66-11e9-9b05-xxxxxxxxxxxx,
file operation protocol: NFS,
client IP address: xxx.xxx.109.58,
RPC procedure: 3.
 
Sun Jan 08 01:27:11 [node01: kernel: Nblade.dBladeNoResponse.NFS:error]: File operation timed out because there was no response from the data-serving node.
Node UUID: xxxxxxxx-ff66-11e9-9b05-xxxxxxxxxxxx,
file operation protocol: NFS,
client IP address: xxx.xxx.109.60,
RPC procedure: 17.
 
Sun Jan 08 01:27:36 [node09: wafl_exempt04: wafl.vol.full:alert]: Insufficient space on volume flexvol__0005@vserver:xxxxxxxx-0a45-11e8-86ae-xxxxxxxxxxxx to perform operation. 76.0KB was requested but only 12.0KB was available.
Sun Jan 08 01:28:19 [node09: wafl_exempt06: wafl.vol.fsp.full:error]: volume flexvol__0005@vserver:xxxxxxxx-0a45-11e8-86ae-xxxxxxxxxxxx: insufficient space in FSP wafl_remote_reserve to satisfy a request of 1 holes and 26 overwrites.
 
Sun Jan 08 11:44:26 [node09: kernel: Nblade.nfsLongRunningOp:debug]: Detected a long running network process operation.
The client IP address:port is 10.96.103.108:775.The local IP address:port is xxx.xxx.207.207:2049.
The protocol requesting the operation is NFS3.The RPC program number for the operation is 100003.
The protocol procedure for the operation is LookUp (3).The disk process UUID is xxxxxxxx926a11e9999b00a0xxxxxxxx.
The Vserver associated with the operation is vserver1.The UID of the user is 0.The MSID for the volume is xxxxxxxxxx.
The inode number of the file is xxxxxxxx.
 
Sun Jan 08 11:49:31 [node09: kernel: Nblade.nfsLongRunningOp:debug]: Detected a long running network process operation.
The client IP address:port is 10.96.103.108:823.The local IP address:port is xxx.xxx.207.206:2049.
The protocol requesting the operation is NFS3.The RPC program number for the operation is 100003.The protocol procedure for the operation is ReadDirPlus (17).
The disk process UUID is xxxxxxxx926a11e9999b00a0xxxxxxxx.The Vserver associated with the operation is vserver1.
The UID of the user is 0.The MSID for the volume is xxxxxxxxxx. The inode number of the file is xxxxxxxx.
 

Fri Jun 21 01:19:58 -0400 [node09: kernel: Nblade.nfsLongRunningOp:debug]: Detected a long running network process operation. The client IP address:port is xx.xx.xx.xx:808. The local IP address:port is xx.xx.xx.xx:2049. The protocol requesting the operation is NFS3. The RPC program number for the operation is 100003. The protocol procedure for the operation is Write (7). The disk process UUID is xxxxxxxxxxxxxxxx. The Vserver associated with the operation is vserver1. The UID of the user is xxxxxx. The MSID for the volume is xxxxxx. The inode number of the file is xxxxxx.
 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.