Performance impact of a full FlexGroup volume
Applies to
- ONTAP 9.1 and later
- FlexGroup
Issue
- Slow response time observed from a FlexGroup volume with all or some constituents above about 90% full.
- Very slow response running ls command on the FlexGroup volume.
- The VMs where the volume is mounted may go hung after logging in and respond very slow .
- in some cases it leads to performance degrading of the node to a degree that you cannot connect to a SVM e.g. serving data is affected
- Constituents with vastly differing space usage between constituents can also be impacted in a similar way as all constituents being 80-90% or more full.
- OCUM reports high latency for other operations.
- Potential operation timeout events in Event log:
[xxx: kernel: Nblade.dBladeNoResponse.NFS:error]: File operation timed out because there was no response from the data-serving node. Node UUID: 9xxx, file operation protocol: NFS, client IP address: xx.xx.xx.xx, RPC procedure: 3.
- Insufficient space errors for FlexGroup volumes in Event log:
wafl_exempt08: wafl.vol.fsp.full:error]: volume amperexxxxxx@vserver:xxxxxx: insufficient space in FSP wafl_remote_reserve to satisfy a request of 0 holes and 27 overwrites.
wafl_exempt08: wafl.vol.fsp.full:error]: volume xxxxxxx@vserver:xxxxxxxxxxx: insufficient space in FSP wafl_remote_reserve to satisfy a request of 2 holes and 27 overwrites
Example:
- All the constituent member volumes are 96% full for the fg1 FlexGroup:
clstr::> volume show -vserver vs1 -volume-style-extended flexgroup
Vserver   Volume       Aggregate    State      Type       Size  Available Used%
--------- ------------ ------------ ---------- ---- ---------- ---------- -----
vs1       fg1          -            online     RW        500GB    207.5GB   96%
clstr::> volume show -vserver vs1 -volume-style-extended flexgroup-constituent
Vserver   Volume       Aggregate    State      Type       Size  Available Used%
--------- ------------ ------------ ---------- ---- ---------- ---------- -----
vs1       fg1__0001    aggr3        online     RW      31.25GB    12.97GB   96%
vs1       fg1__0002    aggr1        online     RW      31.25GB    12.98GB   96%
vs1       fg1__0003    aggr1        online     RW      31.25GB    13.00GB   96%
vs1       fg1__0004    aggr3        online     RW      31.25GB    12.88GB   96%
vs1       fg1__0005    aggr1        online     RW      31.25GB    13.00GB   96%
vs1       fg1__0006    aggr3        online     RW      31.25GB    12.97GB   96%
vs1       fg1__0007    aggr1        online     RW      31.25GB    13.01GB   96%
vs1       fg1__0008    aggr1        online     RW      31.25GB    13.01GB   96%
vs1       fg1__0009    aggr3        online     RW      31.25GB    12.88GB   96%
vs1       fg1__0010    aggr1        online     RW      31.25GB    30.01GB   96%
vs1       fg1__0011    aggr3        online     RW      31.25GB    12.97GB   96%
vs1       fg1__0012    aggr1        online     RW      31.25GB    13.01GB   96%
vs1       fg1__0013    aggr3        online     RW      31.25GB    12.95GB   96%
vs1       fg1__0014    aggr3        online     RW      31.25GB    12.97GB   96%
vs1       fg1__0015    aggr3        online     RW      31.25GB    12.88GB   96%
vs1       fg1__0016    aggr1        online     RW      31.25GB    13.01GB   96%
16 entries were displayed.
- Run qoscommand to check the source of the high latency is fromDatalayer : QOS commands to monitor volume latency in real time
clstr::> qos statistics volume latency show -vserver vs1 -volume fg1
Workload            ID    Latency    Network    Cluster       Data       Disk    QoS Max    QoS Min      NVRAM
--------------- ------ ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
-total-              -   206.06ms     3.27ms     3.10ms    199.7ms        0ms        0ms        0ms        0ms
fg1-wid233..      23350  206.06ms     3.27ms     3.10ms    199.7ms        0ms        0ms        0ms        0ms
-total-              -   347.23ms     5.08ms     2.81ms   334.85ms        0ms        0ms        0ms     2.42ms
fg1-wid233..      23350  347.23ms     5.08ms     2.81ms   334.85ms        0ms        0ms        0ms     2.42ms
-total-              -   308.52ms     4.75ms     2.82ms   296.02ms        0ms        0ms        0ms     2.70ms
