Skip to main content
NetApp Knowledge Base

What are the best practices for adding disks to an existing aggregate?

Views:
9,044
Visibility:
Public
Votes:
15
Category:
ontap-9
Specialty:
perf
Last Updated:

 

Applies to

  • ONTAP 9
  • FAS systems

Answer

Warning:

  • This article applies to hard drive disk (HDD) aggregates, but reallocate must not be done to a SSD aggregate, FabricPool, or in Cloud Volumes ONTAP (CVO) aggregates.
  • For CVO please create a new aggregate, volume move the existing volumes and destroy the existing aggregate once vacated.
  • High latency due to disk utilization on aggregates even after performing disk firmware update and changing efficiency best effort to background requires more disks added to the aggregate.
  • For best performance, it is advisable to add a new RAID group of equal size to existing RAID groups.
    • If a new RAID group cannot be added, then at a minimum, three or more disks should be added at the same time to an existing RAID group.
    • This allows the storage system to write new data across multiple disks.
  • A forced reallocate must be done to evenly distribute data across the RAID-group(s), otherwise most new writes will go to the new disk resulting in an imbalance of workload.
    • If a reallocate is not done, performance will be worse and statit will look like below.
    • Eventually, WAFL will fix itself, but this can take many months.
::> set advanced
::*> node run -node node_1 statit -b
/* wait 60s */
::*> node run -node node_1 statit -e
...
disk             ut%  xfers  ureads--chain-usecs writes--chain-usecs cpreads-chain-usecs greads--chain-usecs gwrites-chain-usecs
/aggr_data/plex0/rg0:
0a.10.6           32  84.50    0.16   3.65  5014  40.70  58.65   357  43.63  55.17   217   0.00   ....     .   0.00   ....     .
0a.10.8           32  83.93    0.17   3.55  4777  40.51  58.94   356  43.25  55.71   216   0.00   ....     .   0.00   ....     .
0a.10.10          51 111.80   29.66  10.65  1862  26.92  29.12   772  55.22  14.13   677   0.00   ....     .   0.00   ....     .
0a.10.12          52 112.22   30.35  10.71  1825  26.91  29.93   735  54.96  14.16   689   0.00   ....     .   0.00   ....     .
0a.10.14          53 112.81   30.63  10.34  1956  27.08  29.59   777  55.10  14.31   697   0.00   ....     .   0.00   ....     .
0a.10.16          54 114.66   31.85  10.76  1902  27.46  30.05   783  55.34  14.45   680   0.00   ....     .   0.00   ....     .
0a.10.18          53 114.26   30.45  11.23  1781  27.84  30.42   784  55.97  14.68   675   0.00   ....     .   0.00   ....     .
0a.10.20          52 113.79   29.10   8.11  2510  27.69  30.14   744  56.99  14.33   673   0.00   ....     .   0.00   ....     .
0a.10.24          53 116.80   29.56   8.08  2443  28.82  30.73   754  58.41  14.49   657   0.00   ....     .   0.00   ....     .
0a.10.26          54 117.57   31.09   8.67  2353  28.63  30.12   752  57.85  14.49   661   0.00   ....     .   0.00   ....     .
0a.10.28          55 118.71   30.31   9.07  2323  29.45  30.87   752  58.95  14.71   661   0.00   ....     .   0.00   ....     .
0a.10.30          50 106.95   28.86   8.86  2197  24.60  29.18   704  53.49  14.21   668   0.00   ....     .   0.00   ....     .
0a.10.36          78 154.61   48.59  11.54  2426  45.44  39.71   863  50.57  20.24   479   0.00   ....     .   0.00   ....     .
0a.10.38          75 158.05   61.35   8.91  2969  39.69  29.13   914  47.01  15.24   666   0.00   ....     .   0.00   ....     .
0a.10.40          75 156.63   60.31   9.21  2918  39.65  29.75   903  46.67  15.51   680   0.00   ....     .   0.00   ....     .
0a.10.42          75 158.28   60.53   9.48  2803  40.21  29.83   896  47.54  15.47   666   0.00   ....     .   0.00   ....     .
0a.10.44          76 159.14   67.07   7.15  3959  38.21  39.97   682  43.86  19.47   572   0.00   ....     .   0.00   ....     .
How should a reallocate be done?

This must be considered with your account team.

FlexVols
  • Forced reallocation ignores the optimization thresholds and completely rewrites the data to disk, unlike the normal reallocation process.
  • Although this improves the layout, routine use of [-force|-f [true]] reallocate is not a best practice, due to excessive load on the aggregate
  • Also, because all of the data is optimized, forced reallocation cannot be run against volumes that have existing Snapshot copies unless the physical reallocation method ([ -space-optimized|-p [true] ]) is also used.
    • cluster::> reallocate start -vserver svm0 -path /vol/vol1 -f true -p true
    • One job at a time may be run, and if performance overhead exists, a second may be added.
FlexGroups
  • Reallocate can be done on an aggregate level, but oftentimes it is costly on disk cycles and takes several days or weeks.
  • Volume reallocates are not possible so an aggregate reallocate is needed:
    • cluster::> storage aggregate reallocation start -once true -aggregate <aggr_name>
Volume Moves
Important points to consider
  • It is best to look at Active IQ Unified Manager under Aggregates, then performance, and Nodes under performance, to determine the quietest times such as after 5 PM or on weekends.
  • Reallocate will cause additional overhead so this must be accounted for.
    • It's estimated to cause between 10-30% of a performance overhead per job, but this is an estimate and could easily take more or less.
    • In the case of high disk utilization, a more measured approach is to do the busiest volume first, working to the quietest in the aggregate.

 

 

 

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.