How does WAFL and striping distribute data among disks?
Applies to
- ONTAP 9
- Data ONTAP 8
Answer
Low performance striping
- Typically an administrator would add additional disk space before the file system gets full.
- Over a period of time as files are deleted, the stripes will balance out.
- On a full file system, it is desirable to add multiple drives (as opposed to a single drive) to keep some striping.
- Also, it is possible to do a level
0 dump/restore
onto a new file system.
Distributing data among disk drives
- Write Anywhere File Layout (WAFL) sweeps through the disk drives and writes to all empty locations.
- On the first sweep after a new disk drive has been added, the new disk drive will get more writes than the rest.
- But the data is spread evenly to the disks precisely because so much more gets written to the new one on the first sweep.
- Because so much data is going to it, it will not stay empty for long.
- As new WAFL sweeps occur, the basic effect is for data to migrate until all the disk drives are equally full.
- For example:
- you start with five completely full 1GB drives.
- That is 4.5 GB of data when you subtract out the 10% reserve.
- With 5 drives, that is 4.5/5 = 0.9 GB per disk.
- As you add additional drives, this number will become low.
- For instance, with 6 drives that is 4.5/6 = 0.75 GB per drive.
- In the example above, data will be 100% balanced when the new drive has 0.75 GB of data - which means you need to change 0.75 GB of data that is currently on the old drive and reallocate it on the new drive before data distribution is balanced.
- Note: The more disk drives you have, the smaller percentage of the data will need to be moved to get the data to a balanced state.
- You might have an archival system where no old data is ever deleted.
- In these instances, the distribution would not even out much, but such a system is predominantly a read, so the write performance is not much of an issue.
- Data distribution can be balanced while doing a full dump/ restore by copying the data around.
- When a file is copied and the original file is deleted, WAFL is given a chance to even out the data distribution as it does the write allocation.
Description of write allocation
- The write allocation in the WAFL code keeps a Current Write Location (CWL) pointer for each disk that indicates where the next write will occur.
- The CWL for each disk starts at the beginning of the disks and advances to the end, filling in every unallocated slot.
- WAFL selects which disk to use based on which CWL is behind the others, so the CWLs for all the disks are closed, which is why the parity disk does not have to seek.
- It is possible for one CWL to get ahead of the others because WAFL writes successive blocks of a single file onto a single disk.
- The end result is that during the first few passes through all the disks, the new disk will have a lot of data written to it because it is completely empty.
- As the old data is removed and new data is written, the data evens out among the disk drives.
Reallocation
- While WAFL attempts to evenly lay out the written data, it may be necessary over time to force a reallocation of the data.
- The storage administrator should reference the System Administration Guide for their specific release of Data ONTAP regarding instructions and caveats to consider when running this command.
Additional Information