Why file transfer speeds of small vs large files could be different
Applies to
- ONTAP 9
- Data ONTAP operating in 7-Mode
Answer
- With all major operating systems, attempting to read or write large numbers of small files results in a large amount of O/S system overhead.
- This is because more time is spent at the O/S level performing find(), open(), and close() operations on each and every file the client is processing.
- Example: It takes almost a second just to find and open the file in SMB2, with the below metadata calls to find the parent folder, sub folder, locate if the file exists, then open (SMB2 Create call) the file, close the file, open it again, then finally read the file.
- SMB2/3 has more metadata workload than NFS.
- Linux clients also can cache some attributes to avoid needless other I/O calls.
- Additional delays can occur as security hardware/software scans each file on the network between the client and ONTAP, or at the client level.
- A large file can quickly be transferred as it is a contiguous operation, and can take advantage of the following features:
- nConnect on NFS
- 64kB to 1 MB rsize/wsize mount options for NFS.
- SMB3 Large MTU: not the same as jumbo frames, this feature enables 1 MB read/write operation size.
- SMB3 Multi-channel: this allows parallel TCP streams to transfer the file across more logical connections at the TCP level.
- TCP Congestion Windows: TCP has sliding window sizes into the megabytes since Windows Vista days (mid 2000s) and allow large file transfers to utilize more bandwidth of a single TCP stream.
- These features do not apply to small files, as they generally can fit on one TCP stream and are smaller than the default read or write size (typically 64k).
- The metadata calls are not parallelized but processed serially.
Additional Information
- Microsoft knowledge base going over slow write performance for small files.
- Also network based NAS file systems will be slower than SAN based or locally attached storage in this scenario.