NDMP backups to fibre channel tape exhibit sporadic low throughput
Applies to
- Two or more ONTAP cluster nodes, with each node connected to fibre channel tape.
- NDMP is running in SVM-scope mode.
- Data Management Application (DMA) configuration contains all nodes' tape device paths.
- DMA is not configured to prefer local device paths, but will instead use any available tape device to complete a backup or restore.
Issue
- NDMP backup or restore takes longer than usual.
- Compared to previous jobs, the "slow" job exhibits lower throughput.
- The low throughput is sporadic -- The same job will run at acceptable throughput during one attempt, and then exhibit low throughput on another attempt.
- All backups, both fast and slow jobs, are being written to SAN tape devices connected to the NetApp cluster nodes.
A "dump to null" test of the volume associated with the slow job does not indicate a read throughput bottleneck.
Example:
The volume "data1" on SVM "nasdata" usually backs up at a rate of ~200 MB/sec, but will occasionally back up at ~75 MB/sec.
If the backup log is examined, the jobs that run at higher throughput have b=60 in the dump options:
dmp Sat May 11 17:10:50 IDT 2019 /nasdata/data1(3) Start (Level 0, NDMP:96285)
dmp Sat May 11 17:10:50 IDT 2019 /nasdata/data1(3) Options (b=60, u)
The jobs that run at lower throughput have b=0 in the dump options, as well as TCP receive and send buffer sizes:
dmp Sat May 11 01:49:54 IDT 2019 /nasdata/data1(7) Start (Level 0, NDMP:36047)
dmp Sat May 11 01:49:54 IDT 2019 /nasdata/data1(7) Options (b=0, u, TCP recv buffer size = 33580, TCP send buffer size = 33580)