Backup logs from aborted and/or resumed NDMP operations can cause an ONTAP node's root volume to fill, possibly leading to node panics
Applies to
- ONTAP 9
- Network Data Management Protocol (NDMP) operations, such as
ndmpcopy
Issue
- Rapid increase in the used size of a single node's root volume. This can be seen by running the following command periodically:
cluster1::> volume show -vserver cluster1-01
Vserver Volume Aggregate State Type Size Available Used%
--------- ------------ ------------ ---------- ---- ---------- ---------- -----
cluster1-01 vol0 aggr0 online RW 442.4GB 407.6GB 7%
(Using a node name as the -vserver
parameter will return that node's root volume)
- The backup log located at
/mroot/etc/log/backup
is filled with messages similar to the following:
Tue Mar 27 00:11:36 EDT 2018 /svm1/vol1 Log_msg (Flush DIRNET for BKP ID=248, type=3 interrupted while waiting for min inflight. Error = Interrupted system call.
The simplest way to access the backup
log is through the Service Processor Infrastructure (SPI) interface by clicking the logs
link. See KB: How to manually collect logs and copy files from a clustered Data ONTAP storage system (under "Option 1") for assistance working with the SPI.
- Affected node may panic with messages similar to the following:
Process vldb unresponsive for 631 seconds in process nodewatchdog onrelease 9.2P1 (C)
Note: This panic may be caused by many other issues. This panic alone does not indicate the issue outlined here; make sure to check the node's root volume status as well as the contents of the backup log.