Skip to main content
NetApp Knowledge Base

Backup logs from aborted and/or resumed NDMP operations can cause an ONTAP node's root volume to fill, possibly leading to node panics

Views:
712
Visibility:
Public
Votes:
0
Category:
ndmp
Specialty:
dp
Last Updated:

Applies to

  • ONTAP 9
  • Network Data Management Protocol (NDMP) operations, such as ndmpcopy

Issue

  • Rapid increase in the used size of a single node's root volume. This can be seen by running the following command periodically:

cluster1::> volume show -vserver cluster1-01
Vserver   Volume       Aggregate    State      Type       Size  Available Used%
--------- ------------ ------------ ---------- ---- ---------- ---------- -----
cluster1-01 vol0     aggr0 online RW 442.4GB    407.6GB    7%

(Using a node name as the -vserver parameter will return that node's root volume)

  • The backup log located at /mroot/etc/log/backup is filled with messages similar to the following:

Tue Mar 27 00:11:36 EDT 2018 /svm1/vol1 Log_msg (Flush DIRNET for BKP ID=248, type=3 interrupted while waiting for min inflight. Error = Interrupted system call.

The simplest way to access the backup log is through the Service Processor Infrastructure (SPI) interface by clicking the logs link.  See KB: How to manually collect logs and copy files from a clustered Data ONTAP storage system (under "Option 1") for assistance working with the SPI.

  • Affected node may panic with messages similar to the following:

Example 1:

Process vldb unresponsive for 631 seconds in process nodewatchdog onrelease 9.2P1 (C)

Note: This panic may be caused by many other issues.  This panic alone does not indicate the issue outlined here; make sure to check the node's root volume status as well as the contents of the backup log.

Example 2:

Apr 12 15:49:43 [node-02:callhome.mdb.recovery.unsuccessful:EMERGENCY]: Call home for MDB RECOVERY UNSUCCESSFUL FOR THE coresegd WARNING. 
Apr 12 15:51:58 [node-02:callhome.mdb.recovery.unsuccessful:EMERGENCY]: Call home for MDB RECOVERY UNSUCCESSFUL FOR THE mcached WARNING. 
Apr 12 15:54:07 [node-02:spm.vifmgr.process.exit:EMERGENCY]: Logical Interface Manager(VifMgr) with ID 9996 aborted as a result of signal normal exit (1). The subsystem will attempt to restart. 
Apr 12 15:54:09 [node-02:callhome.mdb.recovery.unsuccessful:EMERGENCY]: Call home for MDB RECOVERY UNSUCCESSFUL FOR THE vifmgr WARNING. 
  
Apr 12 16:03:14 [node-02:callhome.mdb.recovery.unsuccessful:EMERGENCY]: Call home for MDB RECOVERY UNSUCCESSFUL FOR THE bcomd WARNING. 
PANIC  : Process vifmgr unresponsive for 630 seconds 
version: 9.4P3: Thu Oct 11 18:25:55 EDT 2018 
conf   : x86_64.optimize 
cpuid = 3 
KDB: stack backtrace: 
  
PANIC: Process vifmgr unresponsive for 630 seconds in process nodewatchdog on release 9.4P3 (C) on Wed Apr 12 16:04:13 KST 2023 
  
Apr 12 16:21:11 [node-02:extCache.rw.replay.canceled:notice]: WAFL external cache replay canceled for aggregate node2_aggr0: Aggregate came online after timeout. 
Apr 12 16:22:21 [node-02:mgmtgwd.rootvolrec.low.space:EMERGENCY]: The root volume on node "node-02" is dangerously low on space. Less than 10 MB of free space remaining. 
Apr 12 16:22:21 [node-02:callhome.root.vol.recovery.reqd:EMERGENCY]: Call home for ROOT VOLUME NOT WORKING PROPERLY: RECOVERY REQUIRED. 

  • Backup log growth causes root volume out of space, sometimes causing root aggregate offline.

214G /mroot/etc/log/backup
96G /mroot/etc/log/backup.0

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.