Skip to main content
NetApp Response to Russia-Ukraine Cyber Threat
In response to the recent rise in cyber threat due to the Russian-Ukraine crisis, NetApp is actively monitoring the global security intelligence and updating our cybersecurity measures. We follow U.S. Federal Government guidance and remain on high alert. Customers are encouraged to monitor the Cybersecurity and Infrastructure Security (CISA) website for new information as it develops and remain on high alert.

NetApp KCS Award

NetApp Knowledge Base

Handling watchdog resets (WDR)

Last Updated:


Applies to

Watchdog reset


What is a watchdog reset?

A watchdog is an independent timer that monitors the progress of the main controller running Data ONTAP. Its function is to serve as an automatic server restart in the event the system encounters an unrecoverable system error.

The watchdog implemented by NetApp uses a two-level timer with different actions associated with each level of time.

  • Level 1: Timeout: The storage appliance attempts to panic and dump the core in response to a non-maskable interrupt. Once a L1 watchdog is successfully issued, the system returns to service and a core file is written, allowing NetApp to determine the root cause of the hang. A L1 watchdog is issued if the timer is not reset within 1.5 seconds.
  • Level 2: Reset: The storage appliance resets through a hard reset signal sent from the timer. A L2 watchdog is issued if the watchdog timer is not reset within two seconds after the L1 watchdog. The L2 watchdog does not generate a Core dump

It is not necessary to ‘recover’ from a watchdog timeout or watchdog reset, as both of these events are recovery mechanisms for other failures. The objective instead is to identify the failure(s) that caused the watchdog event.

What is the appropriate response to a watchdog timeout (L1 Watchdog Event)?

A watchdog timeout should be treated just like any other system panic. The associated backtrace and/or the core should be analyzed for the possible root cause(s). A giveback should be performed if necessary.

What is the appropriate response to a watchdog reset (L2 Watchdog Event)?
DO NOT SIMPLY GIVEBACK AND MONITOR as data collection is required

Please collect the following data to help diagnose the cause of a watchdog reset:

  • AutoSupport messages
  • Console logs before, during, and after the watchdog event (if possible)
  • ssram log (/etc/log/ssram/ssram.log or /mroot/etc/log/ssram/ssram.log) - FAS62xx, FAS80x0 only
  • On systems with a service processor:
    • system sensors
    • system log
    • events all
    • sp status -d

Note: No hardware should be replaced unless the root cause is a hardware issue based on the available log analysis.

Additional Information

For further assistance, contact NetApp Technical Support and reference this article along with the data collected.


Scan to view the article on your device