CONTAP-182745: MGWD crashes due to memory shortage

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

Views:: 318

Visibility:: Public

Votes:: 0

Category:: ontap-9

Specialty:: CORE

Last Updated:

Issue

Node experiences the following MGWD panics:

mgwd becoming unresponsive on watchdog
mgwd running out of swap space/memory.

Example panic messages:

Process mgwd unresponsive for 182 seconds (mgwd startup: "(55040)") in process nodewatchdog on release 9.13.0P4 (C)
OOM: out of swap space, process mgwd using 1346 MB in process pageout: dom0 on release 9.13.0P4 (C) (C)

Some additional observations:

User may lose access to cluster during this time.
Depending on which node is being upgraded at the time, user applications report offline for one node, while its HA partner is in partial giveback.
The issue is mostly seen during an ANDU.

Cluster::> storage failover show
                                 Takeover
Node           Partner          Possible State
-------------- -------------- -------- -------------------------------------
Cluster-01     Cluster-02      true     Connected to Cluster-02. Waiting
                                       for cluster applications to come
                                       online on the local node. Offline
                                       applications: mgmt, vifmgr, scsi
                                       blade, clam.
Cluster-02     Cluster-01      true     Connected to Cluster-01, Partial
                                       giveback
2 entries were displayed.