Skip to main content
NetApp Knowledge Base

SHUTDOWN PENDING (degraded mode) CRITICAL - AutoSupport message

Views:
2,033
Visibility:
Public
Votes:
2
Category:
ontap-9
Specialty:
hw
Last Updated:

Applies to

  • ONTAP 9
  • callhome.shutdown.pending
  • monitor.brokenDisk
  • HA Group Notification from node_name (SHUTDOWN PENDING (degraded mode)) ALERT

Event Summary

This message occurs when a disk drive fails but there are no suitable spares available for reconstruction.

  • To protect your data, the system enters degraded mode.
  • The system halts automatically to prevent a double disk drive failure, and possible loss of data, if it runs in degraded mode for the set time interval.
  • The default timeout is usually 24 hours.
  • If a spare drive becomes available while the system is running in degraded mode, the system immediately begins rebuilding the failed drive.

Validate

Event Log

event log show -severity * -message-name callhome*

[node1: statd: callhome.shutdown.pending:alert]: Call home for SHUTDOWN PENDING (degraded mode)

event log show -severity * -message-name monitor.brokenDisk*

[node1: statd: monitor.brokenDisk.notice:info]: When two disks are broken in raid_dp volume, the system shuts down automatically every 24 hours to encourage you to replace the disk. If you reboot the system it will run for another 24 hours before shutting down. (The 24 hour timeout may be increased by altering the "raid.timeout" value using the "options" command.)

[node1: statd: monitor.shutdown.brokenDisk.pending:notice]: two data disks in RAID group "/aggregate_name/plex0/rg0" are broken. Halting system in 24 hours.

Command line

Verify Aggregate status, run storage aggregate show-status

RAID group /aggregate_name/plex0/rg1 (double degraded, block checksums)

      RAID Disk      Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      ---------           ------      ------------- ---- ---- ---- ----- --------------    --------------
      dparity           0b.07.12              0b    7   12  SA:B   0   SAS 10000 1713523/3509295616 1716957/3516328368
      parity             0b.07.13              0b    7   13  SA:B   0   SAS 10000 1713523/3509295616 1716957/3516328368
      data FAILED                 N/A                        1713523/ -
      data               0b.07.15              0b    7   15  SA:B   0   SAS 10000 1713523/3509295616 1716957/3516328368
      data FAILED                 N/A                        1713523/ -
      data               0b.07.21              0b    7   21  SA:B   0   SAS 10000 1713523/3509295616 1716957/3516328368

 Verify failover status, run storage failover show to verify if the aggregate containing the disk that needs to be reconstructed/evacuated is in a partial giveback state

storage failover show
                              Takeover
Node             Partner        Possible State Description
--------------   -------------- -------- -------------------------------------
Node-1           Node-2         true     Connected to Node-2, Partial giveback
Node-2           Node-1         true     Connected to Node-1.

 

Resolution

  1. If in a Partial giveback state, complete the giveback . Refer to Disk does not reconstruct or evacuate when in the partial giveback state
  2. Replace any failed drives. Refer to this Kb to check your Part Status - DISK FAILED - AutoSupport message

Note: If you need assitance, please contact NetApp Support

Please contact NetApp Technical Support and reference this article for further assistance.

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

Scan to view the article on your device