Node shutdown with "monitor.shutdown.brokenDisk:EMERGENCY" error
Applies to
- Data ONTAP all versions
- FAS / AFF models
Issue
- System shutdown with the following error without performing a takeover
Example:
Sat Apr 04 19:00:00 JST [Node-01: statd: monitor.brokenDisk.notice:info]: When two disks are broken in raid_dp volume, the system shuts down automatically every 24 hours to encourage you to replace the disk. If you reboot the system it will run for another 24 hours before shutting down. (The 24 hour timeout may be increased by altering the "raid.timeout" value using the "options" command.)
Sat Apr 04 19:00:00 JST [Node-01: statd: monitor.shutdown.brokenDisk:EMERGENCY]: data disk,parity disk in RAID group "/aggr0_n1/plex0/rg0" are broken. Halting system now.
Sat Apr 04 19:00:00 JST [Node-01: monitor: license.check.v2.bootUnavail:debug]: Licensing not available. Boot completed: true, pending halt: true.
Sat Apr 04 19:00:00 JST [Node-01: monitor: license.state.v2.modified:debug]: Licensing state for local node changed from true to false.
Sat Apr 04 19:00:00 JST [Node-01: monitor: license.state.v2.modified:debug]: Licensing state for local node changed from true to false.
Sat Apr 04 19:00:26 JST [Node-01: shutdown_thread0: ha.localNodeShutDown:notice]: Shutdown of the local node has been initiated with inhibit_takeover set to TRUE.
Sat Apr 04 19:00:30 JST [Node-01: shutdown_thread0: kern.shutdown:notice]: System shut down because : "BROKEN DISK".
- Two disks in "/aggr0_n1/plex0/rg0" are in double degraded state
Example:
Aggregate aggr0_n1 (online, raid_dp, degraded) (block checksums)
Plex /aggr0_n1/plex0 (online, normal, active, pool0)
RAID group /aggr0_n1/plex0/rg0 (double degraded, block checksums)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity 0d.03.0 0d 3 0 SA:A 0 SAS 10000 1142352/2339537408 1144641/2344225968
parity FAILED N/A 1142352/ -
data FAILED N/A 1142352/ -