Node reboots every 24 hours due to a degraded aggregate
Applies to
- ONTAP 9
- FAS/AFF
Issue
- Node reboots every 24 hours due to two broken disks in raid_dp group
- Event logs shows similar messages:
Jun 21 12:51:01 [VDF_SM2_CDOT2-2-02:monitor.raid.brokenDisk:error]: data disk,dparity disk in RAID group "/aggr0/plex0/rg0" are broken. Jun 21 12:51:01 [VDF_SM2_CDOT2-2-02:monitor.brokenDisk.notice:notice]: When two disks are broken in raid_dp volume, the system shuts down automatically every 24 hours to encourage you to replace the disk. If you reboot the system, it will run for another 24 hours before shutting down.
- Aggregate reported shows a RAID group in a double degraded state:
Aggregate aggr0 (restricted, raid_dp, degraded) (block checksums)
Plex /aggr0/plex0 (online, normal, active, pool1)
RAID group /aggr0/plex0/rg0 (double degraded, block checksums)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity FAILED N/A 9324290/ -
parity 0m.i1.0L82 0m 12 21 1 FSAS 7200 9324290/19096145920 9342976/19134414848
data FAILED N/A 9324290/ -
- Reported aggregate is not created/used by customer; no data on it
- Aggregate is visible only from nodeshell