What happens when a disk fails and does not have a hot spare installed?
- Views:
- 3,044
- Visibility:
- Public
- Votes:
- 0
- Category:
- disk-drives
- Specialty:
- 7dot
- Last Updated:
- 1/9/2024, 10:55:07 PM
Applies to
- Data ONTAP 8 7-mode
- Spares
Answer
The following events occur when a disk fails in a filer that is not equipped with a hot spare disk:
- The storage system enters a state called degraded mode. In this state, the RAID feature allows the storage system to continue to run without losing data (although the storage system's performance might be affected).
- The event "callhome.spares.low" is logged in the EMS log
- If a disk drive fails without a spare to reconstruct on, the system enters "degraded" mode.
- Depending on the RAID group type, the aggregate may go into the "completely degraded" state.
- raid4 - RAID group has one missing or failed disk
- raid-dp - RAID group has two missing or failed disks
- raid-tec - RAID group has three missing or failed disks
- A mirrored aggregate is considered "completely degraded" if both plexes of the aggregate has missing or failed disks in the same positional RAID group.
- The system halts automatically to prevent a RAID group integrity failure and possible loss of data, if it runs in "completely degraded" mode for the defined time interval.
- The default timeout is usually 24 hours.
Warning: Replace all reported failed disk as soon as possible, as additional disk failures will cause data loss within the affected raid group. |
4. The storage system logs one of the following warning messages in the /etc/messages
file and to the system console every hour:
Parity disk is broken. Halting in m hours.
Data disk n is broken. Halting in m hours.
where "n" is the disk ID number, and "m" is the number of hours before the system halts.
5. Immediately before the system halts, a message similar to the following is sent to /etc/messages
and the console:
Sat Oct 29 13:26:42 PDT [statd]: When a disk is broken, the system shuts down automatically every 24 hours to encourage you to replace the disk.If you reboot the system it will run for another 24 hours before shutting down.
6. The system shuts down after the specified time period if it is still running in completely degraded mode. The shutdown ensures that you notice the disk failure. You can restart the storage system without fixing the disk, but the storage system will continue to shut itself off at the specified intervals until the issue is fixed.
Additional Information