Skip to main content
NetApp Knowledge Base

What happens when a disk fails and does not have a hot spare installed?

Views:
2,327
Visibility:
Public
Votes:
0
Category:
data-ontap-7
Specialty:
7dot
Last Updated:

Applies to

  • Data ONTAP 7 and earlier
  • Disk Drive
  • ONTAP 9

Answer

The following events occur when a disk fails in a filer that is not equipped with a hot spare disk:

  1. The storage system enters a state called degraded mode. In this state, the RAID feature allows the storage system to continue to run without losing data (although the storage system's performance might be affected).
    • The event "callhome.spares.low" is logged in the EMS log
    • If a disk drive fails without a spare to reconstruct on, the system enters "degraded" mode.
  2. Depending on the RAID group type, the aggregate may go into the "completely degraded" state.
    • raid4 - RAID group has one missing or failed disk
    • raid-dp - RAID group has two missing or failed disks
    • raid-tec - RAID group has three missing or failed disks
    • A mirrored aggregate is considered "completely degraded" if both plexes of the aggregate has missing or failed disks in the same positional RAID group.
  3. The system halts automatically to prevent a RAID group integrity failure and possible loss of data, if it runs in "completely degraded" mode for the defined time interval.
    • The default timeout is usually 24 hours.
Warning: Replace all reported failed disk as soon as possible, as additional disk failures will cause data loss within the affected raid group.

  4. The storage system logs one of the following warning messages in the /etc/messages file and to the system console every hour:

Parity disk is broken. Halting in m hours.
Data disk n is broken. Halting in m hours.

where "n" is the disk ID number, and "m" is the number of hours before the system halts.

  5. Immediately before the system halts, a message similar to the following is sent to /etc/messages and the console:

Sat Oct 29 13:26:42 PDT [statd]: When a disk is broken, the system shuts down automatically every 24 hours to encourage you to replace the disk.If you reboot the system it will run for another 24 hours before shutting down.

  6. The system shuts down after the specified time period if it is still running in completely degraded mode. The shutdown ensures that you notice the disk failure. You can restart the storage system without fixing the disk, but the storage system will continue to shut itself off at the specified intervals until the issue is fixed.

Additional Information

 

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.
Scan to view the article on your device