Skip to main content
NetApp Knowledge Base

How does Data ONTAP respond, when a medium error occurs during a disk reconstruction?

Views:
1,358
Visibility:
Public
Votes:
0
Category:
data-ontap-7
Specialty:
hw
Last Updated:

 

Applies to

Data ONTAP

Answer

  • Media scrub is proactive reading of all the disks to detect and fix media errors before they cause issues during reconstruction or double errors.
  • Although, this significantly reduces the number of issues caused by such errors, it cannot prevent all the errors.

Following is the recovery sequence: 

  1. Data ONTAP marks the volume or aggregate involved as 'inconsistent' and ignores the medium error. 
  2. Next, Data ONTAP attempts to start ‘wafliron’ on that volume or aggregate.
  • If this does not succeed, it tries to restrict (unmount) the volume.
  • If both fail, storage system panics with a message similar to:  

PANIC: raid volfsm: vol vol_8TB_u33: fatal multi-disk error. in process config_thread

  1. On next boot, the volume that caused panic will be restricted (not mountable).
  • If it is a root volume, storage system will not boot.
  • If it is a non-root volume, storage system will boot with the volume restricted.

Note: Data ONTAP will not allow the storage system to boot with this volume online at this point, as the medium error might cause metadata corruption. 

  1. User can manually start wafliron on the affected volume or reboot storage system and run wafl_check on the affected volume. 
  2. After the reconstruction completes, a scrub is started to clear any further double errors. 
  3. After the scrub completes, the 'ignore medium error mode' will be cleared on the volume. 

The unrecoverable data is replaced with zeroed blocks. At least some applications recognize the zeroed blocks as bad data, if that data is ever needed.

 

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.