Media scrub is proactive reading of all the disks to detect and fix media errors before they cause issues during a reconstruction or double errors. While, this significantly reduces the number of issues caused by such errors, it cannot prevent all the errors.
Following is the recovery sequence:
- Data ONTAP marks the volume or aggregate involved as 'inconsistent' and ignores the medium error.
- Next, Data ONTAP attempts to start ‘wafliron’ on that volume or aggregate.
- If this does not succeed, it tries to restrict (unmount) the volume.
- If both fail, storage system panics with a message similar to the following:
PANIC: raid volfsm: vol vol_8TB_u33: fatal multi-disk error. in process config_thread
- On next boot, the volume that caused the panic will be restricted (not mountable).
- If it is a root volume, the storage system will not boot.
- If it is a non-root volume, the storage system will boot with the volume restricted.
Note: Data ONTAP will not allow the storage system to boot with this volume online at this point, as the medium error might cause metadata corruption.
- The user can manually start wafliron on the affected volume or reboot the storage system and run wafl_check on the affected volume.
- After the reconstruction completes, a scrub is started to clear any further double errors.
- After the scrub completes, the 'ignore medium error mode' will be cleared on the volume.
The unrecoverable data is replaced with zeroed blocks. At least some applications recognize the zeroed blocks as bad data, if that data is ever needed.