Skip to main content
NetApp Knowledge Base

What is Media Scan on E-Series storage systems?

Views:
1,837
Visibility:
Public
Votes:
6
Category:
e-series-systems
Specialty:
esg
Last Updated:

 

Applies to

  • E-Series Controller Firmware 6.xx
  • E-Series Controller Firmware 7.xx
  • E-Series Controller Firmware 8.xx

Answer

Check Active IQ if this impacts your systems

  • Media scan is a process that, when enabled, runs during idle time to check the physical disks in a volume.
    • It works to ensure that the sectors are readable, and if Redundancy Check is enabled, will check RAID parity for consistency.
    • In the event that it finds issues with sectors or data-parity mismatches, these are reported to the Major Event Log (MEL) so that the user is aware of any issues. 
  • The process runs at a predetermined rate.
    • For example, if a 30-day interval is selected when enabling it (though this interval is customizable), it will scan that volume at a rate that would take 30 days to complete.
    • However, since media scan only operates during idle time, the actual time for completion might be longer, as it gives priority to host IO over media scan operations.
    • Once the operation completes, it will automatically start over, so that the drives in the background are constantly checked.
  • The limitation to this is that an issue will not be discovered until the controllers are scanning the part of a drive that contains errors.
    • Thus, if a drive develops bad sectors or corruption one day after it was last scanned, it will not be detected until the next time the scan runs over that region of the drive (or until the error is found during some other operation).
  • Any performance hit to the host IO is negligible.
    • Media scan will pause to give priority to the host IO, but the initial response time might be very minutely delayed to switch from media scan to service the IO.
    • For most purposes, this will not be noticeable.
  • Media Scan errors as reported in the MEL

Reported Error

Description

Result

Unrecovered media error

The data could not be read on its first attempt, or on any subsequent 2 retries.

If any of the 3 tries is successful, the data is returned to the host.

If the read retries are unsuccessful, except for RAID 0, attempt error correction via VDD Repair

Recovered media error

The drive could not read the requested data on its first attempt, but succeeded on a subsequent attempt.

Data is written to the drive and verified.

Redundancy mismatches

Redundancy errors are found.

The first 10 redundancy mismatches found on a logical drive are reported. Operating system data checking operations should be executed.

Unfixable error

The data could not be read, and parity or redundancy information could not be used to regenerate it.

An error is reported.

Additional Information

VDD repair:

  1. The VDD Repair starts by reading the data from the RAID stripe + the parity from the stripe.

  2. The VDD Repair then calculates from data+parity of the stripe the data that resides in the unreadable sector of the drive.

  3. If the data is reconstituted successfully from data+parity from the rest of the stripe, the read is returned to the host.

  4. If the VDD Repair is successful, then it does a 'Write Verify' SCSI operation. This writes the reconstituted data to the unreadable sector, and then immediately reads it back.

  5. If the VDD Repair fails (the data is unable to be reconstituted due to a bad read on another drive (in a RAID5), or a degraded RAID group (not enough redundancy), then the affected LBA in the RAID Volume is marked as an 'Unreadable Sector' (ends up in the USM log), and an error is returned to the host. The data at that LBA is lost if we reach this point.

  6. In the background, the Write Verify to the 'bad' sector of the drive will result in the drive firmware reallocating the physical sector (transparently to the E-Series controllers).

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.