Skip to main content

NetApp wins prestigious Coveo Relevance Pinnacle Award. Learn more!

NetApp Knowledge Base

E-Series drive failure - AutoSupport Message

Views:
2,460
Visibility:
Public
Votes:
0
Category:
e-series-systems
Specialty:
esg
Last Updated:

Applies to

NetApp E-Series

Event Summary

SANtricity reports a failed drive for a hard disk drive (HDD) or solid state drive (SSD)

Validate

  1. Login to SANtricity Storage Manager or SANtricity System Manager
  2. Confirm SANtricity reports one of more of the following error messages in the Recovery GURU:
    • FAILED_DRIVE-Recovery Failure Type Code: 23 Failed Drive
    • FAILED_DRIVE-Recovery Failure Type Code: 23 Failed Physical Disk - Unassigned or Standby Hot Spare
    • DISK_POOL_DRIVE_FAILURE-Recovery Failure Type Code: 443 Failed Disk Pool Drive
  3. Confirm from the Hardware tab that the drive shows as failed

Resolution

  • Contact NetApp Technical Support as this will require a drive replacement
  • If AutoSupport is enabled, this event should have already created a technical support case

Additional Information

Possible causes for drive failures:

  • A user manually failed the drive via the shell, SMcli, or the GUI.
  • An unrecovered write error / write failure.
    • Failure of a host initiated write.
    • Failure of an internally initiated write.
  • Failure of a drive firmware download.
  • Exceeding a synthetic predictive failure analysis (SPFA) threshold. (Controller reported / detected errors and initiated failure).
    • Drive firmware may be a factor here if there are any known issues that would cause premature drive failures.
  • Exceeding a SMART (Self-Monitoring, Analysis and Reporting Technology) failure threshold. (Drive reported / detected errors and initiated failure).

 

For Legacy/Traditional RAID Volume Groups:

  • Reconstruction to a hot spare should begin automatically when the drive fails if a hot spare is available.
  • Once reconstruction to the hot spare completes replace the failed drive.​​​​​​
    • Best practice recommendation is to wait until the failed drive disappears from the GUI before inserting the new drive.
  • Once copy back/reconstruction has completed the system will return to optimal.
  • When the new drive is inserted, copy back from the hot spare or reconstruction from parity should begin to the new drive automatically.

For Dynamic Disk Pools:

  • Reconstruction / Rebalance will begin automatically when the drive fails if there is preservation capacity available.
  • Once reconstruction / rebalancing completes replace the failed drive.
    • Best practice recommendation is to wait until the failed drive disappears from the GUI before inserting the new drive.
  • When the new drive is inserted copy back / rebalancing will begin to the new drive automatically.
  • Once copy back / rebalancing has completed the system will return to optimal.

If there are any questions or concerns with the details and procedures above, please contact NetApp Technical Support and reference this article for further assistance.

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

Scan to view the article on your device