Skip to main content
NetApp Knowledge Base

BIOS updates for memory reliability and the PPR feature

Views:
19,158
Visibility:
Public
Votes:
15
Category:
aff-series
Specialty:
hw
Last Updated:

Applies to

  • Platforms:
    • AFF A900 / FAS9500
    • AFF A800 / AFF C800
    • AFF A700 / FAS9000
    • AFF A700s
    • AFF A400 /  AFF C400 /  FAS8300 / FAS8700
  • Post Package Repair (PPR)

Answer

What products include the PPR feature?
Products BIOS Version Bundled with ONTAP RFE Report
AFF A700, FAS9000 10.9 For BIOS 10.9+, ONTAP support is also required:
9.5P15, 9.6P12, 9.7P8, 9.8
and later
1278330
AFF A700s 12.8 9.5P15, 9.6P12, 9.7P9, 9.8 and later 1354656
AFF A800 13.10 9.5P18, 9.6P15, 9.7P14, 9.8P4 and later 1371369
AFF A400, FAS8700, FAS8300 16.3 9.7P12, 9.8P2 and later 1373545
AFF A900, FAS9500 18.3 9.10.1RC2, 9.10.1 and later N/A
What are the BIOS update and Post Package Repair (PPR) enhancements for?

Recent BIOS updates address various memory event handling functions on a per-platform basis. NetApp systems use different Intel CPU chipsets and therefore, each platform has its own BIOS update content.

NetApp is introducing Post Package Repair (PPR) into its products to improve the overall operational experience. PPR is a new memory capability which works in conjunction with newly created features added to ONTAP. These features allow NetApp to leverage PPR-enabled memory and proactively address memory issues, reducing the need to replace DIMMs when memory errors have been detected. In addition, NetApp is also adopting new BIOS updates to improve handling of memory-related errors (correctable and uncorrectable ECC errors).

  • NetApp’s use of newer memory technologies beginning with DDR4 include PPR capability.  
  • When combined with a PPR-enabled controller and operating system, the system can map out a bad memory row and utilize a spare row on the DIMM.  
Why are these updates important and why should I upgrade?

NetApp’s newest systems have drastically increased in memory capacity and memory speed over older models. NetApp’s newer systems use DDR4 memory and have anywhere from 4x to 12x the memory of older systems, but memory quality has remained at a steady-state level. Due to the greater number of DIMM modules in the system, system mean time between failure (MTBF) decreases, with potentially higher levels of system maintenance for memory issues.   

Upgrading the system’s BIOS will help to incrementally reduce the need to replace DIMMs, reducing the need to address memory-related failures on the system.

  • As Intel updates its BIOS to add additional memory testing or memory error handling fixes, NetApp tests these fixes and provide them on the NetApp Support site.
  • BIOS updates are platform-specific, and each revision carries incremental improvements, fixes, or new features - such as PPR functionality. NetApp provides regular updates to improve the overall system experience.
  • Initial PPR functionality is enabled based on the platform (see platform-specific functionality). Future updates will add additional failure mode detection capabilities and further reduce the need to replace DIMMs.
How will the PPR feature change the behavior of my systems?
  1. When a uncorrectable memory error is encountered, the system will panic.
  2. In an HA configuration, the partner will take over and continue to provide services.
  3. When the system reboots from BIOS, it will begin a PPR memory test.

The PPR test can take several minutes for the system to test the memory and display the results on the system console.

What action is needed once the PPR test has completed?
  • Replacement not required - If PPR can detect the problematic memory segment, it will repair it.
    • If the system can recover, it will provide messaging around the event. PPR:Sequence PASS.
    • No further action would be required. 
  • Replacement required - If the memory fails or cannot be repaired, the system will not boot ONTAP and a DIMM replacement will be required.
    • If the same DIMM experiences a 2nd UECC error and panic, you can choose to replace the DIMM. Contact NetApp to order a DIMM replacement
What is being planned in future BIOS/PPR updates?

Future updates will add additional failure mode detection capabilities, to further reduce the need to replace memory DIMMs.

Additional Information

For general information on troubleshooting uncorrectable ECC memory errors, see: How to troubleshoot uncorrectable memory errors on AFF and FAS systems

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.