Skip to main content
NetApp Knowledgebase

BIOS updates for memory reliability and the PPR feature

Views:
155
Visibility:
Internal
Votes:
0
Category:
aff-series
Specialty:
hw
Last Updated:

Applies to

  • AFF A700s
  • AFF A700
  • FAS9000

Answer

What are the BIOS update and Post Package Repair (PPR) enhancements for?

Recent BIOS updates address various memory event handling functions on a per-platform basis.  NetApp systems use different Intel CPU chipsets and therefore, each platform has its own BIOS update content.

NetApp is introducing Post Package Repair (PPR) into its products to improve the overall operational experience.  PPR is a new memory capability which works in conjunction with newly created features added to ONTAP. These features allow NetApp to leverage PPR-enabled memory and proactively address memory issues, reducing the need to replace failing DIMMs.  In addition, NetApp is also adopting new BIOS updates to improve handling of memory-related errors (correctable and uncorrectable ECC errors).

NetApp’s use of newer memory technologies beginning with DDR4 include PPR capability.  When combined with a PPR-enabled controller and operating system, the system can map out a bad memory row and utilize a spare row on the DIMM.  As Intel updates its BIOS to add additional memory testing or memory error handling fixes, NetApp tests these fixes and provide them on the NetApp Support site.

Why are these updates important and why should I upgrade?

NetApp’s newest systems have drastically increased in memory capacity and memory speed over older models. NetApp’s newer systems use DDR4 memory and have anywhere from 4x to 12x the memory of older systems, but memory quality has remained at a steady-state level. Due to the greater number of DIMM modules in the system, system mean time between failure (MTBF) decreases, with potentially higher levels of system maintenance for memory issues.   

Upgrading the system’s BIOS will help to incrementally reduce the need to replace DIMMs, reducing the need to address memory-related failures on the system.

  • BIOS updates are platform-specific, and each revision carries incremental improvements, fixes, or new features - such as PPR functionality.  NetApp provides regular updates to improve the overall system experience.
  • Initial PPR functionality is enabled based on the platform (see platform-specific functionality). Future updates will add additional failure mode detection capabilities and further reduce the need to replace DIMMs.
How will the PPR feature change the behavior of my systems?

When a uncorrectable memory error is encountered, the system will panic. In an HA configuration, the partner will take over and continue to provide services. When the system reboots from BIOS, it will begin a PPR memory test. The PPR test can take several minutes for the system to test the memory and display the results on the system console.

If PPR can detect the problematic memory segment, it will repair it. If the system can recover, it will provide messaging around the event. No further action would be required. If the memory fails or cannot be repaired, the system will not boot ONTAP and a DIMM replacement will be required.

What is being planned in future BIOS/PPR updates?

Future updates will add additional failure mode detection capabilities, to further reduce the need to replace memory DIMMs.

What are the product-specific details for BIOS/PPR updates?
Products BIOS Version Notes
AFF A700, FAS9000 10.9 Initial PPR feature, Intel IPU 2019.2
Requires fix for ONTAP Bug 1278330
AFF A700s 12.8 Initial PPR feature

Additional Information

additionalInformation_text