Skip to main content
NetApp Knowledge Base

AFF A800 experiences panic reboot due to Uncorrectable ECC error

Views:
186
Visibility:
Public
Votes:
0
Category:
aff-series
Specialty:
hw
Last Updated:

Applies to

AFF A800

Issue

  • From SP-LATEST-CONSOLE-LOGS

PANIC: watchdog nmi on cpu 28, hang cpu is 0 in process idle: cpu28 on release 9.10.1P4 (C) on Fri Jul 15 07:34:06 UTC 2022
version: 9.10.1P4: Mon May  9 18:11:44 EDT 2022

  • From SP-LATEST-SYSTEM-EVENT-LOG

Record 1262: Fri Jul 15 07:34:05.800000 2022 [IPMI.notice]: 0439 | 02 | EVT: 6fa10003 | PVDDQ_KLM | Assertion Event, "Uncorrectable ECC"
Record 1263: Fri Jul 15 07:34:05.810000 2022 [IPMI.notice]: 043a | 02 | EVT: 6fa10003 | PVDDQ_KLM | Assertion Event, "Uncorrectable ECC"
Record 1264: Fri Jul 15 07:34:07.770000 2022 [IPMI.notice]: 043b | 02 | EVT: 6fc824ff | System_Watchdog | Assertion Event, "Timer interrupt"
Record 1265: Fri Jul 15 07:34:08.210000 2022 [IPMI Event.critical]: NMI
Record 1266: Fri Jul 15 07:34:08.210000 2022 [IPMI.notice]: 043c | 02 | EVT: 6f00ffff | CriticalInt | Assertion Event, "NMI/Diag Interrupt"

  • From SP-LATEST-RUNTIME

======================
FRU LEDs status
======================
FRU LED ID 1  = BMC Controller Active LED
FRU LED ID 2  = BMC Controller Attention LED
FRU LED ID 3  = BMC System LED
(...)
FRU LED ID 37 = BMC DIMM 16 Attention LED

FRU LED ID 1 is on
FRU LED ID 2 is on
FRU LED ID 3 is on. Set by BMC
(...)
FRU LED ID 37 is on

  • From SP-LATEST-CONSOLE-LOGS DIMM has been tested and requalified

Running full memory initialization.
PPR:Hard
PPR:Processing 0x0/0x0/0x1/0x0/0x0/0x0/0x0/0xD/0x0
PPR:Pre-PPR Row test PASS. dramMask = 0x0
PPR:Post-PPR Row test PASS. dramMask = 0x0
PPR:Sequence PASS

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.