Skip to main content
NetApp Knowledge Base

Even after replacing the DIMM the event log still displays the DIMM error

Views:
14
Visibility:
Public
Votes:
0
Category:
fas-systems
Specialty:
HW
Last Updated:

Applies to

  • FAS 2750
  • ONTAP 9.5

Issue

  • DIMM-1 was replaced due to the following panic error.

PANIC: ECC error at DIMM-1: CE-03-1843-24156E02,ADDR 0x79e200080,(Node(0), Memory controller(0), CH(0), DIMM(0), Rank(0), Bank Group(0), Bank(0x2), Row(0xe780), Col(0x0),  Uncorrectable Machine Check Error at CPU11. BDWL_HA0 Error: STATUS<0xbe00000000010090>(Val,UnCor,Enable,MiscV,AddrV,PCC,CorrSts(0),CorrCnt(0),ExtErr(0x1),ErrCode(Channel 0, Read)ErrCode(0x90))MISC<0x0000000040169686>(HaDbBank(0),PE(0),ReqOpcode(0x2),RNID(0),RTID(0xb),HTID(0x4b))ADDR<0x000000079e200080>((0x79e200080)).  in process ECC scrubber on release 9.5 (C) on Sun May  5 20:09:02 JST 2024
version: 9.5:

  • Even after replacement the DIMM-1, the system's Fault LED remains lit.
  • The "service-event show" command indicates that DIMM-1 is still in an error state.

Cluster::*> system controller service-event show
                            
Node             ID  Event Location                      Event Description
---------------- --- ----------------------------------  ----------------------
****-01      1   DIMM in slot 1 in Controller A      Uncorrectable DRAM ECC
****-01      2   DIMM in slot 1 in Controller A      DIMM error recorded in SRAM
2 entries were displayed.

  • Even after using the "delete" command to remove this event, the same error reappears immediately.

::*> system controller service-event delete -event-id *
2 entries were deleted.

  • The BMC command "system fru led show all" shows that the Attention LED for DIMM-1 is on.

BMC ****-01*> system fru led show all
FRU LED ID 1 is off
FRU LED ID 2 is on. Set by BMC
FRU LED ID 3 is on
FRU LED ID 4 is on
FRU LED ID 5 is off
FRU LED ID 6 is off
FRU LED ID 7 is off
FRU LED ID 8 is off
FRU LED ID 9 is off
FRU LED ID 10 is off
FRU LED ID 11 is off
FRU LED ID 12 is off
FRU LED ID 13 is on  
FRU LED ID 14 is off
FRU LED ID 15 is off
FRU LED ID 16 is off
FRU LED ID 17 is off
FRU LED ID 18 is off

BMC ****-01*> system fru led show
<FRU-LED-ID>:
     1  = BMC Locate LED
     2  = BMC System LED
     3  = BMC Controller Attention LED
     4  = BMC Controller Active LED
     5  = BMC SAS Port A Attention LED
     6  = BMC SAS Port B Attention LED
     7  = BMC CNA Port 1 Attention LED
     8  = BMC CNA Port 2 Attention LED
     9  = BMC CNA Port 3 Attention LED
     10 = BMC CNA Port 4 Attention LED
     11 = BMC 10G Port 1 Attention LED
     12 = BMC 10G Port 2 Attention LED
     13 = BMC DIMM Slot 1 Attention LED
     14 = BMC DIMM Slot 2 Attention LED
     15 = BMC NVMEM 1 Attention LED
     16 = BMC BOOT DISK Attention LED
     17 = BMC NV BATTERY Attention LED
     18 = BMC Coin Cell Attention LED
BMC cltbmnas-01*>

  • Attempting to reseat or replace the DIMM again does not resolve the issue.
  • Issue persists after implementing a Takeover/Giveback and BMC reboot.

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.