Skip to main content

NetApp wins prestigious Coveo Relevance Pinnacle Award. Learn more!

INSIGHT Japan :2023年 1月25日(水)ANAインターコンチネンタルホテル開催 へ参加・申込を行う

NetApp Knowledge Base

AFF A700s CECC: Correctable Machine Check Errors being reported against wrong DIMM

Last Updated:

Applies to

  • AFF A700s
  • ONTAP 9
    • ONTAP 9.1P17 and earlier
    • ONTAP 9.3P11 and earlier
    • ONTAP 9.4P6 and earlier


The CECC error is reported in the same DIMM even after a replacement:

The system health alert show command reports errors similar to the following on the cluster:

Node                  xxxxxx
Monitor               controller
Alert ID              CriticalCECCCountMemErrAlert
Alerting Resource     DIMM-x
Subsystem             Memory
Indication Time       Tue Oct 09 12:24:36 2018
Perceived Severity    Critical
Probable Cause        DIMM_Degraded
Description           The DIMM has degraded, leading to memory errors.

The following are corrective actions:

1. Contact technical support to obtain a new DIMM of the same specification
2. If possible, perform a takeover of this node and bring the node down for maintenance
3. Refer to the DIMM replacement guide for your given hardware platform to replace the DIMM
4. Bring the storage system online

Possible Effect:
Memory issues can lead to a catastrophic system panic, which can lead to data downtime on the node.

The EMS log displays a message similar to the following, reporting CECC error on the specific DIMM:

[?] Tue Oct 09 12:24:36 IST [xxxx: mgwd:]: Call home for Health Monitor process nphm: CriticalCECCCountMemErrAlert[DIMM-x].

Normally, a replacement of this DIMM is suggested.
However, even after the replacement, the errors in the same DIMM might be reported by the cluster.




Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

Scan to view the article on your device