CONTAP-65772: Health Monitor issues false positive alerts for CriticalFruMultiFaultAlert process intermittently
Issue
- On some storage systems, the sensors readings take a while to update due to technical issues which results in false positive alerts for
CriticalFruMultiFaultAlert
,CriticalPSUFruFaultAlert
, and more such processes.
- The following models are potentially affected:
- FAS2520, FAS2552, FAS2554
- When this issue occurs, you might see the following error messages:
[?] Mon Jun 16 13:00:23 +0900 [Node-01: cphmd: hm.alert.raised:alert]: Alert Id = CriticalPSUFruFaultAlert , Alerting Resource = XXXXXXXXXXXXXXX raised by monitor chassis
[?] Mon Jun 16 13:03:29 +0900 [Node-01: mgwd: callhome.hm.alert.critical:alert]: Call home for Health Monitor process cphm: CriticalPSUFruFaultAlert[XXXXXXXXXXXXXXX].
[?] Mon Jun 16 13:10:23 +0900 [Node-01: cphmd: hm.alert.cleared:notice]: Alert Id = CriticalPSUFruFaultAlert , Alerting Resource = XXXXXXXXXXXXXXX cleared by monitor chassis