NetApp H610S node reboots unexpectedly with correctable machine check error
Applies to
- NetApp H610S
- NetApp Element software
- All currently supported versions of BIOS
Issue
- A node in an Element cluster logs a nodeOffline event for approximately 7 to 15 minutes
- Logs indicate that the node has rebooted unexpectedly
- Entries for
Correctable machine check error
are found in the BMC system event log around the time of the nodeOffline event - Example of BMC SEL events:
SEL Record ID : 0076
Record Type : 02
Timestamp : 04/04/2021 11:21:35
Generator ID : 0001
EvM Revision : 04
Sensor Type : Processor
Sensor Number : a8
Event Type : Sensor-specific Discrete
Event Direction : Assertion Event
Event Data : ac032b
Description : Correctable machine check error