StorageGRID Appliance alert Unexpected node reboot due to hardware machine check exception
- Views:
- 398
- Visibility:
- Public
- Votes:
- 0
- Category:
- storagegrid-webscale
- Specialty:
- sgrid
- Last Updated:
- 6/4/2024, 5:13:16 AM
Applies to
NetApp StorageGRID Appliances
Issue
StorageGRID reports alert
Unexpected node reboot
When downloading a support bundle and verifing
base-os-logs/run/mount-tmp/pge-actv-root/var/log/storagegrid_crash_dmesg.DATE.log.gz
it reports:[8526393.622416] Disabling lock debugging due to kernel taint
[8526393.627902] mce: [Hardware Error]: CPU 0: Machine Check Exception: 5 Bank 12: b200003f000100b3
[8526393.636639] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff9e4cae04> {native_queued_spin_lock_slowpath+0x54/0x190}
[8526393.647283] mce: [Hardware Error]: TSC 43d23fe1832b9a2
[8526393.652650] mce: [Hardware Error]: PROCESSOR 0:306e4 TIME 1683024361 SOCKET 0 APIC 0 microcode 428
[8526393.661732] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[8526393.672067] mce: [Hardware Error]: Machine check: Processor context corrupt
[8526393.679162] Kernel panic - not syncing: Fatal machine check