Multiple unexpected node reboots on a single node due to watchdog power cycle
- Views:
- 21
- Visibility:
- Public
- Votes:
- 0
- Category:
- storagegrid-webscale
- Specialty:
- sgrid
- Last Updated:
- 3/19/2025, 1:42:16 PM
Applies to
Issue
- Multiple unexpected reboots of compute node
- From
status.json
in/base-os-logs/var/local/gemini-system-status
directory
{
"id": "91",
"timestamp": "06-Feb-2025 23:59:36",
"sensor": "Watchdog 2 Watchdog",
"event": "Power Cycle",
"details": "Timer use at expiration = SMS/OS ; Interrupt type = none"
},
- From BMC logs:
Event ID Time Stamp Severity Sensor Name Sensor Type Description
91 Feb/6/2025 23:59:36 [Warning] [Watchdog] [Watchdog 2] Power Cycle(Timer use at expiration: SMS/OS) - Asserted
87 Mar/22/2024 07:47:25 [Warning] [Watchdog] [Watchdog 2] Power Cycle(Timer use at expiration: SMS/OS) - Asserted
84 Oct/20/2023 06:04:11 [Warning] [Watchdog] [Watchdog 2] Power Cycle(Timer use at expiration: SMS/OS) - Asserted
81 Jul/7/2023 09:56:48 [Warning] [Watchdog] [Watchdog 2] Power Cycle(Timer use at expiration: SMS/OS) - Asserted
36 Oct/14/2022 18:31:34 [Warning] [Watchdog] [Watchdog 2] Power Cycle(Timer use at expiration: SMS/OS) - Asserted
12 Jun/22/2021 00:07:33 [Information] [Watchdog] [Watchdog 2] Timer Expired(Timer use at expiration: SMS/OS) - Asserted