Chassis power supply show critical status and SP reboot simultaneously
Applies to
FAS8200
Issue
EMS
shows Chassis Internal PSU can't be read and SP reboots due to hearbeat loss at the same time.
[?] Wed Aug 02 06:47:16 +0900 [node-01: env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Temperature is Unreadable
[?] Wed Aug 02 06:47:25 +0900 [node-01: power_low_monitor: monitor.chassisPower.degraded:alert]: Chassis power is degraded: Power Supply Status Critical: PSU1.
[?] Wed Aug 02 06:48:35 +0900 [node-01: spsm_listener: sp.heartbeat.stopped:error]: Have not received a IPMI heartbeat from the Service Processor (SP) in last 20 seconds.
[?] Wed Aug 02 06:51:21 +0900 [node-01: spsm_listener: callhome.sp.hbt.missed:notice]: Call home for SP HBT MISSED
[?] Wed Aug 02 06:53:14 +0900 [node-01: spsm_listener: sp.update.status:debug]: params: {'reason': 'sp_startup_notify_servprocd: SP startup handler has been called. '}
[?] Wed Aug 02 06:53:14 +0900 [node-01: spsm_listener: sp.heartbeat.resumed:info]: Received IPMI heartbeat from the Service Processor (SP).
- After SP reboot
SP-LATEST-IPMI
shows that the PSU status recover but multiple sensor show abnormal status.
Sensor Name | Current | Unit | Status | LCR | LNC | UNC | UCR
-----------------+------------+------------+------------+-----------+-----------+-----------+-----------
CPU0_Temp_Margin | na | degrees C | na | na | na | -5.000 | 0.000
In_Flow_Temp | -55.000 | degrees C | cr | 0.000 | 10.000 | 75.000 | 80.000
CPU_VCC | 0.010 | Volts | cr | 0.708 | 0.747 | 1.348 | 1.426
CPU_1.05V | 0.010 | Volts | cr | 0.892 | 0.941 | 1.154 | 1.203
CPU_VTT | 0.010 | Volts | cr | 0.931 | 0.989 | 1.213 | 1.261
LM56_Temp | na | degrees C | na | 0.000 | 10.000 | 72.000 | 77.000
CPU_1.5V | 0.010 | Volts | cr | 1.271 | 1.348 | 1.649 | 1.727
Bat_1.5V | 1.756 | Volts | cr | 1.280 | 1.348 | 1.649 | 1.727