AFF-A900 PSU1/PSU2 unstable causing takeovers
Applies to
- AFF-A900
- ONTAP 9
Issue
- PSU1 and/or PSU2 have constant alerts trigger that clear shortly after:
[Node1: env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 2 is degraded: PSU2 12V Out Volt is Critical Low (2040 mV)[Node1: power_low_monitor: monitor.chassisPower.degraded:alert]: Chassis power is degraded: Power Supply Status Critical: PSU2.[Node1: power_low_monitor: callhome.chassis.power:error]: Call home for CHASSIS POWER DEGRADED: Power Supply Status Critical: PSU2. [Node1: env_mgr: monitor.chassisPowerSupply.ok:info]: Chassis power supply 2 is OK.
- Takeover may occur:
[Node2: cf_main: cf.fsm.takeover.noHeartbeat:alert]: Failover monitor: Takeover initiated after no heartbeat was detected from the partner node.
- SP logs show power alerts triggering constantly:
Record 1230: [IPMI.notice]: 0a30 | 02 | EVT: 0301ffff | PSU1_Fault | Assertion Event, "State Asserted"Record 1231: [IPMI.notice]: 0a31 | 02 | EVT: 6f02ffff | PSU1_Status | Assertion Event, "Fault"Record 1232: [IPMI.notice]: 0a32 | 02 | EVT: ef05ffff | PSU1_Status | Deassertion Event, "AC OK"Record 1233: [IPMI.notice]: 0a33 | 02 | EVT: 6f06ffff | PSU1_Status | Assertion Event, "Fault Signal"Record 1234: [IPMI.notice]: 0a34 | 02 | EVT: 6f05ffff | PSU1_Status | Assertion Event, "AC OK"Record 1235: [IPMI.notice]: 0a35 | 02 | EVT: ef06ffff | PSU1_Status | Deassertion Event, "Fault Signal" Record 1236: [IPMI.notice]: 0a36 | 02 | EVT: 6f04ffff | PSU1_Status | Assertion Event, "DC OK"Record 1237: [IPMI.notice]: 0a37 | 02 | EVT: ef04ffff | PSU1_Status | Deassertion Event, "DC OK"