CHW-3320: AFF A20 node experiences Environmental shutdown due to unreadable battery sensors
Issue
- Unexpected takeover event
- Environmental shutdown caused by "unreadable" battery sensors.
- ONTAP event messages. Example:
... [NODE-01: env_mgr: callhome.battery.failure:EMERGENCY]: Call home for BATTERY ('Bat Temp' unreadable) CRITICAL.
... [NODE-01: env_mgr: callhome.battery.failure:EMERGENCY]: Call home for BATTERY ('Bat Volt' unreadable) CRITICAL.
... [NODE-01: env_mgr: callhome.battery.failure:EMERGENCY]: Call home for BATTERY ('Bat Curr' unreadable) CRITICAL.
... [NODE-01: env_mgr: callhome.battery.failure:EMERGENCY]: Call home for BATTERY ('Bat Full Cap' unreadable) CRITICAL.
... [NODE-01: env_mgr: callhome.battery.failure:EMERGENCY]: Call home for BATTERY ('Bat Charge Curr' unreadable) CRITICAL.
... [NODE-01: env_mgr: callhome.battery.failure:EMERGENCY]: Call home for BATTERY ('Bat Charge Volt' unreadable) CRITICAL.
... [NODE-01: env_mgr: callhome.battery.failure:EMERGENCY]: Call home for BATTERY ('Bat Design Cap' unreadable) CRITICAL.
... [NODE-01: env_mgr: callhome.battery.failure:EMERGENCY]: Call home for BATTERY ('Bat Dstg Cycles' unreadable) CRITICAL.
... [NODE-01: env_mgr: monitor.shutdown.emergency:EMERGENCY]: Emergency shutdown: Environmental Reason Shutdown (Battery voltage unreadable for 5 mins)
In BMC logs it can also be seen:
[IPMI.emergency]: env_mgr triggers OS halt:Battery voltage unreadable for 5 mins
[IPMI.emergency]: env_mgr triggers OS halt:Battery current unreadable for 5 mins
[Controller.notice]: Appliance user command halt.
[IPMI Event.critical]: System power down