Multiple chassis fan FRUs are failed reported by one node
Applies to
Issue
- EMS log shows the following messages from a single node
[node2: env_mgr: monitor.temp.unreadable:error]: The controller temperature (Midplane 4 Temp) is not readable.
[node2: env_mgr: monitor.temp.unreadable:error]: The controller temperature (Midplane 3 Temp) is not readable.
[node2: env_mgr: monitor.temp.unreadable:error]: The controller temperature (Midplane 2 Temp) is not readable.
[node2: env_mgr: monitor.temp.unreadable:error]: The controller temperature (Midplane 1 Temp) is not readable.
[node2: rlm_hbtrcv_non_blocking: sp.update.status:debug]: params: {'reason': 'sp_bootup_notify_servprocd: SP online handler has been called '}
[node2: cf_worker: cf.hwassist.notifyCfgSuccess:debug]: params: {'hwtype': 'SP'}
[node2: monitor: monitor.globalStatus.critical:EMERGENCY]: Chassis temperature is too high..
[node2: env_mgr: monitor.fan.warning:notice]: multiple fans have failed. Replace it to avoid overheating
[node2: env_mgr: callhome.c.fan.fru.fault:error]: Call home for CHASSIS FAN FRU FAILED: Multiple fans have failed
- SP/BMC logs show both nodes show signs of i2c contention in the
BMC syslog:
BMC node1 env_mgr[1639]: envd_ses_get_sensor_reading: 324: ENVD_SES: Failed to get SES PG!
BMC node1 last message buffered 25 times
BMC node1 env_mgr[1639]: Payload action: update_fru_state(0, FRU_PSU2)
BMC node1 env_mgr[1639]: Payload action: update_snmp(1, (null))
BMC node1 env_mgr[1639]: Payload action: update_fru_state(0, FRU_PSU2)
BMC node1 env_mgr[1639]: Payload action: update_snmp(1, (null))
BMC node1 env_mgr[1639]: envd_ses_get_sensor_reading: 389:ENVD_SES: Read invalid sensor value. sensor_name :Module B Expander Temp,reading:-20 sensor_status :5
BMC node1 env_mgr[1639]: isvc_send_request: 455: ISVCLIB command(4) rsp timed out
BMC node1 env_mgr[1639]: SES page request returned error (2)
BMC node1 env_mgr[1639]: envd_ses_get_sensor_reading: 324: ENVD_SES: Failed to get SES PG!
BMC node1 last message buffered 1 times
BMC node1 env_mgr[1639]: isvc_send_request: 455: ISVCLIB command(4) rsp timed out
BMC node1 env_mgr[1639]: SES page request returned error (2)
BMC node1 env_mgr[1639]: envd_ses_get_sensor_reading: 324: ENVD_SES: Failed to get SES PG!
- SP/BMC is on the latest version
- Management and data network/LIFs are isolated from each other
