"SHELF_FAULT" due to SP loss of ACP communication
Applies to
- ONTAP 9
- Service Processor (SP)
- Baseboard Management Controller (BMC)
Issue
- System reports the following errors and recovers automatically.
Fri Apr 24 13:40:15 JST [node: dsa_worker1: ses.status.ACPError:alert]: DS4246 (S/N xxxxxxxxxx) shelf 0 on channel 0a ACP Processor error for SAS shelf ACP processor 2: critical status ; Alternate Control Path hardware failed This module is on the rear of the shelf at the bottom center, on shelf module B.
Fri Apr 24 13:40:24 JST [node: statd: monitor.shelf.fault:alert]: Critical fault reported on disk storage shelf attached to channel 0a. Check fans, power supplies, disks, and temperature sensors.
Fri Apr 24 13:40:24 JST [node: statd: callhome.shlf.fault:error]: Call home for SHELF_FAULT
Fri Apr 24 14:35:24 JST [node: dsa_worker5: ses.status.ACPInfo:info]: DS4246 (S/N xxxxxxxxxx) shelf 0 on channel 0b ACP Processor information for SAS shelf ACP processor 2: normal status.
Fri Apr 24 14:35:33 JST [node: statd: monitor.shelf.fault.ok:notice]: Fault previously reported on disk storage shelf attached to channel 0a has been corrected.
Fri Apr 24 14:36:00 JST [node: monitor: monitor.globalStatus.ok:notice]: The system's global status is normal.
- The following records can be found from
SP-LATEST-SYSTEM-EVENT-LOG
.
Record 1771: Fri Apr 24 05:33:18 2020 [SP.critical]: Rebooting SP due to loss of ACP comms
- There may be debug information on IOM.
Mon May 16 13:09:13 CST [node: acp_worker_thread_0: acp.common.message:debug]: params: {'debug_string': 'IOM 00:A0:98:C6:73:A6 returned an error to SCSI write with opcode 12.Write Error:2, Sense Values: (0 0 0 0).
- Depending on the situation,
(CLUSTER NETWORK DEGRADED) ALERT
is triggered and result in a shutdown.