"SHELF_FAULT" error is reported on systems with internal storage
Applies to
- ONTAP 9
- FAS255x
Issue
- System reports the following errors and recovers automatically.
Example:
Fri Apr 24 13:40:15 JST [node: dsa_worker1: ses.status.ACPError:alert]: DS4246 (S/N xxxxxxxxxx) shelf 0 on channel 0a ACP Processor error for SAS shelf ACP processor 2: critical status ; Alternate Control Path hardware failed This module is on the rear of the shelf at the bottom center, on shelf module B.
Fri Apr 24 13:40:24 JST [node: statd: monitor.shelf.fault:alert]: Critical fault reported on disk storage shelf attached to channel 0a. Check fans, power supplies, disks, and temperature sensors.
Fri Apr 24 13:40:24 JST [node: statd: callhome.shlf.fault:error]: Call home for SHELF_FAULT
Fri Apr 24 14:35:24 JST [node: dsa_worker5: ses.status.ACPInfo:info]: DS4246 (S/N xxxxxxxxxx) shelf 0 on channel 0b ACP Processor information for SAS shelf ACP processor 2: normal status.
Fri Apr 24 14:35:33 JST [node: statd: monitor.shelf.fault.ok:notice]: Fault previously reported on disk storage shelf attached to channel 0a has been corrected.
Fri Apr 24 14:36:00 JST [node: monitor: monitor.globalStatus.ok:notice]: The system's global status is normal.
- Alternate Control Path (ACP) is in
FAIL
status onSTORAGE-FAULT
output when the error occurs.
Example:
Enclosure Status: critical
Channel: 0a
Shelf: 0
Shelf Type: DS4246
Product Serial Number: xxxxxxxxxx
Module Type: IOM6E
Vendor Unique Element 85-IOM6E: (ACP)
Element Status Status Bytes Status Descriptions
1 [IOM6E A] : OK 01,10,00,00 INBACP
2 [IOM6E B] : CRITICAL 02,10,00,40 INBACP, FAIL
- The following records can be found from
SP-LATEST-SYSTEM-EVENT-LOG
.
Example:
Record 1771: Fri Apr 24 05:33:18 2020 [SP.critical]: Rebooting SP due to loss of ACP comms
Record 1772: Thu Jan 1 00:00:37 1970 [IPMI.notice]: 6104 | c0 | OEM: ffff70005100 | ManufId: 150300 | SP Reset Internally