System fails to boot with - Failed to recover SP
Applies to
- AFF A220 / FAS27x0 / AFF C190
- AFF A200 / FAS26x0
- FAS80x0
- FAS9500
- AFF A300 / FAS8200
- AFF A700 / FAS9000
- AFF A900
- AFF A50 / AFF A30 / AFF A20
- AFF C60 / AFF C30
Issue
- The storage system is rebooted (such as during ONTAP upgrade) but fails to boot, halting to LOADER.
Example:
...
Waiting for SP ...
SP failure. Resetting SP from primary FW. This can take a few minutes
Waiting for SP ...
SP failure. Resetting SP from backup FW. This can take a few minutes
Waiting for SP ...
Failed to recover SP
IPMI PCI Slot Control failed.
IPMI PCI Slot Configuration failed.
Configuring Devices ...
IPMI:Get controller FRU inventory:failed
IPMI:Get midplane FRU 0 inventory:failed
IPMI: Get NVRAM FRU inventory:failed
BIOS POST Failure(s) detected: SP IPMI failure. Abort AUTOBOOT
LOADER-A>BIOS POST Failure(s) detected: Failed to get FRU data. Abort AUTOBOOT
- Service Processor (SP) event log reports similar failure messages.
Example:
Record 1287: Tue Apr 14 14:34:05.000000 2020 [SysFW.notice]: IPMI:Read midplane FRU common header:timeout - retrying
Record 1288: Tue Apr 14 14:34:10.000000 2020 [SysFW.notice]: IPMI:Read midplane FRU common header:timeout
Record 1289: Tue Apr 14 14:34:13.000000 2020 [SysFW.notice]: Failed to recover SP
Record 1290: Tue Apr 14 14:34:13.000000 2020 [SysFW.critical]: IPMI:Read midplane FRU common header:failed
Record 1291: Sun Jan 01 00:02:58.340000 2017 [Trap Event.critical]: hwassist post_error (26)
Record 1292: Tue Apr 14 14:34:14.000000 2020 [SysFW.critical]: IPMI PCI Slot Control failed.
Record 1293: Sun Jan 01 00:02:59.310000 2017 [Trap Event.critical]: hwassist post_error (26)
Record 1294: Tue Apr 14 14:34:18.000000 2020 [CFE.notice]: Loader time adjust: Set BMC time. Old time: Sun Jan  1 00:03:03 2017. New time: Tue Apr 14 14:34:18 2020.
Record 1295: Tue Apr 14 14:34:18.000000 2020 [Boot Loader.notice]: Received time sync
Record 1296: Tue Apr 14 14:34:20.000000 2020 [Boot Loader.critical]: Abort Autoboot due to BIOS POST failure.
Record 1297: Tue Apr 14 14:34:20.280000 2020 [Trap Event.critical]: hwassist post_error (26)
Record 1298: Tue Apr 14 14:34:24.020000 2020 [IPMI.notice]: 001c | 02 | EVT: 6fc220ff | System_FW_Status | Assertion Event, "Bootloader is running"
- The issue remains after a controller re-seat and without any cable connected to e0M/SP
    - (to rule out excessive SP traffic issues as described in Node shuts down and fails to boot due to 'SP IPMI failure')
 
