StorageGRID node run into Unknown status and remain after the controller replacement
Applies to
- StorageGRID 11.60.2
- StorageGRID Appliance SG5712
Issue
- Storage node run into
Unknown
status - 7-segment display on E5700SG shows
HO
and E2800A shows99
- Unable to access to StorageGRID Appliance Installer (
https://<APPLIANCE_IP>:8443
) - Reseating or rebooting controller does not resolve the issue
- Replacing the following parts does not resolve the issue, or the issue reproduces soon after it is temporary resolved
- E5700SG compute controller
- E2800A storage controller
- SFPs on E5700SG and E2800A
- FC cable used for interconnect between E5700SG and E2800A
- Following errors might be seen in some cases:
qla2x00_sp_timeout
instoragegrid_crash_dmesg.<TIMESTAMP>.log
[108059.672106] Hardware name: Default string Default string/Default string, BIOS 1.04.0 08/08/2018
[108059.680861] RIP: 0010:qla2x00_sp_timeout 0x4c/0x80 [qla2xxx]SG-UPDATE-ERROR: Samoa HIC verify failed!
in/var/log/messages
StorageGRID-PGE root: [2023-06-14 01:28:19+00:00 SGA] About to check Samoa HIC version and update if necessary
StorageGRID-PGE root: [2023-06-14 01:28:19+00:00 SGA] Samoa Verify chip 0; filespec /lib/firmware/netapp/Samoa_Sandhawk.bin
StorageGRID-PGE kernel: [ 29.187625] EP8324-HILDA-0 can not find function handle
StorageGRID-PGE root: [2023-06-14 01:28:19+00:00 SGA] Samoa file version: 5.5.31.0; personality FC
StorageGRID-PGE root: [2023-06-14 01:28:19+00:00 SGA] Samoa file mac label/base: fffffffffffffffd nic macs {ffffffffffffffff; ffffffffffffffff}
StorageGRID-PGE root: [2023-06-14 01:28:19+00:00 SGA] spi_read_uint32: Fail...
StorageGRID-PGE root: [2023-06-14 01:28:19+00:00 SGA]
StorageGRID-PGE root: [2023-06-14 01:28:19+00:00 SGA] SG-UPDATE-ERROR: Samoa HIC verify failed!
Note: These messages do not always indicate this problem directly