StorageGRID storage node run into Unknown status due to all 4 HICs are down
Applies to
- StorageGRID 11.6
- SG5700 appliance
Issue
- StorageGRID storage node run into Unknown status
- No error output in
servermanager.log
- No crash dump is generated
- The
messages
indicates that the 4 HICs(hic1 - hic4) go down suddenly
localhost kernel: [19559076.317761] [qed_dbg_parse_attn:10662(hic2-0)]opte (Parity) : mem011_i_mem_prty [address 0x00053000, bit 10] [masked]
localhost kernel: [19559076.339231] [qed_mcp_handle_process_kill:1883(hic2-0)]Process kill counter: 1
localhost kernel: [19559076.349959] [qed_dbg_parse_attn:10662(hic4-0)]opte (Parity) : mem011_i_mem_prty [address 0x00053000, bit 10] [masked]
localhost kernel: [19559076.350055] [qed_mcp_handle_process_kill:1866(hic4-0)]Received a process kill indication
localhost kernel: [19559076.531864] bond0: link status down for interface hic2, disabling it in 200 ms
localhost kernel: [19559076.543859] bond0: link status down for interface hic2, disabling it in 200 ms
localhost kernel: [19559076.553575] qede 0000:42:00.1: Ending qede_remove successfully
localhost kernel: [19559076.571853] bond0: link status down for interface hic2, disabling it in 200 ms
localhost kernel: [19559076.595854] bond0: link status down for interface hic2, disabling it in 200 ms
localhost kernel: [19559076.623867] bond0: link status down for interface hic2, disabling it in 200 ms
localhost kernel: [19559077.916018] [qede_link_update:3836(hic1)]Link is down
localhost kernel: [19559077.967837] bond0: link status down for interface hic4, disabling it in 200 ms
localhost kernel: [19559077.979836] bond0: link status down for interface hic4, disabling it in 200 ms
localhost kernel: [19559077.990998] qede 0000:42:00.3: Ending qede_remove successfully
localhost kernel: [19559078.015288] bond1: link status down for interface hic3, disabling it in 200 ms
localhost kernel: [19559078.017950] qede 0000:42:00.3: firmware: direct-loading firmware qed/qed_init_values_zipped-8.50.16.0.bin
localhost kernel: [19559078.027833] bond1: link status down for interface hic1, disabling it in 200 ms
localhost kernel: [19559078.039965] bond1: link status down for interface hic3, disabling it in 200 ms
localhost kernel: [19559078.062300] bond1: link status down for interface hic3, disabling it in 200 ms
localhost kernel: [19559078.079835] bond0: link status down for interface hic4, disabling it in 200 ms
localhost kernel: [19559078.107835] bond0: link status down for interface hic4, disabling it in 200 ms
localhost kernel: [19559078.115839] bond0: link status down for interface hic4, disabling it in 200 ms
localhost kernel: [19559078.127834] bond1: link status down for interface hic1, disabling it in 200 ms
localhost kernel: [19559078.137130] bond1: link status down for interface hic3, disabling it in 200 ms
localhost kernel: [19559078.143834] bond0: link status down for interface hic4, disabling it in 200 ms
localhost kernel: [19559078.151831] bond0: link status down for interface hic4, disabling it in 200 ms
localhost kernel: [19559078.219861] bond1: now running without any active interface!