Skip to main content
NetApp Knowledge Base

StorageGRID node connection state is unknown due to faulty network adapter

Views:
283
Visibility:
Public
Votes:
0
Category:
storagegrid
Specialty:
sgrid
Last Updated:

Applies to

  • NetApp StorageGRID
  • Bare metal-based Storage Node

Issue

  • A Storage Node Connection State is Unknown in Grid Manager Interface:
Select Nodes > select the interested node > Overview:
connection-state-unknown.png
  • servermanager.log indicates there is network issue:
2021-01-23 12:39:10 +0000 | dynip | Possible network isolation: Node has no contact with other nodes.
 
  • Base OS messages log shows errors about i40e and all interfaces of bond0 are link down:
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: HMC error interrupt
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: HMC error info 0x80000090, HMC error data 0x0
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.1: unhandled interrupt icr0=0x00010000
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: unhandled interrupt icr0=0x00010000
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: device will be reset
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.1: device will be reset
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.1: VSI seid 393 Tx ring 128 disable timeout
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.1: VSI seid 393 Rx ring 128 disable timeout
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: VSI seid 390 Tx ring 0 disable timeout
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: VSI seid 390 Rx ring 0 disable timeout
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: VSI seid 392 Tx ring 128 disable timeout
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: VSI seid 392 Rx ring 128 disable timeout
Jan 23 12:33:41 dc1-sn1 kernel: bond0: link status definitely down for interface eno1, disabling it
Jan 23 12:33:41 dc1-sn1 kernel: device eno1 left promiscuous mode
Jan 23 12:33:41 dc1-sn1 kernel: bond0: now running without any active interface!
Jan 23 12:33:41 dc1-sn1 kernel: bond0: link status definitely down for interface eno2, disabling it
Jan 23 12:33:57 dc1-sn1 kernel: i40e 0000:1a:00.1: PF reset failed, -15
Jan 23 12:33:57 dc1-sn1 kernel: i40e 0000:1a:00.0: PF reset failed, -15
Jan 23 12:34:01 dc1-sn1 kernel: i40e 0000:1a:00.1: Rebuild AdminQ failed, err I40E_ERR_ADMIN_QUEUE_TIMEOUT aq_err OK
Jan 23 12:34:01 dc1-sn1 kernel: i40e 0000:1a:00.0: Rebuild AdminQ failed, err I40E_ERR_ADMIN_QUEUE_TIMEOUT aq_err OK
Jan 23 12:34:01 dc1-sn1 kernel: i40e 0000:1a:00.0: ignoring delete macvlan error on PF, err I40E_ERR_QUEUE_EMPTY, aq_err OK
Jan 23 12:34:17 dc1-sn1 kernel: i40e 0000:1a:00.1: PF reset failed, -15
Jan 23 12:34:17 dc1-sn1 kernel: i40e 0000:1a:00.0: PF reset failed, -15
...
Jan 23 12:39:10 dc1-sn1 journal: Possible network isolation: Node has no contact with other nodes. If this warning persists, use the /usr/sbin/add_node_ip.py command to tell this node the address of another node in the grid. See the Recovery and Maintenance Guide for details.
Jan 23 12:39:10 dc1-sn1 journal: 2021-05-23 13:39:10 +0000 | dynip | Possible network isolation: Node has no contact with other nodes.

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.
Scan to view the article on your device