Motherboard status degraded after ONTAP upgrade

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

Views:: 370

Visibility:: Public

Votes:: 0

Category:: ontap-9

Specialty:: hw

Last Updated:

Applies to

ONTAP 9
Cluster Network Switch

Issue

Health check after ONTAP upgrade or node reboot shows Motherboard status degraded.

::> system health status show Status --------------- degraded

::> system health subsystem show Subsystem Health ----------------- ------------------ SAS-connect ok Environment ok Memory ok Service-Processor ok Switch-Health ok CIFS-NDO ok Motherboard degraded IO ok MetroCluster ok MetroCluster_Node ok FHM-Switch ok FHM-Bridge ok SAS-connect_Cluster ok 13 entries were displayed.

We see NodeIfInErrorsWarnAlert health alert reported for e0c on nodes 1 and 2.

::> system health alert show Node: node2 Alert ID: NodeIfInErrorsWarnAlert Resource: e0c Severity: Major Indication Time: Thu Mar 27 18:33:07 2025 Suppress: false Acknowledge: false Probable Cause: The percentage of inbound packet errors of node "node2" on interface "e0c" is above the warning threshold. Possible Effect: Communication from this node to the cluster might be degraded Corrective Actions: 1) Migrate any cluster LIF that uses this connection to another port connected to a cluster switch. For example, if cluster LIF "clus1" is on port e0a and the other LIF is on e0b, run the following command to move "clus1" to e0b: "network interface migrate -vserver vs1 -lif clus1 -sourcenode node1 -destnode node1 -dest-port e0b" 2) Replace the network cable with a known-good cable. If errors are corrected, stop. No further action is required. Otherwise, continue to Step 3. 3) Move the network cable to another port on the node (if available). Migrate the cluster LIF to the new port. If errors are corrected, contact technical support to troubleshoot the original node port. Otherwise, continue to Step 4. 4) Move the network cable to another available cluster switch port. Migrate the cluster LIF back to the original port. If errors are corrected, contact technical support to troubleshoot the original switch port. If errors persist, contact technical support for further assistance.

Node: node1 Alert ID: NodeIfInErrorsWarnAlert Resource: e0c Severity: Major Indication Time: Thu Mar 27 18:33:01 2025 Suppress: false Acknowledge: false Probable Cause: The percentage of inbound packet errors of node "node1" on interface "e0c" is above the warning threshold. Possible Effect: Communication from this node to the cluster might be degraded Corrective Actions: 1) Migrate any cluster LIF that uses this connection to another port connected to a cluster switch. For example, if cluster LIF "clus1" is on port e0a and the other LIF is on e0b, run the following command to move "clus1" to e0b: "network interface migrate -vserver vs1 -lif clus1 -sourcenode node1 -destnode node1 -dest-port e0b" 2) Replace the network cable with a known-good cable. If errors are corrected, stop. No further action is required. Otherwise, continue to Step 3. 3) Move the network cable to another port on the node (if available). Migrate the cluster LIF to the new port. If errors are corrected, contact technical support to troubleshoot the original node port. Otherwise, continue to Step 4. 4) Move the network cable to another available cluster switch port. Migrate the cluster LIF back to the original port. If errors are corrected, contact technical support to troubleshoot the original switch port. If errors persist, contact technical support for further assistance.

2 entries were displayed

NodeIfInErrorsWarnAlert errors are reported due to increase in CRC errors on cluster port e0c of node node1 and node2.

EMS

The percentage of inbound packet errors of node "node1" on interface "e0c" is above the warning threshold. The percentage of inbound packet errors of node "node2" on interface "e0c" is above the warning threshold.

[node1: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif node2_clus2 (node node2) to cluster lif node1 (node node1).

[node1: vifmgr: callhome.clus.net.degraded:alert]: Call home for CLUSTER NETWORK DEGRADED: Large MTU Packet Loss - Ping failures detected between node2 ( 169.XXX.XX.217 ) on node2 and node1_clus1 ( 169.XXX.XX.173 ) on node1

ifconfig -v

node2

-- interface e0c (16 hours, 4 minutes, 52 seconds) --

-- interface e0c (8 hours, 25 minutes, 21 seconds) --