Skip to main content
NetApp Knowledge Base

CSHM: ClusterSwitchConfig_Alert: System health degraded due to wrong Cluster Switch cabling

Views:
3,626
Visibility:
Public
Votes:
0
Category:
fabric-interconnect-and-management-switches
Specialty:
hw
Last Updated:

Applies to

  • ONTAP 9
  • Cluster Switch Health Monitor (CSHM) AutoSupport message

Issue

  1. System health is degraded
cluster::>system health status show
Status
---------------
degraded
 
  1. Switch-Health subsystem is degraded
cluster::> system health subsystem show
Subsystem         Health
----------------- ------------------
SAS-connect       ok
Environment       ok
Memory            ok
Service-Processor ok
Switch-Health     degraded
CIFS-NDO          ok
Motherboard       ok
IO                ok
MetroCluster      ok
MetroCluster_Node ok
FHM-Switch        ok
FHM-Bridge        ok
SAS-connect_Cluster ok
13 entries were displayed.
 
  1. The following health alerts are logged
cluster::> system health alert show
Node: node01
Resource: node01
Severity: Major
Indication Time: Mon Oct 07 13:19:08 2019
Suppress: false
Acknowledge: false
Probable Cause: One or more nodes are not connected to both cluster switches.
Possible Effect: If one cluster switch fails, "node01" might lose access to the cluster.
Corrective Actions: Ensure the switch "switch02" is connected to the node "node01".

Node: node01
Resource: node02
Severity: Major
Indication Time: Mon Oct 07 13:19:08 2019
Suppress: false
Acknowledge: false
Probable Cause: One or more nodes are not connected to both cluster switches.
Possible Effect: If one cluster switch fails, "node02" might lose access to the cluster.
Corrective Actions: Ensure the switch "switch01" is connected to the node "node02".

2 entries were displayed.

4. You could also see the following heath alerts

cluster::> system health alert show
               Node: cluster-01
           Resource: Ethernet1/1
          Severity: Minor
    Indication Time: Wed Aug 25 05:23:27 2021
           Suppress: false
        Acknowledge: false
     Probable Cause: MTU value "1500" on port "e0e" of node
                     "cluster-01" is improperly set. It
                     should be 9000.
    Possible Effect: Received Ethernet packets that are larger than the
                     configured MTU are dropped, causing data transfer
                     issues.
Corrective Actions: modify the MTU using the command "network port
                     broadcast-domain modify -ipspace Cluster
                     -broadcast-domain Cluster -mtu <MTU>". To find out the
                     broadcast domain name of the port, execute the command
                     "network port broadcast-domain show".

5. AutoSupport reports:

HMSCSA:HA Group Notification from cluster1-03 (Health Monitor process cshm: ClusterSwitchConfig_Alert[cluster1-03]) ERROR

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.