CONTAP-301093: SwitchUnreachable_Alert, SwitchSNMPCommunication_Alert, SwitchInfoUpdateFailure_Alert seen during high utilization of e0M
Issue
- ONTAP Ethernet Switch Health Monitor (CSHM) raises some or all of the following alerts for: SwitchUnreachable_Alert, SwitchSNMPCommunication_Alert, and/or SwitchInfoUpdateFailure_Alert
- If SNMPv3 is enabled for monitoring with CSHM, the following "failed" messages from messages.log "failed" are seen with both switches. Example:
0000002e.12eaff0e 098ae1de Thu Feb 13 2025 20:20:34 +01:00 [kern_network_shelfd:info:1536] 0x80840f900: 0: ERR: dns_sd_lite::networking: get_query_sockets:src/networking.cc:871 Could not create query sockets
...
00000032.00761fb6 0126e281 Tue Sep 03 2024 22:31:26 +02:00 [kern_cshm:info:54360] [Sep 3 22:31:26]: 0x80b10fe00: 0: ERR: CSHM::helpers: snmpwalk:src/tables/cshm_helpers.cc:237 'switch1': Failed to fetch SNMPv3 engine information from sm_snmpEngine: SNMP timeout
...
[kern_cshm:info:29776] [Feb 13 20:20:34]: 0x80b11fb00: 0: ERR: CSHM::snmp: buildCommand:src/tables/cshm_helpers.cc:339 'switch1': Failed to add SNMPv3 engine information from sm_snmpEngine
...
0000002e.12eaff12 098ae1de Thu Feb 13 2025 20:20:34 +01:00 [kern_cshm:info:29776] [Feb 13 20:20:34]: 0x80b11fb00: 0: ERR: CSHM::snmp: buildCommand:src/tables/cshm_helpers.cc:217 'switch1': Failed to build SNMPv3 command
- The node management e0M port(s) go up to a 100% utilization with more than 100MB/s throughput in a 1Gb port.
- From the e0M ifstat output, we can see a high amount of "unexpected" traffic received and transmitted. Example(s):
-- interface e0M (102 days, 15 hours, 25 minutes, 42 seconds) --
RECEIVE
Total frames: 275m | Frames/second: 31 | Total bytes: 74803m
Bytes/second: 8435 | Total errors: 0 | Errors/minute: 0
...
TRANSMIT
Total frames: 3726m | Frames/second: 420 | Total bytes: 559g
Bytes/second: 63080 | Total errors: 0 | Errors/minute: 0
...
-- interface e0M (1 day, 10 hours, 9 minutes, 55 seconds) --
RECEIVE
Total frames: 7903k | Frames/second: 64 | Total bytes: 2003m
Bytes/second: 16290 | Total errors: 0 | Errors/minute: 0
...
TRANSMIT
Total frames: 92516k | Frames/second: 752 | Total bytes: 13425m
Bytes/second: 109k | Total errors: 0 | Errors/minute: 0
...
-- interface e0M (8 days, 10 hours, 0 minutes, 19 seconds) --
RECEIVE
Total frames: 44625k | Frames/second: 61 | Total bytes: 11193m
Bytes/second: 15392 | Total errors: 0 | Errors/minute: 0
...
TRANSMIT
Total frames: 547m | Frames/second: 753 | Total bytes: 79676m
Bytes/second: 109k | Total errors: 0 | Errors/minute: 0
...
-- interface e0M (57 days, 9 hours, 53 minutes, 2 seconds) --
RECEIVE
Total frames: 405m | Frames/second: 82 | Total bytes: 176g
Bytes/second: 35513 | Total errors: 0 | Errors/minute: 0
...
TRANSMIT
Total frames: 3778m | Frames/second: 762 | Total bytes: 609g
Bytes/second: 122k | Total errors: 0 | Errors/minute: 0
...
- The network route with the lowest metric is configured in the same subnet as the management port.