EMS Event "netif.tcp.conn.bad.checksum"
Applies to
ONTAP 9.11.1 and later
Issue
- The following EMS message(s) are displayed:
Mon Sep 26 02:09:05 +0900 [node01: kernel: netif.tcp.conn.bad.checksum:error]: TCP packet with bad checksum detected on port e0c. The packet arrived on connection with source address xx.xx.xx.xx and destination_address xx.xx.x.xx.
- Bad TCP cksum is also incrementing on the port in
::> system node run -node node_name -command ifstat port_name
:
-- interface e0c (40 days, 5 hours, 34 minutes, 37 seconds) --
RECEIVE
Total frames: 2047m | Frames/second: 589 | Total bytes: 2625g
Bytes/second: 755k | Total errors: 0 | Errors/minute: 0
...
LRO bytes: 2527g | LRO6 segments: 0 | LRO6 bytes: 0
Bad UDP cksum: 0 | Bad UDP6 cksum: 0 | Bad TCP cksum: 21
Bad TCP6 cksum: 0 | Mcast v6 solicit: 0 | Lagg errors: 0
...
- Before proceeding with the solution, below troubleshooting steps can be performed:
- Verify any CRC errors under
ifstat
, troubleshoot the cable/SFP connected to the port if CRC errros are visible. - If no CRC errors are reported in
ifstat
:- If a single client is being reported in the EMS message, investigate the client.
- Otherwise, if no trend in clients, investigate devices between the client IP's and LIFs mentioned in the EMS message.
- Ensure correct MTU is set on all interfaces (storage, switch, host)
- Verify any CRC errors under
- Note: Collect simultaneous tcpdumps or packet traces from impacted port on storage, connected switch port and the client to confirm TCP bad checksums. There is currently no other known method to rule out contributors or determine the cause.