CRC errors received on a single NIC port
Applies to
- ONTAP 9
- FAS / AFF Systems
- CRC errors reporting on a single port
Issue
- Event logs report hardware errors on a physical and/or logical port.
[node-01: vifmgr: vifmgr.cluscheck.crcerrors]: Port a0b on node node-01 is reporting a high number of observed hardware errors, possibly CRC errors
[node-02: vifmgr: vifmgr.cluscheck.crcerrors]: Port e0d on node node-02 is reporting a high number of observed hardware errors, possibly CRC errors
[node-02: vifmgr: vifmgr.cluscheck.hwerrors:alert]: Port e0d on node node-02 is reporting a high number (at least 1 per 1000 packets) of observed hardware errors (CRC, length, alignment, dropped)
[node-02: vifmgr: callhome.clus.net.degraded:alert]: Call home for CLUSTER NETWORK DEGRADED: CRC Errors Detected - High CRC errors detected on port e0d node node-02
ifstat
output show CRC errors if ONTAP is receiving the errors.- Issue persists after cable/SFP re-seat and
ifstat -z
on the affected node.
RECEIVE
Total frames: 36418m | Frames/second: 23646 | Total bytes: 179t
Bytes/second: 116m | Total errors: 170k | Errors/minute: 7
Total discards: 0 | Discards/minute: 0 | Multi/broadcast: 1686k
Non-primary u/c: 0 | CRC errors: 159k | Long frames: 0
- CRC errors may be observed on a switch port or client side and latency may be seen due to packet loss
2022-03-20T17:39:36.443Z cpu36:2098075)WARNING: ScsiDeviceIO: 1498: Device naa.600a09803830574c4d5d53ddf26c4543 performance has deteriorated. I/O latency increased from average value of 18171 microseconds to 1816780 microseconds.
Points to remember :
Many switching environments use Cut-through switching rather than store and forward switching because of its speed
- This means that the faulty hardware may not be on the directly connected link
- The CRC may have occurred upstream
- This shows up as a non-zero value in
ifstat
for CRC errors - If CRCs are zero but the switch has CRCs, the problem may be transmitted but ONTAP is not seeing the errors