Data service interruption during MetroCluster swithover with CRC in data ports
Applies to
- Metrocluster IP
- Data Switches
Issue
- Data service interruption during MetroCluster switchover and switchback with:
- Datastores offline
- Aggregates online
- Network interfaces up
- LUNs mappeds and online
- CRC errors noticed on one side of the MCC array for both nodes
- CRC errors already present in ifstat before the issue
- Example:
Node: Cluster01-n01
-- interface e4a (136 days, 19 hours, 38 minutes, 22 seconds) --
RECEIVE
Total frames: 61123k | Frames/second: 5 | Total bytes: 9351m
Bytes/second: 791 | Total errors: 19203 | Errors/minute: 0
...
CRC errors: 19036 | Runt frames: 0 | Fragment: 0
...
Jumbo: 7 | Error symbol: 0 | Bus overruns: 0
...
Node: Cluster01-n02
-- interface e4a (136 days, 19 hours, 38 minutes, 14 seconds) --
RECEIVE
Total frames: 74898k | Frames/second: 6 | Total bytes: 14696m
Bytes/second: 1243 | Total errors: 11610 | Errors/minute: 0
...
CRC errors: 11532 | Runt frames: 0 | Fragment: 0
...
Jumbo: 11 | Error symbol: 0 | Bus overruns: 0
...
- ONTAP Event Messag for network errors. Example:
ALERT: vifmgr.cluscheck.hwerrors: Port e5a-603 on node node_name is reporting a high number (at least 1 per 1000 packets) of observed hardware errors (CRC, length, alignment, dropped).
- Lost connection to LUNs from Vsphere, until switchback was completed.