Skip to main content
NetApp Knowledge Base

CRC errors on T6 ports after converting from 40GbE to 100GbE

Views:
542
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
HW
Last Updated:

Applies to

  • AFF A800, AFF C800, ASA A800, ASA C800 with onboard T6 ports e0a and e0b (T62100-MEZZ) and T62100-CR (P/N X1146A) NIC in Slot 1
  • AFF A320 onboard T6 ports e0g and e0h (T62100-SABR)
  • IO Module,2p MC IP,40GbE QSFP+,100GbE QSFP28 (P/N X91146A) - T62100-CR
  • 2p 100GbE iWARP QSFP28 NIC (P/N X1146A) - T62100-CR

Issue

  • After converting T6-based Ethernet ports from 40GbE to 100GbE speeds, a continuous high number of CRC errors are reported due to corrupted Ethernet packets.

[node1 vifmgr: callhome.clus.net.degraded:alert]: Call home for CLUSTER NETWORK DEGRADED: CRC Errors Detected - High CRC errors detected on port e0a node node1

  • Link parameters are not cleared after a 40GbE to 100GbE port conversion, resulting in the generation of malformed packets. 
  • In some cases, the receipt of these corrupted packets can lead to a system disruption (data outage). Examples:
    • Vifmgr crashes with NAS protocols disruptions. Examples of ONTAP event messages reported:
      • [node_name: vifmgr: vifmgr.startup.merge.err:error]: The Logical Interface Manager (VIFMgr) encountered errors during startup.
      • vifmgr.startup.failover.err: VIFMgr encountered errors during startup.
      • vifmgr.dbase.checkerror: VIFMgr experienced an error verifying cluster database consistency. Some LIFs might not be hosted properly as a result.
    • Multiple disks fail due to checksum errors
      • raidio_thread: raid_tetris_cksum_err_1:notice]: params: {'owner': '', 'disk_info': 'Disk /aggr1/plex4/rg0/0v.i1.2L4P3 Shelf 10 Bay 1 ...
      • io_thread: raid_rg_readerr_repair_parity_1:notice]: params: {'owner': '', 'disk_info': 'Disk /aggr1/plex4/rg0/0v.i1.2L4P3 Shelf 10 Bay 1 ...
      • io_thread: raid_rg_readerr_repair_cksum_stored_1:notice]: params: {'owner': '', 'disk_info': 'Disk /aggr1/plex4/rg0/0v.i1.2L4P3 Shelf 10 Bay 1 ...
      • disk_server_0: disk_checksum_verifyFailed_1:alert]: params: {'diskName': '0v.i1.2L4', 'bno': '13160468', 'vol': '------', 'fileid': '-1', 'block': '0'}
      • hamsg: disk.fail.ssdstats:info]: Disk 0v.i1.2L4 (S6H0NA0TB04199) failed with rated life used 0 %, percent spare blocks 0 %, spare blocks N/A.
      • hamsg: disk.outOfService:notice]: Drive 0v.i1.2L4 (S6H0NA0TB04199): message received. Power-On Hours: 8006, GList Count: 0, Drive Info: Disk 0v.i1.2L4 Shelf 10 Bay 1 ...
      • config_thread: raid.shared.disk.awaiting.done:info]: Received shared disk awaiting done Disk 0v.i1.2L4 Shelf 10 Bay 1 ..., state failing, substate 0x10, partner state failing, partner substate 0x4, partner dblade ID xxxx host type 1 add details receive awaiting done
      • dmgr_thread: raid.notify.on.failure:debug]: Received SDM_NOTIFY_ON_FAILURE for disk uid x:x:x:x, originating sysid 538300608, with failure reason 1 (failed).
      • raid_disk_thread: raid.disk.unload.done:info]: Unload of Disk 0v.i1.2L4 Shelf 10 Bay 1 [NETAPP   X4010S173A1T9NTE NA50] S/N [S6H0NA0TB04199] UID [x:x:x:x] has completed successfully
  • Port speed changes can occur in the following example scenarios. Examples:
    • 40GbE Cluster switch or Ethernet data switches are replaced with 100GbE models
    • Cluster ports are temporarily configured at 40GbE for a storage system upgrade, but the final port speed configuration is 100GbE
    • During a switched to switchless conversion

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.