10G LACP bond ports repeatedly flapping after cluster upgrade to 12.2
Applies to
- NetApp Element OS 12.2
- Juniper switches
Issue
10G LACP bonds repeatedly disconnecting from switches after upgrade to 12.2 on all nodes with extreme
frequency.
Excerpt of switch logs below for eth1:
Apr 22 08:30:27 sfdc-qfx3 mib2d[6494]: SNMP_TRAP_LINK_DOWN:
ifIndex 689, ifAdminStatus up(1), ifOperStatus down(2), ifName xe-0/0/67
Apr 22 08:30:28 sfdc-qfx3 mib2d[6494]: SNMP_TRAP_LINK_UP:
ifIndex 689, ifAdminStatus up(1), ifOperStatus up(1), ifName xe-0/0/67
Apr 22 08:30:38 sfdc-qfx3 mib2d[6494]: SNMP_TRAP_LINK_DOWN:
ifIndex 689, ifAdminStatus up(1), ifOperStatus down(2), ifName xe-0/0/67
Apr 22 08:30:39 sfdc-qfx3 mib2d[6494]: SNMP_TRAP_LINK_UP:
Node kern.log shows following errors:
2021-04-22T12:51:27.217013Z cr-san05-01 kernel: [647443.505835] bnx2x 0000:01:00.0 eth0: NIC Link is Down
2021-04-22T12:51:27.297039Z cr-san05-01 kernel: [647443.585861] bnx2x 0000:01:00.0 eth0: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - receive & transmit
2021-04-22T12:51:27.467029Z cr-san05-01 kernel: [647443.755850] bnx2x 0000:01:00.1 eth1: NIC Link is Down
2021-04-22T12:51:27.557036Z cr-san05-01 kernel: [647443.845857] bnx2x 0000:01:00.1 eth1: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - receive & transmit
2021-04-22T12:58:30.635345Z cr-san05-01 kernel: [647866.924166] bnx2x 0000:01:00.0 eth0: NIC Link is Down
2021-04-22T12:58:30.725346Z cr-san05-01 kernel: [647867.014167] bnx2x 0000:01:00.0 eth0: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - receive & transmit
2021-04-22T12:58:30.895361Z cr-san05-01 kernel: [647867.184182] bnx2x 0000:01:00.1 eth1: NIC Link is Down