A800 interconnect down with e0a/e0b Fatal parity error (0x10)

Last updated

Feb 20, 2025
Save as PDF
Share
1. Share
2. Tweet
3. Share

Views:: 3,104

Visibility:: Public

Votes:: 2

Category:: aff-series

Specialty:: hw

Last Updated:: 2/20/2025, 9:25:24 PM

Applies to

AFF A800, AFF C800, ASA A800, ASA C800
Dual 40/100G Ethernet T62100-MEZZ
ONTAP 9

Issue

After node power cycle, reboot, or upgrade, the system is in partial giveback with Interconnect status: "RDMA Interconnect is down"
The storage failover status is: "Storage failover interconnect error. NVRAM log not synchronized. Disk inventory not exchanged"
The console logs show: e0a/e0b:Fatal parity error (0x10)
The ONTAP OS and the BMC, BIOS and T62100 firmware are updated and running in both nodes

EMS logs:

May 02 07:58:09 [node_name:netif.fatal.err:ALERT]: The network device in slot 0 encountered fatal error e0a/e0b. May 02 07:58:09 [node_name:netif.fatal.err:ALERT]: The network device in slot 0 encountered fatal error e0a/e0b. May 02 22:49:05 [node_name: kernel: netif.linkDown:info]: Ethernet e0a: Link down, check cable. May 02 22:49:05 [node_name: kernel: netif.linkDown:info]: Ethernet e0b: Link down, check cable. May 02 22:49:05 [node_name: intr: rlib.ifconfig.linkEvent:notice]: params: {'ifname': 'e0b', 'eventType': 'DOWN'} May 02 22:49:05 -0800 [node_name: vifmgr: vifmgr.portdown:notice]: A link down event was received on node node_name, port e0a. May 02 22:49:05 -0800 [node_name: nvmm_error: nvmm.mirror.offlined:debug]: params: {'mirror': 'HA_PARTNER'} May 02 22:49:05 -0800 [node_name: vifmgr: vifmgr.clus.linkdown:EMERGENCY]: The cluster port e0a on node node_name has gone down unexpectedly. May 02 23:00:00 -0800 [node_name: statd: ic.HAInterconnectDown:error]: HA interconnect: Interconnect down for 10 minutes: link0 down May 02 23:00:00 -0800 [node_name: statd: callhome.hainterconnect.down:alert]: Call home for HA INTERCONNECT DOWN due to link0 down.