AFF A800 goes down with a fatal parity error
Applies to
- AFF A800
- ONTAP 9.14.1
Issue
Example:
t6nex0: ! [0x08000000] LE parity/CRC error
t6nex0: ! MC0_INT_CAUSE 0x41318 = 0x00000006, E 0x00000007, F 0x00000005
t6nex0: ! [0x00000004] Uncorrectable ECC data error(s)
t6nex0: * [0x00000002] Correctable ECC data error(s)
t6nex0: MC0: 26 uncorrectable ECC data error(s)
t6nex0: MC0: 8 correctable ECC data error(s)
t6nex0: ! PL_PL_INT_CAUSE 0x19430 = 0x00000018, E 0x00000010, F 0x00000010
t6nex0: ! [0x00000010] Fatal parity error
t6nex0: ? [0x00000008]
t6nex0: firmware reports adapter error: Crash (0xc014c010)
t6nex0: encountered fatal error, adapter stopped (0).
Sep 06 12:04:12 [node01:vifmgr.clus.linkdown:EMERGENCY]: The cluster port e0a on node node01 has gone down unexpectedly.
e0a: failed to add mc address 01:00:0c:cc:cc:cc rc=6
e0a: failed to add mc address 01:80:c2:00:00:0e rc=6
Waiting for PIDS: [rpcbind] 1869.
Terminated
.
Kernel thread "tmp_monitor" (pid 0) exited prematurely.