Controller fails to boot, throws POST errors "HA interconnect: Link down on NIC0 / takeover disabled / The root volume is not up"
Applies to
- FAS2520
- FAS2552
- Data ONTAP 8
- ONTAP 9
Issue
System can't boot up by the following 3 types of error.
- "HA interconnect: Link down on NIC 0"
- "takeover of NAMANAGE1-01 disabled (unsynchronized log)"
- "The root volume is not up. This node is not fully operational."
Example:
[2020-03-09 14:19:10.223] LOADER-B>
[2020-03-09 14:21:56.976] Invalid PCIe device detected below PCIe Root Port(Bus/Dev/Func): 00/1C/00
[2020-03-09 14:21:57.056] Actual Vendor ID and Device ID:FFFF/FFFF
[2020-03-09 14:21:57.087] Expected Vendor ID and Device ID:8086/150E
[2020-03-09 14:21:57.135] Mezzanine Card ID(02 - 10GbE, 03 - FC, 07 - No Dev, others - Resv):07
[2020-03-09 14:21:57.215] BIOS is resetting system...
[2020-03-09 14:22:00.365] Phoenix SecureCore(tm) Server
[2020-03-09 14:22:00.413] Copyright 1985-2008 Phoenix Technologies Ltd.
[2020-03-09 14:22:00.461] All Rights Reserved
[2020-03-09 14:22:00.477] BIOS version: 8.3.0
[2020-03-09 14:22:00.493] Portions Copyright (c) 2008-2014 NetApp, Inc. All Rights Reserved
[2020-03-09 14:22:00.573]
[2020-03-09 14:22:00.573] CPU = 1 Processors Detected, Cores per Processor = 2
[2020-03-09 14:22:00.621] Intel(R) Xeon(R) CPU C3528 @ 1.73GHz
[2020-03-09 14:22:00.701] Testing RAM
[2020-03-09 14:22:01.421] 512MB RAM tested
[2020-03-09 14:22:01.437] 18432MB RAM installed
[2020-03-09 14:22:01.485] 256 KB L2 Cache per Processor Core
[2020-03-09 14:22:01.517] 4096K L3 Cache Detected
[2020-03-09 14:22:01.548] System BIOS shadowed
[2020-03-09 14:22:02.813] USB 2.0: MICRON eUSB DISK
[2020-03-09 14:22:02.844] BIOS is scanning PCI Option ROMs, this may take a few seconds...
[2020-03-09 14:22:11.051] ...................
[2020-03-09 14:22:11.147]
[2020-03-09 14:22:11.309]
[2020-03-09 14:22:11.309] Boot Loader version 4.3
[2020-03-09 14:22:11.341] Copyright (C) 2000-2003 Broadcom Corporation.
[2020-03-09 14:22:11.389] Portions Copyright (C) 2002-2014 NetApp, Inc. All Rights Reserved.
[2020-03-09 14:22:11.468]
[2020-03-09 14:22:11.691] CPU Type: Intel(R) Xeon(R) CPU C3528 @ 1.73GHz
[2020-03-09 14:22:11.949]
[2020-03-09 14:22:11.949]
[2020-03-09 14:22:11.949] Starting AUTOBOOT press Ctrl-C to abort...
[2020-03-09 14:22:14.202] Loading X86_64/freebsd/image2/kernel:0x200000/10088648 0xb9f0c8/4301024 Entry at 0x80271e20
[2020-03-09 14:22:17.211] Loading X86_64/freebsd/image2/platform.ko:0xfba000/1990365 0x11a0000/296352 0x11e85a0/273360
[2020-03-09 14:22:18.091] Starting program at 0x80271e20
[2020-03-09 14:22:19.849] NetApp Data ONTAP 8.3.1P1
[2020-03-09 14:22:47.270] Copyright (C) 1992-2015 NetApp.
[2020-03-09 14:22:47.302] All rights reserved.
[2020-03-09 14:22:47.833] *******************************
[2020-03-09 14:22:47.862] * *
[2020-03-09 14:22:47.894] * Press Ctrl-C for Boot Menu. *
[2020-03-09 14:22:47.926] * *
[2020-03-09 14:22:47.958] *******************************
[2020-03-09 14:23:38.205] original max threads=40, original heap size=41943040
[2020-03-09 14:23:38.269] bip_nitro Virtual Size Limit=166748979 Bytes
[2020-03-09 14:23:38.318] bip_nitro: user memory=2022727680, actual max threads=115, actual heap size=120795955
[2020-03-09 14:23:42.522] ixgbe: e0c: ** JUMBOMBUF DEBUG ** switching to large buffers(9k -> 3k): (sz = 5120)!
[2020-03-09 14:23:42.860] ixgbe: e0d: ** JUMBOMBUF DEBUG ** switching to large buffers(9k -> 3k): (sz = 5120)!
[2020-03-09 14:23:43.193] ixgbe: e0e: ** JUMBOMBUF DEBUG ** switching to large buffers(9k -> 3k): (sz = 5120)!
[2020-03-09 14:23:43.530] ixgbe: e0f: ** JUMBOMBUF DEBUG ** switching to large buffers(9k -> 3k): (sz = 5120)!
[2020-03-09 14:23:50.618] Mar 09 14:23:47 [Node-02:cf.nm.nicTransitionDown:warning]: HA interconnect: Link down on NIC 0.
[2020-03-09 14:23:50.732] Mar 09 14:23:47 [Node-02:cf.rv.notConnected:error]: HA interconnect: Connection for 'cfo_rv' failed.
[2020-03-09 14:23:52.923] WAFL CPLEDGER is enabled. Checklist = 0x7ff841ff
[2020-03-09 14:23:54.585] Mar 09 14:23:51 [Node-02:cf.nm.nicReset:warning]: HA interconnect: Initiating soft reset on card 0 due to rendezvous reset.
[2020-03-09 14:23:55.560] add host 127.0.10.1: gateway 127.0.20.1
[2020-03-09 14:23:56.345] Mar 09 14:23:53 [Node-02:cf.fsm.takeoverOfPartnerDisabled:error]: Failover monitor: takeover of Node-01 disabled (HA interconnect error. Verify that the partner node is running and that the HA
[2020-03-09 14:23:56.554] interconnect cabling is correct, if applicable. For further assistance, contact technical support).
[2020-03-09 14:23:56.666] Mar 09 14:23:53 [Node-02:kern.syslog.msg:notice]: The system was down for 556 seconds
[2020-03-09 14:23:57.082] Node-02
[2020-03-09 14:23:57.242] Mar 09 14:23:53 [Node-02:snmp.agent.msg.access.denied:warning]: Permission denied for SNMPv3 requests from root. Reason: Password is too short (SNMPv3 requires at least 8 characters).
[2020-03-09 14:24:02.343] Mar 09 14:23:59 [Node-02:cf.fsm.takeoverOfPartnerDisabled:error]: Failover monitor: takeover of Node-01 disabled (unsynchronized log).
[2020-03-09 14:24:02.503] Mar 09 14:23:59 [Node-02:cf.fsm.takeoverByPartnerDisabled:error]: Failover monitor: takeover of Node-02 by Node-01 disabled (unsynchronized log).
[2020-03-09 14:24:03.353] Mar 09 14:24:00 [Node-02:cf.fsm.takeoverByPartnerEnabled:notice]: Failover monitor: takeover of Node-02 by Node-01 enabled
[2020-03-09 14:24:03.526] Mar 09 14:24:00 [Node-02:monitor.globalStatus.critical:CRITICAL]: Controller failover of Node-01 is not possible: unsynchronized log.
[2020-03-09 14:24:04.342] Mar 09 14:24:01 [Node-02:cf.fsm.takeoverOfPartnerEnabled:notice]: Failover monitor: takeover of Node-01 enabled
[2020-03-09 14:28:57.716] WARNING: Giving up waiting for mroot
[2020-03-09 14:28:58.229]
[2020-03-09 14:28:58.229] Mon Mar 9 14:28:54 JST 2020
[2020-03-09 14:28:58.263] login: admin
[2020-03-09 14:29:07.028] Password:
[2020-03-09 14:29:10.802] ******************************************************
[2020-03-09 14:29:10.866] * This is a serial console session. Output from this *
[2020-03-09 14:29:10.930] * session is mirrored on the SP console session. *
[2020-03-09 14:29:10.978] ******************************************************
[2020-03-09 14:29:11.042] ***********************
[2020-03-09 14:29:11.074] ** SYSTEM MESSAGES **
[2020-03-09 14:29:11.090] ***********************
[2020-03-09 14:29:11.122]
[2020-03-09 14:29:11.122] The root volume is not up. This node is not fully operational. Contact support
[2020-03-09 14:29:11.202] personnel for further assistance.