Home
On Premises
ONTAP 9
ONTAP Select
ONTAP Select KBs
ONTAP Select HA experiences random reboots due to network partitioning caused by a flaky Interconnect

ONTAP Select HA experiences random reboots due to network partitioning caused by a flaky Interconnect

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

Views:: 83

Visibility:: Public

Votes:: 0

Category:: ontap-select

Specialty:: ontapselect

Last Updated:

Applies to

NetApp ONTAP Select
NetApp ONTAP Select Deploy

Issue

This OTS HA experiences random reboots due to network partitioning caused by a flaky Interconnect.
First ha.netPartition.other can be observed as early as the 20th Sept 2021

Mon Sep 20 18:20:14 +0200 [cluster01-02: nvram_sync: ha.netPartition.other:debug]: Network partition due to other error. Duration 803 msecs, takeover wait 0 msecs; error code 5; status: 0x201081; request id: 162.

Problem gradually got more severe leading up to node cluster01-02 being stuck in a boot loop with "Waiting for requisite number of local mailboxes.." -error
Node cluster01-01 was used to create a single node cluster for the customer to continue serving data whilst a new cluster is being set up and can be migrated to.