Skip to main content

NetApp_Insight_2020.png 

NetApp Knowledgebase

Cluster join fails due to vifmgr process failure

Views:
259
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
core
Last Updated:

 

Applies to

  • ONTAP 9.0
  • ONTAP 9.1

Issue

Joining a new node to an ONTAP 9.0 or 9.1 cluster may fail with the following error message:  

Updating LIF Manager ........................Error: Failed to create Default Broadcast domain. Timeout: Operation "vifmgr_broadcast_domain_perform_cluster_join_iterator::create_imp()" took longer than 25 seconds to complete.
 

To confirm the root cause, check the following:

1. Confirm join has failed on the update-default-broadcast-domain task

cluster::*> set -privilege diagnostic

cluster::*> debug cluster-join show
Task ID             SubTask ID                Status Tries Failures
-------------------- ------------------------- ---------- ----- -------
pre-setup check-unused-cluster-ports           success       20       0
pre-setup            mtu-check                 success       20       0
pre-setup            ping-local                success       20       0
pre-setup            ping-remote               success       20       0
pre-setup            mtu-subnet-test           success       20       0
pre-setup            rpc-check                 success       20       0
pre-setup            capability-check          success       20       0
network-setup check-node-mgmt-mtu              success       20       0
network-setup        rename-lifs-nodeuuid      success       20       0
network-setup        relabel-lifs              success       20       0
network-setup        ping-local2               success       20       0
network-setup        limit-check               success       20       0
node-check           check-for-mroot           success       20       0
node-check           ha-mode-check             success       20       0
node-check sfo-partner-check                   success       20       0
node-check           platform-check            success       20       0
node-check           license-check             success       20       0
node-check           join-switchless           success       20       0
node-check           get-node-time             success       20       0
node-check           get-node-name             success       20       0
node-check           check-node-name           success       20       0
node-check resolve-aggr-names                  success       20       0
node-check           cluster-ha-check          success       20       0
system-initialize    system-initialize         success       20       0
cluster-join         join-site-list            success       20       0
cluster-join         wait-for-rdb-online       success       20       0
cluster-join         wait-for-rdb-databases    success       20       0
cluster-join         create_cluster_version_entries success  20       0
cluster-join upload-capability                 success       20       0
system-startup       system-startup            success       20       0
check-cluster-apps   vldb                      success       20       0
check-cluster-apps   lifmgr                    success       20       0
check-cluster-apps   bcom                      success       20       0
vldb-update          register-aggregates       success       20       0
vldb-update          register-volumes          success       20       0
vifmgr-update update-default-broadcast-domain  failure       20       1
nonshared-clus-setup nonshared-clus-setup         -           0       0
miscellaneous        rename-lifs-nodename         -           0       0
miscellaneous        get-location                 -           0       0
miscellaneous        register-mgwd-dsmfp-service  -           0       0
miscellaneous        file-replication             -           0       0
miscellaneous        subscribe-host-based-keys    -           0       0
miscellaneous subscribe-systemshell-ssh-keys      -           0       0
miscellaneous        motd-and-banner-join         -           0       0
miscellaneous        nvfail-setup                 -           0       0
miscellaneous        node-http-config             -           0       0
miscellaneous        upload-licenses-v2           -           0       0
miscellaneous        remove-precluster-cert       -           0       0
miscellaneous        bandwidth-check              -           0       0
miscellaneous        dummy-task                   -           0       0
finished             finished                  failure       20       2

 

2.  Search /mroot/etc/log/mlog/vifmgr.log for "fg_update_join" messages where the number of seconds is large and keeps increasing:

 [src/rdb/TM.cc 2916 (0x8127f0100)]: RW-transaction TID <16,17042,17042> held by client 1022 for 81795 seconds (created: 1995072s, now: 2076867s) (label 'fg_update_join').
[src/rdb/TM.cc 2916 (0x811c0ec00)]: RW-transaction TID <16,17042,17042> held by client 1022 for 81805 seconds (created: 1995072s, now: 2076877s) (label 'fg_update_join').

 

CUSTOMER EXCLUSIVE CONTENT

Registered NetApp customers get unlimited access to our dynamic Knowledge Base.

New authoritative content is published and updated each day by our team of experts.

Current Customer or Partner?

Sign In for unlimited access

New to NetApp?

Learn more about our award-winning Support