Skip to main content

NetApp wins prestigious Coveo Relevance Pinnacle Award. Learn more!

INSIGHT Japan :2023年 1月25日(水)ANAインターコンチネンタルホテル開催 へ参加・申込を行う

NetApp Knowledge Base

mgmt cluster app restarting on all nodes after ONTAP upgrade to 9.10.1

Views:
259
Visibility:
Public
Votes:
1
Category:
ontap-9
Specialty:
core
Last Updated:

Applies to

  • ONTAP
  • Upgrade from certain versions of ONTAP 9.8 or 9.9 to 9.10.1 or later

Issue

  • mgmt cluster application constantly restarts and flaps/toggles between offline and online on all nodes after one or more nodes are upgraded to ONTAP 9.10.1

clus1::*> cluster ring show
Node      UnitName Epoch    DB Epoch DB Trnxs Master    Online
--------- -------- -------- -------- -------- --------- ---------
clus1-01  mgmt     0        1208     45       -         offline
clus1-01  vldb     32       32       525821   clus1-01  master
clus1-01  vifmgr   104      104      889174   clus1-01  master
clus1-01  bcomd    32       32       2879     clus1-01  master
clus1-01  crs      32       32       745      clus1-01  master
clus1-02  mgmt     0        1208     45       -         offline
clus1-02  vldb     32       32       525821   clus1-01  secondary
clus1-02  vifmgr   104      104      889174   clus1-01  secondary
clus1-02  bcomd    32       32       2879     clus1-01  secondary
clus1-02  crs      32       32       745      clus1-01  secondary
clus1-03  mgmt     0        1208     45       -         offline
clus1-03  vldb     32       32       525821   clus1-01  secondary
clus1-03  vifmgr   104      104      889174   clus1-01  secondary
clus1-03  bcomd    32       32       2879     clus1-01  secondary
clus1-03  crs      32       32       745      clus1-01  secondary
clus1-04  mgmt     0        1208     45       -         offline
clus1-04  vldb     32       32       525821   clus1-01  secondary
clus1-04  vifmgr   104      104      889174   clus1-01  secondary
clus1-04  bcomd    32       32       2879     clus1-01  secondary
clus1-04  crs      32       32       745      clus1-01  secondary

  • Data continues to be served because other cluster applications remain online, although nodes will be unable to see status of aggregates when mgmt is offline (aggr show will report aggregates on other nodes in state unknown) and may report messages like:

Info: Node clus1-04 that hosts aggregate aggr1 is offline

  • MGWD logs show SQL insert error for sp_cap_rdb:
[kern_mgwd:info:2136] 0x828839500: SQL error: "INSERT INTO sp_cap_rdb(rowid, _epoch, _tid, [node], [nodeid], [id], [version]) VALUES (-350277020502054092, 828, 141, 'clus1-04', 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx', 33, 2);" UNIQUE constraint failed: sp_cap_rdb.node, sp_cap_rdb.id(19)
[kern_mgwd:info:2136] 0x828839500: 0: ERR: SQL_CONTEXT: execute_sql:src/sql_context.cc:836 SQL: failed on connection 0x81efa7308: UNIQUE constraint failed: sp_cap_rdb.node, sp_cap_rdb.id(19), txn: 'saveTxnChanges:sp_cap_rdb create',active_connection: 0x81efa7308, active_thread: 0x828839500, active_label: 'saveTxnChanges:sp_cap_rdb create', stmt: "INSERT INTO sp_cap_rdb(rowid, _epoch, _tid, [node], [nodeid], [id], [version]) VALUES (-350277020502054092, 828, 141, 'clus1-04', 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx', 33, 2);"
[kern_mgwd:info:2136] E [src/rdb/sql_local_unit.cc 5116 (0x828839500)]: saveTxnChanges: failed to execute SQL: 'INSERT INTO sp_cap_rdb(rowid, _epoch, _tid, [node], [nodeid], [id], [version]) VALUES (-350277020502054092, 828, 141, 'clus1-04', 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx', 33, 2);'.
[kern_mgwd:info:2136] W [src/rdb/sql_local_unit.cc 5288 (0x828839500)]: saveTxnChanges: abandoning due to INTERNAL_ERROR.

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

Scan to view the article on your device