Skip to main content
NetApp Knowledge Base

MGWD crashing/restarting on all nodes after ONTAP upgrade from 9.8 or 9.9 to 9.10.1 or later

Views:
932
Visibility:
Public
Votes:
1
Category:
ontap-9
Specialty:
CORE
Last Updated:

Applies to

  • ONTAP
  • Upgrade from certain versions of ONTAP 9.8 or 9.9 to 9.10.1 or later

Issue

  • mgmt cluster application constantly restarts and flaps/toggles between offline and online on all nodes after one or more nodes are upgraded to ONTAP 9.10.1

clus1::*> cluster ring show
Node      UnitName Epoch    DB Epoch DB Trnxs Master    Online
--------- -------- -------- -------- -------- --------- ---------
clus1-01  mgmt     0        1208     45       -         offline
clus1-01  vldb     32       32       525821   clus1-01  master
clus1-01  vifmgr   104      104      889174   clus1-01  master
clus1-01  bcomd    32       32       2879     clus1-01  master
clus1-01  crs      32       32       745      clus1-01  master
clus1-02  mgmt     0        1208     45       -         offline
clus1-02  vldb     32       32       525821   clus1-01  secondary
clus1-02  vifmgr   104      104      889174   clus1-01  secondary
clus1-02  bcomd    32       32       2879     clus1-01  secondary
clus1-02  crs      32       32       745      clus1-01  secondary
clus1-03  mgmt     0        1208     45       -         offline
clus1-03  vldb     32       32       525821   clus1-01  secondary
clus1-03  vifmgr   104      104      889174   clus1-01  secondary
clus1-03  bcomd    32       32       2879     clus1-01  secondary
clus1-03  crs      32       32       745      clus1-01  secondary
clus1-04  mgmt     0        1208     45       -         offline
clus1-04  vldb     32       32       525821   clus1-01  secondary
clus1-04  vifmgr   104      104      889174   clus1-01  secondary
clus1-04  bcomd    32       32       2879     clus1-01  secondary
clus1-04  crs      32       32       745      clus1-01  secondary

  • Data continues to be served because other cluster applications remain online, although nodes will be unable to see status of aggregates when mgmt is offline (aggr show will report aggregates on other nodes in state unknown) and may report messages like:

Info: Node clus1-04 that hosts aggregate aggr1 is offline

  • MGWD logs show SQL insert error for sp_cap_rdb:
[kern_mgwd:info:2136] 0x828839500: SQL error: "INSERT INTO sp_cap_rdb(rowid, _epoch, _tid, [node], [nodeid], [id], [version]) VALUES (-350277020502054092, 828, 141, 'clus1-04', 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx', 33, 2);" UNIQUE constraint failed: sp_cap_rdb.node, sp_cap_rdb.id(19)
[kern_mgwd:info:2136] 0x828839500: 0: ERR: SQL_CONTEXT: execute_sql:src/sql_context.cc:836 SQL: failed on connection 0x81efa7308: UNIQUE constraint failed: sp_cap_rdb.node, sp_cap_rdb.id(19), txn: 'saveTxnChanges:sp_cap_rdb create',active_connection: 0x81efa7308, active_thread: 0x828839500, active_label: 'saveTxnChanges:sp_cap_rdb create', stmt: "INSERT INTO sp_cap_rdb(rowid, _epoch, _tid, [node], [nodeid], [id], [version]) VALUES (-350277020502054092, 828, 141, 'clus1-04', 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx', 33, 2);"
[kern_mgwd:info:2136] E [src/rdb/sql_local_unit.cc 5116 (0x828839500)]: saveTxnChanges: failed to execute SQL: 'INSERT INTO sp_cap_rdb(rowid, _epoch, _tid, [node], [nodeid], [id], [version]) VALUES (-350277020502054092, 828, 141, 'clus1-04', 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx', 33, 2);'.
[kern_mgwd:info:2136] W [src/rdb/sql_local_unit.cc 5288 (0x828839500)]: saveTxnChanges: abandoning due to INTERNAL_ERROR.

 

 

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.