MGWD crashing/restarting on all nodes after ONTAP upgrade from 9.8 or 9.9 to 9.10.1 or later

Last Updated:

Applies to

  • Upgrade from certain versions of ONTAP 9.8 or 9.9 to 9.10.1 or later


  • mgmt cluster application constantly restarts and flaps/toggles between offline and online on all nodes after one or more nodes are upgraded to ONTAP 9.10.1

clus1::*> cluster ring show
Node      UnitName Epoch    DB Epoch DB Trnxs Master    Online
--------- -------- -------- -------- -------- --------- ---------
clus1-01  mgmt     0        1208     45       -         offline
clus1-01  vldb     32       32       525821   clus1-01  master
clus1-01  vifmgr   104      104      889174   clus1-01  master
clus1-01  bcomd    32       32       2879     clus1-01  master
clus1-01  crs      32       32       745      clus1-01  master
clus1-02  mgmt     0        1208     45       -         offline
clus1-02  vldb     32       32       525821   clus1-01  secondary
clus1-02  vifmgr   104      104      889174   clus1-01  secondary
clus1-02  bcomd    32       32       2879     clus1-01  secondary
clus1-02  crs      32       32       745      clus1-01  secondary
clus1-03  mgmt     0        1208     45       -         offline
clus1-03  vldb     32       32       525821   clus1-01  secondary
clus1-03  vifmgr   104      104      889174   clus1-01  secondary
clus1-03  bcomd    32       32       2879     clus1-01  secondary
clus1-03  crs      32       32       745      clus1-01  secondary
clus1-04  mgmt     0        1208     45       -         offline
clus1-04  vldb     32       32       525821   clus1-01  secondary
clus1-04  vifmgr   104      104      889174   clus1-01  secondary
clus1-04  bcomd    32       32       2879     clus1-01  secondary
clus1-04  crs      32       32       745      clus1-01  secondary

  • Data continues to be served because other cluster applications remain online, although nodes will be unable to see status of aggregates when mgmt is offline (aggr show will report aggregates on other nodes in state unknown) and may report messages like:

Info: Node clus1-04 that hosts aggregate aggr1 is offline

  • MGWD logs show SQL insert error for sp_cap_rdb:
[kern_mgwd:info:2136] 0x828839500: SQL error: "INSERT INTO sp_cap_rdb(rowid, _epoch, _tid, [node], [nodeid], [id], [version]) VALUES (-350277020502054092, 828, 141, 'clus1-04', 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx', 33, 2);" UNIQUE constraint failed: sp_cap_rdb.node,
[kern_mgwd:info:2136] 0x828839500: 0: ERR: SQL_CONTEXT: execute_sql:src/ SQL: failed on connection 0x81efa7308: UNIQUE constraint failed: sp_cap_rdb.node,, txn: 'saveTxnChanges:sp_cap_rdb create',active_connection: 0x81efa7308, active_thread: 0x828839500, active_label: 'saveTxnChanges:sp_cap_rdb create', stmt: "INSERT INTO sp_cap_rdb(rowid, _epoch, _tid, [node], [nodeid], [id], [version]) VALUES (-350277020502054092, 828, 141, 'clus1-04', 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx', 33, 2);"
[kern_mgwd:info:2136] E [src/rdb/ 5116 (0x828839500)]: saveTxnChanges: failed to execute SQL: 'INSERT INTO sp_cap_rdb(rowid, _epoch, _tid, [node], [nodeid], [id], [version]) VALUES (-350277020502054092, 828, 141, 'clus1-04', 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx', 33, 2);'.
[kern_mgwd:info:2136] W [src/rdb/ 5288 (0x828839500)]: saveTxnChanges: abandoning due to INTERNAL_ERROR.



