Node is out of CLAM quorum due to cluster network disruption
Applies to
- ONTAP 9
- Broadcom BES-53248 switch purchased from NetApp
Issue
- A CLAM panic occurs on one of the nodes.
PANIC : Received PANIC packet from partner, receiving message is (Coredump and takeover initiated because Connectivity, Liveliness and Availability Monitor (CLAM) has determined this node is out of quorum.)
- cluster ports on panic node(FAS8200) connects to BES-53248 switches with 10G speed
- cluster ports on nodes of FAS8700 connects to BES-53248 switches with 100G speed
- Massive
vifmgr.cluscheck.ctdpktloss
alerts would be observed before panic
[?] Mon Aug 22 12:49:18 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus1 (node PERPIS-02) to cluster lif PERPIS-01_clus1 (node PERPIS-01).
[?] Mon Aug 22 12:49:40 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus2 (node PERPIS-02) to cluster lif PERPIS-13_clus1 (node PERPIS-10).
[?] Mon Aug 22 12:50:46 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus2 (node PERPIS-02) to cluster lif PERPIS-12_clus2 (node PERPIS-09).
[?] Mon Aug 22 12:51:08 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus2 (node PERPIS-02) to cluster lif PERPIS-11_clus1 (node PERPIS-08).
[?] Mon Aug 22 12:51:30 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus2 (node PERPIS-02) to cluster lif PERPIS-11_clus2 (node PERPIS-08).
[?] Mon Aug 22 12:51:52 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus2 (node PERPIS-02) to cluster lif PERPIS-10_clus1 (node PERPIS-07).
[?] Mon Aug 22 12:52:36 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus2 (node PERPIS-02) to cluster lif PERPIS-04_clus1 (node PERPIS-04).
[?] Mon Aug 22 12:53:42 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus2 (node PERPIS-02) to cluster lif PERPIS-03_clus2 (node PERPIS-03).
[?] Mon Aug 22 12:54:26 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus2 (node PERPIS-02) to cluster lif PERPIS-06_clus2 (node PERPIS-06).
[?] Mon Aug 22 12:54:49 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus2 (node PERPIS-02) to cluster lif PERPIS-05_clus1 (node PERPIS-05).
- SW ports associating with FAS8200 nodes have massive account of OutDropPkts and InDropPkts
Port OutOctets OutUcastPkts OutMcastPkts OutBcastPkts OutDropPkts Tx Error
--------- ---------------- ---------------- ---------------- ---------------- ---------------- ----------------
0/1 135350867555820 30454105411 10640954 1174996 8160 0
0/2 1115535865614836 153716964951 10640974 1155040 244502 0
0/3 555253727068245 114123567719 10640914 1171842 19740 0
0/4 110609817174084 27294584348 10640974 1175446 7 0
0/5 324792975891383 52311649051 10640974 1175338 14560 0
0/6 4499554844182 7764219226 10640974 1176497 0 0
Port InOctets InUcastPkts InMcastPkts InBcastPkts InDropPkts Rx Error
--------- ---------------- ---------------- ---------------- ---------------- ---------------- ----------------
0/1 184812820859850 31934228208 760131 26093 0 0
0/2 236905412197034 81900354370 760111 46049 1 0
0/3 828708576199875 126752545891 760107 29233 167068 0
0/4 259526455231231 38692873350 760111 25643 124437 0
0/5 200243275302091 40098210067 760110 25751 0 0
0/6 82902559324186 12919563869 760110 24592 0 0
- Default Qos setting on BES-53248 switches via RCF1.6