Node panics and no longer joins the cluster on reboot
Applies to
- FAS2750 / AFF-A220
- AFF-A250
- MCC-IP (MetroCluster IP)
- MetroCluster Switch RCF upgrade
Issue
- During RCF upgrade to switches in a MCC-IP, one controller experiences a CLAM panic
Aug 01 17:40:16 [node01:vifmgr.clus.linkdown:EMERGENCY]: The cluster port e0a on node Node-01 has gone down unexpectedly. Aug 01 17:47:34 [node01:vifmgr.clus.linkdown:EMERGENCY]: The cluster port e0a on node Node-01 has gone down unexpectedly. Aug 01 17:52:10 [node01:vifmgr.clus.linkdown:EMERGENCY]: The cluster port e0a on node Node-01 has gone down unexpectedly. Aug 01 18:21:34 [node01:vifmgr.clus.linkdown:EMERGENCY]: The cluster port e0b on node Node-01 has gone down unexpectedly. PANIC : Received PANIC packet from partner, receiving message is (Coredump and takeover initiated because Connectivity, Liveliness and Availability Monitor (CLAM) has determined this node is out of quorum.)
- Upon reboot cluster ports e0a/e0b are up but the nodes are not healthy
::*> network port show -role cluster
                                      Auto-Negot  Duplex     Speed (Mbps)
Node   Port   Role         Link   MTU Admin/Oper  Admin/Oper Admin/Oper
------ ------ ------------ ---- ----- ----------- ---------- ------------
node01
       e0a    cluster      up    9000  true/true  full/full   auto/10000
       e0b    cluster      up    9000  true/true  full/full   auto/10000
::> cluster show
Node                 Health  Eligibility   Epsilon
-------------------- ------- ------------  ------------
node01                 false   true          false
node02                 false   true          false
 
- storage failover showreports the node has not started its applications
::> storage failover show
                              Takeover
Node           Partner        Possible State Description
-------------- -------------- -------- -------------------------------------
node01
               node02         true     Connected to node02
node02
               node01         true     Connected to node01.
                                       Waiting for cluster applications to
                                       come online on the local node.
                                       Offline applications: vldb, vifmgr,
                                       bcomd, crs, scsi blade, clam.
2 entries were displayed.
