Panic due to Cluster ports e0a/e0b going down
Applies to
Issue
- Node panics due to loss of cluster network communication when e0a / e0b go down
- The panic message observed is:
Panic_Message: prod/common/wafl/boot.c:3102: Assertion failure. in SK process shutdown_thread0
-
Following the panic, the HA partner performs a takeover. The affected node remains reachable via the HA interconnect (HA‑IC) but is not accessible through the cluster network.
::> storage failover show
Takeover
Node Partner Possible State Description
-------------- -------------- -------- -------------------------------------
Node-01 Node-02 - Up. Node accessible via HA-IC, but
cluster access failed
Node-02 Node-01 true Connected to Node-01. Waiting for
cluster applications to come online
on the local node. Offline
applications: mgmt, vldb, vifmgr,
bcomd, crs, scsi blade, clam.
-
The node is reachable via SP, but both cluster ports (e0a/e0b) report no carrier, indicating loss of physical cluster connectivity
run local -c ifconfig
e0a: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 options=4ec07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP> ether 00:a0:98:e4:2a:df inet 169.254.64.88 netmask 0xffff0000 broadcast 169.254.255.255 Services: 0x0000000000000002 Vserver ID: -3 media: Ethernet autoselect (autoselect <full-duplex,rxpause,txpause>) status: no carrier. <---
e0b: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 options=4ec07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6,NOMAP> ether 00:a0:98:e4:2a:e0 inet 169.254.39.109 netmask 0xffff0000 broadcast 169.254.255.255 Services: 0x0000000000000002 Vserver ID: -3 media: Ethernet autoselect (autoselect <full-duplex,rxpause,txpause>) status: no carrier <---
