Switchless Cluster: e0b Port MTU Mismatch and Packet Loss
Applies to
- FAS2720
- FAS2750
- Switchless Cluster Configuration
Issue
- Cluster Network Degraded alert:
HA Group Notification from Node-01 (CLUSTER NETWORKDEGRADED) ALERT - Frequent e0b porterrors: Identified on node-01 and node-02. EMS reports the following error.
node-01:
[node-01: vifmgr: vifmgr.reach.noreach:notice]: Network port e0b on node node-01 cannot reach its expected broadcast domain Cluster:Cluster. No other broadcast domains appear to be reachable from this port.
[node-01: vifmgr: vifmgr.reach.ok:notice]: Network port e0b on node node-01 can reach its expected broadcast domain Cluster:Cluster. No other broadcast domains appear to be reachable from this port.
node-02:
[node-02: vifmgr: vifmgr.cluscheck.droppedlarge:alert]: Partial packet loss when pinging from cluster lif node-02_clus2 (node node-02) to cluster lif node-01_clus1 (node node-01).
[node-02: vifmgr: callhome.clus.net.degraded:alert]:Call home for CLUSTER NETWORK DEGRADED: Large MTU Packet Loss - Ping failures detected between node-02_clus2 ( ***.***.***.187 ) on node-02 and node-01_clus1 ( ***.***.***.047 ) on node-01.
- The ifstat logs show no significant hardware errors for the two cluster ports.
cluster ports (e0a and e0b):
Total errors: 0
Total discards: 0
CRC errors: 0
- The sysconfig -a logs for both nodes show that all cluster ports are in the up state.
slot 0: 10 Gigabit Ethernet Controller IX5-SFP+
e0a MAC Address: **** (auto-10g_twinax-fd-up)
SFP Vendor: Amphenol
SFP Part Number: 616310000
SFP Serial Number: ****
e0b MAC Address: **** (auto-10g_twinax-fd-up)
SFP Vendor: Amphenol
SFP Part Number: 616310000
SFP Serial Number: ****
Device Type: X550EM
Firmware Version: 1.13-7.
- Even after replacing the SFP cable for e0b, there was no improvement.
- Partial packet loss:Detected during internal cluster pings
::*> cluster ping-cluster -node node-01
Host is node-01
Getting addresses from network interface table...
Cluster node-01_clus1 ***.***.***.047 node-01 e0a
Cluster node-01_clus2 ***.***.***.035 node-01 e0b
Cluster node-02_clus1 ***.***.***.085 node-02 e0a
Cluster node-02_clus2 ***.***.***.187 node-02 e0b
Local = ***.***.***.047 ***.***.***.035
Remote = ***.***.***.085 ***.***.***.187
Cluster Vserver Id = 4294967293
Ping status:
....
Basic connectivity succeeds on 4 path(s)
Basic connectivity fails on 0 path(s)
................
Detected 9000 byte MTU on 3 path(s):
Local ***.***.***.035 to Remote ***.***.***.085
Local ***.***.***.047 to Remote ***.***.***.085
Local ***.***.***.047 to Remote ***.***.***.187
Detected 4500 byte MTU on 1 path(s): <--- Recognized as MTU 4500. ※ Local ***.***.***.187 to Remote ***.***.***.035 is recognized as MTU 9000.
Local ***.***.***.035 to Remote ***.***.***.187
Larger than PMTU communication succeeds on 4 path(s)
RPC status:
2 paths up, 0 paths down (tcp check)
2 paths up, 0 paths down (udp check)
- MTU is recognized as 4500 on the path from node-01 to node-02 during inter-cluster communication.
In the reverse direction (node-02 to node-01), the MTU is correctly recognized as 9000.
***.***.***.187 (node-02's e0b) → ***.***.***.035 (node-01's e0b) has an MTU of 9000
***.***.***.035 (node-01's e0b) → ***.***.***.187 (node-02's e0b) has an MTU of 4500
