Brocade switch port flaps due to faulty cable
Applies to
Brocade switch
Issue
- Switch port state is
online
underswitchshow
/fabos/bin/switchshow :
Index Slot Port Address Media Speed State Proto
============================================================
129 4 33 708100 id N32 Online FC F-Port 10:00:00:10:9b:xx:xx:xx
- Under
porterrshow
, multiplelink failure
andloss sync
errors along with other media errors likeenc_out, crc err, crc g_eof, c3timeout Tx, pcs
anduncorr
errors
/fabos/cliexec/porterrshow :
frames enc crc crc too too bad enc disc link loss loss frjt fbsy c3timeout pcs uncor
tx rx in err g_eof shrt long eof out c3 fail sync sig tx rx err err
100: 14.1k 5.8k 2.2k 964 932 0 0 32 3.9g 6 12 0 4.4k 0 0 0 0 0 0
/fabos/cliexec/porterrshow :
frames enc crc crc too too bad enc disc link loss loss frjt fbsy c3timeout pcs uncor
tx rx in err g_eof shrt long eof out c3 fail sync sig tx rx err err
129: 20.6k 20.7k 0 0 0 0 0 0 0 9 2.5k 0 7.2k 0 0 9 0 1.1k 26.0k
/fabos/cliexec/porterrshow:
frames enc crc crc too too bad enc disc link loss loss frjt fbsy c3timeout pcs uncor
tx rx in err g_eof shrt long eof out c3 fail sync sig tx rx err err
37: 2.3g 6.4g 0 0 0 0 0 0 0 53 2 0 4 0 0 53 0 0 0
sfpshow
reports Rx value is too low.
RX Power: -15.1 dBm (30.7 uW)
TX Power: -1.7 dBm (678.9 uW)
-
portshow
output confirms that port isOnline
& inIn_Sync
state andLr_In
is greater thanOls_out
which indicates that issue is external to switchport.
portshow 37
portDisableReason: None
[..]
portState: 1 Online
Protocol: FC
portPhys: 6 In_Sync portScn: 32 F_Port
FC Fastwrite: OFF
Interrupts: 48 Link_failure: 2 Frjt: 0
Unknown: 6 Loss_of_sync: 0 Fbsy: 0
Lli: 48 Loss_of_sig: 4
Proc_rqrd: 159 Protocol_err: 0
Timed_out: 0 Invalid_word: 0
Tx_unavail: 0 Invalid_crc: 0
Delim_err: 0 Address_err: 0
Lr_in: 6 Ols_in: 2
Lr_out: 2 Ols_out: 3
Cong_Prim_in: 0
-
Under
Fabriclog
, could see these two ports flapping.
Fabriclog:
Switch 0; Tue Oct 11 12:34:12 2022 IST (GMT+5:30)
12:34:12.020011 SCN Port Offline;rsn=0x2,g=0x530 D2,P0 D2,P0 15 NA
12:34:12.020017 *Removing all nodes from port D2,P0 D2,P0 15 NA
12:34:12.112102 SCN Port Offline;rsn=0x0,g=0x532 D2,P0 D2,P0 127 NA
12:34:12.112108 *Removing all nodes from port D2,P0 D2,P0 127 NA
12:36:40.840204 SCN LR_PORT(0);g=0x530 D2,P0 D2,P0 15 NA
12:36:40.860941 SCN Port Online; g=0x530,isolated=0 D2,P0 D2,P1 15 NA
12:36:40.861044 Port Elp engaged D2,P1 D2,P0 15 NA
12:36:40.861057 *Removing all nodes from port D2,P0 D2,P0 15 NA
- SFP optical values are in optimal range on switch end as per
sfpshow
output-
RX Power: -1.7 dBm (681.90uW)
TX Power: -1.5 dBm (701.10 uW)
-
Switch logged a
C4-5040
message-2024/09/17-20:13:45:225851 (IST), [C4-5040], 2240091/0, SLOT 1 | CHASSIS | PORT 3/11, INFO, SWITCH, Link loss of sync debouncing event detected: Slot 3/Port 11(122)
-
c4-5040
-is logged when there's a link reset for the port due to loss of sync. A debounce timer would be started for the loss of sync signal to get cleared.If the same is not cleared even after the fore-mentioned timer expiry, the C4-5040 raslog would be posted. -
In
fabriclog
we see the port go offline andLR_in
occuring on the switchSwitch 0; Tue Sep 17 20:13:23 2024 IST (GMT+5:30)
20:13:23.722607 SCN Port Offline;rsn=0x2,g=0x16a D2,P0 D2,P0 11 NA
20:13:23.722613 *Removing all nodes from port D2,P0 D2,P0 11 NA
20:13:24.365532 SCN LR_PORT(0);g=0x16a D2,P0 D2,P0 11 NA
20:13:24.423719 SCN Port Online; g=0x16a,isolated=0 D2,P0 D2,P1 11 NA
20:13:24.423926 Port Elp engaged D2,P1 D2,P0 11 NA
20:13:24.423938 *Removing all nodes from port D2,P0 D2,P0 11 NA
20:13:24.424123 SCN Port F_PORT D2,P1 D2,P0 11 NA
20:13:24.538573 SCN LR_PORT(0);g=0x16a LR_IN D2,P0 D2,P0 11 NA
-
Error counters seen under portstatsshow -
portstatsshow 11
er_bad_os 13
phy_stats_clear_ts 09-13-2024 IST Fri 03:01:23 Timestamp of phy_port stats clear
lgc_stats_clear_ts 09-13-2024 IST Fri 03:01:23 Timestamp of lgc_port stats clear
Lr_in 0 top_int : Number of link resets received
1 bottom_int : Number of link resets received
Link_failure 0 top_int : Number of link failures
1 bottom_int : Number of link failures
Loss_of_sig 0 top_int : Number of instances of signal loss detected
1 bottom_int : Number of instances of signal loss detected
- From the
MAPS policy
, we could seelink fail ,loss sig
andLR
reported-
MAPS alerts for the port 3/11:
LOSS_SYNC(SyncLoss) -
LF(LFs) 3/11(1)
LOSS_SIGNAL(LOS) 3/11(1)
PE(Errors)
STATE_CHG 3/11(2)
LR(LRs) 3/11(1)
- When looking at
sfpshow
for the port from old supportsaves,RX
Power can be seen going down but still a decent value
Temperature: 51 Centigrade
Current: 7.786 mAmps
Voltage: 3329.70 mVolts
RX Power: -1.7 dBm (681.90uW)
TX Power: -1.5 dBm (701.10 uW)
12-SEP 2025 20h25
Current: 7.788 mAmps
Voltage: 3318.80 mVolts
RX Power: -1.6 dBm (685.00uW)
TX Power: -1.5 dBm (703.20 uW)
12-SEP 00h05
Current: 7.796 mAmps
Voltage: 3312.40 mVolts
RX Power: -1.6 dBm (694.90uW)
TX Power: -1.5 dBm (703.00 uW)
- Rule
defALL_32GSWL_SFPRXP_63
anddefALL_OTHER_F_PORTSSTATE_CHG_5
triggered undererrdump
logs along withloss signal
,link failures
andframe timeout detected
errors.
2022/10/11-12:41:00, [MAPS-1004], 46328, SLOT 1 | FID 128, INFO, XXX, SFP 3/15, Condition=ALL_32GSWL_SFP(RXP<=63), Current Value:[RXP, 0 uW], RuleName=defALL_32GSWL_SFPRXP_63, Dashboard Category=Port Health.
2022/10/11-12:41:00, [MAPS-1004], 46329, SLOT 1 | FID 128, INFO, XXX, SFP 12/15, Condition=ALL_32GSWL_SFP(RXP<=63), Current Value:[RXP, 0 uW], RuleName=defALL_32GSWL_SFPRXP_63, Dashboard Category=Port Health.
2022/10/11-12:43:00, [MAPS-1004], 46330, SLOT 1 | FID 128, INFO, XXX, SFP 3/15, Condition=ALL_32GSWL_SFP(RXP<=63), Current Value:[RXP, 0 uW], RuleName=defALL_32GSWL_SFPRXP_63, Dashboard Category=Port Health.
2022/10/11-12:43:00, [MAPS-1004], 46331, SLOT 1 | FID 128, INFO, XXX, SFP 12/15, Condition=ALL_32GSWL_SFP(RXP<=63), Current Value:[RXP, 0 uW], RuleName=defALL_32GSWL_SFPRXP_63, Dashboard Category=Port Health.
2023/02/20-16:42:20 (IST), [MAPS-1003], 9812, SLOT 2 | FID 128 | PORT 4/33, WARNING, switch, slot4 port33, F-Port 4/33, Condition=ALL_HOST_PORTS(STATE_CHG/min>5), Current Value:[STATE_CHG, 6], RuleName=defALL_HOST_PORTSSTATE_CHG_5, Dashboard Category=Port Health, Quiet Time=None.
2023/02/20-16:42:20 (IST), [MAPS-1003], 9813, SLOT 2 | FID 128 | PORT 4/33, WARNING, switch, slot4 port33, F-Port 4/33, Condition=ALL_OTHER_F_PORTS(STATE_CHG/min>5), Current Value:[STATE_CHG, 6], RuleName=defALL_OTHER_F_PORTSSTATE_CHG_5, Dashboard Category=Port Health, Quiet Time=None.
2022/06/26-03:53:42, [MAPS-1003], 42403, SLOT 2 | FID 128, WARNING, switch, slot6 port9, U-Port 6/9, Condition=ALL_PORTS(LOSS_SIGNAL/min>3), Current Value:[LOSS_SIGNAL, 386 LOS], RuleName=defALL_PORTSLOSS_SIGNAL_3, Dashboard Category=Port Health.
2022/06/26-03:54:42, [MAPS-1003], 42404, SLOT 2 | FID 128, WARNING, switch, slot6 port9, U-Port 6/9, Condition=ALL_PORTS(LOSS_SIGNAL/min>3), Current Value:[LOSS_SIGNAL, 378 LOS], RuleName=defALL_PORTSLOSS_SIGNAL_3, Dashboard Health.
2021/10/23-23:38:47, [MAPS-1003], 55690, SLOT 1 | FID 128, WARNING, Fabric1, slot10 port31, U-Port 10/31, Condition=ALL_PORTS(LF/min>3), Current Value:[LF, 4], RuleName=defALL_PORTSLF_3, Dashboard Category=Port Health.
2021/10/23-04:24:46, [PORT-1003], 53615, SLOT 1 | FID 128, WARNING, Fabric1, Port 223 Faulted because of many Link Failures.
2020/04/08-21:35:56, [C3-1014], 2556, CHASSIS, WARNING, Brocade6510, Link Reset on Port S0,P8(12) vc_no=0 crd(s)lost=12 auto trigger.
2020/04/09-09:03:42, [C3-1014], 2557, CHASSIS, WARNING, Brocade6510, Link Reset on Port S0,P8(12) vc_no=0 crd(s)lost=12 auto trigger.
2023/10/15-01:03:18, [AN-1014], 589, FID 128, INFO, SWITCH, Frame timeout detected, tx port 37 rx port 15, sid 650900, did 632501, timestamp 2023-10-15 01:03:18 .
2023/10/15-01:03:18, [LOG-1000], 594, FID 128, INFO, SWITCH, Previous message repeated 5 time(s).
RuleName=defALL_32GSWL_SFPRXP_63
is triggered when the Rx power starts to degrade and is indicative of upstream issue i,e Faulty SFP or cable-
On Storage end, in
EMS
log,link break
events are reported:[?] Tue Sep 17 20:13:44 +0530 [NetApp-2: fct_tpd_work_thread_0: scsitarget.slifct.linkBreak:error]: Link break detected on Fibre Channel target HBA 3d with event status 1 , topology type 1, status1 0x0, status2 0x0.
[?] Tue Sep 17 20:13:45 +0530 [NetApp-2: fct_tpd_work_thread_0: scsitarget.hwpfct.linkUp:notice]: Link up on Fibre Channel target adapter 3d.
-
Performance archives covering the issue time reports
ITW errors
andloss of sync
happening at the same time.
-
On the switch end too, we could see
loss of sig
errors. -
This indicates that there was some
loss of signal
issue which happened on the link between switch and storage and due to that storage initiated the link reset to recover from that condition. -
This in turn triggered the
ITW errors
since link was being reset during that time, the frames coming to that link will get dropped. -
ITW
Errors indicate faulty cable.