Skip to main content
NetApp Knowledge Base

Brocade switch port flaps due to faulty cable

Views:
15,866
Visibility:
Public
Votes:
16
Category:
fabric-interconnect-and-management-switches
Specialty:
brocade
Last Updated:

Applies to

Brocade switch

Issue

  • Switch port state is online under switchshow

/fabos/bin/switchshow :

Index Slot Port Address Media  Speed        State    Proto
============================================================
129    4   33   708100   id    N32       Online      FC  F-Port  10:00:00:10:9b:xx:xx:xx

 

  • Under porterrshow, multiple link failure and loss sync errors along with other media errors like enc_out, crc err, crc g_eof, c3timeout Tx, pcs and uncorr errors are reported.

/fabos/cliexec/porterrshow :
          frames        enc     crc     crc     too     too     bad     enc    disc    link    loss    loss    frjt    fbsy     c3timeout     pcs      uncor
        tx       rx      in     err     g_eof   shrt    long    eof     out    c3      fail    sync    sig                      tx      rx      err     err
100:   14.1k    5.8k    2.2k  964     932       0       0      32       3.9g    6      12       0       4.4k    0       0       0       0       0       0
/fabos/cliexec/porterrshow :
          frames        enc     crc     crc     too     too     bad     enc    disc    link    loss    loss    frjt    fbsy     c3timeout     pcs      uncor
        tx       rx      in     err     g_eof   shrt    long    eof     out    c3      fail    sync    sig                      tx      rx      err     err
129:   20.6k   20.7k    0       0       0       0       0       0       0       9       2.5k    0       7.2k    0       0       9       0       1.1k   26.0k

/fabos/cliexec/porterrshow:
          frames        enc     crc     crc     too     too     bad     enc    disc    link    loss    loss    frjt    fbsy     c3timeout     pcs      uncor
        tx       rx      in     err     g_eof   shrt    long    eof     out    c3      fail    sync    sig                      tx      rx      err     err
  37:    2.3g    6.4g    0       0       0       0       0       0       0      53       2       0       4       0       0      53       0       0       0    

​​​​​

  • sfpshow reports Rx value is too low.

RX Power:    -15.1   dBm (30.7 uW)
TX Power:    -1.7    dBm (678.9 uW)

  • portshow output confirms that port is Online & in In_Sync state and Lr_In is greater than Ols_out which indicates that issue is external to switchport.

portshow 37
portDisableReason: None
[..]
portState: 1      Online   
Protocol: FC
portPhys:  6      In_Sync     portScn:   32     F_Port    
FC Fastwrite: OFF
Interrupts:        48   Link_failure: 2   Frjt:         0
Unknown:           6    Loss_of_sync: 0   Fbsy:         0
Lli:               48   Loss_of_sig:  4   
Proc_rqrd:         159  Protocol_err: 0   
Timed_out:         0    Invalid_word: 0   
Tx_unavail:        0    Invalid_crc:  0
Delim_err:         0    Address_err:  0
Lr_in:             6    Ols_in:       2
Lr_out:            2    Ols_out:      3
Cong_Prim_in:       0

 

  • Under Fabriclog, could see these two ports flapping.

Fabriclog:
Switch 0; Tue Oct 11 12:34:12 2022 IST (GMT+5:30)
12:34:12.020011 SCN Port Offline;rsn=0x2,g=0x530            D2,P0  D2,P0  15    NA
12:34:12.020017 *Removing all nodes from port               D2,P0  D2,P0  15    NA

12:34:12.112102 SCN Port Offline;rsn=0x0,g=0x532            D2,P0  D2,P0  127   NA
12:34:12.112108 *Removing all nodes from port               D2,P0  D2,P0  127   NA
12:36:40.840204 SCN LR_PORT(0);g=0x530                      D2,P0  D2,P0  15    NA
12:36:40.860941 SCN Port Online; g=0x530,isolated=0         D2,P0  D2,P1  15    NA
12:36:40.861044 Port Elp engaged                            D2,P1  D2,P0  15    NA
12:36:40.861057 *Removing all nodes from port               D2,P0  D2,P0  15    NA

 

  • SFP optical values are in optimal range on switch end as per sfpshow output-

RX Power:    -1.7    dBm (681.90uW)
TX Power:    -1.5    dBm (701.10 uW)

 

  • Switch logged a C4-5040 message-

    2024/09/17-20:13:45:225851 (IST), [C4-5040], 2240091/0, SLOT 1 | CHASSIS | PORT 3/11, INFO,  SWITCH, Link loss of sync debouncing event detected: Slot 3/Port 11(122)

     

  • c4-5040 -is logged when there's a link reset for the port due to loss of sync. A debounce timer would be started for the loss of sync signal to get cleared.If the same is not cleared even after the fore-mentioned timer expiry, the C4-5040 raslog would be posted.

  • In fabriclog we see the port go offline and LR_in occuring on the switch

    Switch 0; Tue Sep 17 20:13:23 2024 IST (GMT+5:30)
    20:13:23.722607 SCN Port Offline;rsn=0x2,g=0x16a            D2,P0  D2,P0  11    NA
    20:13:23.722613 *Removing all nodes from port               D2,P0  D2,P0  11    NA
    20:13:24.365532 SCN LR_PORT(0);g=0x16a                      D2,P0  D2,P0  11    NA
    20:13:24.423719 SCN Port Online; g=0x16a,isolated=0         D2,P0  D2,P1  11    NA
    20:13:24.423926 Port Elp engaged                            D2,P1  D2,P0  11    NA
    20:13:24.423938 *Removing all nodes from port               D2,P0  D2,P0  11    NA
    20:13:24.424123 SCN Port F_PORT                             D2,P1  D2,P0  11    NA
    20:13:24.538573 SCN LR_PORT(0);g=0x16a LR_IN                D2,P0  D2,P0  11    NA


     

  • Error counters seen under portstatsshow -

portstatsshow 11
er_bad_os               13
phy_stats_clear_ts      09-13-2024 IST Fri 03:01:23     Timestamp of phy_port stats clear
lgc_stats_clear_ts      09-13-2024 IST Fri 03:01:23     Timestamp of lgc_port stats clear
Lr_in           0           top_int : Number of link resets received
                1           bottom_int : Number of link resets received

Link_failure    0           top_int : Number of link failures
                1           bottom_int : Number of link failures

Loss_of_sig     0           top_int : Number of instances of signal loss detected
                1           bottom_int : Number of instances of signal loss detected

 

  • From the MAPS policy  , we could see link fail ,loss sig and LR reported-

MAPS alerts for the port 3/11:

LOSS_SYNC(SyncLoss) -
LF(LFs)             3/11(1)

LOSS_SIGNAL(LOS)    3/11(1)

PE(Errors)
STATE_CHG           3/11(2)

LR(LRs)             3/11(1)

 

  • When looking at sfpshow for the port from old supportsaves,  RX Power can be seen going down but still a decent value

Temperature: 51      Centigrade
Current:     7.786   mAmps
Voltage:     3329.70 mVolts
RX Power:    -1.7    dBm (681.90uW)
TX Power:    -1.5    dBm (701.10 uW)

12-SEP 2025 20h25
Current:     7.788   mAmps
Voltage:     3318.80 mVolts
RX Power:    -1.6    dBm (685.00uW)
TX Power:    -1.5    dBm (703.20 uW)

12-SEP 00h05
Current:     7.796   mAmps
Voltage:     3312.40 mVolts
RX Power:    -1.6    dBm (694.90uW)
TX Power:    -1.5    dBm (703.00 uW)

 

  • Rule defALL_32GSWL_SFPRXP_63 and defALL_OTHER_F_PORTSSTATE_CHG_5 triggered under errdump logs along with loss signal, link failures and frame timeout detected errors.

2022/10/11-12:41:00, [MAPS-1004], 46328, SLOT 1 | FID 128, INFO, XXX, SFP 3/15, Condition=ALL_32GSWL_SFP(RXP<=63), Current Value:[RXP, 0 uW], RuleName=defALL_32GSWL_SFPRXP_63, Dashboard Category=Port Health.
2022/10/11-12:41:00, [MAPS-1004], 46329, SLOT 1 | FID 128, INFO, XXX, SFP 12/15, Condition=ALL_32GSWL_SFP(RXP<=63), Current Value:[RXP, 0 uW], RuleName=defALL_32GSWL_SFPRXP_63, Dashboard Category=Port Health.
2022/10/11-12:43:00, [MAPS-1004], 46330, SLOT 1 | FID 128, INFO, XXX, SFP 3/15, Condition=ALL_32GSWL_SFP(RXP<=63), Current Value:[RXP, 0 uW], RuleName=defALL_32GSWL_SFPRXP_63, Dashboard Category=Port Health.
2022/10/11-12:43:00, [MAPS-1004], 46331, SLOT 1 | FID 128, INFO, XXX, SFP 12/15, Condition=ALL_32GSWL_SFP(RXP<=63), Current Value:[RXP, 0 uW], RuleName=defALL_32GSWL_SFPRXP_63, Dashboard Category=Port Health.

2023/02/20-16:42:20 (IST), [MAPS-1003], 9812, SLOT 2 | FID 128 | PORT 4/33, WARNING, switch, slot4 port33, F-Port 4/33, Condition=ALL_HOST_PORTS(STATE_CHG/min>5), Current Value:[STATE_CHG, 6], RuleName=defALL_HOST_PORTSSTATE_CHG_5, Dashboard Category=Port Health, Quiet Time=None.
2023/02/20-16:42:20 (IST), [MAPS-1003], 9813, SLOT 2 | FID 128 | PORT 4/33, WARNING, switch, slot4 port33, F-Port 4/33, Condition=ALL_OTHER_F_PORTS(STATE_CHG/min>5), Current Value:[STATE_CHG, 6], RuleName=defALL_OTHER_F_PORTSSTATE_CHG_5, Dashboard Category=Port Health, Quiet Time=None.

2022/06/26-03:53:42, [MAPS-1003], 42403, SLOT 2 | FID 128, WARNING, switch, slot6 port9, U-Port 6/9, Condition=ALL_PORTS(LOSS_SIGNAL/min>3), Current Value:[LOSS_SIGNAL, 386 LOS], RuleName=defALL_PORTSLOSS_SIGNAL_3, Dashboard Category=Port Health.
2022/06/26-03:54:42, [MAPS-1003], 42404, SLOT 2 | FID 128, WARNING, switch, slot6 port9, U-Port 6/9, Condition=ALL_PORTS(LOSS_SIGNAL/min>3), Current Value:[LOSS_SIGNAL, 378 LOS], RuleName=defALL_PORTSLOSS_SIGNAL_3, Dashboard Health.

2021/10/23-23:38:47, [MAPS-1003], 55690, SLOT 1 | FID 128, WARNING, Fabric1, slot10 port31, U-Port 10/31, Condition=ALL_PORTS(LF/min>3), Current Value:[LF, 4], RuleName=defALL_PORTSLF_3, Dashboard Category=Port Health.
2021/10/23-04:24:46, [PORT-1003], 53615, SLOT 1 | FID 128, WARNING, Fabric1, Port 223 Faulted because of many Link Failures.

2020/04/08-21:35:56, [C3-1014], 2556, CHASSIS, WARNING, Brocade6510,  Link Reset on Port S0,P8(12) vc_no=0 crd(s)lost=12 auto trigger.
2020/04/09-09:03:42, [C3-1014], 2557, CHASSIS, WARNING, Brocade6510,  Link Reset on Port S0,P8(12) vc_no=0 crd(s)lost=12 auto trigger.

2023/10/15-01:03:18, [AN-1014], 589, FID 128, INFO, SWITCH, Frame timeout detected, tx port 37 rx port 15, sid 650900, did 632501, timestamp 2023-10-15 01:03:18 .
2023/10/15-01:03:18, [LOG-1000], 594, FID 128, INFO, SWITCH, Previous message repeated 5 time(s).

 

  • RuleName=defALL_32GSWL_SFPRXP_63 is triggered when the Rx power starts to degrade and is indicative of upstream issue i,e Faulty SFP or cable
  • On Storage end, in EMS log, link break events are reported:

     [?]  Tue Sep 17 20:13:44 +0530 [NetApp-2: fct_tpd_work_thread_0: scsitarget.slifct.linkBreak:error]: Link break detected on Fibre Channel target HBA 3d with event status 1 , topology type 1, status1 0x0, status2 0x0.
     [?]  Tue Sep 17 20:13:45 +0530 [NetApp-2: fct_tpd_work_thread_0: scsitarget.hwpfct.linkUp:notice]: Link up on Fibre Channel target adapter 3d.
     

     

  • Performance archives covering the issue time reports  ITW errors and loss of sync happening at the same time.

ITW-ERROR.png

 

  • On the switch end too, we could see loss of sig errors.

  • This indicates that there was some loss of signal issue which happened on the link between switch and storage and due to that storage initiated the link reset to recover from that condition.

  • This in turn triggered the ITW errors since link was being reset during that time, the frames coming to that link will get dropped.

  • ITW Errors indicate faulty cable.

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.