C3 Tx discards and Frame timeout due to hardware issues
Applies to
Brocade Fabric OS
Issue
errdump -a
logsframe timeout,
stating which port received a frame (rx) and where it cannot be transmitted on (tx) along with loss signal and oversubscribed alerts.
2022/02/09-22:31:04, [AN-1014], 2266, SLOT 2 | FID 128, INFO, switch, Frame timeout detected, tx port 9/7 rx port 9/27, sid c8f907, did 678740, timestamp 2022-02-09 22:31:04 .
2023/12/23-10:47:16 (IST), [MAPS-1003], 274485, SLOT 2 | FID 128 | PORT 11/8, WARNING, switch, IB_INFINI_3185_W_N3P6, F-Port 11/8, Condition=ALL_PORTS(PORT_BANDWIDTH/NONE==OVERSUBSCRIBED), Current Value:[PORT_BANDWIDTH, OVERSUBSCRIBED, (TXQL=914 us, TX=72.8%) ], RuleName=defALL_PORTS_OVERSUBSCRIBED, Dashboard Category=Fabric Performance Impact, Quiet Time=15 min.
2023/12/18-01:35:43 (IST), [MAPS-1003], 270764, SLOT 1 | FID 128 | PORT 10/21, WARNING, switch, U-Port 10/21, Condition=ALL_PORTS(LOSS_SIGNAL/min>5), Current Value:[LOSS_SIGNAL, 8 LOS], RuleName=defALL_PORTSLOSS_SIGNAL_5, Dashboard Category=Port Health, Quiet Time=None.
- The
frame timeouts
indicate that those Tx ports were unable to forward the frames across to the designated Rx ports and, hence got timed out.
IO_FRAME_LOSS
events indicating frame delay andIO_PERF_IMPACT
events logged inerrdump
on switch end -
2024/12/02-04:56:04 (IST), [MAPS-1001], 1664654, SLOT 2 | FID 128 | PORT 12/8, CRITICAL, switch, slot12 port8, F-Port 12/8, Condition=ALL_PORTS(DEV_LATENCY_IMPACT/NONE==IO_FRAME_LOSS), Current Value:[DEV_LATENCY_IMPACT, IO_FRAME_LOSS, (174 ms Frame Delay in VC: 2) ], RuleName=ALL_PORTS_IO_FRAME_LOSS_UNQUAR, Dashboard Category=Fabric Performance Impact, Quiet Time=1 day.
2024/12/02-04:56:04 (IST), [MAPS-1001], 1664654, SLOT 2 | FID 128 | PORT 12/8, CRITICAL, switch, slot12 port8, F-Port 12/8, Condition=ALL_PORTS(DEV_LATENCY_IMPACT/NONE==IO_FRAME_LOSS), Current Value:[DEV_LATENCY_IMPACT, IO_FRAME_LOSS, (174 ms Frame Delay in VC: 2) ], RuleName=ALL_PORTS_IO_FRAME_LOSS_UNQUAR, Dashboard Category=Fabric Performance Impact, Quiet Time=1 day.
2024/12/02-04:56:04 (IST), [MAPS-1003], 1664655, SLOT 2 | FID 128 | PORT 12/45, WARNING, switch, slot12 port45, F-Port 12/45, Condition=ALL_PORTS(DEV_LATENCY_IMPACT/NONE==IO_PERF_IMPACT), Current Value:[DEV_LATENCY_IMPACT, IO_PERF_IMPACT, (39.3% of 10 secs in VC: 3-7) ], RuleName=defALL_PORTS_IO_PERF_IMPACT_UNQUAR_1, Dashboard Category=Fabric Performance Impact, Quiet Time=1 day.
sfpshow
reports both Tx and Rx power within the recommended range.
=============
Slot 12/Port 45:
=============
RX Power: -2.7 dBm (532.0uW)
TX Power: -1.5 dBm (711.2 uW)
porterrshow
showslink failures
,link resets
,c3 discards
andTx timeouts
related error counter incrementing.
/fabos/cliexec/porterrshow:
frames enc crc crc too too bad enc disc link loss loss frjt fbsy c3timeout pcs uncor
tx rx in err g_eof shrt long eof out c3 fail sync sig tx rx err err
391: 45.3g 3.4g 0 0 0 0 0 0 0 43.1k 5 0 5 0 0 43.1k 0 0 2.5k
c3-timeout tx
:
o The number of transmit class 3 frames discarded at the transmission port due to timeout (platform- and port-specific).
o This indicates an issue with the device connected to the switch.
sfpshow
indicates low receive power
=============
Port 391:
=============
RX Power: -8.1 dBm (155.8uW)
TX Power: -3.0 dBm (500.6 uW)
-
portshow
indicates the remote device has reset the link more than the local port sent OFFLINE primitives out
portshow 391
[...]
Lr_in: 133 Ols_in: 5
Lr_out: 7 Ols_out: 6
-
Focus on analysing the end device workload and if that looks good, proceed to the solution to perform hardware checks on the end device connected to the ports reporting timeouts and frame loss.