Continuous Error InfiniBand retimer programming failed
Applies to
- FAS8080 Controller-IOXM (C-I configuration)
- ONTAP 9
Issue
- Following Errors are seen in event Logs:
Fri Oct 07 07:42:24 +0530 [Cluster-01: ib_nap_tx_1: connectx.IbRetimerRetry:debug]: InfiniBand retimer programming is being retried on port ib0a.
Fri Oct 07 07:42:26 +0530 [Cluster-01: ib_nap_tx_2: connectx.IbRetimerRetry:debug]: InfiniBand retimer programming is being retried on port ib0b.
Fri Oct 07 07:42:30 +0530 [Cluster-01: ib_nap_tx_1: callhome.ibretimerprog.fail:EMERGENCY]: Call home for INFINIBAND RETIMER PROGRAMMING FAILURE
Fri Oct 07 07:42:32 +0530 [Cluster-01: ib_nap_tx_2: callhome.ibretimerprog.fail:EMERGENCY]: Call home for INFINIBAND RETIMER PROGRAMMING FAILURE
- MB is replaced however following alerts are reported after that:
HA Group Notification (InfiniBand retimer programming failed on port ib0a) EMERGENCY
Fri Oct 07 07:42:22 +0530 [Cluster-01: ib_nap_tx_1: connectx.IbQsfpDumpCtrl:error]: InfiniBand retimer programming failed on port ib0a due to QSFP register dump error. Dumping registers: control 0x40000050, data 0x0, timeout 0xf0f3000, clock 0xbc4709c9.
Fri Oct 07 07:42:24 +0530 [Cluster-01: ib_nap_tx_2: connectx.IbQsfpDumpCtrl:error]: InfiniBand retimer programming failed on port ib0b due to QSFP register dump error. Dumping registers: control 0x40000050, data 0x0, timeout 0xf0f3000, clock 0xbc4709c9.
Fri Oct 07 08:37:25 +0530 [Cluster-01: ib_nap_tx_1: connectx.IbCableDetected:info]: Detected Active Optical cable of length 5M on InfiniBand port ib0a.
Fri Oct 07 08:37:25 +0530 [Cluster-01: ib_nap_tx_1: connectx.IbRetimerProgrmPass:info]: InfiniBand retimer programming was successful on port ib0a.
Fri Oct 07 08:37:25 +0530 [Cluster-01: ib_nap_tx_2: connectx.IbCableDetected:info]: Detected Active Optical cable of length 5M on InfiniBand port ib0b.
Fri Oct 07 08:37:25 +0530 [Cluster-01: ib_nap_tx_2: connectx.IbRetimerProgrmPass:info]: InfiniBand retimer programming was successful on port ib0b.