Skip to main content
NetApp Knowledge Base

Failover monitor: takeover of node-01 by node-02 disabled (unsynchronized log) with CRC Errors and Error Symbol

Views:
115
Visibility:
Public
Votes:
0
Category:
aff-series
Specialty:
hw
Last Updated:
12/17/2024, 3:56:17 AM

Applies to

  • AFF-C250
  • ONTAP 9
  • BES-53248 Cluster Switch

Issue

  • The below alerts are frequently seen in the event/EMS logs:

 Mon Dec 02 01:04:26 -0500 [node-01: wafl_exempt09: mirror.stream.qp.error:debug]: params: {'mirror': 'HA Partner', 'qp_name': 'WAFL', 'error': 'NVMM_ERR_POLL_TIMEOUT'}
Mon Dec 02 01:04:26 -0500 [node-01: mcc_cfd_rnic: mirror.stream.qp.error:debug]: params: {'mirror': 'HA Partner', 'qp_name': 'RAID', 'error': 'NVMM_ERR_STREAM'}
 Mon Dec 02 01:04:26 -0500 [node-01: mcc_cfd_rnic: mirror.stream.qp.error:debug]: params: {'mirror': 'HA Partner', 'qp_name': 'MISC', 'error': 'NVMM_ERR_STREAM'}
 Mon Dec 02 01:04:26 -0500 [node-01: nvmm_error: rdma.rlib.event.error:debug]: QP wafl event error: client disconnect.
Mon Dec 02 01:04:26 -0500 [node-01: nvmm_error: nvmm.mirror.offlined:debug]: params: {'mirror': 'HA_PARTNER'}
Mon Dec 02 01:04:26 -0500 [node-01: rastrace_dump: rastrace.dump.saved:debug]: A RAS trace dump for module IC instance 0 was stored in /etc/log/rastrace/IC_0_20241202_01:04:26:534541.dmp.
Mon Dec 02 01:04:27 -0500 [node-01: cf_main: cf.fsm.takeoverByPartnerDisabled:error]: Failover monitor: takeover of node-01 by node-02 disabled (unsynchronized log).
Mon Dec 02 01:04:29 -0500 [node-01: nvmm_mirror_sync: nvmm.mirror.state.change:debug]: mirror of sysid 1, partner_type HA Partner, changed state from NVMM_MIRROR_LAYOUT_SYNCING to NVMM_MIRROR_LAYOUT_SYNCED and took 1 msecs. 

Tue Dec 03 12:35:00 -0500 [node-01: monitor: monitor.globalStatus.critical:EMERGENCY]: Controller failover of node-01 is not possible: unsynchronized log. 


 

  • IFSTAT output shows CRC Errors and Error Symbol on port :

-- interface  e0d  (86 days, 5 hours, 53 minutes, 54 seconds) --

RECEIVE  
 Total frames:     3080m | Frames/second:     413  | Total bytes:     21281g
 Bytes/second:     2855k | Total errors:      293  | Errors/minute:       0
 Total discards:      0  | Discards/minute:     0  | Multi/broadcast: 45101k
 Non-primary u/c:     0  | Errored frames:      0  | Unsupported Op:      0
 CRC errors:        146  | Runt frames:         0  | Fragment:            1
 Long frames:         0  | Jabber:              0  | Length errors:       0
 Alignment errors:    0  | No buffer:           0  | Pause:               0
 Jumbo:            2376m | Error symbol:      146  | Bus overruns:        0
 Queue drops:         0  | LRO segments:     1246m | LRO bytes:       21082g
 LRO6 segments:       0  | LRO6 bytes:          0  | Bad UDP cksum:       0
 Bad UDP6 cksum:      0  | Bad TCP cksum:       0  | Bad TCP6 cksum:      0
 Mcast v6 solicit:    0  | Lagg errors:         0  | Lacp errors:         0
 Lacp PDU errors:     0
TRANSMIT
 Total frames:     1438m | Frames/second:     193  | Total bytes:       169g
 Bytes/second:    22739  | Total errors:        0  | Errors/minute:       0
 Total discards:      0  | Queue overflow:      0  | Multi/broadcast: 14818k
 Collisions:          0  | Pause:              48  | Jumbo:             340m
 Cfg Up to Downs:     2  | TSO segments:    82774  | TSO bytes:        1409m
 TSO6 segments:       0  | TSO6 bytes:          0  | HW UDP cksums:    7452k
 HW UDP6 cksums:      0  | HW TCP cksums:    1403m | HW TCP6 cksums:      0
 Mcast v6 solicit:    5  | Lagg drops:          0  | Lagg no buffer:      0
 Lagg no entries:     0
DEVICE
 Mcast addresses:     6  | Rx MBuf Sz:       9216
LINK INFO
 Speed:           25000M | Duplex:            full | Flowcontrol:       none
 Media state:     active | Up to downs:      20597 | HW assist:        5655

  • Switch logs show Recieve errors on Switch port:

Port              InOctets      InUcastPkts      InMcastPkts      InBcastPkts       InDropPkts         Rx Error
--------- ---------------- ---------------- ---------------- ---------------- ---------------- ----------------
0/1          1747429110877       3619349837          7700076           257725            74546             1586
0/2         19941079199680       6150430088          7820711           256177           160171                0

 

 

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.