EMS reports "cf.fsm.takeoverByPartnerDisabled"
Applies to
- ONTAP 9
- AFF/FAS systems with HA Interconnect cables
Issue
- EMS reports
cf.fsm.takeoverOfPartnerDisabled
due to unsynchronized log, followed by the failover is enabled messages:
Thu Dec 02 03:56:45 +0900 [node01: cf_main: cf.fsm.takeoverByPartnerDisabled:error]: Failover monitor: takeover of node01 by node02 disabled (unsynchronized log).
Thu Dec 02 03:56:48 +0900 [node01: nvmm_mirror_sync: nvmm.mirror.state.change:debug]: mirror of sysid 1, partner_type HA Partner, changed state from NVMM_MIRROR_LAYOUT_SYNCING to NVMM_MIRROR_LAYOUT_SYNCED and took 1 msecs.
Thu Dec 02 03:56:48 +0900 [node01: nvmm_mirror_sync: nvmm.mirror.state.change:debug]: mirror of sysid 1, partner_type HA Partner, changed state from NVMM_MIRROR_LAYOUT_SYNCED to NVMM_MIRROR_SYNCING_START and took 0 msecs.
Thu Dec 02 03:56:48 +0900 [node01: nvmm_mirror_sync: nvmm.mirror.aborting:debug]: mirror of sysid 1, partner_type HA Partner and mirror state NVMM_MIRROR_SYNCING_START is aborted because of reason NVMM_ERR_STREAM_MAP.
Thu Dec 02 03:56:48 +0900 [node01: nvmm_error: nvmm.mirror.aborting:debug]: mirror of sysid 1, partner_type HA Partner and mirror state NVMM_MIRROR_OFFLINE is aborted because of reason NVMM_ABORT_SYNCING_MIRROR.
Thu Dec 02 03:56:48 +0900 [node01: ib_cm_13: rdma.rlib.connected:debug]: misc:HA:P QP is now connected.
Thu Dec 02 03:56:48 +0900 [node01: ib_cm_1: rdma.rlib.connected:debug]: wafl:HA:P QP is now connected.
Thu Dec 02 03:56:48 +0900 [node01: ib_cm_14: rdma.rlib.connected:debug]: raid:HA:P QP is now connected.
Thu Dec 02 03:56:48 +0900 [node01: ib_cm_17: rdma.rlib.connected:debug]: misc:HA:P QP is now connected.
Thu Dec 02 03:56:49 +0900 [node01: cf_main: cf.fsm.takeoverOfPartnerEnabled:notice]: Failover monitor: takeover of node02 enabled
Thu Dec 02 03:56:49 +0900 [node01: cf_main: cf.fsm.takeoverByPartnerEnabled:notice]: Failover monitor: takeover of node01 by node02 enabled
- Flapping interconnect link issue on the node might be seen:
Example:
Thu Dec 02 03:56:45 +0900 [node01: kernel: netif.linkDown:info]: Ethernet e0a: Link down, check cable.
Thu Dec 02 03:56:47 +0900 [node01: kernel: netif.linkUp:info]: Ethernet e0a: Link up.
Thu Dec 02 04:20:30 +0900 [node01: kernel: netif.linkDown:info]: Ethernet e0a: Link down, check cable.
Thu Dec 02 04:20:32 +0900 [node01: kernel: netif.linkUp:info]: Ethernet e0a: Link up.
- HA interconnect port might be reporting high CRCs:
-- interface e0h (0 days, 1 hours, 3 minutes, 9 seconds) --
RECEIVE
Total frames: 1534m | Frames/second: 3208 | Total bytes: 2236g
Bytes/second: 4674k | Total errors: 1845 | Errors/minute: 0
Total discards: 0 | Discards/minute: 0 | Multi/broadcast: 31966k
Non-primary u/c: 0 | CRC errors: 1843 | Runt frames: 0
Long frames: 2 | Length errors: 101 | Alignment errors: 0
No buffer: 0 | Pause: 0 | Jumbo: 179m
Noproto: 0 | Bus overruns: 0 | LRO segments: 14947m
LRO bytes: 2012g | LRO6 segments: 0 | LRO6 bytes: 0
Bad UDP cksum: 0 | Bad UDP6 cksum: 0 | Bad TCP cksum: 0
Bad TCP6 cksum: 0 | Mcast v6 solicit: 0 | Lagg errors: 0
Lacp errors: 0 | Lacp PDU errors: 0