CONTAP-334525: FAS28x0 takeover disabled intermittently for unsynced logs with NVIDIA SN2100 switches
Issue
- One or more nodes experiences an unexpected reboot, showing an error on reboot similar to:
PANIC: irdma_request_reset:83 requesting pf-reset in process defer_thr
- EMS shows unsyncronized logs and takeover disabled/enabled:
Mon Sep 02 01:15:06 +0100 cf.fsm.takeoverOfPartnerDisabled:error]: Failover monitor: takeover of dmstorage-03 disabled (unsynchronized log).
Mon Sep 02 01:15:09 +0100 cf.fsm.takeoverOfPartnerEnabled:notice]: Failover monitor: takeover of dmstorage-03 enabled
Mon Sep 02 02:01:08 +0100 cf.fsm.takeoverOfPartnerDisabled:error]: Failover monitor: takeover of dmstorage-03 disabled (unsynchronized log).
Mon Sep 02 02:01:14 +0100 cf.fsm.takeoverOfPartnerEnabled:notice]: Failover monitor: takeover of dmstorage-03 enabled
Mon Sep 02 04:28:37 +0100 cf.fsm.takeoverOfPartnerDisabled:error]: Failover monitor: takeover of dmstorage-03 disabled (unsynchronized log).
Mon Sep 02 04:28:40 +0100 cf.fsm.takeoverOfPartnerEnabled:notice]: Failover monitor: takeover of dmstorage-03 enabled
- Switched Cluster using NVIDIA SN2100 switches