Skip to main content
NetApp Knowledge Base

ONTAP Select storage failover shows: Mailbox disks are not healthy

Views:
624
Visibility:
Public
Votes:
0
Category:
ontap-select
Specialty:
virt
Last Updated:
6/23/2025, 8:24:40 AM

Applies to

  • ONTAP Select
  • Ontap Select Deploy
  • Takeover/Giveback
  • Mailbox disks

Issue

  • Takeover is not possible, an error is shown for the mailbox disks:
::*> storage failover show
Takeover
Node           Partner        Possible State Description
-------------- -------------- -------- -------------------------------------
node-01        node-02        false    Connected to node-02, Takeover
                                       is not possible: Mailbox disks are not healthy
node-02        node-01        false    Connected to node-01, Takeover
                                       is not possible: Mailbox disks are not healthy
2 entries were displayed.
 
  • Mediator Mailbox disks are in container broken instead of as expected in container mailbox:

::*> disk show
                     Usable           Disk    Container   Container
Disk                   Size Shelf Bay Type    Type        Name          Owner
---------------- ---------- ----- --- ------- ----------- ---------     --------
NET-1.1             66.93GB     -   - VMDISK  aggregate   node_01_root  node-01
NET-1.2              1.48TB     -   - VMDISK  aggregate   node_01_SAS01 node-01
NET-1.3              1.48TB     -   - VMDISK  aggregate   node_01_SAS01 node-01
NET-1.4             66.93GB     -   - VMDISK  aggregate   node_02_ROOT  node-02
NET-1.6              1007GB     -   - VMDISK  aggregate   node_01_SAS01 node-01
NET-1.7              1.67TB     -   - VMDISK  aggregate   node_01_SAS01 node-01
NET-2.1                   -     -   - VMDISK  broken      -             node-01
NET-2.2                   -     -   - VMDISK  broken      -             node-02
NET-3.1             66.93GB     -   - VMDISK  aggregate   node_02_ROOT  node-02
NET-3.2              1.48TB     -   - VMDISK  aggregate   node_01_SAS01 node-01
NET-3.3              1.48TB     -   - VMDISK  aggregate   node_01_SAS01 node-01
NET-3.4             66.93GB     -   - VMDISK  aggregate   node_01_root  node-01
NET-3.6              1007GB     -   - VMDISK  aggregate   node_01_SAS01 node-01
NET-3.7              1.67TB     -   - VMDISK  aggregate   node_01_SAS01 node-01
14 entries were displayed.

  • From the event logs, we can see the below event reports being reported:
     
    Fri May 02 16:15:47 +0000 [hkdc1otscluster-01: kernel: iscsi.session.stateChanged:notice]: iSCSI session state is changed to Reconnecting for the target iqn.2012-05.local:mailbox.target.select000000 (type: mailbox, address: 10.1.5.221). Reason: no ping reply after 5 seconds.
    Fri May 02 16:15:57 +0000 [hkdc1otscluster-01: kernel: iscsi.session.stateChanged:notice]: iSCSI session state is changed to Reconnecting (destroy sim) for the target iqn.2012-05.local:mailbox.target.select000000 (type: mailbox, address: 10.1.5.221). Reason: session login timed out 2 times.
    Fri May 02 16:15:57 +0000 [hkdc1otscluster-01: fmmbx_instanceWorker: fmmb.discvryHintsInstance:debug]: params: {'action': 'write', 'host': 'Local', 'mailboxId': 'ee4bbcd7-003e-11ee-8400-00a0b8f80de8', 'sequenceNo': '48428796', 'startTime': '51832969502', 'finishTime': '51832969515'}
    Fri May 02 16:15:57 +0000 [hkdc1otscluster-01: fmmbx_instanceWorker: fmmb.lock.disk.remove:info]: Disk 0f.21 removed from Local mailbox set.
    Fri May 02 16:15:57 +0000 [hkdc1otscluster-01: fmmbx_instanceWorker: fmmb.current.lock.disk:info]: Disk 0b.20 is a Local HA mailbox disk.
    Fri May 02 16:15:57 +0000 [hkdc1otscluster-01: fmmbx_instanceWorker: fmmb.current.lock.disk:info]: Disk 0d.24 is a Local HA mailbox disk.
    Fri May 02 16:15:57 +0000 [hkdc1otscluster-01: cf_main: cf.fsm.takeoverOfPartnerDisabled:error]: Failover monitor: takeover of hkdc1otscluster-02 disabled (Mailbox disks are not healthy).
    Fri May 02 16:15:57 +0000 [hkdc1otscluster-01: cf_main: cf.fsm.takeoverByPartnerDisabled:error]: Failover monitor: takeover of hkdc1otscluster-01 by hkdc1otscluster-02 disabled (Mailbox disks are not healthy).
    Fri May 02 16:16:00 +0000 [hkdc1otscluster-01: fmmbx_instanceWorker: fmmb.lock.disk.remove:info]: Disk 0f.22 removed from Partner mailbox set.
    Fri May 02 16:16:00 +0000 [hkdc1otscluster-01: fmmbx_instanceWorker: fmmb.skipReadRetry:debug]: params: {'side': 'Partner', 'mbx_status': 'Mailbox status Backup'}
    Fri May 02 16:16:00 +0000 [hkdc1otscluster-01: monitor: monitor.globalStatus.critical:EMERGENCY]: Controller failover of hkdc1otscluster-02 is not possible: Mailbox disks are not healthy.
    Fri May 02 16:16:13 +0000 [hkdc1otscluster-01: iscsid: iscsi.session.stateChanged:notice]: iSCSI session state is changed to Connected for the target iqn.2012-05.local:mailbox.target.select000000 (type: mailbox, address: 10.1.5.221).
    Fri May 02 16:16:28 +0000 [hkdc1otscluster-01: fmmbx_instanceWorker: fmmb.discvryHintsInstance:debug]: params: {'action': 'write', 'host': 'Local', 'mailboxId': 'ee4bbcd7-003e-11ee-8400-00a0b8f80de8', 'sequenceNo': '48428826', 'startTime': '51833000536', 'finishTime': '51833000545'}
    
    Fri May 02 16:15:57 +0000 [hkdc1otscluster-02: cf_main: cf.fsm.takeoverOfPartnerDisabled:error]: Failover monitor: takeover of hkdc1otscluster-01 disabled (Mailbox disks are not healthy).
    Fri May 02 16:16:00 +0000 [hkdc1otscluster-02: monitor: monitor.globalStatus.critical:EMERGENCY]: Controller failover of hkdc1otscluster-01 is not possible: Mailbox disks are not healthy.
    
  • Mailbox iSCSI connections to ONTAP Select Deploy IP (in the example 10.0.0.1) show as up/up:
::*> storage iscsi-initiator show
Status
Node    Type     Label    Target Portal        Target Name                      Admin/Op
----    ----     -------- ------------------   -------------------------------- --------
node-01 mailbox  f7fa0d14-f4c5-11e9-8715-005056bc73cb-mailbox
                          10.0.0.1             iqn.2012-05.local:mailbox.target.select000002
                                                                                up/up
        partner  f7faeaa4-f4c5-11e9-8715-005056bc73cb-partner
                          169.254.23.163:65200 iqn.2012-06.com.bsdctl:target0   up/up
        partner2 f7faeaa4-f4c5-11e9-8715-005056bc73cb-partner2
                          169.254.23.163:65200 iqn.2012-06.com.bsdctl:target0   up/up
node-02 mailbox  f7faeaa4-f4c5-11e9-8715-005056bc73cb-mailbox
                          10.0.0.1             iqn.2012-05.local:mailbox.target.select000002
                                                                                up/up
        partner  f7fa0d14-f4c5-11e9-8715-005056bc73cb-partner
                          169.254.23.12:65200  iqn.2012-06.com.bsdctl:target0   up/up
        partner  2f7fa0d14-f4c5-11e9-8715-005056bc73cb-partner2
                          169.254.23.12:65200  iqn.2012-06.com.bsdctl:target0   up/up
6 entries were displayed.
 

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.