Skip to main content
NetApp Knowledge Base

HA-pair Panic event and reboot due to faulty power on disk shelf

Views:
323
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
HW
Last Updated:

Applies to

  • ONTAP 9
  • NS224 Disk shelves

Issue

  • Both nodes within an HA-pair encounter multi-disk error and reboot due to loss of access to disks:

Sat Nov 25 10:17:05 +0000 [netapp01-01: fmmbx_instanceWorker: cf.multidisk.fatalProblem:error]: Node encountered a multidisk error or other fatal error while waiting to be taken over. Permanent errors on all HA mailbox disks (while marshalling header).

Sat Nov 25 10:17:06 +0000 [netapp01-02: fmmbx_instanceWorker: sk.panic:alert]: Panic String: Permanent errors on all HA mailbox disks (while marshalling header) in SK process fmmbx_instanceWorker on release 9.11.1P8 (C)

  • Link down alerts for storage ports connected to shelf:

Sat Nov 25 10:15:39 +0000 [netapp01-01: kernel: netif.linkDown:info]: Ethernet e10b: Link down, check cable.
Sat Nov 25 10:15:39 +0000 [netapp01-01: intr: netif.linkDown:info]: Ethernet e10b-30: Link down, check cable.
Sat Nov 25 10:15:39 +0000 [netapp01-01: kernel: netif.linkDown:info]: Ethernet e2a: Link down, check cable.
Sat Nov 25 10:15:39 +0000 [netapp01-01: intr: netif.linkDown:info]: Ethernet e2a-30: Link down, check cable.

Sat Nov 25 10:15:39 +0000 [netapp01-02: kernel: netif.linkDown:info]: Ethernet e2a: Link down, check cable.
Sat Nov 25 10:15:39 +0000 [netapp01-02: intr: netif.linkDown:info]: Ethernet e2a-30: Link down, check cable.
Sat Nov 25 10:15:39 +0000 [netapp01-02: kernel: netif.linkDown:info]: Ethernet e10b: Link down, check cable.
Sat Nov 25 10:15:39 +0000 [netapp01-02: intr: netif.linkDown:info]: Ethernet e10b-30: Link down, check cable.

  • May not see shelf power faults in AutoSupport EMS logs
  • Shelf logs report PSU alerts from power manager:

Sat Nov 25 10:14:42 2023 (  148+23:59:17.135); 030B0060; S1; ENC_MGT; power_manager; 04; PCM 2 local fan power restored
Sat Nov 25 10:14:42 2023 (  148+23:59:17.135); 030B0084; S1; ENC_MGT; power_manager; 02; Clearing PSU AC Missing (non-redundant) alarm
Sat Nov 25 10:14:43 2023 (  148+23:59:18.126); 030B005C; S1; ENC_MGT; power_manager; 04; PCM 2 fault cleared, assume power restored (1600W)
Sat Nov 25 10:14:43 2023 (  148+23:59:18.126); 030B0078; S1; ENC_MGT; power_manager; 02; Clearing PSU Fail (non-redundant) alarm
Sat Nov 25 10:14:51 2023 (  148+23:59:26.123); 030B006F; S1; ENC_MGT; power_manager; 02; PCM 1 DC FAILURE Fault Detected
Sat Nov 25 10:14:51 2023 (  148+23:59:26.123); 030B0072; S1; ENC_MGT; power_manager; 02; Setting FAIL MIN REDUNDANT alarm for PCM 1
Sat Nov 25 10:14:51 2023 (  148+23:59:26.123); 030B005B; S1; ENC_MGT; power_manager; 04; PCM 1 faults indicate loss of power (1600W)
Sat Nov 25 10:14:52 2023 (  148+23:59:27.124); 030B005C; S1; ENC_MGT; power_manager; 04; PCM 1 fault cleared, assume power restored (1600W)
Sat Nov 25 10:14:52 2023 (  148+23:59:27.124); 030B0076; S1; ENC_MGT; power_manager; 02; Clearing PSU Fail (min-redundant) alarm
Sat Nov 25 10:14:55 2023 (  148+23:59:30.135); 030B006F; S1; ENC_MGT; power_manager; 02; PCM 2 PCM FAILURE Fault Detected
Sat Nov 25 10:14:55 2023 (  148+23:59:30.135); 030B0072; S1; ENC_MGT; power_manager; 02; Setting FAIL MIN REDUNDANT alarm for PCM 2
Sat Nov 25 10:14:55 2023 (  148+23:59:30.135); 030B006F; S1; ENC_MGT; power_manager; 02; PCM 2 TURNED OFF Fault Detected

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.