Skip to main content
NetApp Knowledge Base

CFBRIDGE-414: Both NSM encountering the watchdog reset caused the loss of access to shelf X

Views:
15
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
hw
Last Updated:
4/15/2025, 12:46:02 PM

Issue

  • We see  scsi.cmd.checkCondition  

 [Node: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Disk device e5a.11.3.5L0: Check Condition: CDB 0x28:2183d84b:0001: Sense Data SCSI:aborted command -  (0xb - 0x90 0x2 0xfc)(2520).
 [Node: scsi_cmdblk_strthr_admin: scsi.cmd.checkCondition:error]: Disk device e5b.11.0.12L0: Check Condition: CDB 0x28:9ae4029e:0001: Sense Data SCSI:aborted command -  (0xb - 0x90 0x2 0xfc)(2669).
 [Node: scsi_cmdblk_strthr_admin: scsi.cmd.retrySuccess:debug]: Disk device e5a.11.1.3L0: request successful after retry #1/#0: cdb 0x28:de3dfca4:0001 (3538).
 [Node: scsi_cmdblk_strthr_admin: scsi.cmd.retrySuccess:debug]: Disk device e5b.11.0.5L0: request successful after retry #1/#0: cdb 0x28:2183d84e:0001 (3371).
 
  • Node rebooted due to

[Node: config_thread: raid.config.filesystem.disk.missing:info]: File system Disk /Node_SSD/plex0/rg2/e5b.11.2.1 Shelf 11 Bay 1 [NETAPP   X4020S173A15TNQF NA55] S/N [XXXXXXXXXXXXXX] UID [36313230:57B16662:00253845:00000002:00000000:00000000:00000000:00000000:00000000:00000000] is missing.
[Node: config_thread: raid.config.filesystem.disk.missing:info]: File system Disk /Node_SSD/plex0/rg2/e5a.11.1.2 Shelf 11 Bay 2 [NETAPP   X4020S173A15TNQF NA55] S/N [XXXXXXXXXXXXXX] UID [36313230:57B16664:00253845:00000002:00000000:00000000:00000000:00000000:00000000:00000000] is missing.
[Node: config_thread: cf.multidisk.fatalProblem:error]: Node encountered a multidisk error or other fatal error while waiting to be taken over. aggr Node_SSD: raid volfsm, fatal multi-disk error..  Raid type - raid_dp Group name plex0/rg2 state NORMAL. 8 disks failed in the group. Disk e5b.11.2.0 Shelf 11 Bay 0 [NETAPP   X4020S173A15TNQF NA55] S/N [XXXXXXXXXXXXX] UID [36313230:57B16661:00253845:00000002:00000000:00000000:00000000:00000000:00000000:00000000] error: no valid path to disk. Disk e5b.11.2.1 Shelf 11 Bay 1 [NETAPP   X4020S173A15TNQF NA55] S/N [XXXXXXXXXXXXX] UID [36313230:57B16662:00253845:00000002:00000000:00000000:00000000:00000000:00000000:00000000] error: disk does not exist. Disk e5a.11.1.2 Shelf 11 Bay 2 [NETAPP   X4020S173A15TNQF NA55] S/N 
  • Shelf logs we see "software watchdog detected fault" in both Modules

--------------------------------------------------------------
Shelflog start time: Sun Mar  9 09:15:18 GMT 2025
Controller Id: XXXXXXXXXXX
Channel: 0x Shelf: 11 Module type: NSM100 Firmware rev: 0305
Shelf product id: NS224NSM100
Shelf Serial Number: XXXXXXXXXXXXX
Module A Serial Number: XXXXXXXXXXX
Log ID: XXXXXXXXXXXXXX
Timestamp: Thu Mar 20 21:54:52 GMT 2025
--------------------------------------------------------------
EVENT LOGS
Timestamp Thu Mar 20 21:54:51 2025
(183+12:51:48.557)
Thu Mar 20 21:54:47 2025 (  183+12:51:45.089); 02000228; M0; HAL; hal; 02; Failure: software watchdog detected fault.
Thu Mar 20 21:54:47 2025 (  183+12:51:45.089); 02000229; M0; HAL; hal; 02; Failure info: Client "bridgeWdgClient" triggered wdg. tNow:3b364bc9h, tLast:3b35fa79h, interval:4e20h, failed:0h.
Thu Mar 20 21:54:47 2025 (  183+12:51:45.089); 02000263; M0; HAL; hal; 04; HAL_ProductCrashAndCoreIt: prior system(pkill -6 bio) status:0 pid:0
Thu Mar 20 21:54:47 2025 (  183+12:51:45.089); 02000263; M0; HAL; hal; 04; HAL_ProductCrashAndCoreIt: post system(pkill -6 bio) status:0 pid:3102921

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.