Skip to main content
NetApp Knowledge Base

StorageGrid storage node stuck upgrading base os due to hardware fault

Views:
90
Visibility:
Public
Votes:
0
Category:
storagegrid
Specialty:
sgrid
Last Updated:

Applies to

All StorageGRID Appliances

Issue

  • During StorageGRID upgrade, storage node is stuck in Upgrading BaseOS step.
  • SSH into node and confirm node is in BaseOS 
    • Green root@SG means in BaseOS.
  • /var/log/syslog reports the following:
    •  dockerd[3299]: Error starting daemon: Devices cgroup isn't mounted
    • kernel: [  482.217615] CPU: 15 PID: 0 Comm: swapper/15 Kdump: loaded Tainted: G           OE     4.19.0-18-amd64 #1 Debian 4.19.208-1+ntapB
      kernel: [  482.217616] Hardware name: Default string Default string/Default string, BIOS 0.12.0 01/16/2017
      kernel: [  482.217616] Call Trace:
      kernel: [  482.217619]  <IRQ>
      kernel: [  482.217627]  dump_stack+0x66/0x81
      kernel: [  482.217630]  nmi_cpu_backtrace.cold.4+0x13/0x50
      kernel: [  482.217635]  ? lapic_can_unplug_cpu+0x80/0x80
      kernel: [  482.217640]  nmi_trigger_cpumask_backtrace+0xf9/0x100
      kernel: [  482.217645]  __handle_sysrq.cold.9+0x45/0xf2
      kernel: [  482.217650]  fpgaIsr.cold.5+0xbb/0x168 [fpga_pci]
      kernel: [  482.217655]  __handle_irq_event_percpu+0x46/0x190
      kernel: [  482.217657]  handle_irq_event_percpu+0x30/0x80
      kernel: [  482.217659]  handle_irq_event+0x3c/0x60
      kernel: [  482.217661]  handle_edge_irq+0x97/0x1e0
      kernel: [  482.217665]  handle_irq+0x1f/0x30
      kernel: [  482.217667]  do_IRQ+0x49/0xe0
      kernel: [  482.217671]  common_interrupt+0xf/0xf
      kernel: [  482.217672]  </IRQ>
      kernel: [  482.217677] RIP: 0010:cpuidle_enter_state+0xb9/0x320
      kernel: [  482.217678] Code: e8 5c 5e b2 ff 80 7c 24 0b 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 3b 02 00 00 31 ff e8 6e ea b7 ff fb 66 0f 1f 44 00 00 <48> b8 ff ff ff ff f3 01 00 00 48 2b 1c 24 ba ff ff ff 7f 48 39 c3
      kernel: [  482.217679] RSP: 0018:ffffaafc4635be90 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffde
      kernel: [  482.217681] RAX: ffff94e6fede2140 RBX: 0000007045edefd5 RCX: 000000000000001f
      kernel: [  482.217682] RDX: 0000007045edefd5 RSI: 0000000040000431 RDI: 0000000000000000
      kernel: [  482.217683] RBP: ffff94e6fedea628 R08: 0000000000000004 R09: 0000000000021a00
      kernel: [  482.217684] R10: 00000cba5b622362 R11: ffff94e6fede1128 R12: 0000000000000004
      kernel: [  482.217684] R13: ffffffff95eb7238 R14: 0000000000000004 R15: 0000000000000000
      kernel: [  482.217691]  do_idle+0x228/0x270
      kernel: [  482.217694]  cpu_startup_entry+0x6f/0x80
      kernel: [  482.217696]  start_secondary+0x1a4/0x200
      kernel: [  482.217699]  secondary_startup_64+0xa4/0xb0

  • /var/log/daemon.log reports ImportError: No module named pge_mgmt_api.helpers.run_cmd

  • /var/log/sga-docker.daemon.log reports the following:

    • /usr/bin/python: No module named netapp.platform_info
      ERROR: Failed to load user data
  • /var/log/messages reports the following:

    • kernel: [21954396.100230] sd 0:0:0:0: [sda] tag#17 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
      kernel: [21954396.109232] sd 0:0:0:0: [sda] tag#17 CDB: Read(10) 28 00 01 dc e7 80 00 00 08 00
      kernel: [21954396.123110] sd 0:0:0:0: [sda] tag#18 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
      kernel: [21954396.132108] sd 0:0:0:0: [sda] tag#18 CDB: Read(10) 28 00 01 dc e7 80 00 00 08 00
      kernel: [21954396.153728] sd 0:0:0:0: [sda] tag#5 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
      kernel: [21954396.162641] sd 0:0:0:0: [sda] tag#5 CDB: Read(10) 28 00 00 ee 78 02 00 00 02 00
      root: [2023-11-25 16:18:25+00:00 PIU] mount: /mnt/pge-inac-part: can't read superblock on /dev/sda2.
      root: [2023-11-25 16:18:25+00:00 PIU] mount /dev/sda2 /mnt/pge-inac-part failed; trying again
      root: [2023-11-25 16:18:26+00:00 PIU] [root@SG:/] >>> mount /dev/sda2 /mnt/pge-inac-part
      root: [2023-11-25 16:18:26+00:00 PIU] mount: /mnt/pge-inac-part: can't read superblock on /dev/sda2.
      root: [2023-11-25 16:18:26+00:00 PIU] mount /dev/sda2 /mnt/pge-inac-part failed; trying again
      root: [2023-11-25 16:18:27+00:00 PIU] [root@SG:/] >>> mount /dev/sda2 /mnt/pge-inac-part
      root: [2023-11-25 16:18:27+00:00 PIU] mount: /mnt/pge-inac-part: can't read superblock on /dev/sda2.

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.