Skip to main content
NetApp Knowledge Base

StorageGRID appliance SG5700 and SG6000 reboot sporadically and report down in the GRID management interface

Views:
354
Visibility:
Public
Votes:
0
Category:
storagegrid
Specialty:
sgrid
Last Updated:

Applies to

  • NetApp StorageGRID Appliance SG6000
  • NetApp StorageGRID Appliance SG5700

Issue

  • StorageGRID storage node report a blue status (Administratively unknown) sporadically as a result of the node rebooting
  • Several alarms (like CASA, NRLY, SVST, etc.) are detected and cleared intermittently (data store down)
  • StorageGRID compute controller (i.e 5700 Controller, SG6000 1U node) kernel logs /var/local/log/messages and/or dmesg.txt indicate there are aborts on one or more of the Fibre Channel links used for inter-appliance communication

Aug  7 08:03:46 SG kernel: 4,2160,410123880,-;qla2xxx [0000:14:00.1]-801c:7: Abort command issued nexus=7:0:0 --  1 2002.
Aug  7 08:03:47 SG kernel: 4,2161,411147898,-;qla2xxx [0000:14:00.1]-801c:7: Abort command issued nexus=7:0:0 --  1 2002.
Aug  7 08:03:48 SG kernel: 4,2162,412171780,-;qla2xxx [0000:14:00.1]-801c:7: Abort command issued nexus=7:0:0 --  1 2002.

Aug  7 08:03:43 SG kernel: 4,2136,406224581,-;Call Trace:
Aug  7 08:03:43 SG kernel: 4,2137,406227012,-; [<ffffffffb9e18609>] ? __schedule+0x239/0x6f0
Aug  7 08:03:43 SG kernel: 4,2138,406232464,-; [<ffffffffb9e18af2>] ? schedule+0x32/0x80
Aug  7 08:03:43 SG kernel: 4,2139,406237574,-; [<ffffffffb9e1be17>] ? schedule_timeout+0x167/0x380
Aug  7 08:03:43 SG kernel: 4,2140,406243546,-; [<ffffffffb98e9220>] ? del_timer_sync+0x50/0x50
Aug  7 08:03:43 SG kernel: 4,2141,406249172,-; [<ffffffffb98ea02a>] ? msleep+0x2a/0x40
Aug  7 08:03:43 SG kernel: 4,2142,406254118,-; [<ffffffffc0a88481>] ? qla2x00_eh_wait_on_command+0x41/0x90 [qla2xxx]
Aug  7 08:03:43 SG kernel: 4,2143,406261651,-; [<ffffffffc0a887ab>] ? qla2xxx_eh_abort+0x2db/0x310 [qla2xxx]
Aug  7 08:03:43 SG kernel: 4,2144,406268491,-; [<ffffffffc02ba4c2>] ? scmd_eh_abort_handler+0x72/0x270 [scsi_mod]
Aug  7 08:03:43 SG kernel: 4,2145,406275762,-; [<ffffffffb989486a>] ? process_one_work+0x18a/0x430
Aug  7 08:03:43 SG kernel: 4,2146,406281734,-; [<ffffffffb9894b5d>] ? worker_thread+0x4d/0x490
Aug  7 08:03:43 SG kernel: 4,2147,406287362,-; [<ffffffffb9894b10>] ? process_one_work+0x430/0x430
Aug  7 08:03:43 SG kernel: 4,2148,406293338,-; [<ffffffffb989abc9>] ? kthread+0xd9/0xf0
Aug  7 08:03:43 SG kernel: 4,2149,406298361,-; [<ffffffffb9e1d4f1>] ? __switch_to_asm+0x41/0x70
Aug  7 08:03:43 SG kernel: 4,2150,406304074,-; [<ffffffffb989aaf0>] ? kthread_park+0x60/0x60
Aug  7 08:03:43 SG kernel: 4,2151,406309530,-; [<ffffffffb9e1d577>] ? ret_from_fork+0x57/0x70

  • base-os-logs /var/log/syslog in StorageGRID compute controller also shows I/O error

Sep  2 20:17:14 localhost kernel: [51563.447905] print_req_error: I/O error, dev sdab, sector 13323229512
Sep  2 20:17:14 localhost kernel: [51563.454244] device-mapper: multipath: Failing path 65:176.

  • E-Series storage controller (ie E2800 Controller) major event log indicate there are Fibre Channel link errors (Event Type: 1206)

Event type: 1206
Event category: Error
Description: Fibre channel link errors continue

Note: Not all of the above conditions are met at the same time

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.