scsi.cmd.abortedByHost:error reported on stretch Metrocluster

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

Views:: 153

Visibility:: Public

Votes:: 0

Category:: ontap-9

Specialty:: metrocluster

Last Updated:

Applies to

ONTAP 9
2-node Bridge-attached Stretch Metrocluster

Issue

On node Node-01, the following "scsi.cmd.abortedByHost" events are seen for different disk drives via path 2b

Thu Mar 20 16:35:06 +0100 [Node-01: slifc_intrd: scsi.cmd.abortedByHost:error]: Disk device 2b.125L62: Command aborted by host adapter: HA status 0x4: cdb 0xea:0bbc6000:01e0.

Thu Mar 20 16:35:09 +0100 [Node-01: slifc_intrd: scsi.cmd.abortedByHost:error]: Disk device 2b.125L55: Command aborted by host adapter: HA status 0x4: cdb 0x2a:8bb9edf8:0008.

FC port 2d of node Node-01 is flapping very frequently and StorageFCAdapterFault_Alert is reported due to that:

Thu Mar 20 16:36:29 +0100 [Node-01: slifc_asyncd_4: fci.adapter.link.online:info]: Fibre Channel adapter 2b link online.

Thu Mar 20 16:37:10 +0100 [Node-01: slifc_timeout_4: fci.link.error:error]: Could not recover link on Fibre Channel adapter 2b after 30 seconds. Taking the adapter offline.

Thu Mar 20 16:37:10 +0100 [Node-01: dsbridge_admin: bridge.removed:info]: FC-to-SAS bridge 2b.125L0 [ATTO     FibreBridge7600N 4.35] S/N [FB7600N106192] was removed.

Thu Mar 20 16:37:20 +0100 [Node-01: nchmd: hm.alert.raised:alert]: Alert Id = StorageFCAdapterFault_Alert , Alerting Resource = 100000109b4ede02 raised by monitor node-connect

Thu Mar 20 16:51:11 +0100 [Node-01: slifc_asyncd_4: fci.adapter.online:info]: Fibre Channel adapter 2b is now online.

Thu Mar 20 16:51:27 +0100 [Node-01: dsbridge_admin: bridge.discovered:info]: FC-to-SAS bridge 2b.125L0 [ATTO     FibreBridge7600N 4.35] S/N [FB7600N106192] was discovered.

When FC port 2b is down, the access to ATTO bridge FB7600N106192 was lost and hence the node was transitioning to Mixed-Path configuration and the following events are reported:

Thu Mar 20 16:48:44 +0100 [Node-01: svc_queue_thread: callhome.dsk.redun.fault:error]: Call home for DISK REDUNDANCY FAILED

Thu Mar 20 16:49:24 +0100 [Node-01: dsa_disc: ses.multipath.ReqError:alert]: SAS disk shelf detected without a multipath configuration.

Thu Mar 20 16:50:03 +0100 [Node-01: mgwd: callhome.hm.alert.major:alert]: Call home for Health Monitor process nchm: SinglePathToDiskShelf_Alert[2937244207926544976].

The ATTO port statistics indicate link failure, sync loss and CRC errors on FC port 1

FC Port 2100001086b11d80:

State: up

Speed: 16 Gb/s

Topology: point-to-point

Link Failure Count: 263 <--------------------

Loss of Sync Count: 492019335

CRC Error Count: 10967

LIP Count: 0

Frames In: 17894

Frames Out: 24957655

SFP Vendor: AVAGO

SFP Part Number: AFBR-57G5MZ-ELX

SFP Serial Number: AN2138G016M

SFP Capabilities: 8, 16,

On node Node-02, we can see excessive errors reported on adapter 1b and "scsi.cmd.abortedByHost" errors are also seen:

Mon Mar 17 18:20:45 +0100 [Node-02: slifc_intrd: scsi.cmd.abortedByHost:error]: Disk device 1b.125L8: Command aborted by host adapter: HA status 0x4: cdb 0x2a:1da79600:0200.

Mon Mar 17 18:31:03 +0100 [Node-02: slifc_intrd: scsi.cmd.abortedByHost:error]: Disk device 1b.125L3: Command aborted by host adapter: HA status 0x4: cdb 0x28:59974688:0008.

Thu Mar 20 01:27:22 +0100 [Node-02: slifc_intrd: scsi.path.excessiveErrors:error]: Excessive errors encountered by adapter 1b on disk device 1b.125.

Thu Mar 20 01:27:22 +0100 [Node-02: slifc_intrd: scsi.cmd.transportErrorEMSOnly:error]: Disk device 1b.125L30: Transport error during execution of command: HA status 0x9: cdb 0x28:84756688:0088.