Skip to main content
NetApp Knowledge Base

After a disk failure the spare disks break one after another

Views:
208
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
hw
Last Updated:

Applies to

  • X343_TA15E1T8A10
  • FAS2750
  • 9.9.1P10

Issue

  1. On 2024/10/1 from 01:10:06 to 01:10:59, Unrecovered read error  occurred alternately in Shelf 2 Bay 19.

01Oct2024 01:10:06 disk_ioMediumError diskName="0a.02.19" op="0x28:07df1600:0200" sector="132060944" senseInfo="SCSI:medium error" sCode="Unrecovered read error" disk_info="- If the disk is in a RAID group, the subsystem will attempt to reconstruct unreadable data" sense_key="0x3" sense_code="0x11" qualifier="0x1" fru_failed="0x0" CTime="2496" disk_information="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"
01Oct2024 01:10:06 disk_IO_status deviceName="0a.02.19" ETime="2496" cdb="0x28:07df1600:0200" victimRetryCount="0" retryCount="0" timeoutRetryCount="0" pathRetryCount="0" adapterStatus="0x0" targetStatus="0x2" sSenseKey="SCSI:medium error" sSenseCode="Unrecovered read error" iSenseKey="0x3" iASC="0x11" iASCQ="0x1" pathsTried="1" basicTimeout="5" returnCode="5" disk_information="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"


01Oct2024 01:10:59 disk_ioMediumError diskName="0a.02.19" op="0x28:07df17a8:0008" sector="132061096" senseInfo="SCSI:medium error" sCode="Unrecovered read error" disk_info="- If the disk is in a RAID group, the subsystem will attempt to reconstruct unreadable data" sense_key="0x3" sense_code="0x11" qualifier="0xff" fru_failed="0x0" CTime="32232" disk_information="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"
01Oct2024 01:10:59 disk_IO_status deviceName="0a.02.19" ETime="32232" cdb="0x28:07df17a8:0008" victimRetryCount="6" retryCount="0" timeoutRetryCount="0" pathRetryCount="0" adapterStatus="0x0" targetStatus="0x2" sSenseKey="SCSI:medium error" sSenseCode="Unrecovered read error" iSenseKey="0x3" iASC="0x11" iASCQ="0xff" pathsTried="1" basicTimeout="5" returnCode="5" disk_information="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"

  1. On 2024/10/1 at 01:10:59, a large number of raid_rg_scrub_media_err occurred in Shelf 2 Bay 19.

01Oct2024 01:10:59 raid_rg_scrub_media_err owner="" disk_info="Disk /ask_backup01/plex0/rg2/0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" blockNum="16507618" shelf="2" bay="19" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
01Oct2024 01:10:59 raid_rg_scrub_media_err owner="" disk_info="Disk /ask_backup01/plex0/rg2/0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" blockNum="16507619" shelf="2" bay="19" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"


firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
01Oct2024 01:10:59 raid_rg_scrub_media_err owner="" disk_info="Disk /ask_backup01/plex0/rg2/0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" blockNum="16507646" shelf="2" bay="19" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
01Oct2024 01:10:59 raid_rg_scrub_media_err owner="" disk_info="Disk /ask_backup01/plex0/rg2/0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" blockNum="16507647" shelf="2" bay="19" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"

  1. From 01Oct2024 01:11:01 to 01:11:30, SCSI: aborted command and Unrecovered read error occurred alternately in Shelf 2 Bay 19.

01Oct2024 01:11:01 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.02.19" cdb="0x28:07df1880:0008" sSenseKey="SCSI:aborted command" sSenseCode="" iSenseKey="0xb" iASC="0x2f" iASCQ="0x10" iFRU="0x0" DTime="4610" disk_information=""
01Oct2024 01:11:01 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.02.19" cdb="0x28:07df1870:0008" sSenseKey="SCSI:aborted command" sSenseCode="" iSenseKey="0xb" iASC="0x2f" iASCQ="0x10" iFRU="0x0" DTime="4610" disk_information=""
01Oct2024 01:11:01 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.02.19" cdb="0x28:07df1868:0008" sSenseKey="SCSI:aborted command" sSenseCode="" iSenseKey="0xb" iASC="0x2f" iASCQ="0x10" iFRU="0x0" DTime="4610" disk_information=""
01Oct2024 01:11:01 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.02.19" cdb="0x28:07df1878:0008" sSenseKey="SCSI:aborted command" sSenseCode="" iSenseKey="0xb" iASC="0x2f" iASCQ="0x10" iFRU="0x0" DTime="4610" disk_information=""
01Oct2024 01:11:02 disk_ioMediumError diskName="0a.02.19" op="0x28:07df1888:0008" sector="132061320" senseInfo="SCSI:medium error" sCode="Unrecovered read error" disk_info="- If the disk is in a RAID group, the subsystem will attempt to reconstruct unreadable data" sense_key="0x3" sense_code="0x11" qualifier="0x1" fru_failed="0x0" CTime="4911" disk_information="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"
01Oct2024 01:11:02 disk_IO_status deviceName="0a.02.19" ETime="4911" cdb="0x28:07df1888:0008" victimRetryCount="0" retryCount="0" timeoutRetryCount="0" pathRetryCount="0" adapterStatus="0x0" targetStatus="0x2" sSenseKey="SCSI:medium error" sSenseCode="Unrecovered read error" iSenseKey="0x3" iASC="0x11" iASCQ="0x1" pathsTried="1" basicTimeout="5" returnCode="5" disk_information="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"
01Oct2024 01:11:03 disk_ioMediumError diskName="0a.02.19" op="0x28:07df1890:0008" sector="132061328" senseInfo="SCSI:medium error" sCode="Unrecovered read error" disk_info="- If the disk is in a RAID group, the subsystem will attempt to reconstruct unreadable data" sense_key="0x3" sense_code="0x11" qualifier="0xff" fru_failed="0x0" CTime="4594" disk_information="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"


01Oct2024 01:11:28 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.02.19" cdb="0x28:07df1878:0008" sSenseKey="SCSI:aborted command" sSenseCode="" iSenseKey="0xb" iASC="0x2f" iASCQ="0x10" iFRU="0x0" DTime="32432" disk_information=""
01Oct2024 01:11:28 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.02.19" cdb="0x28:07df1868:0008" sSenseKey="SCSI:aborted command" sSenseCode="" iSenseKey="0xb" iASC="0x2f" iASCQ="0x10" iFRU="0x0" DTime="32432" disk_information=""
01Oct2024 01:11:28 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.02.19" cdb="0x28:07df1958:0008" sSenseKey="SCSI:aborted command" sSenseCode="" iSenseKey="0xb" iASC="0x2f" iASCQ="0x10" iFRU="0x0" DTime="4755" disk_information=""
01Oct2024 01:11:29 raid_rg_readerr_repair_data owner="" disk_info="Disk /ask_backup01/plex0/rg2/0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" blockNum="16507621" vbn="10050326693" shelf="2" bay="19" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
01Oct2024 01:11:30 disk_IO_status deviceName="0a.02.19" ETime="4550" cdb="0x28:07df1960:0008" victimRetryCount="0" retryCount="0" timeoutRetryCount="0" pathRetryCount="0" adapterStatus="0x0" targetStatus="0x2" sSenseKey="SCSI:medium error" sSenseCode="Unrecovered read error" iSenseKey="0x3" iASC="0x11" iASCQ="0xff" pathsTried="1" basicTimeout="5" returnCode="5" disk_information="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"
01Oct2024 01:11:30 disk_IO_status deviceName="0a.02.19" ETime="4559" cdb="0x28:07df1968:0008" victimRetryCount="0" retryCount="0" timeoutRetryCount="0" pathRetryCount="0" adapterStatus="0x0" targetStatus="0x2" sSenseKey="SCSI:medium error" sSenseCode="Unrecovered read error" iSenseKey="0x3" iASC="0x11" iASCQ="0xff" pathsTried="1" basicTimeout="5" returnCode="5" disk_information="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"

  1. On 2024/10/1 at 01:11:32, the disk in Shelf 2 Bay 19 was identified as faulty, and reconstruction started on the disk in Shelf 1 Bay 23, but Shelf 1 Bay 23 also became unresponsive, leading to power cycles and other actions from the IOM.

01Oct2024 01:11:32 raid_rg_recons_start disk_info="Disk /ask_backup01/plex0/rg2/0a.01.23 Shelf 1 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [71XXXXXXXXUR] UID [5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" owner="" disk="0a.01.23" startBlockNum="5248" shelf="1" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="71XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local" aggregate_uuid="68ab4e22-b25f-430e-add8-5fc247c5daca"
01Oct2024 01:11:32 raid_rg_scrub_stopped owner="" rg="/ask_backup01/plex0/rg2" stripe="16507584" duration="11:29.00" aggregate_uuid="68ab4e22-b25f-430e-add8-5fc247c5daca"
01Oct2024 01:11:32 raid_rg_scrub_summary_pi pis="0" suffix="ies" rg="/ask_backup01/plex0/rg2" current="."
01Oct2024 01:11:32 raid_rg_scrub_summary_cksum errors="0" suffix="s" rg="/ask_backup01/plex0/rg2" current="."
01Oct2024 01:11:32 raid_rg_scrub_summary_media errors="25" suffix="s" rg="/ask_backup01/plex0/rg2" current="."
01Oct2024 01:11:32 raid_rg_scrub_summary_lw errors="0" suffix="ies" rg="/ask_backup01/plex0/rg2" current="."
01Oct2024 01:11:33 raid_rg_recons_progress disk_info="Disk /ask_backup01/plex0/rg2/0a.01.23 Shelf 1 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [71XXXXXXXXUR] UID [5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" blockNumCurrent="7424" blockNumEnd="438661952" percent="0" duration="0:00:01" shelf="1" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="71XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
01Oct2024 01:11:34 raid_rg_diskcopy_aborted owner="" rg="/ask_backup01/plex0/rg2" source="0a.02.19" target="0a.01.23" blockNum="5248" duration="0:07.72" reason="Source disk failed." aggregate_uuid="68ab4e22-b25f-430e-add8-5fc247c5daca"
01Oct2024 01:11:39 sas_adapter_debug adapterName="0a" debug_string="Starting powercycle on device 0a.02.19"
01Oct2024 01:11:39 disk_dhm_pitstop_start diskName="0a.02.19" vendorName="NETAPP " productId="X343_TA15E1T8A10" fwVersion="NA01" serialno="42XXXXXXXXUR" device_class_name="SAS_8388608_3MB_DST"
01Oct2024 01:11:39 sas_device_quiesce adapterName="0a" deviceType="disk" deviceName="0a.01.23"
01Oct2024 01:11:42 sas_device_timeout adapterName="0a" deviceType="Disk" deviceName="0a.01.23"
01Oct2024 01:11:42 sas_adapter_debug adapterName="0a" debug_string="Level 0 timeout: Abort task set: 0a.01.23 L0 (0xfffff80779414040,0x28:00017400:0200,0/0)"
01Oct2024 01:11:42 sas_adapter_debug adapterName="0a" debug_string="ABORT TASKSET on device 0a.01.23 L0"
01Oct2024 01:11:44 sas_adapter_debug adapterName="0a" debug_string="Powercycle on device 0a.02.19 complete: status 0"
01Oct2024 01:11:45 sas_adapter_debug adapterName="0a" debug_string="Transition Comparing this: [IOM1] and next: [IOM1] ..."
01Oct2024 01:11:45 sas_adapter_debug adapterName="0a" debug_string="Device 0a.02.19 invalidate debounce - 40"
01Oct2024 01:11:45 sas_adapter_debug adapterName="0a" debug_string="Device 0b.02.19 invalidate debounce - 40"

  1. From 2024/10/1 01:12:11 to 01:14:50, multiple SCSI: not ready errors occurred on the disk in Shelf 1 Bay 23. Although it temporarily recovered after retries, SCSI: not ready reoccurred, ultimately leading to shm_setup_for_failure.

01Oct2024 01:12:11 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0b.01.23" cdb="0x2a:0043c000:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="9932" disk_information=""
01Oct2024 01:12:11 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0b.01.23" cdb="0x2a:d196b930:0010" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="9920" disk_information=""
01Oct2024 01:12:11 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0b.01.23" cdb="0x2a:0043c400:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="8945" disk_information=""
01Oct2024 01:12:11 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0b.01.23" cdb="0x2a:0043be00:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="9943" disk_information=""
01Oct2024 01:12:11 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0b.01.23" cdb="0x2a:0043c200:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="9951" disk_information=""
01Oct2024 01:12:11 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0b.01.23" cdb="0x2a:0043c000:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="10830" disk_information=""


01Oct2024 01:14:48 scsi_cmd_notReadyConditionEMSOnly deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:01c55000:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="17977" disk_information=""
01Oct2024 01:14:49 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:01c55200:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="17986" disk_information=""
01Oct2024 01:14:49 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:01c55600:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="17994" disk_information=""
01Oct2024 01:14:49 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:01c55400:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="18006" disk_information=""
01Oct2024 01:14:50 scsi_cmd_retrySuccess deviceType="Disk" deviceName="0a.01.23" retryCount="1" freeRetryCount="10" cdb="0x2a:01c55000:0200" dTime="19133"
01Oct2024 01:14:50 scsi_debug debug_string="shm_setup_for_failure disk 0a.01.23 (S/N 71XXXXXXXXUR) error 100h"

  1. From 2024/10/1 01:14:53 to 01:14:58, the stop of Shelf 2 Bay 19 was completed, and the disconnection triggered an Autosupport for DISK FAILED.

01Oct2024 01:14:53 dhm_pitstop_complete diskName="0a.02.19" vendorName="NETAPP " productId="X343_TA15E1T8A10" fwVersion="NA01" serialno="42XXXXXXXXUR" result="DHM pitstop's drive self test failed" lbasPerZone="0x800000" actualTimeTaken="0" projectedTimeTaken="0" numZones="3" diskZeroTime="13000" rightSize="0xd12b99ff"
01Oct2024 01:14:55 cf_disk_skipped diskname="0a.02.19" status="adapter error prevents command from being sent to device"
01Oct2024 01:14:58 raid_disk_maint_failed disk_info="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" start_code="f1 5d 12 00" end_code="00 00 00 00" shelf="2" bay="19" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
01Oct2024 01:14:58 raid_config_disk_failed disk_info="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" shelf="2" bay="19" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" failure_reason="disk maintenance testing failed" carrier="" site="Local"
01Oct2024 01:14:58 callhome_dsk_fault subject="DISK FAILED Shelf 2, Bay 19, Model [X343_TA15E1T8A10], S/N [42XXXXXXXXUR]"

  1. From 2024/10/1 01:18:43 to 01:36:48, SCSI: not ready and SCSI: hardware error continued to appear on Shelf 1 Bay 23, and power cycles and other actions were performed by the IOM but did not improve the situation, leading to shm_setup_for_failure being recorded.

01Oct2024 01:18:43 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:03794400:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="9483" disk_information=""
01Oct2024 01:18:43 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:03794600:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="9489" disk_information=""
01Oct2024 01:18:43 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:03794800:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="8688" disk_information=""
01Oct2024 01:18:43 scsi_cmd_notReadyConditionEMSOnly deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:03794200:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="9517" disk_information=""


01Oct2024 01:32:49 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:08c8c400:0200" sSenseKey="SCSI:hardware error" sSenseCode="" iSenseKey="0x4" iASC="0x44" iASCQ="0xa3" iFRU="0x0" DTime="1021" disk_information=""
01Oct2024 01:32:49 disk_powercycle deviceType="Disk" deviceName="0a.01.23" disk_information="Disk 0a.01.23 Shelf 1 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [71XXXXXXXXUR] UID [5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" reason="a hardware error" command="0x2a:08c8c400:0200"
01Oct2024 01:32:49 scsi_cmd_retrySuccess deviceType="Disk" deviceName="0a.01.23" retryCount="1" freeRetryCount="0" cdb="0x2a:08c8c400:0200" dTime="1056"
01Oct2024 01:32:54 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:08cfe200:0200" sSenseKey="SCSI:hardware error" sSenseCode="" iSenseKey="0x4" iASC="0x44" iASCQ="0xa3" iFRU="0x0" DTime="933" disk_information=""
01Oct2024 01:32:54 disk_powercycle deviceType="Disk" deviceName="0a.01.23" disk_information="Disk 0a.01.23 Shelf 1 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [71XXXXXXXXUR] UID [5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" reason="a hardware error" command="0x2a:08cfe200:0200"
01Oct2024 01:32:54 scsi_cmd_retrySuccess deviceType="Disk" deviceName="0a.01.23" retryCount="1" freeRetryCount="0" cdb="0x2a:08cfe200:0200" dTime="934"
01Oct2024 01:32:59 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:08d93800:0200" sSenseKey="SCSI:hardware error" sSenseCode="" iSenseKey="0x4" iASC="0x44" iASCQ="0xa3" iFRU="0x0" DTime="995" disk_information=""

01Oct2024 01:36:48 scsi_cmd_retrySuccess deviceType="Disk" deviceName="0a.01.23" retryCount="0" freeRetryCount="11" cdb="0x2a:09b15200:0200" dTime="20431"
01Oct2024 01:36:48 scsi_cmd_retrySuccess deviceType="Disk" deviceName="0a.01.23" retryCount="1" freeRetryCount="11" cdb="0x2a:09b14c00:0200" dTime="20716"
01Oct2024 01:36:48 scsi_cmd_retrySuccess deviceType="Disk" deviceName="0a.01.23" retryCount="0" freeRetryCount="11" cdb="0x2a:09b14e00:0200" dTime="20715"
01Oct2024 01:36:48 scsi_debug debug_string="shm_setup_for_failure disk 0a.01.23 (S/N 71XXXXXXXXUR) error 100h"

  1. From 2024/10/1 06:36:09 to 2024/10/3 08:47:04, five hours after the error occurred, Shelf 1 Bay 23 was determined to be faulty, Disk Copy to Shelf 3 Bay 23 was completed, and an Autosupport for DISK FAILED was triggered.

01Oct2024 06:36:09 raid_rg_diskcopy_start owner="" rg="/ask_backup01/plex0/rg2" source="0a.01.23" source_serialno="71XXXXXXXXUR" target="0a.03.23" target_serialno="42XXXXXXXXUR" reason="Disk replace was started." aggregate_uuid="68ab4e22-b25f-430e-add8-5fc247c5daca"
01Oct2024 06:36:09 raid_rg_diskcopy_progress source_disk="1.1.23" source_serialno="71XXXXXXXXUR" target_disk="1.3.23" target_serialno="42XXXXXXXXUR" blockNum="0" percent="0" duration="0:00:00"
01Oct2024 06:36:11 raid_spares_media_scrub_stopped owner="" disk_info="Disk 0a.03.23 Shelf 3 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXX58:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" shelf="3" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
01Oct2024 06:36:34 cf_disk_skipped diskname="0a.02.19" status="disk failed"
01Oct2024 06:41:11 raid_rg_diskcopy_progress source_disk="1.1.23" source_serialno="71XXXXXXXXUR" target_disk="1.3.23" target_serialno="42XXXXXXXXUR" blockNum="19446912" percent="4" duration="0:05:01"
01Oct2024 06:41:34 cf_disk_skipped diskname="0a.02.19" status="disk failed"
01Oct2024 06:46:11 raid_rg_diskcopy_progress source_disk="1.1.23" source_serialno="71XXXXXXXXUR" target_disk="1.3.23" target_serialno="42XXXXXXXXUR" blockNum="38660736" percent="9" duration="0:10:01"


03Oct2024 08:47:04 raid_label_io_writeError disk_info="Disk /ask_backup01/plex0/rg2/0a.01.23 Shelf 1 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [71XXXXXXXXUR] UID [5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" error_str="Label read after label write did not match." instanceFile="prod/common/raidv2/label-io.c" instanceId="3318" shelf="1" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="71XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
03Oct2024 08:47:04 raid_config_filesystem_disk_failed disk_info="Disk /ask_backup01/plex0/rg2/0a.01.23 Shelf 1 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [71XXXXXXXXUR] UID [5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" shelf="1" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="71XXXXXXXXUR" disk_type="4" disk_rpm="10000" failure_reason="disk failed" carrier="" site="Local"
03Oct2024 08:47:04 disk_outOfService diskName="0a.01.23" serialno="71XXXXXXXXUR" reason=": sense information: SCSI:hardware error(0x04), ASC(0x44), ASCQ(0xa3), FRU(0x01)" powerOnHours="18190" glistEntries="0" disk_information="Disk 0a.01.23 Shelf 1 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [71XXXXXXXXUR] UID [5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"
03Oct2024 08:47:04 disk_write_failure diskName="0a.01.23" serialno="71XXXXXXXXUR" state="0x5" fa0="0xf4" fa1="0x44" fa2="0xa3" fa3="0x1" disk_information="Disk 0a.01.23 Shelf 1 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [71XXXXXXXXUR] UID [5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"
03Oct2024 08:47:04 rastrace_dump_saved module="RAID" instance="0" filename="/etc/log/rastrace/RAID_0_20241003_08:47:04:323851.dmp"
03Oct2024 08:47:04 raid_notify_on_failure disk_uid="5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000" originating_sysid="538237732" failure_reason="1" failure_string="failed"
03Oct2024 08:47:04 raid_disk_unload_done disk_info="Disk 0a.01.23 Shelf 1 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [71XXXXXXXXUR] UID [5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" shelf="1" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="71XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
03Oct2024 08:47:04 callhome_fdsk_fault subject="FILESYSTEM DISK FAILED Shelf 1, Bay 23, Model [X343_TA15E1T8A10], S/N [71XXXXXXXXUR]"

  1. From 2024/10/3 08:47:04 to 2024/10/4 00:57:16, reconstruction started on Shelf 3 Bay 23, but many skip errors were displayed for the already disconnected disks in Shelf 2 Bay 19 and Shelf 1 Bay 23.

03Oct2024 08:47:09 raid_rg_recons_progress disk_info="Disk /ask_backup01/plex0/rg2/0a.03.23 Shelf 3 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXX58:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" blockNumCurrent="432731072" blockNumEnd="438661952" percent="99" duration="0:00:05" shelf="3" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
03Oct2024 08:47:09 cf_disk_skipped diskname="0a.02.19" status="disk failed"
03Oct2024 08:49:10 cf_disk_skipped diskname="0a.01.23" status="disk failed"
03Oct2024 08:49:10 cf_disk_skipped diskname="0a.02.19" status="disk failed"
03Oct2024 08:50:09 raid_rg_normal owner="" name="/ask_backup01/plex0/rg2" aggr_UUID="68ab4e22-b25f-430e-add8-5fc247c5daca"
03Oct2024 08:50:09 raid_rg_recons_done disk_info="Disk /ask_backup01/plex0/rg2/0a.03.23 Shelf 3 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXX58:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" owner="" disk="0a.03.23" duration="3:05.08" shelf="3" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local" aggregate_uuid="68ab4e22-b25f-430e-add8-5fc247c5daca"
03Oct2024 08:52:00 raid_rg_media_scrub_start owner="" rg="/ask_backup01/plex0/rg2" aggregate_uuid="68ab4e22-b25f-430e-add8-5fc247c5daca"
03Oct2024 08:52:10 cf_disk_skipped diskname="0a.01.23" status="disk failed"
03Oct2024 08:52:10 cf_disk_skipped diskname="0a.02.19" status="disk failed"
03Oct2024 08:55:10 cf_disk_skipped diskname="0a.01.23" status="disk failed"
03Oct2024 08:55:10 cf_disk_skipped diskname="0a.02.19" status="disk failed"
03Oct2024 08:58:10 cf_disk_skipped diskname="0a.01.23" status="disk failed"


04Oct2024 00:48:16 cf_disk_skipped diskname="0a.01.23" status="disk failed"
04Oct2024 00:48:16 cf_disk_skipped diskname="0a.02.19" status="disk failed"
04Oct2024 00:51:16 cf_disk_skipped diskname="0a.01.23" status="disk failed"
04Oct2024 00:51:16 cf_disk_skipped diskname="0a.02.19" status="disk failed"
04Oct2024 00:54:16 cf_disk_skipped diskname="0a.01.23" status="disk failed"
04Oct2024 00:54:16 cf_disk_skipped diskname="0a.02.19" status="disk failed"
04Oct2024 00:57:16 cf_disk_skipped diskname="0a.01.23" status="disk failed"
04Oct2024 00:57:16 cf_disk_skipped diskname="0a.02.19" status="disk failed"

  1. From 2024/10/4 01:00:59 to 01:12:59, raid_readerr_lw_reconsBadBlk_ckinfo and raid_readerr_lw_reconsGoodBlk_ckinfo repeated, and it appeared that raid_rg_readerr_repair_data and repairs were being executed on Shelf 3 Bay 23.

04Oct2024 01:00:59 raid_readerr_lw_dparity_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.02.12" dbn="554826" stripe_id="0x12" gen_cnt1="0x15555" gen_cnt2="0x0" comp_cksum="0xb617e667" wafl_cxt0="0xab3e4c6a" wafl_cxt1="0x1eb1403" wafl_cxt2="0x1ff002" wafl_cxt3="0x1f87e0" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" disk_serialno="42S0A070F4UR"
04Oct2024 01:00:59 raid_readerr_lw_parity_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.02.13" dbn="554826" stripe_id="0x12" gen_cnt1="0x15555" gen_cnt2="0x0" comp_cksum="0x1bddcf8f" wafl_cxt0="0xb32c6f" wafl_cxt1="0x1d79" wafl_cxt2="0x403" wafl_cxt3="0x410" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" disk_serialno="42S0A06GF4UR"
04Oct2024 01:00:59 raid_readerr_lw_data_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.02.14" dbn="554826" disk_pos="0" stripe_id="0x1259fede" gen_cnt="0x1" comp_cksum="0x9cd49c3e" sto_vbn="7841090378" wafl_cxt0="0x563825" wafl_cxt1="0x65f4" wafl_cxt2="0x403" wafl_cxt3="0x410" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" disk_serialno="7150B0N8V4UR"
04Oct2024 01:00:59 raid_readerr_lw_reconsBadBlk_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.03.23" dbn="554826" comp_cksum="0xe7fd4673" wafl_cxt0="0x563709" wafl_cxt1="0x65f4" wafl_cxt2="0x403" wafl_cxt3="0x410" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" iteration="1" disk_serialno="42XXXXXXXXUR"
04Oct2024 01:00:59 raid_readerr_lw_reconsGoodBlk_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.02.12" dbn="554826" comp_cksum="0xb617e667" wafl_cxt0="0xab3e4c6a" wafl_cxt1="0x1eb1403" wafl_cxt2="0x1ff002" wafl_cxt3="0x1f87e0" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" iteration="1" disk_serialno="42S0A070F4UR"
04Oct2024 01:00:59 raid_readerr_lw_consist_check_pass owner="" rg="/ask_backup01/plex0/rg2" stripe_num="554826" iteration="1"
04Oct2024 01:00:59 ems_engine_suppressed emsId="raid.rg.readerr.repair.data" numDrops="3" numSeconds="258570"


04Oct2024 01:12:59 raid_readerr_lw_consist_check_pass owner="" rg="/ask_backup01/plex0/rg2" stripe_num="558383" iteration="1"
04Oct2024 01:12:59 raid_rg_readerr_repair_data owner="" disk_info="Disk /ask_backup01/plex0/rg2/0a.03.23 Shelf 3 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXX58:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" blockNum="558383" vbn="10034377455" shelf="3" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
04Oct2024 01:12:59 raid_readerr_lw_reconsBadBlk_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.03.23" dbn="558381" comp_cksum="0xb011fdbc" wafl_cxt0="0xb36228" wafl_cxt1="0x65e5" wafl_cxt2="0x404" wafl_cxt3="0x410" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" iteration="1" disk_serialno="42XXXXXXXXUR"
04Oct2024 01:12:59 raid_readerr_lw_reconsGoodBlk_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.02.12" dbn="558381" comp_cksum="0x784d4f1d" wafl_cxt0="0x8d2e8fa" wafl_cxt1="0x1f7f949" wafl_cxt2="0x1fe7f8" wafl_cxt3="0x1f87e0" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" iteration="1" disk_serialno="42S0A070F4UR"
04Oct2024 01:12:59 raid_readerr_lw_consist_check_pass owner="" rg="/ask_backup01/plex0/rg2" stripe_num="558381" iteration="1"
04Oct2024 01:12:59 raid_pi_diag_error msg="Out of messages" type="checksum mismatch"
04Oct2024 01:12:59 raid_readerr_lw_reconsBadBlk_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.03.23" dbn="558382" comp_cksum="0xabec5fa6" wafl_cxt0="0xb36229" wafl_cxt1="0x65e5" wafl_cxt2="0x404" wafl_cxt3="0x410" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" iteration="1" disk_serialno="42XXXXXXXXUR"
04Oct2024 01:12:59 raid_readerr_lw_reconsGoodBlk_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.02.12" dbn="558382" comp_cksum="0xd0baf7b7" wafl_cxt0="0x8d2e8b8" wafl_cxt1="0x1f7f949" wafl_cxt2="0x1fe7f8" wafl_cxt3="0x1f87e0" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" iteration="1" disk_serialno="42S0A070F4UR"
04Oct2024 01:12:59 raid_readerr_lw_consist_check_pass owner="" rg="/ask_backup01/plex0/rg2" stripe_num="558382" iteration="1"
04Oct2024 01:12:59 raid_readerr_lw_reconsBadBlk_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.03.23" dbn="558380" comp_cksum="0xe8c81d73" wafl_cxt0="0xb36227" wafl_cxt1="0x65e5" wafl_cxt2="0x404" wafl_cxt3="0x410" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" iteration="1" disk_serialno="42XXXXXXXXUR"
04Oct2024 01:12:59 raid_readerr_lw_reconsGoodBlk_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.02.12" dbn="558380" comp_cksum="0x47e5cf7f" wafl_cxt0="0x8d2d3f4" wafl_cxt1="0x1f7f949" wafl_cxt2="0x1fe7f8" wafl_cxt3="0x1f87e0" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" iteration="1" disk_serialno="42S0A070F4UR"
04Oct2024 01:12:59 raid_readerr_lw_consist_check_pass owner="" rg="/ask_backup01/plex0/rg2" stripe_num="558380" iteration="1"

  1. On 2024/10/4 at 01:13:00, Shelf 3 Bay 23 was also determined to be faulty and was disconnected.

04Oct2024 01:12:59 raid_notify_on_failure disk_uid="5XXXXXXB:98XXXX58:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000" originating_sysid="538237732" failure_reason="1" failure_string="failed"
04Oct2024 01:12:59 raid_disk_unload_done disk_info="Disk 0a.03.23 Shelf 3 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXX58:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" shelf="3" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
04Oct2024 01:13:00 callhome_fdsk_fault subject="FILESYSTEM DISK FAILED Shelf 3, Bay 23, Model [X343_TA15E1T8A10], S/N [42XXXXXXXXUR]"

 

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.