After a disk failure the spare disks break one after another
Applies to
- X343_TA15E1T8A10
- FAS2750
- 9.9.1P10
Issue
- On 2024/10/1 from 01:10:06 to 01:10:59,
Unrecovered read erroroccurred alternately in Shelf 2 Bay 19.
01Oct2024 01:10:06 disk_ioMediumError diskName="0a.02.19" op="0x28:07df1600:0200" sector="132060944" senseInfo="SCSI:medium error" sCode="Unrecovered read error" disk_info="- If the disk is in a RAID group, the subsystem will attempt to reconstruct unreadable data" sense_key="0x3" sense_code="0x11" qualifier="0x1" fru_failed="0x0" CTime="2496" disk_information="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"
01Oct2024 01:10:06 disk_IO_status deviceName="0a.02.19" ETime="2496" cdb="0x28:07df1600:0200" victimRetryCount="0" retryCount="0" timeoutRetryCount="0" pathRetryCount="0" adapterStatus="0x0" targetStatus="0x2" sSenseKey="SCSI:medium error" sSenseCode="Unrecovered read error" iSenseKey="0x3" iASC="0x11" iASCQ="0x1" pathsTried="1" basicTimeout="5" returnCode="5" disk_information="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"
・
・
01Oct2024 01:10:59 disk_ioMediumError diskName="0a.02.19" op="0x28:07df17a8:0008" sector="132061096" senseInfo="SCSI:medium error" sCode="Unrecovered read error" disk_info="- If the disk is in a RAID group, the subsystem will attempt to reconstruct unreadable data" sense_key="0x3" sense_code="0x11" qualifier="0xff" fru_failed="0x0" CTime="32232" disk_information="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"
01Oct2024 01:10:59 disk_IO_status deviceName="0a.02.19" ETime="32232" cdb="0x28:07df17a8:0008" victimRetryCount="6" retryCount="0" timeoutRetryCount="0" pathRetryCount="0" adapterStatus="0x0" targetStatus="0x2" sSenseKey="SCSI:medium error" sSenseCode="Unrecovered read error" iSenseKey="0x3" iASC="0x11" iASCQ="0xff" pathsTried="1" basicTimeout="5" returnCode="5" disk_information="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"
- On 2024/10/1 at 01:10:59, a large number of
raid_rg_scrub_media_erroccurred in Shelf 2 Bay 19.
01Oct2024 01:10:59 raid_rg_scrub_media_err owner="" disk_info="Disk /ask_backup01/plex0/rg2/0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" blockNum="16507618" shelf="2" bay="19" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
01Oct2024 01:10:59 raid_rg_scrub_media_err owner="" disk_info="Disk /ask_backup01/plex0/rg2/0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" blockNum="16507619" shelf="2" bay="19" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
・
・
firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
01Oct2024 01:10:59 raid_rg_scrub_media_err owner="" disk_info="Disk /ask_backup01/plex0/rg2/0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" blockNum="16507646" shelf="2" bay="19" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
01Oct2024 01:10:59 raid_rg_scrub_media_err owner="" disk_info="Disk /ask_backup01/plex0/rg2/0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" blockNum="16507647" shelf="2" bay="19" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
- From 01Oct2024 01:11:01 to 01:11:30,
SCSI: aborted commandandUnrecovered read erroroccurred alternately in Shelf 2 Bay 19.
01Oct2024 01:11:01 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.02.19" cdb="0x28:07df1880:0008" sSenseKey="SCSI:aborted command" sSenseCode="" iSenseKey="0xb" iASC="0x2f" iASCQ="0x10" iFRU="0x0" DTime="4610" disk_information=""
01Oct2024 01:11:01 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.02.19" cdb="0x28:07df1870:0008" sSenseKey="SCSI:aborted command" sSenseCode="" iSenseKey="0xb" iASC="0x2f" iASCQ="0x10" iFRU="0x0" DTime="4610" disk_information=""
01Oct2024 01:11:01 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.02.19" cdb="0x28:07df1868:0008" sSenseKey="SCSI:aborted command" sSenseCode="" iSenseKey="0xb" iASC="0x2f" iASCQ="0x10" iFRU="0x0" DTime="4610" disk_information=""
01Oct2024 01:11:01 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.02.19" cdb="0x28:07df1878:0008" sSenseKey="SCSI:aborted command" sSenseCode="" iSenseKey="0xb" iASC="0x2f" iASCQ="0x10" iFRU="0x0" DTime="4610" disk_information=""
01Oct2024 01:11:02 disk_ioMediumError diskName="0a.02.19" op="0x28:07df1888:0008" sector="132061320" senseInfo="SCSI:medium error" sCode="Unrecovered read error" disk_info="- If the disk is in a RAID group, the subsystem will attempt to reconstruct unreadable data" sense_key="0x3" sense_code="0x11" qualifier="0x1" fru_failed="0x0" CTime="4911" disk_information="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"
01Oct2024 01:11:02 disk_IO_status deviceName="0a.02.19" ETime="4911" cdb="0x28:07df1888:0008" victimRetryCount="0" retryCount="0" timeoutRetryCount="0" pathRetryCount="0" adapterStatus="0x0" targetStatus="0x2" sSenseKey="SCSI:medium error" sSenseCode="Unrecovered read error" iSenseKey="0x3" iASC="0x11" iASCQ="0x1" pathsTried="1" basicTimeout="5" returnCode="5" disk_information="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"
01Oct2024 01:11:03 disk_ioMediumError diskName="0a.02.19" op="0x28:07df1890:0008" sector="132061328" senseInfo="SCSI:medium error" sCode="Unrecovered read error" disk_info="- If the disk is in a RAID group, the subsystem will attempt to reconstruct unreadable data" sense_key="0x3" sense_code="0x11" qualifier="0xff" fru_failed="0x0" CTime="4594" disk_information="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"
・
・
01Oct2024 01:11:28 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.02.19" cdb="0x28:07df1878:0008" sSenseKey="SCSI:aborted command" sSenseCode="" iSenseKey="0xb" iASC="0x2f" iASCQ="0x10" iFRU="0x0" DTime="32432" disk_information=""
01Oct2024 01:11:28 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.02.19" cdb="0x28:07df1868:0008" sSenseKey="SCSI:aborted command" sSenseCode="" iSenseKey="0xb" iASC="0x2f" iASCQ="0x10" iFRU="0x0" DTime="32432" disk_information=""
01Oct2024 01:11:28 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.02.19" cdb="0x28:07df1958:0008" sSenseKey="SCSI:aborted command" sSenseCode="" iSenseKey="0xb" iASC="0x2f" iASCQ="0x10" iFRU="0x0" DTime="4755" disk_information=""
01Oct2024 01:11:29 raid_rg_readerr_repair_data owner="" disk_info="Disk /ask_backup01/plex0/rg2/0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" blockNum="16507621" vbn="10050326693" shelf="2" bay="19" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
01Oct2024 01:11:30 disk_IO_status deviceName="0a.02.19" ETime="4550" cdb="0x28:07df1960:0008" victimRetryCount="0" retryCount="0" timeoutRetryCount="0" pathRetryCount="0" adapterStatus="0x0" targetStatus="0x2" sSenseKey="SCSI:medium error" sSenseCode="Unrecovered read error" iSenseKey="0x3" iASC="0x11" iASCQ="0xff" pathsTried="1" basicTimeout="5" returnCode="5" disk_information="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"
01Oct2024 01:11:30 disk_IO_status deviceName="0a.02.19" ETime="4559" cdb="0x28:07df1968:0008" victimRetryCount="0" retryCount="0" timeoutRetryCount="0" pathRetryCount="0" adapterStatus="0x0" targetStatus="0x2" sSenseKey="SCSI:medium error" sSenseCode="Unrecovered read error" iSenseKey="0x3" iASC="0x11" iASCQ="0xff" pathsTried="1" basicTimeout="5" returnCode="5" disk_information="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"
- On 2024/10/1 at 01:11:32, the disk in Shelf 2 Bay 19 was identified as faulty, and reconstruction started on the disk in Shelf 1 Bay 23, but Shelf 1 Bay 23 also became unresponsive, leading to power cycles and other actions from the IOM.
01Oct2024 01:11:32 raid_rg_recons_start disk_info="Disk /ask_backup01/plex0/rg2/0a.01.23 Shelf 1 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [71XXXXXXXXUR] UID [5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" owner="" disk="0a.01.23" startBlockNum="5248" shelf="1" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="71XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local" aggregate_uuid="68ab4e22-b25f-430e-add8-5fc247c5daca"
01Oct2024 01:11:32 raid_rg_scrub_stopped owner="" rg="/ask_backup01/plex0/rg2" stripe="16507584" duration="11:29.00" aggregate_uuid="68ab4e22-b25f-430e-add8-5fc247c5daca"
01Oct2024 01:11:32 raid_rg_scrub_summary_pi pis="0" suffix="ies" rg="/ask_backup01/plex0/rg2" current="."
01Oct2024 01:11:32 raid_rg_scrub_summary_cksum errors="0" suffix="s" rg="/ask_backup01/plex0/rg2" current="."
01Oct2024 01:11:32 raid_rg_scrub_summary_media errors="25" suffix="s" rg="/ask_backup01/plex0/rg2" current="."
01Oct2024 01:11:32 raid_rg_scrub_summary_lw errors="0" suffix="ies" rg="/ask_backup01/plex0/rg2" current="."
01Oct2024 01:11:33 raid_rg_recons_progress disk_info="Disk /ask_backup01/plex0/rg2/0a.01.23 Shelf 1 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [71XXXXXXXXUR] UID [5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" blockNumCurrent="7424" blockNumEnd="438661952" percent="0" duration="0:00:01" shelf="1" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="71XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
01Oct2024 01:11:34 raid_rg_diskcopy_aborted owner="" rg="/ask_backup01/plex0/rg2" source="0a.02.19" target="0a.01.23" blockNum="5248" duration="0:07.72" reason="Source disk failed." aggregate_uuid="68ab4e22-b25f-430e-add8-5fc247c5daca"
01Oct2024 01:11:39 sas_adapter_debug adapterName="0a" debug_string="Starting powercycle on device 0a.02.19"
01Oct2024 01:11:39 disk_dhm_pitstop_start diskName="0a.02.19" vendorName="NETAPP " productId="X343_TA15E1T8A10" fwVersion="NA01" serialno="42XXXXXXXXUR" device_class_name="SAS_8388608_3MB_DST"
01Oct2024 01:11:39 sas_device_quiesce adapterName="0a" deviceType="disk" deviceName="0a.01.23"
01Oct2024 01:11:42 sas_device_timeout adapterName="0a" deviceType="Disk" deviceName="0a.01.23"
01Oct2024 01:11:42 sas_adapter_debug adapterName="0a" debug_string="Level 0 timeout: Abort task set: 0a.01.23 L0 (0xfffff80779414040,0x28:00017400:0200,0/0)"
01Oct2024 01:11:42 sas_adapter_debug adapterName="0a" debug_string="ABORT TASKSET on device 0a.01.23 L0"
01Oct2024 01:11:44 sas_adapter_debug adapterName="0a" debug_string="Powercycle on device 0a.02.19 complete: status 0"
01Oct2024 01:11:45 sas_adapter_debug adapterName="0a" debug_string="Transition Comparing this: [IOM1] and next: [IOM1] ..."
01Oct2024 01:11:45 sas_adapter_debug adapterName="0a" debug_string="Device 0a.02.19 invalidate debounce - 40"
01Oct2024 01:11:45 sas_adapter_debug adapterName="0a" debug_string="Device 0b.02.19 invalidate debounce - 40"
- From 2024/10/1 01:12:11 to 01:14:50, multiple
SCSI: not readyerrors occurred on the disk in Shelf 1 Bay 23. Although it temporarily recovered after retries,SCSI: not readyreoccurred, ultimately leading toshm_setup_for_failure.
01Oct2024 01:12:11 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0b.01.23" cdb="0x2a:0043c000:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="9932" disk_information=""
01Oct2024 01:12:11 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0b.01.23" cdb="0x2a:d196b930:0010" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="9920" disk_information=""
01Oct2024 01:12:11 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0b.01.23" cdb="0x2a:0043c400:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="8945" disk_information=""
01Oct2024 01:12:11 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0b.01.23" cdb="0x2a:0043be00:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="9943" disk_information=""
01Oct2024 01:12:11 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0b.01.23" cdb="0x2a:0043c200:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="9951" disk_information=""
01Oct2024 01:12:11 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0b.01.23" cdb="0x2a:0043c000:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="10830" disk_information=""
・
・
01Oct2024 01:14:48 scsi_cmd_notReadyConditionEMSOnly deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:01c55000:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="17977" disk_information=""
01Oct2024 01:14:49 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:01c55200:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="17986" disk_information=""
01Oct2024 01:14:49 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:01c55600:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="17994" disk_information=""
01Oct2024 01:14:49 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:01c55400:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="18006" disk_information=""
01Oct2024 01:14:50 scsi_cmd_retrySuccess deviceType="Disk" deviceName="0a.01.23" retryCount="1" freeRetryCount="10" cdb="0x2a:01c55000:0200" dTime="19133"
01Oct2024 01:14:50 scsi_debug debug_string="shm_setup_for_failure disk 0a.01.23 (S/N 71XXXXXXXXUR) error 100h"
- From 2024/10/1 01:14:53 to 01:14:58, the stop of Shelf 2 Bay 19 was completed, and the disconnection triggered an Autosupport for
DISK FAILED.
01Oct2024 01:14:53 dhm_pitstop_complete diskName="0a.02.19" vendorName="NETAPP " productId="X343_TA15E1T8A10" fwVersion="NA01" serialno="42XXXXXXXXUR" result="DHM pitstop's drive self test failed" lbasPerZone="0x800000" actualTimeTaken="0" projectedTimeTaken="0" numZones="3" diskZeroTime="13000" rightSize="0xd12b99ff"
01Oct2024 01:14:55 cf_disk_skipped diskname="0a.02.19" status="adapter error prevents command from being sent to device"
01Oct2024 01:14:58 raid_disk_maint_failed disk_info="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" start_code="f1 5d 12 00" end_code="00 00 00 00" shelf="2" bay="19" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
01Oct2024 01:14:58 raid_config_disk_failed disk_info="Disk 0a.02.19 Shelf 2 Bay 19 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXXEC:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" shelf="2" bay="19" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" failure_reason="disk maintenance testing failed" carrier="" site="Local"
01Oct2024 01:14:58 callhome_dsk_fault subject="DISK FAILED Shelf 2, Bay 19, Model [X343_TA15E1T8A10], S/N [42XXXXXXXXUR]"
- From 2024/10/1 01:18:43 to 01:36:48,
SCSI: not readyandSCSI: hardware errorcontinued to appear on Shelf 1 Bay 23, and power cycles and other actions were performed by the IOM but did not improve the situation, leading toshm_setup_for_failurebeing recorded.
01Oct2024 01:18:43 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:03794400:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="9483" disk_information=""
01Oct2024 01:18:43 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:03794600:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="9489" disk_information=""
01Oct2024 01:18:43 scsi_cmd_notReadyCondition deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:03794800:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="8688" disk_information=""
01Oct2024 01:18:43 scsi_cmd_notReadyConditionEMSOnly deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:03794200:0200" sSenseKey="SCSI:not ready" sSenseCode="Drive spinning up" iSenseKey="0x2" iASC="0x4" iASCQ="0x1" iFRU="0x0" dTime="9517" disk_information=""
・
・
01Oct2024 01:32:49 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:08c8c400:0200" sSenseKey="SCSI:hardware error" sSenseCode="" iSenseKey="0x4" iASC="0x44" iASCQ="0xa3" iFRU="0x0" DTime="1021" disk_information=""
01Oct2024 01:32:49 disk_powercycle deviceType="Disk" deviceName="0a.01.23" disk_information="Disk 0a.01.23 Shelf 1 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [71XXXXXXXXUR] UID [5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" reason="a hardware error" command="0x2a:08c8c400:0200"
01Oct2024 01:32:49 scsi_cmd_retrySuccess deviceType="Disk" deviceName="0a.01.23" retryCount="1" freeRetryCount="0" cdb="0x2a:08c8c400:0200" dTime="1056"
01Oct2024 01:32:54 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:08cfe200:0200" sSenseKey="SCSI:hardware error" sSenseCode="" iSenseKey="0x4" iASC="0x44" iASCQ="0xa3" iFRU="0x0" DTime="933" disk_information=""
01Oct2024 01:32:54 disk_powercycle deviceType="Disk" deviceName="0a.01.23" disk_information="Disk 0a.01.23 Shelf 1 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [71XXXXXXXXUR] UID [5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" reason="a hardware error" command="0x2a:08cfe200:0200"
01Oct2024 01:32:54 scsi_cmd_retrySuccess deviceType="Disk" deviceName="0a.01.23" retryCount="1" freeRetryCount="0" cdb="0x2a:08cfe200:0200" dTime="934"
01Oct2024 01:32:59 scsi_cmd_checkCondition deviceType="Disk" deviceName="0a.01.23" cdb="0x2a:08d93800:0200" sSenseKey="SCSI:hardware error" sSenseCode="" iSenseKey="0x4" iASC="0x44" iASCQ="0xa3" iFRU="0x0" DTime="995" disk_information=""
01Oct2024 01:36:48 scsi_cmd_retrySuccess deviceType="Disk" deviceName="0a.01.23" retryCount="0" freeRetryCount="11" cdb="0x2a:09b15200:0200" dTime="20431"
01Oct2024 01:36:48 scsi_cmd_retrySuccess deviceType="Disk" deviceName="0a.01.23" retryCount="1" freeRetryCount="11" cdb="0x2a:09b14c00:0200" dTime="20716"
01Oct2024 01:36:48 scsi_cmd_retrySuccess deviceType="Disk" deviceName="0a.01.23" retryCount="0" freeRetryCount="11" cdb="0x2a:09b14e00:0200" dTime="20715"
01Oct2024 01:36:48 scsi_debug debug_string="shm_setup_for_failure disk 0a.01.23 (S/N 71XXXXXXXXUR) error 100h"
- From 2024/10/1 06:36:09 to 2024/10/3 08:47:04, five hours after the error occurred, Shelf 1 Bay 23 was determined to be faulty, Disk Copy to Shelf 3 Bay 23 was completed, and an Autosupport for
DISK FAILEDwas triggered.
01Oct2024 06:36:09 raid_rg_diskcopy_start owner="" rg="/ask_backup01/plex0/rg2" source="0a.01.23" source_serialno="71XXXXXXXXUR" target="0a.03.23" target_serialno="42XXXXXXXXUR" reason="Disk replace was started." aggregate_uuid="68ab4e22-b25f-430e-add8-5fc247c5daca"
01Oct2024 06:36:09 raid_rg_diskcopy_progress source_disk="1.1.23" source_serialno="71XXXXXXXXUR" target_disk="1.3.23" target_serialno="42XXXXXXXXUR" blockNum="0" percent="0" duration="0:00:00"
01Oct2024 06:36:11 raid_spares_media_scrub_stopped owner="" disk_info="Disk 0a.03.23 Shelf 3 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXX58:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" shelf="3" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
01Oct2024 06:36:34 cf_disk_skipped diskname="0a.02.19" status="disk failed"
01Oct2024 06:41:11 raid_rg_diskcopy_progress source_disk="1.1.23" source_serialno="71XXXXXXXXUR" target_disk="1.3.23" target_serialno="42XXXXXXXXUR" blockNum="19446912" percent="4" duration="0:05:01"
01Oct2024 06:41:34 cf_disk_skipped diskname="0a.02.19" status="disk failed"
01Oct2024 06:46:11 raid_rg_diskcopy_progress source_disk="1.1.23" source_serialno="71XXXXXXXXUR" target_disk="1.3.23" target_serialno="42XXXXXXXXUR" blockNum="38660736" percent="9" duration="0:10:01"
・
・
03Oct2024 08:47:04 raid_label_io_writeError disk_info="Disk /ask_backup01/plex0/rg2/0a.01.23 Shelf 1 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [71XXXXXXXXUR] UID [5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" error_str="Label read after label write did not match." instanceFile="prod/common/raidv2/label-io.c" instanceId="3318" shelf="1" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="71XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
03Oct2024 08:47:04 raid_config_filesystem_disk_failed disk_info="Disk /ask_backup01/plex0/rg2/0a.01.23 Shelf 1 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [71XXXXXXXXUR] UID [5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" shelf="1" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="71XXXXXXXXUR" disk_type="4" disk_rpm="10000" failure_reason="disk failed" carrier="" site="Local"
03Oct2024 08:47:04 disk_outOfService diskName="0a.01.23" serialno="71XXXXXXXXUR" reason=": sense information: SCSI:hardware error(0x04), ASC(0x44), ASCQ(0xa3), FRU(0x01)" powerOnHours="18190" glistEntries="0" disk_information="Disk 0a.01.23 Shelf 1 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [71XXXXXXXXUR] UID [5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"
03Oct2024 08:47:04 disk_write_failure diskName="0a.01.23" serialno="71XXXXXXXXUR" state="0x5" fa0="0xf4" fa1="0x44" fa2="0xa3" fa3="0x1" disk_information="Disk 0a.01.23 Shelf 1 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [71XXXXXXXXUR] UID [5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]"
03Oct2024 08:47:04 rastrace_dump_saved module="RAID" instance="0" filename="/etc/log/rastrace/RAID_0_20241003_08:47:04:323851.dmp"
03Oct2024 08:47:04 raid_notify_on_failure disk_uid="5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000" originating_sysid="538237732" failure_reason="1" failure_string="failed"
03Oct2024 08:47:04 raid_disk_unload_done disk_info="Disk 0a.01.23 Shelf 1 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [71XXXXXXXXUR] UID [5000039B:5928131C:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" shelf="1" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="71XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
03Oct2024 08:47:04 callhome_fdsk_fault subject="FILESYSTEM DISK FAILED Shelf 1, Bay 23, Model [X343_TA15E1T8A10], S/N [71XXXXXXXXUR]"
- From 2024/10/3 08:47:04 to 2024/10/4 00:57:16, reconstruction started on Shelf 3 Bay 23, but many skip errors were displayed for the already disconnected disks in Shelf 2 Bay 19 and Shelf 1 Bay 23.
03Oct2024 08:47:09 raid_rg_recons_progress disk_info="Disk /ask_backup01/plex0/rg2/0a.03.23 Shelf 3 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXX58:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" blockNumCurrent="432731072" blockNumEnd="438661952" percent="99" duration="0:00:05" shelf="3" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
03Oct2024 08:47:09 cf_disk_skipped diskname="0a.02.19" status="disk failed"
03Oct2024 08:49:10 cf_disk_skipped diskname="0a.01.23" status="disk failed"
03Oct2024 08:49:10 cf_disk_skipped diskname="0a.02.19" status="disk failed"
03Oct2024 08:50:09 raid_rg_normal owner="" name="/ask_backup01/plex0/rg2" aggr_UUID="68ab4e22-b25f-430e-add8-5fc247c5daca"
03Oct2024 08:50:09 raid_rg_recons_done disk_info="Disk /ask_backup01/plex0/rg2/0a.03.23 Shelf 3 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXX58:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" owner="" disk="0a.03.23" duration="3:05.08" shelf="3" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local" aggregate_uuid="68ab4e22-b25f-430e-add8-5fc247c5daca"
03Oct2024 08:52:00 raid_rg_media_scrub_start owner="" rg="/ask_backup01/plex0/rg2" aggregate_uuid="68ab4e22-b25f-430e-add8-5fc247c5daca"
03Oct2024 08:52:10 cf_disk_skipped diskname="0a.01.23" status="disk failed"
03Oct2024 08:52:10 cf_disk_skipped diskname="0a.02.19" status="disk failed"
03Oct2024 08:55:10 cf_disk_skipped diskname="0a.01.23" status="disk failed"
03Oct2024 08:55:10 cf_disk_skipped diskname="0a.02.19" status="disk failed"
03Oct2024 08:58:10 cf_disk_skipped diskname="0a.01.23" status="disk failed"
・
・
04Oct2024 00:48:16 cf_disk_skipped diskname="0a.01.23" status="disk failed"
04Oct2024 00:48:16 cf_disk_skipped diskname="0a.02.19" status="disk failed"
04Oct2024 00:51:16 cf_disk_skipped diskname="0a.01.23" status="disk failed"
04Oct2024 00:51:16 cf_disk_skipped diskname="0a.02.19" status="disk failed"
04Oct2024 00:54:16 cf_disk_skipped diskname="0a.01.23" status="disk failed"
04Oct2024 00:54:16 cf_disk_skipped diskname="0a.02.19" status="disk failed"
04Oct2024 00:57:16 cf_disk_skipped diskname="0a.01.23" status="disk failed"
04Oct2024 00:57:16 cf_disk_skipped diskname="0a.02.19" status="disk failed"
- From 2024/10/4 01:00:59 to 01:12:59,
raid_readerr_lw_reconsBadBlk_ckinfoandraid_readerr_lw_reconsGoodBlk_ckinforepeated, and it appeared thatraid_rg_readerr_repair_dataand repairs were being executed on Shelf 3 Bay 23.
04Oct2024 01:00:59 raid_readerr_lw_dparity_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.02.12" dbn="554826" stripe_id="0x12" gen_cnt1="0x15555" gen_cnt2="0x0" comp_cksum="0xb617e667" wafl_cxt0="0xab3e4c6a" wafl_cxt1="0x1eb1403" wafl_cxt2="0x1ff002" wafl_cxt3="0x1f87e0" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" disk_serialno="42S0A070F4UR"
04Oct2024 01:00:59 raid_readerr_lw_parity_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.02.13" dbn="554826" stripe_id="0x12" gen_cnt1="0x15555" gen_cnt2="0x0" comp_cksum="0x1bddcf8f" wafl_cxt0="0xb32c6f" wafl_cxt1="0x1d79" wafl_cxt2="0x403" wafl_cxt3="0x410" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" disk_serialno="42S0A06GF4UR"
04Oct2024 01:00:59 raid_readerr_lw_data_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.02.14" dbn="554826" disk_pos="0" stripe_id="0x1259fede" gen_cnt="0x1" comp_cksum="0x9cd49c3e" sto_vbn="7841090378" wafl_cxt0="0x563825" wafl_cxt1="0x65f4" wafl_cxt2="0x403" wafl_cxt3="0x410" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" disk_serialno="7150B0N8V4UR"
04Oct2024 01:00:59 raid_readerr_lw_reconsBadBlk_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.03.23" dbn="554826" comp_cksum="0xe7fd4673" wafl_cxt0="0x563709" wafl_cxt1="0x65f4" wafl_cxt2="0x403" wafl_cxt3="0x410" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" iteration="1" disk_serialno="42XXXXXXXXUR"
04Oct2024 01:00:59 raid_readerr_lw_reconsGoodBlk_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.02.12" dbn="554826" comp_cksum="0xb617e667" wafl_cxt0="0xab3e4c6a" wafl_cxt1="0x1eb1403" wafl_cxt2="0x1ff002" wafl_cxt3="0x1f87e0" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" iteration="1" disk_serialno="42S0A070F4UR"
04Oct2024 01:00:59 raid_readerr_lw_consist_check_pass owner="" rg="/ask_backup01/plex0/rg2" stripe_num="554826" iteration="1"
04Oct2024 01:00:59 ems_engine_suppressed emsId="raid.rg.readerr.repair.data" numDrops="3" numSeconds="258570"
・
・
04Oct2024 01:12:59 raid_readerr_lw_consist_check_pass owner="" rg="/ask_backup01/plex0/rg2" stripe_num="558383" iteration="1"
04Oct2024 01:12:59 raid_rg_readerr_repair_data owner="" disk_info="Disk /ask_backup01/plex0/rg2/0a.03.23 Shelf 3 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXX58:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" blockNum="558383" vbn="10034377455" shelf="3" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
04Oct2024 01:12:59 raid_readerr_lw_reconsBadBlk_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.03.23" dbn="558381" comp_cksum="0xb011fdbc" wafl_cxt0="0xb36228" wafl_cxt1="0x65e5" wafl_cxt2="0x404" wafl_cxt3="0x410" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" iteration="1" disk_serialno="42XXXXXXXXUR"
04Oct2024 01:12:59 raid_readerr_lw_reconsGoodBlk_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.02.12" dbn="558381" comp_cksum="0x784d4f1d" wafl_cxt0="0x8d2e8fa" wafl_cxt1="0x1f7f949" wafl_cxt2="0x1fe7f8" wafl_cxt3="0x1f87e0" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" iteration="1" disk_serialno="42S0A070F4UR"
04Oct2024 01:12:59 raid_readerr_lw_consist_check_pass owner="" rg="/ask_backup01/plex0/rg2" stripe_num="558381" iteration="1"
04Oct2024 01:12:59 raid_pi_diag_error msg="Out of messages" type="checksum mismatch"
04Oct2024 01:12:59 raid_readerr_lw_reconsBadBlk_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.03.23" dbn="558382" comp_cksum="0xabec5fa6" wafl_cxt0="0xb36229" wafl_cxt1="0x65e5" wafl_cxt2="0x404" wafl_cxt3="0x410" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" iteration="1" disk_serialno="42XXXXXXXXUR"
04Oct2024 01:12:59 raid_readerr_lw_reconsGoodBlk_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.02.12" dbn="558382" comp_cksum="0xd0baf7b7" wafl_cxt0="0x8d2e8b8" wafl_cxt1="0x1f7f949" wafl_cxt2="0x1fe7f8" wafl_cxt3="0x1f87e0" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" iteration="1" disk_serialno="42S0A070F4UR"
04Oct2024 01:12:59 raid_readerr_lw_consist_check_pass owner="" rg="/ask_backup01/plex0/rg2" stripe_num="558382" iteration="1"
04Oct2024 01:12:59 raid_readerr_lw_reconsBadBlk_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.03.23" dbn="558380" comp_cksum="0xe8c81d73" wafl_cxt0="0xb36227" wafl_cxt1="0x65e5" wafl_cxt2="0x404" wafl_cxt3="0x410" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" iteration="1" disk_serialno="42XXXXXXXXUR"
04Oct2024 01:12:59 raid_readerr_lw_reconsGoodBlk_ckinfo owner="" disk_name="/ask_backup01/plex0/rg2/0a.02.12" dbn="558380" comp_cksum="0x47e5cf7f" wafl_cxt0="0x8d2d3f4" wafl_cxt1="0x1f7f949" wafl_cxt2="0x1fe7f8" wafl_cxt3="0x1f87e0" wafl_cxt4="0x0" wafl_cxt5="0x0" wafl_cxt6="0x0" wafl_cxt7="0x0" iteration="1" disk_serialno="42S0A070F4UR"
04Oct2024 01:12:59 raid_readerr_lw_consist_check_pass owner="" rg="/ask_backup01/plex0/rg2" stripe_num="558380" iteration="1"
- On 2024/10/4 at 01:13:00, Shelf 3 Bay 23 was also determined to be faulty and was disconnected.
04Oct2024 01:12:59 raid_notify_on_failure disk_uid="5XXXXXXB:98XXXX58:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000" originating_sysid="538237732" failure_reason="1" failure_string="failed"
04Oct2024 01:12:59 raid_disk_unload_done disk_info="Disk 0a.03.23 Shelf 3 Bay 23 [NETAPP X343_TA15E1T8A10 NA01] S/N [42XXXXXXXXUR] UID [5XXXXXXB:98XXXX58:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]" shelf="3" bay="23" vendor="NETAPP " model="X343_TA15E1T8A10" firmware_revision="NA01" serialno="42XXXXXXXXUR" disk_type="4" disk_rpm="10000" carrier="" site="Local"
04Oct2024 01:13:00 callhome_fdsk_fault subject="FILESYSTEM DISK FAILED Shelf 3, Bay 23, Model [X343_TA15E1T8A10], S/N [42XXXXXXXXUR]"
