CONTAP-82695: System disruption occurs due to medium errors or recovered errors
Issue
- The storage appliance experiences a unexpected reboot with: WAFL hung for <aggregate name>. in SK process wafl_exempt<nn> on release 9.12.1
- Immediately prior to the reboot, media errors for one or more drives associated with the aggregate is reported by the storage appliance.
- EMS reports:
Fri May 26 01:57:51 +0200 [NODE01: raidio_thread: raid_rg_readerr_repair_data_1:notice]: params: {'site': 'Local', 'disk_rpm': '10000', 'vendor': 'NETAPP ', 'firmware_revision': 'NA03', 'shelf': '51', 'disk_info': 'Disk /aggr_mmo_18sas_03/plex0/rg1/FC_switch_A_1:6.126L1040 Shelf 51 Bay 13 [NETAPP X343_SSKBE1T8A10 NA03] S/N [XXXXXXXX] UID [5000C500:D1818EB7:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]', 'vbn': '9447206211', 'bay': '13', 'carrier': '', 'serialno': 'XXXXXXXX', 'owner': '', 'model': 'X343_SSKBE1T8A10', 'disk_type': '4', 'blockNum': '235420675'}
Fri May 26 01:57:52 +0200 [NODE01: disk_server_0: disk.ioReassignSuccess:notice]: disk FC_switch_A_1:6.126L1040: sector 1883365400 was reassigned (203). Disk FC_switch_A_1:6.126L1040 Shelf 51 Bay 13 [NETAPP X343_SSKBE1T8A10 NA03] S/N [XXXXXXXX] UID [5000C500:D1818EB7:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000]