Ontap Select kept crashing with WAFL Inconsistency and Unrecoverable metadata block
Applies to
- ONTAP 9
- ONTAP select
Issue
- ONTAP Select VM repeatedly crashes due to WAFL inconsistencies with a "Panic String: Unrecoverable metadata block"
[?] Tue May 13 14:00:59 +1200 [TAW10ONTSLT-01: wafl_exempt02: wafl.raid.incons.buf:error]: WAFL inconsistent: bad block at VBN 2895176134 (vvbn:0 fbn:6 level:0) in private inode (fileid:4294967295 snapid:0 fixable:1 file_type:1 disk_flags:0x202 error:117 raid_set:1) in volume TAW10ONTSLT_01_DATA. <===========================
[?] Tue May 13 14:00:59 +1200 [TAW10ONTSLT-01: wafl_exempt02: callhome.wafl.inconsistent.block:alert]: Call home for WAFL INCONSISTENT BLOCK <===========================
[?] Tue May 13 14:00:59 +1200 [TAW10ONTSLT-01: wafl_exempt02: cf.fm.localFwTransition:debug]: params:
{'prevstate': 'SF_OS_BOOTED', 'newstate': 'SF_DUMPCORE', 'progresscounter': '1'}
[?] Tue May 13 14:00:59 +1200 [TAW10ONTSLT-01: wafl_exempt02: sk.panic:alert]: Panic String: Unrecoverable metadata block (file -1, block 2895176134, fbn 6, level 0, file type 1) in aggregate TAW10ONTSLT_01_DATA. WAFL inconsistent. Contact NetApp technical support. in SK process wafl_exempt02 on release 9.7P23 (C)
- Data aggregate ends in "restricted" state
TAW10ONTSLT::> aggr status
Aggregate Size Available Used% State #Vols Nodes RAID Status
--------- -------- --------- ----- ------- ------ ---------------- ------------
TAW10ONTSLT_01_DATA 0B 0B 0% restricted 0 TAW10ONTSLT-01 raid0,
normal
aggr0_TAW10ONTSLT_01 60.22GB 2.92GB 95% online 1 TAW10ONTSLT-01 raid0,
normal
2 entries were displayed.
- Additional log entries shows "raid_multierr_bad_block_1:error"
Tue May 13 14:00:59 +1200 [TAW10ONTSLT-01: raidio_thread: raid_multierr_bad_block_1:error]: params: {'owner': '', 'disk_info': 'Disk /TAW10ONTSLT_01_DATA/plex0/rg0/0b.4 S/N [0-2sgNFOu7nZciJvfhw4] UID [0-2sgNFOu7nZciJvfhw4]', 'blockNum': '1196312780', 'volumeBno': '2895175659', 'shelf': '-', 'bay': '-', 'vendor': 'NETAPP ', 'model': 'PHA-DISK ', 'firmware_revision': '0001', 'serialno': '0-2sgNFOu7nZciJvfhw4', 'disk_type': '11', 'disk_rpm': 'N/A', 'carrier': '', 'site': 'Local'} Tue May 13 14:00:59 +1200 [TAW10ONTSLT-01: raidio_thread: raid_cksum_wc_blkErr_1:notice]: params: {'vol_type': 'aggregate', 'owner': '', 'vol': 'TAW10ONTSLT_01_DATA', 'disk_info': 'Disk /TAW10ONTSLT_01_DATA/plex0/rg0/0b.4 S/N [0-2sgNFOu7nZciJvfhw4] UID [0-2sgNFOu7nZciJvfhw4]', 'blockNum': '1196313263', 'buftreeid': '1613469503', 'ino_type': 'private', 'fileid': '-1', 'snapid': '0', 'bno': '6', 'level': '0', 'stored_buftreeid': '8477', 'stored_fbn': '134217728', 'remaining_stored_context': 'CP count : 6780136, PVBN 2895176134, encrypted flag : 0, key index : 0', 'shelf': '-', 'bay': '-', 'vendor': 'NETAPP ', 'model': 'PHA-DISK ', 'firmware_revision': '0001', 'serialno': '0-2sgNFOu7nZciJvfhw4', 'disk_type': '11', 'disk_rpm': 'N/A', 'carrier': '', 'site': 'Local'} <
- Volumes are inaccessible due to the state of aggregate and WAFL inconsistency.
- Attempting a SnapMirror restore can also cause the VM to crash.
- Multiple warning alerts are observed on the ESXi host regarding datastore disconnection.
