Skip to main content
NetApp Knowledge Base

WAFL inconsistency due to lack of spares and degraded aggregate for prolonged time

Views:
515
Visibility:
Public
Votes:
0
Category:
aff-series
Specialty:
hw
Last Updated:

Applies to

  • AFF/FAS Systems
  • ONTAP 9

Issue

In EMS  logs:

  • System used all adequate spare disks and reports on spares low

Sat Mar 12 05:17:26 +0200 [node_2: config_thread: raid.rg.spares.low:error]: /aggr2_2/plex0/rg0
Sat Mar 12 05:17:26 +0200 [node_2: config_thread: callhome.spares.low:error]: Call home for SPARES_LOW

  • Following the next failure raid group is in degraded state

Mon Apr 04 02:00:00 +0200 [node_2: statd: monitor.raiddp.vol.singleDegraded:error]: data disk in RAID group "/aggr2_2/plex0/rg0" is broken.

  • Disk failures continue

Thu May 05 21:03:07 +0200 [node_2: config_thread: raid.rg.recons.cantStart:error]: The reconstruction cannot start in RAID group /aggr2_2/plex0/rg0: No matching disks available in spare pool, targeting any spare pool

Wed May 04 03:00:00 +0200 [node_2: statd: monitor.brokenDisk.notice:notice]: When two disks are broken in raid_dp volume, the system shuts down automatically every 24 hours to encourage you to replace the disk. If you reboot the system, it will run for another 24 hours before shutting down.

Wed May 04 03:00:00 +0200 [node_2: statd: monitor.shutdown.brokenDisk.pending:notice]: two data disks in RAID group "/aggr2_2/plex0/rg0" are broken. Halting system in 24 hours.

  • Spare disks provided and reconstruction starts
  • If there are dodgy disks in the raid group, reconstruction is not able to rebuild completely and start to mark missing blocks

Fri May 06 10:05:51 +0200 [node_2: raidio_thread: raid_multierr_bad_block_1:error]: params: {'disk_rpm': '10000', 'vendor': 'NETAPP  ', 'firmware_revision': 'NA02', 'shelf': '2', 'disk_info': 'Disk /aggr2_2/plex0/rg0/0a.02.23P1 Shelf 2 Bay 23 [NETAPP   X343_SSKBE1T8A10 NA02] S/N [WBN1AJT5NP001] UID [6000C500:BCA9B53B:500A0981:00000001:00000000:00000000:00000000:00000000:00000000:00000000]', 'volumeBno': '1348939177', 'site': 'Local', 'bay': '23', 'carrier': '', 'serialno': 'WBN1AJT5NP001', 'owner': '', 'model': 'X343_SSKBE1T8A10', 'disk_type': '4', 'blockNum': '81428969'}
Fri May 06 10:05:51 +0200 [node_2: raidio_thread: raid_multierr_bad_missingBlk_1:debug]: params: {'owner': '', 'rg': '/aggr2_2/plex0/rg0', 'blockNum': '81428969', 'vbn': '7381173545'}

  • When client discover a damaged data it triggers an inconsistency alert

Sun May 15 18:14:30 +0200 [node_2: wafl_exempt01: wafl.raid.incons.userdata:error]: WAFL inconsistent: inconsistent user data block at VBN 3581364492 (vvbn:567776529 fbn:664341713 level:0) in public inode (fileid:96 snapid:0 file_type:15 disk_flags:0x8402 error:120 raid_set:1) in volume node_02_vol@vserver:6456a9ee-6e12-11e8-99f3-01b099c9ade9.
Sun May 15 18:14:30 +0200 [node_2: wafl_exempt01: wafl.incons.userdata.vol:alert]: WAFL inconsistent: volume vol_02_vol@vserver:6456a9ee-6e12-11e8-99f3-01b099c9ade9 has an inconsistent user data block. Note: Any new Snapshot copies might contain this inconsistency.

 

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.