Skip to main content
NetApp Knowledge Base

What are the important changes to RAID scrub in Data ONTAP 8.3.x or later

Views:
692
Visibility:
Public
Votes:
0
Category:
clustered-data-ontap-8
Specialty:
core
Last Updated:

Applies to

  • Clustered Data ONTAP 8.3
  • ONTAP 9

Answer

  • General higher CPU and Disk utilization may be observed, especially during night hours.
  • A possible reason can be the change to the RAID scrub schedule introduced in Data ONTAP 8.3.
    • The default RAID scrub schedule changed in Data ONTAP 8.3 - scrubs are run every day.
    • For more information, Storage raid-options Commands.
      • If no specific value is defined, the default schedule will apply.

raid.scrub.schedule 

  • This option specifies the weekly schedule (day, time, and duration) for scrubs started automatically.

    • On a non-AFF system, the default schedule is daily at 1 a.m. for the duration of 4 hours except on Sunday when it is 12 hours.
    • On an AFF system, the default schedule is weekly at 1 a.m. on Sunday for the duration of 6 hours
      • By default, scrub will run 4 hours every day, thus the overall scrub runtime will be higher and scans will complete more frequently compared to prior ONTAP 8.3 versions of ONTAP. 
      • It is expected behavior that the system will have higher CPU and disk activity during this time
      • If this is an issue during the week, the schedule can be defined to run at specific times and for specific durations

 

Example:
  • Use the storage raid-options show command to check the current settings:
cluster::> storage raid-options show -name raid.scrub.schedule
Node     Option                                Value        Constraint
-------- ------------------------------------- ------------ -----------
cluster-01 raid.scrub.schedule                           none
cluster-02 raid.scrub.schedule                           none
2 entries were displayed.
  • Use the storage raid-options modify command to change the schedule as required:
cluster::> storage raid-options modify -node cluster-01 -name raid.scrub.schedule 240m@tue@2
Specified scrub schedule added
  • With the storage raid-options show command, you can verify the change to the schedule:
cluster::> storage raid-options show -name raid.scrub.schedule
Node     Option                                Value        Constraint
-------- ------------------------------------- ------------ -----------
cluster-01 raid.scrub.schedule              240m@tue@2   none
cluster-02 raid.scrub.schedule                           none
2 entries were displayed.

 

Verification through the Event log:
  • You can also verify and search for the related messages using the event log show command:
    • In this example, starting at 1 AM as per the default schedule.
Cluster-01::> event log show -messagename raid.rg.scrub.resume
[?] Tue May 24 01:00:12 CEST [cluster: config_thread: raid.rg.scrub.resume:notice]: /aggr_ssc_dc1_ds11_b_sata_root/plex0
/rg0: resuming scrub at stripe 578657472 (89%% complete)
  • To check for the pausing of scrub, search for a suspend message:
    • In this example, it suspends at 5 AM after 4 hours runtime, as per the default schedule.
Cluster-01::> event log show -messagename raid.rg.scrub.suspend
[?] Tue May 24 05:00:01 CEST [cluster: config_thread: raid.scrub.suspended:notice]: Disk scrub suspended. 
  • To check for the summary, run:
Cluster-01::> event log show -messagename raid.rg.scrub.summary 
[?] Tue May 24 05:00:01 CEST [cluster: config_thread: raid.rg.scrub.summary.lw:notice]: Scrub found 0 RAID write
signature inconsistencies in /aggr_ssc_dc1_ds11_b_sata_data_01/plex0/rg0.

Additional Information

statit will show greads during this time, and the disks may show 100% busy, but this is normal due to RAID scrubs being background activity at disk level:

Cluster::> set advanced
Cluster::*> node run -node node_1 -command statit -b
Cluster::*> node run -node node_1 -command statit -e
...
                       Disk Statistics (per second)
        ut% is the percent of time the disk was busy.
        xfers is the number of data-transfer commands issued per second.
        xfers = ureads + writes + cpreads + greads + gwrites
        chain is the average number of 4K blocks per command.
        usecs is the average disk round-trip time per 4K block.

disk             ut%  xfers  ureads--chain-usecs writes--chain-usecs cpreads-chain-usecs greads--chain-usecs gwrites-chain-usecs
/data_aggr1/plex0/rg0:    
0a.00.4           79 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   60   0.00   ....     .
0a.00.18          84 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   67   0.00   ....     .
0a.00.10          82 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   64   0.00   ....     .
0a.00.19          87 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   71   0.00   ....     .
0a.00.12          86 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   69   0.00   ....     .
0a.00.17          90 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   75   0.00   ....     .
0a.00.16          91 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   77   0.00   ....     .
0a.00.2           91 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   76   0.00   ....     .
0a.00.3           92 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   78   0.00   ....     .
0a.00.5           94 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   82   0.00   ....     .
0a.00.6           95 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   85   0.00   ....     .
0a.00.7           95 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   87   0.00   ....     .
0a.00.13          96 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   89   0.00   ....     .
0a.00.15          96 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   92   0.00   ....     .