Skip to main content
NetApp Knowledgebase

What are the important changes to RAID scrub in Data ONTAP 8.3.x or later

Views:
354
Visibility:
Public
Votes:
0
Category:
clustered-data-ontap-8
Specialty:
core
Last Updated:

Applies to

  • Clustered Data ONTAP 8.3
  • ONTAP 9

Answer

  • General higher system utilization is observed by customers after an upgrade to Data ONTAP 8.3 or later, especially during night hours.
  • A possible reason can be the change to the RAID scrub schedule introduced in Data ONTAP 8.3.
  • For more information, see the Clustered Data ONTAP ® 8.3 Release Notes
  • If no specific value is defined, the default schedule will apply.

raid.scrub.schedule 

  • This option specifies the weekly schedule (day, time, and duration) for scrubs started automatically.

    • On a non-AFF system, the default schedule is daily at 1 a.m. for the duration of 4 hours except on Sunday when it is 12 hours.
    • On an AFF system, the default schedule is weekly at 1 a.m. on Sunday for the duration of 6 hours
    • If an empty string ("") is specified as an argument, it will delete the previous scrub schedule and add the default schedule.
    • One or more schedules can be specified using this option.
    • The syntax is duration[h|m]@weekday@start_time,[duration[h|m]@weekday@start_time,...], where duration is the time period for which the scrub operation is allowed to run, in hours or minutes ('h' or 'm' respectively).
      • If the duration is not specified, the raid.scrub.duration option value will be used as the duration for the schedule. 
      • Weekday is the day on which the scrub is scheduled to start.
      • The valid values are sun, mon, tue, wed, thu, fri, sat. 
      • start_time is the time when a scrub is scheduled to start. It is specified in a 24-hour format.
      • Only the hour (0-23) needs to be specified.
      • For example, options raid.scrub.schedule 240m@tue@2,8h@sat@22 will cause the scrub to start every Tuesday at 2 a.m. for 240 minutes, and every Saturday at 10 p.m. for 480 minutes. 
  • Example:
    • Use the storage raid-options show command to check the current settings:

cluster::> storage raid-options show -name raid.scrub.schedule
Node     Option                                Value        Constraint
-------- ------------------------------------- ------------ -----------
cluster-01 raid.scrub.schedule                           none
cluster-02 raid.scrub.schedule                           none
2 entries were displayed.

  • Use the storage raid-options modify command to change the schedule as required:

cluster::> storage raid-options modify -node cluster-01 -name raid.scrub.schedule 240m@tue@2
Specified scrub schedule added

  • With the storage raid-options show command, you can verify the change to the schedule:

cluster::> storage raid-options show -name raid.scrub.schedule
Node     Option                                Value        Constraint
-------- ------------------------------------- ------------ -----------
cluster-01 raid.scrub.schedule              240m@tue@2   none
cluster-02 raid.scrub.schedule                           none
2 entries were displayed.

  • To change back to the default schedule, use the modify command again:
    • If you replace an existing schedule with an empty string "", the default scrub schedule will be added automatically.

cluster::> storage raid-options modify -node cluster-01 -name raid.scrub.schedule "" 

  • Verification through the Event log:
    • You can also verify and search for the related messages using the event log show command:
      • In this example, starting at 1 AM as per the default schedule.

Cluster-01::> event log show -messagename raid.rg.scrub.resume
[?] Tue May 24 01:00:12 CEST [cluster: config_thread: raid.rg.scrub.resume:notice]: /aggr_ssc_dc1_ds11_b_sata_root/plex0/rg0: resuming scrub at stripe 578657472 (89%% complete)

  • To check for the pausing of scrub, search for a suspend message:
    • In this example, it suspends at 5 AM after 4 hours runtime, as per the default schedule.

Cluster-01::> event log show -messagename raid.rg.scrub.suspend
[?] Tue May 24 05:00:01 CEST [cluster: config_thread: raid.scrub.suspended:notice]: Disk scrub suspended. 

  • To check for the summary, run:

Cluster-01::> event log show -messagename raid.rg.scrub.summary 
[?] Tue May 24 05:00:01 CEST [cluster: config_thread: raid.rg.scrub.summary.lw:notice]: Scrub found 0 RAID write signature inconsistencies in /aggr_ssc_dc1_ds11_b_sata_data_01/plex0/rg0.

  • By default, scrub will run 4 hours every day, thus the overall scrub runtime will be higher and scans will complete more frequently compared to previous versions of Data ONTAP. 
  • It is expected behavior that the system will have higher disk activity during this time, compared to previous releases.
  •  If this is an issue during the week, the schedule can be defined to run at specific times and for specific durations.

Additional Information

statit will show greads during this time, and the disks may show 100% busy, but this is normal due to RAID scrubs being background activity at disk level:

Cluster::> set advanced
Cluster::*> node run -node node_1 -command statit -b
Cluster::*> node run -node node_1 -command statit -e
...
                       Disk Statistics (per second)
        ut% is the percent of time the disk was busy.
        xfers is the number of data-transfer commands issued per second.
        xfers = ureads + writes + cpreads + greads + gwrites
        chain is the average number of 4K blocks per command.
        usecs is the average disk round-trip time per 4K block.

disk             ut%  xfers  ureads--chain-usecs writes--chain-usecs cpreads-chain-usecs greads--chain-usecs gwrites-chain-usecs
/data_aggr1/plex0/rg0:    
0a.00.4           79 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   60   0.00   ....     .
0a.00.18          84 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   67   0.00   ....     .
0a.00.10          82 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   64   0.00   ....     .
0a.00.19          87 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   71   0.00   ....     .
0a.00.12          86 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   69   0.00   ....     .
0a.00.17          90 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   75   0.00   ....     .
0a.00.16          91 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   77   0.00   ....     .
0a.00.2           91 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   76   0.00   ....     .
0a.00.3           92 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   78   0.00   ....     .
0a.00.5           94 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   82   0.00   ....     .
0a.00.6           95 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   85   0.00   ....     .
0a.00.7           95 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   87   0.00   ....     .
0a.00.13          96 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   89   0.00   ....     .
0a.00.15          96 280.18    0.00   ....     .   0.00   ....     .   0.00   ....     . 280.18  64.00   92   0.00   ....     .