How does the deduplication changelog work?

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

Views:: 7,385

Visibility:: Public

Votes:: 4

Category:: clustered-data-ontap-8

Specialty:: core

Last Updated:

Applies to

Clustered Data ONTAP
Dedupe & Compression

Answer

The changelog records modifications to data blocks. Once deduplication is started, Data ONTAP refers to the changelog to deduplicate the data, and clears the changelog when the deduplication process is complete.
Changelog can hold records of modifications of blocks up to a maximum size of 8TB.
This maximum size cannot be changed.
A changelog size of 64GB means that changelogging will continue new writes till 8 TB worth of data is written to a volume.
Example:
8TB user data = 2147483648 data blocks in a volume (4k per block size)
Fingerprint size = 32bytes
Thus, the total changeloge size supported = 2147483648 * 32bytes = 64GB
- As calculated above, the number of blocks—which depends on the volume size—affects the change log size, and the maximum is reached at 8TB volume.

When it is about to exceed this limit, Data ONTAP stops writing modifications to the changelog and writes the following message to the syslog:

Thu Mar 25 04:54:39 GMT [nss-u55/nss-u56: sis.changelog.full:warning]:Change logging metafile on volume [host][vol] is full and can not hold any more fingerprint entries.

To check the current usage, use the appropriate command:

For 7-Mode:

Run the sis status -l
Last Operation Size and Changelog Usage display the current status.

For clustered Data ONTAP:

::volume efficiency show -vserver -volume

To start the deduplication process and clear the changelog, use the appropriate command:

For 7-Mode:

The sis start command starts the deduplication process and clears the changelog.
However, running the sis start command alone will not process modified blocks when the changelog is full. Thus, run the sis start -s command to deduplicate all data in this case.

For clustered Data ONTAP:

::>volume efficiency start -vserver <vserver name> -volume <vol name>

::>volume efficiency start -vserver <vserver name> -volume <vol name> -scan-old-data true

Additional Information

TR-3966 -NetApp Data Compression and Deduplication Deployment and Implementation Guide Clustered Data ONTAP