How does the deduplication changelog work?
Applies to
- Clustered Data ONTAP
- Dedupe & Compression
Answer
- The changelog records modifications to data blocks. Once deduplication is started, Data ONTAP refers to the changelog to deduplicate the data, and clears the changelog when the deduplication process is complete.
- Changelog can hold records of modifications of blocks up to a maximum size of 8TB.
- This maximum size cannot be changed.
- A changelog size of 64GB means that changelogging will continue new writes till 8 TB worth of data is written to a volume.
8TB user data = 2147483648 data blocks in a volume (4k per block size)
Fingerprint size = 32bytes
Thus, the total changeloge size supported = 2147483648 * 32bytes = 64GB
- When it is about to exceed this limit, Data ONTAP stops writing modifications to the changelog and writes the following message to the syslog:
Thu Mar 25 04:54:39 GMT [nss-u55/nss-u56: sis.changelog.full:warning]:Change logging metafile on volume [host][vol] is full and can not hold any more fingerprint entries.
To check the current usage, use the appropriate command:
For 7-Mode:
Run the sis status -l
Last Operation Size and Changelog Usage display the current status.
For clustered Data ONTAP:
::volume efficiency show -vserver -volume
To start the deduplication process and clear the changelog, use the appropriate command:
For 7-Mode:
- The
sis start
command starts the deduplication process and clears the changelog. - However, running the
sis start
command alone will not process modified blocks when the changelog is full. Thus, run thesis start -s
command to deduplicate all data in this case.
For clustered Data ONTAP:
::>volume efficiency start -vserver <vserver name> -volume <vol name>
::>volume efficiency start -vserver <vserver name> -volume <vol name> -scan-old-data true
Additional Information
TR-3966 -NetApp Data Compression and Deduplication Deployment and Implementation Guide Clustered Data ONTAP