What is MD5 Fingerprinting ?
Applies to
- Checksum
- MD5
- ONTAP 9
Answer
- Fingerprinting is a mechanism used by customers who utilize filers for archival and retention to ensure that data at rest hasn't been replaced, edited, or corrupted.
- At a minimum, it involves a filer computing a secure digest on a file and reporting it to a client.
- If a fingerprint is computed, it is cleared out after an hour or after the dump command is executed once, whichever comes first. This ensures that the in-memory value does not persist indefinitely, which could potentially overload the memory.
- The MD5 checksum is stored in memory (RAM) and is not persisted. This means that once computed, the checksum is held temporarily and cleared out .
- Yes, the dump execution is a one-time command because the fingerprint computation is an in-memory value that needs to be cleared out to avoid overloading the memory. If fingerprints were continuously computed without dumping, it would eventually fill the memory with these contexts. In contrast, core dumps are stored on persistent storage, which is managed by rotating the stored dumps when the maximum core count is reached.
- The customer has MD5 checksum records for files when the data was hosted on "Site A".
- After moving the data to "Site B", they want to validate that the checksums match the records from "Site A".
- End goal: MD5 checksums generated for files on "Site A" should match the MD5 checksums generated for files on "Site B".
Customer output: PS Z:\> certutil -hashfile COMPLETE_DATA_BACKUP_Rise_Final_Migration_databackup_0_1 MD5 MD5 hash of COMPLETE_DATA_BACKUP_Rise_Final_Migration_databackup_0_1: 28bbdb4cd3969aa76a11acbd9ed90d96 CertUtil: -hashfile command completed successfully.
Our Output:
Data Fingerprint: KLvbTNOWmqdqEay9ntkNlg==
Metadata Fingerprint: Wo5Lm2Q2vxgiovPrTzhlMQ==
In order to calculate correct MD5 value generated on volume through volume file fingerprint dump , we should decode Base encode MD5 output :
Data Fingerprint: KLvbTNOWmqdqEay9ntkNlg==
Base64 Decoded Output: 28bbdb4cd3969aa76a11acbd9ed90d96
Customer's Output: 28bbdb4cd3969aa76a11acbd9ed90d96
This is expected as detailed in the volume file fingerprint dump:
Data Fingerprint:
The digest value of data of the file. The fingerprint is base64 encoded. This field is not included if the scope is metadata-only.
Metadata Fingerprint:
The digest value of metadata of the file. The metadata fingerprint is calculated for file size, file ctime, file mtime, file crtime, file retention time, file uid, file gid, and file type. The fingerprint is base64 encoded. This field is not included if the scope is data-only.