Dual Commit is only relevant for the data part of an object. The metadata part of each object is always stored as 3 copies per data center.
Dual Commit is a process that occurs during ingest. It is a separate process from, and done before ILM (Information Lifecycle Management) evaluation. Dual Commit must be verified as completed before the grid acknowledges to the S3 client that the write was successful. To maximize performance, Dual Commit generally writes both copies on two distinct Storage Nodes in the same site where ingestion has occurred - this removes the delay of crossing a WAN link.
The ILM policy will then evaluate the object and distribute it across the grid in accordance with the ILM rules, and delete the redundant copy in the ingestion site once the matching rule is satisfied. Redundant Dual Commit copies will only be removed after the appropriate ILM rule is totally satisfied.
For example, an object is ingested at DataCenter1 and a 3-copy ILM rule is applied: (1 copy in DC1, 1 copy in DC2 and 1 copy in DC3)
- Dual Commit makes two copies at the ingestion site DC1
- The 3 copy rule is matched to this content
- One of the DC1 copies is earmarked/selected by the ILM engine
- One copy is replicated to DC2
- One copy is replicated to DC3
- ILM is marked as satisfied
- The redundant Dual Commit copy in DC1 is removed.
As stated previously, Dual Commit generally makes two copies at the same site where ingestion has occurred. If there is insufficient available storage at that site, the Dual Commit system will use storage at another site.
Dual Commit is analogous to the ONTAP concept of logging an incoming write into two NVRAM locations to give a client early acknowledgment before it is finally flushed out to either RAID-4, RAID-DP, or RAID-TEC depending on the target aggregate. The Dual Commit in SGWS, like NVRAM, is just there to protect the data until it can be committed to its final destination. Once that has occurred, it is flushed out of the 'buffer.'
Add your text here.