How does NVFAIL work with LUNs in ONTAP?
Applies to
ONTAP 9
Answer
Overview
- During a forced failback or metrocluster switchover databases are vulnerable to corruption due to inconsistency between data on disk and the data on the internal cache
- This is because during a forced failback or metrocluster switchover, previously acknowledged changes might be discarded and the contents of the storage array jump backward in time, and the state of the database cache no longer reflects the state of the data on disk
- This inconsistency could result in data corruption
- Caching can occur at the application or server layer. For example, an Oracle Real Application Cluster (RAC) configuration with servers active on both a primary and a remote site caches data within the Oracle SGA. A forced switchover operation that resulted in lost data would put the database at risk of corruption because the blocks stored in the SGA might not match the blocks on disk.
- A less obvious use of caching is at the OS file system layer. Blocks from a mounted NFS file system might be cached in the OS. Alternatively, a clustered file system based on LUNs found the primary site could be mounted on servers at the remote site, and once again data could be cached.
- A failure of NVRAM, a forced takeover, or forced switchover could result in file system corruption
- ONTAP systems protect databases and operating systems from this scenario with
NVFAIL
and its associated parameters - NVFail is enabled by default on all volumes with LUNs to prevent issues with the LUNs file system
- NVFail should be considered for NAS volumes hosting databases