Why Upgrade to ESX 6.x? Snapshot handling!
A lot of organisations are still on older versions of VMware vSphere. I see ESXi deployments in the field which run on version 4.1, 5.0 and 5.5 regularly. When I ask them why they do not upgrade, I get all sorts of answers. The most common and general statement is, “I do not see any benefit in the new version”. VMware made a huge improvement to the snapshot handling and consolidation in vSphere 6.0, but they didn’t advertise this broadly to the whole community.
In vSphere versions prior to 6.0 you will see a ‘stun’ of the VM when the snapshot is being consolidated and removed. When there are applications, like backup software e.g. Veeam Backup & Replication, that make regular snapshots you will see that in your environment. Especially when there are high transactional systems like databases running.
When a snapshot is taken, all disk contents at snapshot time are preserved. Future modifications to the disk are logged in a separate snapshot file. Multiple levels of snapshots are possible, and 2 multiple snapshots can be consolidated into a single disk or a snapshot by applying modifications in each to the previous snapshot or base disk. Once consolidated, intermediate snapshots can be discarded.
Snapshot handling for a backup of the VM (prior version 6)
The process starts by taking a snapshot of the base disk, which becomes read only, all new writes are now sent to this snapshot (1,2). After the process is finished, we will start the removal of the snapshot (3) an algorithm will check if the snapshot content can be applied to the base disk within 12 seconds. If this is not the case a new snapshot is taken and the older snapshot becomes read only. The older snapshot is consolidated to the base disk (4). Again, the algorithm will check if the consolidation to the base disk can take place within 12 seconds, if this is the case (5) the VM is stunned (paused) and the snap contents is applied to the base disk. After applying all changes the snap is removed and the VM continues its work (6).
With Storage vMotion all virtual disks of a VM are copied from a source volume to a target volume. While the copy process takes place the VM is using the virtual disks on the source volume. During the migration/copy, will the VM still be live and there will be data changes on the source copy. VMware uses a mirror driver to make sure all new writes will be sent to the source and target disks concurrently. The VM will switch from using the source to the target disk, when the source and target disks are equal. After the switch is successful the source disk will be deleted.
In VMware Infrastructure 3 and vSphere 4.x, the virtual machine snapshot delete operation combines the consolidation of the data and the deletion of the file. This caused issues when the snapshot files are removed from the Snapshot Manager, but the consolidation failed. This left the virtual machine still running on snapshots, and the user may not notice until the datastore is full of multiple snapshot files.
In vSphere 4.x, an alarm can be created to indicate if a virtual machine was running in snapshot mode. For more information, see Configuring VMware vCenter Server to send alarms when virtual machines are running from snapshots (1018029)
In vSphere 5.0, enhancements have been made to the snapshot removal. In vSphere 5.0, you are informed via the UI if the consolidation part of a RemoveSnapshot or RemoveAllSnapshots operation has failed. A new option, Consolidate, is available via the Snapshot menu to restart the consolidation.
In vSphere 6.0, the snapshot consolidation process also uses the same mirror driver as with a Storage vMotion. This mirror driver makes it possible to consolidate a snapshot in 1 pass instead of creating multiple helper snaps until the calculation for commit is lower than 12 seconds. During the commit/consolidation of a snapshot the mirror driver will write to the original disk and a new sparse file (3). If the commit process successfully completes there is no need to merge the helper file with the newly writes to the base disk. This snap helper file (sparse file) will be deleted (4). If for some reason the original consolidation/merge of the helper file fails, the process will start again but there will be two files to consolidate then. (rare cases) By using this newly snapshot consolidation process there is almost no visible stun time. Less than 0 seconds.
By upgrading to VMware vSphere 6.x snapshot consolidation nightmares are a thing of the past. The mirror driver works great, especially when you have applications in your environment that rely heavily on VMware snapshots. If you want to see what the effect is with snapshots prior version 6 and with the mirror driver in place, please see this blog from Luca Dell’Oca.
- The Design and Evolution of Live Storage Migration in VMware ESX
- PowerShell Friday: Snapshots
- With vSphere 6, snapshot consolidation issues are a thing of the past!
- Snapshot Consolidation Changes in vSphere 6.0