|
There was a big buzz going around on Twitter this morning about memory management mechanisms implemented by the VMkernel in the event that a Guest OS has a limit set that is under its total assigned value. The conversation was kicked off from Arnim Van Lieshout's blog post on memory management. This is NOT a good scenario to have in your ESX environment, and I have seen it many times due to lack of education on what setting a limit means, or simply having a bad template deployed throughout the environment. I want to start by saying that I am not 100% sure this is exactly what the VMkernel does, but having seen it, troubleshot it, and written rules in enterprise virtualization monitoring products around the behavior, I have a pretty solid base of understanding. Â
The first question that keeps coming up is "Don't memory management methods only kick in when there is contention?" My answer to this is two-fold. First, I've only seen the VMkernel waiting for contention when looking at shares to determine priority, not to execute a method of memory savings/management. Secondly, we need to define "contention". In this particular case it is when a Guest OS needs more resources than the VMkernel can assign to it at a point in time. This can be from a lack of available resources, or by forcing a restriction (like a limit) as to how much the VMkernel can give to a guest. That's the one thing about a limit in VMware... The limit is a hard limit. There is no "If someone else isn't using it we will let you go above it", it's static and absolute. OK, so let's review what's going on here with a scenario I've seen all too often. I configure my template with 512MB of memory and start deploying a bunch of VMs with 1024MB of memory. For whatever reason, whether it is a bug, or a static misconfiguration on the part of the VMware Admin, all my VMs go out with a hard limit of 512MB of memory and an assignment of 1024MB.   So as my virtual machine boots up and loads it's applications, it runs along happily until it hits that 512MB limit. At this point, the OS and applications don't know anything about a 512MB limit (They think it is the 1024MB assigned to the VM). They are going to keep requesting the use of more and more memory. The VMkernel, whose job it is to assign memory, is simply going to say "No, you have a limit, and you will stick to that".Â
The guest simply doesn't know why it is being rejected. As far as it is concerned, it still sees 512MB sitting there unused. This is naturally going to cause a performance hit to your applications, as they are being deprived of memory resources. This is where "Contention" by my definition hits. The Guest OS sees that it should have access and demands more resources. The VMkernel is not allocating any more memory due to the hard limit. This is where things get complicated, and I can ONLY speak from experience and not with a 100% certainty about the process, but I have seen and resolved this issue enough times to make me pretty damn sure I am correct. Since the guest keeps pressuring the VMkernel, and the VMkernel not having any more many it is allowed to assign to the guest, the VMkernel does the next best thing, which is trigger the balloon driver inside the Guest OS.
As you can see in the previous image, in order to get the Guest OS to actually swap effectively, it needs to balloon the full amount of memory so the guest can clean up and only keep the more frequently accessed pages in memory, and move everything else to its OS Swap file. When it comes to ballooning, the VMkernel seems to be capable of breaking its memory limit restriction for this purpose. When this happens you will see a HUGE spike in Balloon memory for that virtual machine. The balloon will effectively grow to ((memoryAssigned - memoryLimit) + memoryFreedbyBalloon) before it deflates and frees up the OS memory. Within 10 minutes, the process should be complete, and the balloon will deflate, leaving the OS with access to more memory assignment from the VMkernel, again, up to the specified limit. 
Of course, my workloads are never happy and always want more memory, so this process is going to repeat itself and the OS is going to hit the limit again.
This time, the VMkernel is pretty much out of options, it will try to balloon again, but it will not work. What you will see in your ESXTOP or monitoring tool is that the balloon is going to significantly inflate again. Instead of deflating in 10 minutes, it is going to stay inflated, and keep trying to get the guest to page. This causes a very major performance hit to your VM, and is the final warning that you have that your VM workload is about to use the VMware swap file. If you don't do something at this point, the Virtual Machine is pretty much toast in terms of performance...in fact, it is SERIOUSLY degraded at this point.
Once this happens, you MUST know what you are looking for. VMotioning the VM will have no effect, as the limit follows it. Unless you know how to detect and resolve this issue, the VM (or as far too often the case, multiple VMs) will be degraded. If there are too many VMs having this condition on a single ESX Server, the host itself will grind to a halt. After detecting and resolving this issue in a "worst case" scenario, many customers are able to get more than double the amount of VM workloads running on their system. Far too often it actually goes undetected because people either 1) Don't know its an issue or 2) don't know how to effectively troubleshoot the problem. At the end of the day, it is always safer for your infrastructure to lower the assigned amount of memory of a VM vs. messing around with setting limits to restrict them! As most everyone knows, I do design enterprise virtualization software for a living, so if you want to make sure you either detect this problem before it impacts your VMs, or avoid it altogether, make sure you check out vFoglight from Vizioncore, which has rules built into the product to detect and alarm on this exact scenario!
Trackback(0)
|