Hello guys, I hope you started 2020 as the best year of your life. In the next part of virtualization tips about the resource management via the ESXi, I will deep dive into the Memory Management.
We have two concepts of memory: physical memory & virtual memory that Both of them are block-based that are called page. In the case of physical memory is full (over-commit) the other data of virtual pages are stored on disk (called memory swapping). There is two types of memory virtualization:
Software-based &
Hardware-assisted.
But before everything, let's look at two general concepts:
1. Regardless of virtual machine's required memory, the ESXi host toke a piece of physical memory for running its own daemons and agents, when the host is normally up and is ready to work. (and also a dynamic overhead memory for its code and data) Remaining total memory is available for assignment to all of the virtual machines via sharing, reserving or other memory allocation techniques.
2. There is an abstraction layer for the memory allocation between the ESXi host and its associated VMs. Hypervisor do it without the knowledge or any required management activities of the
guest OS of VM. To improve memory utilization, the ESXi host transfers memory from idle VMs to VMs that need more memory.
For better understanding of memory allocation in a virtualized environment, I should explain about three main definitions:
- Machine Page Number(MPN): Physical memory resources that is available inside the ESXi. (host physical memory)
- Physical
Page Number(PPN): Assigned memory from available physical resources of
the ESXi host to the virtual machines.(VM physical memory)
- Virtual Page Number(VPN): Guest OS virtual memory.
Hypervisor
(
VMM) is the responsible for translating / mapping PPN (physical) to
MPN (machine) to provide memory for each VM. The VMM maintains the combined virtual-to-machine page mappings in the shadow page tables, so the PPN to MPN mappings
maintained by the VMM. But for both of memory
virtualization types the Guest OS of each VM wil map VPN to PPN. Then the tables are kept up to date with the VPN to PPN mapping via the guest OS. (No need to more computation for maintaining the coherency of the shadow page tables) Also there is a good point in this method of memory allocation: The server can remap a physical page by changing its
PPN2MPN mapping, in a manner that is completely transparent to the
VM.
The TLB (called address translation cache) stores the recent translations of VPN2PPN. The hardware TLB permits ordinary memory references to execute without
additional overhead, so it will directly cache the VPN2MPN read from the shadow page table. In the software-based method, there is a combination of VPN2MPN as a software technique that saves them in the shadow page tables managed
by the hypervisor. But in other method, as
VMware said hardware
resources will be used to generate the combined mappings with the
guest's page tables
and the nested page tables maintained by the hypervisor.
For the calculation of required memory in our datacenter, some important points about the memory allocation & assignment are exist that we need to be aware. Let's review them:
1. An ESXi host will provide an extra layer of address translation on the underlay physical memory to grant memory access for each VM, because the Guest OS needs a zero-based physical address space at the memont of power-on operation, So basically the host virtualized its
physical memory via this method. VMware also carefully defined that "The virtual memory of a VM is really a software abstraction layer to provide the illusion of
hardware memory for that machine."
2. Consider a VM with configured 4GB RAM. Regardless of share, reservation and limit considerations, the guest OS behaves like it has absolutely a dedicated 4GB memory. But based on many factors include of memory over-head/over-commit and the level of memory pressure on the ESXi host, it might not really use exact the configured size of vSphere Documentarymemory.
3. The software MMU has a higher overhead memory requirement than hardware MMU. Sometimes software memory virtualization may have some performance benefit over
hardware-assisted approach if the workload induces a huge amount of TLB
misses. But the performance of hardware MMU has improved since it was first introduced with extensive caching implemented in the hardware layer.
4. In the case of memory reservation you must remember if the VM becomes idle after awhile, the host does not reclaim its memory and VM can retain it. So the VMkernel can do the memory allocation for the other VMs (that are out of reserved memory capacity via the resource pools) from remained portion of its physical memory resources.
5. Transparent Page Sharing (TPS)
is a wonderful technique for consuming less
memory via the removing of memory page's redundant copies. So in
compare with the traditional implementation of physical system, a
workload running on a VM often consumes less
memory. Just know it as the security approach, inter-VM TPS
is disabled by default and page sharing is being restricted to
intra-VM memory sharing. (Page sharing will occure only inside of VMs,
not across them)
Obviously before this, we could say that overall overhead of memory virtualization highly depends on the VMs & hosts workload. But you should understand each of the above definitions and tips before the design and calculation of required memory resources of the virtualization infrastructure. If you want to read more details about the memory virtualization and ESXi management mechanisms, please look at the
vSphere Documentary about the Memory Virtualization