Wednesday, September 30, 2020

NVMe architecture and its pros and cons for VMware ESXi


NVMe (Non-Volatile Memory Express) is a great technology that can be considered as the storage access and transport protocol for the next-generation of SSD disks (with PCI interface). However you should consider, the NVMe protocol is not just a physical connector for the flash-based disks because it's also used as a networking protocol. The NVMe is also an end-to-end standard (with a set of commands) to reach the highest level of throughput and performance (IOPS) for the storage systems on most of today's enterprise workloads, especially IoT, data-mining, and virtualized datacenters. NVMe offers significantly higher performance and lower latency compared to legacy SAS and SATA protocols because of parallel architecture. As a networking protocol, NVMe enables a high-performance storage networking fabric and provides a common framework for a variety of transports.
Unlike other types of storage protocols designed for the mechanical hard disks, NVMe is not just a simple SSD storage because it can use the benefits of multi-processing architectures. NVMe is also a NUMA-optimized technology and highly scalable storage protocol that connects the host to the memory subsystem to deliver the lowest latency in front of other types of storage devices and protocols, even the legacy SSD disks. This technology has many unique features like multi-stream writes and over-provisioning that are greatly useful in virtualization environments.
NVMe brings many advantages compared to legacy protocols. But why we need to use them in the virtualized infrastructure and what's the effects of NVMe usage in the metrics of performance, data transmission and storing in our virtualization and storage infrastructures. As WD perfectly mentioned about the NVMe features: IO virtualization, together with namespaces, makes NVMe very interesting for enterprise SAN, hyper-scale server SAN, virtualization, and hyper-convergence use-cases. Taking it one step further, SR-IOV  allows different VMs to share a single PCIe hardware interface.
Although as I mentioned previously NVMe was designed for the flash-based disks but it can communicate between the storage interface and the System CPU using high-speed PCIe sockets, So another big difference between the NVMe and other types of flash-based storage is about how to access to the processing resources. Legacy SSDs do it via the HBA controller but NVMe SSDs are connected to the CPU directly via PCI connectors. Because of these benefits, in design of the modern datacenters NVMe will play an effective role, especially if we want to use them as the local storage for the ESXi hosts. As an important note, you should never forget we cannot use NVMe disks as part of a disk array (like RAID) because NVMe storage devices are connected via PCI, not the hardware RAID controller (PCI doesn't have RAID options). As you know VMware ESXi doesn't support software RAID techniques, so it's not possible to use NVMe as an array of disks into the ESXi storage design. So be careful when and why you use them inside the vSphere environment!
while VMware officially supports the NVMe after vSphere 5.5 in November 2014, but just as a separate driver. However it was available as part of base image of ESXi in vSphere 6.0. VMware IOVP Program (I/O Vendor Partner) certified storage drivers (VIB and binary files) developed by many storage vendors like Dell, Intel, HP, Western Digital and Samsung. VMware also released its dedicated NVMe controller for using as the virtual hardware of VMs. We can leverage them inside the virtual machine for hardware versions higher than 13, so after release of ESXi 6.5 it’s possible to use the NVMe controller inside the virtual machines
In comparison other types of storage controllers, using this feature in virtual machines will significantly reduces the software overhead of processing guest OS I/O. So it's very useful for most of virtualization solutions especially in  VDI environments, because NVMe let us to use more virtual desktops per each ESXi host. (Each virtual machine supports 4 NVMe controllers and up to 15 devices per controller) You can list already installed NVMe devices in the ESXi host via running the following esxcli command:

# esxcli nvme device list

For more information about the NVMe technology you can read the below mentioned links. Also there is other related features like NVMe over Fabrics (NVMe-oF) or NVMe over Fiber Channel (NVMe/FC) that maybe I write another posts about them later ;)

https://kb.vmware.com/s/article/2147714

https://blog.westerndigital.com/nvme-important-data-driven-businesses/

Thursday, September 17, 2020

A simple wrong configuration that you may do - Part I

 In many cases of virtualization deployments, especially when you have an FW-VA (Firewall Virtual Appliance) and wanna configure its required networking structure, I saw that all traffic, include LAN and WAN are transferred via a single boundary, I mean there is only one virtual switch with default configuration (just one port group), so all of the existing vNICs are connected to that port group. The sad story is here when you see there is only one connected uplink (VMNIC) for the physical connectivity of mentioned vSwitch.

 

 If I want to describe the problem with more details, in some unexpected situations you will encounter the problem of unstable connectivity and many other issues like one I saw recently: duplicate and mismatch of the MAC address of firewall network interfaces! and I think you can imagine what's the origination and cause of the problem. In similar topology like that I described, all LAN, WAN, DMZ traffics of the virtual firewall will be forwarded from a single VM with multiple vNICs to a unique vswitch, and the vswitch will forward them through that single existing uplink. But the problem will be occur when the (reply) frames are coming back to the ESXi host from the physical networking, the vSwitch couldn't recognize exactly whose is the correct vNIC that originate the sending data. As you may know, the vSwitch does not need to learn the MAC learning feature and also which MAC address has been associated with each VM. So vSwitch doesn't really require to ARP information and CAM table. When the frame arrives to the vSwitch, it's part of the VMkernel duties to check the connected vNICs to that vswitch, and compare with the destination MAC inside the received frame. However consider this matter if there is no match the frame will be rejected.

 In this situation for the first step you can simply separate the virtual switches (and by doing this actually you will separate the physical uplinks too). Then you need to assign separate port groups (in different vSwitch) per each of required interfaces in the firewall based on network topology design. and at last you should connect the physical cables directly to the endpoint WAN devices (like line terminal or DSL model) without any other interface. Now you can understand why it's required to separate physical connectives of firewall virtual appliances ...

 Sometimes you may design for configuration of just one VSS/VDS and BTW you don't want to separate any traffics via dedicated virtual switches. In this situation you need to change the failover order of vmnics for each port group and arrange them based on corresponding outgoing physical interface. However, in similar circumstances I had many issues like that I mentioned. So I strongly recommend to use multiple separate vSwitches for LAN, WAN, DMZ port groups.



I will start a new journey soon ...