Thursday, April 18, 2019

NSX Host Preparation Failure reasons



 In the preparing steps of VMware NSX deployment, after NSX Manager and also controller nodes deploying (at least 3 nodes or other odd number of controllers) you need to do "Host Preparation" procedures in the ESXi clusters. Sometimes after execute "install" on the specified cluster, you may encounter some problems that are related to communication channel health, So the NSX Deployment phases will stop in this step. Today I want to review the cause of these problems:
1. Before executing any changes, please check the version of your vSphere suite (ESXi hosts belong to the cluster and vCenter server) and also the NSX Manager appliance version and review the compatibility of these products.
2. Check Lookup Service of PSC and NSX Manager connection. it can be related to the time settings that are not-synchronized or lack of same time value between these two servers. 
Also, you should Check port 443, maybe it has been blocked by a firewall. you need to check the certificates that have been used by vCenter and NSX manager, maybe they are self-signed Certificates and can cause trouble in SSL connectivity. each of them can be the cause of vCenter~NSX communication problems.
3. It's recommended to design and prepare vSphere Distributed Switches before deploying NSX in the SDDC, especially for preparing of VXLAN TCP/IP stack (and also VTEP state) in the SDN environment. So you need to attach all hosts of the cluster to the VDS and then deploy NSX on their cluster. Here is an important point you should remember: VDS and Cluster are not forced to have 1:1 relations (a VDS can be deployed to more than one cluster and a single cluster can have communication with multiple VDS) but when we talk about the NSX, it's better to have exactly a 1:1 relation between VDS and Cluster.

4. Check DNS settings of the ESXi management network stack (VMkernel) because it's may not do name resolution on Network Fabric for downloading VIB packages from the NSX Manager server.



5. Check the VXLAN configuration: IP Pools and vmkNIC status, try "resolve" option in the Host preparation and let the NSX Manager's Agents (Firewall and Control Plane) be get UP.



At last, if you change one of these settings, review every single operation that has been logged in the NSX Audit Logs and vCenter Events tabs.

Thursday, April 11, 2019

Problem in Last ESXi patch with HP DL380G7

Hello everybody... i want to share what happened to us in this week. we had a bad experience with last published critical hotfix from VMware:  ESXi600-201903001 (I explained about this patch in my last post) when we installed it on a HP Proliant server DL380 G7 Server with ESXi 6.0 Update 3. This server has Xeon x5660 2.8 GHz as its processor, but after two hours of successful upgrade this server (with ESXCLI) server has been crashed and HA moved all VMs to other host. Sadly there is no more collected dump information has been shown as the purple screen, so if i can share more information about the problem, i will explain it later




Saturday, April 6, 2019

VMSA-2019-0005

VMware Security Advisory announced some security issues with critical severity and published appropriate hotfix updates for its relevant products that may has been compromised by this bug: VMware ESXi, Workstation & Fusion. Also vulnerabilities are about:
1. Virtual USB 1.1 Universal Host Controller Interface (UHCI) out-of-bounds read/write and Time-of-check Time-of-use (TOCTOU)
2. Intel E1000 / E1000E vAdapters out-of-bounds write (Fusion/workstation)
3. Unauthenticated APIs (Fusion only)

As VMware said: Exploitation of these issues may let an attacker to execute code on the host from a virtual machine.
CVE Numbers of these security problems are:

   CVE-2019-5514
   CVE-2019-5515
   CVE-2019-5518
   CVE-2019-5519
   CVE-2019-5524


For more information about this issue and other new recently issues, please refer to VMSA-2019-0005 on the VMware Security Advisory portal.

I will start a new journey soon ...