Sunday, September 29, 2019

Saturday, September 28, 2019

ESXCLI Networking - Part 1 (Video Series)

After all I prepared and uploaded the first part of this video tutorial series and it's about how to work usefully ESXCLI command line and its related syntax, to manage of ESXi networking and communication area.

I hope you enjoy it


Thursday, September 26, 2019

Migrate from Standard vSwitch to Distribute vSwitch


Today one of my students asked me to describe a procedure how to safety upgrade from VS to VDS, So I decide to write a checklist for it. Also a few days ago I talked about it in some of VMware community threads here: 618874 & 613321
  1. First of all we must design & create the vSphere Distributed Switches based on vSphere networking assessments & virtual infrastructure requirements, Also configure Distribute uplink port Group. We must set the maximum number of required uplinks (VMNICs) for that VDS.
  2. Add every ESXi hosts in the datacenter that you need to connect them to the VDS. Do not forget there is no dependencies or relations between ESXi hosts that are part of VDS and their membership in the existing clusters. So each host from every cluster can be added to the VDS.
  3. Consider redundancy on physical uplinks on existing VSS! Yes it's better to do it. Although it's not a pre-requirments for VDS migration but for safety of migration procedure I always try to follow this instruction. So when we want to start the major migration operation, we can do it without any risk of network connectivity interruption. Otherwise if we can't provide any redundant uplink for the VSS, then we should run this step and three next ones in a single running of VSS to VDS migration wizard.
  4. Move first series of physical uplinks, So we need to assign one of VMNICs from each Hosts as the VDS first uplink. Then check your co-existence of VSS and VDS and their connectivity.
  5. Create a new Distributed Port Group and then migrate the VMkernel ports  to the new assigned dvPortGroup, especially that one is designed for the management port. Then check the host management connection after successful migration.
  6. Create designed Port Groups for the virtual machine communications. Then migrate them with their associated uplinks to keep their connectivity.
  7. Move all the remaining objects like the other VMNICs to the VDS. So after the last time of all networking communication checking, you can remove everything related to the VSS if you don't need them at all .
I hope it can be useful for you all ;)

Saturday, September 21, 2019

ESXi Networking Management & Troubleshooting by ESXCLI - Part 1

In this series I want to demonstrate how to work with ESXCLI command line tools to manage and troubleshoot ESXi network configuration. Now let's start step by step, first of all respect to NIC status I want to explain about them and related syntax:

1. To list all of the physical interfaces (pNICs) belong to that host and give more information about them:
esxcli network nic list
 
2. Detail information about an specific pNIC:
esxcli network nic get -n vmnic0

3. Software details include VLAN Tagging and VXLAN encapsulation:
esxcli network nic software list
 
4. Return sent / received rate of packet in the VLAN associated to that pNIC:
esxcli network nic vlan stats get -n vmnic0 

5. Configure pNIC attributes include Speed, Wake-on-LAN settings, Duplexity and so on:
esxcli network nic set -n vmnic0 -S 1000 -D full 

6. Sort list of connected VMs and their associated:
esxcli network vm list
 
7. Retrieve details of connected ports of a VM with (-world ID) include vSwitch, Port Group, IP and MAC addresses and related Uplink: 
esxcli network vm port list -w 136666

8. Detail information about Distributed vSwitches that are associated to the Host:
esxcli network vswitch dvs vmware list

In the next post I will show you how to work with ESXCLI in networking in a new video series.

Friday, September 13, 2019

vRealize Network Insight - New Posters

Thanks to VMware, They recently released two very fantastic posters about vRealize Network Insight (vRNI) to use in the search engine:

1. Network flows search
2. Virtual Machine search guide



https://blogs.vmware.com/management/files/2019/09/vRNI-Flow-Search.pngvRNI VM Search Poster

Monday, September 9, 2019

An Example of Importance of Management and Controlling Virtual Infrastructure Resources


In one of my projects I had a bad problem with vSphere environment . The issue had been occurred in following situation:
In the first episode VCSA server encountered with a low disk space problem and suddenly crashed. After increase size of VMDK files and fix the first problem, I saw one of the ESXi host belongs to the cluster is unreachable (disconnected and also vCenter cannot connect to it, but both of them is reachable by my client system. In a SSH access I checked the ESXi host is accessible but vCenter server couldn't connect only to this host.
All network parameters and storage zone settings, and all time settings and service configuration were same for each hosts. Sadly syslog settings was not configured and we didn't have access to scratch logs in duration of the issue had been occurred (I don't know why). Trying to restart all management agents of the host was suspended and suppressing to it by running services.sh restart process was stuck and nothing really happened. also trying to restart vpxa and hostd didn't fix the issue.
There was only one error in summary tab of disconnected host that described about the vSphere HA that is not configured and ask to remove and add the host again to the vCenter. But I couldn't reconnect it. My only guess is it's only related to startup sequence of ESXi hosts and storage systems because tech support unit restarted some of them after confronting to the problem, So HA automatically tried to migrate VMs of that offline hosts to other online hosts and this is the moment I want to call it "Complex Disaster". So was stuck decided to disable HA and DRS on cluster settings, nothing changed! problem still existed. After fixing the VCSA problem I knew if we restart that host, maybe the second problem will be solved but because of a VM operation, we couldn't do it. Migration did not work and we were confused.
Then I tried to shutdown some of not-necessary VMs belong to the disconnected host. after releasing some CPU/RAM resources, this time management agent restart was done successfully (services.sh restart operation)
So trying to connect VCSA to that problematic ESXi was possible and the problem was gone forever!
After that I wrote a procedure for that company's IT Department as the Virtualization Checklist:
1. Attend to your VI's assets logs. Don't forget to keep them locally in a safe repository and also in a syslog server.
2. Always monitor used and free process/memory resources of cluster. Never override their thresholds, because a host failure may cause to consecutive failures
3. Control status of virtual infrastructure management services include vCenter Server, NSX Manager and also their disk usage. Execute "df -h" in CLI or check status of their VMDKs in GUI. (I explained about how to do it in this post)
4. In critical situations or even maintenance operations always first shutdown your ESXi hosts and then storage systems and for reloading the system first start the storage, then the hosts.
5. In the end, please DO NOT disconnect vNIC of VCSA from associated Port Group if it is part of a Distributed vSwitch. They did it and it's made me to suffer a lot to reconnect VCSA. Even if you restore a new backup of VCSA, don't remove network connectivity of failed VCSA until the problem is not solve.

I will start a new journey soon ...