Tuesday, May 28, 2019

Cannot access VCSA via WinSCP

One of my student asked me about how to connect to VCSA with file management tools like WinSCP? Because he faced with some fatal error like "Received too large (x) SFTP packet. Max supported packet size is (y)" after many trying to connect with WinSCP. It's so important to know that unlike the VMware ESXi, by default you cannot access to the vCenter Server Appliance Linux system files & directories via WinSCP. For enabling this type of access, you should run following shell command to change default bash to Bash:

#chsh -s /bin/bash root

It will be useful for such important file transfer (Upload/Download) like contents of /var/log folder or generated certificates and their associated keys.Also you can revert back this behavior to the default settings (Appliance Shell), by executing the following command:

#chsh -s /bin/appliancesh root
 

Thursday, May 16, 2019

VCSA Low Disk Space Problem - Part 1


 It's a very common problem, observing some of the vCenter server's disks are full and also no space remaining causes service interruption, because of possible reasons like massive log or dump files creation. For instance, if you restart the server, some of its services cannot be started, especially VMware PostgreSQL database in the Appliance deployment type. So practically vCenter servers will fail and cannot handle service requests anymore. In the case of VCSA, if you encounter with this issue, maybe you can connect only to the VAMI web page or with the shell access, but there is no vSphere web client UI access, so what should you do in this situation?!
Let's fix it. First of all, shell access is required:
> shell.set --enabled True
> shell

Then the next command can help you to check used and available size of disks # df -h
The result is something like this:
Then we can investigate exist and occupied spaces of VMDKs, so we can choose two methods for saving vCenter: Remove some unnecessary files or Increase the size of vDisks of VCSA.

In some situations like unexpectedly growing log or audit files, saving huge tcpdump exported files in the local storage, many collected netdump files in the ESXi hosts crashing incidents, you will get into this matter, So device what to do highly depends of detecting the root of the problem. I had a critical experience with a failed vCenter that have 0% free space in dev/sda3, and unfortunately, I couldn't enable Bash Shell access, even with: > com.vmware.shell enable.
So you may need to mount the VCSA's VMDKs to another Linux OS (Not as a bootable partition) and clean some of the files to free up disk spaces, then boot the VCSA again to start Bash Shell for its troubleshooting. Selecting an accurate scale of virtual environment in the early steps of VCSA deployment is very important, because wrong selection may lead to saving inventory logs into the disks more than server expectation, so soon enough disk will be filled to cause a problem. In some scenarios after increasing VMDK size by vSphere web client console, then you need to reset the VCSA server for applying Because it will add the not-allocated disk spaces automatically to the old created logical volumes. But if want to do it quickly and cannot restart the server, you can also run following command:

vpxd_servicecfg storage lvm autogrow

So the expected result will be shown like this:

VC_CFG_RESULT=0 
It's good to remember that if you deployed an External PSC, in situation of PSC failure (because of low disk space) you should execute this command:

/usr/lib/applmgmt/support/scripts/lvm_cfg.sh storage lvm autogrow  
  


Sunday, May 12, 2019

VCSA Backup Failed because of VSAN

VMware vCenter Server Appliance Management Interface (VAMI) is a very useful web console environment for VCSA management & maintenance operations, that has been existing on HTTPS://VCSA_URL:5480. One of these tools is Backup and with these tools, you can take a specified backup of vCenter data consists of Inventory & Configuration (Required) and Stats, Events & Tasks (Optional) 
VMware published VCSA6.7 (version 6.7.0.14) with these protocols for Backup: FTPS, HTTPS, SCP, FTP, HTTP. But announced NFS & SMB will be supported after VCSA6.7U2 (version 6.7.0.3). We had two big problems with these useful tools,  One of them is related to VSAN Health Service. 
 
 Whenever the Backup task has been started, it was stopped immediately and generated a warning about VSAN Health Service, because it seems to be crashed. (VCSA Management GUI exactly will tell us this happened.) Sadly if you try to start this service (even with --force option) it leads to another failed attempt and result is something like this:


So after many retrying for starting this service, I decided to check the files structure of this service in path of /etc/VMware-vsan-health and compare them with a fresh-installed vCenter Server.

Also, there are two files that could to be related to the cause of this issue: logger.conf  file that has been absolutely empty in the troubled vCenter Server and VI result shows nothing, whereas in the healthy VCSA you can see something like below results:

 


When I checked the vsanhealth.properties, it shows communication of this service is worked with HTTPS, so its connections need to have an SSL structure. Then I found the second file: fwsecrets, it contains something like two hash streams. So I decided to risk and remove this file and the logger.conf file too (of course after getting its backup). At last, after some minutes the next try of service start was successful.

Remember that you need always to check DNS (FWD, RVS), NTP, Certificate and Firewall, Especially if you setup the vSphere environment with an External Platform Service Controller. I will explain the second problem in another post.

Saturday, May 4, 2019

Creating an NSX logical switch fails

While you are configuring NSX, after adding logical switches you may encounter with an error and also failure of NSX logical switch creation:

" Unable to allocate an available resource for resource type Segment Id Pool "

This problem has been occurred just because of not-configured Segment ID Pool for assigning to NSX objects. So you need only edit Segment ID settings and add a range of pool id for objects (or universal objects). Then you will see everything will be done and you can setup the logical switches of NSX




I will start a new journey soon ...