Last month I had a bad experience with one of the managed vSphere environments by me, after a suddenly long-time power termination of the Datacenter that causes rebooting of most of the ESXi hosts inside the primary vSphere cluster. After passing this disastrous day, accidentally I saw there is an issue when the VM Migration is going to run, and it shows the following compatibility issue:
"A general system error occurred: Connection Refused; The remote service is not running, OR is overloaded, OR a firewall is rejecting connections."
Our operation team decided to investigate every possible reason from the scratch, so we checked the following lists:
- TCP/IP stack (especially the VMkernel Gateway) configuration for each ESXi host
- Disable/Enable vMotion capability of the related VMKernels
- Disable/Enable HA feature in the Cluster settings
- The built-in firewall status of ESXi hosts
- Restart the management services, even reboot one of the mentioned servers
- Configuration of each networking component in this path
- Remove and re-Add the ESXi host to the Cluster
Finally, to find more detail I checked the /var/log files too, But sadly nothing gives me the required feedback! So I decided to go after the next component: the vCenter Server. First of all, I checked the VAMI, and the Health status and services were OK, and nothing specially mentioned in the Log files. But meantime of checking the VCSA details, I encountered the current version of this service:
Because of getting no results from all my troubleshooting operations, I decided to upgrade its version to the latest build number (17713310) of the stable version (6.7.0.47000) that is published.
After the successful upgrade operation and restart of the vCenter server Fortunately, we saw this problem has been fixed without any special action on cluster or ESXi settings. One more time, I strongly recommend taking the upgrading/updating tasks seriously, because of it will fix many unexpected issues with less concern. By the way don't forget to take a backup before the upgrade and also, pilot the installation in your test environment, before running through the main system.
Very useful
ReplyDelete