Wednesday, November 30, 2022

VMware VDI (Horizon View) Troubleshooting - Part VI

Before everything please accept my apologies because I didn’t post for nearly two months and the primary reason for this shortening is related to the recent catastrophic suppression inside my Homeland, Iran. During these two months through the Iranian protests, the IRGC military forces like the brutal alien mercenaries murdered many of my people, including more than 60 children because they are fighting for their freedom and shouting out the “Woman, Life, Freedom” slogan. This revolution disrupts the life of all Iranian around the world and plunged us into deep anger and sadness. Thanks to the people of other countries for their support, I hope they keep it up until achieving freedom for all. 

 

After a long time I decided to continue the VDI Tshoot series (Part5 is te last one). From the beginning of introducing VMware Horizon View, the Instant Clone desktop pool is one of the three major types of Desktop Provisioning through this VDI solution. However, the Linked Clone type is deprecated in the recent versions and the real question of this topic is which one of the remaining types is suitable for your Company’s VDI deployment and why? Instant clone or Full Clone?

For an instant clone desktop pool if you change each one of the related vSphere objects, include of the cluster, snapshot, master image, and so on, you should schedule a Maintain operation to push the new snapshot of the modified image, because the desktop pool couldn’t find these previously defined objects and provisioning operations will stop. However, it didn’t stop the provisioned desktops because they are already generated and registered to the related Horizon Connection Servers. But for the next machines, it will prompt errors that will announce it’s not possible to provide new desktops and of course corresponding, VM generation inside the vSphere environment will fail too.

So run the maintain wizard and fix the issue by adding the new candidate snapshot for the future desktop deployment. Attend if you want to modify any vSphere objects that are related to the Horizon too, you should double-check them before the editing.

Changing the AD credential or account expiration will fail the desktop generation in the final steps. However, if you change the related OU structure, it will corrupt new desktop provisioning in the related step of computer account (SID) generation. So like the vSphere modification, you should keep in mind if you change the Active Directory objects values, you should fix them immediately in the corresponding Desktop Pool settings inside the Horizon administration console. In the following error, it prompts that computer account creation has failed.


Although the Instant Clone type is an efficient method for desktop generation, possible maintenance operations are not too much easy. To avoid issues like these mentioned ones, you can review the following consideration checklist. Most of them are based on my personal feedback in different VDI projects, so it’s my pleasure to read and add your similar experiences:

  1. Review the Domain definitions. For example, all related Active Directory objects like OU and Account credentials. Also, if you provide a delegation control based on specific OU for VDI desktops, you should check its AD privileges.

  2. Review the whole vSphere hierarchy objects that are related to your VDI: Datacenter, Cluster, VM folder, Golden Image VM name and the related Snapshot.

  3. Also review the defined permissions in both mentioned environments (Active Directory/vSphere) if you couldn’t understand the root cause of errors or experience some new infrastructure changes recently.

  4. If you removed or changed the VM’s snapshot manually or through the actions like Disk Consolidation, you should provide a new snapshot based on the required attributes for an Instant Clone desktop pool.

  5. Regardless of the Instant Clone desktop pool benefits (like faster desktop provisioning and deployment) you should attend it’s naturally harder to maintainging and troubleshooting the issues related to the desktops deployed via this method, because of their complex structure. It generates a lot of vSphere objects for the Horizon environment like Internal Template, Replica and Parent VMs and the Clone itself. However, in the case of Full Image, It’s a fully independent VM that is generated for a candidate Template, while the whole preparing procedure has many manual steps and takes more time to go.

  6. Check the Event in the monitoring section. By the way, if you don’t find a related error you have to investigate and deep dive inside the log file in the following default path:

    C:\ProgramData\VMware\VDM\logs\

As the last conceptual point, we should understand deeply what's going on inside a service and its related network connections and communication to truly troubleshoot related issues and the VMware Horizon View is not an exception. As an important use case, I saw many times IT staff didn't know how to configure the firewalls between the different sections of a VDI environment, especially whenever we separate the VLANs/Subnets of the desktop pools and management servers. For example Blast Extreme protocol has two sites: As the initial step of connecting the port is TCP/UDP 8443, but in the next step to communication between VDI Client and Desktop/RDSH, it's TCP/UDP 22443. So you should understand the differences in both setup and troubleshooting steps. 

This post is presented to Kian Pirfalak, a 9-years old boy that was killed by the regime forces. He was a very creative young boy and dreamed to be a Robotics Engineer in the future. R.I.P our lovely Son.

 





Tuesday, September 20, 2022

Bursting of new releases for the vSphere 7.0

 I think the recent release of the vSphere (Version7.0 Update3) is one of the VMware Products that have many patch releases in a short duration. (8 versions from Oct 2021 until now) Regardless of the reasons that were usually based on the security weakness of each release, it brings us a clear conclusion: Every new product although passed many complex security tests and considerations, can include recent zero days and deep breaches inside their architecture. Log4j vulnerability proved this fact that an old behavior of a service or solution can be a perfect target for Hackers because most of the time an unknown reaction or response to a complex request maybe leads the whole system to an unstable status.

In recent years, most Unix/Linux Based OS and services are the targets of new attacks and other types of threats like ransomware. So the VMware products are not excluded from these critical risks. It's natural to see many new patches in a short time. However, it's not a reason to avoid providing a well-done designed plan that is nicely scheduled for protecting against new disasters like announcing a new vulnerability. I think if we are forced to check the new build numbers of ESXi or vCenter server weekly, it should be part of our IT staff's primary tasks.
Some releases include of many security fixes, like the vCenter Server 7.0U3f:
 

In this post, I want to mention to some of the known issues in the recent releases of vSphere 7.0 U3 Patches:
  1. Security: Encrypted VM fails to power on Trusted Cluster containing an unattested host in migrating/cloning states, or in the HA/DRS-enabled cluster.
  2. Networking: VM might lose Ethernet traffic after hot-add, hot-remove or storage vMotion.
  3. Networking: IPv6 traffic fails to pass through VMkernel ports using IPsec.
  4. Networking: When upgrading from vSphere6.7 to vSphere7.0 high throughput virtual machines may experience degradation in network performance while NIOC is enabled.
  5. Storage: VOMA check is not supported for NVMe-based VMFS datastores and will fail with an error.
  6. Storage: After recovering from APD/PDL conditions, the VMFS datastore with enabled support for clustered virtual disks might remain inaccessible. The VMkernel log might show multiple "SCSI3 reservation conflict" messages.
  7. VSAN: VMs lose connectivity due to a network outage in the preferred site of a vSAN stretched cluster and still stay inaccessible state, while they should failover to the secondary site.
  8. vCLS: System VMs that are added for ensuring healthy operations of the vSphere Cluster Services, might impact cluster and datastore maintenance workflows in vCenter 7.0 U1.
  9. vSphere client: Cross vCenter migration of a VM fails with an error: "The operation is not allowed in the current state".
  10. VM MGMT: The post customization section of the script runs before the guest customization, if you enable Cloud-Init in a Linux Guest OS.
  11. VM MGMT: Deploying an OVF or OVA template from a URL fails with a 403 Forbidden error and also maybe a local OVF deployment containing files with non-ASCII characters in their name might fail with an error.
  12. VM MGMT: You cannot add or modify an existing network adapter on a virtual machine: 

Although it's just a part of the whole story and you should read the full document of VMware vCenter Server 7.0 U3g Release Notes, about the known issues to truly understand what actions and workarounds are required to do, or which update you should run to fix them. I always prefer to execute the CLI way instead of GUI methods:
 

 

Wednesday, August 31, 2022

Which level of authentication?


 Did you ever ask yourself which level of authentication is required to secure the infrastructure access? Recently the Multi-Factor Authentication (MFA) solutions are one of the greatest ways of securing every IT infrastructure and also improving the identity management platforms. Regardless of each action for IT security hardening, MFAs are the most powerful way to increase the organizational user's authentication procedure, especially whenever you mix it with PAM/IAM solutions. But what information is required to understand and select a suitable solution? First of all, we need to decide exactly which level of authentication is based on our network architecture, types of internal or external incoming authentication requests, and risk level of corresponding data/service through this method of access. Truly based on each authentication level consideration, you can decide how many AUTH factors and finally which solution is the best for you:

  1. SFA: Although the single-factor is the easiest way, it's ideal for situations the user has no options to input his/her credentials into the system. For example, inside an industrial environment where may no keyboard exists for authentication, SFA is the best solution. Most of the SFAs are included with biometric factors like face detection, fingerprinting, or eye scanners. So regardless of SFA nature, it's a secure way, especially for OT networks.
  2. 2FA: Two-factor authentication is the most popular method because it can easily block more than 90% of threats like Brute-Forcing, Password Guessing, and Library Attacks. Regardless of the possible limitations of any selected solution, I strongly suggest considering the SSO (Single Sign-On) attributes of your 2FA system. I mean 2FA should be compatible with regular authentication services like Microsoft Active Directory, or any common LDAP services. Because it can reduce the rate of general complexity of 2FA/MFA systems. Corresponding to the traditional first level of authentication and also SSO/SAML integration, you can select the suitable option for your infra.
  3. MFA: Increasing the level of AUTH, will totally increase the security measures. But you shouldn't avoid inhabiting factors like the knowledge of users about how to efficiently use each authentication level of configured MFA system. Next, we should obtain a failover option at each authentication level. For example, a Help Desk team quickly reset TOTP paired tokens for problematic users. Or select a backup solution if a user loses their hardware token or lost his/her smartphone with the software(App) token. Mixing the hardware/software solutions and using an easy-to-use biometric method like an eye-scanner inside the hard-working areas of your organization, as the replacement of regular TOTP-based Tokens.

 Employment of all AUTH levels from the same provider can increase integration and compatibility rate, but make the system vulnerable to unknown architectural bugs like the Zero-Days of company products. Combining with a 3rd-party solution increases both the security and complexity properties of the design, so you should consider all aspects of MFA system maintenance.

 In the conclusion, I think before constructing the final authentication system, check the integration rate of each selected AUTH level with their similar solutions, and then consider a backup way if you lose physical/logical access to the provided tokens. 

 Congrats, and enjoy your MFA solution.

Monday, August 22, 2022

Configure ESXi SNMPv3 via PowerCLI

In another post, I described configuring SNMPv3 via VMware ESXi ESXCLI command line. In this post, I want to combine and run the esxcli with the powercli cmdlets to make it an automated procedure to get the value of corresponding ESXi hosts inside the vSphere environment and set the required SNMP(v3) configuration. If you aren't aware of how to connect to the vCenter via PowerCLI, read this first.

 As the initial step I get a list of the ESXi hosts and put them inside a for loop, then call the esxcli inside the PowerCLI.

 

In the next step, I recommend providing arguments, including each field of ESXi SNMP v3 configs. At last, we can then set command via invoking the filled arguments. Now the configuration has been run on each ESXi host selected by the condition (via the Where-Object cmdlet).

You can also check the accuracy of the result via running the esxcli system snmp get through the ESXi shell, or $esxcli.system.snmp.get.invoke() inside the PowerCLI connection.


Sunday, July 31, 2022

Investigation around the vSphere objects via PowerCLI

There was a missing VM inside the cluster that led to losing it and we couldn't understand what happened or whether it belongs to which ESXi host. I should mention it's about an enterprise environment that sadly has no logging solution such as vRealize Log Insight (vRLI) or 3rd-Party solution like Splunk. So there is no way of sorting, filtering, and searching between thousand of daily logs, just the vSphere itself: Monitor\Event section. But we couldn't reach any cause of this and sadly there was no time to inspect the Log files of all ESXi hosts of this cluster to find out what exactly occurred. However, I guessed there is a wrong VM re-naming that suddenly happened by a Help Desk staff without announcing to any vSphere Admins (Although it's a wrong access definition/granting for them because we should remove this privilege from their permission list). So I decided to inspect the details of Log files via PowerCLI through the running of the Get-VIevent cmdlet.

However, this problem forces me to post some use cases for working with this useful PowerCLI cmdlet. In the following I will show you some practical examples:

1. As the first sample, you can watch the result of all events in the Warning severity level by running this:

 

 

 

2. In the second example, I ran a little more complex filter based on the start time which Event Type ID is like this 'com.vmware.vc.authorization*'. It can also be included ending date with -finish syntax.



3. As the last one, you can see I ran the command against a cluster object named "CLS" where the log message included a word like "Vm" and the result is shown in PowerShell GridView.


 

There are many other possible methods of mixing and pipe-lining cmdlet to get the expected results. It just needs a little patience and understanding of whatever you want to do. I hope you always will be in a good situation in your Log management system.

Saturday, July 16, 2022

Desktop Pool deployment failure factors

 
 Have you ever been in the situation of suddenly virtual desktop deployment failure that you couldn't realize what's happening in your VDI environment and related services? As you may know, installation of most VDI solutions like VMware Horizon View's first deployment is not a complex task for an expert administrator, but understanding the cause of each issue that stops the desktop pool provisioning and in the following finding a way of troubleshooting progress is not easy at all.

 While there are many reasons to stop the VDI deployment, I want to investigate some cases of vDesktop provisioning failures and how to achieve a fast way to resolve or a method of bringing back the provisioning operation to its normal mode. Especially for an Instant Clone Desktop Pool, because of its complicated architecture in front of Full Clone type, we can experience deployment failure, and sadly it's not possible to change easily all settings of this desktop pool. Modifying the Golden Image needs to do maintenance actions and run the publish wizard, so any modification can lead to an unstable state of vDesktop generation. So let's go to check most of the situations:
  1. Accounts: Generally two types of credentials are required in the construction of Instant Clone: First an account for accessing the vSphere infrastructure that can be a part of vCenter SSO or any other connected LDAP repository. The second one is a part of the AD domain account to join the OS of deployed virtual machines. Modifying each one of them in any maintenance interval without informing the VDI Admin team may lead the vDesktop deployment to stop: VM deployment failure (vCenter account) or VDI error (AD account).
  2. Directories: Renaming the VM Folder or changing its hierarchy can cause to loss of the Reference VM and then fail the new deployment. However you can find the new placement path through Publish wizard still, it will stop the virtual desktop recover option by the way. In this situation you should know it's not possible to edit the Desktop Pool easily, thus as a good recommendation, first define the VM folder structure and precedence, then create the required desktop pools based on design. Although it's the same story for AD OU changes, it's easier to set the corresponding OU path inside the desktop pool edit section.
  3. Privileges: There are some necessary permissions for successfully creating a Virtual Desktop. Part one: vCenter privileges on each level of vSphere's objects hierarchy for automatically virtual machine deployment inside the Cluster and put it on the corresponding VM folder. Part two: AD privileges for computer object creation inside the considered OU. Both procedures have their own required permissions. Comply these notes always: Do not modify the Horizon Connection Server considered permissions that are defined in the vSphere environment, and do not change the granted access for the VDI account that is authorized for Domain joining. It's good to create a vSphere role with the required privilege to grant required permissions to Horizon Service Account. For AD accounts, do not set a higher administrator level than is required. I think it's enough to delegate the control with the required AD permission for the mentioned account at the corresponding OU level.
  4. Defined Assets: If you change some primary vSphere components that are selected as part of Instant Clone desktop deployment, like the Cluster and Shared Datastores, this action may break the line of new vDesktop generation without the possibility of knowing exactly what's happened. Of course, you can investigate inside the Horizon details logs to know what's going on, like checking this path (C:\Program Files\VMware\VMware View\Server\Broker\Logs) but it's a complicated and time-consuming troubleshooting operation. So as a good recommendation, define a naming pattern for each type of the vSphere object and configure them all, before running the VDI construction.
  5. Name Resolution: Whenever you are using FQDN instead of IP address, changing the naming convention method or each of the VDI-related DNS records may lead Connection Server, Domain Controller, Event Database Server, and vCenter Server lost each other. The best practices of this section told us to define all servers preferably by their DNS names. For example, it's not possible to change the defined vCenter Server while there is just a related desktop pool (it means never!). Now if you decide to change the network subnets, it's enough to update the DNS cache to resolve the vCenter Server address. However be careful if you define an Alias name or CNAME record for the Horizon Server definition, never wiped them.

 In this post, I tried to mention some of the most potential failure factors. However, there are a lot of reasons for the vDesktop provisioning failure that you may encounter with them in future, like virtual machine snapshot issues (I think I should speak about them in another post). Before starting the VDI project it's highly recommended to construct the server and datacenter virtualization infrastructure carefully, with the power of scalability to avoid unnecessary changes, especially object renaming or changing directory patterns and so on. 

Thursday, June 23, 2022

vCenter Server suddenly lose all Hosts connectivity

Once again I write a new post to emphasize the importance of vCenter server availability as the major critical component of all virtualized environments based on VMware products. Recently we had a strange situation that all ESXi hosts from different datacenters (sites) had been disconnected from the vCenter server suddenly and the reconnect operation was pending at 0% in the Tasks section. When we wanted to explore the sub-objects inside of a datacenter-level, the operation was delayed, and removing and re-adding is not possible for these ones. As the result of investigation activities around this catastrophic issue, we check all the following related matters:

  1. DNS configuration on both sides: VCSA and some disconnected ESXi hosts, everything seemed to work correctly. 
  2. Resetting Management Agents (VPXA & Hostd) has no success result and the problem still exists. Also, all related services in the VCSA like VPXD could be restarted successfully.
  3. VCSA guest OS partitions, especially /storage/archive directory had enough free space. (In other posts, I described how to resolve the VCSA service interruption because of low disk space on two posts: Part1 and Part2)
  4. There was no reason to suspect the recent infrastructure changes like the modification of firewall rules, while all ESXi hosts from all clusters on each datacenter were involved in this problem. So the rollback operation to the latest network device configuration brought no success for us. Also, running the TCPDUMP command on both sides and watching the result, give us enough evidence this issue is not related to the network configuration. 
  5. Even restoring a backup to a normal state couldn’t resolve the mentioned issue, because after booting the vCenter Server, are hosts were still disconnected. Even by trying actions mentioned in matter 2 again, never reach us to the normal situation.

Then in the continuous troubleshooting operations, I decided to deep dive into the vCenter log files, especially checking the log files inside this directory: /storage/log/vmware/vpxd.

There were just some info and warning messages, for example in every resync between the VCSA and all the hosts belonging to a datacenter, it prompted some information like this for each ESXi: “info vpxd [id] [originator … HeartbeatModuleStart - …] Certificate not available, starting hostsync for host: host-id

However, we checked all used certificates and find nofthing related. So I decided to back to my zero states: Bring up an older restore point with a better situation, and do whatever I did once again, with one more important operation: Ignore all current ESXi host time settings (even NTP) and synchronized the “hardware clock/system time” for a selected host exactly with the currently configured time of the vCenter Server. Then restart VPXA/Hostd agents again, and after some moments I saw the ESXi object react in the vSphere web client. At that moment, we could run the “connect the host” action completely, because it didn’t pending and even pop-up the warning, the vpxuser account is not correct: cannot complete login due to an incorrect username or password”. Finally running the connect wizard could easily complete and the ESXi host stayed in its normal state. 

So what was the root cause of this disaster? I couldn’t find it yet. However, I still suspect the MAC duplication/mismatch issue between the vCenter and ESXi host. but there are some essential tips that we should keep in mind always:

  1. vCenter is the core component of vSphere management for VDS, Cluster, VSAN, and Template objects, and also is the primary connection point to other solutions like NSX, Horizon View, and vRealize or even 3rd-party solutions like virtual machine Backup & Replication software. Although they are not exactly dependent on the vCenter server for all operations. But any interruption of vCenter can lead them to lose some critical sections of their own management actions. For example, all deployed desktops inside the Horizon View environment are always through their connected Horizon Agent, but for generating new desktops, the Connection Server requires to call the vCenter to generate them via its Template/VM based on desktop pool type. 
  2. Having a backup system and a scheduled job for vCenter Server protection is not enough for the safety of this primary component of virtualization. Even the vCenter HA setup cannot guarantee all aspects of availability (Like in our case, VCHA solution couldn't help us). So we need to always monitor the whole system including checking randomly possible warnings inside the VAMI interface, checking the detail log files inside the shell, configuring Syslog, and also inspecting them always. 
  3. Some important configurations are easy, but ignoring them is easier! Like DNS, NTP, Syslog, and so on. Never postpone their configuration to another time because each one can lead our infrastructure to a sudden interruption. Although some other settings like SNMP are a little complex in comparison to the mentioned parts, we can always use the benefits of Automation. Creating scripts including PowerCLI cmdlets is not easy for all administrators, while it's enough to make them just once and use them forever. If it’s possible based on provided vSphere license, you can use features like Host Profile to configure mentioned settings for all managed ESXi hosts too.
  4. vCenter Server restore points (via VAMI or 3rd-Party solutions) must be defined based on vSphere Infrastructure changing intervals, so we need a reliable Change Management procedure to correspond the backup system to any type of modifications. Changes like removing a host from a Cluster, adding a host to a Distributed vSwitch, Changing Permissions and credentials, and so on.

Tuesday, June 14, 2022

VMware Horizon View - Deployment Considerations - Part 1

In this part, I want to review two important points of the initial Horizon Connection Servers installation: Choosing the deployment location and the role of the LDAP instance.

 

Friday, April 29, 2022

VSAN Deployment Guidelines - Part 1

In this post, I want to emphasize some of the important points of VSAN design by reviewing most primary VMware VSAN documentaries. Some of them included:

  1. vSAN 7.0 Planning and Deployment
  2. VMware VSAN Design Guide

I tried to cover all aspects of planning and design however, maybe I forget to mention some of them. Of course, it makes me happy if you find and remind me as a reply.

  1. Don't ignore utilizing the same server configuration in the VSAN cluster, that is included identical hardware resources, such as processor model, total memory, and identical disk devices. Unbalancing the cluster can lead to storage performance reduction and different maintenance duration for each ESXi host. a uniform setup of the hosts also increases the rate of stability.
  2. Consider some extra/spare components like disk devices, network adapters for VSAN cluster's member or even a physical server with enough capacity for rebuilding operations in the cases of disk device or host failure.
  3. Try to keep always storage's used space less than rate of 70%. In the case of multiple VSAN cluster rebalancing operations (higher than 80% usage) it will affect on the line of business performance (Apps on VMs).
  4. Don't forget to use VSAN sizing tools and check your server platform and storage device model on the VMware Compatibility Guide (VCG) as a pre-requirement step of VSAN setup.
  5. Construct All-Flash architecture for the VSAN cluster, if you have any plan for VDI Deployment based on VMware Horizon View solutions even in the future, especially in the scenarios you will deploy Instant Clone. Enabling Deduplication and Compression options are highly recommended for this type of Desktop Pool to save storage space.
  6. Enable VSAN Encryption preferably before deploying VMs on VSAN Datastores to reduce the required time for this operation. Although it's possible to enable Encryption for both layers of Capacity and Cache, Compression can be done just for Capacity.
  7. Preparing more disk groups with a few members of the disk device can break the fault domain originated by any disk failures. However, in your VSAN planning phase, you should attend to these triple metrics carefully and choose whatever has a higher priority in your infrastructure: Required total capacity, better performance, and higher fault tolerance.
  8. Using the Passthrough mode for the setup of the VSAN disk devices has better performance than using RAID0 mode which is configured via server local array configuration tools. Also, RAID0 required more configuration and maintenance actions. At last, if you have some non-VSAN disks, do not assign them as the RDM for your VMs.
In this post, I focused on server and storage considerations of VSAN deployment. In the next part of this series, I will discuss with respect to the networking section.

Friday, April 22, 2022

Horizon View: Investigation of all states of a virtual desktop

In this video, I discussed and review all types of virtual machine status related to a virtual desktop belonging to a Horizon View desktop pool. I tried to analyze most of them and also the circumstances of their occurrence.

  

Tuesday, April 19, 2022

A simple review on "Exposing Malware in Linux-Based Multi-Cloud Environments" document


Thanks to VMware TAU (Threat Analysis Unit) for publishing a technical report about the ransomware, cryptominers, and RAT attacks and techniques. This document focused on all recent critical threats against the Linux-based multi-cloud environment, like REvil and Defray777. 

I asked one of my technical staff (TW:alirahimi681) to review this very useful document. After his deeply reading, he summarized an easy-to-read draft about two categories of threats in this document: Ransomware and Cryptominer. Then I decided to publish this briefing on my blog for whom that may not read the whole original document. 

Ransomware families:

VMware TAU analyzed nine ransomware families and characterized their evolution. They started analyzing the different characteristics of the ransomware samples of each of these by looking at the static information extracted from their ELF files. While threats can be a combination of shell scripts, Python scripts, and binaries, this report focuses on the binaries. Binaries are usually the components that carry out the file system encryption in a ransomware attack.

REvil: Also known as Sodinokibi, originally targeted Windows hosts but released a Linux version in spring 2021. Interestingly, this threat relies on the esxcli tools to stop the current ESXi virtual machines. It then encrypts their on-disk images to prevent the recovery of running VMs. Recently, REvil actors have been targeted by a coordinated take-down operation that may impact future variants.

DarkSide: The actors behind DarkSide initially distributed REvil. Usually, they are using REvil as a service they called ransomware as a service (RaaS) operator. This ransomware has been used to target a wide variety of organizations and initially targeted Windows but quickly evolved to include Linux targets and in particular, those running on ESXi servers. These servers are usually targeted after the threat actors gain access to a VMware vCenter deployment, often by means of stolen credentials.

BlackMatter: The actors behind BlackMatter made sure to publicly announce that they were not targeting specific verticals, such as healthcare, oil and gas, government, and critical infrastructure companies possibly following the backlash that the Colonial Pipeline attack created, and the unwanted attention that the DarkSide operators received.

Defray777: Defray ransomware is another Linux-based threat that targets ESXi VMs. An interesting property of some of its samples is that it doesn’t strip or tamper with ELF binaries, which makes them easier to analyze. This ransomware family is closely related to RansomEXX to the point that sometimes the two families are considered to be variations of the same threat.

HelloKitty: The actors behind HelloKitty ransomware have achieved notoriety after successfully attacking CD Projekt Red, the makers of the Cyberpunk 2077 video game. It’s a Windows-based threat that evolved and expanded into the Linux world, targeting Linux-based systems and ESXi servers. Like other samples that target ESXi VMs, HelloKitty uses the esxcli tool to stop the VMs currently running before encrypting their files.

ViceSociety: Their malware shows substantial similarities with the HelloKitty ransomware. This ransomware family was responsible for attacking the United Health Centers in the San Joaquin Valley in California, among other targets, which resulted in the leaking of sensitive patient record

EreBus: This is a relatively older ransomware family. It initially targeted Windows hosts but evolved in 2016 to include a Linux variant. This threat is unique because of its multilingual nature. While the actors behind the ransomware have stopped their activity, it is still an interesting sample that shares some behaviors with other ransomware families.

GonnaCry: This is an open-source ransomware sample written in C and Python. While the Python version is mostly used as a way to showcase some of the behaviors associated with ransomware, the C version has actually been observed in the wild.

ECh0raix: This ransomware targets QNAP NAS storage devices with weak credentials. This family is written in the Go language, and its features are simpler than other ransomware families.

Cryptominer Families:

XMRig: Is an open-source miner available for Windows, Linux and macOS. While this miner is not a threat by itself, a variant of this component is often deployed as part of crypto-jacking attacks to perform the mining. XMRig is often used to mine the Monero cryptocurrency, which is the preferred target because it can be mined without needing specialized hardware.

Sysrv: Is a botnet written in Go with cryptomining capabilities that has been recently deployed against K8s pods running WordPress. This threat attempts to spread on the network via performing password dictionary attacks against vulnerable services and using a database of exploits against known RCE vulnerabilities.

Team TNT: This threat’s actors target open K8s / Docker environments to deploy XMRig cryptominers. To evade detection, this threat hijacks the library loading mechanism to hide specific directories in the /proc file system, which are associated with the processes running the cryptominers.

Mexalz: Mostly customized versions of XMRig that is exploiting weak credentials to compromise hosts and deploy cryptomining malware.

Omelette: Is a cryptomining worm that exploits known vulnerabilities in Exim, WebLogic, and Confluence to install modified versions of XMRig. It spreads to other systems by exploiting trust relationships.

WatchDog: Represents one of the longest-running cryptomining operations that is written in Go. The name is derived from the presence of a component that is tasked to monitor the execution of the actual cryptomining program, similar to the Linux utility “watchdog.”

Kinsing: The attack exploits Docker API endpoints that are open to the world to download and install a number of shell scripts that aid in persistence and ultimately lead to the download of a cryptomining component.


Tuesday, April 5, 2022

Many Vulnerabilities of many products (Spring4Shell)

 

Multiple vulnerabilities have been announced related to many VMware products in the last month, especially in the recent two weeks. It's very important to consider them all: Investigate the workaround or install the published patch.

Also most of VMware Tanzu products are vulnerable against the recently announced RCE: Spring4Shell (CVE-2022-22965) related to the Spring Core Java Framework that is described in the VMSA-2022-0010.




Tuesday, March 22, 2022

Running ESXCLI commands through PowerCLI

ESXCLI is a set of useful namespaces to get (retrieve) or set (change) ESXi host configuration and truly is the real successor of ESXCFG commands in the roadmap of VMware CLI development. In addition, PowerCLI is a perfect option that we can use to gain the benefits of automation! Because it's possible to execute the ESXi shell's command through PowerCLI, as the network administrator you may require to run some ESXCLI commands via PowerCLI.

In this post, I explain how to achieve this goal by running a sample of retrieving options: Get the current Syslog configuration of all the ESXi hosts. So if we want to do it via esxcli, I should run this for all the ESXi hosts one by one through the shell and running of this command:

esxcli system syslog config get

Although I need to get this via for all the ESXi (named like VH*) by a foreach loop, First I need to create a variable ($MyESXs) to get the required ESXi value (for example its name):

 

 

 

 
Note: Because the first version of Get-EsxCli is deprecated and will be removed soon, it's recommended to use the second version.

 

 

 
 
Of course, you can change some considered attributes via this method, but you need to create some arguments before the execution of the final ESXCLI command. (CreateArgs). Then you can set any required namespace, like LogDir, LogHost, LogLevel, and so on, such as the following way:

  

 

If the result is successful, it returns True. However, you may also be familiar with the old-fashioned method of PowerCLI cmdlet execution to get and set some of Advanced Configuration. In my case (Syslog) I did it by some pipe-lining like this:   
In Windows PowerShell environment

  

 
 
In Linux pwsh-preview environment

 
 
 
 
 
You can also do it via editing and saving a *.ps1 file (power-shell extension), but In some circumstances like mine, you need to run them in a Linux environment. So I prefer to create a .ps1 file via touch command, then edit the generated file via nano or vi, and import the required content.

 




 

At last you can easily run the corresponding ps1 file inside your pwsh-preview (PowerShell of the Linux Shell):

Tuesday, March 8, 2022

PowerCLI on my Ubuntu desktop

 

I think many of us one day may require to deploy and work with useful VMware PowerCLI cmdlets through our Linux-based workstations. At the first step, you need to install Microsoft PowerShell first of all. So let's do it. Remember you can login to the root access, via running the sudo -i to avoid input "sudo" in each line of command execution:

1. Keep the OS up-to-date:

# sudo apt-get update

# sudo apt-get upgrade

Keep in mind you can upgrade to the latest official version of your Ubuntu OS via:

# sudo apt-get dist-upgrade

2. You need the "Curl" as the primary prerequisite, so run this if you didn't yet:

# sudo apt-get install curl

3. Then we need to add the corresponding Microsoft Repository to get the PowerShell: 

# sudo curl https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add –

In earlier versions, you may encounter some warning like deprecation of some commands like apt-key

 

# sudo curl -o /etc/apt/sources.list.d/microsoft.list https://packages.microsoft.com/config/ubuntu/21.04/prod.list

4. Execute the following commands to install and, then run the Microsoft PowerShell:

# sudo apt-get install powershell-preview

# sudo pwsh-preview  

5. Finally you got the PowerShell. Now it's time to install the latest VMware PowerCLI and even enjoy some other useful SDK of VMware products, like Horizon View:

> Install-Module -Name VMware.PowerCLI

> Install-Module -Name VMware.vSphere.Sdk

> Import-Module -Name VMware.VimAutomation.HorizonView

In most cases you need to choose your response for VMware CEIP:

> Set-PowerCLIConfiguration -Scope User -ParticipateInCEIP $true

 

 

 

 

To list all available scripts about the VMware tools and solutions:

> Get-Module *vmware* -ListAvailable 

Basically, you need to connect to the vCenter server to run other required commands:

> Connect-VIServer -Server 10.10.10.100 -User administrator@vsphere.local -Password YourPass

However, you may see the invalid certificate error but you can ignore the certificate checking procedure, especially for the self-signed certificate. 


 

Although it's highly recommended to use a valid certificate in the vSphere environment and the below action is not a reliable action as the security consideration, so I think it's better to set it at least in the warning level:

> Set-PowerCLIConfiguration -InvalidCertificateAction warn

Now it's possible to use the VMware PowerCLI cmdlets through the Ubuntu Terminal, for example, you can get related information of virtual machines that are named like "DC":

> Get-VM | where-object ({$_.name -like "*DC*"}) | fl *

I hope this post can be helpful for you.



Thursday, February 24, 2022

Why we need to deploy Instant Clone?

It's highly recommended to use the Instant Clone Desktop Pool (IC-DP) in comparison to the Full Clone Desktop Pool (FC-DP). But why? You may think it's an easy question that has complex answers, but I believe it's vice versa because I believe it's not a simple question but can have many easy answers. In many projects I saw the network administrators think about creating full clones is a better choice than using Instant Clones, because it’s easier to deploy and manage. Regardless of this idea is true or not, I think before creating any type of desktop pool, we should plan carefully to understand which type of desktop is more suitable for our VDI environment and can answer our end-users demands. 

First of all, I should mention in comparison to the FC-DP, management operations related to the IC-DP including maintenance jobs, updating procedures, and running them as an orchestrated workflow are more reliable and more schedulable in the overall support duration. For example, consider a situation you want to update the Horizon Agent on all of the desktops that belong to an FC-DP. Which options we have?

1. Rebuild the whole desktop pool again with an updated template (reference template)? If you want to do it, updating the Agent of your Golden Image used on your IC-DP will take the same amount of time as  FC-DP, but honestly, as you know reconstruction of all desktops consumes a long time for FC-DP (Even maybe one hour, while it's about the seconds for IC-DP)
2. Update the existing desktops via scripting? If you run the Agent installation in the silent mode, and run it through a scheduled script or execute via a domain policy, I can answer Yes! it's a good method (I will teach how to do it, in another post). But keep in mind it's required to monitor carefully with more attention because you need to check the correctness of script execution, the Policy establishment, and checking the machine's section on the Horizon management console through all steps of this operation.

Let's back to where I explain the initial step, what will satisfy your company and its end-user? I want to conclude such as this: FC-DP is simpler to build but is not true for long-time maintenance actions Because IC-DP acts faster in rebuilding steps. Of course, I agree the construction of IC-DP requires more background preparation phases than the FC-DP. So, I described the following checklist based on my experiences in different situations. I think if you consider them before starting a new VDI project, it can be a great one:

1. Complete the Cluster configuration, before running the VDI construction: In some situations, you may still be in the process of ESXi cluster building and also shared storage implementation. If they are not completed yet or some hosts are not connected to the corresponding LUNs, then don't run the IC-DP deployment on them. Because in such as these circumstances, desktop deployment tasks will be failed. So you should ignore those hosts to a candidate as the VM (desktop) placement or remove them from the cluster totally until you fix the SAN connectivities. Also, you can balance the desktop's resource usage via enabling the DRS.

2. Assign Flash-based volumes as the VDI storage placement: If there is no SSD datastore inside your data center, never try IC-DP creation. Although the procedure of IC-DP deployment includes several objects (Master, Internal, Replica, Parent) and it's not simple as the way of FC-DP, you can satisfy of very fast-provisioning via implementing the flash-based disks because they are the best storage candidate for high-rate disk-reading workloads. The good news is if you need to multiple IC-DP there is no need to dedicated another VM/Snapshot combination and even the Connection Server do not deploy whole combination of mentioned prerequisites IC-DP's virtual machines (Parent is different).

3. Define a procedure to update the DP’s machines regularly: One of the important difference between the two types of desktop pools is the way of their future maintenance operations, like upgrading and patching the OS, updating the installed antivirus, installing new versions of required software, and so on. Instant Clones are the better choice for the VDI change management operations, because based on IC-DP architecture you can easily update the Golden Image and then re-deploy desktops again. (Keep its OS and Apps updated when the VM is power-on, then generate a new snapshot and just replace with the old one easily)

 

4. Less storage is required: IC-DP is based on JMS technology (like other VMware products such as App Volumes) and uses the virtual disks and memory of its Parent-VM. So, if you want to take these advantages in your environment, don’t lose the IC-DP. It's good to know you can keep and protect the user profiles by using VMware Workspace ONE UEM or  Microsoft SCCM OSD profile capturing feature especially whenever you plan to deploy persistent desktops. 

5. Identical organization requirements, means same workloads and truly same applications: Whenever your clients need to same Apps and Services, and a general perspective has been defined for access the organizational resources, IC-DP is the best option, especially for non-persistent desktops that you can refresh them without any user considerations. Kiosks and any other types of OS-less workstations that are connecting to the network through the Horizon  is another use-case for accepting them as the clients of IC-DP, because generally they are not dedicated to the users and are public devices in our organizations. 

VMware improves the Instant Clone technology, for example I guess there is no SID-duplication issue. I checked it in one of the IC-DP members through Windows Power-Shell:

Get-ADComputer  -filter  {name -like '*vdi*'} | fl name, sid, *guid  

Although there are still some considerations and restrictions for Instant Clone technology in each of Horizon View version as I mentioned some of them above. So keep in mind always extract the details of organization's requests from the considered VDI solution, then start to build your virtual desktop infrastructure.

 

I will start a new journey soon ...