DevOps, VMware

Creating an Azure DevOps hosted agent image for VMware

15 min read

I’ve recently came across a news from Microsoft stating that they will remove the free grant for hosted agents:

We are temporarily changing the process to acquire Azure Pipelines free grants to address the increasing abuse of hosted agents. By default, new organizations created in Azure DevOps may no longer get a free grant of concurrent pipelines. New users will have to send an email and provide additional information to get free CI/CD.1

I love hosted agents as they include all the tools, I can possibly use, for me to build any possible software. For new comers, who want to have an Azure DevOps organization that can leverage hosted agents, which include concurrent pipelines, having this limitation is a bummer. Going through the channels to get that feature unblocked, can be overwhelming and a lot of people move on.

This leaves you 1 solution if you don’t want to go through all of that: hosting the agents (at least 2, because concurrency means 2 agents/workers!) yourself.

Thankfully, the hosted agent process for building the virtual machine images, is open sourced.

Now you may ask, where to host them? Obviously, the natural choice is Azure. They already leverage all the work for you. If you are interested, you can tweak the source to use a Shared Image Gallery instead of dumping the image into an Azure Storage (default behavior) but that is a post/topic for another time.

But what if you want to leverage your current virtualization infrastructure, specifically your VMware virtualization infrastructure?
This is what I want to show you today. By all means, I’m not VMware pro, but I am leveraging my homelab to get a little more versed with it. This was a perfect opportunity for me to learn more about all the sides, that is VMware and how the hosted agents are built.

In today’s post, I will show you how you can leverage the solution to build a VMware VM image that can be used in your infrastructure. What you will need to complete this task are the following:

  • ESXi 7 host
  • VMware VCenter
  • Image of Ubuntu 20.04.1 server (optional)

If you don’t have ESXi or VCenter, you can grab yourself a free trial through VMware.

I decided to build the image on a remote ESXi as I’ve wanted to leverage the existing infrastructure I had. Moreover, by using a remote ESXi, I could continuously build this project. This obviously would need some tweaks in the provisioning, but very doable.

Note: This post uses Packer concepts intensively. If you do not have any Packer experience, consider reading the documentation to get yourself familiarized with the concepts.

TL;DR: if you want to only see the code, it is available on my GitHub repository.

Getting started

The first thing you need to do is grab yourself a copy of the hosted agent project on GitHub.

I am using Windows to perform all the operations unless specified otherwise

Tools install and configuration

You will need to download/install the following tools to be able to complete the end-to-end process:

  • Packer
  • ovftool
    • Make sure you add it to your PATH. It needs to be discoverable in a shell/command line.
  • govc
  • VMware PowerCLI
    • Easily install it by executing the command Install-Module -Name VMware.PowerCLI

Note: You may get the following error while installing VMware PowerCLI:

The following commands are already available on this system:’Export-VM,Get-VM,Get-VMHost,Move-VM,New-VM,Remove-VM,Restart-VM,Set-VM,Set-VMHost,Start-VM,Stop-VM,Suspend-VM’. This module
‘VMware.VimAutomation.Core’ may override the existing commands. If you still want to install this module ‘VMware.VimAutomation.Core’, use
-AllowClobber parameter.

This is because you may have the Hyper-V module loaded and installed.

Use the AllowClobber parameter to install VMware PowerCLI

Install-Module -Name VMware.PowerCLI -AllowClobber.

Then in your profile, remap the Hyper-V cmdlets to start with MS. You can do this by editing your PowerShell profile (usually located in C:\users\<your_username>\Documents\PowerShell\Microsoft.PowerShell_profile.ps1) and adding the following line:

Import-Module Hyper-V -Prefix MS

Once imported with a prefix, you can use Get-MS* cmdlets to interact with Hyper-V VMs.

VMware

To be able to build on the host, Packer needs to connect to the ESXi host on which it will run the operations. For this, it needs SSH. Enable SSH on the ESXi host you plan on using.

GuestIPHack also needs to be enabled. This allows Packer to infer the IP address of the Guest VM via ARP Packet Inspection.
SSH onto the ESXi host and run the below command.

esxcli system settings advanced set -o /Net/GuestIPHack -i 1

We do not need to tweak the firewall for the VNC on ESXi 7 as we’ll use websocket to connect to the VNC. William Lam as an article on this on his blog.

Lets code!

In this section, we will start to create the necessary files and code to be able to provision the image on the ESXi host.

We can start by cloning the file ubuntu2004.json located in the images/linux directory to have the base foundation when it comes to the provisioners. In my case, I called this file ubuntu2004-esxi.json.

Remove everything Azure related

We need to cleanup everything that is Azure related, this includes the builder.

Remove the builder of type azure-arm. Remove all the variables that were used with this builder. They will be replaced with ESXi/vCenter specific variables later on.

We also need to remove anything that relates to the Microsoft Azure Linux Agent (waagent).

First thing is to remove the last shell provisionner that generalizes the linux image on Azure.

In the file images/linux/scripts/installers/configure-environment.sh remove anything that relates to the waagent

Converting the Packer file to new format

Packer version 1.5.0 introduced support for HCL2 templates as a beta feature. As of version 1.7.0, HCL2 support is no longer in beta and is the preferred way to write Packer configuration(s)1.

To convert the json v1 configuration file to HCL2 format, execute the following command

You can now delete the former json configuration file, ubuntu2004-esxi.json.

Variables

We need some variables to be able to dynamically configure our configuration/image. The following are the variables that are required for the process.

Variable Description
builder_host The ESXi host IP/FQDN where the VM image will be created for processing
builder_host_username The ESXi host username to login using SSH
builder_host_password The ESXi host password to login using SSH
builder_host_datastore The ESXi host datastore to use for caching, images, etc
builder_host_portgroup The ESXi host port group (network name)
builder_host_output_dir The ESXi host datastore folder where the VMs files that are being built will be stored
dockerhub_login A docker hub username. Used to counter the rate limited imposed by Docker
dockerhub_password The docker hub username’s password
iso_local_path The path to the ubuntu ISO locally (or on a network share)
iso_checksum The checksum of the ubuntu ISO
numvcpus The number of VCPUS to use. Defaults to 4
ramsize The number of ram to use. Defaults to 16384 or 16gb
vm_name The name of the VM that will be created on the ESXi host
ovftool_deploy_vcenter The vCenter IP/FQDN
ovftool_deploy_vcenter_username The vCenter username
ovftool_deploy_vcenter_password The vCenter password

Adding the VMware builder

The builder I used to create the image is the vmware-iso builder. The boot command was derived from the beautiful community templates available.

As of 20.04.2, Ubuntu only ships its distribution through its network installer (live-server) flavor. Thus, the legacy-server is not available anymore. In this post, I am using the legacy-server flavor. If you want to use the live-server flavor, what needs to change is the boot command.

Below represents the builder that I’ve setup.

I won’t go in details over all the options set, as they can be found in the documentation of the builder, but there are a few things I would like to pinpoint.

The iso_urls property starts with a local path for the Ubuntu ISO. Packer caches, on the ESXi host datastore, the image so that it doesn’t download it all the time.

The iso_checkum property takes the sha256 of the ubuntu ISO. This can be found on the Ubuntu page when downloading the image in the file SHA256SUMS.

The builder creates a local HTTP server that will be used to host the preseed configuration (preseed.cfg) for an unattended install. This is what you usually configure when you’re in the UI.

The remote_output_directory will store the vm created by packer, on the datastore, in this specified directory /build/<image_version> so that if you have many running in parallel, they are all grouped in the same place.

Don’t change the SSH username and password. To connect to the VM, once it has been generated and created on the host, will be done through cloud-init using SSH keys. The account will have password auth locked. If you do change the username and password, please update the preseed.cfg accordingly.

This one made me lose a few hours. In Azure, waagent adds the provisioned user to the sudoers file allowing it to sudo with no password. In our case, this is not automatically done. Usually, if the provisionner needs root privileges, you could pass the password to sudo in the execute_command property (like echo {{ user ssh_password }} | sudo -S sh -c '{{ .Vars }} {{ .Path }}'). However, if you have scripts that need to run as the user initially (like homebrew), but insides uses the sudo command, you are kind of screwed. There is one way to circumvent this (SUDO_ASKPASS), but going to Rome to go back to Canada ain’t a good way. Below I will shed some more light on how to get this resolved.

Other necessary files

We need other files for the provisioning to properly succeed.

Preseed

The first one is, as mentioned above, is the preseed.cfg file. This file is necessary to setup the system in an unattended way. It uses the Debian install syntax flavor. More info can be found here.

Remember when I said above that we needed to setup the sudoers file so that the user could sudo without no password. This is done, as you can see, at the end of the file.

This file is saved in the images/linux/http folder. The content of the http folder will be served by packer’s HTTP server when provisioning.

Extra installers

Those installers are extra stuff we install in the image. They are saved in the images/linux/scripts/installers folder.

vsts agent

This installs the Azure DevOps agent. We want to have it in the image so we aren’t required to install it manually post provisioning, just configure it.

What I’ve done here is get the latest official (non-preview) released agent and saving it into /agent

vmware tools

The vmware tools are necessary for cloud-init

Note that I modified the tools section of the PowerShell tests to include VMware in it, thus the last line.

cloud-init

Cloud-init will be used to provision the system upon the first boot. This is necessary to perform certain operations such as set the ssh keys for the agent user, configure the VSTS agent, and so on. It can be used to even configure other things. More on that later.

vCenter specific

If your ESXi host is managed by vCenter, once the provisioning is done, the host will still be lingering around in the sources. You need to unregister it. This is done using a PowerShell script that is called in a post-processor of the configuration template.

The script was adapted from a community template repository and is saved in the images/linux/scripts/esxi folder

Post processor

Building the image

To build the image, you can use either the -var parameter, for the user parameters, or a var file (-var-file) to pass to Packer. I opted for a variable file. As such my command looked something like this:

packer build -var-file="path/to/ado-agent-packer-vars.json" ./images/linux/ubuntu2004-esxi.pkr.hcl

The build end-to-end took about 3h15 mins.

Cloud-init

As mentioned above, we now have a full baked in image with all the tools used to build any code, like the hosted agents in Azure DevOps.
But if we are to use the OVF as-is, we would have to login to the machine and do a bunch of manual configuration. This is not ideal if you want to create multiple agents with slightly different config (hostname!) but using the same image.

We can do this using cloud-init. I won’t go into the what is cloud-init, as I believe it’s a bit out of the scope of building the image, but I added this section as a how-to for me, and the struggle I’ve gone through to get it to work.

To use cloud-init you need a user-data file and optionally a meta-data file. A user-data file is basically all the type of stuff you’d do when you would login onto the VM to finish configuring it to match your governance.

You can find the user-data documentation here.

In order for VMware to grab the configuration files (or just the user-data), it uses a datasource plugin, OVF, which reads the VM’s properties on a VM’s extraconfig data (in the VMX) or a customizable vApp’s properties data.

William Lam posted a great explanation on how those properties are passed to the VM

Basically, there are two ways in which you can provide “custom” key/value pairs to a VM which can then be consumed from within the GuestOS

1. vApp and that constructs the payload in XML form and there’s only a single key called guestinfo.ovfEnv and you’ll need to parse this out. This is how VMware’s OVA/OVF are built and consumed

2. You can define individual key/value using vSphere API by simply updating the ExtraConfig property of a VM which is really just an array, see https://github.com/lamw/vghetto-scripts/blob/master/powershell/GuestInfo.ps1 for example using PowerCLI. This ultimately will manifest itself as key/value pairs within the VMX file (that’s another way but do not recommend manually tweaking it since we’ve got an API do help with that)

If cloud-init expects to parse specific key/value, then option (2) maybe what you need rather than the cleaner option using vApp. The one upside to (2) is that it works with standalone ESXi hosts where as vApp option is a VC feature and requires that you have vCenter Server

I chose to use the vApp options. To do that, it required a little bit of ninja work. William does it with code in his repo, but I did it manually. Could definitely be automated.

Tweaking the OVF for vApp options

If you change the OVF file, you need to remove the .mf file as now the checksums won’t match.

replace the tag <VirtualHardwareSection> with <VirtualHardwareSection ovf:transport="com.vmware.guestInfo">. This will enable the VM Tools transport for the vApp options.

In the VirtualSystem node, after the VirtualHardwareSection node, add the following. Be sure to choose between seedfrom and user-data as the former superseeds the other.

Replace the {{Image Version}} with the version of the image. I like the same versioning scheme as Microsoft, that is YYYYMMDD.BUILD, i.e. 20210411.1
You are welcome to set the Vendor/ProductUrl/VendorUrl nodes values.

Save your OVF file then convert this OVF to OVA. In a shell/command line, run the following:

ovftool <ovf file> <ova file>

Setup for cloud-init

Setup govc

We now need an options template to pass when we will create our VM. In order to extract that template, we use govc. govc needs to be able to connect to vCenter so make sure the following environment variables are set:

Once set, you can test that it can connect properly: govc about

Extract config template

To extract the configuration options template, you can use the command

govc import.spec <ovf file> > ubuntu.json

Create user-data

here is an example of user-data you can use. Again, customize it to your liking. This is what cloud-init is for

One thing that is missing in this cloud-config is the command to setup the VSTS (Azure DevOps) agent. You can add a runcmd section to the cloud-config with the following if you wish to configure it in an unattended way on first boot.

Convert user-data to base64

You could pass the data as-is to the vApp option, but it’s better to base64 it to not have problems with carriage line feeds and other possible things. In a shell, execute

base64 -w 0 /path/to/user-data 
Here, I used my WSL2 shell to convert to base64

Set the values in the template

Set the value of the user-data key to the value generated in the last step. Fill in also the keys instance-id and the hostname.

Fill in the Name property value along with the Network value in the NetworkMapping section.

Create the Virtual Machine

Create the virtual machine using the command

govc import.ova -options="ubuntu.json" <ova file> 

Power the virtual machine and cleanup

You can start the virtual machine in vCenter or using govc: govc vm.power -on=true <VM Name>. Once on, cloud-init will start and provision the VM as you want it to be.

Once the provisioning is done, shutdown the VM (if it hasn’t been done by a cloud-init directive) and clean the vApp options values as they may contain sensitive values.

Conclusion

So as you can see we can automate the building of the agent for VMware ESXi. Note that for cloud-init network configuration, cloud-init doesn’t work out of the box with VMware to be able to set, for instance, static IPs. You will need to override certain stuff manually. If you want to know how to do that, you can search around and you will find your answer pretty easily.