vTPM & Native Key Provider

Had a question recently about vTPMs for VMS. This use case also included questions about vSAN disk encryption. Here are the bulletpoints from my documentation reading and discussions with others. I’m reasonably certain that all of this information is correct, if I got anything wrong put it in the comments and I’ll address it.

vTPM overview:

  • vTPM information is stored in the .nvram file for the VM.  This file, and a few others, are encrypted with VM Encryption. Encrypting the entire VM (VMDKs) is not required. If you are protecting VMs with a vTPM be certain your backup methods also capture the .nvram file otherwise the recovered VM will not boot.
  • vSphere 8 added a vTPM provision policy. This allows vTPM devices to be replaced during clone/deployment. In vSphere 7 the vTPM is cloned along with the VM resulting in identical secrets. To replace the vTPM in vSphere 7 the vTPM must be removed and re-added to replace the secrets. Changing the vTPM will have an impact on any guest functions that rely on the secrets.

Performance:

  • There is no observed performance difference between the Native Key Provider (NKP) and an external provider. There is the chance that a remote KIMP would be slower due to latency/external calls. The NKP is sufficient to handle anything that is within the vCenter maximums.

Key Backups:

  • vSphere Native Key Provider is backed up as part of the vCenter Server file-based backup. One manual backup needs to be performed on the vCenter or you will see a warning in the console every 24 hours.
  • You can also backup just the NKP data separately as a PKCS #12 file.
  • Keeping a separate copy of the NKP is advised just in case the vCenter cannot be restored.

Host integration:

  • If you have a TPM 2.0 chip on the ESX hosts the Key Derivation Key (KDK) from the Key Provider is stored on the hosts and sealed with the physical TPM chip.  This will allow use of encrypted datastores if the Key Provider is offline. (in 7.0 U2 and later). This is HIGHLY recommended to prevent circular dependencies.

ELM & Cross-vCenter vMotion.

  • The Native Key Provider data is NOT shared when vCenters are in Enhanced Link Mode. If you want to use the same key provider for the linked vCenters it must be imported (see resources section)
  • Guest VMs that are migrated across will still need access to the original Key Provider to unlock the vTPM files at guest OS boot.
  • VMs with vTPM will still be configured with their original key provider after migration.

vSAN:

  • vSAN has two levels of encryption. Data-in-transit and Data-at-rest.
    • Data-in-transit encrypts the transmission of data between the hosts.
    • Data-at-rest encrypts the data when it is written to disk.
  • If you encrypt the vSAN datastore you do not need to use VM encryption. Likewise if you encrypt the VM you do not need to encrypt the datastore. Encrypting the VMs will likely have a negative impact any DeDupe and Compression ratios.
  • vSAN datastore encryption requires a disk format change. Like other disk format changes this is done in a rolling fashion through each disk group in the cluster.
    • To speed up this process you can allow reduced redundancy during the reformat. This is similar to maintenance mode with the ensure accessibility option.
  • With vSAN 8.0 ESA. Encryption must be enabled when the cluster is created, and cannot be turned off.
  • Data at rest is encrypted after all other processing, such as deduplication, is performed.
  • VMs storage policies on stretched clusters should be evaluated before encryption is enabled with the ‘Allow reduced redundancy’ option. If Secondary (site) failures to tolerate is 0 read operations will happen on the secondary object, this may increase latency. This is not a concern if SFTT is greater than 0 or the allow reduced redundancy option is not selected

Swapping Key Providers

  • Changing Key providers is straightforward process and involves a ‘shallow’ re-key. This changes the Key Encryption Key and can be performed without guest downtime.
  • Swapping can happen between any supported key provider

Horizon with Windows 11 and vTPM:

  • Golden image must not have a vTPM. Horizon should add a vTPM in the provisioning settings.
    • Provisioning a Win11 machine without a vTPM can be done with WinPE or MS Deployment Toolkit
  • The golden image must have Microsoft Virtualization Based Security (VBS) enabled.
  • Instant Clone Mode B where no parent VMs exist is not supported with vTPM enabled machines. Planned for future Horizon release. Will require vSphere 7.0 U3f+

Resources:

Unassociated vSAN objects

Since mid 2017 I have been aware of an issue with Vmware Horizon that when a VM is deleted files are left behind.  When Horizon creates a new machine with the same name a new folder is created with an _1 appended to the end (or _2, _3, …  if this machine has been deleted multiple times.) It seems this has been an issue with Horizon since v6.0 and vmware has a KB article for a work around (KB2108928)

That work around isn’t great,  it is manual but works in a traditional storage environment. An admin would console/ssh into an ESXi host and issue an rm -f command on the offending folders and be done. My virtual desktop VMs reside on a vSAN. Within a vSAN all the folders and vmdk files are objects; if I were to rm a folder on a vSAN datastore it would not delete the underlying objects and they would still consume space on the disks. The rm command is not vSAN aware.

I could load up the vSphere web console and delete the directories individually. I could even use the HTML5 interface and select multiple folders for deletion simultaneously. In either case I need to check each individual folder to verify it is no longer in use.

There has to be a better way.

 

Thankfully the vSAN engineers have a command that will list the status of every object stored on the vSAN. This command is aware which VM is associated with each object. Since the source VM has been deleted the objects remaining will be unassociated. Be careful with unassociated objects; any template, ISO, txt, ova,etc file that you have placed on your vsanDatastore that is not mounted or in use by a VM will be in the unassociated object list.

To get this list login to RVC on your vCenter appliance and execute:

vsan.obj_status_report . --print-uuids --print-table

(see https://blogs.vmware.com/vsphere/2014/07/managing-vsan-ruby-vsphere-console.html for info about RVC)

Initial header output:

20180713-vsan-obj_status_report-header

Unassociated objects are below the list of VMs:

20180713-vsan-obj_status_report-unassociated

We copy the list of unassociated objects into the clipboard.

 

Then on an ESXi host in the vSAN cluster we create a new file with vi and paste the contents in and save as unassociated.txt.

(I have not found a more elegant way of doing this, please let me know in the comments if you do, the esxcli vsan namespace commands are not aware of object and VM association)

We now have a file that has your object UUIDs and some display artifacts.

We do some text processing to remove those artifacts:

cat unassociated.txt | awk '{print $2}' > UUID.txt

 

Now we have a file with just the UUID of the unassociated objects.

Time to translate the UUID into something we can use to filter and narrow the list to just the objects we would like to remove. We use the objtool command and loop through the UUID.txt get the metadata on each object and output that to another file:

cat UUID.txt | while read UUID ; do echo -e "\nUUID: $UUID" ; /usr/lib/vmware/osfs/bin/objtool getAttr -u $UUID | grep -i 'friend\|class\|path'; done > uuid_status.txt

 

uuid_status.txt now contains four lines per object: UUID, Friendly name, Object Class, and Path:

20180713-uuid_status.txt

That’s not too useful if we want to filter this in a meaningful way.

Lets make a csv we can ingest into something else (or filter further)

awk '/UUID: /{if (x)print x;x="";}{x=(!x)?$0:x","$0;}END{print x;}' uuid_status.txt |sed 's/UUID: //g' |sed 's/User friendly name://g' |sed 's/Object class: //g' | sed 's/Object path: //g' | sed 's/,$//g' > uuid_status.csv

Thanks to http://www.theunixschool.com/2012/05/awk-join-or-merge-lines-on-finding.html Example 4 for the awk command.

This will create a csv file with each object taking up one line.

Further filtering needs to be done to only have the UUIDs of objects we truly wish to delete.

 

Positive match filtering:

Since all my VMs are created by Horizon they follow a naming pattern, and the friendly name and path are both based of the I have a easy job filtering.

In my case the objects all contain  VDI- at the beginning of the namespace or the filename

grep ',VDI-\|/VDI-' uuid_status.csv > uuid_to_delete.csv

 

Negative match filtering:

If I didn’t have the luxury of  positive match filtering I would have to generate my list based on exclusionary patterns

For example my Appvolumes vmdks are unassociated so I would filter out the appVolumes folder as well as the apps & writable template folders.  If I had an ISO folder I could exclude it. And I would always want to exclude the .vsan.stats object

grep -v appVolumes uuid_status.csv | grep -v _templates | grep -v ISOs | grep -v vsan.stats > uuid_to_delete.csv

 

(note: I’m not including the leading / for folder names. If you use “/foldername” as your grep filter it will not match the namespace object. Deleting the namespace object removes vCenter’s access the object)

To be more confident combine both positive and negative filtering:

grep ',VDI-\|/VDI-' uuid_status.csv | grep -v appVolumes | grep -v _templates | grep -v ISOs | grep -v vsan.stats > uuid_to_delete.csv

 

I like to do a visual check of my list so I load up the csv in excel and peruse the contents to be sure.

Once we have final list of files to delete in uuid_to_delete.csv we need to remove everything except the UUID:

cat uuid_to_delete.csv | awk -F , '{print $1}' > uuid_to_delete.txt

 

Now comes the point  of no return. I suggest verifying your backups prior to this step.

Deleting the objects:

cat uuid_to_delete.txt | while read UUID ; do /usr/lib/vmware/osfs/bin/objtool delete -u $UUID ; done

 

Yes a few of these steps could be combined. I separated them for instruction, and to reduce unintended consequences.