vSAN Encryption at Rest & In Transit: What is the difference?

In the past, I’ve written a few posts about vSAN Data-at-Rest Encryption, which became available with vSAN 6.6. You can find those posts here. In vSAN version 7.0U1 there is a new option for encryption, Data-In- Transit Encryption. So what is the difference? Can I only choose one or both? Let’s find out.

vSAN Data at Rest Encryption

Data-at-rest (D@RE) was designed to do just that. Encrypt all your data once it lands on the disks being used by vSAN. This will work regardless the Storage Policy you choose, and all the data replicas will be encrypted at both the cache layer and the capacity layer. One major advantage of Data-at-Rest Encryption over the vSphere VM encryption is that vSAN will still allow you to encrypt your data and take advantage of space saving features such as deduplication and compression. When the data lands in cache it will be encrypted using the Data Encryption Key (DEK), then while the data is being destaged to the capacity layer it will be decrypted, and it is here where the deduplication and compression takes place. Finally when the data lands in the capacity devices, the data gets encrypted once again. It is also important to highlight that the DEK is protected by the Key Encryption Key (KEK) which is coming from the Key Management Server (KMS)… and this is one of the differences between the two options.

vSAN Data in Transit Encryption

Data-In-Transit Encryption (DIT) comes in to complete the end-to-end encryption of the data while in transit between hosts. Data-at-Rest encryption only encrypts the data when it lands on disk, so if someone takes a disk out of a server, all data is encrypted. But what about other attacks such as Man-in-the-middle attacks? Well, this is where Data-In-Transit encryption can protect the data. The keys used for DIT encryption are managed internally and there is no need for a KMS. Such keys are also rotated much, much faster when compared with D@RE. DIT encryption keys are rotated weekly by default, but you can change this option and rotate keys either every 7 days or every 6 hours or something in between. Just like D@RE encryption, DIT encryption works at a vSAN cluster level; so either all the hosts are protected or none.

Here is a quick comparison between the two options

FAQ

Can I enable both at the same time?

Yes. You can enable Data at rest and Data in Transit encryption in order to get full protection in your vSAN environment. It is recommended to enable vSAN Data at Rest encryption in the early stages of the cluster to minimize the time for on-disk formatting as there is less data to move around.

What is the performance impact of turning encryption on?

There are a lot of variables that come in to play when we talk about performance. However; vSAN encryption (both) will take advantage of AES-NI and offload operations in order to reduce any performance hit. Most modern CPU have AES-NI, but sometimes this feature is not enabled, so make sure to check this at deployment. Please also be mindful that enabled D@RE when the cluster has a lot data in it will result in large amounts of data being moved, so plan this to be done during off hours if possible.

What vSAN License do I need to enable vSAN Encryption?

In order to enable Data-at-Rest and/or Data-In-Transit Encryption you will need vSAN Enterprise or vSAN Enterprise Plus licenses. Refer to licensing guide here.

How do I enable Data-In-Transit Encryption?

Enabling DIT encryption is easy. Within the vCenter UI, select the vSAN cluster > Configure > Services > Data-In-Transit can be enable with or without Data-at-Rest encryption. Here is where you can also change the key rotation schedule for the DIT encryption keys.

@GreatWhiteTec

vSAN Encryption KMS info retrieval

A few years ago I wrote a blog post about “Replacing vCenter with vSAN Encryption Enabled“. For this particular exercise, one key piece of information needed to be retrieved was the kmipClusterId.

A couple of things have changed since then, in newer version of vSAN.

Change #1: ESXCLI commands

An easier way to retrieve this information with esxcli command was added. This command allows you to obtain a lot of information about the state of vSAN encryption, retrieve the hostKeyId, kekID, etc.

esxcli vsan encryption <option> get/list

 

So, based on this addition, you can now get the kmipClusterId needed for vCenter replacement by using esxcli vsan encryption kms list

As you can see, you can still look for this information on the esx.conf file which is where the hosts store this information for this particular version of vSAN (6.7 P01 – Build 15160138). Which brings me to the second update…

 

Change #2: vSAN Persistence

In vSAN 7.0 and beyond some changes were made on how this configuration gets stored. In this case, the encryption information that was previously file based (esx.conf) is now stored in a database. This provides better concurrency for multiple readers and writers versus the file based esx.conf option, among other advantages.

The good news is that the esxcli vsan encryption command will still allow you to retrieve the information needed in regards to encryption. However, if you attempt to retrieve this information from the esx.conf file, you won’t be able to find it there anymore.

Alternatively, you can retrieve the information directly from the config-store… maybe more info than you need. So, I’ld just stick to esxcli commands.

What’s new on vSAN Encryption 6.7 U1?

I’ve written a few blog posts in the past about vSAN Data at Rest Encryption (D@RE). These posts explain how encryption works, and how the keys are handed over to vSphere. Go here for more info.

For vSAN D@RE to work properly, ESXi hosts need to be able to reach the KMS cluster during reboot operations. Yes, hopefully you have a cluster for redundancy, but a single KMS server will still work. This is necessary in order for ESXi hosts within the vSAN cluster to be able to obtain both the Host Encryption Key (let’s call this HEK), and the Key Encryption Key (KEK).

Wait!!! Why do we have to go to KMS again if we already received the keys?!?!

See, The Host Encryption Key, and the Key Encryption Key live in a non persistent state in memory, in the key cache. When a vSAN node (ESXi server) is rebooted, these key go away (poof…gone). So, when vSAN encryption is enabled, and the hosts are rebooted, it needs to go out to the KMS and get those keys. So you may want to make sure that your hosts can talk to KMS, and that KMS has your keys before you consider rebooting your hosts. Oh yeah, it goes without saying that the KMS should NOT be in the vSAN cluster, and you can see why.

Once the HEK is obtained, the host reaches a crypto-safe mode, which allows the host to obtain a good operational state, and continue with the boot process, at which point it asks for the KEK from KMS. If the host is not able to obtain such keys from the KMS cluster, the host will continue to boot; however, the disks will not be mounted as the host was not in crypto-safe mode, and it was not able to obtain the KEK from KMS resulting in failure to unwrap the Data Encryption Key (DEK).

In a scenario where hosts are being updated/upgraded via VUM, in most occasions the hosts will do a rolling reboot as part of the VUM process. With vSAN versions 6.7 and prior, rolling reboots of hosts via VUM were allowed, irrelevant of the state of the connection with KMS, and the availability of keys. As already described, these keys are necessary in order to properly mount the drives on each host during a reboot.

In vSAN 6.7 Update 1, VMware has added guard rails to prevent disks of multiple hosts from unmounting due to lack of connectivity with KMS, or accidental key deletion. During an upgrade operation, VUM will place a host in Enhanced Maintenance Mode (EMM), perform updates, reboot, and exit EMM. If after a reboot, the host is not able to reach crypto-safe mode, the host will not exit EMM – stalling the VUM progress. In this case, the host’s drives are not mounted due to it not being able to reach the crypto-safe mode, if we allow the upgrade to continue, all other hosts will upgrade, but all the drives within the vSAN datastore will be unmounted.

This new guard rail, helps prevent losing all vSAN storage due to connectivity issues, or accidental changes with KMS, and key availability. This feature also highlights the benefits of having a HCI solution embedded in the kernel, the ease of orchestration with other vSphere components, and features makes vSAN even more appealing.

Considerations when Enabling vSAN Encryption

In previous posts, I talked about vSAN Encryption architecture, and how to enable such feature. However, there are a couple of considerations aside from the requirements that should be taken into account prior to enabling vSAN Encryption.

BIOS Settings:

With most deployments, whether it is vSphere, or vSAN; I’ve noticed that BIOS settings are often overlook, even though they can help increase performance with a simple change. One of those settings is AES-NI. AES-NI was proposed by Intel some time back, and it is essentially a set of [new] instructions (NI), for the Advanced Encryption Standard (AES); hence the acronym AES-NI. What AES-NI does, is provide hardware acceleration to applications using AES for encryption, and decryption.

Most modern CPUs (Intel & AMD), support AES-NI, and some BIOS configurations from certain hardware vendors already have AES-NI enabled by default. When considering vSAN Encryption, it is imperative to make sure that AES-NI has been enabled in the BIOS, in order to take advantage of such offloading of instructions to the CPU as well as strengthening, and accelerating the execution of AES applications.

Failure to enable AES-NI while Encryption is enabled, may result in a dramatic cpu utilization increase. In recent versions of vSAN, the Health Check UI detects, and alerts when AES-NI has not been enabled. If the BIOS does not have the option to enable AES-NI, it is most likely that the feature is always enabled.

Note: This also applies to VM encryption.

 

Available Space

The other consideration is available space. My previous posts talk about data migration occurring if vSAN Encryption was enabled after data has been moved into the vSAN Datastore, due to the disk format task necessary. Although vSAN Encryption does not incur a space overhead for its operation, it is important to keep in mind that there needs to be enough available space to be able to evacuate an entire disk group during the configuration process.

 

vSAN 6.6 Encryption Configuration

New on vSAN 6.6, vSAN native encryption for data at rest is now available. This feature does not require self-encrypting drives (SEDs). Encryption is supported on both all-flash and hybrid configurations of vSAN, and it is done at the datastore level.

It is important to note that data is encrypted during the de-staging process, which means that all other vSAN features are fully supported, such as deduplication and compression, among others.

Given the multitude of KMS vendors, the setup and configuration of KMS is not part of this document, and it is a pre-requisite prior to enabling encryption on vSAN datastore.

Requirements for vSAN Encryption:

  • Deploy KMS cluster/server of your choice
  • Add/trust KMS server to vCenter UI
  • vSAN encryption requires on-disk format (ODF) version 5
    • You can upgrade this via Web Client
    • or if you enable Encryption or Deduplication and Compression on an existing vSAN cluster, the ODF gets upgraded to the latest version automatically.
  • When vSAN encryption is enabled all disks are reformatted
    • This is achieved in a rolling manner

 

Initial configuration is done in the VMware vCenter Server user interface of the vSphere Web Client. The KMS cluster is added to vCenter Server and a trust relationship is established. The process for doing this is vendor-specific. Consult your KMS vendor documentation prior to adding the KMS cluster to vCenter.

To add the KMS cluster to vCenter in the vSphere Web Client, click on the vCenter server, click on “Configure” tab, “Key Management Servers”, and click “add KMS”. Enter the information for your specific KMS cluster/server.

 

Once the KMS cluster/server has been added, you will need to establish trust with the KMS server. Follow the instructions from your KMS vendor as they differ from vendor to vendor.

 

After the KMS has been configured, you will see that the connections status and the certificate have green checks, meaning we are ready to move forward.

 

Now, we need to verify that all of the disks in the cluster are on version 5 for on-disk format prior to enabling vSAN encryption, since version 5 is a requirement.

 

 

At this point we are ready to turn encryption on, since we have completed the first three steps.

  • Deploy KMS cluster/server of your choice
  • Add/trust KMS server to vCenter UI
  • vSAN encryption requires on-disk format version 5
  • When vSAN encryption is enabled all disks are reformatted

 

To enable vSAN encryption, click on the vSAN cluster, “Configure” tab, and “General” under the vSAN section, and click “edit”. Here we have the option to erase the disk before use. This will increase the time it will take to do the rolling format of the devices, but it will provide better protection.

 

After you click ok, vSAN will remove one Disk Group at a time, format each device, and recreate the Disk Group once the format completed. It will then move on to the next Disk Group until all Disk Groups are recreated, and all devices formatted. During this period, data will be evacuated from the Disk Groups, so you will see components resyncing.

 

Note: This process can take quite some time depending on the amount of data that needs to be migrated during the rolling reformat, so please plan accordingly.

 

Once vSAN encryption is enabled, you are able to disable encryption; however, the same procedure is needed as far as reformatting all the drives in a rolling manner.

 

New Key Generation

You also have the capability of generating new keys for encryption. There are 2 modes for rekeying. One of them is a high level rekey where the data encryption key is wrapped by a new key encryption key. The other level is a complete re-encryption of all data. This second rekey (deep rekey) may take significant time to complete as all the data will have to be re-written, and may decrease performance.

 

 

Summary of expected behaviors:

  • Enabling vSAN Encryption requires disk reformat with object resyncs.
  • You don’t have to erase all the disks first prior to using native encryption unless you want to reduce the possibility of data leakage and have a decreased attack vector. However, this will result in additional time required to erase disks, reformat drives, and enable encryption.
  • Enabling vSAN Deduplication and Compression still requires disk reformat with object resyncs whether the Disk Group is encrypted or not.
  • Disabling any of the aforementioned features requires another reformat of the devices along with object resyncs.