2022-08-03 15:33:27

by Babis Chalios

[permalink] [raw]
Subject: [PATCH 0/2] virt: vmgenid: add generation counter

From: Babis Chalios <[email protected]>

Linux recently added support for the VM Generation ID mechanism from
Microsoft. The way this works currently is using the 128-bit blob
provided by the vmgenid device to re-seed the RNG. While this works it
has two main issues, (a) it is inherently racy due to the fact that it
relies on a ACPI notification being delivered and handled and (b) the ID
is unsuitable for exposing to user-space.

This patch-set extends the vmgenid device to introduce a generation
counter, a 32-bit counter which is different every time the unique ID
changes. The addition to the original implementation in QEMU can be
found here:
https://lists.nongnu.org/archive/html/qemu-devel/2022-08/msg00524.html.

The first patch re-works slightly the current vmgenid driver to add a
function that parses an object from the vmgenid device and returns the
physical address of the vmgenid data. The second patch uses that
function to parse additionally the address of the generation counter
from the vmgenid namespace. The counter is then exposed to the
user-space through a misc-device which provides `read` and `mmap`
interfaces.

Babis Chalios (2):
virt: vmgenid: add helper function to parse ADDR
virt: vmgenid: add support for generation counter

Documentation/virt/vmgenid.rst | 120 ++++++++++++++++++++++++++
drivers/virt/vmgenid.c | 151 ++++++++++++++++++++++++++++-----
2 files changed, 251 insertions(+), 20 deletions(-)
create mode 100644 Documentation/virt/vmgenid.rst

--
2.37.1

Amazon Spain Services sociedad limitada unipersonal, Calle Ramirez de Prado 5, 28045 Madrid. Registro Mercantil de Madrid . Tomo 22458 . Folio 102 . Hoja M-401234 . CIF B84570936



2022-08-03 15:34:15

by Babis Chalios

[permalink] [raw]
Subject: [PATCH 2/2] virt: vmgenid: add support for generation counter

From: Babis Chalios <[email protected]>

VM Generation ID provides a means of reseeding kernel's RNG using a
128-bit UUID when a VM fork occurs, thus avoiding issues running
multiple VMs with the exact same RNG state. However, user-space
applications, such as user-space PRNGs and applications that maintain
world-unique data, need a mechanism to handle VM fork events as well.

To handle the user-space use-case, this: <url> qemu patch extends
Microsoft's original vmgenid specification adding an extra page which
holds a single 32-bit generation counter, which increases every time a
VM gets restored from a snapshot.

This patch exposes the generation counter through a character device
(`/dev/vmgenid`) that provides a `read` and `mmap` interface, for
user-space applications to consume. Userspace applications should read
this value before starting a transaction involving cached random bits
and ensure that it has not changed while committing the transaction.

It can be used from qemu using the `-device vmgenid,guid=auto,genctr=42`
parameter to start a VM with a generation counter with value 42.
Reading 4 bytes from `/dev/vmgenid` will return the value 42. Next, use
`savevm my_snapshot` in the monitor to snapshot the VM. Now, start
another VM using `-device vmgenid,guid=auto,genctr=43 -loadvm
my_snapshot`. Reading now from `/dev/vmgenid` will return 43.

Signed-off-by: Babis Chalios <[email protected]>
---
Documentation/virt/vmgenid.rst | 120 +++++++++++++++++++++++++++++++++
drivers/virt/vmgenid.c | 103 +++++++++++++++++++++++++++-
2 files changed, 221 insertions(+), 2 deletions(-)
create mode 100644 Documentation/virt/vmgenid.rst

diff --git a/Documentation/virt/vmgenid.rst b/Documentation/virt/vmgenid.rst
new file mode 100644
index 000000000..61c29e4a7
--- /dev/null
+++ b/Documentation/virt/vmgenid.rst
@@ -0,0 +1,120 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======
+VMGENID
+=======
+
+The VM Generation ID (VMGENID) is a feature from Microsoft
+(https://go.microsoft.com/fwlink/?LinkId=260709) supported by multiple
+hypervisor vendors.
+
+Its purpose is to help tackle issues occurying by duplication of the state
+of a Virtual Machine (VM) during events that cause a VM to "return back in
+time", like snapshot and restore. It exposes a generation ID inside the VM so
+that applications that rely on world-wide unique or random data can check if
+that value has changed before committing transactions.
+
+Problem Definition
+------------------
+
+Often in its lifetime, a VM will get snapshotted and later it will be restored
+in that previous state. Moreover, one or more new VMs can be spawned from this
+snapshot. Both scenarios result in one or more VMs running with same RNG state,
+which makes early operations after restore that rely on randomness predictable,
+and thus render them insecure, for example TLS.
+
+Userspace PRNGs, as well as code that caches streams of random bits, to speed
+up latency critical applications, suffer from similar issues.
+
+Apart from concerns related with cryptography, userspace applications operating
+with (what they consider to be) unique data, such as UUIDs, are affected by
+spawning of multiple VMs from the same snapshot.
+
+VMGENID tackles the issue by providing a unique (not random) 128-bits
+identifier every time a VM is restored from a snapshot. The identifier is used
+to reseed the kernel's RNG ensuring that different VMs spawned from the same
+snapshot will observe different streams of random data.
+
+Notice that VMGENID does not eliminate the problem but it significantly reduces
+the window in which the system's RNG will produce identical data across
+different VMs.
+
+Reseeding the kernel's RNG tackles the issue of duplicated random values
+provided by the kernel, however it does little to address the issue of
+userspace applications that use world-unique data. The UUID defined by the
+original VMGENID specification is used to reseed the RNG, so it cannot be
+exposed to the userspace. This class of applications need a separate API which
+they can consume in order to detect VM restore events and adapt accordingly.
+
+In that front, VMGENID has been extended to expose to userspace an additional
+32 bits generation counter, which acts as a notification mechanism for restore
+events. The value of the counter after a VM restore will be different than
+its value when the snapshot was taken in order to signal to userspace that
+a VM restore has occurred.
+
+VMGENID in Linux
+----------------
+
+Linux kernel uses the 128-bits UUID of VMGENID to reseed the RNG every time an
+ACPI notification arrives. Moreover, it exposes the 32-bits generation counter
+through a character device ``/dev/vmgenid``. The device supports ``read()``
+and ``mmap`` for user space applications to monitor restore events:
+
+``read()``:
+Read always returns the first 4 bytes of the page including the generation
+counter. Partial reads and reads in offset other than 0 are not allowed and
+return ``EINVAL``.
+
+``mmap()``:
+It maps a single page in the address space of the userspace application. The
+driver supports ``PROT_READ`` and ``MAP_SHARED``. Mapping with ``PROT_WRITE``
+will result in ``EPERM``, whereas mapping past the first page will result in
+``EINVAL``.
+
+A userspace application that caches random bits from the kernel should ensure
+that the moment it actually wants to consume some of these bits the value of
+the generation counter equals its value when the bits were initially cached.
+For example:
+
+```
+uint32_t *gen_cntr = mmaped_gen_counter();
+uint32_t cached_gen_cntr = *gen_cntr;
+char *secret;
+
+for(;;) {
+ secret = get_secret();
+
+ // All good, not restore has happened.
+ if (cached_gen_cntr == *gen_cntr)
+ break;
+
+ // Generation counter has changed. We need to recreate caches and try again
+
+ cached_gen_cntr = *gen_cntr;
+ barrier();
+
+ // recreate secrets' cache
+ rebuild_cache();
+}
+
+consume_secret(secret);
+
+```
+
+The driver for VMGENID lives under ``drivers/virt/vmgenid.c``.
+
+Using VMGENID
+-------------
+
+https://git.qemu.org/?p=qemu.git;a=blob_plain;f=docs/specs/vmgenid.txt;hb=refs/heads/master
+describes how the VMGENID device can be used. First we start a VM passing the
+parameter `-device vmgenid,guid=auto,genctr=42`. With this the UUID value of
+VMGENID will be populated with a UUID created by qemu and a generation counter
+of 42. Next, we can save the VM state from the monitor using the `savevm`
+command.
+
+Now, we can start another VM from the same snapshot using the `-device
+vmgenid,guid=auto,genctr=43 -loadvm {snapshot}` options. This will update the
+UUID with a new value generated by qemu and 43 for the generation counter in
+memory before resuming the vcpus and then send an appropriate ACPI notification
+to the guest.
diff --git a/drivers/virt/vmgenid.c b/drivers/virt/vmgenid.c
index 0cc2fe0f4..1cb0b3560 100644
--- a/drivers/virt/vmgenid.c
+++ b/drivers/virt/vmgenid.c
@@ -11,6 +11,10 @@
#include <linux/module.h>
#include <linux/acpi.h>
#include <linux/random.h>
+#include "linux/container_of.h"
+#include <linux/miscdevice.h>
+#include <linux/fs.h>
+#include <linux/mm.h>

ACPI_MODULE_NAME("vmgenid");

@@ -19,6 +23,69 @@ enum { VMGENID_SIZE = 16 };
struct vmgenid_state {
u8 *next_id;
u8 this_id[VMGENID_SIZE];
+
+ phys_addr_t gen_cntr_addr;
+ u32 *next_counter;
+
+ int misc_enabled;
+ struct miscdevice misc;
+};
+
+static int vmgenid_mmap(struct file *filep, struct vm_area_struct *vma)
+{
+ struct vmgenid_state *state = filep->private_data;
+
+ if (vma->vm_pgoff || vma_pages(vma) > 1)
+ return -EINVAL;
+
+ if ((vma->vm_flags & VM_WRITE))
+ return -EPERM;
+
+ vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
+ vma->vm_flags &= ~VM_MAYWRITE;
+
+ return vm_iomap_memory(vma, state->gen_cntr_addr, PAGE_SIZE);
+}
+
+static ssize_t vmgenid_read(struct file *filep, char __user *buff, size_t count,
+ loff_t *offp)
+{
+ struct vmgenid_state *state = filep->private_data;
+
+ if (count == 0)
+ return 0;
+
+ /* We don't allow partial reads */
+ if (count != sizeof(u32))
+ return -EINVAL;
+
+ if (put_user(*state->next_counter, (u32 __user *)buff))
+ return -EFAULT;
+
+ return sizeof(u32);
+}
+
+static int vmgenid_open(struct inode *inode, struct file *filep)
+{
+ struct vmgenid_state *state =
+ container_of(filep->private_data, struct vmgenid_state, misc);
+
+ filep->private_data = state;
+ return 0;
+}
+
+static const struct file_operations fops = {
+ .owner = THIS_MODULE,
+ .open = vmgenid_open,
+ .read = vmgenid_read,
+ .mmap = vmgenid_mmap,
+ .llseek = noop_llseek,
+};
+
+static struct miscdevice vmgenid_misc = {
+ .minor = MISC_DYNAMIC_MINOR,
+ .name = "vmgenid",
+ .fops = &fops,
};

static int parse_vmgenid_address(struct acpi_device *device, acpi_string object_name,
@@ -57,7 +124,7 @@ static int vmgenid_add(struct acpi_device *device)
phys_addr_t phys_addr;
int ret;

- state = devm_kmalloc(&device->dev, sizeof(*state), GFP_KERNEL);
+ state = devm_kzalloc(&device->dev, sizeof(*state), GFP_KERNEL);
if (!state)
return -ENOMEM;

@@ -74,6 +141,27 @@ static int vmgenid_add(struct acpi_device *device)

device->driver_data = state;

+ /* Backwards compatibility. If CTRA is not there we just don't expose
+ * the char device
+ */
+ ret = parse_vmgenid_address(device, "CTRA", &state->gen_cntr_addr);
+ if (ret)
+ return 0;
+
+ state->next_counter = devm_memremap(&device->dev, state->gen_cntr_addr,
+ sizeof(u32), MEMREMAP_WB);
+ if (IS_ERR(state->next_counter))
+ return 0;
+
+ memcpy(&state->misc, &vmgenid_misc, sizeof(state->misc));
+ ret = misc_register(&state->misc);
+ if (ret) {
+ devm_memunmap(&device->dev, state->next_counter);
+ return 0;
+ }
+
+ state->misc_enabled = 1;
+
return 0;
}

@@ -89,6 +177,16 @@ static void vmgenid_notify(struct acpi_device *device, u32 event)
add_vmfork_randomness(state->this_id, sizeof(state->this_id));
}

+static int vmgenid_remove(struct acpi_device *device)
+{
+ struct vmgenid_state *state = device->driver_data;
+
+ if (state->misc_enabled)
+ misc_deregister(&state->misc);
+
+ return 0;
+}
+
static const struct acpi_device_id vmgenid_ids[] = {
{ "VMGENCTR", 0 },
{ "VM_GEN_COUNTER", 0 },
@@ -101,7 +199,8 @@ static struct acpi_driver vmgenid_driver = {
.owner = THIS_MODULE,
.ops = {
.add = vmgenid_add,
- .notify = vmgenid_notify
+ .notify = vmgenid_notify,
+ .remove = vmgenid_remove
}
};

--
2.32.1 (Apple Git-133)

Amazon Spain Services sociedad limitada unipersonal, Calle Ramirez de Prado 5, 28045 Madrid. Registro Mercantil de Madrid . Tomo 22458 . Folio 102 . Hoja M-401234 . CIF B84570936


2022-08-03 15:34:43

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 2/2] virt: vmgenid: add support for generation counter

On Wed, Aug 03, 2022 at 05:21:27PM +0200, [email protected] wrote:
> From: Babis Chalios <[email protected]>
>
> VM Generation ID provides a means of reseeding kernel's RNG using a
> 128-bit UUID when a VM fork occurs, thus avoiding issues running
> multiple VMs with the exact same RNG state. However, user-space
> applications, such as user-space PRNGs and applications that maintain
> world-unique data, need a mechanism to handle VM fork events as well.
>
> To handle the user-space use-case, this: <url> qemu patch extends
> Microsoft's original vmgenid specification adding an extra page which
> holds a single 32-bit generation counter, which increases every time a
> VM gets restored from a snapshot.
>
> This patch exposes the generation counter through a character device
> (`/dev/vmgenid`) that provides a `read` and `mmap` interface, for
> user-space applications to consume. Userspace applications should read
> this value before starting a transaction involving cached random bits
> and ensure that it has not changed while committing the transaction.
>
> It can be used from qemu using the `-device vmgenid,guid=auto,genctr=42`
> parameter to start a VM with a generation counter with value 42.
> Reading 4 bytes from `/dev/vmgenid` will return the value 42. Next, use
> `savevm my_snapshot` in the monitor to snapshot the VM. Now, start
> another VM using `-device vmgenid,guid=auto,genctr=43 -loadvm
> my_snapshot`. Reading now from `/dev/vmgenid` will return 43.
>
> Signed-off-by: Babis Chalios <[email protected]>
> ---
> Documentation/virt/vmgenid.rst | 120 +++++++++++++++++++++++++++++++++
> drivers/virt/vmgenid.c | 103 +++++++++++++++++++++++++++-
> 2 files changed, 221 insertions(+), 2 deletions(-)
> create mode 100644 Documentation/virt/vmgenid.rst
>
> diff --git a/Documentation/virt/vmgenid.rst b/Documentation/virt/vmgenid.rst
> new file mode 100644
> index 000000000..61c29e4a7
> --- /dev/null
> +++ b/Documentation/virt/vmgenid.rst
> @@ -0,0 +1,120 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=======
> +VMGENID
> +=======

<snip>

This file is now just floating in the directory, not tied to anything,
so auto-generation of the documentation will not pick it up or link to
it, right?

So, why does this have to be a separate file at all? Why not put this
in the .c file and pull it straight from there so that it keeps in sync
with the code easier?

thanks,

greg k-h

2022-08-03 15:36:14

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 2/2] virt: vmgenid: add support for generation counter

On Wed, Aug 03, 2022 at 05:21:27PM +0200, [email protected] wrote:
> + /* Backwards compatibility. If CTRA is not there we just don't expose
> + * the char device

Backwards compatibility with what?

> + */
> + ret = parse_vmgenid_address(device, "CTRA", &state->gen_cntr_addr);
> + if (ret)
> + return 0;
> +
> + state->next_counter = devm_memremap(&device->dev, state->gen_cntr_addr,
> + sizeof(u32), MEMREMAP_WB);
> + if (IS_ERR(state->next_counter))
> + return 0;

This too is an error, you can not return with "all is good", right?
Once you try to create this device because the address is present, you
can't just give up and succeed no matter what went wrong, that seems
incorrect.

thanks,

greg k-h

2022-08-03 15:53:21

by Babis Chalios

[permalink] [raw]
Subject: Re: [PATCH 0/2] virt: vmgenid: add generation counter

CC'ing some more people to the discussion.

On 3/8/22 17:21, [email protected] wrote:
> From: Babis Chalios <[email protected]>
>
> Linux recently added support for the VM Generation ID mechanism from
> Microsoft. The way this works currently is using the 128-bit blob
> provided by the vmgenid device to re-seed the RNG. While this works it
> has two main issues, (a) it is inherently racy due to the fact that it
> relies on a ACPI notification being delivered and handled and (b) the ID
> is unsuitable for exposing to user-space.
>
> This patch-set extends the vmgenid device to introduce a generation
> counter, a 32-bit counter which is different every time the unique ID
> changes. The addition to the original implementation in QEMU can be
> found here:
> https://lists.nongnu.org/archive/html/qemu-devel/2022-08/msg00524.html.
>
> The first patch re-works slightly the current vmgenid driver to add a
> function that parses an object from the vmgenid device and returns the
> physical address of the vmgenid data. The second patch uses that
> function to parse additionally the address of the generation counter
> from the vmgenid namespace. The counter is then exposed to the
> user-space through a misc-device which provides `read` and `mmap`
> interfaces.
>
> Babis Chalios (2):
> virt: vmgenid: add helper function to parse ADDR
> virt: vmgenid: add support for generation counter
>
> Documentation/virt/vmgenid.rst | 120 ++++++++++++++++++++++++++
> drivers/virt/vmgenid.c | 151 ++++++++++++++++++++++++++++-----
> 2 files changed, 251 insertions(+), 20 deletions(-)
> create mode 100644 Documentation/virt/vmgenid.rst
>

Amazon Spain Services sociedad limitada unipersonal, Calle Ramirez de Prado 5, 28045 Madrid. Registro Mercantil de Madrid . Tomo 22458 . Folio 102 . Hoja M-401234 . CIF B84570936

2022-08-03 16:20:53

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 2/2] virt: vmgenid: add support for generation counter

On Wed, Aug 03, 2022 at 05:21:27PM +0200, [email protected] wrote:
> +static const struct file_operations fops = {
> + .owner = THIS_MODULE,
> + .open = vmgenid_open,
> + .read = vmgenid_read,
> + .mmap = vmgenid_mmap,
> + .llseek = noop_llseek,

Where is this new user/kernel api being documented?

See, put it in the code please, no one knows to look in a random file in
Documentation/


> +};
> +
> +static struct miscdevice vmgenid_misc = {
> + .minor = MISC_DYNAMIC_MINOR,
> + .name = "vmgenid",
> + .fops = &fops,
> };
>
> static int parse_vmgenid_address(struct acpi_device *device, acpi_string object_name,
> @@ -57,7 +124,7 @@ static int vmgenid_add(struct acpi_device *device)
> phys_addr_t phys_addr;
> int ret;
>
> - state = devm_kmalloc(&device->dev, sizeof(*state), GFP_KERNEL);
> + state = devm_kzalloc(&device->dev, sizeof(*state), GFP_KERNEL);
> if (!state)
> return -ENOMEM;
>
> @@ -74,6 +141,27 @@ static int vmgenid_add(struct acpi_device *device)
>
> device->driver_data = state;
>
> + /* Backwards compatibility. If CTRA is not there we just don't expose
> + * the char device
> + */
> + ret = parse_vmgenid_address(device, "CTRA", &state->gen_cntr_addr);
> + if (ret)
> + return 0;
> +
> + state->next_counter = devm_memremap(&device->dev, state->gen_cntr_addr,
> + sizeof(u32), MEMREMAP_WB);
> + if (IS_ERR(state->next_counter))
> + return 0;
> +
> + memcpy(&state->misc, &vmgenid_misc, sizeof(state->misc));
> + ret = misc_register(&state->misc);
> + if (ret) {
> + devm_memunmap(&device->dev, state->next_counter);
> + return 0;

Why are you not returning an error? Why is this ok?

And why call devm_memunmap() directly? That kind of defeats the purpose
of using devm_memremap(), right?

thanks,

greg k-h

2022-08-03 17:04:38

by Babis Chalios

[permalink] [raw]
Subject: Re: [PATCH 0/2] virt: vmgenid: add generation counter

(Correctly) CC'ing more people in the discussion. Apologies for this.

On 3/8/22 17:21, [email protected] wrote:
> From: Babis Chalios <[email protected]>
>
> Linux recently added support for the VM Generation ID mechanism from
> Microsoft. The way this works currently is using the 128-bit blob
> provided by the vmgenid device to re-seed the RNG. While this works it
> has two main issues, (a) it is inherently racy due to the fact that it
> relies on a ACPI notification being delivered and handled and (b) the ID
> is unsuitable for exposing to user-space.
>
> This patch-set extends the vmgenid device to introduce a generation
> counter, a 32-bit counter which is different every time the unique ID
> changes. The addition to the original implementation in QEMU can be
> found here:
> https://lists.nongnu.org/archive/html/qemu-devel/2022-08/msg00524.html.
>
> The first patch re-works slightly the current vmgenid driver to add a
> function that parses an object from the vmgenid device and returns the
> physical address of the vmgenid data. The second patch uses that
> function to parse additionally the address of the generation counter
> from the vmgenid namespace. The counter is then exposed to the
> user-space through a misc-device which provides `read` and `mmap`
> interfaces.
>
> Babis Chalios (2):
> virt: vmgenid: add helper function to parse ADDR
> virt: vmgenid: add support for generation counter
>
> Documentation/virt/vmgenid.rst | 120 ++++++++++++++++++++++++++
> drivers/virt/vmgenid.c | 151 ++++++++++++++++++++++++++++-----
> 2 files changed, 251 insertions(+), 20 deletions(-)
> create mode 100644 Documentation/virt/vmgenid.rst
>

Amazon Spain Services sociedad limitada unipersonal, Calle Ramirez de Prado 5, 28045 Madrid. Registro Mercantil de Madrid . Tomo 22458 . Folio 102 . Hoja M-401234 . CIF B84570936

2022-08-03 18:10:16

by Babis Chalios

[permalink] [raw]
Subject: Re: [PATCH 2/2] virt: vmgenid: add support for generation counter



On 3/8/22 17:31, Greg KH wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Wed, Aug 03, 2022 at 05:21:27PM +0200, [email protected] wrote:
>> + /* Backwards compatibility. If CTRA is not there we just don't expose
>> + * the char device
> Backwards compatibility with what?
With hypervisor versions that do not support the generation counter, but
do support
the VM generation ID. In this case the hypervisor would expose ADDR but
not CTRA.
>
>> + */
>> + ret = parse_vmgenid_address(device, "CTRA", &state->gen_cntr_addr);
>> + if (ret)
>> + return 0;
>> +
>> + state->next_counter = devm_memremap(&device->dev, state->gen_cntr_addr,
>> + sizeof(u32), MEMREMAP_WB);
>> + if (IS_ERR(state->next_counter))
>> + return 0;
> This too is an error, you can not return with "all is good", right?
> Once you try to create this device because the address is present, you
> can't just give up and succeed no matter what went wrong, that seems
> incorrect.
Same intention as in the response in your other comment. I thought that
it doesn't make sense
to fail the whole ACPI device if registering the misc device fails.

However, seeing that again (even if my thinking is right) if this
devm_memremap fails we should
probably fail because there might be something wrong with the address
the device gave us.
>
> thanks,
>
> greg k-h


Cheers,
Babis
Amazon Spain Services sociedad limitada unipersonal, Calle Ramirez de Prado 5, 28045 Madrid. Registro Mercantil de Madrid . Tomo 22458 . Folio 102 . Hoja M-401234 . CIF B84570936

2022-08-03 18:12:12

by Babis Chalios

[permalink] [raw]
Subject: Re: [PATCH 2/2] virt: vmgenid: add support for generation counter



On 3/8/22 17:30, Greg KH wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Wed, Aug 03, 2022 at 05:21:27PM +0200, [email protected] wrote:
>> +static const struct file_operations fops = {
>> + .owner = THIS_MODULE,
>> + .open = vmgenid_open,
>> + .read = vmgenid_read,
>> + .mmap = vmgenid_mmap,
>> + .llseek = noop_llseek,
> Where is this new user/kernel api being documented?
>
> See, put it in the code please, no one knows to look in a random file in
> Documentation/
Agreed! Will move it in the code.
>
>
>> +};
>> +
>> +static struct miscdevice vmgenid_misc = {
>> + .minor = MISC_DYNAMIC_MINOR,
>> + .name = "vmgenid",
>> + .fops = &fops,
>> };
>>
>> static int parse_vmgenid_address(struct acpi_device *device, acpi_string object_name,
>> @@ -57,7 +124,7 @@ static int vmgenid_add(struct acpi_device *device)
>> phys_addr_t phys_addr;
>> int ret;
>>
>> - state = devm_kmalloc(&device->dev, sizeof(*state), GFP_KERNEL);
>> + state = devm_kzalloc(&device->dev, sizeof(*state), GFP_KERNEL);
>> if (!state)
>> return -ENOMEM;
>>
>> @@ -74,6 +141,27 @@ static int vmgenid_add(struct acpi_device *device)
>>
>> device->driver_data = state;
>>
>> + /* Backwards compatibility. If CTRA is not there we just don't expose
>> + * the char device
>> + */
>> + ret = parse_vmgenid_address(device, "CTRA", &state->gen_cntr_addr);
>> + if (ret)
>> + return 0;
>> +
>> + state->next_counter = devm_memremap(&device->dev, state->gen_cntr_addr,
>> + sizeof(u32), MEMREMAP_WB);
>> + if (IS_ERR(state->next_counter))
>> + return 0;
>> +
>> + memcpy(&state->misc, &vmgenid_misc, sizeof(state->misc));
>> + ret = misc_register(&state->misc);
>> + if (ret) {
>> + devm_memunmap(&device->dev, state->next_counter);
>> + return 0;
> Why are you not returning an error? Why is this ok?
>
> And why call devm_memunmap() directly? That kind of defeats the purpose
> of using devm_memremap(), right?
The intention here was to not fail the whole ACPI device if registering
the misc device fails. My rationale
was that the ACPI device is used for other things as well (re-seeding RNG).

>
> thanks,
>
> greg k-h

Cheers,
Babis
Amazon Spain Services sociedad limitada unipersonal, Calle Ramirez de Prado 5, 28045 Madrid. Registro Mercantil de Madrid . Tomo 22458 . Folio 102 . Hoja M-401234 . CIF B84570936

2022-08-04 13:57:44

by Babis Chalios

[permalink] [raw]
Subject: Re: [PATCH 0/2] virt: vmgenid: add generation counter

On 3/8/22 17:21, [email protected] wrote:
> From: Babis Chalios <[email protected]>
>
> Linux recently added support for the VM Generation ID mechanism from
> Microsoft. The way this works currently is using the 128-bit blob
> provided by the vmgenid device to re-seed the RNG. While this works it
> has two main issues, (a) it is inherently racy due to the fact that it
> relies on a ACPI notification being delivered and handled and (b) the ID
> is unsuitable for exposing to user-space.
>
> This patch-set extends the vmgenid device to introduce a generation
> counter, a 32-bit counter which is different every time the unique ID
> changes. The addition to the original implementation in QEMU can be
> found here:
> https://lists.nongnu.org/archive/html/qemu-devel/2022-08/msg00524.html.
>
> The first patch re-works slightly the current vmgenid driver to add a
> function that parses an object from the vmgenid device and returns the
> physical address of the vmgenid data. The second patch uses that
> function to parse additionally the address of the generation counter
> from the vmgenid namespace. The counter is then exposed to the
> user-space through a misc-device which provides `read` and `mmap`
> interfaces.
>
> Babis Chalios (2):
> virt: vmgenid: add helper function to parse ADDR
> virt: vmgenid: add support for generation counter
>
> Documentation/virt/vmgenid.rst | 120 ++++++++++++++++++++++++++
> drivers/virt/vmgenid.c | 151 ++++++++++++++++++++++++++++-----
> 2 files changed, 251 insertions(+), 20 deletions(-)
> create mode 100644 Documentation/virt/vmgenid.rst
>

I am also CCing Michael from Microsoft since he was involved in the
last discussions regarding the Linux driver.

Cheers,
Babis
Amazon Spain Services sociedad limitada unipersonal, Calle Ramirez de Prado 5, 28045 Madrid. Registro Mercantil de Madrid . Tomo 22458 . Folio 102 . Hoja M-401234 . CIF B84570936

2022-08-04 15:30:10

by Jason A. Donenfeld

[permalink] [raw]
Subject: Re: [PATCH 0/2] virt: vmgenid: add generation counter

Hi Babis,

On Wed, Aug 03, 2022 at 05:21:25PM +0200, [email protected] wrote:
> From: Babis Chalios <[email protected]>
>
> Linux recently added support for the VM Generation ID mechanism from
> Microsoft. The way this works currently is using the 128-bit blob
> provided by the vmgenid device to re-seed the RNG. While this works it
> has two main issues, (a) it is inherently racy due to the fact that it
> relies on a ACPI notification being delivered and handled and (b) the ID
> is unsuitable for exposing to user-space.
>
> This patch-set extends the vmgenid device to introduce a generation
> counter, a 32-bit counter which is different every time the unique ID
> changes. The addition to the original implementation in QEMU can be
> found here:
> https://lists.nongnu.org/archive/html/qemu-devel/2022-08/msg00524.html.
>
> The first patch re-works slightly the current vmgenid driver to add a
> function that parses an object from the vmgenid device and returns the
> physical address of the vmgenid data. The second patch uses that
> function to parse additionally the address of the generation counter
> from the vmgenid namespace. The counter is then exposed to the
> user-space through a misc-device which provides `read` and `mmap`
> interfaces.

First, with regards to your mmap interface, it's more likely that this
kind of thing will be eventually folded into my investigations regarding
the RNG and the vDSO (which would make this kind of thing accessible
without needing the file system).

Regarding the counter itself, I don't want to rush into augmenting the
vmgenid mechanism until we've had some conversations with Microsoft. But
also, it seems like you might have missed the extensive previous
discussion about this. There was some tradeoff in efficiency about
mapping this all the way through, as doing so would require the counter
to be in a totally separate page as the main 128-bit ID, versus just
having the kernel manage a separate counter and incur a potential [maybe
acceptable? unclear] race.

Jason

2022-08-04 16:18:14

by Babis Chalios

[permalink] [raw]
Subject: Re: [PATCH 0/2] virt: vmgenid: add generation counter

Hi Jason,

On 8/4/22 4:59 PM, "Jason A. Donenfeld" <[email protected]> wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> Hi Babis,
>
> On Wed, Aug 03, 2022 at 05:21:25PM +0200, [email protected] wrote:
> > From: Babis Chalios <[email protected]>
> >
> > Linux recently added support for the VM Generation ID mechanism from
> > Microsoft. The way this works currently is using the 128-bit blob
> > provided by the vmgenid device to re-seed the RNG. While this works it
> > has two main issues, (a) it is inherently racy due to the fact that it
> > relies on a ACPI notification being delivered and handled and (b) the ID
> > is unsuitable for exposing to user-space.
> >
> > This patch-set extends the vmgenid device to introduce a generation
> > counter, a 32-bit counter which is different every time the unique ID
> > changes. The addition to the original implementation in QEMU can be
> > found here:
> > https://lists.nongnu.org/archive/html/qemu-devel/2022-08/msg00524.html.
> >
> > The first patch re-works slightly the current vmgenid driver to add a
> > function that parses an object from the vmgenid device and returns the
> > physical address of the vmgenid data. The second patch uses that
> > function to parse additionally the address of the generation counter
> > from the vmgenid namespace. The counter is then exposed to the
> > user-space through a misc-device which provides `read` and `mmap`
> > interfaces.
>
> First, with regards to your mmap interface, it's more likely that this
> kind of thing will be eventually folded into my investigations regarding
> the RNG and the vDSO (which would make this kind of thing accessible
> without needing the file system).
>

Interesting, could you share a link to discussions / code regarding this?


> Regarding the counter itself, I don't want to rush into augmenting the
> vmgenid mechanism until we've had some conversations with Microsoft. But
> also, it seems like you might have missed the extensive previous
> discussion about this. There was some tradeoff in efficiency about
> mapping this all the way through, as doing so would require the counter
> to be in a totally separate page as the main 128-bit ID, versus just
> having the kernel manage a separate counter and incur a potential [maybe
> acceptable? unclear] race.
>

Again, would you have pointers to this discussion? I have been following this
thread: https://lkml.org/lkml/2022/3/1/693. I am interested on the take that
the race would be acceptable.

> Jason
>

Cheers,
Babis

Amazon Spain Services sociedad limitada unipersonal, Calle Ramirez de Prado 5, 28045 Madrid. Registro Mercantil de Madrid . Tomo 22458 . Folio 102 . Hoja M-401234 . CIF B84570936

2022-08-10 10:09:42

by Babis Chalios

[permalink] [raw]
Subject: Re: [PATCH 0/2] virt: vmgenid: add generation counter

Hi Jason

On 8/4/22 4:59 PM, "Jason A. Donenfeld" <[email protected]> wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> Hi Babis,
>
> On Wed, Aug 03, 2022 at 05:21:25PM +0200, [email protected] wrote:
> > From: Babis Chalios <[email protected]>
> >
> > Linux recently added support for the VM Generation ID mechanism from
> > Microsoft. The way this works currently is using the 128-bit blob
> > provided by the vmgenid device to re-seed the RNG. While this works it
> > has two main issues, (a) it is inherently racy due to the fact that it
> > relies on a ACPI notification being delivered and handled and (b) the ID
> > is unsuitable for exposing to user-space.
> >
> > This patch-set extends the vmgenid device to introduce a generation
> > counter, a 32-bit counter which is different every time the unique ID
> > changes. The addition to the original implementation in QEMU can be
> > found here:
> > https://lists.nongnu.org/archive/html/qemu-devel/2022-08/msg00524.html.
> >
> > The first patch re-works slightly the current vmgenid driver to add a
> > function that parses an object from the vmgenid device and returns the
> > physical address of the vmgenid data. The second patch uses that
> > function to parse additionally the address of the generation counter
> > from the vmgenid namespace. The counter is then exposed to the
> > user-space through a misc-device which provides `read` and `mmap`
> > interfaces.
>
> First, with regards to your mmap interface, it's more likely that this
> kind of thing will be eventually folded into my investigations regarding
> the RNG and the vDSO (which would make this kind of thing accessible
> without needing the file system).
>
> Regarding the counter itself, I don't want to rush into augmenting the
> vmgenid mechanism until we've had some conversations with Microsoft. But
> also, it seems like you might have missed the extensive previous
> discussion about this. There was some tradeoff in efficiency about
> mapping this all the way through, as doing so would require the counter
> to be in a totally separate page as the main 128-bit ID, versus just
> having the kernel manage a separate counter and incur a potential [maybe
> acceptable? unclear] race.
>
> Jason
>

Just linking here a comment from Michael on the QEMU discussion: https://www.mail-archive.com/[email protected]/msg903695.html

Cheers,
Babis


Amazon Spain Services sociedad limitada unipersonal, Calle Ramirez de Prado 5, 28045 Madrid. Registro Mercantil de Madrid . Tomo 22458 . Folio 102 . Hoja M-401234 . CIF B84570936

2022-08-14 04:01:51

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 2/2] virt: vmgenid: add support for generation counter

Hi,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on crng-random/master]
[also build test WARNING on linus/master v5.19 next-20220812]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/bchalios-amazon-es/virt-vmgenid-add-generation-counter/20220803-232559
base: git://git.kernel.org/pub/scm/linux/kernel/git/crng/random.git master
reproduce: make htmldocs

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

>> Documentation/virt/vmgenid.rst:79: WARNING: Inline literal start-string without end-string.
>> Documentation/virt/vmgenid.rst:79: WARNING: Inline emphasis start-string without end-string.
>> Documentation/virt/vmgenid.rst:89: WARNING: Unexpected indentation.
>> Documentation/virt/vmgenid.rst:98: WARNING: Definition list ends without a blank line; unexpected unindent.
>> Documentation/virt/vmgenid.rst:102: WARNING: Inline interpreted text or phrase reference start-string without end-string.
>> Documentation/virt/vmgenid.rst: WARNING: document isn't included in any toctree

vim +79 Documentation/virt/vmgenid.rst

78
> 79 ```
80 uint32_t *gen_cntr = mmaped_gen_counter();
81 uint32_t cached_gen_cntr = *gen_cntr;
82 char *secret;
83
84 for(;;) {
85 secret = get_secret();
86
87 // All good, not restore has happened.
88 if (cached_gen_cntr == *gen_cntr)
> 89 break;
90
91 // Generation counter has changed. We need to recreate caches and try again
92
93 cached_gen_cntr = *gen_cntr;
94 barrier();
95
96 // recreate secrets' cache
97 rebuild_cache();
> 98 }
99
100 consume_secret(secret);
101
> 102 ```
103

--
0-DAY CI Kernel Test Service
https://01.org/lkp