2024-04-18 11:49:12

by Jason A. Donenfeld

[permalink] [raw]
Subject: [PATCH] Revert "vmgenid: emit uevent when VMGENID updates"

This reverts commit ad6bcdad2b6724e113f191a12f859a9e8456b26d. I had
nak'd it, and Greg said on the thread that it links that he wasn't going
to take it either, especially since it's not his code or his tree, but
then, seemingly accidentally, it got pushed up some months later, in
what looks like a mistake, with no further discussion in the linked
thread. So revert it, since it's clearly not intended.

Fixes: ad6bcdad2b67 ("vmgenid: emit uevent when VMGENID updates")
Cc: [email protected]
Cc: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jason A. Donenfeld <[email protected]>
---
drivers/virt/vmgenid.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/drivers/virt/vmgenid.c b/drivers/virt/vmgenid.c
index b67a28da4702..a1c467a0e9f7 100644
--- a/drivers/virt/vmgenid.c
+++ b/drivers/virt/vmgenid.c
@@ -68,7 +68,6 @@ static int vmgenid_add(struct acpi_device *device)
static void vmgenid_notify(struct acpi_device *device, u32 event)
{
struct vmgenid_state *state = acpi_driver_data(device);
- char *envp[] = { "NEW_VMGENID=1", NULL };
u8 old_id[VMGENID_SIZE];

memcpy(old_id, state->this_id, sizeof(old_id));
@@ -76,7 +75,6 @@ static void vmgenid_notify(struct acpi_device *device, u32 event)
if (!memcmp(old_id, state->this_id, sizeof(old_id)))
return;
add_vmfork_randomness(state->this_id, sizeof(state->this_id));
- kobject_uevent_env(&device->dev.kobj, KOBJ_CHANGE, envp);
}

static const struct acpi_device_id vmgenid_ids[] = {
--
2.44.0



2024-04-18 12:49:15

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] Revert "vmgenid: emit uevent when VMGENID updates"

On Thu, Apr 18, 2024 at 01:48:08PM +0200, Jason A. Donenfeld wrote:
> This reverts commit ad6bcdad2b6724e113f191a12f859a9e8456b26d. I had
> nak'd it, and Greg said on the thread that it links that he wasn't going
> to take it either, especially since it's not his code or his tree, but
> then, seemingly accidentally, it got pushed up some months later, in
> what looks like a mistake, with no further discussion in the linked
> thread. So revert it, since it's clearly not intended.
>
> Fixes: ad6bcdad2b67 ("vmgenid: emit uevent when VMGENID updates")
> Cc: [email protected]
> Cc: Greg Kroah-Hartman <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> Signed-off-by: Jason A. Donenfeld <[email protected]>
> ---
> drivers/virt/vmgenid.c | 2 --
> 1 file changed, 2 deletions(-)

Sorry about that, I picked it up thinking I had missed it previously:

Acked-by: Greg Kroah-Hartman <[email protected]>


2024-04-22 07:53:48

by Alexander Graf

[permalink] [raw]
Subject: [REGRESSION] Re: [PATCH] Revert "vmgenid: emit uevent when VMGENID updates"

[Adding CC list of original patch plus regression tracker]

Hi Jason,

On 18.04.24 13:48, Jason A. Donenfeld wrote:
> This reverts commit ad6bcdad2b6724e113f191a12f859a9e8456b26d. I had
> nak'd it, and Greg said on the thread that it links that he wasn't going
> to take it either, especially since it's not his code or his tree, but
> then, seemingly accidentally, it got pushed up some months later, in
> what looks like a mistake, with no further discussion in the linked
> thread. So revert it, since it's clearly not intended.

Reverting this patch creates a user space visible regression compared to
v6.8. Please treat it as such.

I'm slightly confused to see you passionate about this patch after you
ghosted the conversation you referenced:


https://lore.kernel.org/lkml/[email protected]/

The purpose of this uevent is to notify systemd[1][2] (or similar) that
a VM clone event happened, so it can for example regenerate MAC
addresses if it generated them on boot, regenerate its unique machine id
or simply force rerequest a new DHCP lease.

I don't understand how there's any correlation or dependency to
vgetrandom() or anything RNG in this and why getting vgetrandom() merged
upstream is even something to talk about in the same line as this patch [3].

We had a lengthy, constructive conversation with Ted at LPC last year
about the "PRNG and clone" use case and concluded that it's best for
everyone to simply assume the system could be cloned at any point, hence
always force intermix of RDRAND or comparable to any PRNG output. We
since no longer need an event for that case.


Alex

[1] https://github.com/systemd/systemd/issues/26380
[2] https://lore.kernel.org/lkml/ZJGNREN4tLzQXOJr@gardel-login/
[3]
https://lore.kernel.org/lkml/CAHmME9pxc-nO_xa=4+1CnvbnuefbRTJHxM7n817c_TPeoxzu_g@mail.gmail.com/

#regzbot introduced: 3aadf100f93d8081

>
> Fixes: ad6bcdad2b67 ("vmgenid: emit uevent when VMGENID updates")
> Cc: [email protected]
> Cc: Greg Kroah-Hartman <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> Signed-off-by: Jason A. Donenfeld <[email protected]>
> ---
> drivers/virt/vmgenid.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/drivers/virt/vmgenid.c b/drivers/virt/vmgenid.c
> index b67a28da4702..a1c467a0e9f7 100644
> --- a/drivers/virt/vmgenid.c
> +++ b/drivers/virt/vmgenid.c
> @@ -68,7 +68,6 @@ static int vmgenid_add(struct acpi_device *device)
> static void vmgenid_notify(struct acpi_device *device, u32 event)
> {
> struct vmgenid_state *state = acpi_driver_data(device);
> - char *envp[] = { "NEW_VMGENID=1", NULL };
> u8 old_id[VMGENID_SIZE];
>
> memcpy(old_id, state->this_id, sizeof(old_id));
> @@ -76,7 +75,6 @@ static void vmgenid_notify(struct acpi_device *device, u32 event)
> if (!memcmp(old_id, state->this_id, sizeof(old_id)))
> return;
> add_vmfork_randomness(state->this_id, sizeof(state->this_id));
> - kobject_uevent_env(&device->dev.kobj, KOBJ_CHANGE, envp);
> }
>
> static const struct acpi_device_id vmgenid_ids[] = {




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879


2024-04-23 01:24:26

by Jason A. Donenfeld

[permalink] [raw]
Subject: Re: [REGRESSION] Re: [PATCH] Revert "vmgenid: emit uevent when VMGENID updates"

Hi Alexander,

The process here seems weirdly aggressive and sneaky.

On 2023-06-19, I wrote that I didn't want to take this route for
userspace notifications.

Then on 2023-06-28, you wrote to Greg asking him to take it instead of
me. Nine minutes later, Greg said "yea sure." Then he caught up on the
thread and some hours later wrote:

> Wait, no, I'm not the maintainer of this, Jason is. And he already
> rejected it (and based on the changelog text, I would too), so why are
> you asking me a month later to take this?
>
> Work with the maintainer please, don't try to route around them, you
> both know better than this.

Then on 2023-11-14 you wrote to me again asking me to take it, despite
my earlier reservations not changing in the interim. I didn't have a
chance to reply.

Then on 2023-11-30, Greg weirdly took it anyway, with zero discussion
or evidence on the mailing list as to what had happened.

When I noticed what had happened (while working on his driver in the
process of cleaning up/reworking patches that your Amazon employees
sent me that needed work), suspicious that you tried to "route around"
the proper way of getting this done and trick Greg again into taking a
patch that's not his purview, I asked him wtf happened on IRC:

<gregkh> ugh, sorry, I don't remember that. I think Alexander talked
to me at plumbers and said, "hey, please take this virt patch"
<gregkh> but you are right, you NAKed it in that thread, I forgot
that, sorry. Yes, revert it if that's needed.

Greg then ACK'd the revert commit which came with a stable@ marking
and a Fixes: tag (for 6.8, which isn't very old).

So it looks to me like you twice tried to trick Greg into taking this,
succeeded the second time, got caught, and now are trying to make a
regression argument as a means of keeping your sneaky commit in there.
All of this really _really_ rubs me the wrong way, I have to say.

I don't know what holds more weight here -- the predictable regression
argument, or the fact that you snuck nack'd changes into a very very
recent kernel that can still be removed while probably only affecting
you. But I'm obviously not happy about this.

Jason

2024-04-23 06:56:24

by Alexander Graf

[permalink] [raw]
Subject: Re: [REGRESSION] Re: [PATCH] Revert "vmgenid: emit uevent when VMGENID updates"

Hey Jason,

On 23.04.24 03:21, Jason A. Donenfeld wrote:
> Hi Alexander,
>
> The process here seems weirdly aggressive and sneaky.
>
> On 2023-06-19, I wrote that I didn't want to take this route for
> userspace notifications.
>
> Then on 2023-06-28, you wrote to Greg asking him to take it instead of
> me. Nine minutes later, Greg said "yea sure." Then he caught up on the
> thread and some hours later wrote:
>
>> Wait, no, I'm not the maintainer of this, Jason is. And he already
>> rejected it (and based on the changelog text, I would too), so why are
>> you asking me a month later to take this?
>>
>> Work with the maintainer please, don't try to route around them, you
>> both know better than this.
> Then on 2023-11-14 you wrote to me again asking me to take it, despite
> my earlier reservations not changing in the interim. I didn't have a
> chance to reply.
>
> Then on 2023-11-30, Greg weirdly took it anyway, with zero discussion
> or evidence on the mailing list as to what had happened.
>
> When I noticed what had happened (while working on his driver in the
> process of cleaning up/reworking patches that your Amazon employees
> sent me that needed work), suspicious that you tried to "route around"
> the proper way of getting this done and trick Greg again into taking a
> patch that's not his purview, I asked him wtf happened on IRC:
>
> <gregkh> ugh, sorry, I don't remember that. I think Alexander talked
> to me at plumbers and said, "hey, please take this virt patch"
> <gregkh> but you are right, you NAKed it in that thread, I forgot
> that, sorry. Yes, revert it if that's needed.
>
> Greg then ACK'd the revert commit which came with a stable@ marking
> and a Fixes: tag (for 6.8, which isn't very old).
>
> So it looks to me like you twice tried to trick Greg into taking this,
> succeeded the second time, got caught, and now are trying to make a
> regression argument as a means of keeping your sneaky commit in there.
> All of this really _really_ rubs me the wrong way, I have to say.
>
> I don't know what holds more weight here -- the predictable regression
> argument, or the fact that you snuck nack'd changes into a very very
> recent kernel that can still be removed while probably only affecting
> you. But I'm obviously not happy about this.


I'm personally much more concerned about Linux' ability to deal with VM
Clone events than "my personal use case". The group at Amazon you see
working on this is working on AWS Lambda which owns the full host and
guest stack, including Linux on both ends. They could happily patch
their own Linux kernel. Instead, I have managed to get them to do "the
right thing" and work with the Linux upstream community to build a
viable solution that works for everyone.

However, every time they do that, all they get back is vgetrandom()
arguments which are completely irrelevant to the conversation and
deteriorate my efforts to get AWS to work *more* rather than less
upstream. Can we please move this back to a technical discussion and
based on technical grounds determine why sending a notification to user
space when a VM was cloned via uevents is even remotely a bad idea?


Thanks,

Alex




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879


2024-04-23 12:24:09

by Lennart Poettering

[permalink] [raw]
Subject: Re: [REGRESSION] Re: [PATCH] Revert "vmgenid: emit uevent when VMGENID updates"

On Di, 23.04.24 03:21, Jason A. Donenfeld ([email protected]) wrote:

Jason!

Can you please explain to me what the precise problem is with the
uevent? It doesn't leak any information about the actual vmgenid, it
just lets userspace know that the machine was cloned,
basically. What's the problem with that? I'd really like to
understand?

There are many usecases for this in the VM world, for example we'd
like to hook things up so that various userspace managed concepts,
such as DHCP leases, MAC addresses are automatically refreshed.

This has no relationship to RNGs or anything like this, it's just an
event we can handle in userspace to trigger address refreshes like
this.

Hence, why is the revert necessary? This was already in a released
kernel, and we have started work on making use of this in systemd, and
afaics this does not compromise the kernel RNG in even the remotest of
ways, hence why is a revert necessary? From my usersace perspective
it's just very very sad, that this simple, trivial interface we wanted
to use, that was in a stable kernel is now gone again.

Can you explain what the problem with this single-line trivial
interface is? I really would like to understand!

Lennart

(BTW: even if the uevent would leak the vmgenid somehow to userspace —
which it does not —, at least on qemu — i.e. one of the most relevant
VM platforms — the vmgenid can be read directly from userspace by
cat'ing /sys/firmware/qemu_fw_cfg/by_name/etc/vmgenid_guid/raw,
i.e. it's not that well protected anyway).

--
Lennart Poettering, Berlin

2024-04-26 11:33:38

by Alexander Graf

[permalink] [raw]
Subject: Re: [REGRESSION] Re: [PATCH] Revert "vmgenid: emit uevent when VMGENID updates"


On 23.04.24 14:23, Lennart Poettering wrote:
> On Di, 23.04.24 03:21, Jason A. Donenfeld ([email protected]) wrote:
>
> Jason!
>
> Can you please explain to me what the precise problem is with the
> uevent? It doesn't leak any information about the actual vmgenid, it
> just lets userspace know that the machine was cloned,
> basically. What's the problem with that? I'd really like to
> understand?
>
> There are many usecases for this in the VM world, for example we'd
> like to hook things up so that various userspace managed concepts,
> such as DHCP leases, MAC addresses are automatically refreshed.
>
> This has no relationship to RNGs or anything like this, it's just an
> event we can handle in userspace to trigger address refreshes like
> this.
>
> Hence, why is the revert necessary? This was already in a released
> kernel, and we have started work on making use of this in systemd, and
> afaics this does not compromise the kernel RNG in even the remotest of
> ways, hence why is a revert necessary? From my usersace perspective
> it's just very very sad, that this simple, trivial interface we wanted
> to use, that was in a stable kernel is now gone again.
>
> Can you explain what the problem with this single-line trivial
> interface is? I really would like to understand!


Jason, ping?

If I don't see technical reasoning from you here, I will assume that you
agree with Lennart and my points of views and send a revert of your
revert shortly to ensure systemd has its uevent still in 6.9.


Alex




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879


2024-04-26 12:53:02

by Jason A. Donenfeld

[permalink] [raw]
Subject: Re: [REGRESSION] Re: [PATCH] Revert "vmgenid: emit uevent when VMGENID updates"

I don't think adding UAPI to an individual device driver like this is a
good approach especially considering that the virtio changes we
discussed some time ago will likely augment this and create another
means of a similar notification. And given that this intersects with
other userspace-oriented work I hope to get back to pretty soon, I think
introducing some adhoc mechanism like this adds clutter and isn't the
ideal way forward.

On top of that, your extremely aggressive approach and apparent
disregard for established processes and general sneakiness doesn't
really earn you a lot of points here. There wasn't even an iota of
recognition in your reply that you realize the way you've gone about
this is not fine. That doesn't really sit too well, you know?

2024-04-26 13:47:23

by Babis Chalios

[permalink] [raw]
Subject: Re: [REGRESSION] Re: [PATCH] Revert "vmgenid: emit uevent when VMGENID updates"

Hi Jason,

On 4/26/24 14:52, Jason A. Donenfeld wrote:
> I don't think adding UAPI to an individual device driver like this is a
> good approach especially considering that the virtio changes we
> discussed some time ago will likely augment this and create another
> means of a similar notification. And given that this intersects with
> other userspace-oriented work I hope to get back to pretty soon, I think
> introducing some adhoc mechanism like this adds clutter and isn't the
> ideal way forward.
>

Correct me if I'm wrong, but the virtio changes were meant to mean "please
reseed your PRNGs". That's why we wanted to route them via random.c. We
designed them specifically so that virtio-rng would be only one of the
potential
systems that would emit such notifications, whereas other systems might have
nothing to do with VM events.

With that in mind, could you describe how these events would be useful
to the
use case of Lennart? systemd does not need a notification every time the
system
believes PRNGs need to be reseeded. It explicitly needs a notification
when a VM
was cloned. This has nothing to do with PRNGs and I don't believe random.c,
virtio-rng, or vgetrand() should be responsible for delivering this.

Cheers,
Babis

2024-04-26 14:21:03

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [REGRESSION] Re: [PATCH] Revert "vmgenid: emit uevent when VMGENID updates"

On 22.04.24 09:51, Alexander Graf wrote:
> [Adding CC list of original patch plus regression tracker]
>
> On 18.04.24 13:48, Jason A. Donenfeld wrote:
>> This reverts commit ad6bcdad2b6724e113f191a12f859a9e8456b26d. I had
>> nak'd it, and Greg said on the thread that it links that he wasn't going
>> to take it either, especially since it's not his code or his tree, but
>> then, seemingly accidentally, it got pushed up some months later, in
>> what looks like a mistake, with no further discussion in the linked
>> thread. So revert it, since it's clearly not intended.
>
> Reverting this patch creates a user space visible regression compared to
> v6.8.

A theoretical one? Sure! But did any machines actually used in
production break? From my understanding of Linus approach to the "no
regression" rule this is what matters most here.

And even if that was the case: It afaics also matters that the commit
was just in proper releases for a short time frame. Linus thus might
consider the whole situation along the lines of "we really did screw up
here and to fix it we are bending the 'no regressions' rule slightly;
sorry". Things like that iirc have happened in the past, but I might
misremember here.

Linus, if I got you wrong there, please speak up. But right now I'm
inclined to not handle this as a regression and drop it from the tracking.

Ciao, Thorsten

> Please treat it as such.
>
> I'm slightly confused to see you passionate about this patch after you
> ghosted the conversation you referenced:
>
>
> https://lore.kernel.org/lkml/[email protected]/
>
> The purpose of this uevent is to notify systemd[1][2] (or similar) that
> a VM clone event happened, so it can for example regenerate MAC
> addresses if it generated them on boot, regenerate its unique machine id
> or simply force rerequest a new DHCP lease.
>
> I don't understand how there's any correlation or dependency to
> vgetrandom() or anything RNG in this and why getting vgetrandom() merged
> upstream is even something to talk about in the same line as this patch
> [3].
>
> We had a lengthy, constructive conversation with Ted at LPC last year
> about the "PRNG and clone" use case and concluded that it's best for
> everyone to simply assume the system could be cloned at any point, hence
> always force intermix of RDRAND or comparable to any PRNG output. We
> since no longer need an event for that case.
>
>
> Alex
>
> [1] https://github.com/systemd/systemd/issues/26380
> [2] https://lore.kernel.org/lkml/ZJGNREN4tLzQXOJr@gardel-login/
> [3]
> https://lore.kernel.org/lkml/CAHmME9pxc-nO_xa=4+1CnvbnuefbRTJHxM7n817c_TPeoxzu_g@mail.gmail.com/
>
> #regzbot introduced: 3aadf100f93d8081
>
>>
>> Fixes: ad6bcdad2b67 ("vmgenid: emit uevent when VMGENID updates")
>> Cc: [email protected]
>> Cc: Greg Kroah-Hartman <[email protected]>
>> Link: https://lore.kernel.org/r/[email protected]
>> Signed-off-by: Jason A. Donenfeld <[email protected]>
>> ---
>>   drivers/virt/vmgenid.c | 2 --
>>   1 file changed, 2 deletions(-)
>>
>> diff --git a/drivers/virt/vmgenid.c b/drivers/virt/vmgenid.c
>> index b67a28da4702..a1c467a0e9f7 100644
>> --- a/drivers/virt/vmgenid.c
>> +++ b/drivers/virt/vmgenid.c
>> @@ -68,7 +68,6 @@ static int vmgenid_add(struct acpi_device *device)
>>   static void vmgenid_notify(struct acpi_device *device, u32 event)
>>   {
>>       struct vmgenid_state *state = acpi_driver_data(device);
>> -    char *envp[] = { "NEW_VMGENID=1", NULL };
>>       u8 old_id[VMGENID_SIZE];
>>         memcpy(old_id, state->this_id, sizeof(old_id));
>> @@ -76,7 +75,6 @@ static void vmgenid_notify(struct acpi_device
>> *device, u32 event)
>>       if (!memcmp(old_id, state->this_id, sizeof(old_id)))
>>           return;
>>       add_vmfork_randomness(state->this_id, sizeof(state->this_id));
>> -    kobject_uevent_env(&device->dev.kobj, KOBJ_CHANGE, envp);
>>   }
>>     static const struct acpi_device_id vmgenid_ids[] = {
>
>
>
>
> Amazon Development Center Germany GmbH
> Krausenstr. 38
> 10117 Berlin
> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
> Sitz: Berlin
> Ust-ID: DE 289 237 879
>
>

2024-04-26 20:05:26

by Alexander Graf

[permalink] [raw]
Subject: Re: [REGRESSION] Re: [PATCH] Revert "vmgenid: emit uevent when VMGENID updates"


On 26.04.24 15:43, Babis Chalios wrote:
> Hi Jason,
>
> On 4/26/24 14:52, Jason A. Donenfeld wrote:
>> I don't think adding UAPI to an individual device driver like this is a
>> good approach especially considering that the virtio changes we
>> discussed some time ago will likely augment this and create another
>> means of a similar notification. And given that this intersects with
>> other userspace-oriented work I hope to get back to pretty soon, I think
>> introducing some adhoc mechanism like this adds clutter and isn't the
>> ideal way forward.
>>
>
> Correct me if I'm wrong, but the virtio changes were meant to mean
> "please
> reseed your PRNGs". That's why we wanted to route them via random.c. We
> designed them specifically so that virtio-rng would be only one of the
> potential
> systems that would emit such notifications, whereas other systems
> might have
> nothing to do with VM events.
>
> With that in mind, could you describe how these events would be useful
> to the
> use case of Lennart? systemd does not need a notification every time
> the system
> believes PRNGs need to be reseeded. It explicitly needs a notification
> when a VM
> was cloned. This has nothing to do with PRNGs and I don't believe
> random.c,
> virtio-rng, or vgetrand() should be responsible for delivering this.


I second this. A VM clone event may be one of multiple events that can
cause an RNG reseed event. This is what Babis' patches to propagate such
an event[1] implemented: A new type of multiplexed event that feeds from
multiple sources; most notably *not* from VMGenID.

Due your reluctance to enable user space PRNGs to get any notion of
reseed events [2], we have since abandoned that whole RNG reseed event
approach: Going forward, for our environments, we'll simply mandate that
PRNGs always mix in randomness that is guaranteed different post-clone
(such as RDRAND). You want a clone safe system? Use one that does (fast
and non-broken) RDRAND.

However, VM clone events are useful for other situations as all of us
outlined multiple times in this and previous threads. While you can use
VM clone events as a source for RNG reseed events, you can not use RNG
reseed events (or a snapshot safe RNG source like /dev/random) as
indicator for VM clones, because they will trigger more often and hence
cause undesired side effects. You may want a reseed every 60s, but
surely don't want a new MAC address every 60 seconds, right? :)

Now, theoretically I can see some merit for a single, abstracted event
source for VM clones over a per-driver one. But practically, between
VMGenID on ACPI and Device Tree systems, there are very for platforms
that care about safe VM snapshots and wouldn't "just work". The only one
I can think of atm is s390x. I don't know if an abstraction - like
another driver that simply proxies notifications - would be worth it. Or
if in that case we'd just expand the very same vmgenid driver to that
other one-off platform that happens to run without DT or ACPI.

So, overall, I still don't see any better path forward to get a "VM
cloned" event into systemd than the uevent.

Jason, could you please outline how your "other userspace-oriented work
you hope to get back to soon" would help with the systemd use case?


Alex

[1] https://lore.kernel.org/lkml/[email protected]/
[2] https://lore.kernel.org/lkml/[email protected]/




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879


2024-04-29 09:05:49

by Lennart Poettering

[permalink] [raw]
Subject: Re: [REGRESSION] Re: [PATCH] Revert "vmgenid: emit uevent when VMGENID updates"

On Fr, 26.04.24 14:52, Jason A. Donenfeld ([email protected]) wrote:

> I don't think adding UAPI to an individual device driver like this

Does vmgenid really qualify as "an individual device driver"? It's a
pretty generic software interface, implemented by various different
VMMs these days. It is also the only interface I am aware of that
actually exists and would provide the concept right now?

if this was really hyperv specific, then I'd agree it's just an
"individual device driver". But it's widely implemented, for example a
trivial command line switch in qemu.

Hence, for something this generic, and widely deployed with multiple
backend implementations I think we can say it's kinda more of a
subsystem and less of an individual driver, no?

> is a good approach especially considering that the virtio changes we
> discussed some time ago will likely augment this and create another
> means of a similar notification. And given that this intersects with
> other userspace-oriented work I hope to get back to pretty soon, I
> think introducing some adhoc mechanism like this adds clutter and
> isn't the ideal way forward.

If one day a virtio-based equivalent shows up, then I'd be entirely
fine with supporting this in userspace directly too , because virtio
too is a generic thing typically implemented by multiple VMM
backends. From my userspace perspective I see little benefit in the
kernel abstracting over vmgenid and virtio-genid (if that ever
materializes), as a systemd person I am not asking for this kind of
abstraction (in case anyone wonders). A generic ACPI device such as
vmgenid is entirely enough of "generic" for me.

The way we would process the event in userspace in systemd (from a
udev rule) is so generic that it's trivial to match against two
generic interfaces, instead of just one.

And even if there's value in a generic abstraction provided by the
kernel over both vmgenid and a future virtio-based thing: the kernel
patch in question was a *single* line, and our hookup in userspace
could easily be moved over when the day comes, because it's really not
a rocket science level interface. It's a single parameterless event,
how much easier could things get?

I understand that how this all happened wasn't to everyones wishes,
but do we really have to make all of this so complex if it could just
be so simple? Why delay this further, why go back again given the
event, the interface itself is such an utter triviality? Do we really
make such a threatre around a single line change, a single additional
uevent, just because of politics?

Lennart

--
Lennart Poettering, Berlin

2024-05-03 10:14:47

by Babis Chalios

[permalink] [raw]
Subject: Re: [REGRESSION] Re: [PATCH] Revert "vmgenid: emit uevent when VMGENID updates"

Hi Jason,

Friendly ping?

IMHO Lennart, Alex and myself have raised (what I think to be) valid
technical points regarding your concerns about your belief that the
uevent mechanism is an ad-hoc mechanism that you don't consider viable.

Just to summarize:

* Upon VM clone, user space needs to adjust various components (DHCP
leases, MAC addresses, etc.) that have nothing to do with PRNGs.
* The path of exposing the VM clone event via vgetrandom() (or any other
interface of random.c) is simply wrong. The random subsystem is the
natural component to inform about when cached entropy is stale. It
should not be responsible for informing user space about VM clone
events. IOW, "reseed your PRNGs" is not equivalent to "your VM has been
cloned".

Given all this, it would help the discussion if you explained why you
believe random.c should propagate VM clone events and how.

If you don't believe that, could you explain what is the problem with
the proposed uevent mechanism? Personally, I agree with Lennart that
VMGenID is a generic ACPI device built for exactly this purpose. VMGenID
is not an "ad-hoc driver". It is a standard which is supported by most
(all?) major VMMs out there today and its whole purpose is to deliver
inside the VM a notification that it was cloned.

Cheers,
Babis



2024-06-13 16:38:23

by Alexander Graf

[permalink] [raw]
Subject: Re: [REGRESSION] Re: [PATCH] Revert "vmgenid: emit uevent when VMGENID updates"

Hey Jason,

On 29.04.24 11:04, Lennart Poettering wrote:
> On Fr, 26.04.24 14:52, Jason A. Donenfeld ([email protected]) wrote:
>
>> I don't think adding UAPI to an individual device driver like this
> Does vmgenid really qualify as "an individual device driver"? It's a
> pretty generic software interface, implemented by various different
> VMMs these days. It is also the only interface I am aware of that
> actually exists and would provide the concept right now?
>
> if this was really hyperv specific, then I'd agree it's just an
> "individual device driver". But it's widely implemented, for example a
> trivial command line switch in qemu.
>
> Hence, for something this generic, and widely deployed with multiple
> backend implementations I think we can say it's kinda more of a
> subsystem and less of an individual driver, no?
>
>> is a good approach especially considering that the virtio changes we
>> discussed some time ago will likely augment this and create another
>> means of a similar notification. And given that this intersects with
>> other userspace-oriented work I hope to get back to pretty soon, I
>> think introducing some adhoc mechanism like this adds clutter and
>> isn't the ideal way forward.
> If one day a virtio-based equivalent shows up, then I'd be entirely
> fine with supporting this in userspace directly too , because virtio
> too is a generic thing typically implemented by multiple VMM
> backends. From my userspace perspective I see little benefit in the
> kernel abstracting over vmgenid and virtio-genid (if that ever
> materializes), as a systemd person I am not asking for this kind of
> abstraction (in case anyone wonders). A generic ACPI device such as
> vmgenid is entirely enough of "generic" for me.
>
> The way we would process the event in userspace in systemd (from a
> udev rule) is so generic that it's trivial to match against two
> generic interfaces, instead of just one.
>
> And even if there's value in a generic abstraction provided by the
> kernel over both vmgenid and a future virtio-based thing: the kernel
> patch in question was a *single* line, and our hookup in userspace
> could easily be moved over when the day comes, because it's really not
> a rocket science level interface. It's a single parameterless event,
> how much easier could things get?
>
> I understand that how this all happened wasn't to everyones wishes,
> but do we really have to make all of this so complex if it could just
> be so simple? Why delay this further, why go back again given the
> event, the interface itself is such an utter triviality? Do we really
> make such a threatre around a single line change, a single additional
> uevent, just because of politics?


Friendly ping again. We would really like to have a constructive
technical conversation and collaboration on how to make forward progress
with VM clone notifications for user space applications that hold unique
data and hence need to learn about VM clone events, outside of any
randomness semantics.


Thanks,

Alex





Amazon Web Services Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597