2024-03-18 17:41:31

by Marc Zyngier

[permalink] [raw]
Subject: Re: [RFC PATCH v2 0/4] arm64: Add PSCI v1.3 SYSTEM_OFF2 support for hibernation

On Mon, 18 Mar 2024 17:26:07 +0000,
David Woodhouse <[email protected]> wrote:
>
> [1 <text/plain; UTF-8 (quoted-printable)>]
> On Mon, 2024-03-18 at 16:57 +0000, Marc Zyngier wrote:
> >
> > >
> > > There *is* a way for a VMM to opt *out* of newer PSCI versions... by
> > > setting a per-vCPU "special" register that actually ends up setting the
> > > PSCI version KVM-wide. Quite why this isn't just a simple KVM_CAP, I
> > > have no idea.
> >
> > Because the expectations are that the VMM can blindly save/restore the
> > guest's state, including the PSCI version, and restore that blindly.
> > KVM CAPs are just a really bad design pattern for this sort of things.
>
> Hm, am I missing something here? Does the *guest* get to set the PSCI
> version somehow, and opt into the latest version that it understands
> regardless of what the firmware/host can support?

No. The *VMM* sets the PSCI version by writing to a pseudo register.
It means that when the guest migrates, the VMM saves and restores that
version, and the guest doesn't see any change.

The host firmware has nothing to do with it, obviously. This is all
about KVM's own implementation of the "firmware", as seen by the guest.

> Because if not, surely it's just part of the basic shape of the
> machine, like "how many vCPUs does it have". You don't need to be able
> to query it back again.

Nobody needs to do this.

> I don't think we ever aspired to be able to hand an arbitrary KVM fd to
> a userspace VMM and have the VMM be able to drive that VM without
> having any a priori context, did we?

Arbitrary? No. This is actually very specific and pretty well
documented.

Also, to answer your question about why we treat 0.1 differently from
0.2+: 0.1 didn't specify the PSCI SMC/HCR encoding, meaning that KVM
implemented something that was never fully specified. The VMM has to
provide firmware tables that describe that. With 0.2+, there is a
standard encoding for all functions, and the VMM doesn't have to
provide the encoding to the guest.

M.

--
Without deviation from the norm, progress is not possible.


2024-03-19 01:22:42

by David Woodhouse

[permalink] [raw]
Subject: Re: [RFC PATCH v2 0/4] arm64: Add PSCI v1.3 SYSTEM_OFF2 support for hibernation

On Mon, 2024-03-18 at 17:41 +0000, Marc Zyngier wrote:
> On Mon, 18 Mar 2024 17:26:07 +0000,
> David Woodhouse <[email protected]> wrote:
> >
> > [1  <text/plain; UTF-8 (quoted-printable)>]
> > On Mon, 2024-03-18 at 16:57 +0000, Marc Zyngier wrote:
> > >
> > > >
> > > > There *is* a way for a VMM to opt *out* of newer PSCI versions... by
> > > > setting a per-vCPU "special" register that actually ends up setting the
> > > > PSCI version KVM-wide. Quite why this isn't just a simple KVM_CAP, I
> > > > have no idea.
> > >
> > > Because the expectations are that the VMM can blindly save/restore the
> > > guest's state, including the PSCI version, and restore that blindly.
> > > KVM CAPs are just a really bad design pattern for this sort of things.
> >
> > Hm, am I missing something here? Does the *guest* get to set the PSCI
> > version somehow, and opt into the latest version that it understands
> > regardless of what the firmware/host can support?
>
> No. The *VMM* sets the PSCI version by writing to a pseudo register.
> It means that when the guest migrates, the VMM saves and restores that
> version, and the guest doesn't see any change.

And when you boot a guest image which has been working for years under
a new kernel+KVM, your guest suddenly experiences a new PSCI version.
As I said that's not just new optional functions; it's potentially even
returning new error codes to the functions that said guest was already
using.

And when you *hibernate* a guest and then launch it again under a newer
kernel+KVM, it experiences the same incompatibility.

Unless the VMM realises this problem and opts *out* of the newer KVM
behaviour, of course. This is very much unlike how we *normally* expose
new KVM capabilities.

> > I don't think we ever aspired to be able to hand an arbitrary KVM fd to
> > a userspace VMM and have the VMM be able to drive that VM without
> > having any a priori context, did we?
>
> Arbitrary? No. This is actually very specific and pretty well
> documented.
>
> Also, to answer your question about why we treat 0.1 differently from
> 0.2+: 0.1 didn't specify the PSCI SMC/HCR encoding, meaning that KVM
> implemented something that was never fully specified. The VMM has to
> provide firmware tables that describe that. With 0.2+, there is a
> standard encoding for all functions, and the VMM doesn't have to
> provide the encoding to the guest.

Gotcha. So for that case we were *forced* to do things correctly and
allow userspace to opt-in to the capability. While for 0.2 onwards we
got away with this awfulness of silently upgrading the version without
VMM consent.

I was hoping to just follow the existing model of SYSTEM_RESET2 and not
have to touch this awfulness with a barge-pole, but sure, whatever you
want.


Attachments:
smime.p7s (5.83 kB)