2022-04-07 21:45:18

by Peter Gonda

[permalink] [raw]
Subject: [PATCH v4.1] KVM, SEV: Add KVM_EXIT_SHUTDOWN metadata for SEV-ES

If an SEV-ES guest requests termination, exit to userspace with
KVM_EXIT_SYSTEM_EVENT and a dedicated SEV_TERM type instead of -EINVAL
so that userspace can take appropriate action.

See AMD's GHCB spec section '4.1.13 Termination Request' for more details.

Suggested-by: Sean Christopherson <[email protected]>
Suggested-by: Paolo Bonzini <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Peter Gonda <[email protected]>

---
V4
* Updated to Sean and Paolo's suggestion of reworking the
kvm_run.system_event struct to ndata and data fields to fix the
padding.
* 4.1 Updated commit description

V3
* Add Documentation/ update.
* Updated other KVM_EXIT_SHUTDOWN exits to clear ndata and set reason
to KVM_SHUTDOWN_REQ.

V2
* Add KVM_CAP_EXIT_SHUTDOWN_REASON check for KVM_CHECK_EXTENSION.

Tested by making an SEV-ES guest call sev_es_terminate() with hardcoded
reason code set and reason code and then observing the codes from the
userspace VMM in the kvm_run.shutdown.data fields.

---
arch/x86/kvm/svm/sev.c | 9 +++++++--
include/uapi/linux/kvm.h | 5 ++++-
2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 75fa6dd268f0..1a080f3f09d8 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -2735,8 +2735,13 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
pr_info("SEV-ES guest requested termination: %#llx:%#llx\n",
reason_set, reason_code);

- ret = -EINVAL;
- break;
+ vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
+ vcpu->run->system_event.type = KVM_SYSTEM_EVENT_SEV_TERM |
+ KVM_SYSTEM_EVENT_NDATA_VALID;
+ vcpu->run->system_event.ndata = 1;
+ vcpu->run->system_event.data[1] = control->ghcb_gpa;
+
+ return 0;
}
default:
/* Error, keep GHCB MSR value as-is */
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 8616af85dc5d..dd1d8167e71f 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -444,8 +444,11 @@ struct kvm_run {
#define KVM_SYSTEM_EVENT_SHUTDOWN 1
#define KVM_SYSTEM_EVENT_RESET 2
#define KVM_SYSTEM_EVENT_CRASH 3
+#define KVM_SYSTEM_EVENT_SEV_TERM 4
+#define KVM_SYSTEM_EVENT_NDATA_VALID (1u << 31)
__u32 type;
- __u64 flags;
+ __u32 ndata;
+ __u64 data[16];
} system_event;
/* KVM_EXIT_S390_STSI */
struct {
--
2.35.1.1178.g4f1659d476-goog


2022-04-08 03:27:33

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v4.1] KVM, SEV: Add KVM_EXIT_SHUTDOWN metadata for SEV-ES

On Thu, Apr 07, 2022, Peter Gonda wrote:
> If an SEV-ES guest requests termination, exit to userspace with
> KVM_EXIT_SYSTEM_EVENT and a dedicated SEV_TERM type instead of -EINVAL
> so that userspace can take appropriate action.
>
> See AMD's GHCB spec section '4.1.13 Termination Request' for more details.

Maybe it'll be obvious by the lack of compilation errors, but the changelog should
call out the flags => ndata+data shenanigans, otherwise this looks like ABI breakage.

> Suggested-by: Sean Christopherson <[email protected]>
> Suggested-by: Paolo Bonzini <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Peter Gonda <[email protected]>
>
> ---
> V4
> * Updated to Sean and Paolo's suggestion of reworking the
> kvm_run.system_event struct to ndata and data fields to fix the
> padding.
> * 4.1 Updated commit description
>
> V3
> * Add Documentation/ update.
> * Updated other KVM_EXIT_SHUTDOWN exits to clear ndata and set reason
> to KVM_SHUTDOWN_REQ.
>
> V2
> * Add KVM_CAP_EXIT_SHUTDOWN_REASON check for KVM_CHECK_EXTENSION.
>
> Tested by making an SEV-ES guest call sev_es_terminate() with hardcoded
> reason code set and reason code and then observing the codes from the
> userspace VMM in the kvm_run.shutdown.data fields.
>
> ---
> arch/x86/kvm/svm/sev.c | 9 +++++++--
> include/uapi/linux/kvm.h | 5 ++++-
> 2 files changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 75fa6dd268f0..1a080f3f09d8 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -2735,8 +2735,13 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
> pr_info("SEV-ES guest requested termination: %#llx:%#llx\n",
> reason_set, reason_code);
>
> - ret = -EINVAL;
> - break;
> + vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
> + vcpu->run->system_event.type = KVM_SYSTEM_EVENT_SEV_TERM |
> + KVM_SYSTEM_EVENT_NDATA_VALID;
> + vcpu->run->system_event.ndata = 1;
> + vcpu->run->system_event.data[1] = control->ghcb_gpa;
> +
> + return 0;

Kinda silly, but

ret = 0;
break;

would be better so that this flows through the tracepoint. I wouldn't care much
if it didn't result in an unpaired "entry" tracepoint (and I still don't care that
much...).

> }
> default:
> /* Error, keep GHCB MSR value as-is */
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 8616af85dc5d..dd1d8167e71f 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -444,8 +444,11 @@ struct kvm_run {
> #define KVM_SYSTEM_EVENT_SHUTDOWN 1
> #define KVM_SYSTEM_EVENT_RESET 2
> #define KVM_SYSTEM_EVENT_CRASH 3
> +#define KVM_SYSTEM_EVENT_SEV_TERM 4
> +#define KVM_SYSTEM_EVENT_NDATA_VALID (1u << 31)
> __u32 type;
> - __u64 flags;
> + __u32 ndata;
> + __u64 data[16];
> } system_event;
> /* KVM_EXIT_S390_STSI */
> struct {
> --
> 2.35.1.1178.g4f1659d476-goog
>

2022-04-08 05:08:19

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v4.1] KVM, SEV: Add KVM_EXIT_SHUTDOWN metadata for SEV-ES

Hi Peter,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on kvm/master]
[also build test ERROR on v5.18-rc1 next-20220407]
[cannot apply to v4.1]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/intel-lab-lkp/linux/commits/Peter-Gonda/KVM-SEV-Add-KVM_EXIT_SHUTDOWN-metadata-for-SEV-ES/20220408-050628
base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git master
config: arm64-randconfig-r035-20220408 (https://download.01.org/0day-ci/archive/20220408/[email protected]/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project c29a51b3a257908aebc01cd7c4655665db317d66)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install arm64 cross compiling tool for clang build
# apt-get install binutils-aarch64-linux-gnu
# https://github.com/intel-lab-lkp/linux/commit/3b310e5891d172b59042783c128f6efcf5bf6198
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Peter-Gonda/KVM-SEV-Add-KVM_EXIT_SHUTDOWN-metadata-for-SEV-ES/20220408-050628
git checkout 3b310e5891d172b59042783c128f6efcf5bf6198
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm64 SHELL=/bin/bash arch/arm64/kvm/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

>> arch/arm64/kvm/psci.c:184:26: error: no member named 'flags' in 'struct kvm_run::(unnamed at include/uapi/linux/kvm.h:443:3)'
vcpu->run->system_event.flags = flags;
~~~~~~~~~~~~~~~~~~~~~~~ ^
1 error generated.


vim +184 arch/arm64/kvm/psci.c

e6bc13c8a70eab arch/arm/kvm/psci.c Anup Patel 2014-04-29 163
34739fd95fab3a arch/arm64/kvm/psci.c Will Deacon 2022-02-21 164 static void kvm_prepare_system_event(struct kvm_vcpu *vcpu, u32 type, u64 flags)
4b1238269ed340 arch/arm/kvm/psci.c Anup Patel 2014-04-29 165 {
46808a4cb89708 arch/arm64/kvm/psci.c Marc Zyngier 2021-11-16 166 unsigned long i;
cf5d318865e25f arch/arm/kvm/psci.c Christoffer Dall 2014-10-16 167 struct kvm_vcpu *tmp;
cf5d318865e25f arch/arm/kvm/psci.c Christoffer Dall 2014-10-16 168
cf5d318865e25f arch/arm/kvm/psci.c Christoffer Dall 2014-10-16 169 /*
cf5d318865e25f arch/arm/kvm/psci.c Christoffer Dall 2014-10-16 170 * The KVM ABI specifies that a system event exit may call KVM_RUN
cf5d318865e25f arch/arm/kvm/psci.c Christoffer Dall 2014-10-16 171 * again and may perform shutdown/reboot at a later time that when the
cf5d318865e25f arch/arm/kvm/psci.c Christoffer Dall 2014-10-16 172 * actual request is made. Since we are implementing PSCI and a
cf5d318865e25f arch/arm/kvm/psci.c Christoffer Dall 2014-10-16 173 * caller of PSCI reboot and shutdown expects that the system shuts
cf5d318865e25f arch/arm/kvm/psci.c Christoffer Dall 2014-10-16 174 * down or reboots immediately, let's make sure that VCPUs are not run
cf5d318865e25f arch/arm/kvm/psci.c Christoffer Dall 2014-10-16 175 * after this call is handled and before the VCPUs have been
cf5d318865e25f arch/arm/kvm/psci.c Christoffer Dall 2014-10-16 176 * re-initialized.
cf5d318865e25f arch/arm/kvm/psci.c Christoffer Dall 2014-10-16 177 */
cc9b43f99d5ff4 virt/kvm/arm/psci.c Andrew Jones 2017-06-04 178 kvm_for_each_vcpu(i, tmp, vcpu->kvm)
3781528e3045e7 arch/arm/kvm/psci.c Eric Auger 2015-09-25 179 tmp->arch.power_off = true;
7b244e2be654d9 virt/kvm/arm/psci.c Andrew Jones 2017-06-04 180 kvm_make_all_cpus_request(vcpu->kvm, KVM_REQ_SLEEP);
cf5d318865e25f arch/arm/kvm/psci.c Christoffer Dall 2014-10-16 181
4b1238269ed340 arch/arm/kvm/psci.c Anup Patel 2014-04-29 182 memset(&vcpu->run->system_event, 0, sizeof(vcpu->run->system_event));
4b1238269ed340 arch/arm/kvm/psci.c Anup Patel 2014-04-29 183 vcpu->run->system_event.type = type;
34739fd95fab3a arch/arm64/kvm/psci.c Will Deacon 2022-02-21 @184 vcpu->run->system_event.flags = flags;
4b1238269ed340 arch/arm/kvm/psci.c Anup Patel 2014-04-29 185 vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
4b1238269ed340 arch/arm/kvm/psci.c Anup Patel 2014-04-29 186 }
4b1238269ed340 arch/arm/kvm/psci.c Anup Patel 2014-04-29 187

--
0-DAY CI Kernel Test Service
https://01.org/lkp

2022-04-08 05:48:13

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v4.1] KVM, SEV: Add KVM_EXIT_SHUTDOWN metadata for SEV-ES

Hi Peter,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on kvm/master]
[also build test ERROR on v5.18-rc1 next-20220407]
[cannot apply to v4.1]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/intel-lab-lkp/linux/commits/Peter-Gonda/KVM-SEV-Add-KVM_EXIT_SHUTDOWN-metadata-for-SEV-ES/20220408-050628
base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git master
config: riscv-randconfig-c006-20220408 (https://download.01.org/0day-ci/archive/20220408/[email protected]/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project c29a51b3a257908aebc01cd7c4655665db317d66)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install riscv cross compiling tool for clang build
# apt-get install binutils-riscv64-linux-gnu
# https://github.com/intel-lab-lkp/linux/commit/3b310e5891d172b59042783c128f6efcf5bf6198
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Peter-Gonda/KVM-SEV-Add-KVM_EXIT_SHUTDOWN-metadata-for-SEV-ES/20220408-050628
git checkout 3b310e5891d172b59042783c128f6efcf5bf6198
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=riscv SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

>> arch/riscv/kvm/vcpu_sbi.c:97:20: error: no member named 'flags' in 'struct kvm_run::(unnamed at include/uapi/linux/kvm.h:443:3)'
run->system_event.flags = flags;
~~~~~~~~~~~~~~~~~ ^
1 error generated.


vim +97 arch/riscv/kvm/vcpu_sbi.c

dea8ee31a03927 Atish Patra 2021-09-27 83
4b11d86571c447 Anup Patel 2022-01-31 84 void kvm_riscv_vcpu_sbi_system_reset(struct kvm_vcpu *vcpu,
4b11d86571c447 Anup Patel 2022-01-31 85 struct kvm_run *run,
4b11d86571c447 Anup Patel 2022-01-31 86 u32 type, u64 flags)
4b11d86571c447 Anup Patel 2022-01-31 87 {
4b11d86571c447 Anup Patel 2022-01-31 88 unsigned long i;
4b11d86571c447 Anup Patel 2022-01-31 89 struct kvm_vcpu *tmp;
4b11d86571c447 Anup Patel 2022-01-31 90
4b11d86571c447 Anup Patel 2022-01-31 91 kvm_for_each_vcpu(i, tmp, vcpu->kvm)
4b11d86571c447 Anup Patel 2022-01-31 92 tmp->arch.power_off = true;
4b11d86571c447 Anup Patel 2022-01-31 93 kvm_make_all_cpus_request(vcpu->kvm, KVM_REQ_SLEEP);
4b11d86571c447 Anup Patel 2022-01-31 94
4b11d86571c447 Anup Patel 2022-01-31 95 memset(&run->system_event, 0, sizeof(run->system_event));
4b11d86571c447 Anup Patel 2022-01-31 96 run->system_event.type = type;
4b11d86571c447 Anup Patel 2022-01-31 @97 run->system_event.flags = flags;
4b11d86571c447 Anup Patel 2022-01-31 98 run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
4b11d86571c447 Anup Patel 2022-01-31 99 }
4b11d86571c447 Anup Patel 2022-01-31 100

--
0-DAY CI Kernel Test Service
https://01.org/lkp

2022-04-08 18:25:54

by Peter Gonda

[permalink] [raw]
Subject: Re: [PATCH v4.1] KVM, SEV: Add KVM_EXIT_SHUTDOWN metadata for SEV-ES

On Thu, Apr 7, 2022 at 8:55 PM Sean Christopherson <[email protected]> wrote:
>
> On Thu, Apr 07, 2022, Peter Gonda wrote:
> > If an SEV-ES guest requests termination, exit to userspace with
> > KVM_EXIT_SYSTEM_EVENT and a dedicated SEV_TERM type instead of -EINVAL
> > so that userspace can take appropriate action.
> >
> > See AMD's GHCB spec section '4.1.13 Termination Request' for more details.
>
> Maybe it'll be obvious by the lack of compilation errors, but the changelog should
> call out the flags => ndata+data shenanigans, otherwise this looks like ABI breakage.

Hmm I am not sure we can do this change anymore given that we have two
call sites using 'flags'

arch/arm64/kvm/psci.c:184
arch/riscv/kvm/vcpu_sbi.c:97

I am not at all familiar with ARM and RISC-V but some quick reading
tells me these archs also require 64-bit alignment on their 64-bit
accesses. If thats correct, should I fix this call sites up by
proceeding with this ndata + data[] change and move whatever they are
assigning to flags into data[0] like I am doing here? It looks like
both of these changes are not in a kernel release so IIUC we can still
fix the ABI here?

>
> > Suggested-by: Sean Christopherson <[email protected]>
> > Suggested-by: Paolo Bonzini <[email protected]>
> > Cc: [email protected]
> > Cc: [email protected]
> > Signed-off-by: Peter Gonda <[email protected]>
> >
> > ---
> > V4
> > * Updated to Sean and Paolo's suggestion of reworking the
> > kvm_run.system_event struct to ndata and data fields to fix the
> > padding.
> > * 4.1 Updated commit description
> >
> > V3
> > * Add Documentation/ update.
> > * Updated other KVM_EXIT_SHUTDOWN exits to clear ndata and set reason
> > to KVM_SHUTDOWN_REQ.
> >
> > V2
> > * Add KVM_CAP_EXIT_SHUTDOWN_REASON check for KVM_CHECK_EXTENSION.
> >
> > Tested by making an SEV-ES guest call sev_es_terminate() with hardcoded
> > reason code set and reason code and then observing the codes from the
> > userspace VMM in the kvm_run.shutdown.data fields.
> >
> > ---
> > arch/x86/kvm/svm/sev.c | 9 +++++++--
> > include/uapi/linux/kvm.h | 5 ++++-
> > 2 files changed, 11 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > index 75fa6dd268f0..1a080f3f09d8 100644
> > --- a/arch/x86/kvm/svm/sev.c
> > +++ b/arch/x86/kvm/svm/sev.c
> > @@ -2735,8 +2735,13 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
> > pr_info("SEV-ES guest requested termination: %#llx:%#llx\n",
> > reason_set, reason_code);
> >
> > - ret = -EINVAL;
> > - break;
> > + vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
> > + vcpu->run->system_event.type = KVM_SYSTEM_EVENT_SEV_TERM |
> > + KVM_SYSTEM_EVENT_NDATA_VALID;
> > + vcpu->run->system_event.ndata = 1;
> > + vcpu->run->system_event.data[1] = control->ghcb_gpa;
> > +
> > + return 0;
>
> Kinda silly, but
>
> ret = 0;
> break;
>
> would be better so that this flows through the tracepoint. I wouldn't care much
> if it didn't result in an unpaired "entry" tracepoint (and I still don't care that
> much...).

Ah I'll fix that up.


>
> > }
> > default:
> > /* Error, keep GHCB MSR value as-is */
> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> > index 8616af85dc5d..dd1d8167e71f 100644
> > --- a/include/uapi/linux/kvm.h
> > +++ b/include/uapi/linux/kvm.h
> > @@ -444,8 +444,11 @@ struct kvm_run {
> > #define KVM_SYSTEM_EVENT_SHUTDOWN 1
> > #define KVM_SYSTEM_EVENT_RESET 2
> > #define KVM_SYSTEM_EVENT_CRASH 3
> > +#define KVM_SYSTEM_EVENT_SEV_TERM 4
> > +#define KVM_SYSTEM_EVENT_NDATA_VALID (1u << 31)
> > __u32 type;
> > - __u64 flags;
> > + __u32 ndata;
> > + __u64 data[16];
> > } system_event;
> > /* KVM_EXIT_S390_STSI */
> > struct {
> > --
> > 2.35.1.1178.g4f1659d476-goog
> >

2022-04-09 13:25:20

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v4.1] KVM, SEV: Add KVM_EXIT_SHUTDOWN metadata for SEV-ES

+Anup and Will

On Fri, Apr 08, 2022, Peter Gonda wrote:
> On Thu, Apr 7, 2022 at 8:55 PM Sean Christopherson <[email protected]> wrote:
> >
> > On Thu, Apr 07, 2022, Peter Gonda wrote:
> > > If an SEV-ES guest requests termination, exit to userspace with
> > > KVM_EXIT_SYSTEM_EVENT and a dedicated SEV_TERM type instead of -EINVAL
> > > so that userspace can take appropriate action.
> > >
> > > See AMD's GHCB spec section '4.1.13 Termination Request' for more details.
> >
> > Maybe it'll be obvious by the lack of compilation errors, but the changelog should
> > call out the flags => ndata+data shenanigans, otherwise this looks like ABI breakage.
>
> Hmm I am not sure we can do this change anymore given that we have two
> call sites using 'flags'
>
> arch/arm64/kvm/psci.c:184
> arch/riscv/kvm/vcpu_sbi.c:97
>
> I am not at all familiar with ARM and RISC-V but some quick reading
> tells me these archs also require 64-bit alignment on their 64-bit
> accesses. If thats correct, should I fix this call sites up by
> proceeding with this ndata + data[] change and move whatever they are
> assigning to flags into data[0] like I am doing here? It looks like
> both of these changes are not in a kernel release so IIUC we can still
> fix the ABI here?

Yeah, both came in for v5.18. Given that there will be multiple paths that need
to set data, it's worth adding a common helper to the dirty work.

Anup and Will,

system_event.flags is broken (at least on x86) due to the prior 'type' field not
being propery padded, e.g. userspace will read/write garbage if the userspace
and kernel compilers pad structs differently.

struct {
__u32 type;
__u64 flags;
} system_event;

Our plan to unhose this is to change the struct as follows and use bit 31 in the
'type' to indicate that ndata+data are valid.

struct {
__u32 type;
__u32 ndata;
__u64 data[16];
} system_event;

Any objection to updating your architectures to use a helper to set the bit and
populate ndata+data accordingly? It'll require a userspace update, but v5.18
hasn't officially released yet so it's not kinda sort not ABI breakage.

2022-04-09 18:04:39

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v4.1] KVM, SEV: Add KVM_EXIT_SHUTDOWN metadata for SEV-ES

Queued, thanks. But documentation was missing:

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index e7a0dfdc0178..72183ae628f7 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6088,8 +6088,12 @@ should put the acknowledged interrupt vector into the 'epr' field.
#define KVM_SYSTEM_EVENT_SHUTDOWN 1
#define KVM_SYSTEM_EVENT_RESET 2
#define KVM_SYSTEM_EVENT_CRASH 3
+ #define KVM_SYSTEM_EVENT_SEV_TERM 4
+ #define KVM_SYSTEM_EVENT_NDATA_VALID (1u << 31)
__u32 type;
+ __u32 ndata;
__u64 flags;
+ __u64 data[16];
} system_event;

If exit_reason is KVM_EXIT_SYSTEM_EVENT then the vcpu has triggered
@@ -6099,7 +6103,7 @@ HVC instruction based PSCI call from the vcpu. The 'type' field describes
the system-level event type. The 'flags' field describes architecture
specific flags for the system-level event.

-Valid values for 'type' are:
+Valid values for bits 30:0 of 'type' are:

- KVM_SYSTEM_EVENT_SHUTDOWN -- the guest has requested a shutdown of the
VM. Userspace is not obliged to honour this, and if it does honour
@@ -6112,12 +6116,18 @@ Valid values for 'type' are:
has requested a crash condition maintenance. Userspace can choose
to ignore the request, or to gather VM memory core dump and/or
reset/shutdown of the VM.
+ - KVM_SYSTEM_EVENT_SEV_TERM -- an AMD SEV guest requested termination.
+ The guest physical address of the guest's GHCB is stored in `data[0]`.

Valid flags are:

- KVM_SYSTEM_EVENT_RESET_FLAG_PSCI_RESET2 (arm64 only) -- the guest issued
a SYSTEM_RESET2 call according to v1.1 of the PSCI specification.

+Extra data for this event is stored in the `data[]` array, up to index
+`ndata-1` included, if bit 31 is set in `type`. The data depends on the
+`type` field. There is no extra data if bit 31 is clear or `ndata` is zero.
+
::

/* KVM_EXIT_IOAPIC_EOI */


Paolo


2022-04-12 06:57:29

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v4.1] KVM, SEV: Add KVM_EXIT_SHUTDOWN metadata for SEV-ES

On Mon, Apr 11, 2022, Marc Zyngier wrote:
> On Fri, 08 Apr 2022 17:56:42 +0100,
> Paolo Bonzini <[email protected]> wrote:
> >
> > Queued, thanks. But documentation was missing:
> >
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index e7a0dfdc0178..72183ae628f7 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -6088,8 +6088,12 @@ should put the acknowledged interrupt vector into the 'epr' field.
> > #define KVM_SYSTEM_EVENT_SHUTDOWN 1
> > #define KVM_SYSTEM_EVENT_RESET 2
> > #define KVM_SYSTEM_EVENT_CRASH 3
> > + #define KVM_SYSTEM_EVENT_SEV_TERM 4
> > + #define KVM_SYSTEM_EVENT_NDATA_VALID (1u << 31)
> > __u32 type;
> > + __u32 ndata;
> > __u64 flags;
> > + __u64 data[16];
> > } system_event;
> >
> > If exit_reason is KVM_EXIT_SYSTEM_EVENT then the vcpu has triggered
> > @@ -6099,7 +6103,7 @@ HVC instruction based PSCI call from the vcpu. The 'type' field describes
> > the system-level event type. The 'flags' field describes architecture
> > specific flags for the system-level event.
> >
> > -Valid values for 'type' are:
> > +Valid values for bits 30:0 of 'type' are:
> >
> > - KVM_SYSTEM_EVENT_SHUTDOWN -- the guest has requested a shutdown of the
> > VM. Userspace is not obliged to honour this, and if it does honour
> > @@ -6112,12 +6116,18 @@ Valid values for 'type' are:
> > has requested a crash condition maintenance. Userspace can choose
> > to ignore the request, or to gather VM memory core dump and/or
> > reset/shutdown of the VM.
> > + - KVM_SYSTEM_EVENT_SEV_TERM -- an AMD SEV guest requested termination.
> > + The guest physical address of the guest's GHCB is stored in `data[0]`.
> >
> > Valid flags are:
> >
> > - KVM_SYSTEM_EVENT_RESET_FLAG_PSCI_RESET2 (arm64 only) -- the guest issued
> > a SYSTEM_RESET2 call according to v1.1 of the PSCI specification.
> >
> > +Extra data for this event is stored in the `data[]` array, up to index
> > +`ndata-1` included, if bit 31 is set in `type`. The data depends on the
> > +`type` field. There is no extra data if bit 31 is clear or `ndata` is zero.
> > +
>
> This has the potential to break userspace as it expects a strict match
> on the whole of 'type', and does not expect to treat it as a bitfield.
>
> Case in point, QEMU:
>
> accel/kvm/kvm-all.c::kvm_cpu_exec()
>
> case KVM_EXIT_SYSTEM_EVENT:
> switch (run->system_event.type) {
>
> CrosVM and kvmtool have similar constructs, and will break as soon as
> KVM_SYSTEM_EVENT_NDATA_VALID is or'ed into 'type'.

Yeah, if we go this route, we'd have to make sure to document that only new types
can use the flag.

2022-04-12 07:06:44

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v4.1] KVM, SEV: Add KVM_EXIT_SHUTDOWN metadata for SEV-ES

Hi Sean,

Cheers for the heads-up.

[+Marc and Alex as this looks similar to [1]]

On Fri, Apr 08, 2022 at 05:01:21PM +0000, Sean Christopherson wrote:
> On Fri, Apr 08, 2022, Peter Gonda wrote:
> > On Thu, Apr 7, 2022 at 8:55 PM Sean Christopherson <[email protected]> wrote:
> > > On Thu, Apr 07, 2022, Peter Gonda wrote:
> > > > If an SEV-ES guest requests termination, exit to userspace with
> > > > KVM_EXIT_SYSTEM_EVENT and a dedicated SEV_TERM type instead of -EINVAL
> > > > so that userspace can take appropriate action.
> > > >
> > > > See AMD's GHCB spec section '4.1.13 Termination Request' for more details.
> > >
> > > Maybe it'll be obvious by the lack of compilation errors, but the changelog should
> > > call out the flags => ndata+data shenanigans, otherwise this looks like ABI breakage.
> >
> > Hmm I am not sure we can do this change anymore given that we have two
> > call sites using 'flags'
> >
> > arch/arm64/kvm/psci.c:184
> > arch/riscv/kvm/vcpu_sbi.c:97
> >
> > I am not at all familiar with ARM and RISC-V but some quick reading
> > tells me these archs also require 64-bit alignment on their 64-bit
> > accesses. If thats correct, should I fix this call sites up by
> > proceeding with this ndata + data[] change and move whatever they are
> > assigning to flags into data[0] like I am doing here? It looks like
> > both of these changes are not in a kernel release so IIUC we can still
> > fix the ABI here?
>
> Yeah, both came in for v5.18. Given that there will be multiple paths that need
> to set data, it's worth adding a common helper to the dirty work.
>
> Anup and Will,
>
> system_event.flags is broken (at least on x86) due to the prior 'type' field not
> being propery padded, e.g. userspace will read/write garbage if the userspace
> and kernel compilers pad structs differently.
>
> struct {
> __u32 type;
> __u64 flags;
> } system_event;

On arm64, I think the compiler is required to put the padding between type
and flags so that both the struct and 'flags' are 64-bit aligned [2]. Does
x86 not offer any guarantees on the overall structure alignment?

> Our plan to unhose this is to change the struct as follows and use bit 31 in the
> 'type' to indicate that ndata+data are valid.
>
> struct {
> __u32 type;
> __u32 ndata;
> __u64 data[16];
> } system_event;
>
> Any objection to updating your architectures to use a helper to set the bit and
> populate ndata+data accordingly? It'll require a userspace update, but v5.18
> hasn't officially released yet so it's not kinda sort not ABI breakage.

It's a bit annoying, as we're using the current structure in Android 13 :/
Obviously, if there's no choice then upstream shouldn't worry, but it means
we'll have to carry a delta in crosvm. Specifically, the new 'ndata' field
is going to be unusable for us because it coincides with the padding.

Will

[1] https://lore.kernel.org/r/[email protected]
[2] https://github.com/ARM-software/abi-aa/blob/60a8eb8c55e999d74dac5e368fc9d7e36e38dda4/aapcs64/aapcs64.rst#composite-types

2022-04-12 12:32:22

by Alexandru Elisei

[permalink] [raw]
Subject: Re: [PATCH v4.1] KVM, SEV: Add KVM_EXIT_SHUTDOWN metadata for SEV-ES

Hi,

On Mon, Apr 11, 2022 at 10:12:13AM +0100, Will Deacon wrote:
> Hi Sean,
>
> Cheers for the heads-up.
>
> [+Marc and Alex as this looks similar to [1]]
>
> On Fri, Apr 08, 2022 at 05:01:21PM +0000, Sean Christopherson wrote:
> > On Fri, Apr 08, 2022, Peter Gonda wrote:
> > > On Thu, Apr 7, 2022 at 8:55 PM Sean Christopherson <[email protected]> wrote:
> > > > On Thu, Apr 07, 2022, Peter Gonda wrote:
> > > > > If an SEV-ES guest requests termination, exit to userspace with
> > > > > KVM_EXIT_SYSTEM_EVENT and a dedicated SEV_TERM type instead of -EINVAL
> > > > > so that userspace can take appropriate action.
> > > > >
> > > > > See AMD's GHCB spec section '4.1.13 Termination Request' for more details.
> > > >
> > > > Maybe it'll be obvious by the lack of compilation errors, but the changelog should
> > > > call out the flags => ndata+data shenanigans, otherwise this looks like ABI breakage.
> > >
> > > Hmm I am not sure we can do this change anymore given that we have two
> > > call sites using 'flags'
> > >
> > > arch/arm64/kvm/psci.c:184
> > > arch/riscv/kvm/vcpu_sbi.c:97
> > >
> > > I am not at all familiar with ARM and RISC-V but some quick reading
> > > tells me these archs also require 64-bit alignment on their 64-bit
> > > accesses. If thats correct, should I fix this call sites up by
> > > proceeding with this ndata + data[] change and move whatever they are
> > > assigning to flags into data[0] like I am doing here? It looks like
> > > both of these changes are not in a kernel release so IIUC we can still
> > > fix the ABI here?
> >
> > Yeah, both came in for v5.18. Given that there will be multiple paths that need
> > to set data, it's worth adding a common helper to the dirty work.
> >
> > Anup and Will,
> >
> > system_event.flags is broken (at least on x86) due to the prior 'type' field not
> > being propery padded, e.g. userspace will read/write garbage if the userspace
> > and kernel compilers pad structs differently.
> >
> > struct {
> > __u32 type;
> > __u64 flags;
> > } system_event;
>
> On arm64, I think the compiler is required to put the padding between type
> and flags so that both the struct and 'flags' are 64-bit aligned [2]. Does
> x86 not offer any guarantees on the overall structure alignment?

This is also my understanding. The "Procedure Call Standard for the Arm
64-bit Architecture" [1] has these rules for structs (called "aggregates"):

"An aggregate, where the members are laid out sequentially in memory
(possibly with inter-member padding)" => the field "type" is at offset 0 in
the struct.

"The member alignment of an element of a composite type is the alignment of
that member after the application of any language alignment modifiers to
that member" => "flags" is 8-byte aligned and "type" is 4-byte aligned.

"The alignment of an aggregate shall be the alignment of its most-aligned
member." => struct system_event is 8-byte aligned.

and finally:

"The size of an aggregate shall be the smallest multiple of its alignment
that is sufficient to hold all of its members."

I think this is the only possible layout that satisfies all the above
conditions:

offset 0-3: type (at an 8-byte aligned address in memory)
offset 4-7: padding
offset 8-15: flags

so under all compilers that correctly implement the architecture the struct
will have the same layout.

[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf

>
> > Our plan to unhose this is to change the struct as follows and use bit 31 in the
> > 'type' to indicate that ndata+data are valid.
> >
> > struct {
> > __u32 type;
> > __u32 ndata;
> > __u64 data[16];
> > } system_event;
> >
> > Any objection to updating your architectures to use a helper to set the bit and
> > populate ndata+data accordingly? It'll require a userspace update, but v5.18
> > hasn't officially released yet so it's not kinda sort not ABI breakage.
>
> It's a bit annoying, as we're using the current structure in Android 13 :/
> Obviously, if there's no choice then upstream shouldn't worry, but it means
> we'll have to carry a delta in crosvm. Specifically, the new 'ndata' field
> is going to be unusable for us because it coincides with the padding.

Just a thought, but wouldn't such a drastical change be better implemented
as a new exit_reason and a new associated struct?

Thanks,
Alex

>
> Will
>
> [1] https://lore.kernel.org/r/[email protected]
> [2] https://github.com/ARM-software/abi-aa/blob/60a8eb8c55e999d74dac5e368fc9d7e36e38dda4/aapcs64/aapcs64.rst#composite-types

2022-04-12 21:12:36

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v4.1] KVM, SEV: Add KVM_EXIT_SHUTDOWN metadata for SEV-ES

On Mon, Apr 11, 2022, Alexandru Elisei wrote:
> Hi,
>
> On Mon, Apr 11, 2022 at 10:12:13AM +0100, Will Deacon wrote:
> > Hi Sean,
> >
> > Cheers for the heads-up.
> >
> > [+Marc and Alex as this looks similar to [1]]
> >
> > On Fri, Apr 08, 2022 at 05:01:21PM +0000, Sean Christopherson wrote:
> > > system_event.flags is broken (at least on x86) due to the prior 'type' field not
> > > being propery padded, e.g. userspace will read/write garbage if the userspace
> > > and kernel compilers pad structs differently.
> > >
> > > struct {
> > > __u32 type;
> > > __u64 flags;
> > > } system_event;
> >
> > On arm64, I think the compiler is required to put the padding between type
> > and flags so that both the struct and 'flags' are 64-bit aligned [2]. Does
> > x86 not offer any guarantees on the overall structure alignment?
>
> This is also my understanding. The "Procedure Call Standard for the Arm
> 64-bit Architecture" [1] has these rules for structs (called "aggregates"):

AFAIK, all x86 compilers will pad structures accordingly, but a 32-bit userspace
running against a 64-bit kernel will have different alignment requirements, i.e.
won't pad, and x86 supports CONFIG_KVM_COMPAT=y. And I have no idea what x86's
bizarre x32 ABI does.

> > > Our plan to unhose this is to change the struct as follows and use bit 31 in the
> > > 'type' to indicate that ndata+data are valid.
> > >
> > > struct {
> > > __u32 type;
> > > __u32 ndata;
> > > __u64 data[16];
> > > } system_event;
> > >
> > > Any objection to updating your architectures to use a helper to set the bit and
> > > populate ndata+data accordingly? It'll require a userspace update, but v5.18
> > > hasn't officially released yet so it's not kinda sort not ABI breakage.
> >
> > It's a bit annoying, as we're using the current structure in Android 13 :/
> > Obviously, if there's no choice then upstream shouldn't worry, but it means
> > we'll have to carry a delta in crosvm. Specifically, the new 'ndata' field
> > is going to be unusable for us because it coincides with the padding.

Yeah, it'd be unusuable for existing types. One idea is that we could define the
ABI to be that the RESET and SHUTDOWN types have an implicit ndata=1 on arm64 and
RISC-V. That would allow keeping the flags interpretation and so long as crosvm
doesn't do something stupid like compile with "pragma pack" (does clang even support
that?), there's no delta necessary for Android.

> Just a thought, but wouldn't such a drastical change be better implemented
> as a new exit_reason and a new associated struct?

Maybe? I wasn't aware that arm64/RISC-V picked up usage of "flags" when I
suggested this, but I'm not sure it would have changed anything. We could add
SYSTEM_EVENT2 or whatever, but since there's no official usage of flags, it seems
a bit gratutious.

2022-04-12 22:23:18

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v4.1] KVM, SEV: Add KVM_EXIT_SHUTDOWN metadata for SEV-ES

On Fri, 08 Apr 2022 17:56:42 +0100,
Paolo Bonzini <[email protected]> wrote:
>
> Queued, thanks. But documentation was missing:
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index e7a0dfdc0178..72183ae628f7 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -6088,8 +6088,12 @@ should put the acknowledged interrupt vector into the 'epr' field.
> #define KVM_SYSTEM_EVENT_SHUTDOWN 1
> #define KVM_SYSTEM_EVENT_RESET 2
> #define KVM_SYSTEM_EVENT_CRASH 3
> + #define KVM_SYSTEM_EVENT_SEV_TERM 4
> + #define KVM_SYSTEM_EVENT_NDATA_VALID (1u << 31)
> __u32 type;
> + __u32 ndata;
> __u64 flags;
> + __u64 data[16];
> } system_event;
>
> If exit_reason is KVM_EXIT_SYSTEM_EVENT then the vcpu has triggered
> @@ -6099,7 +6103,7 @@ HVC instruction based PSCI call from the vcpu. The 'type' field describes
> the system-level event type. The 'flags' field describes architecture
> specific flags for the system-level event.
>
> -Valid values for 'type' are:
> +Valid values for bits 30:0 of 'type' are:
>
> - KVM_SYSTEM_EVENT_SHUTDOWN -- the guest has requested a shutdown of the
> VM. Userspace is not obliged to honour this, and if it does honour
> @@ -6112,12 +6116,18 @@ Valid values for 'type' are:
> has requested a crash condition maintenance. Userspace can choose
> to ignore the request, or to gather VM memory core dump and/or
> reset/shutdown of the VM.
> + - KVM_SYSTEM_EVENT_SEV_TERM -- an AMD SEV guest requested termination.
> + The guest physical address of the guest's GHCB is stored in `data[0]`.
>
> Valid flags are:
>
> - KVM_SYSTEM_EVENT_RESET_FLAG_PSCI_RESET2 (arm64 only) -- the guest issued
> a SYSTEM_RESET2 call according to v1.1 of the PSCI specification.
>
> +Extra data for this event is stored in the `data[]` array, up to index
> +`ndata-1` included, if bit 31 is set in `type`. The data depends on the
> +`type` field. There is no extra data if bit 31 is clear or `ndata` is zero.
> +

This has the potential to break userspace as it expects a strict match
on the whole of 'type', and does not expect to treat it as a bitfield.

Case in point, QEMU:

accel/kvm/kvm-all.c::kvm_cpu_exec()

case KVM_EXIT_SYSTEM_EVENT:
switch (run->system_event.type) {

CrosVM and kvmtool have similar constructs, and will break as soon as
KVM_SYSTEM_EVENT_NDATA_VALID is or'ed into 'type'.

M.

--
Without deviation from the norm, progress is not possible.

2022-04-16 01:50:44

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v4.1] KVM, SEV: Add KVM_EXIT_SHUTDOWN metadata for SEV-ES

PAOLO!!!!!!

Or maybe I need to try the Beetlejuice trick...

Paolo, Paolo, Paolo.

This is now sitting in kvm/next, which makes RISC-V and arm64 unhappy. Thoughts
on how to proceed?

arch/riscv/kvm/vcpu_sbi.c: In function ‘kvm_riscv_vcpu_sbi_system_reset’:
arch/riscv/kvm/vcpu_sbi.c:97:26: error: ‘struct <anonymous>’ has no member named ‘flags’
97 | run->system_event.flags = flags;
| ^


On Mon, Apr 11, 2022, Sean Christopherson wrote:
> On Mon, Apr 11, 2022, Alexandru Elisei wrote:
> > Hi,
> >
> > On Mon, Apr 11, 2022 at 10:12:13AM +0100, Will Deacon wrote:
> > > Hi Sean,
> > >
> > > Cheers for the heads-up.
> > >
> > > [+Marc and Alex as this looks similar to [1]]
> > >
> > > On Fri, Apr 08, 2022 at 05:01:21PM +0000, Sean Christopherson wrote:
> > > > system_event.flags is broken (at least on x86) due to the prior 'type' field not
> > > > being propery padded, e.g. userspace will read/write garbage if the userspace
> > > > and kernel compilers pad structs differently.
> > > >
> > > > struct {
> > > > __u32 type;
> > > > __u64 flags;
> > > > } system_event;
> > >
> > > On arm64, I think the compiler is required to put the padding between type
> > > and flags so that both the struct and 'flags' are 64-bit aligned [2]. Does
> > > x86 not offer any guarantees on the overall structure alignment?
> >
> > This is also my understanding. The "Procedure Call Standard for the Arm
> > 64-bit Architecture" [1] has these rules for structs (called "aggregates"):
>
> AFAIK, all x86 compilers will pad structures accordingly, but a 32-bit userspace
> running against a 64-bit kernel will have different alignment requirements, i.e.
> won't pad, and x86 supports CONFIG_KVM_COMPAT=y. And I have no idea what x86's
> bizarre x32 ABI does.
>
> > > > Our plan to unhose this is to change the struct as follows and use bit 31 in the
> > > > 'type' to indicate that ndata+data are valid.
> > > >
> > > > struct {
> > > > __u32 type;
> > > > __u32 ndata;
> > > > __u64 data[16];
> > > > } system_event;
> > > >
> > > > Any objection to updating your architectures to use a helper to set the bit and
> > > > populate ndata+data accordingly? It'll require a userspace update, but v5.18
> > > > hasn't officially released yet so it's not kinda sort not ABI breakage.
> > >
> > > It's a bit annoying, as we're using the current structure in Android 13 :/
> > > Obviously, if there's no choice then upstream shouldn't worry, but it means
> > > we'll have to carry a delta in crosvm. Specifically, the new 'ndata' field
> > > is going to be unusable for us because it coincides with the padding.
>
> Yeah, it'd be unusuable for existing types. One idea is that we could define the
> ABI to be that the RESET and SHUTDOWN types have an implicit ndata=1 on arm64 and
> RISC-V. That would allow keeping the flags interpretation and so long as crosvm
> doesn't do something stupid like compile with "pragma pack" (does clang even support
> that?), there's no delta necessary for Android.
>
> > Just a thought, but wouldn't such a drastical change be better implemented
> > as a new exit_reason and a new associated struct?
>
> Maybe? I wasn't aware that arm64/RISC-V picked up usage of "flags" when I
> suggested this, but I'm not sure it would have changed anything. We could add
> SYSTEM_EVENT2 or whatever, but since there's no official usage of flags, it seems
> a bit gratutious.