2023-12-19 14:10:40

by Nina Schoetterl-Glausch

[permalink] [raw]
Subject: [PATCH v4 0/4] KVM: s390: Fix minor bugs in STFLE shadowing

v3 -> v4:
* pick up tags (thanks {David, Janosch, Heiko})
* changes to commit messages
* flip lines and add comment (Janosch)

v2 -> v3:
* pick up tags (thanks Claudio)
* reverse Christmas tree

v1 -> v2:
* pick up tags (thanks {Claudio, David})
* drop Fixes tag on cleanup patch, change message (thanks David)
* drop Fixes tag on second patch since the length of the facility list
copied wasn't initially specified and only clarified in later
revisions
* use READ/WRITE_ONCE (thanks {David, Heiko})

Improve the STFLE vsie implementation.
Firstly, fix a bug concerning the identification if the guest is
intending to use interpretive execution for STFLE for its guest.
Secondly, decrease the amount of guest memory accessed to the
minimum.
Also do some (optional) cleanups.

Nina Schoetterl-Glausch (4):
KVM: s390: vsie: Fix STFLE interpretive execution identification
KVM: s390: vsie: Fix length of facility list shadowed
KVM: s390: cpu model: Use proper define for facility mask size
KVM: s390: Minor refactor of base/ext facility lists

arch/s390/include/asm/facility.h | 6 +++++
arch/s390/include/asm/kvm_host.h | 2 +-
arch/s390/kernel/Makefile | 2 +-
arch/s390/kernel/facility.c | 21 +++++++++++++++
arch/s390/kvm/kvm-s390.c | 44 ++++++++++++++------------------
arch/s390/kvm/vsie.c | 19 ++++++++++++--
6 files changed, 65 insertions(+), 29 deletions(-)
create mode 100644 arch/s390/kernel/facility.c

Range-diff against v3:
1: de77a2c36786 ! 1: 69599bb38487 KVM: s390: vsie: Fix STFLE interpretive execution identification
@@ arch/s390/kvm/vsie.c: static void retry_vsie_icpt(struct vsie_page *vsie_page)
+ __u32 fac = READ_ONCE(vsie_page->scb_o->fac);

if (fac && test_kvm_facility(vcpu->kvm, 7)) {
-+ fac = fac & 0x7ffffff8U;
retry_vsie_icpt(vsie_page);
++ /*
++ * The facility list origin (FLO) is in bits 1 - 28 of the FLD
++ * so we need to mask here before reading.
++ */
++ fac = fac & 0x7ffffff8U;
if (read_guest_real(vcpu, fac, &vsie_page->fac,
sizeof(vsie_page->fac)))
+ return set_validity_icpt(scb_s, 0x1090U);
2: e4b44c4d2400 ! 2: cba3c32a8db7 KVM: s390: vsie: Fix length of facility list shadowed
@@ Commit message

The length of the facility list accessed when interpretively executing
STFLE is the same as the hosts facility list (in case of format-0)
- When shadowing, copy only those bytes.
- The memory following the facility list need not be accessible, in which
- case we'd wrongly inject a validity intercept.
+ The memory following the facility list doesn't need to be accessible.
+ The current VSIE implementation accesses a fixed length that exceeds the
+ guest/host facility list length and can therefore wrongly inject a
+ validity intercept.
+ Instead, find out the host facility list length by running STFLE and
+ copy only as much as necessary when shadowing.

Acked-by: David Hildenbrand <[email protected]>
Reviewed-by: Claudio Imbrenda <[email protected]>
+ Acked-by: Heiko Carstens <[email protected]>
Signed-off-by: Nina Schoetterl-Glausch <[email protected]>

## arch/s390/include/asm/facility.h ##
@@ arch/s390/include/asm/facility.h: static inline void stfle(u64 *stfle_fac_list,
#endif /* __ASM_FACILITY_H */

## arch/s390/kernel/Makefile ##
-@@ arch/s390/kernel/Makefile: obj-y += sysinfo.o lgr.o os_info.o
+@@ arch/s390/kernel/Makefile: obj-y += sysinfo.o lgr.o os_info.o ctlreg.o
obj-y += runtime_instr.o cache.o fpu.o dumpstack.o guarded_storage.o sthyi.o
obj-y += entry.o reipl.o kdebugfs.o alternative.o
obj-y += nospec-branch.o ipl_vmparm.o machine_kexec_reloc.o unwind_bc.o
@@ arch/s390/kvm/vsie.c: static int handle_stfle(struct kvm_vcpu *vcpu, struct vsie
+ * -> format-0 flcb
+ */
if (fac && test_kvm_facility(vcpu->kvm, 7)) {
- fac = fac & 0x7ffffff8U;
retry_vsie_icpt(vsie_page);
+ /*
+@@ arch/s390/kvm/vsie.c: static int handle_stfle(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
+ * so we need to mask here before reading.
+ */
+ fac = fac & 0x7ffffff8U;
+ /*
+ * format-0 -> size of nested guest's facility list == guest's size
+ * guest's size == host's size, since STFLE is interpretatively executed
3: 8b02ac33defb ! 3: 4b52e432d736 KVM: s390: cpu model: Use proper define for facility mask size
@@ Commit message
Note that both values are the same, there is no functional change.

Reviewed-by: Claudio Imbrenda <[email protected]>
+ Reviewed-by: David Hildenbrand <[email protected]>
+ Reviewed-by: Janosch Frank <[email protected]>
Signed-off-by: Nina Schoetterl-Glausch <[email protected]>

## arch/s390/include/asm/kvm_host.h ##
4: a592be823576 = 4: 9e551ba53b14 KVM: s390: Minor refactor of base/ext facility lists
--
2.40.1



2023-12-19 14:10:49

by Nina Schoetterl-Glausch

[permalink] [raw]
Subject: [PATCH v4 4/4] KVM: s390: Minor refactor of base/ext facility lists

Directly use the size of the arrays instead of going through the
indirection of kvm_s390_fac_size().
Don't use magic number for the number of entries in the non hypervisor
managed facility bit mask list.
Make the constraint of that number on kvm_s390_fac_base obvious.
Get rid of implicit double anding of stfle_fac_list.

Reviewed-by: Claudio Imbrenda <[email protected]>
Signed-off-by: Nina Schoetterl-Glausch <[email protected]>
---

Notes:
I think it's nicer this way but it might be needless churn.

arch/s390/kvm/kvm-s390.c | 44 +++++++++++++++++-----------------------
1 file changed, 19 insertions(+), 25 deletions(-)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 7aa0e668488f..ac8d551f8b32 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -224,33 +224,25 @@ static int async_destroy = 1;
module_param(async_destroy, int, 0444);
MODULE_PARM_DESC(async_destroy, "Asynchronous destroy for protected guests");

-/*
- * For now we handle at most 16 double words as this is what the s390 base
- * kernel handles and stores in the prefix page. If we ever need to go beyond
- * this, this requires changes to code, but the external uapi can stay.
- */
-#define SIZE_INTERNAL 16
-
+#define HMFAI_DWORDS 16
/*
* Base feature mask that defines default mask for facilities. Consists of the
* defines in FACILITIES_KVM and the non-hypervisor managed bits.
*/
-static unsigned long kvm_s390_fac_base[SIZE_INTERNAL] = { FACILITIES_KVM };
+static unsigned long kvm_s390_fac_base[HMFAI_DWORDS] = { FACILITIES_KVM };
+static_assert(ARRAY_SIZE(((long[]){ FACILITIES_KVM })) <= HMFAI_DWORDS);
+static_assert(ARRAY_SIZE(kvm_s390_fac_base) <= S390_ARCH_FAC_MASK_SIZE_U64);
+static_assert(ARRAY_SIZE(kvm_s390_fac_base) <= S390_ARCH_FAC_LIST_SIZE_U64);
+static_assert(ARRAY_SIZE(kvm_s390_fac_base) <= ARRAY_SIZE(stfle_fac_list));
+
/*
* Extended feature mask. Consists of the defines in FACILITIES_KVM_CPUMODEL
* and defines the facilities that can be enabled via a cpu model.
*/
-static unsigned long kvm_s390_fac_ext[SIZE_INTERNAL] = { FACILITIES_KVM_CPUMODEL };
-
-static unsigned long kvm_s390_fac_size(void)
-{
- BUILD_BUG_ON(SIZE_INTERNAL > S390_ARCH_FAC_MASK_SIZE_U64);
- BUILD_BUG_ON(SIZE_INTERNAL > S390_ARCH_FAC_LIST_SIZE_U64);
- BUILD_BUG_ON(SIZE_INTERNAL * sizeof(unsigned long) >
- sizeof(stfle_fac_list));
-
- return SIZE_INTERNAL;
-}
+static const unsigned long kvm_s390_fac_ext[] = { FACILITIES_KVM_CPUMODEL };
+static_assert(ARRAY_SIZE(kvm_s390_fac_ext) <= S390_ARCH_FAC_MASK_SIZE_U64);
+static_assert(ARRAY_SIZE(kvm_s390_fac_ext) <= S390_ARCH_FAC_LIST_SIZE_U64);
+static_assert(ARRAY_SIZE(kvm_s390_fac_ext) <= ARRAY_SIZE(stfle_fac_list));

/* available cpu features supported by kvm */
static DECLARE_BITMAP(kvm_s390_available_cpu_feat, KVM_S390_VM_CPU_FEAT_NR_BITS);
@@ -3348,13 +3340,16 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
kvm->arch.sie_page2->kvm = kvm;
kvm->arch.model.fac_list = kvm->arch.sie_page2->fac_list;

- for (i = 0; i < kvm_s390_fac_size(); i++) {
+ for (i = 0; i < ARRAY_SIZE(kvm_s390_fac_base); i++) {
kvm->arch.model.fac_mask[i] = stfle_fac_list[i] &
- (kvm_s390_fac_base[i] |
- kvm_s390_fac_ext[i]);
+ kvm_s390_fac_base[i];
kvm->arch.model.fac_list[i] = stfle_fac_list[i] &
kvm_s390_fac_base[i];
}
+ for (i = 0; i < ARRAY_SIZE(kvm_s390_fac_ext); i++) {
+ kvm->arch.model.fac_mask[i] |= stfle_fac_list[i] &
+ kvm_s390_fac_ext[i];
+ }
kvm->arch.model.subfuncs = kvm_s390_available_subfunc;

/* we are always in czam mode - even on pre z14 machines */
@@ -5868,9 +5863,8 @@ static int __init kvm_s390_init(void)
return -EINVAL;
}

- for (i = 0; i < 16; i++)
- kvm_s390_fac_base[i] |=
- stfle_fac_list[i] & nonhyp_mask(i);
+ for (i = 0; i < HMFAI_DWORDS; i++)
+ kvm_s390_fac_base[i] |= nonhyp_mask(i);

r = __kvm_s390_init();
if (r)
--
2.40.1


2023-12-19 14:10:47

by Nina Schoetterl-Glausch

[permalink] [raw]
Subject: [PATCH v4 1/4] KVM: s390: vsie: Fix STFLE interpretive execution identification

STFLE can be interpretively executed.
This occurs when the facility list designation is unequal to zero.
Perform the check before applying the address mask instead of after.

Fixes: 66b630d5b7f2 ("KVM: s390: vsie: support STFLE interpretation")
Reviewed-by: Claudio Imbrenda <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Signed-off-by: Nina Schoetterl-Glausch <[email protected]>
---
arch/s390/kvm/vsie.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
index 8207a892bbe2..35937911724e 100644
--- a/arch/s390/kvm/vsie.c
+++ b/arch/s390/kvm/vsie.c
@@ -984,10 +984,15 @@ static void retry_vsie_icpt(struct vsie_page *vsie_page)
static int handle_stfle(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
{
struct kvm_s390_sie_block *scb_s = &vsie_page->scb_s;
- __u32 fac = READ_ONCE(vsie_page->scb_o->fac) & 0x7ffffff8U;
+ __u32 fac = READ_ONCE(vsie_page->scb_o->fac);

if (fac && test_kvm_facility(vcpu->kvm, 7)) {
retry_vsie_icpt(vsie_page);
+ /*
+ * The facility list origin (FLO) is in bits 1 - 28 of the FLD
+ * so we need to mask here before reading.
+ */
+ fac = fac & 0x7ffffff8U;
if (read_guest_real(vcpu, fac, &vsie_page->fac,
sizeof(vsie_page->fac)))
return set_validity_icpt(scb_s, 0x1090U);
--
2.40.1


2023-12-19 14:11:21

by Nina Schoetterl-Glausch

[permalink] [raw]
Subject: [PATCH v4 3/4] KVM: s390: cpu model: Use proper define for facility mask size

Use the previously unused S390_ARCH_FAC_MASK_SIZE_U64 instead of
S390_ARCH_FAC_LIST_SIZE_U64 for defining the fac_mask array.
Note that both values are the same, there is no functional change.

Reviewed-by: Claudio Imbrenda <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Janosch Frank <[email protected]>
Signed-off-by: Nina Schoetterl-Glausch <[email protected]>
---
arch/s390/include/asm/kvm_host.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 67a298b6cf6e..52664105a473 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -818,7 +818,7 @@ struct s390_io_adapter {

struct kvm_s390_cpu_model {
/* facility mask supported by kvm & hosting machine */
- __u64 fac_mask[S390_ARCH_FAC_LIST_SIZE_U64];
+ __u64 fac_mask[S390_ARCH_FAC_MASK_SIZE_U64];
struct kvm_s390_vm_cpu_subfunc subfuncs;
/* facility list requested by guest (in dma page) */
__u64 *fac_list;
--
2.40.1


2023-12-20 09:56:52

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH v4 1/4] KVM: s390: vsie: Fix STFLE interpretive execution identification

Am 19.12.23 um 15:08 schrieb Nina Schoetterl-Glausch:
> STFLE can be interpretively executed.
> This occurs when the facility list designation is unequal to zero.
> Perform the check before applying the address mask instead of after.
>
> Fixes: 66b630d5b7f2 ("KVM: s390: vsie: support STFLE interpretation")
> Reviewed-by: Claudio Imbrenda <[email protected]>
> Acked-by: David Hildenbrand <[email protected]>
> Signed-off-by: Nina Schoetterl-Glausch <[email protected]>

Reviewed-by: Christian Borntraeger <[email protected]>

this should not matter in reality but maybe some weird guests puts this at address 0.
Do we want a unit test for that case?

> ---
> arch/s390/kvm/vsie.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
> index 8207a892bbe2..35937911724e 100644
> --- a/arch/s390/kvm/vsie.c
> +++ b/arch/s390/kvm/vsie.c
> @@ -984,10 +984,15 @@ static void retry_vsie_icpt(struct vsie_page *vsie_page)
> static int handle_stfle(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
> {
> struct kvm_s390_sie_block *scb_s = &vsie_page->scb_s;
> - __u32 fac = READ_ONCE(vsie_page->scb_o->fac) & 0x7ffffff8U;
> + __u32 fac = READ_ONCE(vsie_page->scb_o->fac);
>
> if (fac && test_kvm_facility(vcpu->kvm, 7)) {
> retry_vsie_icpt(vsie_page);
> + /*
> + * The facility list origin (FLO) is in bits 1 - 28 of the FLD
> + * so we need to mask here before reading.
> + */
> + fac = fac & 0x7ffffff8U;
> if (read_guest_real(vcpu, fac, &vsie_page->fac,
> sizeof(vsie_page->fac)))
> return set_validity_icpt(scb_s, 0x1090U);

2023-12-20 10:42:22

by Janosch Frank

[permalink] [raw]
Subject: Re: [PATCH v4 1/4] KVM: s390: vsie: Fix STFLE interpretive execution identification

On 12/19/23 15:08, Nina Schoetterl-Glausch wrote:
> STFLE can be interpretively executed.
> This occurs when the facility list designation is unequal to zero.
> Perform the check before applying the address mask instead of after.
>
> Fixes: 66b630d5b7f2 ("KVM: s390: vsie: support STFLE interpretation")
> Reviewed-by: Claudio Imbrenda <[email protected]>
> Acked-by: David Hildenbrand <[email protected]>
> Signed-off-by: Nina Schoetterl-Glausch <[email protected]>

Reviewed-by: Janosch Frank <[email protected]>

> ---
> arch/s390/kvm/vsie.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
> index 8207a892bbe2..35937911724e 100644
> --- a/arch/s390/kvm/vsie.c
> +++ b/arch/s390/kvm/vsie.c
> @@ -984,10 +984,15 @@ static void retry_vsie_icpt(struct vsie_page *vsie_page)
> static int handle_stfle(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
> {
> struct kvm_s390_sie_block *scb_s = &vsie_page->scb_s;
> - __u32 fac = READ_ONCE(vsie_page->scb_o->fac) & 0x7ffffff8U;
> + __u32 fac = READ_ONCE(vsie_page->scb_o->fac);
>
> if (fac && test_kvm_facility(vcpu->kvm, 7)) {
> retry_vsie_icpt(vsie_page);
> + /*
> + * The facility list origin (FLO) is in bits 1 - 28 of the FLD
> + * so we need to mask here before reading.
> + */
> + fac = fac & 0x7ffffff8U;
> if (read_guest_real(vcpu, fac, &vsie_page->fac,
> sizeof(vsie_page->fac)))
> return set_validity_icpt(scb_s, 0x1090U);


2023-12-20 10:56:03

by Janosch Frank

[permalink] [raw]
Subject: Re: [PATCH v4 4/4] KVM: s390: Minor refactor of base/ext facility lists

On 12/19/23 15:08, Nina Schoetterl-Glausch wrote:
> Directly use the size of the arrays instead of going through the
> indirection of kvm_s390_fac_size().
> Don't use magic number for the number of entries in the non hypervisor
> managed facility bit mask list.
> Make the constraint of that number on kvm_s390_fac_base obvious.
> Get rid of implicit double anding of stfle_fac_list.
>
> Reviewed-by: Claudio Imbrenda <[email protected]>
> Signed-off-by: Nina Schoetterl-Glausch <[email protected]>

@Nina: I'm currently still recovering from a cold and hence I'm not
fully able to grasp this patch.

May I drop it and we re-visit it next year for 6.9?