2021-08-15 00:16:10

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 00/21] Support SDEI Virtualization

This series intends to virtualize Software Delegated Exception Interface
(SDEI), which is defined by DEN0054A. It allows the hypervisor to deliver
NMI-alike event to guest and it's needed by asynchronous page fault to
deliver page-not-present notification from hypervisor to guest. The code
and the required qemu changes can be found from:

https://developer.arm.com/documentation/den0054/latest
https://github.com/gwshan/linux ("kvm/arm64_sdei")
https://github.com/gwshan/qemu ("kvm/arm64_sdei")

The SDEI event is identified by a 32-bits number. Bits[31:24] are used
to indicate the SDEI event properties while bits[23:0] are identifying
the unique number. The implementation takes bits[23:22] to indicate the
owner of the SDEI event. For example, those SDEI events owned by KVM
should have these two bits set to 0b01. Besides, the implementation
supports SDEI events owned by KVM only.

The design is pretty straightforward and the implementation is just
following the SDEI specification, to support the defined SMCCC intefaces,
except the IRQ binding stuff. There are several data structures introduced.
Some of the objects have to be migrated by VMM. So their definitions are
split up for VMM to include the corresponding states for migration.

struct kvm_sdei_kvm
Associated with VM and used to track the KVM exposed SDEI events
and those registered by guest.
struct kvm_sdei_vcpu
Associated with vCPU and used to track SDEI event delivery. The
preempted context is saved prior to the delivery and restored
after that.
struct kvm_sdei_event
SDEI events exposed by KVM so that guest can register and enable.
struct kvm_sdei_kvm_event
SDEI events that have been registered by guest.
struct kvm_sdei_vcpu_event
SDEI events that have been queued to specific vCPU for delivery.

The series is organized as below:

PATCH[01] Introduces template for smccc_get_argx()
PATCH[02] Introduces the data structures and infrastructure
PATCH[03-14] Supports various SDEI related hypercalls
PATCH[15] Supports SDEI event notification
PATCH[16-17] Introduces ioctl command for migration
PATCH[18-19] Supports SDEI event injection and cancellation
PATCH[20] Exports SDEI capability
PATCH[21] Adds self-test case for SDEI virtualization

Testing
=======

There are additional patches in the following repositories to create
procfs entries, allowing inject SDEI event and test driver in the guest
to handle the SDEI event. Besides, the additional QEMU changes are needed
so that guest can detect the SDEI service through ACPI table.

https://github.com/gwshan/linux ("kvm/arm64_sdei")
https://github.com/gwshan/qemu ("kvm/arm64_sdei")

The SDEI event is received and handled in the guest after it's injected
through the procfs entries from host side.

host# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
-accel kvm -machine virt,gic-version=host \
-cpu host -smp 8,sockets=2,cores=4,threads=1 -m 1024M \
: \
-kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
-initrd /home/gavin/sandbox/images/rootfs.cpio.xz \
-append 'earlycon=pl011,mmio,0x9000000'
host# echo > /proc/kvm/kvm-10842/vcpu-0
guest# =========== SDEI Event (CPU#0) ===========
num=0x40400000, arg=0xdabfdabf
SP: 0xffff800011613e90 PC: 0x0 pState: 0x0
Regs:
000000000002ac4 ffff00001ff947a0 0000000000002ac2 ffff00001ff976c0
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 ffff80001121a000
ffff8000116199e0 ffff800011619ad8 ffff80001122d8b8 ffff800011619afc
0000000000000000 0000000000000000 ffff800011622140 ffff800011150108
00000000582c0018 ffff800011613e90 ffff800010bd0248
Query context:
x[00]: 0000000000002ac4 errno: 0
x[01]: ffff00001ff947a0 errno: 0
:
x[18]: ffff800010bd01d8 errno: 0
x[19]: fffffffffffffffe errno: -22
x[20]: fffffffffffffffe errno: -22
:
x[30]: fffffffffffffffe errno: -22
host# echo > /proc/kvm/kvm-10842/vcpu-7
guest# =========== SDEI Event (CPU#7) ===========
num=0x40400000, arg=0xdabfdabf
SP: 0xffff800011b73f20 PC: 0x0 pState: 0x0
Regs:
00000000000010d0 ffff00003fde07a0 00000000000010ce 7fffffff1999999a
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 ffff80001121a000
ffff8000116199e0 ffff800011619ad8 ffff80001122d8b8 ffff800011619afc
0000000000000000 0000000000000000 ffff000020032ac0 0000000000000000
0000000000000000 ffff800011b73f20 ffff800010bd0248
Query context:
x[00]: 00000000000010d0 errno: 0
x[01]: ffff00003fde07a0 errno: 0
:
x[18]: ffff800010bd01d8 errno: 0
x[19]: fffffffffffffffe errno: -22
:
x[30]: fffffffffffffffe errno: -22

Changelog
=========
v4:
* Rebased to v5.14.rc5 (Gavin)
v3:
* Rebased to v5.13.rc1 (Gavin)
* Use linux data types in kvm_sdei.h (Gavin)
v2:
* Rebased to v5.11.rc6 (Gavin)
* Dropped changes related to SDEI client driver (Gavin)
* Removed support for passthrou SDEI events (Gavin)
* Redesigned data structures (Gavin)
* Implementation is almost rewritten as the data structures
are totally changed (Gavin)
* Added ioctl commands to support migration (Gavin)

Gavin Shan (21):
KVM: arm64: Introduce template for inline functions
KVM: arm64: Add SDEI virtualization infrastructure
KVM: arm64: Support SDEI_VERSION hypercall
KVM: arm64: Support SDEI_EVENT_REGISTER hypercall
KVM: arm64: Support SDEI_EVENT_{ENABLE, DISABLE} hypercall
KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall
KVM: arm64: Support SDEI_EVENT_UNREGISTER hypercall
KVM: arm64: Support SDEI_EVENT_STATUS hypercall
KVM: arm64: Support SDEI_EVENT_GET_INFO hypercall
KVM: arm64: Support SDEI_EVENT_ROUTING_SET hypercall
KVM: arm64: Support SDEI_PE_{MASK, UNMASK} hypercall
KVM: arm64: Support SDEI_{PRIVATE, SHARED}_RESET hypercall
KVM: arm64: Impment SDEI event delivery
KVM: arm64: Support SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME}
hypercall
KVM: arm64: Support SDEI event notifier
KVM: arm64: Support SDEI ioctl commands on VM
KVM: arm64: Support SDEI ioctl commands on vCPU
KVM: arm64: Support SDEI event injection
KVM: arm64: Support SDEI event cancellation
KVM: arm64: Export SDEI capability
KVM: selftests: Add SDEI test case

arch/arm64/include/asm/kvm_emulate.h | 1 +
arch/arm64/include/asm/kvm_host.h | 8 +
arch/arm64/include/asm/kvm_sdei.h | 136 ++
arch/arm64/include/uapi/asm/kvm.h | 1 +
arch/arm64/include/uapi/asm/kvm_sdei.h | 86 ++
arch/arm64/kvm/Makefile | 2 +-
arch/arm64/kvm/arm.c | 19 +
arch/arm64/kvm/hyp/exception.c | 7 +
arch/arm64/kvm/hypercalls.c | 18 +
arch/arm64/kvm/inject_fault.c | 27 +
arch/arm64/kvm/sdei.c | 1519 ++++++++++++++++++++
include/kvm/arm_hypercalls.h | 34 +-
include/uapi/linux/kvm.h | 4 +
tools/testing/selftests/kvm/Makefile | 1 +
tools/testing/selftests/kvm/aarch64/sdei.c | 171 +++
15 files changed, 2014 insertions(+), 20 deletions(-)
create mode 100644 arch/arm64/include/asm/kvm_sdei.h
create mode 100644 arch/arm64/include/uapi/asm/kvm_sdei.h
create mode 100644 arch/arm64/kvm/sdei.c
create mode 100644 tools/testing/selftests/kvm/aarch64/sdei.c

--
2.23.0


2021-08-15 00:16:11

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 01/21] KVM: arm64: Introduce template for inline functions

The inline functions used to get the SMCCC parameters have same
layout. It means these functions can be presented by a template,
to make the code simplified. Besides, this adds more similar inline
functions like smccc_get_arg{4,5,6,7,8}() to visit more SMCCC arguments,
which are needed by SDEI virtualization support.

Signed-off-by: Gavin Shan <[email protected]>
---
include/kvm/arm_hypercalls.h | 34 +++++++++++++++-------------------
1 file changed, 15 insertions(+), 19 deletions(-)

diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
index 0e2509d27910..ebecb6c68254 100644
--- a/include/kvm/arm_hypercalls.h
+++ b/include/kvm/arm_hypercalls.h
@@ -6,27 +6,21 @@

#include <asm/kvm_emulate.h>

-int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
-
-static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
-{
- return vcpu_get_reg(vcpu, 0);
+#define SMCCC_DECLARE_GET_FUNC(type, name, reg) \
+static inline type smccc_get_##name(struct kvm_vcpu *vcpu) \
+{ \
+ return vcpu_get_reg(vcpu, reg); \
}

-static inline unsigned long smccc_get_arg1(struct kvm_vcpu *vcpu)
-{
- return vcpu_get_reg(vcpu, 1);
-}
-
-static inline unsigned long smccc_get_arg2(struct kvm_vcpu *vcpu)
-{
- return vcpu_get_reg(vcpu, 2);
-}
-
-static inline unsigned long smccc_get_arg3(struct kvm_vcpu *vcpu)
-{
- return vcpu_get_reg(vcpu, 3);
-}
+SMCCC_DECLARE_GET_FUNC(u32, function, 0)
+SMCCC_DECLARE_GET_FUNC(unsigned long, arg1, 1)
+SMCCC_DECLARE_GET_FUNC(unsigned long, arg2, 2)
+SMCCC_DECLARE_GET_FUNC(unsigned long, arg3, 3)
+SMCCC_DECLARE_GET_FUNC(unsigned long, arg4, 4)
+SMCCC_DECLARE_GET_FUNC(unsigned long, arg5, 5)
+SMCCC_DECLARE_GET_FUNC(unsigned long, arg6, 6)
+SMCCC_DECLARE_GET_FUNC(unsigned long, arg7, 7)
+SMCCC_DECLARE_GET_FUNC(unsigned long, arg8, 8)

static inline void smccc_set_retval(struct kvm_vcpu *vcpu,
unsigned long a0,
@@ -40,4 +34,6 @@ static inline void smccc_set_retval(struct kvm_vcpu *vcpu,
vcpu_set_reg(vcpu, 3, a3);
}

+int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
+
#endif
--
2.23.0

2021-08-15 00:16:25

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 02/21] KVM: arm64: Add SDEI virtualization infrastructure

Software Delegated Exception Interface (SDEI) provides a mechanism for
registering and servicing system events. Those system events are high
priority events, which must be serviced immediately. It's going to be
used by Asynchronous Page Fault (APF) to deliver notification from KVM
to guest. It's noted that SDEI is defined by ARM DEN0054A specification.

This introduces SDEI virtualization infrastructure where the SDEI events
are registered and manuplated by the guest through hypercall. The SDEI
event is delivered to one specific vCPU by KVM once it's raised. This
introduces data structures to represent the needed objects to implement
the feature, which is highlighted as below. As those objects could be
migrated between VMs, these data structures are partially exported to
user space.

* kvm_sdei_event
SDEI events are exported from KVM so that guest is able to register
and manuplate.
* kvm_sdei_kvm_event
SDEI event that has been registered by guest.
* kvm_sdei_kvm_vcpu
SDEI event that has been delivered to the target vCPU.
* kvm_sdei_kvm
Place holder of exported and registered SDEI events.
* kvm_sdei_vcpu
Auxiliary object to save the preempted context during SDEI event
delivery.

The error is returned for all SDEI hypercalls for now. They will be
implemented by the subsequent patches.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/include/asm/kvm_host.h | 6 +
arch/arm64/include/asm/kvm_sdei.h | 118 +++++++++++++++
arch/arm64/include/uapi/asm/kvm.h | 1 +
arch/arm64/include/uapi/asm/kvm_sdei.h | 60 ++++++++
arch/arm64/kvm/Makefile | 2 +-
arch/arm64/kvm/arm.c | 7 +
arch/arm64/kvm/hypercalls.c | 18 +++
arch/arm64/kvm/sdei.c | 198 +++++++++++++++++++++++++
8 files changed, 409 insertions(+), 1 deletion(-)
create mode 100644 arch/arm64/include/asm/kvm_sdei.h
create mode 100644 arch/arm64/include/uapi/asm/kvm_sdei.h
create mode 100644 arch/arm64/kvm/sdei.c

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 41911585ae0c..aedf901e1ec7 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -113,6 +113,9 @@ struct kvm_arch {
/* Interrupt controller */
struct vgic_dist vgic;

+ /* SDEI support */
+ struct kvm_sdei_kvm *sdei;
+
/* Mandated version of PSCI */
u32 psci_version;

@@ -339,6 +342,9 @@ struct kvm_vcpu_arch {
* here.
*/

+ /* SDEI support */
+ struct kvm_sdei_vcpu *sdei;
+
/*
* Guest registers we preserve during guest debugging.
*
diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
new file mode 100644
index 000000000000..b0abc13a0256
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_sdei.h
@@ -0,0 +1,118 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Definitions of various KVM SDEI events.
+ *
+ * Copyright (C) 2021 Red Hat, Inc.
+ *
+ * Author(s): Gavin Shan <[email protected]>
+ */
+
+#ifndef __ARM64_KVM_SDEI_H__
+#define __ARM64_KVM_SDEI_H__
+
+#include <uapi/linux/arm_sdei.h>
+#include <uapi/asm/kvm_sdei.h>
+#include <linux/bitmap.h>
+#include <linux/list.h>
+#include <linux/spinlock.h>
+
+struct kvm_sdei_event {
+ struct kvm_sdei_event_state state;
+ struct kvm *kvm;
+ struct list_head link;
+};
+
+struct kvm_sdei_kvm_event {
+ struct kvm_sdei_kvm_event_state state;
+ struct kvm_sdei_event *kse;
+ struct kvm *kvm;
+ struct list_head link;
+};
+
+struct kvm_sdei_vcpu_event {
+ struct kvm_sdei_vcpu_event_state state;
+ struct kvm_sdei_kvm_event *kske;
+ struct kvm_vcpu *vcpu;
+ struct list_head link;
+};
+
+struct kvm_sdei_kvm {
+ spinlock_t lock;
+ struct list_head events; /* kvm_sdei_event */
+ struct list_head kvm_events; /* kvm_sdei_kvm_event */
+};
+
+struct kvm_sdei_vcpu {
+ spinlock_t lock;
+ struct kvm_sdei_vcpu_state state;
+ struct kvm_sdei_vcpu_event *critical_event;
+ struct kvm_sdei_vcpu_event *normal_event;
+ struct list_head critical_events;
+ struct list_head normal_events;
+};
+
+/*
+ * According to SDEI specification (v1.0), the event number spans 32-bits
+ * and the lower 24-bits are used as the (real) event number. I don't
+ * think we can use that much SDEI numbers in one system. So we reserve
+ * two bits from the 24-bits real event number, to indicate its types:
+ * physical event and virtual event. One reserved bit is enough for now,
+ * but two bits are reserved for possible extension in future.
+ *
+ * The physical events are owned by underly firmware while the virtual
+ * events are used by VMM and KVM.
+ */
+#define KVM_SDEI_EV_NUM_TYPE_SHIFT 22
+#define KVM_SDEI_EV_NUM_TYPE_MASK 3
+#define KVM_SDEI_EV_NUM_TYPE_PHYS 0
+#define KVM_SDEI_EV_NUM_TYPE_VIRT 1
+
+static inline bool kvm_sdei_is_valid_event_num(unsigned long num)
+{
+ unsigned long type;
+
+ if (num >> 32)
+ return false;
+
+ type = (num >> KVM_SDEI_EV_NUM_TYPE_SHIFT) & KVM_SDEI_EV_NUM_TYPE_MASK;
+ if (type != KVM_SDEI_EV_NUM_TYPE_VIRT)
+ return false;
+
+ return true;
+}
+
+/* Accessors for the registration or enablement states of KVM event */
+#define KVM_SDEI_FLAG_FUNC(field) \
+static inline bool kvm_sdei_is_##field(struct kvm_sdei_kvm_event *kske, \
+ unsigned int index) \
+{ \
+ return !!test_bit(index, (void *)(kske->state.field)); \
+} \
+ \
+static inline bool kvm_sdei_empty_##field(struct kvm_sdei_kvm_event *kske) \
+{ \
+ return bitmap_empty((void *)(kske->state.field), \
+ KVM_SDEI_MAX_VCPUS); \
+} \
+static inline void kvm_sdei_set_##field(struct kvm_sdei_kvm_event *kske, \
+ unsigned int index) \
+{ \
+ set_bit(index, (void *)(kske->state.field)); \
+} \
+static inline void kvm_sdei_clear_##field(struct kvm_sdei_kvm_event *kske, \
+ unsigned int index) \
+{ \
+ clear_bit(index, (void *)(kske->state.field)); \
+}
+
+KVM_SDEI_FLAG_FUNC(registered)
+KVM_SDEI_FLAG_FUNC(enabled)
+
+/* APIs */
+void kvm_sdei_init_vm(struct kvm *kvm);
+void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
+int kvm_sdei_hypercall(struct kvm_vcpu *vcpu);
+void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
+void kvm_sdei_destroy_vm(struct kvm *kvm);
+
+#endif /* __ARM64_KVM_SDEI_H__ */
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index b3edde68bc3e..e1b200bb6482 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -36,6 +36,7 @@
#include <linux/types.h>
#include <asm/ptrace.h>
#include <asm/sve_context.h>
+#include <asm/kvm_sdei.h>

#define __KVM_HAVE_GUEST_DEBUG
#define __KVM_HAVE_IRQ_LINE
diff --git a/arch/arm64/include/uapi/asm/kvm_sdei.h b/arch/arm64/include/uapi/asm/kvm_sdei.h
new file mode 100644
index 000000000000..8928027023f6
--- /dev/null
+++ b/arch/arm64/include/uapi/asm/kvm_sdei.h
@@ -0,0 +1,60 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Definitions of various KVM SDEI event states.
+ *
+ * Copyright (C) 2021 Red Hat, Inc.
+ *
+ * Author(s): Gavin Shan <[email protected]>
+ */
+
+#ifndef _UAPI__ASM_KVM_SDEI_H
+#define _UAPI__ASM_KVM_SDEI_H
+
+#ifndef __ASSEMBLY__
+#include <linux/types.h>
+
+#define KVM_SDEI_MAX_VCPUS 512
+#define KVM_SDEI_INVALID_NUM 0
+#define KVM_SDEI_DEFAULT_NUM 0x40400000
+
+struct kvm_sdei_event_state {
+ __u64 num;
+
+ __u8 type;
+ __u8 signaled;
+ __u8 priority;
+};
+
+struct kvm_sdei_kvm_event_state {
+ __u64 num;
+ __u32 refcount;
+
+ __u8 route_mode;
+ __u64 route_affinity;
+ __u64 entries[KVM_SDEI_MAX_VCPUS];
+ __u64 params[KVM_SDEI_MAX_VCPUS];
+ __u64 registered[KVM_SDEI_MAX_VCPUS/64];
+ __u64 enabled[KVM_SDEI_MAX_VCPUS/64];
+};
+
+struct kvm_sdei_vcpu_event_state {
+ __u64 num;
+ __u32 refcount;
+};
+
+struct kvm_sdei_vcpu_regs {
+ __u64 regs[18];
+ __u64 pc;
+ __u64 pstate;
+};
+
+struct kvm_sdei_vcpu_state {
+ __u8 masked;
+ __u64 critical_num;
+ __u64 normal_num;
+ struct kvm_sdei_vcpu_regs critical_regs;
+ struct kvm_sdei_vcpu_regs normal_regs;
+};
+
+#endif /* !__ASSEMBLY__ */
+#endif /* _UAPI__ASM_KVM_SDEI_H */
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 989bb5dad2c8..eefca8ca394d 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -16,7 +16,7 @@ kvm-y := $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o \
inject_fault.o va_layout.o handle_exit.o \
guest.o debug.o reset.o sys_regs.o \
vgic-sys-reg-v3.o fpsimd.o pmu.o \
- arch_timer.o trng.o\
+ arch_timer.o trng.o sdei.o \
vgic/vgic.o vgic/vgic-init.o \
vgic/vgic-irqfd.o vgic/vgic-v2.o \
vgic/vgic-v3.o vgic/vgic-v4.o \
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e9a2b8f27792..2f021aa41632 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -150,6 +150,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)

kvm_vgic_early_init(kvm);

+ kvm_sdei_init_vm(kvm);
+
/* The maximum number of VCPUs is limited by the host's GIC model */
kvm->arch.max_vcpus = kvm_arm_default_max_vcpus();

@@ -179,6 +181,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)

kvm_vgic_destroy(kvm);

+ kvm_sdei_destroy_vm(kvm);
+
for (i = 0; i < KVM_MAX_VCPUS; ++i) {
if (kvm->vcpus[i]) {
kvm_vcpu_destroy(kvm->vcpus[i]);
@@ -333,6 +337,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)

kvm_arm_pvtime_vcpu_init(&vcpu->arch);

+ kvm_sdei_create_vcpu(vcpu);
+
vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu;

err = kvm_vgic_vcpu_init(vcpu);
@@ -354,6 +360,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
kvm_timer_vcpu_terminate(vcpu);
kvm_pmu_vcpu_destroy(vcpu);
+ kvm_sdei_destroy_vcpu(vcpu);

kvm_arm_vcpu_destroy(vcpu);
}
diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index 30da78f72b3b..d3fc893a4f58 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -139,6 +139,24 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
case ARM_SMCCC_TRNG_RND32:
case ARM_SMCCC_TRNG_RND64:
return kvm_trng_call(vcpu);
+ case SDEI_1_0_FN_SDEI_VERSION:
+ case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
+ case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
+ case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
+ case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
+ case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
+ case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
+ case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
+ case SDEI_1_0_FN_SDEI_EVENT_STATUS:
+ case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
+ case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
+ case SDEI_1_0_FN_SDEI_PE_MASK:
+ case SDEI_1_0_FN_SDEI_PE_UNMASK:
+ case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
+ case SDEI_1_0_FN_SDEI_INTERRUPT_RELEASE:
+ case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
+ case SDEI_1_0_FN_SDEI_SHARED_RESET:
+ return kvm_sdei_hypercall(vcpu);
default:
return kvm_psci_call(vcpu);
}
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
new file mode 100644
index 000000000000..ab330b74a965
--- /dev/null
+++ b/arch/arm64/kvm/sdei.c
@@ -0,0 +1,198 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * SDEI virtualization support.
+ *
+ * Copyright (C) 2021 Red Hat, Inc.
+ *
+ * Author(s): Gavin Shan <[email protected]>
+ */
+
+#include <linux/kernel.h>
+#include <linux/kvm_host.h>
+#include <linux/spinlock.h>
+#include <linux/slab.h>
+#include <kvm/arm_hypercalls.h>
+
+static struct kvm_sdei_event_state defined_kse[] = {
+ { KVM_SDEI_DEFAULT_NUM,
+ SDEI_EVENT_TYPE_PRIVATE,
+ 1,
+ SDEI_EVENT_PRIORITY_CRITICAL
+ },
+};
+
+static void kvm_sdei_remove_events(struct kvm *kvm)
+{
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_event *kse, *tmp;
+
+ list_for_each_entry_safe(kse, tmp, &ksdei->events, link) {
+ list_del(&kse->link);
+ kfree(kse);
+ }
+}
+
+static void kvm_sdei_remove_kvm_events(struct kvm *kvm,
+ unsigned int mask,
+ bool force)
+{
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_event *kse;
+ struct kvm_sdei_kvm_event *kske, *tmp;
+
+ list_for_each_entry_safe(kske, tmp, &ksdei->kvm_events, link) {
+ kse = kske->kse;
+
+ if (!((1 << kse->state.type) & mask))
+ continue;
+
+ if (!force && kske->state.refcount)
+ continue;
+
+ list_del(&kske->link);
+ kfree(kske);
+ }
+}
+
+static void kvm_sdei_remove_vcpu_events(struct kvm_vcpu *vcpu)
+{
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_vcpu_event *ksve, *tmp;
+
+ list_for_each_entry_safe(ksve, tmp, &vsdei->critical_events, link) {
+ list_del(&ksve->link);
+ kfree(ksve);
+ }
+
+ list_for_each_entry_safe(ksve, tmp, &vsdei->normal_events, link) {
+ list_del(&ksve->link);
+ kfree(ksve);
+ }
+}
+
+int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
+{
+ u32 func = smccc_get_function(vcpu);
+ bool has_result = true;
+ unsigned long ret;
+
+ switch (func) {
+ case SDEI_1_0_FN_SDEI_VERSION:
+ case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
+ case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
+ case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
+ case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
+ case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
+ case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
+ case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
+ case SDEI_1_0_FN_SDEI_EVENT_STATUS:
+ case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
+ case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
+ case SDEI_1_0_FN_SDEI_PE_MASK:
+ case SDEI_1_0_FN_SDEI_PE_UNMASK:
+ case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
+ case SDEI_1_0_FN_SDEI_INTERRUPT_RELEASE:
+ case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
+ case SDEI_1_0_FN_SDEI_SHARED_RESET:
+ default:
+ ret = SDEI_NOT_SUPPORTED;
+ }
+
+ /*
+ * We don't have return value for COMPLETE or COMPLETE_AND_RESUME
+ * hypercalls. Otherwise, the restored context will be corrupted.
+ */
+ if (has_result)
+ smccc_set_retval(vcpu, ret, 0, 0, 0);
+
+ return 1;
+}
+
+void kvm_sdei_init_vm(struct kvm *kvm)
+{
+ struct kvm_sdei_kvm *ksdei;
+ struct kvm_sdei_event *kse;
+ int i;
+
+ ksdei = kzalloc(sizeof(*ksdei), GFP_KERNEL);
+ if (!ksdei)
+ return;
+
+ spin_lock_init(&ksdei->lock);
+ INIT_LIST_HEAD(&ksdei->events);
+ INIT_LIST_HEAD(&ksdei->kvm_events);
+
+ /*
+ * Populate the defined KVM SDEI events. The whole functionality
+ * will be disabled on any errors.
+ */
+ for (i = 0; i < ARRAY_SIZE(defined_kse); i++) {
+ kse = kzalloc(sizeof(*kse), GFP_KERNEL);
+ if (!kse) {
+ kvm_sdei_remove_events(kvm);
+ kfree(ksdei);
+ return;
+ }
+
+ kse->kvm = kvm;
+ kse->state = defined_kse[i];
+ list_add_tail(&kse->link, &ksdei->events);
+ }
+
+ kvm->arch.sdei = ksdei;
+}
+
+void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_vcpu *vsdei;
+
+ if (!kvm->arch.sdei)
+ return;
+
+ vsdei = kzalloc(sizeof(*vsdei), GFP_KERNEL);
+ if (!vsdei)
+ return;
+
+ spin_lock_init(&vsdei->lock);
+ vsdei->state.masked = 1;
+ vsdei->state.critical_num = KVM_SDEI_INVALID_NUM;
+ vsdei->state.normal_num = KVM_SDEI_INVALID_NUM;
+ vsdei->critical_event = NULL;
+ vsdei->normal_event = NULL;
+ INIT_LIST_HEAD(&vsdei->critical_events);
+ INIT_LIST_HEAD(&vsdei->normal_events);
+
+ vcpu->arch.sdei = vsdei;
+}
+
+void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu)
+{
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+
+ if (vsdei) {
+ spin_lock(&vsdei->lock);
+ kvm_sdei_remove_vcpu_events(vcpu);
+ spin_unlock(&vsdei->lock);
+
+ kfree(vsdei);
+ vcpu->arch.sdei = NULL;
+ }
+}
+
+void kvm_sdei_destroy_vm(struct kvm *kvm)
+{
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ unsigned int mask = (1 << SDEI_EVENT_TYPE_PRIVATE) |
+ (1 << SDEI_EVENT_TYPE_SHARED);
+
+ if (ksdei) {
+ spin_lock(&ksdei->lock);
+ kvm_sdei_remove_kvm_events(kvm, mask, true);
+ kvm_sdei_remove_events(kvm);
+ spin_unlock(&ksdei->lock);
+
+ kfree(ksdei);
+ kvm->arch.sdei = NULL;
+ }
+}
--
2.23.0

2021-08-15 00:16:33

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 03/21] KVM: arm64: Support SDEI_VERSION hypercall

This supports SDEI_VERSION hypercall by returning v1.0.0 simply
when the functionality is supported on the VM and vCPU.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/kvm/sdei.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index ab330b74a965..aa9485f076a9 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -70,6 +70,22 @@ static void kvm_sdei_remove_vcpu_events(struct kvm_vcpu *vcpu)
}
}

+static unsigned long kvm_sdei_hypercall_version(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ unsigned long ret = SDEI_NOT_SUPPORTED;
+
+ if (!(ksdei && vsdei))
+ return ret;
+
+ /* v1.0.0 */
+ ret = (1UL << SDEI_VERSION_MAJOR_SHIFT);
+
+ return ret;
+}
+
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
{
u32 func = smccc_get_function(vcpu);
@@ -78,6 +94,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)

switch (func) {
case SDEI_1_0_FN_SDEI_VERSION:
+ ret = kvm_sdei_hypercall_version(vcpu);
+ break;
case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
--
2.23.0

2021-08-15 00:17:01

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 04/21] KVM: arm64: Support SDEI_EVENT_REGISTER hypercall

This supports SDEI_EVENT_REGISTER hypercall, which is used by guest
to register SDEI events. The SDEI event won't be raised to the guest
or specific vCPU until it's registered and enabled explicitly.

Only those events that have been exported by KVM can be registered.
After the event is registered successfully, the KVM SDEI event (object)
is created or updated because the same KVM SDEI event is shared by
multiple vCPUs if it's a private event.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/kvm/sdei.c | 122 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 122 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index aa9485f076a9..d3ea3eee154b 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -21,6 +21,20 @@ static struct kvm_sdei_event_state defined_kse[] = {
},
};

+static struct kvm_sdei_event *kvm_sdei_find_event(struct kvm *kvm,
+ unsigned long num)
+{
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_event *kse;
+
+ list_for_each_entry(kse, &ksdei->events, link) {
+ if (kse->state.num == num)
+ return kse;
+ }
+
+ return NULL;
+}
+
static void kvm_sdei_remove_events(struct kvm *kvm)
{
struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
@@ -32,6 +46,20 @@ static void kvm_sdei_remove_events(struct kvm *kvm)
}
}

+static struct kvm_sdei_kvm_event *kvm_sdei_find_kvm_event(struct kvm *kvm,
+ unsigned long num)
+{
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_kvm_event *kske;
+
+ list_for_each_entry(kske, &ksdei->kvm_events, link) {
+ if (kske->state.num == num)
+ return kske;
+ }
+
+ return NULL;
+}
+
static void kvm_sdei_remove_kvm_events(struct kvm *kvm,
unsigned int mask,
bool force)
@@ -86,6 +114,98 @@ static unsigned long kvm_sdei_hypercall_version(struct kvm_vcpu *vcpu)
return ret;
}

+static unsigned long kvm_sdei_hypercall_register(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_event *kse = NULL;
+ struct kvm_sdei_kvm_event *kske = NULL;
+ unsigned long event_num = smccc_get_arg1(vcpu);
+ unsigned long event_entry = smccc_get_arg2(vcpu);
+ unsigned long event_param = smccc_get_arg3(vcpu);
+ unsigned long route_mode = smccc_get_arg4(vcpu);
+ unsigned long route_affinity = smccc_get_arg5(vcpu);
+ int index = vcpu->vcpu_idx;
+ unsigned long ret = SDEI_SUCCESS;
+
+ /* Sanity check */
+ if (!(ksdei && vsdei)) {
+ ret = SDEI_NOT_SUPPORTED;
+ goto out;
+ }
+
+ if (!kvm_sdei_is_valid_event_num(event_num)) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto out;
+ }
+
+ if (!(route_mode == SDEI_EVENT_REGISTER_RM_ANY ||
+ route_mode == SDEI_EVENT_REGISTER_RM_PE)) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto out;
+ }
+
+ /*
+ * The KVM event could have been created if it's a private event.
+ * We needn't create a KVM event in this case.
+ */
+ spin_lock(&ksdei->lock);
+ kske = kvm_sdei_find_kvm_event(kvm, event_num);
+ if (kske) {
+ kse = kske->kse;
+ index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
+ vcpu->vcpu_idx : 0;
+
+ if (kvm_sdei_is_registered(kske, index)) {
+ ret = SDEI_DENIED;
+ goto unlock;
+ }
+
+ kske->state.route_mode = route_mode;
+ kske->state.route_affinity = route_affinity;
+ kske->state.entries[index] = event_entry;
+ kske->state.params[index] = event_param;
+ kvm_sdei_set_registered(kske, index);
+ goto unlock;
+ }
+
+ /* Check if the event number has been registered */
+ kse = kvm_sdei_find_event(kvm, event_num);
+ if (!kse) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto unlock;
+ }
+
+ /* Create KVM event */
+ kske = kzalloc(sizeof(*kske), GFP_KERNEL);
+ if (!kske) {
+ ret = SDEI_OUT_OF_RESOURCE;
+ goto unlock;
+ }
+
+ /* Initialize KVM event state */
+ index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
+ vcpu->vcpu_idx : 0;
+ kske->state.num = event_num;
+ kske->state.refcount = 0;
+ kske->state.route_mode = route_affinity;
+ kske->state.route_affinity = route_affinity;
+ kske->state.entries[index] = event_entry;
+ kske->state.params[index] = event_param;
+ kvm_sdei_set_registered(kske, index);
+
+ /* Initialize KVM event */
+ kske->kse = kse;
+ kske->kvm = kvm;
+ list_add_tail(&kske->link, &ksdei->kvm_events);
+
+unlock:
+ spin_unlock(&ksdei->lock);
+out:
+ return ret;
+}
+
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
{
u32 func = smccc_get_function(vcpu);
@@ -97,6 +217,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
ret = kvm_sdei_hypercall_version(vcpu);
break;
case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
+ ret = kvm_sdei_hypercall_register(vcpu);
+ break;
case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
--
2.23.0

2021-08-15 00:17:17

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 06/21] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall

This supports SDEI_EVENT_CONTEXT hypercall. It's used by the guest
to retrieved the original registers (R0 - R17) in its SDEI event
handler. Those registers can be corrupted during the SDEI event
delivery.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/kvm/sdei.c | 40 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 40 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index b022ce0a202b..b4162efda470 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -270,6 +270,44 @@ static unsigned long kvm_sdei_hypercall_enable(struct kvm_vcpu *vcpu,
return ret;
}

+static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_vcpu_regs *regs;
+ unsigned long index = smccc_get_arg1(vcpu);
+ unsigned long ret = SDEI_SUCCESS;
+
+ /* Sanity check */
+ if (!(ksdei && vsdei)) {
+ ret = SDEI_NOT_SUPPORTED;
+ goto out;
+ }
+
+ if (index > ARRAY_SIZE(vsdei->state.critical_regs.regs)) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto out;
+ }
+
+ /* Check if the pending event exists */
+ spin_lock(&vsdei->lock);
+ if (!(vsdei->critical_event || vsdei->normal_event)) {
+ ret = SDEI_DENIED;
+ goto unlock;
+ }
+
+ /* Fetch the requested register */
+ regs = vsdei->critical_event ? &vsdei->state.critical_regs :
+ &vsdei->state.normal_regs;
+ ret = regs->regs[index];
+
+unlock:
+ spin_unlock(&vsdei->lock);
+out:
+ return ret;
+}
+
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
{
u32 func = smccc_get_function(vcpu);
@@ -290,6 +328,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
ret = kvm_sdei_hypercall_enable(vcpu, false);
break;
case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
+ ret = kvm_sdei_hypercall_context(vcpu);
+ break;
case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
--
2.23.0

2021-08-15 00:17:25

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 07/21] KVM: arm64: Support SDEI_EVENT_UNREGISTER hypercall

This supports SDEI_EVENT_UNREGISTER hypercall. It's used by the
guest to unregister SDEI event. The SDEI event won't be raised to
the guest or specific vCPU after it's unregistered successfully.
It's notable the SDEI event is disabled automatically on the guest
or specific vCPU once it's unregistered successfully.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/kvm/sdei.c | 61 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 61 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index b4162efda470..a3ba69dc91cb 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -308,6 +308,65 @@ static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu)
return ret;
}

+static unsigned long kvm_sdei_hypercall_unregister(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_event *kse = NULL;
+ struct kvm_sdei_kvm_event *kske = NULL;
+ unsigned long event_num = smccc_get_arg1(vcpu);
+ int index = 0;
+ unsigned long ret = SDEI_SUCCESS;
+
+ /* Sanity check */
+ if (!(ksdei && vsdei)) {
+ ret = SDEI_NOT_SUPPORTED;
+ goto out;
+ }
+
+ if (!kvm_sdei_is_valid_event_num(event_num)) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto out;
+ }
+
+ /* Check if the KVM event exists */
+ spin_lock(&ksdei->lock);
+ kske = kvm_sdei_find_kvm_event(kvm, event_num);
+ if (!kske) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto unlock;
+ }
+
+ /* Check if there is pending events */
+ if (kske->state.refcount) {
+ ret = SDEI_PENDING;
+ goto unlock;
+ }
+
+ /* Check if it has been registered */
+ kse = kske->kse;
+ index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
+ vcpu->vcpu_idx : 0;
+ if (!kvm_sdei_is_registered(kske, index)) {
+ ret = SDEI_DENIED;
+ goto unlock;
+ }
+
+ /* The event is disabled when it's unregistered */
+ kvm_sdei_clear_enabled(kske, index);
+ kvm_sdei_clear_registered(kske, index);
+ if (kvm_sdei_empty_registered(kske)) {
+ list_del(&kske->link);
+ kfree(kske);
+ }
+
+unlock:
+ spin_unlock(&ksdei->lock);
+out:
+ return ret;
+}
+
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
{
u32 func = smccc_get_function(vcpu);
@@ -333,6 +392,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
+ ret = kvm_sdei_hypercall_unregister(vcpu);
+ break;
case SDEI_1_0_FN_SDEI_EVENT_STATUS:
case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
--
2.23.0

2021-08-15 00:17:32

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 08/21] KVM: arm64: Support SDEI_EVENT_STATUS hypercall

This supports SDEI_EVENT_STATUS hypercall. It's used by the guest
to retrieve a bitmap to indicate the SDEI event states, including
registration, enablement and delivery state.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/kvm/sdei.c | 50 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 50 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index a3ba69dc91cb..b95b8c4455e1 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -367,6 +367,54 @@ static unsigned long kvm_sdei_hypercall_unregister(struct kvm_vcpu *vcpu)
return ret;
}

+static unsigned long kvm_sdei_hypercall_status(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_event *kse = NULL;
+ struct kvm_sdei_kvm_event *kske = NULL;
+ unsigned long event_num = smccc_get_arg1(vcpu);
+ int index = 0;
+ unsigned long ret = SDEI_SUCCESS;
+
+ /* Sanity check */
+ if (!(ksdei && vsdei)) {
+ ret = SDEI_NOT_SUPPORTED;
+ goto out;
+ }
+
+ if (!kvm_sdei_is_valid_event_num(event_num)) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto out;
+ }
+
+ /*
+ * Check if the KVM event exists. None of the flags
+ * will be set if it doesn't exist.
+ */
+ spin_lock(&ksdei->lock);
+ kske = kvm_sdei_find_kvm_event(kvm, event_num);
+ if (!kske) {
+ ret = 0;
+ goto unlock;
+ }
+
+ index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
+ vcpu->vcpu_idx : 0;
+ if (kvm_sdei_is_registered(kske, index))
+ ret |= (1UL << SDEI_EVENT_STATUS_REGISTERED);
+ if (kvm_sdei_is_enabled(kske, index))
+ ret |= (1UL << SDEI_EVENT_STATUS_ENABLED);
+ if (kske->state.refcount)
+ ret |= (1UL << SDEI_EVENT_STATUS_RUNNING);
+
+unlock:
+ spin_unlock(&ksdei->lock);
+out:
+ return ret;
+}
+
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
{
u32 func = smccc_get_function(vcpu);
@@ -395,6 +443,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
ret = kvm_sdei_hypercall_unregister(vcpu);
break;
case SDEI_1_0_FN_SDEI_EVENT_STATUS:
+ ret = kvm_sdei_hypercall_status(vcpu);
+ break;
case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
case SDEI_1_0_FN_SDEI_PE_MASK:
--
2.23.0

2021-08-15 00:17:44

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 09/21] KVM: arm64: Support SDEI_EVENT_GET_INFO hypercall

This supports SDEI_EVENT_GET_INFO hypercall. It's used by the guest
to retrieve various information about the supported (exported) events,
including type, signaled, route mode and affinity for the shared
events.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/kvm/sdei.c | 76 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 76 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index b95b8c4455e1..5dfa74b093f1 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -415,6 +415,80 @@ static unsigned long kvm_sdei_hypercall_status(struct kvm_vcpu *vcpu)
return ret;
}

+static unsigned long kvm_sdei_hypercall_info(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_event *kse = NULL;
+ struct kvm_sdei_kvm_event *kske = NULL;
+ unsigned long event_num = smccc_get_arg1(vcpu);
+ unsigned long event_info = smccc_get_arg2(vcpu);
+ unsigned long ret = SDEI_SUCCESS;
+
+ /* Sanity check */
+ if (!(ksdei && vsdei)) {
+ ret = SDEI_NOT_SUPPORTED;
+ goto out;
+ }
+
+ if (!kvm_sdei_is_valid_event_num(event_num)) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto out;
+ }
+
+ /*
+ * Check if the KVM event exists. The event might have been
+ * registered, we need fetch the information from the registered
+ * event in that case.
+ */
+ spin_lock(&ksdei->lock);
+ kske = kvm_sdei_find_kvm_event(kvm, event_num);
+ kse = kske ? kske->kse : NULL;
+ if (!kse) {
+ kse = kvm_sdei_find_event(kvm, event_num);
+ if (!kse) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto unlock;
+ }
+ }
+
+ /* Retrieve the requested information */
+ switch (event_info) {
+ case SDEI_EVENT_INFO_EV_TYPE:
+ ret = kse->state.type;
+ break;
+ case SDEI_EVENT_INFO_EV_SIGNALED:
+ ret = kse->state.signaled;
+ break;
+ case SDEI_EVENT_INFO_EV_PRIORITY:
+ ret = kse->state.priority;
+ break;
+ case SDEI_EVENT_INFO_EV_ROUTING_MODE:
+ case SDEI_EVENT_INFO_EV_ROUTING_AFF:
+ if (kse->state.type != SDEI_EVENT_TYPE_SHARED) {
+ ret = SDEI_INVALID_PARAMETERS;
+ break;
+ }
+
+ if (event_info == SDEI_EVENT_INFO_EV_ROUTING_MODE) {
+ ret = kske ? kske->state.route_mode :
+ SDEI_EVENT_REGISTER_RM_ANY;
+ } else {
+ ret = kske ? kske->state.route_affinity : 0;
+ }
+
+ break;
+ default:
+ ret = SDEI_INVALID_PARAMETERS;
+ }
+
+unlock:
+ spin_unlock(&ksdei->lock);
+out:
+ return ret;
+}
+
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
{
u32 func = smccc_get_function(vcpu);
@@ -446,6 +520,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
ret = kvm_sdei_hypercall_status(vcpu);
break;
case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
+ ret = kvm_sdei_hypercall_info(vcpu);
+ break;
case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
case SDEI_1_0_FN_SDEI_PE_MASK:
case SDEI_1_0_FN_SDEI_PE_UNMASK:
--
2.23.0

2021-08-15 00:17:58

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 11/21] KVM: arm64: Support SDEI_PE_{MASK, UNMASK} hypercall

This supports SDEI_PE_{MASK, UNMASK} hypercall. They are used by
the guest to stop the specific vCPU from receiving SDEI events.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/kvm/sdei.c | 35 +++++++++++++++++++++++++++++++++++
1 file changed, 35 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 458695c2394f..3fb33258b494 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -551,6 +551,37 @@ static unsigned long kvm_sdei_hypercall_route(struct kvm_vcpu *vcpu)
return ret;
}

+static unsigned long kvm_sdei_hypercall_mask(struct kvm_vcpu *vcpu,
+ bool mask)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ unsigned long ret = SDEI_SUCCESS;
+
+ /* Sanity check */
+ if (!(ksdei && vsdei)) {
+ ret = SDEI_NOT_SUPPORTED;
+ goto out;
+ }
+
+ spin_lock(&vsdei->lock);
+
+ /* Check the state */
+ if (mask == vsdei->state.masked) {
+ ret = SDEI_DENIED;
+ goto unlock;
+ }
+
+ /* Update the state */
+ vsdei->state.masked = mask ? 1 : 0;
+
+unlock:
+ spin_unlock(&vsdei->lock);
+out:
+ return ret;
+}
+
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
{
u32 func = smccc_get_function(vcpu);
@@ -588,7 +619,11 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
ret = kvm_sdei_hypercall_route(vcpu);
break;
case SDEI_1_0_FN_SDEI_PE_MASK:
+ ret = kvm_sdei_hypercall_mask(vcpu, true);
+ break;
case SDEI_1_0_FN_SDEI_PE_UNMASK:
+ ret = kvm_sdei_hypercall_mask(vcpu, false);
+ break;
case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
case SDEI_1_0_FN_SDEI_INTERRUPT_RELEASE:
case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
--
2.23.0

2021-08-15 00:18:06

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 13/21] KVM: arm64: Impment SDEI event delivery

This implement kvm_sdei_deliver() to support SDEI event delivery.
The function is called when the request (KVM_REQ_SDEI) is raised.
The following rules are taken according to the SDEI specification:

* x0 - x17 are saved. All of them are cleared except the following
registered:
x0: number SDEI event to be delivered
x1: parameter associated with the SDEI event
x2: PC of the interrupted context
x3: PState of the interrupted context

* PC is set to the handler of the SDEI event, which was provided
during its registration. PState is modified accordingly.

* SDEI event with critical priority can preempt those with normal
priority.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/include/asm/kvm_host.h | 1 +
arch/arm64/include/asm/kvm_sdei.h | 1 +
arch/arm64/kvm/arm.c | 3 ++
arch/arm64/kvm/sdei.c | 84 +++++++++++++++++++++++++++++++
4 files changed, 89 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index aedf901e1ec7..46f363aa6524 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -47,6 +47,7 @@
#define KVM_REQ_RECORD_STEAL KVM_ARCH_REQ(3)
#define KVM_REQ_RELOAD_GICv4 KVM_ARCH_REQ(4)
#define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5)
+#define KVM_REQ_SDEI KVM_ARCH_REQ(6)

#define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \
KVM_DIRTY_LOG_INITIALLY_SET)
diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
index b0abc13a0256..7f5f5ad689e6 100644
--- a/arch/arm64/include/asm/kvm_sdei.h
+++ b/arch/arm64/include/asm/kvm_sdei.h
@@ -112,6 +112,7 @@ KVM_SDEI_FLAG_FUNC(enabled)
void kvm_sdei_init_vm(struct kvm *kvm);
void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu);
+void kvm_sdei_deliver(struct kvm_vcpu *vcpu);
void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
void kvm_sdei_destroy_vm(struct kvm *kvm);

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 2f021aa41632..0c3db1ef1ba9 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -689,6 +689,9 @@ static void check_vcpu_requests(struct kvm_vcpu *vcpu)
if (kvm_check_request(KVM_REQ_VCPU_RESET, vcpu))
kvm_reset_vcpu(vcpu);

+ if (kvm_check_request(KVM_REQ_SDEI, vcpu))
+ kvm_sdei_deliver(vcpu);
+
/*
* Clear IRQ_PENDING requests that were made to guarantee
* that a VCPU sees new virtual interrupts.
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 62efee2b67b8..b5d6d1ed3858 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -671,6 +671,90 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
return 1;
}

+void kvm_sdei_deliver(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_event *kse = NULL;
+ struct kvm_sdei_kvm_event *kske = NULL;
+ struct kvm_sdei_vcpu_event *ksve = NULL;
+ struct kvm_sdei_vcpu_regs *regs = NULL;
+ unsigned long pstate;
+ int index = 0;
+
+ /* Sanity check */
+ if (!(ksdei && vsdei))
+ return;
+
+ /* The critical event can't be preempted */
+ spin_lock(&vsdei->lock);
+ if (vsdei->critical_event)
+ goto unlock;
+
+ /*
+ * The normal event can be preempted by the critical event.
+ * However, the normal event can't be preempted by another
+ * normal event.
+ */
+ ksve = list_first_entry_or_null(&vsdei->critical_events,
+ struct kvm_sdei_vcpu_event, link);
+ if (!ksve && !vsdei->normal_event) {
+ ksve = list_first_entry_or_null(&vsdei->normal_events,
+ struct kvm_sdei_vcpu_event, link);
+ }
+
+ if (!ksve)
+ goto unlock;
+
+ kske = ksve->kske;
+ kse = kske->kse;
+ if (kse->state.priority == SDEI_EVENT_PRIORITY_CRITICAL) {
+ vsdei->critical_event = ksve;
+ vsdei->state.critical_num = ksve->state.num;
+ regs = &vsdei->state.critical_regs;
+ } else {
+ vsdei->normal_event = ksve;
+ vsdei->state.normal_num = ksve->state.num;
+ regs = &vsdei->state.normal_regs;
+ }
+
+ /* Save registers: x0 -> x17, PC, PState */
+ for (index = 0; index < ARRAY_SIZE(regs->regs); index++)
+ regs->regs[index] = vcpu_get_reg(vcpu, index);
+
+ regs->pc = *vcpu_pc(vcpu);
+ regs->pstate = *vcpu_cpsr(vcpu);
+
+ /*
+ * Inject SDEI event: x0 -> x3, PC, PState. We needn't take lock
+ * for the KVM event as it can't be destroyed because of its
+ * reference count.
+ */
+ for (index = 0; index < ARRAY_SIZE(regs->regs); index++)
+ vcpu_set_reg(vcpu, index, 0);
+
+ index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
+ vcpu->vcpu_idx : 0;
+ vcpu_set_reg(vcpu, 0, kske->state.num);
+ vcpu_set_reg(vcpu, 1, kske->state.params[index]);
+ vcpu_set_reg(vcpu, 2, regs->pc);
+ vcpu_set_reg(vcpu, 3, regs->pstate);
+
+ pstate = regs->pstate;
+ pstate |= (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT);
+ pstate &= ~PSR_MODE_MASK;
+ pstate |= PSR_MODE_EL1h;
+ pstate &= ~PSR_MODE32_BIT;
+
+ vcpu_write_sys_reg(vcpu, regs->pstate, SPSR_EL1);
+ *vcpu_cpsr(vcpu) = pstate;
+ *vcpu_pc(vcpu) = kske->state.entries[index];
+
+unlock:
+ spin_unlock(&vsdei->lock);
+}
+
void kvm_sdei_init_vm(struct kvm *kvm)
{
struct kvm_sdei_kvm *ksdei;
--
2.23.0

2021-08-15 00:18:14

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 05/21] KVM: arm64: Support SDEI_EVENT_{ENABLE, DISABLE} hypercall

This supports SDEI_EVENT_{ENABLE, DISABLE} hypercall. After SDEI
event is registered by guest, it won't be delivered to the guest
until it's enabled. On the other hand, the SDEI event won't be
raised to the guest or specific vCPU if it's has been disabled
on the guest or specific vCPU.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/kvm/sdei.c | 68 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 68 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index d3ea3eee154b..b022ce0a202b 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -206,6 +206,70 @@ static unsigned long kvm_sdei_hypercall_register(struct kvm_vcpu *vcpu)
return ret;
}

+static unsigned long kvm_sdei_hypercall_enable(struct kvm_vcpu *vcpu,
+ bool enable)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_event *kse = NULL;
+ struct kvm_sdei_kvm_event *kske = NULL;
+ unsigned long event_num = smccc_get_arg1(vcpu);
+ int index = 0;
+ unsigned long ret = SDEI_SUCCESS;
+
+ /* Sanity check */
+ if (!(ksdei && vsdei)) {
+ ret = SDEI_NOT_SUPPORTED;
+ goto out;
+ }
+
+ if (!kvm_sdei_is_valid_event_num(event_num)) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto out;
+ }
+
+ /* Check if the KVM event exists */
+ spin_lock(&ksdei->lock);
+ kske = kvm_sdei_find_kvm_event(kvm, event_num);
+ if (!kske) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto unlock;
+ }
+
+ /* Check if there is pending events */
+ if (kske->state.refcount) {
+ ret = SDEI_PENDING;
+ goto unlock;
+ }
+
+ /* Check if it has been registered */
+ kse = kske->kse;
+ index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
+ vcpu->vcpu_idx : 0;
+ if (!kvm_sdei_is_registered(kske, index)) {
+ ret = SDEI_DENIED;
+ goto unlock;
+ }
+
+ /* Verify its enablement state */
+ if (enable == kvm_sdei_is_enabled(kske, index)) {
+ ret = SDEI_DENIED;
+ goto unlock;
+ }
+
+ /* Update enablement state */
+ if (enable)
+ kvm_sdei_set_enabled(kske, index);
+ else
+ kvm_sdei_clear_enabled(kske, index);
+
+unlock:
+ spin_unlock(&ksdei->lock);
+out:
+ return ret;
+}
+
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
{
u32 func = smccc_get_function(vcpu);
@@ -220,7 +284,11 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
ret = kvm_sdei_hypercall_register(vcpu);
break;
case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
+ ret = kvm_sdei_hypercall_enable(vcpu, true);
+ break;
case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
+ ret = kvm_sdei_hypercall_enable(vcpu, false);
+ break;
case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
--
2.23.0

2021-08-15 00:18:19

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 12/21] KVM: arm64: Support SDEI_{PRIVATE, SHARED}_RESET hypercall

This supports SDEI_{PRIVATE, SHARED}_RESET. They are used by the
guest to purge the private or shared SDEI events, which are registered
previously.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/kvm/sdei.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 3fb33258b494..62efee2b67b8 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -582,6 +582,29 @@ static unsigned long kvm_sdei_hypercall_mask(struct kvm_vcpu *vcpu,
return ret;
}

+static unsigned long kvm_sdei_hypercall_reset(struct kvm_vcpu *vcpu,
+ bool private)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ unsigned int mask = private ? (1 << SDEI_EVENT_TYPE_PRIVATE) :
+ (1 << SDEI_EVENT_TYPE_SHARED);
+ unsigned long ret = SDEI_SUCCESS;
+
+ /* Sanity check */
+ if (!(ksdei && vsdei)) {
+ ret = SDEI_NOT_SUPPORTED;
+ goto out;
+ }
+
+ spin_lock(&ksdei->lock);
+ kvm_sdei_remove_kvm_events(kvm, mask, false);
+ spin_unlock(&ksdei->lock);
+out:
+ return ret;
+}
+
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
{
u32 func = smccc_get_function(vcpu);
@@ -626,8 +649,14 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
break;
case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
case SDEI_1_0_FN_SDEI_INTERRUPT_RELEASE:
+ ret = SDEI_NOT_SUPPORTED;
+ break;
case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
+ ret = kvm_sdei_hypercall_reset(vcpu, true);
+ break;
case SDEI_1_0_FN_SDEI_SHARED_RESET:
+ ret = kvm_sdei_hypercall_reset(vcpu, false);
+ break;
default:
ret = SDEI_NOT_SUPPORTED;
}
--
2.23.0

2021-08-15 00:18:25

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 10/21] KVM: arm64: Support SDEI_EVENT_ROUTING_SET hypercall

This supports SDEI_EVENT_ROUTING_SET hypercall. It's used by the
guest to set route mode and affinity for the registered KVM event.
It's only valid for the shared events. It's not allowed to do so
when the corresponding event has been raised to the guest.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/kvm/sdei.c | 64 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 64 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 5dfa74b093f1..458695c2394f 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -489,6 +489,68 @@ static unsigned long kvm_sdei_hypercall_info(struct kvm_vcpu *vcpu)
return ret;
}

+static unsigned long kvm_sdei_hypercall_route(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_event *kse = NULL;
+ struct kvm_sdei_kvm_event *kske = NULL;
+ unsigned long event_num = smccc_get_arg1(vcpu);
+ unsigned long route_mode = smccc_get_arg2(vcpu);
+ unsigned long route_affinity = smccc_get_arg3(vcpu);
+ int index = 0;
+ unsigned long ret = SDEI_SUCCESS;
+
+ /* Sanity check */
+ if (!(ksdei && vsdei)) {
+ ret = SDEI_NOT_SUPPORTED;
+ goto out;
+ }
+
+ if (!kvm_sdei_is_valid_event_num(event_num)) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto out;
+ }
+
+ if (!(route_mode == SDEI_EVENT_REGISTER_RM_ANY ||
+ route_mode == SDEI_EVENT_REGISTER_RM_PE)) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto out;
+ }
+
+ /* Check if the KVM event has been registered */
+ spin_lock(&ksdei->lock);
+ kske = kvm_sdei_find_kvm_event(kvm, event_num);
+ if (!kske) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto unlock;
+ }
+
+ /* Validate KVM event state */
+ kse = kske->kse;
+ if (kse->state.type != SDEI_EVENT_TYPE_SHARED) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto unlock;
+ }
+
+ if (!kvm_sdei_is_registered(kske, index) ||
+ kvm_sdei_is_enabled(kske, index) ||
+ kske->state.refcount) {
+ ret = SDEI_DENIED;
+ goto unlock;
+ }
+
+ /* Update state */
+ kske->state.route_mode = route_mode;
+ kske->state.route_affinity = route_affinity;
+
+unlock:
+ spin_unlock(&ksdei->lock);
+out:
+ return ret;
+}
+
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
{
u32 func = smccc_get_function(vcpu);
@@ -523,6 +585,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
ret = kvm_sdei_hypercall_info(vcpu);
break;
case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
+ ret = kvm_sdei_hypercall_route(vcpu);
+ break;
case SDEI_1_0_FN_SDEI_PE_MASK:
case SDEI_1_0_FN_SDEI_PE_UNMASK:
case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
--
2.23.0

2021-08-15 00:18:44

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 15/21] KVM: arm64: Support SDEI event notifier

The owner of the SDEI event, like asynchronous page fault, need
know the state of injected SDEI event. This supports SDEI event
state updating by introducing notifier mechanism. It's notable
the notifier (handler) should be capable of migration.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/include/asm/kvm_sdei.h | 12 +++++++
arch/arm64/include/uapi/asm/kvm_sdei.h | 1 +
arch/arm64/kvm/sdei.c | 45 +++++++++++++++++++++++++-
3 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
index 7f5f5ad689e6..19f2d9b91f85 100644
--- a/arch/arm64/include/asm/kvm_sdei.h
+++ b/arch/arm64/include/asm/kvm_sdei.h
@@ -16,6 +16,16 @@
#include <linux/list.h>
#include <linux/spinlock.h>

+struct kvm_vcpu;
+
+typedef void (*kvm_sdei_notifier)(struct kvm_vcpu *vcpu,
+ unsigned long num,
+ unsigned int state);
+enum {
+ KVM_SDEI_NOTIFY_DELIVERED,
+ KVM_SDEI_NOTIFY_COMPLETED,
+};
+
struct kvm_sdei_event {
struct kvm_sdei_event_state state;
struct kvm *kvm;
@@ -112,6 +122,8 @@ KVM_SDEI_FLAG_FUNC(enabled)
void kvm_sdei_init_vm(struct kvm *kvm);
void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu);
+int kvm_sdei_register_notifier(struct kvm *kvm, unsigned long num,
+ kvm_sdei_notifier notifier);
void kvm_sdei_deliver(struct kvm_vcpu *vcpu);
void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
void kvm_sdei_destroy_vm(struct kvm *kvm);
diff --git a/arch/arm64/include/uapi/asm/kvm_sdei.h b/arch/arm64/include/uapi/asm/kvm_sdei.h
index 8928027023f6..4ef661d106fe 100644
--- a/arch/arm64/include/uapi/asm/kvm_sdei.h
+++ b/arch/arm64/include/uapi/asm/kvm_sdei.h
@@ -23,6 +23,7 @@ struct kvm_sdei_event_state {
__u8 type;
__u8 signaled;
__u8 priority;
+ __u64 notifier;
};

struct kvm_sdei_kvm_event_state {
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 1e8e213c9d70..5f7a37dcaa77 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -314,9 +314,11 @@ static unsigned long kvm_sdei_hypercall_complete(struct kvm_vcpu *vcpu,
struct kvm *kvm = vcpu->kvm;
struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_event *kse = NULL;
struct kvm_sdei_kvm_event *kske = NULL;
struct kvm_sdei_vcpu_event *ksve = NULL;
struct kvm_sdei_vcpu_regs *regs;
+ kvm_sdei_notifier notifier;
unsigned long ret = SDEI_SUCCESS;
int index;

@@ -349,6 +351,13 @@ static unsigned long kvm_sdei_hypercall_complete(struct kvm_vcpu *vcpu,
*vcpu_cpsr(vcpu) = regs->pstate;
*vcpu_pc(vcpu) = regs->pc;

+ /* Notifier */
+ kske = ksve->kske;
+ kse = kske->kse;
+ notifier = (kvm_sdei_notifier)(kse->state.notifier);
+ if (notifier)
+ notifier(vcpu, kse->state.num, KVM_SDEI_NOTIFY_COMPLETED);
+
/* Inject interrupt if needed */
if (resume)
kvm_inject_irq(vcpu);
@@ -358,7 +367,6 @@ static unsigned long kvm_sdei_hypercall_complete(struct kvm_vcpu *vcpu,
* event state as it's not destroyed because of the reference
* count.
*/
- kske = ksve->kske;
ksve->state.refcount--;
kske->state.refcount--;
if (!ksve->state.refcount) {
@@ -746,6 +754,35 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
return 1;
}

+int kvm_sdei_register_notifier(struct kvm *kvm,
+ unsigned long num,
+ kvm_sdei_notifier notifier)
+{
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_event *kse = NULL;
+ int ret = 0;
+
+ if (!ksdei) {
+ ret = -EPERM;
+ goto out;
+ }
+
+ spin_lock(&ksdei->lock);
+
+ kse = kvm_sdei_find_event(kvm, num);
+ if (!kse) {
+ ret = -EINVAL;
+ goto unlock;
+ }
+
+ kse->state.notifier = (unsigned long)notifier;
+
+unlock:
+ spin_unlock(&ksdei->lock);
+out:
+ return ret;
+}
+
void kvm_sdei_deliver(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
@@ -755,6 +792,7 @@ void kvm_sdei_deliver(struct kvm_vcpu *vcpu)
struct kvm_sdei_kvm_event *kske = NULL;
struct kvm_sdei_vcpu_event *ksve = NULL;
struct kvm_sdei_vcpu_regs *regs = NULL;
+ kvm_sdei_notifier notifier;
unsigned long pstate;
int index = 0;

@@ -826,6 +864,11 @@ void kvm_sdei_deliver(struct kvm_vcpu *vcpu)
*vcpu_cpsr(vcpu) = pstate;
*vcpu_pc(vcpu) = kske->state.entries[index];

+ /* Notifier */
+ notifier = (kvm_sdei_notifier)(kse->state.notifier);
+ if (notifier)
+ notifier(vcpu, kse->state.num, KVM_SDEI_NOTIFY_DELIVERED);
+
unlock:
spin_unlock(&vsdei->lock);
}
--
2.23.0

2021-08-15 00:18:50

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 17/21] KVM: arm64: Support SDEI ioctl commands on vCPU

This supports ioctl commands on vCPU to manage the various object.
It's primarily used by VMM to accomplish live migration. The ioctl
commands introduced by this are highlighted as below:

* KVM_SDEI_CMD_GET_VEVENT_COUNT
Retrieve number of SDEI events that pend for handling on the
vCPU
* KVM_SDEI_CMD_GET_VEVENT
Retrieve the state of SDEI event, which has been delivered to
the vCPU for handling
* KVM_SDEI_CMD_SET_VEVENT
Populate the SDEI event, which has been delivered to the vCPU
for handling
* KVM_SDEI_CMD_GET_VCPU_STATE
Retrieve vCPU state related to SDEI handling
* KVM_SDEI_CMD_SET_VCPU_STATE
Populate vCPU state related to SDEI handling

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/include/asm/kvm_sdei.h | 1 +
arch/arm64/include/uapi/asm/kvm_sdei.h | 7 +
arch/arm64/kvm/arm.c | 3 +
arch/arm64/kvm/sdei.c | 228 +++++++++++++++++++++++++
4 files changed, 239 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
index 8f5ea947ed0e..a997989bab77 100644
--- a/arch/arm64/include/asm/kvm_sdei.h
+++ b/arch/arm64/include/asm/kvm_sdei.h
@@ -126,6 +126,7 @@ int kvm_sdei_register_notifier(struct kvm *kvm, unsigned long num,
kvm_sdei_notifier notifier);
void kvm_sdei_deliver(struct kvm_vcpu *vcpu);
long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg);
+long kvm_sdei_vcpu_ioctl(struct kvm_vcpu *vcpu, unsigned long arg);
void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
void kvm_sdei_destroy_vm(struct kvm *kvm);

diff --git a/arch/arm64/include/uapi/asm/kvm_sdei.h b/arch/arm64/include/uapi/asm/kvm_sdei.h
index 35ff05be3c28..b916c3435646 100644
--- a/arch/arm64/include/uapi/asm/kvm_sdei.h
+++ b/arch/arm64/include/uapi/asm/kvm_sdei.h
@@ -62,6 +62,11 @@ struct kvm_sdei_vcpu_state {
#define KVM_SDEI_CMD_GET_KEVENT_COUNT 2
#define KVM_SDEI_CMD_GET_KEVENT 3
#define KVM_SDEI_CMD_SET_KEVENT 4
+#define KVM_SDEI_CMD_GET_VEVENT_COUNT 5
+#define KVM_SDEI_CMD_GET_VEVENT 6
+#define KVM_SDEI_CMD_SET_VEVENT 7
+#define KVM_SDEI_CMD_GET_VCPU_STATE 8
+#define KVM_SDEI_CMD_SET_VCPU_STATE 9

struct kvm_sdei_cmd {
__u32 cmd;
@@ -71,6 +76,8 @@ struct kvm_sdei_cmd {
__u64 num;
struct kvm_sdei_event_state kse_state;
struct kvm_sdei_kvm_event_state kske_state;
+ struct kvm_sdei_vcpu_event_state ksve_state;
+ struct kvm_sdei_vcpu_state ksv_state;
};
};

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 8d61585124b2..215cdbeb272a 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1308,6 +1308,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,

return kvm_arm_vcpu_finalize(vcpu, what);
}
+ case KVM_ARM_SDEI_COMMAND: {
+ return kvm_sdei_vcpu_ioctl(vcpu, arg);
+ }
default:
r = -EINVAL;
}
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index bdd76c3e5153..79315b77f24b 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -35,6 +35,25 @@ static struct kvm_sdei_event *kvm_sdei_find_event(struct kvm *kvm,
return NULL;
}

+static struct kvm_sdei_vcpu_event *kvm_sdei_find_vcpu_event(struct kvm_vcpu *vcpu,
+ unsigned long num)
+{
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_vcpu_event *ksve;
+
+ list_for_each_entry(ksve, &vsdei->critical_events, link) {
+ if (ksve->state.num == num)
+ return ksve;
+ }
+
+ list_for_each_entry(ksve, &vsdei->normal_events, link) {
+ if (ksve->state.num == num)
+ return ksve;
+ }
+
+ return NULL;
+}
+
static void kvm_sdei_remove_events(struct kvm *kvm)
{
struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
@@ -1102,6 +1121,215 @@ long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg)
return ret;
}

+static long kvm_sdei_get_vevent_count(struct kvm_vcpu *vcpu, int *count)
+{
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_vcpu_event *ksve = NULL;
+ int total = 0;
+
+ list_for_each_entry(ksve, &vsdei->critical_events, link) {
+ total++;
+ }
+
+ list_for_each_entry(ksve, &vsdei->normal_events, link) {
+ total++;
+ }
+
+ *count = total;
+ return 0;
+}
+
+static struct kvm_sdei_vcpu_event *next_vcpu_event(struct kvm_vcpu *vcpu,
+ unsigned long num)
+{
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_event *kse = NULL;
+ struct kvm_sdei_kvm_event *kske = NULL;
+ struct kvm_sdei_vcpu_event *ksve = NULL;
+
+ ksve = kvm_sdei_find_vcpu_event(vcpu, num);
+ if (!ksve)
+ return NULL;
+
+ kske = ksve->kske;
+ kse = kske->kse;
+ if (kse->state.priority == SDEI_EVENT_PRIORITY_CRITICAL) {
+ if (!list_is_last(&ksve->link, &vsdei->critical_events)) {
+ ksve = list_next_entry(ksve, link);
+ return ksve;
+ }
+
+ ksve = list_first_entry_or_null(&vsdei->normal_events,
+ struct kvm_sdei_vcpu_event, link);
+ return ksve;
+ }
+
+ if (!list_is_last(&ksve->link, &vsdei->normal_events)) {
+ ksve = list_next_entry(ksve, link);
+ return ksve;
+ }
+
+ return NULL;
+}
+
+static long kvm_sdei_get_vevent(struct kvm_vcpu *vcpu,
+ struct kvm_sdei_vcpu_event_state *ksve_state)
+{
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_vcpu_event *ksve = NULL;
+
+ /*
+ * If the event number is invalid, the first critical or
+ * normal event is fetched. Otherwise, the next valid event
+ * is returned.
+ */
+ if (!kvm_sdei_is_valid_event_num(ksve_state->num)) {
+ ksve = list_first_entry_or_null(&vsdei->critical_events,
+ struct kvm_sdei_vcpu_event, link);
+ if (!ksve) {
+ ksve = list_first_entry_or_null(&vsdei->normal_events,
+ struct kvm_sdei_vcpu_event, link);
+ }
+ } else {
+ ksve = next_vcpu_event(vcpu, ksve_state->num);
+ }
+
+ if (!ksve)
+ return -ENOENT;
+
+ *ksve_state = ksve->state;
+
+ return 0;
+}
+
+static long kvm_sdei_set_vevent(struct kvm_vcpu *vcpu,
+ struct kvm_sdei_vcpu_event_state *ksve_state)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_event *kse = NULL;
+ struct kvm_sdei_kvm_event *kske = NULL;
+ struct kvm_sdei_vcpu_event *ksve = NULL;
+
+ if (!kvm_sdei_is_valid_event_num(ksve_state->num))
+ return -EINVAL;
+
+ kske = kvm_sdei_find_kvm_event(kvm, ksve_state->num);
+ if (!kske)
+ return -ENOENT;
+
+ ksve = kvm_sdei_find_vcpu_event(vcpu, ksve_state->num);
+ if (ksve)
+ return -EEXIST;
+
+ ksve = kzalloc(sizeof(*ksve), GFP_KERNEL);
+ if (!ksve)
+ return -ENOMEM;
+
+ kse = kske->kse;
+ ksve->state = *ksve_state;
+ ksve->kske = kske;
+ ksve->vcpu = vcpu;
+
+ if (kse->state.priority == SDEI_EVENT_PRIORITY_CRITICAL)
+ list_add_tail(&ksve->link, &vsdei->critical_events);
+ else
+ list_add_tail(&ksve->link, &vsdei->normal_events);
+
+ kvm_make_request(KVM_REQ_SDEI, vcpu);
+
+ return 0;
+}
+
+static long kvm_sdei_set_vcpu_state(struct kvm_vcpu *vcpu,
+ struct kvm_sdei_vcpu_state *ksv_state)
+{
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_vcpu_event *critical_ksve = NULL;
+ struct kvm_sdei_vcpu_event *normal_ksve = NULL;
+
+ if (kvm_sdei_is_valid_event_num(ksv_state->critical_num)) {
+ critical_ksve = kvm_sdei_find_vcpu_event(vcpu,
+ ksv_state->critical_num);
+ if (!critical_ksve)
+ return -EINVAL;
+ }
+
+ if (kvm_sdei_is_valid_event_num(ksv_state->critical_num)) {
+ normal_ksve = kvm_sdei_find_vcpu_event(vcpu,
+ ksv_state->critical_num);
+ if (!normal_ksve)
+ return -EINVAL;
+ }
+
+ vsdei->state = *ksv_state;
+ vsdei->critical_event = critical_ksve;
+ vsdei->normal_event = normal_ksve;
+
+ return 0;
+}
+
+long kvm_sdei_vcpu_ioctl(struct kvm_vcpu *vcpu, unsigned long arg)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_cmd *cmd = NULL;
+ void __user *argp = (void __user *)arg;
+ bool copy = false;
+ long ret = 0;
+
+ /* Sanity check */
+ if (!(ksdei && vsdei)) {
+ ret = -EPERM;
+ goto out;
+ }
+
+ cmd = kzalloc(sizeof(*cmd), GFP_KERNEL);
+ if (!cmd) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ if (copy_from_user(cmd, argp, sizeof(*cmd))) {
+ ret = -EFAULT;
+ goto out;
+ }
+
+ spin_lock(&vsdei->lock);
+
+ switch (cmd->cmd) {
+ case KVM_SDEI_CMD_GET_VEVENT_COUNT:
+ copy = true;
+ ret = kvm_sdei_get_vevent_count(vcpu, &cmd->count);
+ break;
+ case KVM_SDEI_CMD_GET_VEVENT:
+ copy = true;
+ ret = kvm_sdei_get_vevent(vcpu, &cmd->ksve_state);
+ break;
+ case KVM_SDEI_CMD_SET_VEVENT:
+ ret = kvm_sdei_set_vevent(vcpu, &cmd->ksve_state);
+ break;
+ case KVM_SDEI_CMD_GET_VCPU_STATE:
+ copy = true;
+ cmd->ksv_state = vsdei->state;
+ break;
+ case KVM_SDEI_CMD_SET_VCPU_STATE:
+ ret = kvm_sdei_set_vcpu_state(vcpu, &cmd->ksv_state);
+ break;
+ default:
+ ret = -EINVAL;
+ }
+
+ spin_unlock(&vsdei->lock);
+out:
+ if (!ret && copy && copy_to_user(argp, cmd, sizeof(*cmd)))
+ ret = -EFAULT;
+
+ kfree(cmd);
+ return ret;
+}
+
void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu)
{
struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
--
2.23.0

2021-08-15 00:19:02

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 19/21] KVM: arm64: Support SDEI event cancellation

The injected SDEI event is to send notification to guest. The SDEI
event might not be needed after it's injected. This introduces API
to support cancellation on the injected SDEI event if it's not fired
to the guest yet.

This mechanism will be needed when we're going to support asynchronous
page fault.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/include/asm/kvm_sdei.h | 1 +
arch/arm64/kvm/sdei.c | 49 +++++++++++++++++++++++++++++++
2 files changed, 50 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
index 51087fe971ba..353744c7bad9 100644
--- a/arch/arm64/include/asm/kvm_sdei.h
+++ b/arch/arm64/include/asm/kvm_sdei.h
@@ -126,6 +126,7 @@ int kvm_sdei_register_notifier(struct kvm *kvm, unsigned long num,
kvm_sdei_notifier notifier);
int kvm_sdei_inject(struct kvm_vcpu *vcpu,
unsigned long num, bool immediate);
+int kvm_sdei_cancel(struct kvm_vcpu *vcpu, unsigned long num);
void kvm_sdei_deliver(struct kvm_vcpu *vcpu);
long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg);
long kvm_sdei_vcpu_ioctl(struct kvm_vcpu *vcpu, unsigned long arg);
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 7c2789cd1421..4f5a582daa97 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -907,6 +907,55 @@ int kvm_sdei_inject(struct kvm_vcpu *vcpu,
return ret;
}

+int kvm_sdei_cancel(struct kvm_vcpu *vcpu, unsigned long num)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_kvm_event *kske = NULL;
+ struct kvm_sdei_vcpu_event *ksve = NULL;
+ int ret = 0;
+
+ if (!(ksdei && vsdei)) {
+ ret = -EPERM;
+ goto out;
+ }
+
+ /* Find the vCPU event */
+ spin_lock(&vsdei->lock);
+ ksve = kvm_sdei_find_vcpu_event(vcpu, num);
+ if (!ksve) {
+ ret = -EINVAL;
+ goto unlock;
+ }
+
+ /* Event can't be cancelled if it has been delivered */
+ if (ksve->state.refcount <= 1 &&
+ (vsdei->critical_event == ksve ||
+ vsdei->normal_event == ksve)) {
+ ret = -EINPROGRESS;
+ goto unlock;
+ }
+
+ /* Free the vCPU event if necessary */
+ kske = ksve->kske;
+ ksve->state.refcount--;
+ if (!ksve->state.refcount) {
+ list_del(&ksve->link);
+ kfree(ksve);
+ }
+
+unlock:
+ spin_unlock(&vsdei->lock);
+ if (kske) {
+ spin_lock(&ksdei->lock);
+ kske->state.refcount--;
+ spin_unlock(&ksdei->lock);
+ }
+out:
+ return ret;
+}
+
void kvm_sdei_deliver(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
--
2.23.0

2021-08-15 00:19:20

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 21/21] KVM: selftests: Add SDEI test case

This adds SDEI test case into selftests where the various hypercalls
are issued to kvm private event (0x40200000) and then ensure that's
completed without error. Note that two vCPUs are started up by default
to run same consequence. Actually, it's simulating what SDEI client
driver does and the following hypercalls are issued in sequence:

SDEI_1_0_FN_SDEI_VERSION (probing SDEI capability)
SDEI_1_0_FN_SDEI_PE_UNMASK (CPU online)
SDEI_1_0_FN_SDEI_PRIVATE_RESET (restart SDEI)
SDEI_1_0_FN_SDEI_SHARED_RESET
SDEI_1_0_FN_SDEI_EVENT_GET_INFO (register event)
SDEI_1_0_FN_SDEI_EVENT_GET_INFO
SDEI_1_0_FN_SDEI_EVENT_GET_INFO
SDEI_1_0_FN_SDEI_EVENT_REGISTER
SDEI_1_0_FN_SDEI_EVENT_ENABLE (enable event)
SDEI_1_0_FN_SDEI_EVENT_DISABLE (disable event)
SDEI_1_0_FN_SDEI_EVENT_UNREGISTER (unregister event)
SDEI_1_0_FN_SDEI_PE_MASK (CPU offline)

Signed-off-by: Gavin Shan <[email protected]>
---
tools/testing/selftests/kvm/Makefile | 1 +
tools/testing/selftests/kvm/aarch64/sdei.c | 171 +++++++++++++++++++++
2 files changed, 172 insertions(+)
create mode 100644 tools/testing/selftests/kvm/aarch64/sdei.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 5832f510a16c..33284c468b7d 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -87,6 +87,7 @@ TEST_GEN_PROGS_x86_64 += kvm_binary_stats_test
TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
TEST_GEN_PROGS_aarch64 += aarch64/vgic_init
+TEST_GEN_PROGS_aarch64 += aarch64/sdei
TEST_GEN_PROGS_aarch64 += demand_paging_test
TEST_GEN_PROGS_aarch64 += dirty_log_test
TEST_GEN_PROGS_aarch64 += dirty_log_perf_test
diff --git a/tools/testing/selftests/kvm/aarch64/sdei.c b/tools/testing/selftests/kvm/aarch64/sdei.c
new file mode 100644
index 000000000000..fadccdfc009c
--- /dev/null
+++ b/tools/testing/selftests/kvm/aarch64/sdei.c
@@ -0,0 +1,171 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ARM SDEI test
+ *
+ * Copyright (C) 2021 Red Hat, Inc.
+ *
+ * Author(s): Gavin Shan <[email protected]>
+ */
+#define _GNU_SOURCE
+#include <stdio.h>
+
+#include "test_util.h"
+#include "kvm_util.h"
+#include "processor.h"
+#include "asm/kvm_sdei.h"
+#include "linux/arm_sdei.h"
+
+#define NR_VCPUS 2
+
+struct sdei_event {
+ uint32_t cpu;
+ uint64_t version;
+ uint64_t num;
+ uint64_t type;
+ uint64_t priority;
+ uint64_t signaled;
+};
+
+static struct sdei_event sdei_events[NR_VCPUS];
+
+static int64_t smccc(uint32_t func, uint64_t arg0, uint64_t arg1,
+ uint64_t arg2, uint64_t arg3, uint64_t arg4)
+{
+ int64_t ret;
+
+ asm volatile(
+ "mov x0, %1\n"
+ "mov x1, %2\n"
+ "mov x2, %3\n"
+ "mov x3, %4\n"
+ "mov x4, %5\n"
+ "mov x5, %6\n"
+ "hvc #0\n"
+ "mov %0, x0\n"
+ : "=r" (ret) : "r" (func), "r" (arg0), "r" (arg1),
+ "r" (arg2), "r" (arg3), "r" (arg4) :
+ "x0", "x1", "x2", "x3", "x4", "x5");
+
+ return ret;
+}
+
+static inline bool is_error(int64_t ret)
+{
+ if (ret == SDEI_NOT_SUPPORTED ||
+ ret == SDEI_INVALID_PARAMETERS ||
+ ret == SDEI_DENIED ||
+ ret == SDEI_PENDING ||
+ ret == SDEI_OUT_OF_RESOURCE)
+ return true;
+
+ return false;
+}
+
+static void guest_code(int cpu)
+{
+ struct sdei_event *event = &sdei_events[cpu];
+ int64_t ret;
+
+ /* CPU */
+ event->cpu = cpu;
+ event->num = KVM_SDEI_DEFAULT_NUM;
+ GUEST_ASSERT(cpu < NR_VCPUS);
+
+ /* Version */
+ ret = smccc(SDEI_1_0_FN_SDEI_VERSION, 0, 0, 0, 0, 0);
+ GUEST_ASSERT(!is_error(ret));
+ GUEST_ASSERT(SDEI_VERSION_MAJOR(ret) == 1);
+ GUEST_ASSERT(SDEI_VERSION_MINOR(ret) == 0);
+ event->version = ret;
+
+ /* CPU unmasking */
+ ret = smccc(SDEI_1_0_FN_SDEI_PE_UNMASK, 0, 0, 0, 0, 0);
+ GUEST_ASSERT(!is_error(ret));
+
+ /* Reset */
+ ret = smccc(SDEI_1_0_FN_SDEI_PRIVATE_RESET, 0, 0, 0, 0, 0);
+ GUEST_ASSERT(!is_error(ret));
+ ret = smccc(SDEI_1_0_FN_SDEI_SHARED_RESET, 0, 0, 0, 0, 0);
+ GUEST_ASSERT(!is_error(ret));
+
+ /* Event properties */
+ ret = smccc(SDEI_1_0_FN_SDEI_EVENT_GET_INFO,
+ event->num, SDEI_EVENT_INFO_EV_TYPE, 0, 0, 0);
+ GUEST_ASSERT(!is_error(ret));
+ event->type = ret;
+
+ ret = smccc(SDEI_1_0_FN_SDEI_EVENT_GET_INFO,
+ event->num, SDEI_EVENT_INFO_EV_PRIORITY, 0, 0, 0);
+ GUEST_ASSERT(!is_error(ret));
+ event->priority = ret;
+
+ ret = smccc(SDEI_1_0_FN_SDEI_EVENT_GET_INFO,
+ event->num, SDEI_EVENT_INFO_EV_SIGNALED, 0, 0, 0);
+ GUEST_ASSERT(!is_error(ret));
+ event->signaled = ret;
+
+ /* Event registration */
+ ret = smccc(SDEI_1_0_FN_SDEI_EVENT_REGISTER,
+ event->num, 0, 0, SDEI_EVENT_REGISTER_RM_ANY, 0);
+ GUEST_ASSERT(!is_error(ret));
+
+ /* Event enablement */
+ ret = smccc(SDEI_1_0_FN_SDEI_EVENT_ENABLE,
+ event->num, 0, 0, 0, 0);
+ GUEST_ASSERT(!is_error(ret));
+
+ /* Event disablement */
+ ret = smccc(SDEI_1_0_FN_SDEI_EVENT_DISABLE,
+ event->num, 0, 0, 0, 0);
+ GUEST_ASSERT(!is_error(ret));
+
+ /* Event unregistration */
+ ret = smccc(SDEI_1_0_FN_SDEI_EVENT_UNREGISTER,
+ event->num, 0, 0, 0, 0);
+ GUEST_ASSERT(!is_error(ret));
+
+ /* CPU masking */
+ ret = smccc(SDEI_1_0_FN_SDEI_PE_MASK, 0, 0, 0, 0, 0);
+ GUEST_ASSERT(!is_error(ret));
+
+ GUEST_DONE();
+}
+
+int main(int argc, char **argv)
+{
+ struct kvm_vm *vm;
+ int i;
+
+ if (!kvm_check_cap(KVM_CAP_ARM_SDEI)) {
+ pr_info("SDEI not supported\n");
+ return 0;
+ }
+
+ vm = vm_create_default(0, 0, guest_code);
+ ucall_init(vm, NULL);
+
+ for (i = 1; i < NR_VCPUS; i++)
+ vm_vcpu_add_default(vm, i, guest_code);
+
+ for (i = 0; i < NR_VCPUS; i++) {
+ vcpu_args_set(vm, i, 1, i);
+ vcpu_run(vm, i);
+
+ sync_global_from_guest(vm, sdei_events[i]);
+ pr_info("--------------------------------\n");
+ pr_info("CPU: %d\n",
+ sdei_events[i].cpu);
+ pr_info("Version: %ld.%ld (0x%lx)\n",
+ SDEI_VERSION_MAJOR(sdei_events[i].version),
+ SDEI_VERSION_MINOR(sdei_events[i].version),
+ SDEI_VERSION_VENDOR(sdei_events[i].version));
+ pr_info("Event: 0x%08lx\n",
+ sdei_events[i].num);
+ pr_info("Type: %s\n",
+ sdei_events[i].type ? "shared" : "private");
+ pr_info("Signaled: %s\n",
+ sdei_events[i].signaled ? "yes" : "no");
+ }
+
+ return 0;
+}
--
2.23.0

2021-08-15 00:20:29

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 14/21] KVM: arm64: Support SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall

This supports SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall.
They are used by the guest to notify the completion of the SDEI
event in the handler. The registers are changed according to the
SDEI specification as below:

* x0 - x17, PC and PState are restored to what values we had in
the interrupted context.

* If it's SDEI_EVENT_COMPLETE_AND_RESUME hypercall, IRQ exception
is injected.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/include/asm/kvm_emulate.h | 1 +
arch/arm64/include/asm/kvm_host.h | 1 +
arch/arm64/kvm/hyp/exception.c | 7 +++
arch/arm64/kvm/inject_fault.c | 27 ++++++++++
arch/arm64/kvm/sdei.c | 75 ++++++++++++++++++++++++++++
5 files changed, 111 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index fd418955e31e..923b4d08ea9a 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -37,6 +37,7 @@ bool kvm_condition_valid32(const struct kvm_vcpu *vcpu);
void kvm_skip_instr32(struct kvm_vcpu *vcpu);

void kvm_inject_undefined(struct kvm_vcpu *vcpu);
+void kvm_inject_irq(struct kvm_vcpu *vcpu);
void kvm_inject_vabt(struct kvm_vcpu *vcpu);
void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr);
void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr);
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 46f363aa6524..1824f7e1f9ab 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -437,6 +437,7 @@ struct kvm_vcpu_arch {
#define KVM_ARM64_EXCEPT_AA32_UND (0 << 9)
#define KVM_ARM64_EXCEPT_AA32_IABT (1 << 9)
#define KVM_ARM64_EXCEPT_AA32_DABT (2 << 9)
+#define KVM_ARM64_EXCEPT_AA32_IRQ (3 << 9)
/* For AArch64: */
#define KVM_ARM64_EXCEPT_AA64_ELx_SYNC (0 << 9)
#define KVM_ARM64_EXCEPT_AA64_ELx_IRQ (1 << 9)
diff --git a/arch/arm64/kvm/hyp/exception.c b/arch/arm64/kvm/hyp/exception.c
index 0418399e0a20..ef458207d152 100644
--- a/arch/arm64/kvm/hyp/exception.c
+++ b/arch/arm64/kvm/hyp/exception.c
@@ -310,6 +310,9 @@ static void kvm_inject_exception(struct kvm_vcpu *vcpu)
case KVM_ARM64_EXCEPT_AA32_DABT:
enter_exception32(vcpu, PSR_AA32_MODE_ABT, 16);
break;
+ case KVM_ARM64_EXCEPT_AA32_IRQ:
+ enter_exception32(vcpu, PSR_AA32_MODE_IRQ, 4);
+ break;
default:
/* Err... */
break;
@@ -320,6 +323,10 @@ static void kvm_inject_exception(struct kvm_vcpu *vcpu)
KVM_ARM64_EXCEPT_AA64_EL1):
enter_exception64(vcpu, PSR_MODE_EL1h, except_type_sync);
break;
+ case (KVM_ARM64_EXCEPT_AA64_ELx_IRQ |
+ KVM_ARM64_EXCEPT_AA64_EL1):
+ enter_exception64(vcpu, PSR_MODE_EL1h, except_type_irq);
+ break;
default:
/*
* Only EL1_SYNC makes sense so far, EL2_{SYNC,IRQ}
diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
index b47df73e98d7..3a8c55867d2f 100644
--- a/arch/arm64/kvm/inject_fault.c
+++ b/arch/arm64/kvm/inject_fault.c
@@ -66,6 +66,13 @@ static void inject_undef64(struct kvm_vcpu *vcpu)
vcpu_write_sys_reg(vcpu, esr, ESR_EL1);
}

+static void inject_irq64(struct kvm_vcpu *vcpu)
+{
+ vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1 |
+ KVM_ARM64_EXCEPT_AA64_ELx_IRQ |
+ KVM_ARM64_PENDING_EXCEPTION);
+}
+
#define DFSR_FSC_EXTABT_LPAE 0x10
#define DFSR_FSC_EXTABT_nLPAE 0x08
#define DFSR_LPAE BIT(9)
@@ -77,6 +84,12 @@ static void inject_undef32(struct kvm_vcpu *vcpu)
KVM_ARM64_PENDING_EXCEPTION);
}

+static void inject_irq32(struct kvm_vcpu *vcpu)
+{
+ vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA32_IRQ |
+ KVM_ARM64_PENDING_EXCEPTION);
+}
+
/*
* Modelled after TakeDataAbortException() and TakePrefetchAbortException
* pseudocode.
@@ -160,6 +173,20 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
inject_undef64(vcpu);
}

+/**
+ * kvm_inject_irq - inject an IRQ into the guest
+ *
+ * It is assumed that this code is called from the VCPU thread and that the
+ * VCPU therefore is not currently executing guest code.
+ */
+void kvm_inject_irq(struct kvm_vcpu *vcpu)
+{
+ if (vcpu_el1_is_32bit(vcpu))
+ inject_irq32(vcpu);
+ else
+ inject_irq64(vcpu);
+}
+
void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 esr)
{
vcpu_set_vsesr(vcpu, esr & ESR_ELx_ISS_MASK);
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index b5d6d1ed3858..1e8e213c9d70 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -308,6 +308,75 @@ static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu)
return ret;
}

+static unsigned long kvm_sdei_hypercall_complete(struct kvm_vcpu *vcpu,
+ bool resume)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_kvm_event *kske = NULL;
+ struct kvm_sdei_vcpu_event *ksve = NULL;
+ struct kvm_sdei_vcpu_regs *regs;
+ unsigned long ret = SDEI_SUCCESS;
+ int index;
+
+ /* Sanity check */
+ if (!(ksdei && vsdei)) {
+ ret = SDEI_NOT_SUPPORTED;
+ goto out;
+ }
+
+ spin_lock(&vsdei->lock);
+ if (vsdei->critical_event) {
+ ksve = vsdei->critical_event;
+ regs = &vsdei->state.critical_regs;
+ vsdei->critical_event = NULL;
+ vsdei->state.critical_num = KVM_SDEI_INVALID_NUM;
+ } else if (vsdei->normal_event) {
+ ksve = vsdei->normal_event;
+ regs = &vsdei->state.normal_regs;
+ vsdei->normal_event = NULL;
+ vsdei->state.normal_num = KVM_SDEI_INVALID_NUM;
+ } else {
+ ret = SDEI_DENIED;
+ goto unlock;
+ }
+
+ /* Restore registers: x0 -> x17, PC, PState */
+ for (index = 0; index < ARRAY_SIZE(regs->regs); index++)
+ vcpu_set_reg(vcpu, index, regs->regs[index]);
+
+ *vcpu_cpsr(vcpu) = regs->pstate;
+ *vcpu_pc(vcpu) = regs->pc;
+
+ /* Inject interrupt if needed */
+ if (resume)
+ kvm_inject_irq(vcpu);
+
+ /*
+ * Update state. We needn't take lock in order to update the KVM
+ * event state as it's not destroyed because of the reference
+ * count.
+ */
+ kske = ksve->kske;
+ ksve->state.refcount--;
+ kske->state.refcount--;
+ if (!ksve->state.refcount) {
+ list_del(&ksve->link);
+ kfree(ksve);
+ }
+
+ /* Make another request if there is pending event */
+ if (!(list_empty(&vsdei->critical_events) &&
+ list_empty(&vsdei->normal_events)))
+ kvm_make_request(KVM_REQ_SDEI, vcpu);
+
+unlock:
+ spin_unlock(&vsdei->lock);
+out:
+ return ret;
+}
+
static unsigned long kvm_sdei_hypercall_unregister(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
@@ -628,7 +697,13 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
ret = kvm_sdei_hypercall_context(vcpu);
break;
case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
+ has_result = false;
+ ret = kvm_sdei_hypercall_complete(vcpu, false);
+ break;
case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
+ has_result = false;
+ ret = kvm_sdei_hypercall_complete(vcpu, true);
+ break;
case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
ret = kvm_sdei_hypercall_unregister(vcpu);
break;
--
2.23.0

2021-08-15 00:20:36

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 16/21] KVM: arm64: Support SDEI ioctl commands on VM

This supports ioctl commands on VM to manage the various objects.
It's primarily used by VMM to accomplish live migration. The ioctl
commands introduced by this are highlighted as blow:

* KVM_SDEI_CMD_GET_VERSION
Retrieve the version of current implementation
* KVM_SDEI_CMD_SET_EVENT
Add event to be exported from KVM so that guest can register
against it afterwards
* KVM_SDEI_CMD_GET_KEVENT_COUNT
Retrieve number of registered SDEI events
* KVM_SDEI_CMD_GET_KEVENT
Retrieve the state of the registered SDEI event
* KVM_SDEI_CMD_SET_KEVENT
Populate the registered SDEI event

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/include/asm/kvm_sdei.h | 1 +
arch/arm64/include/uapi/asm/kvm_sdei.h | 17 +++
arch/arm64/kvm/arm.c | 3 +
arch/arm64/kvm/sdei.c | 171 +++++++++++++++++++++++++
include/uapi/linux/kvm.h | 3 +
5 files changed, 195 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
index 19f2d9b91f85..8f5ea947ed0e 100644
--- a/arch/arm64/include/asm/kvm_sdei.h
+++ b/arch/arm64/include/asm/kvm_sdei.h
@@ -125,6 +125,7 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu);
int kvm_sdei_register_notifier(struct kvm *kvm, unsigned long num,
kvm_sdei_notifier notifier);
void kvm_sdei_deliver(struct kvm_vcpu *vcpu);
+long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg);
void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
void kvm_sdei_destroy_vm(struct kvm *kvm);

diff --git a/arch/arm64/include/uapi/asm/kvm_sdei.h b/arch/arm64/include/uapi/asm/kvm_sdei.h
index 4ef661d106fe..35ff05be3c28 100644
--- a/arch/arm64/include/uapi/asm/kvm_sdei.h
+++ b/arch/arm64/include/uapi/asm/kvm_sdei.h
@@ -57,5 +57,22 @@ struct kvm_sdei_vcpu_state {
struct kvm_sdei_vcpu_regs normal_regs;
};

+#define KVM_SDEI_CMD_GET_VERSION 0
+#define KVM_SDEI_CMD_SET_EVENT 1
+#define KVM_SDEI_CMD_GET_KEVENT_COUNT 2
+#define KVM_SDEI_CMD_GET_KEVENT 3
+#define KVM_SDEI_CMD_SET_KEVENT 4
+
+struct kvm_sdei_cmd {
+ __u32 cmd;
+ union {
+ __u32 version;
+ __u32 count;
+ __u64 num;
+ struct kvm_sdei_event_state kse_state;
+ struct kvm_sdei_kvm_event_state kske_state;
+ };
+};
+
#endif /* !__ASSEMBLY__ */
#endif /* _UAPI__ASM_KVM_SDEI_H */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 0c3db1ef1ba9..8d61585124b2 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1389,6 +1389,9 @@ long kvm_arch_vm_ioctl(struct file *filp,
return -EFAULT;
return kvm_vm_ioctl_mte_copy_tags(kvm, &copy_tags);
}
+ case KVM_ARM_SDEI_COMMAND: {
+ return kvm_sdei_vm_ioctl(kvm, arg);
+ }
default:
return -EINVAL;
}
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 5f7a37dcaa77..bdd76c3e5153 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -931,6 +931,177 @@ void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu)
vcpu->arch.sdei = vsdei;
}

+static long kvm_sdei_set_event(struct kvm *kvm,
+ struct kvm_sdei_event_state *kse_state)
+{
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_event *kse = NULL;
+
+ if (!kvm_sdei_is_valid_event_num(kse_state->num))
+ return -EINVAL;
+
+ if (!(kse_state->type == SDEI_EVENT_TYPE_SHARED ||
+ kse_state->type == SDEI_EVENT_TYPE_PRIVATE))
+ return -EINVAL;
+
+ if (!(kse_state->priority == SDEI_EVENT_PRIORITY_NORMAL ||
+ kse_state->priority == SDEI_EVENT_PRIORITY_CRITICAL))
+ return -EINVAL;
+
+ kse = kvm_sdei_find_event(kvm, kse_state->num);
+ if (kse)
+ return -EEXIST;
+
+ kse = kzalloc(sizeof(*kse), GFP_KERNEL);
+ if (!kse)
+ return -ENOMEM;
+
+ kse->state = *kse_state;
+ kse->kvm = kvm;
+ list_add_tail(&kse->link, &ksdei->events);
+
+ return 0;
+}
+
+static long kvm_sdei_get_kevent_count(struct kvm *kvm, int *count)
+{
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_kvm_event *kske = NULL;
+ int total = 0;
+
+ list_for_each_entry(kske, &ksdei->kvm_events, link) {
+ total++;
+ }
+
+ *count = total;
+ return 0;
+}
+
+static long kvm_sdei_get_kevent(struct kvm *kvm,
+ struct kvm_sdei_kvm_event_state *kske_state)
+{
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_kvm_event *kske = NULL;
+
+ /*
+ * The first entry is fetched if the event number is invalid.
+ * Otherwise, the next entry is fetched.
+ */
+ if (!kvm_sdei_is_valid_event_num(kske_state->num)) {
+ kske = list_first_entry_or_null(&ksdei->kvm_events,
+ struct kvm_sdei_kvm_event, link);
+ } else {
+ kske = kvm_sdei_find_kvm_event(kvm, kske_state->num);
+ if (kske && !list_is_last(&kske->link, &ksdei->kvm_events))
+ kske = list_next_entry(kske, link);
+ else
+ kske = NULL;
+ }
+
+ if (!kske)
+ return -ENOENT;
+
+ *kske_state = kske->state;
+
+ return 0;
+}
+
+static long kvm_sdei_set_kevent(struct kvm *kvm,
+ struct kvm_sdei_kvm_event_state *kske_state)
+{
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_event *kse = NULL;
+ struct kvm_sdei_kvm_event *kske = NULL;
+
+ /* Sanity check */
+ if (!kvm_sdei_is_valid_event_num(kske_state->num))
+ return -EINVAL;
+
+ if (!(kske_state->route_mode == SDEI_EVENT_REGISTER_RM_ANY ||
+ kske_state->route_mode == SDEI_EVENT_REGISTER_RM_PE))
+ return -EINVAL;
+
+ /* Check if the event number is valid */
+ kse = kvm_sdei_find_event(kvm, kske_state->num);
+ if (!kse)
+ return -ENOENT;
+
+ /* Check if the event has been populated */
+ kske = kvm_sdei_find_kvm_event(kvm, kske_state->num);
+ if (kske)
+ return -EEXIST;
+
+ kske = kzalloc(sizeof(*kske), GFP_KERNEL);
+ if (!kske)
+ return -ENOMEM;
+
+ kske->state = *kske_state;
+ kske->kse = kse;
+ kske->kvm = kvm;
+ list_add_tail(&kske->link, &ksdei->kvm_events);
+
+ return 0;
+}
+
+long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg)
+{
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_cmd *cmd = NULL;
+ void __user *argp = (void __user *)arg;
+ bool copy = false;
+ long ret = 0;
+
+ /* Sanity check */
+ if (!ksdei) {
+ ret = -EPERM;
+ goto out;
+ }
+
+ cmd = kzalloc(sizeof(*cmd), GFP_KERNEL);
+ if (!cmd) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ if (copy_from_user(cmd, argp, sizeof(*cmd))) {
+ ret = -EFAULT;
+ goto out;
+ }
+
+ spin_lock(&ksdei->lock);
+
+ switch (cmd->cmd) {
+ case KVM_SDEI_CMD_GET_VERSION:
+ copy = true;
+ cmd->version = (1 << 16); /* v1.0.0 */
+ break;
+ case KVM_SDEI_CMD_SET_EVENT:
+ ret = kvm_sdei_set_event(kvm, &cmd->kse_state);
+ break;
+ case KVM_SDEI_CMD_GET_KEVENT_COUNT:
+ copy = true;
+ ret = kvm_sdei_get_kevent_count(kvm, &cmd->count);
+ break;
+ case KVM_SDEI_CMD_GET_KEVENT:
+ copy = true;
+ ret = kvm_sdei_get_kevent(kvm, &cmd->kske_state);
+ break;
+ case KVM_SDEI_CMD_SET_KEVENT:
+ ret = kvm_sdei_set_kevent(kvm, &cmd->kske_state);
+ break;
+ default:
+ ret = -EINVAL;
+ }
+
+ spin_unlock(&ksdei->lock);
+out:
+ if (!ret && copy && copy_to_user(argp, cmd, sizeof(*cmd)))
+ ret = -EFAULT;
+
+ kfree(cmd);
+ return ret;
+}
+
void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu)
{
struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index d9e4aabcb31a..8cf41fd4bf86 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1679,6 +1679,9 @@ struct kvm_xen_vcpu_attr {
#define KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_DATA 0x4
#define KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST 0x5

+/* Available with KVM_CAP_ARM_SDEI */
+#define KVM_ARM_SDEI_COMMAND _IOWR(KVMIO, 0xce, struct kvm_sdei_cmd)
+
/* Secure Encrypted Virtualization command */
enum sev_cmd_id {
/* Guest initialization commands */
--
2.23.0

2021-08-15 00:20:52

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 18/21] KVM: arm64: Support SDEI event injection

This supports SDEI event injection by implementing kvm_sdei_inject().
It's called by kernel directly or VMM through ioctl command to inject
SDEI event to the specific vCPU.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/include/asm/kvm_sdei.h | 2 +
arch/arm64/include/uapi/asm/kvm_sdei.h | 1 +
arch/arm64/kvm/sdei.c | 108 +++++++++++++++++++++++++
3 files changed, 111 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
index a997989bab77..51087fe971ba 100644
--- a/arch/arm64/include/asm/kvm_sdei.h
+++ b/arch/arm64/include/asm/kvm_sdei.h
@@ -124,6 +124,8 @@ void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu);
int kvm_sdei_register_notifier(struct kvm *kvm, unsigned long num,
kvm_sdei_notifier notifier);
+int kvm_sdei_inject(struct kvm_vcpu *vcpu,
+ unsigned long num, bool immediate);
void kvm_sdei_deliver(struct kvm_vcpu *vcpu);
long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg);
long kvm_sdei_vcpu_ioctl(struct kvm_vcpu *vcpu, unsigned long arg);
diff --git a/arch/arm64/include/uapi/asm/kvm_sdei.h b/arch/arm64/include/uapi/asm/kvm_sdei.h
index b916c3435646..f7a6b2b22b50 100644
--- a/arch/arm64/include/uapi/asm/kvm_sdei.h
+++ b/arch/arm64/include/uapi/asm/kvm_sdei.h
@@ -67,6 +67,7 @@ struct kvm_sdei_vcpu_state {
#define KVM_SDEI_CMD_SET_VEVENT 7
#define KVM_SDEI_CMD_GET_VCPU_STATE 8
#define KVM_SDEI_CMD_SET_VCPU_STATE 9
+#define KVM_SDEI_CMD_INJECT_EVENT 10

struct kvm_sdei_cmd {
__u32 cmd;
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 79315b77f24b..7c2789cd1421 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -802,6 +802,111 @@ int kvm_sdei_register_notifier(struct kvm *kvm,
return ret;
}

+int kvm_sdei_inject(struct kvm_vcpu *vcpu,
+ unsigned long num,
+ bool immediate)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_event *kse = NULL;
+ struct kvm_sdei_kvm_event *kske = NULL;
+ struct kvm_sdei_vcpu_event *ksve = NULL;
+ int index, ret = 0;
+
+ /* Sanity check */
+ if (!(ksdei && vsdei)) {
+ ret = -EPERM;
+ goto out;
+ }
+
+ if (!kvm_sdei_is_valid_event_num(num)) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ /* Check the kvm event */
+ spin_lock(&ksdei->lock);
+ kske = kvm_sdei_find_kvm_event(kvm, num);
+ if (!kske) {
+ ret = -ENOENT;
+ goto unlock_kvm;
+ }
+
+ kse = kske->kse;
+ index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
+ vcpu->vcpu_idx : 0;
+ if (!(kvm_sdei_is_registered(kske, index) &&
+ kvm_sdei_is_enabled(kske, index))) {
+ ret = -EPERM;
+ goto unlock_kvm;
+ }
+
+ /* Check the vcpu state */
+ spin_lock(&vsdei->lock);
+ if (vsdei->state.masked) {
+ ret = -EPERM;
+ goto unlock_vcpu;
+ }
+
+ /* Check if the event can be delivered immediately */
+ if (immediate) {
+ if (kse->state.priority == SDEI_EVENT_PRIORITY_CRITICAL &&
+ !list_empty(&vsdei->critical_events)) {
+ ret = -ENOSPC;
+ goto unlock_vcpu;
+ }
+
+ if (kse->state.priority == SDEI_EVENT_PRIORITY_NORMAL &&
+ (!list_empty(&vsdei->critical_events) ||
+ !list_empty(&vsdei->normal_events))) {
+ ret = -ENOSPC;
+ goto unlock_vcpu;
+ }
+ }
+
+ /* Check if the vcpu event exists */
+ ksve = kvm_sdei_find_vcpu_event(vcpu, num);
+ if (ksve) {
+ kske->state.refcount++;
+ ksve->state.refcount++;
+ kvm_make_request(KVM_REQ_SDEI, vcpu);
+ goto unlock_vcpu;
+ }
+
+ /* Allocate vcpu event */
+ ksve = kzalloc(sizeof(*ksve), GFP_KERNEL);
+ if (!ksve) {
+ ret = -ENOMEM;
+ goto unlock_vcpu;
+ }
+
+ /*
+ * We should take lock to update KVM event state because its
+ * reference count might be zero. In that case, the KVM event
+ * could be destroyed.
+ */
+ kske->state.refcount++;
+ ksve->state.num = num;
+ ksve->state.refcount = 1;
+ ksve->kske = kske;
+ ksve->vcpu = vcpu;
+
+ if (kse->state.priority == SDEI_EVENT_PRIORITY_CRITICAL)
+ list_add_tail(&ksve->link, &vsdei->critical_events);
+ else
+ list_add_tail(&ksve->link, &vsdei->normal_events);
+
+ kvm_make_request(KVM_REQ_SDEI, vcpu);
+
+unlock_vcpu:
+ spin_unlock(&vsdei->lock);
+unlock_kvm:
+ spin_unlock(&ksdei->lock);
+out:
+ return ret;
+}
+
void kvm_sdei_deliver(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
@@ -1317,6 +1422,9 @@ long kvm_sdei_vcpu_ioctl(struct kvm_vcpu *vcpu, unsigned long arg)
case KVM_SDEI_CMD_SET_VCPU_STATE:
ret = kvm_sdei_set_vcpu_state(vcpu, &cmd->ksv_state);
break;
+ case KVM_SDEI_CMD_INJECT_EVENT:
+ ret = kvm_sdei_inject(vcpu, cmd->num, false);
+ break;
default:
ret = -EINVAL;
}
--
2.23.0

2021-08-15 00:21:13

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v4 20/21] KVM: arm64: Export SDEI capability

The SDEI functionality is ready to be exported so far. This adds
new capability (KVM_CAP_ARM_SDEI) and exports it.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/kvm/arm.c | 3 +++
include/uapi/linux/kvm.h | 1 +
2 files changed, 4 insertions(+)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 215cdbeb272a..7d9bbc888ae5 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -278,6 +278,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_ARM_PTRAUTH_GENERIC:
r = system_has_full_ptr_auth();
break;
+ case KVM_CAP_ARM_SDEI:
+ r = 1;
+ break;
default:
r = 0;
}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 8cf41fd4bf86..2aa748fd89c7 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1112,6 +1112,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_BINARY_STATS_FD 203
#define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
#define KVM_CAP_ARM_MTE 205
+#define KVM_CAP_ARM_SDEI 206

#ifdef KVM_CAP_IRQ_ROUTING

--
2.23.0

2021-08-15 00:21:37

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Support SDEI Virtualization

On 8/15/21 10:13 AM, Gavin Shan wrote:
> This series intends to virtualize Software Delegated Exception Interface
> (SDEI), which is defined by DEN0054A. It allows the hypervisor to deliver
> NMI-alike event to guest and it's needed by asynchronous page fault to
> deliver page-not-present notification from hypervisor to guest. The code
> and the required qemu changes can be found from:
>
> https://developer.arm.com/documentation/den0054/latest
> https://github.com/gwshan/linux ("kvm/arm64_sdei")
> https://github.com/gwshan/qemu ("kvm/arm64_sdei")
>
> The SDEI event is identified by a 32-bits number. Bits[31:24] are used
> to indicate the SDEI event properties while bits[23:0] are identifying
> the unique number. The implementation takes bits[23:22] to indicate the
> owner of the SDEI event. For example, those SDEI events owned by KVM
> should have these two bits set to 0b01. Besides, the implementation
> supports SDEI events owned by KVM only.
>
> The design is pretty straightforward and the implementation is just
> following the SDEI specification, to support the defined SMCCC intefaces,
> except the IRQ binding stuff. There are several data structures introduced.
> Some of the objects have to be migrated by VMM. So their definitions are
> split up for VMM to include the corresponding states for migration.
>
> struct kvm_sdei_kvm
> Associated with VM and used to track the KVM exposed SDEI events
> and those registered by guest.
> struct kvm_sdei_vcpu
> Associated with vCPU and used to track SDEI event delivery. The
> preempted context is saved prior to the delivery and restored
> after that.
> struct kvm_sdei_event
> SDEI events exposed by KVM so that guest can register and enable.
> struct kvm_sdei_kvm_event
> SDEI events that have been registered by guest.
> struct kvm_sdei_vcpu_event
> SDEI events that have been queued to specific vCPU for delivery.
>
> The series is organized as below:
>
> PATCH[01] Introduces template for smccc_get_argx()
> PATCH[02] Introduces the data structures and infrastructure
> PATCH[03-14] Supports various SDEI related hypercalls
> PATCH[15] Supports SDEI event notification
> PATCH[16-17] Introduces ioctl command for migration
> PATCH[18-19] Supports SDEI event injection and cancellation
> PATCH[20] Exports SDEI capability
> PATCH[21] Adds self-test case for SDEI virtualization
>

[...]

I explicitly copied James Morse and Mark Rutland when posting the series,
but something unknown went wrong. I'm including them here to avoid
reposting the whole series.

Thanks,
Gavin

2021-11-10 00:00:39

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 03/21] KVM: arm64: Support SDEI_VERSION hypercall

Hi Gavin

On 8/15/21 2:13 AM, Gavin Shan wrote:
> This supports SDEI_VERSION hypercall by returning v1.0.0 simply
s/This supports/Add Support. I think this is the prefered way to start
the commit msg. Here and elsewhere.
> when the functionality is supported on the VM and vCPU.
Can you explain when the functionality isn't supported on either. From
the infra patch I have the impression that an allocation failure is the
sole cause of lack of support?
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> arch/arm64/kvm/sdei.c | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index ab330b74a965..aa9485f076a9 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -70,6 +70,22 @@ static void kvm_sdei_remove_vcpu_events(struct kvm_vcpu *vcpu)
> }
> }
>
> +static unsigned long kvm_sdei_hypercall_version(struct kvm_vcpu *vcpu)
> +{
> + struct kvm *kvm = vcpu->kvm;
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> + unsigned long ret = SDEI_NOT_SUPPORTED;
nit: I would remove ret local variable
> +
> + if (!(ksdei && vsdei))
> + return ret;
> +
> + /* v1.0.0 */
> + ret = (1UL << SDEI_VERSION_MAJOR_SHIFT);
> +
> + return ret;
> +}
> +
> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> {
> u32 func = smccc_get_function(vcpu);
> @@ -78,6 +94,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>
> switch (func) {
> case SDEI_1_0_FN_SDEI_VERSION:
> + ret = kvm_sdei_hypercall_version(vcpu);
> + break;
> case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
> case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
> case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
>

Eric

2021-11-10 00:00:40

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 01/21] KVM: arm64: Introduce template for inline functions

Hi Gavin,

On 8/15/21 2:13 AM, Gavin Shan wrote:
> The inline functions used to get the SMCCC parameters have same
> layout. It means these functions can be presented by a template,
> to make the code simplified. Besides, this adds more similar inline
> functions like smccc_get_arg{4,5,6,7,8}() to visit more SMCCC arguments,
> which are needed by SDEI virtualization support.
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> include/kvm/arm_hypercalls.h | 34 +++++++++++++++-------------------
> 1 file changed, 15 insertions(+), 19 deletions(-)
>
> diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
> index 0e2509d27910..ebecb6c68254 100644
> --- a/include/kvm/arm_hypercalls.h
> +++ b/include/kvm/arm_hypercalls.h
> @@ -6,27 +6,21 @@
>
> #include <asm/kvm_emulate.h>
>
> -int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
> -
> -static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
> -{
> - return vcpu_get_reg(vcpu, 0);
> +#define SMCCC_DECLARE_GET_FUNC(type, name, reg) \
> +static inline type smccc_get_##name(struct kvm_vcpu *vcpu) \
> +{ \
> + return vcpu_get_reg(vcpu, reg); \
> }
>
> -static inline unsigned long smccc_get_arg1(struct kvm_vcpu *vcpu)
> -{
> - return vcpu_get_reg(vcpu, 1);
> -}
> -
> -static inline unsigned long smccc_get_arg2(struct kvm_vcpu *vcpu)
> -{
> - return vcpu_get_reg(vcpu, 2);
> -}
> -
> -static inline unsigned long smccc_get_arg3(struct kvm_vcpu *vcpu)
> -{
> - return vcpu_get_reg(vcpu, 3);
> -}
> +SMCCC_DECLARE_GET_FUNC(u32, function, 0)
> +SMCCC_DECLARE_GET_FUNC(unsigned long, arg1, 1)
> +SMCCC_DECLARE_GET_FUNC(unsigned long, arg2, 2)
> +SMCCC_DECLARE_GET_FUNC(unsigned long, arg3, 3)
> +SMCCC_DECLARE_GET_FUNC(unsigned long, arg4, 4)
> +SMCCC_DECLARE_GET_FUNC(unsigned long, arg5, 5)
> +SMCCC_DECLARE_GET_FUNC(unsigned long, arg6, 6)
> +SMCCC_DECLARE_GET_FUNC(unsigned long, arg7, 7)
> +SMCCC_DECLARE_GET_FUNC(unsigned long, arg8, 8)
I think I would keep smccc_get_function() and add macros to get the
64-bit args. SMCCC_DECLARE_GET_FUNC is an odd macro name for a function
fetching an arg. I would suggest:

> +#define SMCCC_DECLARE_GET_ARG(reg) \
> +static inline unsigned long smccc_get_arg##reg(struct kvm_vcpu *vcpu) \
> +{ \
> + return vcpu_get_reg(vcpu, reg); \
> }
>
> static inline void smccc_set_retval(struct kvm_vcpu *vcpu,
> unsigned long a0,
> @@ -40,4 +34,6 @@ static inline void smccc_set_retval(struct kvm_vcpu *vcpu,
> vcpu_set_reg(vcpu, 3, a3);
> }
>
> +int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
> +
spurious change?
> #endif
>
Thanks

Eric

2021-11-10 00:03:55

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 02/21] KVM: arm64: Add SDEI virtualization infrastructure

Hi Gavin,
On 8/15/21 2:13 AM, Gavin Shan wrote:
> Software Delegated Exception Interface (SDEI) provides a mechanism for
> registering and servicing system events. Those system events are high
> priority events, which must be serviced immediately. It's going to be
> used by Asynchronous Page Fault (APF) to deliver notification from KVM
> to guest. It's noted that SDEI is defined by ARM DEN0054A specification.
>
> This introduces SDEI virtualization infrastructure where the SDEI events
> are registered and manuplated by the guest through hypercall. The SDEI
manipulated
> event is delivered to one specific vCPU by KVM once it's raised. This
> introduces data structures to represent the needed objects to implement
> the feature, which is highlighted as below. As those objects could be
> migrated between VMs, these data structures are partially exported to
> user space.
>
> * kvm_sdei_event
> SDEI events are exported from KVM so that guest is able to register
> and manuplate.
manipulate
> * kvm_sdei_kvm_event
> SDEI event that has been registered by guest.
I would recomment to revisit the names. Why kvm event? Why not
registered_event instead that actually would tell what it its. also you
have kvm twice in the struct name.
> * kvm_sdei_kvm_vcpu
Didn't you mean kvm_sdei_vcpu_event instead?
> SDEI event that has been delivered to the target vCPU.
> * kvm_sdei_kvm
> Place holder of exported and registered SDEI events.
> * kvm_sdei_vcpu
> Auxiliary object to save the preempted context during SDEI event
> delivery.
>
> The error is returned for all SDEI hypercalls for now. They will be
> implemented by the subsequent patches.
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> arch/arm64/include/asm/kvm_host.h | 6 +
> arch/arm64/include/asm/kvm_sdei.h | 118 +++++++++++++++
> arch/arm64/include/uapi/asm/kvm.h | 1 +
> arch/arm64/include/uapi/asm/kvm_sdei.h | 60 ++++++++
> arch/arm64/kvm/Makefile | 2 +-
> arch/arm64/kvm/arm.c | 7 +
> arch/arm64/kvm/hypercalls.c | 18 +++
> arch/arm64/kvm/sdei.c | 198 +++++++++++++++++++++++++
> 8 files changed, 409 insertions(+), 1 deletion(-)
> create mode 100644 arch/arm64/include/asm/kvm_sdei.h
> create mode 100644 arch/arm64/include/uapi/asm/kvm_sdei.h
> create mode 100644 arch/arm64/kvm/sdei.c
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 41911585ae0c..aedf901e1ec7 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -113,6 +113,9 @@ struct kvm_arch {
> /* Interrupt controller */
> struct vgic_dist vgic;
>
> + /* SDEI support */
does not bring much. Why not reusing the commit msg explanation? Here
and below.
> + struct kvm_sdei_kvm *sdei;
> +
> /* Mandated version of PSCI */
> u32 psci_version;
>
> @@ -339,6 +342,9 @@ struct kvm_vcpu_arch {
> * here.
> */
>
> + /* SDEI support */
> + struct kvm_sdei_vcpu *sdei;> +
> /*
> * Guest registers we preserve during guest debugging.
> *
> diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
> new file mode 100644
> index 000000000000..b0abc13a0256
> --- /dev/null
> +++ b/arch/arm64/include/asm/kvm_sdei.h
> @@ -0,0 +1,118 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Definitions of various KVM SDEI events.
> + *
> + * Copyright (C) 2021 Red Hat, Inc.
> + *
> + * Author(s): Gavin Shan <[email protected]>
> + */
> +
> +#ifndef __ARM64_KVM_SDEI_H__
> +#define __ARM64_KVM_SDEI_H__
> +
> +#include <uapi/linux/arm_sdei.h>> +#include <uapi/asm/kvm_sdei.h>
> +#include <linux/bitmap.h>
> +#include <linux/list.h>
> +#include <linux/spinlock.h>
> +
> +struct kvm_sdei_event {
> + struct kvm_sdei_event_state state;
> + struct kvm *kvm;
> + struct list_head link;
> +};
> +
> +struct kvm_sdei_kvm_event {
> + struct kvm_sdei_kvm_event_state state;
> + struct kvm_sdei_event *kse;
> + struct kvm *kvm;
can't you reuse the kvm handle in state?
> + struct list_head link;
> +};
> +
> +struct kvm_sdei_vcpu_event {
> + struct kvm_sdei_vcpu_event_state state;
> + struct kvm_sdei_kvm_event *kske;
> + struct kvm_vcpu *vcpu;
> + struct list_head link;
> +};
> +
> +struct kvm_sdei_kvm {
> + spinlock_t lock;
> + struct list_head events; /* kvm_sdei_event */
> + struct list_head kvm_events; /* kvm_sdei_kvm_event */
> +};
> +
> +struct kvm_sdei_vcpu {
> + spinlock_t lock;
> + struct kvm_sdei_vcpu_state state;
could you explain the fields below?
> + struct kvm_sdei_vcpu_event *critical_event;
> + struct kvm_sdei_vcpu_event *normal_event;
> + struct list_head critical_events;
> + struct list_head normal_events;
> +};
> +
> +/*
> + * According to SDEI specification (v1.0), the event number spans 32-bits
> + * and the lower 24-bits are used as the (real) event number. I don't
> + * think we can use that much SDEI numbers in one system. So we reserve
> + * two bits from the 24-bits real event number, to indicate its types:
> + * physical event and virtual event. One reserved bit is enough for now,
> + * but two bits are reserved for possible extension in future.
I think this assumption is worth to be mentionned in the commit msg.
> + *
> + * The physical events are owned by underly firmware while the virtual
underly?
> + * events are used by VMM and KVM.
> + */
> +#define KVM_SDEI_EV_NUM_TYPE_SHIFT 22
> +#define KVM_SDEI_EV_NUM_TYPE_MASK 3
> +#define KVM_SDEI_EV_NUM_TYPE_PHYS 0
> +#define KVM_SDEI_EV_NUM_TYPE_VIRT 1
> +
> +static inline bool kvm_sdei_is_valid_event_num(unsigned long num)
the name of the function does does not really describe what it does. It
actually checks the sdei is a virtual one. suggest kvm_sdei_is_virtual?
> +{
> + unsigned long type;
> +
> + if (num >> 32)
> + return false;
> +
> + type = (num >> KVM_SDEI_EV_NUM_TYPE_SHIFT) & KVM_SDEI_EV_NUM_TYPE_MASK;
I think the the mask generally is applied before shifting. See
include/linux/irqchip/arm-gic-v3.h
> + if (type != KVM_SDEI_EV_NUM_TYPE_VIRT)
> + return false;
> +
> + return true;
> +}
> +
> +/* Accessors for the registration or enablement states of KVM event */
> +#define KVM_SDEI_FLAG_FUNC(field) \
> +static inline bool kvm_sdei_is_##field(struct kvm_sdei_kvm_event *kske, \
> + unsigned int index) \
> +{ \
> + return !!test_bit(index, (void *)(kske->state.field)); \
> +} \
> + \
> +static inline bool kvm_sdei_empty_##field(struct kvm_sdei_kvm_event *kske) \
nit: s/empty/none ?
> +{ \
> + return bitmap_empty((void *)(kske->state.field), \
> + KVM_SDEI_MAX_VCPUS); \
> +} \
> +static inline void kvm_sdei_set_##field(struct kvm_sdei_kvm_event *kske, \
> + unsigned int index) \
> +{ \
> + set_bit(index, (void *)(kske->state.field)); \
> +} \
> +static inline void kvm_sdei_clear_##field(struct kvm_sdei_kvm_event *kske, \
> + unsigned int index) \
> +{ \
> + clear_bit(index, (void *)(kske->state.field)); \
> +}
> +
> +KVM_SDEI_FLAG_FUNC(registered)
> +KVM_SDEI_FLAG_FUNC(enabled)
> +
> +/* APIs */
> +void kvm_sdei_init_vm(struct kvm *kvm);
> +void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
> +int kvm_sdei_hypercall(struct kvm_vcpu *vcpu);
> +void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
> +void kvm_sdei_destroy_vm(struct kvm *kvm);
> +
> +#endif /* __ARM64_KVM_SDEI_H__ */
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index b3edde68bc3e..e1b200bb6482 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -36,6 +36,7 @@
> #include <linux/types.h>
> #include <asm/ptrace.h>
> #include <asm/sve_context.h>
> +#include <asm/kvm_sdei.h>
>
> #define __KVM_HAVE_GUEST_DEBUG
> #define __KVM_HAVE_IRQ_LINE
> diff --git a/arch/arm64/include/uapi/asm/kvm_sdei.h b/arch/arm64/include/uapi/asm/kvm_sdei.h
> new file mode 100644
> index 000000000000..8928027023f6
> --- /dev/null
> +++ b/arch/arm64/include/uapi/asm/kvm_sdei.h
> @@ -0,0 +1,60 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +/*
> + * Definitions of various KVM SDEI event states.
> + *
> + * Copyright (C) 2021 Red Hat, Inc.
> + *
> + * Author(s): Gavin Shan <[email protected]>
> + */
> +
> +#ifndef _UAPI__ASM_KVM_SDEI_H
> +#define _UAPI__ASM_KVM_SDEI_H
> +
> +#ifndef __ASSEMBLY__
> +#include <linux/types.h>
> +
> +#define KVM_SDEI_MAX_VCPUS 512
> +#define KVM_SDEI_INVALID_NUM 0
> +#define KVM_SDEI_DEFAULT_NUM 0x40400000

The motivation behind introducing such uapi should be clearer (besides
just telling this aims at migrating). To me atm, this justification does
not make possible to understand if those structs are well suited. You
should document the migration process I think.

I would remove _state suffix in all of them.
> +
> +struct kvm_sdei_event_state {
This is not really a state because it cannot be changed by the guest,
right? I would remove _state and just call it kvm_sdei_event
> + __u64 num;
> +
> + __u8 type;
> + __u8 signaled;
> + __u8 priority;
you need some padding to be 64-bit aligned. See in generic or aarch64
kvm.h for instance.
> +};
> +
> +struct kvm_sdei_kvm_event_state {
I would rename into kvm_sdei_registered_event or smth alike
> + __u64 num;
how does this num differ from the event state one?
> + __u32 refcount;
> +
> + __u8 route_mode;
padding also here. See for instance
https://lore.kernel.org/kvm/[email protected]/T/#m7bac2ff2b28a68f8d2196ec452afd3e46682760d

Maybe put the the route_mode field and refcount at the end and add one
byte of padding?

Why can't we have a single sdei_event uapi representation where route
mode defaults to unset and refcount defaults to 0 when not registered?

> + __u64 route_affinity;
> + __u64 entries[KVM_SDEI_MAX_VCPUS];
> + __u64 params[KVM_SDEI_MAX_VCPUS];
I would rename entries into ep_address and params into ep_arg.
> + __u64 registered[KVM_SDEI_MAX_VCPUS/64];
maybe add a comment along with KVM_SDEI_MAX_VCPUS that it must be a
multiple of 64 (or a build check)

> + __u64 enabled[KVM_SDEI_MAX_VCPUS/64];
Also you may clarify what this gets used for a shared event. I guess
this only makes sense for a private event which can be registered by
several EPs?
> +};
> +
> +struct kvm_sdei_vcpu_event_state {
> + __u64 num;
> + __u32 refcount;
how does it differ from num and refcount of the registered event?
padding++
> +};
> +
> +struct kvm_sdei_vcpu_regs {
> + __u64 regs[18];
> + __u64 pc;
> + __u64 pstate;
> +};
> +
> +struct kvm_sdei_vcpu_state {
> + __u8 masked;
padding++
> + __u64 critical_num;
> + __u64 normal_num;
> + struct kvm_sdei_vcpu_regs critical_regs;
> + struct kvm_sdei_vcpu_regs normal_regs;
> +};> +
> +#endif /* !__ASSEMBLY__ */
> +#endif /* _UAPI__ASM_KVM_SDEI_H */
> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> index 989bb5dad2c8..eefca8ca394d 100644
> --- a/arch/arm64/kvm/Makefile
> +++ b/arch/arm64/kvm/Makefile
> @@ -16,7 +16,7 @@ kvm-y := $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o \
> inject_fault.o va_layout.o handle_exit.o \
> guest.o debug.o reset.o sys_regs.o \
> vgic-sys-reg-v3.o fpsimd.o pmu.o \
> - arch_timer.o trng.o\
> + arch_timer.o trng.o sdei.o \
> vgic/vgic.o vgic/vgic-init.o \
> vgic/vgic-irqfd.o vgic/vgic-v2.o \
> vgic/vgic-v3.o vgic/vgic-v4.o \
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index e9a2b8f27792..2f021aa41632 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -150,6 +150,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>
> kvm_vgic_early_init(kvm);
>
> + kvm_sdei_init_vm(kvm);
> +
> /* The maximum number of VCPUs is limited by the host's GIC model */
> kvm->arch.max_vcpus = kvm_arm_default_max_vcpus();
>
> @@ -179,6 +181,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
>
> kvm_vgic_destroy(kvm);
>
> + kvm_sdei_destroy_vm(kvm);
> +
> for (i = 0; i < KVM_MAX_VCPUS; ++i) {
> if (kvm->vcpus[i]) {
> kvm_vcpu_destroy(kvm->vcpus[i]);
> @@ -333,6 +337,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>
> kvm_arm_pvtime_vcpu_init(&vcpu->arch);
>
> + kvm_sdei_create_vcpu(vcpu);
> +
> vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu;
>
> err = kvm_vgic_vcpu_init(vcpu);
> @@ -354,6 +360,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
> kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
> kvm_timer_vcpu_terminate(vcpu);
> kvm_pmu_vcpu_destroy(vcpu);
> + kvm_sdei_destroy_vcpu(vcpu);
>
> kvm_arm_vcpu_destroy(vcpu);
> }
> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
> index 30da78f72b3b..d3fc893a4f58 100644
> --- a/arch/arm64/kvm/hypercalls.c
> +++ b/arch/arm64/kvm/hypercalls.c
> @@ -139,6 +139,24 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
> case ARM_SMCCC_TRNG_RND32:
> case ARM_SMCCC_TRNG_RND64:
> return kvm_trng_call(vcpu);
> + case SDEI_1_0_FN_SDEI_VERSION:
> + case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
> + case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
> + case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
> + case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
> + case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
> + case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
> + case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
> + case SDEI_1_0_FN_SDEI_EVENT_STATUS:
> + case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
> + case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
> + case SDEI_1_0_FN_SDEI_PE_MASK:
> + case SDEI_1_0_FN_SDEI_PE_UNMASK:
> + case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
> + case SDEI_1_0_FN_SDEI_INTERRUPT_RELEASE:
> + case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
> + case SDEI_1_0_FN_SDEI_SHARED_RESET:
> + return kvm_sdei_hypercall(vcpu);
> default:
> return kvm_psci_call(vcpu);
> }
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> new file mode 100644
> index 000000000000..ab330b74a965
> --- /dev/null
> +++ b/arch/arm64/kvm/sdei.c
> @@ -0,0 +1,198 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * SDEI virtualization support.
> + *
> + * Copyright (C) 2021 Red Hat, Inc.
> + *
> + * Author(s): Gavin Shan <[email protected]>
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/kvm_host.h>
> +#include <linux/spinlock.h>
> +#include <linux/slab.h>
> +#include <kvm/arm_hypercalls.h>
> +
> +static struct kvm_sdei_event_state defined_kse[] = {
> + { KVM_SDEI_DEFAULT_NUM,
> + SDEI_EVENT_TYPE_PRIVATE,
> + 1,
> + SDEI_EVENT_PRIORITY_CRITICAL
> + },
> +};
I understand from the above we currently only support a single static (~
platform) SDEI event with num = KVM_SDEI_DEFAULT_NUM. We do not support
bound events. You may add a comment here and maybe in the commit msg.
I would rename the variable into exported_events.
> +
> +static void kvm_sdei_remove_events(struct kvm *kvm)
> +{
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_event *kse, *tmp;
> +
> + list_for_each_entry_safe(kse, tmp, &ksdei->events, link) {
> + list_del(&kse->link);
> + kfree(kse);
> + }
> +}
> +
> +static void kvm_sdei_remove_kvm_events(struct kvm *kvm,
> + unsigned int mask,
> + bool force)
> +{
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_event *kse;
> + struct kvm_sdei_kvm_event *kske, *tmp;
> +
> + list_for_each_entry_safe(kske, tmp, &ksdei->kvm_events, link) {
> + kse = kske->kse;
> +
> + if (!((1 << kse->state.type) & mask))
> + continue;
don't you need to hold a lock before looping? What if sbdy concurrently
changes the state fields, especially the refcount below?
> +
> + if (!force && kske->state.refcount)
> + continue;
Usually the refcount is used to control the lifetime of the object. The
'force' flag looks wrong in that context. Shouldn't you make sure all
users have released their refcounts and on the last decrement, delete
the object?
> +
> + list_del(&kske->link);
> + kfree(kske);
> + }
> +}
> +
> +static void kvm_sdei_remove_vcpu_events(struct kvm_vcpu *vcpu)
> +{
> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> + struct kvm_sdei_vcpu_event *ksve, *tmp;
> +
> + list_for_each_entry_safe(ksve, tmp, &vsdei->critical_events, link) {
> + list_del(&ksve->link);
> + kfree(ksve);
> + }
> +
> + list_for_each_entry_safe(ksve, tmp, &vsdei->normal_events, link) {
> + list_del(&ksve->link);
> + kfree(ksve);
> + }
> +}
> +
> +int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> +{
> + u32 func = smccc_get_function(vcpu);
> + bool has_result = true;
> + unsigned long ret;
> +
> + switch (func) {
> + case SDEI_1_0_FN_SDEI_VERSION:
> + case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
> + case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
> + case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
> + case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
> + case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
> + case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
> + case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
> + case SDEI_1_0_FN_SDEI_EVENT_STATUS:
> + case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
> + case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
> + case SDEI_1_0_FN_SDEI_PE_MASK:
> + case SDEI_1_0_FN_SDEI_PE_UNMASK:
> + case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
> + case SDEI_1_0_FN_SDEI_INTERRUPT_RELEASE:
> + case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
> + case SDEI_1_0_FN_SDEI_SHARED_RESET:
> + default:
> + ret = SDEI_NOT_SUPPORTED;
> + }
> +
> + /*
> + * We don't have return value for COMPLETE or COMPLETE_AND_RESUME
> + * hypercalls. Otherwise, the restored context will be corrupted.
> + */
> + if (has_result)
> + smccc_set_retval(vcpu, ret, 0, 0, 0);
If I understand the above comment, COMPLETE and COMPLETE_AND_RESUME
should have has_result set to false whereas in that case they will
return NOT_SUPPORTED. Is that OK for the context restore?
> +
> + return 1;
> +}
> +
> +void kvm_sdei_init_vm(struct kvm *kvm)
> +{
> + struct kvm_sdei_kvm *ksdei;
> + struct kvm_sdei_event *kse;
> + int i;
> +
> + ksdei = kzalloc(sizeof(*ksdei), GFP_KERNEL);
> + if (!ksdei)
> + return;
> +
> + spin_lock_init(&ksdei->lock);
> + INIT_LIST_HEAD(&ksdei->events);
> + INIT_LIST_HEAD(&ksdei->kvm_events);
> +
> + /*
> + * Populate the defined KVM SDEI events. The whole functionality
> + * will be disabled on any errors.
You should definitively revise your naming conventions. this brings
confusion inbetween exported events and registered events. Why not
simply adopt the spec terminology?
> + */
> + for (i = 0; i < ARRAY_SIZE(defined_kse); i++) {
> + kse = kzalloc(sizeof(*kse), GFP_KERNEL);
> + if (!kse) {
> + kvm_sdei_remove_events(kvm);
> + kfree(ksdei);
> + return;
> + }
Add a comment saying that despite we currently support a single static
event we prepare for binding support by building a list of exposed events?

Or maybe simplify the implementation at this stage of the development
assuming a single platform event is supported?
> +
> + kse->kvm = kvm;
> + kse->state = defined_kse[i];
> + list_add_tail(&kse->link, &ksdei->events);
> + }
> +
> + kvm->arch.sdei = ksdei;
> +}
> +
> +void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu)
> +{
> + struct kvm *kvm = vcpu->kvm;
> + struct kvm_sdei_vcpu *vsdei;
> +
> + if (!kvm->arch.sdei)
> + return;
> +
> + vsdei = kzalloc(sizeof(*vsdei), GFP_KERNEL);
> + if (!vsdei)
> + return;
> +
> + spin_lock_init(&vsdei->lock);
> + vsdei->state.masked = 1;
> + vsdei->state.critical_num = KVM_SDEI_INVALID_NUM;
> + vsdei->state.normal_num = KVM_SDEI_INVALID_NUM;
> + vsdei->critical_event = NULL;
> + vsdei->normal_event = NULL;
> + INIT_LIST_HEAD(&vsdei->critical_events);
> + INIT_LIST_HEAD(&vsdei->normal_events);
> +
> + vcpu->arch.sdei = vsdei;
> +}
> +
> +void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu)
> +{
> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> +
> + if (vsdei) {
> + spin_lock(&vsdei->lock);
> + kvm_sdei_remove_vcpu_events(vcpu);
> + spin_unlock(&vsdei->lock);
> +
> + kfree(vsdei);
> + vcpu->arch.sdei = NULL;
> + }
> +}
> +
> +void kvm_sdei_destroy_vm(struct kvm *kvm)
> +{
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + unsigned int mask = (1 << SDEI_EVENT_TYPE_PRIVATE) |
> + (1 << SDEI_EVENT_TYPE_SHARED);
> +
> + if (ksdei) {
> + spin_lock(&ksdei->lock);
> + kvm_sdei_remove_kvm_events(kvm, mask, true);> + kvm_sdei_remove_events(kvm);
> + spin_unlock(&ksdei->lock);
> +
> + kfree(ksdei);
> + kvm->arch.sdei = NULL;
> + }
> +}
>
Thanks

Eric

2021-11-10 00:04:19

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 05/21] KVM: arm64: Support SDEI_EVENT_{ENABLE, DISABLE} hypercall

Hi Gavin,

On 8/15/21 2:13 AM, Gavin Shan wrote:
> This supports SDEI_EVENT_{ENABLE, DISABLE} hypercall. After SDEI
> event is registered by guest, it won't be delivered to the guest
> until it's enabled. On the other hand, the SDEI event won't be
> raised to the guest or specific vCPU if it's has been disabled
> on the guest or specific vCPU.
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> arch/arm64/kvm/sdei.c | 68 +++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 68 insertions(+)
>
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index d3ea3eee154b..b022ce0a202b 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -206,6 +206,70 @@ static unsigned long kvm_sdei_hypercall_register(struct kvm_vcpu *vcpu)
> return ret;
> }
>
> +static unsigned long kvm_sdei_hypercall_enable(struct kvm_vcpu *vcpu,
> + bool enable)
> +{
> + struct kvm *kvm = vcpu->kvm;
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> + struct kvm_sdei_event *kse = NULL;
> + struct kvm_sdei_kvm_event *kske = NULL;
> + unsigned long event_num = smccc_get_arg1(vcpu);
> + int index = 0;
> + unsigned long ret = SDEI_SUCCESS;
> +
> + /* Sanity check */
> + if (!(ksdei && vsdei)) {
> + ret = SDEI_NOT_SUPPORTED;
> + goto out;
> + }
> +
> + if (!kvm_sdei_is_valid_event_num(event_num)) {
I would rename into is_exposed_event_num()
> + ret = SDEI_INVALID_PARAMETERS;
> + goto out;
> + }
> +
> + /* Check if the KVM event exists */
> + spin_lock(&ksdei->lock);
> + kske = kvm_sdei_find_kvm_event(kvm, event_num);
> + if (!kske) {
> + ret = SDEI_INVALID_PARAMETERS;
should be DENIED according to the spec, ie. nobody registered that event?
> + goto unlock;
> + }
> +
> + /* Check if there is pending events */
does that match the "handler-unregister-pending state" case mentionned
in the spec?
> + if (kske->state.refcount) {
> + ret = SDEI_PENDING;
? not documented in my A spec? DENIED?
> + goto unlock;
> + }
> +
> + /* Check if it has been registered */
isn't duplicate of /* Check if the KVM event exists */ ?
> + kse = kske->kse;
> + index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
> + vcpu->vcpu_idx : 0;
> + if (!kvm_sdei_is_registered(kske, index)) {
> + ret = SDEI_DENIED;
> + goto unlock;
> + }
> +
> + /* Verify its enablement state */
> + if (enable == kvm_sdei_is_enabled(kske, index)) {
spec says:
Enabling/disabled an event, which is already enabled/disabled, is
permitted and has no effect. I guess ret should be OK.
> + ret = SDEI_DENIED;
> + goto unlock;
> + }
> +
> + /* Update enablement state */
> + if (enable)
> + kvm_sdei_set_enabled(kske, index);
> + else
> + kvm_sdei_clear_enabled(kske, index);
> +
> +unlock:
> + spin_unlock(&ksdei->lock);
> +out:
> + return ret;
> +}
> +
> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> {
> u32 func = smccc_get_function(vcpu);
> @@ -220,7 +284,11 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> ret = kvm_sdei_hypercall_register(vcpu);
> break;
> case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
> + ret = kvm_sdei_hypercall_enable(vcpu, true);
> + break;
> case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
> + ret = kvm_sdei_hypercall_enable(vcpu, false);
> + break;
> case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
>
Eric

2021-11-10 00:05:49

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 04/21] KVM: arm64: Support SDEI_EVENT_REGISTER hypercall

Hi Gavin,
On 8/15/21 2:13 AM, Gavin Shan wrote:
> This supports SDEI_EVENT_REGISTER hypercall, which is used by guest
> to register SDEI events. The SDEI event won't be raised to the guest
> or specific vCPU until it's registered and enabled explicitly.
>
> Only those events that have been exported by KVM can be registered.
> After the event is registered successfully, the KVM SDEI event (object)
> is created or updated because the same KVM SDEI event is shared by
revisit the terminology (KVM SDEI event). The same SDEI registered event
object is shared by multiple vCPUs if it is a private event.
> multiple vCPUs if it's a private event.>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> arch/arm64/kvm/sdei.c | 122 ++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 122 insertions(+)
>
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index aa9485f076a9..d3ea3eee154b 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -21,6 +21,20 @@ static struct kvm_sdei_event_state defined_kse[] = {
> },
> };
>
> +static struct kvm_sdei_event *kvm_sdei_find_event(struct kvm *kvm,
> + unsigned long num)
> +{
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_event *kse;
the 'k' prefix everywhere for your local variable is unneeded.
> +
> + list_for_each_entry(kse, &ksdei->events, link) {
> + if (kse->state.num == num)
> + return kse;
> + }
> +
> + return NULL;
> +}
> +
> static void kvm_sdei_remove_events(struct kvm *kvm)
> {
> struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> @@ -32,6 +46,20 @@ static void kvm_sdei_remove_events(struct kvm *kvm)
> }
> }
>
> +static struct kvm_sdei_kvm_event *kvm_sdei_find_kvm_event(struct kvm *kvm,
> + unsigned long num)
> +{
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_kvm_event *kske;
> +
> + list_for_each_entry(kske, &ksdei->kvm_events, link) {> + if (kske->state.num == num)
I still don't get the diff between the num of an SDEI event vs the num
of a so-called SDEI kvm event. Event numbers are either static or
dynamically created using bind ops which you do not support. But to me
this is a property of the root exposed SDEI event and not a property of
the registered event. Please could you clarify?
> + return kske;
> + }
> +
> + return NULL;
> +}
> +
> static void kvm_sdei_remove_kvm_events(struct kvm *kvm,
> unsigned int mask,
> bool force)
> @@ -86,6 +114,98 @@ static unsigned long kvm_sdei_hypercall_version(struct kvm_vcpu *vcpu)
> return ret;
> }
>
> +static unsigned long kvm_sdei_hypercall_register(struct kvm_vcpu *vcpu)
> +{
> + struct kvm *kvm = vcpu->kvm;
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> + struct kvm_sdei_event *kse = NULL;
> + struct kvm_sdei_kvm_event *kske = NULL;
> + unsigned long event_num = smccc_get_arg1(vcpu);
> + unsigned long event_entry = smccc_get_arg2(vcpu);
> + unsigned long event_param = smccc_get_arg3(vcpu);
> + unsigned long route_mode = smccc_get_arg4(vcpu);
> + unsigned long route_affinity = smccc_get_arg5(vcpu);
> + int index = vcpu->vcpu_idx;
> + unsigned long ret = SDEI_SUCCESS;
> +
> + /* Sanity check */
> + if (!(ksdei && vsdei)) {
> + ret = SDEI_NOT_SUPPORTED;
> + goto out;
> + }
> +
> + if (!kvm_sdei_is_valid_event_num(event_num)) {
> + ret = SDEI_INVALID_PARAMETERS;
> + goto out;
> + }
> +
> + if (!(route_mode == SDEI_EVENT_REGISTER_RM_ANY ||
> + route_mode == SDEI_EVENT_REGISTER_RM_PE)) {
> + ret = SDEI_INVALID_PARAMETERS;
> + goto out;
> + }
> +
> + /*
> + * The KVM event could have been created if it's a private event.
> + * We needn't create a KVM event in this case.
s/create a KVM event/to create another KVM event instance
> + */
> + spin_lock(&ksdei->lock);
> + kske = kvm_sdei_find_kvm_event(kvm, event_num);
> + if (kske) {
> + kse = kske->kse;
> + index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
> + vcpu->vcpu_idx : 0;
> +
> + if (kvm_sdei_is_registered(kske, index)) {
> + ret = SDEI_DENIED;
> + goto unlock;
> + }
> +
> + kske->state.route_mode = route_mode;
> + kske->state.route_affinity = route_affinity;
> + kske->state.entries[index] = event_entry;
> + kske->state.params[index] = event_param;
> + kvm_sdei_set_registered(kske, index);
> + goto unlock;
> + }
> +
> + /* Check if the event number has been registered */
> + kse = kvm_sdei_find_event(kvm, event_num);
I don't get the comment. find_event looks up for exposed events and not
registered events, right? So maybe this is the first thing to check, ie.
the num matches one exposed event.
> + if (!kse) {
> + ret = SDEI_INVALID_PARAMETERS;
> + goto unlock;
> + }
> +
> + /* Create KVM event */
> + kske = kzalloc(sizeof(*kske), GFP_KERNEL);
> + if (!kske) {
> + ret = SDEI_OUT_OF_RESOURCE;
> + goto unlock;
> + }
> +
> + /* Initialize KVM event state */
> + index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
> + vcpu->vcpu_idx : 0;
> + kske->state.num = event_num;
> + kske->state.refcount = 0;
> + kske->state.route_mode = route_affinity;
> + kske->state.route_affinity = route_affinity;
> + kske->state.entries[index] = event_entry;
> + kske->state.params[index] = event_param;
> + kvm_sdei_set_registered(kske, index);
> +
> + /* Initialize KVM event */
> + kske->kse = kse;
> + kske->kvm = kvm;
> + list_add_tail(&kske->link, &ksdei->kvm_events);
> +
> +unlock:
> + spin_unlock(&ksdei->lock);
> +out:
> + return ret;
> +}
> +
> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> {
> u32 func = smccc_get_function(vcpu);
> @@ -97,6 +217,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> ret = kvm_sdei_hypercall_version(vcpu);
> break;
> case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
> + ret = kvm_sdei_hypercall_register(vcpu);
> + break;
> case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
> case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
> case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
>
Thanks

Eric

2021-11-10 00:09:09

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 08/21] KVM: arm64: Support SDEI_EVENT_STATUS hypercall



On 8/15/21 2:13 AM, Gavin Shan wrote:
> This supports SDEI_EVENT_STATUS hypercall. It's used by the guest
> to retrieve a bitmap to indicate the SDEI event states, including
> registration, enablement and delivery state.
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> arch/arm64/kvm/sdei.c | 50 +++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 50 insertions(+)
>
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index a3ba69dc91cb..b95b8c4455e1 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -367,6 +367,54 @@ static unsigned long kvm_sdei_hypercall_unregister(struct kvm_vcpu *vcpu)
> return ret;
> }
>
> +static unsigned long kvm_sdei_hypercall_status(struct kvm_vcpu *vcpu)
> +{
> + struct kvm *kvm = vcpu->kvm;
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> + struct kvm_sdei_event *kse = NULL;
> + struct kvm_sdei_kvm_event *kske = NULL;
> + unsigned long event_num = smccc_get_arg1(vcpu);
> + int index = 0;
> + unsigned long ret = SDEI_SUCCESS;
> +
> + /* Sanity check */
> + if (!(ksdei && vsdei)) {
> + ret = SDEI_NOT_SUPPORTED;
> + goto out;
> + }
> +
> + if (!kvm_sdei_is_valid_event_num(event_num)) {
> + ret = SDEI_INVALID_PARAMETERS;
> + goto out;
> + }
if we were to support bound events, I do not know if a given even num
can disapper inbetween that check and the rest of the code, in which
case a lock would be needed?
> +
> + /*
> + * Check if the KVM event exists. None of the flags
> + * will be set if it doesn't exist.
> + */
> + spin_lock(&ksdei->lock);
> + kske = kvm_sdei_find_kvm_event(kvm, event_num);
> + if (!kske) {
> + ret = 0;
> + goto unlock;
> + }
> +
> + index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
> + vcpu->vcpu_idx : 0;
> + if (kvm_sdei_is_registered(kske, index))
> + ret |= (1UL << SDEI_EVENT_STATUS_REGISTERED);
> + if (kvm_sdei_is_enabled(kske, index))
> + ret |= (1UL << SDEI_EVENT_STATUS_ENABLED);
> + if (kske->state.refcount)
> + ret |= (1UL << SDEI_EVENT_STATUS_RUNNING);
> +
> +unlock:
> + spin_unlock(&ksdei->lock);
> +out:
> + return ret;
> +}
> +
> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> {
> u32 func = smccc_get_function(vcpu);
> @@ -395,6 +443,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> ret = kvm_sdei_hypercall_unregister(vcpu);
> break;
> case SDEI_1_0_FN_SDEI_EVENT_STATUS:
> + ret = kvm_sdei_hypercall_status(vcpu);
> + break;
> case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
> case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
> case SDEI_1_0_FN_SDEI_PE_MASK:
>
Thanks

Eric

2021-11-10 00:09:34

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 07/21] KVM: arm64: Support SDEI_EVENT_UNREGISTER hypercall



On 8/15/21 2:13 AM, Gavin Shan wrote:
> This supports SDEI_EVENT_UNREGISTER hypercall. It's used by the
> guest to unregister SDEI event. The SDEI event won't be raised to
> the guest or specific vCPU after it's unregistered successfully.
> It's notable the SDEI event is disabled automatically on the guest
> or specific vCPU once it's unregistered successfully.
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> arch/arm64/kvm/sdei.c | 61 +++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 61 insertions(+)
>
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index b4162efda470..a3ba69dc91cb 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -308,6 +308,65 @@ static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu)
> return ret;
> }
>
> +static unsigned long kvm_sdei_hypercall_unregister(struct kvm_vcpu *vcpu)
> +{
> + struct kvm *kvm = vcpu->kvm;
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> + struct kvm_sdei_event *kse = NULL;
> + struct kvm_sdei_kvm_event *kske = NULL;
> + unsigned long event_num = smccc_get_arg1(vcpu);
> + int index = 0;
> + unsigned long ret = SDEI_SUCCESS;
> +
> + /* Sanity check */
> + if (!(ksdei && vsdei)) {
> + ret = SDEI_NOT_SUPPORTED;
> + goto out;
> + }
> +
> + if (!kvm_sdei_is_valid_event_num(event_num)) {
> + ret = SDEI_INVALID_PARAMETERS;
> + goto out;
> + }
> +
> + /* Check if the KVM event exists */
> + spin_lock(&ksdei->lock);
> + kske = kvm_sdei_find_kvm_event(kvm, event_num);
> + if (!kske) {
> + ret = SDEI_INVALID_PARAMETERS;
> + goto unlock;
> + }
> +
> + /* Check if there is pending events */
> + if (kske->state.refcount) {
> + ret = SDEI_PENDING;
don't you want to record the fact the unregistration is outstanding to
perform subsequent actions? Otherwise nothing will hapen when the
current executing handlers complete?
> + goto unlock;
> + }
> +
> + /* Check if it has been registered */
> + kse = kske->kse;
> + index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
> + vcpu->vcpu_idx : 0;
you could have an inline for the above as this is executed in many
functions. even including the code below.
> + if (!kvm_sdei_is_registered(kske, index)) {
> + ret = SDEI_DENIED;
> + goto unlock;
> + }
> +
> + /* The event is disabled when it's unregistered */
> + kvm_sdei_clear_enabled(kske, index);
> + kvm_sdei_clear_registered(kske, index);
> + if (kvm_sdei_empty_registered(kske)) {
a refcount mechanism would be cleaner I think.
> + list_del(&kske->link);
> + kfree(kske);
> + }
> +
> +unlock:
> + spin_unlock(&ksdei->lock);
> +out:
> + return ret;
> +}
> +
> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> {
> u32 func = smccc_get_function(vcpu);
> @@ -333,6 +392,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
> case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
> + ret = kvm_sdei_hypercall_unregister(vcpu);
> + break;
> case SDEI_1_0_FN_SDEI_EVENT_STATUS:
> case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
> case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
>

2021-11-10 00:10:18

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 09/21] KVM: arm64: Support SDEI_EVENT_GET_INFO hypercall

Hi Gavin,

On 8/15/21 2:13 AM, Gavin Shan wrote:
> This supports SDEI_EVENT_GET_INFO hypercall. It's used by the guest
> to retrieve various information about the supported (exported) events,
> including type, signaled, route mode and affinity for the shared
> events.
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> arch/arm64/kvm/sdei.c | 76 +++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 76 insertions(+)
>
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index b95b8c4455e1..5dfa74b093f1 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -415,6 +415,80 @@ static unsigned long kvm_sdei_hypercall_status(struct kvm_vcpu *vcpu)
> return ret;
> }
>
> +static unsigned long kvm_sdei_hypercall_info(struct kvm_vcpu *vcpu)
> +{
> + struct kvm *kvm = vcpu->kvm;
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> + struct kvm_sdei_event *kse = NULL;
> + struct kvm_sdei_kvm_event *kske = NULL;
> + unsigned long event_num = smccc_get_arg1(vcpu);
> + unsigned long event_info = smccc_get_arg2(vcpu);
> + unsigned long ret = SDEI_SUCCESS;
> +
> + /* Sanity check */
> + if (!(ksdei && vsdei)) {
> + ret = SDEI_NOT_SUPPORTED;
> + goto out;
> + }
> +
> + if (!kvm_sdei_is_valid_event_num(event_num)) {
> + ret = SDEI_INVALID_PARAMETERS;
> + goto out;
> + }
> +
> + /*
> + * Check if the KVM event exists. The event might have been
> + * registered, we need fetch the information from the registered
s/fetch/to fetch
> + * event in that case.
> + */
> + spin_lock(&ksdei->lock);
> + kske = kvm_sdei_find_kvm_event(kvm, event_num);
> + kse = kske ? kske->kse : NULL;
> + if (!kse) {
> + kse = kvm_sdei_find_event(kvm, event_num);
> + if (!kse) {
> + ret = SDEI_INVALID_PARAMETERS;
this should have already be covered by !kvm_sdei_is_valid_event_num I
think (although this latter only checks the since static event num with
KVM owner mask)
> + goto unlock;
> + }
> + }
> +
> + /* Retrieve the requested information */
> + switch (event_info) {
> + case SDEI_EVENT_INFO_EV_TYPE:
> + ret = kse->state.type;
> + break;
> + case SDEI_EVENT_INFO_EV_SIGNALED:
> + ret = kse->state.signaled;
> + break;
> + case SDEI_EVENT_INFO_EV_PRIORITY:
> + ret = kse->state.priority;
> + break;
> + case SDEI_EVENT_INFO_EV_ROUTING_MODE:
> + case SDEI_EVENT_INFO_EV_ROUTING_AFF:
> + if (kse->state.type != SDEI_EVENT_TYPE_SHARED) {
> + ret = SDEI_INVALID_PARAMETERS;
> + break;
> + }
> +
> + if (event_info == SDEI_EVENT_INFO_EV_ROUTING_MODE) {
> + ret = kske ? kske->state.route_mode :
> + SDEI_EVENT_REGISTER_RM_ANY;
no, if event is not registered (!kske) DENIED should be returned
> + } else {
same here
> + ret = kske ? kske->state.route_affinity : 0;
> + }
> +
> + break;
> + default:
> + ret = SDEI_INVALID_PARAMETERS;
> + }
> +
> +unlock:
> + spin_unlock(&ksdei->lock);
> +out:
> + return ret;
> +}
> +
> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> {
> u32 func = smccc_get_function(vcpu);
> @@ -446,6 +520,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> ret = kvm_sdei_hypercall_status(vcpu);
> break;
> case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
> + ret = kvm_sdei_hypercall_info(vcpu);
> + break;
> case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
> case SDEI_1_0_FN_SDEI_PE_MASK:
> case SDEI_1_0_FN_SDEI_PE_UNMASK:
>
Eric

2021-11-10 00:12:42

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 10/21] KVM: arm64: Support SDEI_EVENT_ROUTING_SET hypercall

Hi Gavin,
On 8/15/21 2:13 AM, Gavin Shan wrote:
> This supports SDEI_EVENT_ROUTING_SET hypercall. It's used by the
> guest to set route mode and affinity for the registered KVM event.
> It's only valid for the shared events. It's not allowed to do so
> when the corresponding event has been raised to the guest.
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> arch/arm64/kvm/sdei.c | 64 +++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 64 insertions(+)
>
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index 5dfa74b093f1..458695c2394f 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -489,6 +489,68 @@ static unsigned long kvm_sdei_hypercall_info(struct kvm_vcpu *vcpu)
> return ret;
> }
>
> +static unsigned long kvm_sdei_hypercall_route(struct kvm_vcpu *vcpu)
> +{
> + struct kvm *kvm = vcpu->kvm;
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> + struct kvm_sdei_event *kse = NULL;
> + struct kvm_sdei_kvm_event *kske = NULL;
> + unsigned long event_num = smccc_get_arg1(vcpu);
> + unsigned long route_mode = smccc_get_arg2(vcpu);
> + unsigned long route_affinity = smccc_get_arg3(vcpu);
> + int index = 0;
> + unsigned long ret = SDEI_SUCCESS;
> +
> + /* Sanity check */
> + if (!(ksdei && vsdei)) {
> + ret = SDEI_NOT_SUPPORTED;
> + goto out;
> + }
> +
> + if (!kvm_sdei_is_valid_event_num(event_num)) {
> + ret = SDEI_INVALID_PARAMETERS;
> + goto out;
> + }
> +
> + if (!(route_mode == SDEI_EVENT_REGISTER_RM_ANY ||
> + route_mode == SDEI_EVENT_REGISTER_RM_PE)) {
> + ret = SDEI_INVALID_PARAMETERS;
> + goto out;
> + }
Some sanity checking on the affinity arg could be made as well according
to 5.1.2 affinity desc. The fn shall return INVALID_PARAMETER in case
of invalid affinity.
> +
> + /* Check if the KVM event has been registered */
> + spin_lock(&ksdei->lock);
> + kske = kvm_sdei_find_kvm_event(kvm, event_num);
> + if (!kske) {
> + ret = SDEI_INVALID_PARAMETERS;
> + goto unlock;
> + }
> +
> + /* Validate KVM event state */
> + kse = kske->kse;
> + if (kse->state.type != SDEI_EVENT_TYPE_SHARED) {
> + ret = SDEI_INVALID_PARAMETERS;
> + goto unlock;
> + }
> +
Event handler is in a state other than: handler-registered.
> + if (!kvm_sdei_is_registered(kske, index) ||
> + kvm_sdei_is_enabled(kske, index) ||
> + kske->state.refcount) {
I am not sure about the refcount role here. Does it make sure the state
is != handler-enabled and running or handler-unregister-pending?

I think we would gain in readibility if we had a helper to check whether
we are in those states?
> + ret = SDEI_DENIED;
> + goto unlock;
> + }
> +
> + /* Update state */
> + kske->state.route_mode = route_mode;
> + kske->state.route_affinity = route_affinity;
> +
> +unlock:
> + spin_unlock(&ksdei->lock);
> +out:
> + return ret;
> +}
> +
> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> {
> u32 func = smccc_get_function(vcpu);
> @@ -523,6 +585,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> ret = kvm_sdei_hypercall_info(vcpu);
> break;
> case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
> + ret = kvm_sdei_hypercall_route(vcpu);
> + break;
> case SDEI_1_0_FN_SDEI_PE_MASK:
> case SDEI_1_0_FN_SDEI_PE_UNMASK:
> case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
>
Eric

2021-11-10 00:18:08

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 11/21] KVM: arm64: Support SDEI_PE_{MASK, UNMASK} hypercall

Hi Gavin,

On 8/15/21 2:13 AM, Gavin Shan wrote:
> This supports SDEI_PE_{MASK, UNMASK} hypercall. They are used by
> the guest to stop the specific vCPU from receiving SDEI events.
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> arch/arm64/kvm/sdei.c | 35 +++++++++++++++++++++++++++++++++++
> 1 file changed, 35 insertions(+)
>
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index 458695c2394f..3fb33258b494 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -551,6 +551,37 @@ static unsigned long kvm_sdei_hypercall_route(struct kvm_vcpu *vcpu)
> return ret;
> }
>
> +static unsigned long kvm_sdei_hypercall_mask(struct kvm_vcpu *vcpu,
> + bool mask)
> +{
> + struct kvm *kvm = vcpu->kvm;
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> + unsigned long ret = SDEI_SUCCESS;
> +
> + /* Sanity check */
> + if (!(ksdei && vsdei)) {
> + ret = SDEI_NOT_SUPPORTED;
> + goto out;
> + }
> +
> + spin_lock(&vsdei->lock);
> +
> + /* Check the state */
> + if (mask == vsdei->state.masked) {
> + ret = SDEI_DENIED;
are you sure? I don't this error documented in 5.1.12?

Besides the spec says:
"
This call can be invoked by the client to mask the PE, whether or not
the PE is already masked."
> + goto unlock;
> + }
> +
> + /* Update the state */
> + vsdei->state.masked = mask ? 1 : 0;
> +
> +unlock:
> + spin_unlock(&vsdei->lock);
> +out:
> + return ret;
In case of success the returned value is SUCESS for UNMASK but not for
MASK (see table in 5.1.12).

By the way I have just noticed there is a more recent of the spec than
the A:

ARM_DEN0054C

You should update the cover letter and [PATCH v4 02/21] KVM: arm64: Add
SDEI virtualization infrastructure commit msg


> +}
> +
> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> {
> u32 func = smccc_get_function(vcpu);
> @@ -588,7 +619,11 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> ret = kvm_sdei_hypercall_route(vcpu);
> break;
> case SDEI_1_0_FN_SDEI_PE_MASK:
> + ret = kvm_sdei_hypercall_mask(vcpu, true);
> + break;
> case SDEI_1_0_FN_SDEI_PE_UNMASK:
> + ret = kvm_sdei_hypercall_mask(vcpu, false);
> + break;
> case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
> case SDEI_1_0_FN_SDEI_INTERRUPT_RELEASE:
> case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
>
Eric

2021-11-10 00:18:08

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 12/21] KVM: arm64: Support SDEI_{PRIVATE, SHARED}_RESET hypercall



On 8/15/21 2:13 AM, Gavin Shan wrote:
> This supports SDEI_{PRIVATE, SHARED}_RESET. They are used by the
> guest to purge the private or shared SDEI events, which are registered
to reset all private SDEI event registrations of the calling PE (resp.
PRIVATE or SHARED)
> previously.
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> arch/arm64/kvm/sdei.c | 29 +++++++++++++++++++++++++++++
> 1 file changed, 29 insertions(+)
>
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index 3fb33258b494..62efee2b67b8 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -582,6 +582,29 @@ static unsigned long kvm_sdei_hypercall_mask(struct kvm_vcpu *vcpu,
> return ret;
> }
>
> +static unsigned long kvm_sdei_hypercall_reset(struct kvm_vcpu *vcpu,
> + bool private)
> +{
> + struct kvm *kvm = vcpu->kvm;
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> + unsigned int mask = private ? (1 << SDEI_EVENT_TYPE_PRIVATE) :
> + (1 << SDEI_EVENT_TYPE_SHARED);
> + unsigned long ret = SDEI_SUCCESS;
> +
> + /* Sanity check */
> + if (!(ksdei && vsdei)) {
> + ret = SDEI_NOT_SUPPORTED;
> + goto out;
> + }
> +
> + spin_lock(&ksdei->lock);
> + kvm_sdei_remove_kvm_events(kvm, mask, false);
With kvm_sdei_remove_kvm_events() implementation, why do you make sure
that events which have a running handler get unregistered once the
handler completes? I just see the refcount check that prevents the "KVM
event object" from being removed.
> + spin_unlock(&ksdei->lock);
> +out:
> + return ret;
> +}
> +
> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> {
> u32 func = smccc_get_function(vcpu);
> @@ -626,8 +649,14 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> break;
> case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
> case SDEI_1_0_FN_SDEI_INTERRUPT_RELEASE:
> + ret = SDEI_NOT_SUPPORTED;
> + break;
> case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
> + ret = kvm_sdei_hypercall_reset(vcpu, true);
> + break;
> case SDEI_1_0_FN_SDEI_SHARED_RESET:
> + ret = kvm_sdei_hypercall_reset(vcpu, false);
> + break;
> default:
> ret = SDEI_NOT_SUPPORTED;
> }
>
Eric

2021-11-10 11:00:53

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 14/21] KVM: arm64: Support SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall

Hi Gavin,

On 8/15/21 2:13 AM, Gavin Shan wrote:
> This supports SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall.
> They are used by the guest to notify the completion of the SDEI
> event in the handler. The registers are changed according to the
> SDEI specification as below:
>
> * x0 - x17, PC and PState are restored to what values we had in
> the interrupted context.
>
> * If it's SDEI_EVENT_COMPLETE_AND_RESUME hypercall, IRQ exception
> is injected.
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> arch/arm64/include/asm/kvm_emulate.h | 1 +
> arch/arm64/include/asm/kvm_host.h | 1 +
> arch/arm64/kvm/hyp/exception.c | 7 +++
> arch/arm64/kvm/inject_fault.c | 27 ++++++++++
> arch/arm64/kvm/sdei.c | 75 ++++++++++++++++++++++++++++
> 5 files changed, 111 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index fd418955e31e..923b4d08ea9a 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -37,6 +37,7 @@ bool kvm_condition_valid32(const struct kvm_vcpu *vcpu);
> void kvm_skip_instr32(struct kvm_vcpu *vcpu);
>
> void kvm_inject_undefined(struct kvm_vcpu *vcpu);
> +void kvm_inject_irq(struct kvm_vcpu *vcpu);
> void kvm_inject_vabt(struct kvm_vcpu *vcpu);
> void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr);
> void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr);
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 46f363aa6524..1824f7e1f9ab 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -437,6 +437,7 @@ struct kvm_vcpu_arch {
> #define KVM_ARM64_EXCEPT_AA32_UND (0 << 9)
> #define KVM_ARM64_EXCEPT_AA32_IABT (1 << 9)
> #define KVM_ARM64_EXCEPT_AA32_DABT (2 << 9)
> +#define KVM_ARM64_EXCEPT_AA32_IRQ (3 << 9)
> /* For AArch64: */
> #define KVM_ARM64_EXCEPT_AA64_ELx_SYNC (0 << 9)
> #define KVM_ARM64_EXCEPT_AA64_ELx_IRQ (1 << 9)
> diff --git a/arch/arm64/kvm/hyp/exception.c b/arch/arm64/kvm/hyp/exception.c
> index 0418399e0a20..ef458207d152 100644
> --- a/arch/arm64/kvm/hyp/exception.c
> +++ b/arch/arm64/kvm/hyp/exception.c
> @@ -310,6 +310,9 @@ static void kvm_inject_exception(struct kvm_vcpu *vcpu)
> case KVM_ARM64_EXCEPT_AA32_DABT:
> enter_exception32(vcpu, PSR_AA32_MODE_ABT, 16);
> break;
> + case KVM_ARM64_EXCEPT_AA32_IRQ:
> + enter_exception32(vcpu, PSR_AA32_MODE_IRQ, 4);
> + break;
> default:
> /* Err... */
> break;
> @@ -320,6 +323,10 @@ static void kvm_inject_exception(struct kvm_vcpu *vcpu)
> KVM_ARM64_EXCEPT_AA64_EL1):
> enter_exception64(vcpu, PSR_MODE_EL1h, except_type_sync);
> break;
> + case (KVM_ARM64_EXCEPT_AA64_ELx_IRQ |
> + KVM_ARM64_EXCEPT_AA64_EL1):
> + enter_exception64(vcpu, PSR_MODE_EL1h, except_type_irq);
> + break;
> default:
> /*
> * Only EL1_SYNC makes sense so far, EL2_{SYNC,IRQ}
> diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
> index b47df73e98d7..3a8c55867d2f 100644
> --- a/arch/arm64/kvm/inject_fault.c
> +++ b/arch/arm64/kvm/inject_fault.c
> @@ -66,6 +66,13 @@ static void inject_undef64(struct kvm_vcpu *vcpu)
> vcpu_write_sys_reg(vcpu, esr, ESR_EL1);
> }
>
> +static void inject_irq64(struct kvm_vcpu *vcpu)
> +{
> + vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1 |
> + KVM_ARM64_EXCEPT_AA64_ELx_IRQ |
> + KVM_ARM64_PENDING_EXCEPTION);
> +}
> +
> #define DFSR_FSC_EXTABT_LPAE 0x10
> #define DFSR_FSC_EXTABT_nLPAE 0x08
> #define DFSR_LPAE BIT(9)
> @@ -77,6 +84,12 @@ static void inject_undef32(struct kvm_vcpu *vcpu)
> KVM_ARM64_PENDING_EXCEPTION);
> }
>
> +static void inject_irq32(struct kvm_vcpu *vcpu)
> +{
> + vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA32_IRQ |
> + KVM_ARM64_PENDING_EXCEPTION);
> +}
> +
> /*
> * Modelled after TakeDataAbortException() and TakePrefetchAbortException
> * pseudocode.
> @@ -160,6 +173,20 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
> inject_undef64(vcpu);
> }
>
> +/**
> + * kvm_inject_irq - inject an IRQ into the guest
> + *
> + * It is assumed that this code is called from the VCPU thread and that the
> + * VCPU therefore is not currently executing guest code.
> + */
> +void kvm_inject_irq(struct kvm_vcpu *vcpu)
> +{
> + if (vcpu_el1_is_32bit(vcpu))
> + inject_irq32(vcpu);
> + else
> + inject_irq64(vcpu);
> +}
> +
> void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 esr)
> {
> vcpu_set_vsesr(vcpu, esr & ESR_ELx_ISS_MASK);
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index b5d6d1ed3858..1e8e213c9d70 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -308,6 +308,75 @@ static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu)
> return ret;
> }
>
> +static unsigned long kvm_sdei_hypercall_complete(struct kvm_vcpu *vcpu,
> + bool resume)
> +{
> + struct kvm *kvm = vcpu->kvm;
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> + struct kvm_sdei_kvm_event *kske = NULL;
> + struct kvm_sdei_vcpu_event *ksve = NULL;
> + struct kvm_sdei_vcpu_regs *regs;
> + unsigned long ret = SDEI_SUCCESS;
for the RESUME you never seem to read resume_addr arg? How does it work?
I don't get the irq injection path. Please could you explain?
> + int index;
> +
> + /* Sanity check */
> + if (!(ksdei && vsdei)) {
> + ret = SDEI_NOT_SUPPORTED;
> + goto out;
> + }
> +
> + spin_lock(&vsdei->lock);
> + if (vsdei->critical_event) {
> + ksve = vsdei->critical_event;
> + regs = &vsdei->state.critical_regs;
> + vsdei->critical_event = NULL;
> + vsdei->state.critical_num = KVM_SDEI_INVALID_NUM;
> + } else if (vsdei->normal_event) {
> + ksve = vsdei->normal_event;
> + regs = &vsdei->state.normal_regs;
> + vsdei->normal_event = NULL;
> + vsdei->state.normal_num = KVM_SDEI_INVALID_NUM;
> + } else {
> + ret = SDEI_DENIED;
> + goto unlock;
> + }
> +
> + /* Restore registers: x0 -> x17, PC, PState */
> + for (index = 0; index < ARRAY_SIZE(regs->regs); index++)
> + vcpu_set_reg(vcpu, index, regs->regs[index]);
> +
> + *vcpu_cpsr(vcpu) = regs->pstate;
> + *vcpu_pc(vcpu) = regs->pc;
> +
> + /* Inject interrupt if needed */
> + if (resume)
> + kvm_inject_irq(vcpu);
> +
> + /*
> + * Update state. We needn't take lock in order to update the KVM
> + * event state as it's not destroyed because of the reference
> + * count.
> + */
> + kske = ksve->kske;
> + ksve->state.refcount--;
> + kske->state.refcount--;
why double --?
> + if (!ksve->state.refcount) {
why not using a struct kref directly?
> + list_del(&ksve->link);
> + kfree(ksve);
> + }
> +
> + /* Make another request if there is pending event */
> + if (!(list_empty(&vsdei->critical_events) &&
> + list_empty(&vsdei->normal_events)))
> + kvm_make_request(KVM_REQ_SDEI, vcpu);
> +
> +unlock:
> + spin_unlock(&vsdei->lock);
> +out:
> + return ret;
> +}
> +
> static unsigned long kvm_sdei_hypercall_unregister(struct kvm_vcpu *vcpu)
> {
> struct kvm *kvm = vcpu->kvm;
> @@ -628,7 +697,13 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> ret = kvm_sdei_hypercall_context(vcpu);
> break;
> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
> + has_result = false;
> + ret = kvm_sdei_hypercall_complete(vcpu, false);
> + break;
> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
> + has_result = false;
> + ret = kvm_sdei_hypercall_complete(vcpu, true);
> + break;
> case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
> ret = kvm_sdei_hypercall_unregister(vcpu);
> break;
>
Eric

2021-11-10 11:03:19

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 13/21] KVM: arm64: Impment SDEI event delivery

s/Impment/Implement in the commit title

On 8/15/21 2:13 AM, Gavin Shan wrote:
> This implement kvm_sdei_deliver() to support SDEI event delivery.
> The function is called when the request (KVM_REQ_SDEI) is raised.
> The following rules are taken according to the SDEI specification:
>
> * x0 - x17 are saved. All of them are cleared except the following
> registered:
s/registered/registers
> x0: number SDEI event to be delivered
s/number SDEI event/SDEI event number
> x1: parameter associated with the SDEI event
user arg?
> x2: PC of the interrupted context
> x3: PState of the interrupted context
>
> * PC is set to the handler of the SDEI event, which was provided
> during its registration. PState is modified accordingly.
>
> * SDEI event with critical priority can preempt those with normal
> priority.
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> arch/arm64/include/asm/kvm_host.h | 1 +
> arch/arm64/include/asm/kvm_sdei.h | 1 +
> arch/arm64/kvm/arm.c | 3 ++
> arch/arm64/kvm/sdei.c | 84 +++++++++++++++++++++++++++++++
> 4 files changed, 89 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index aedf901e1ec7..46f363aa6524 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -47,6 +47,7 @@
> #define KVM_REQ_RECORD_STEAL KVM_ARCH_REQ(3)
> #define KVM_REQ_RELOAD_GICv4 KVM_ARCH_REQ(4)
> #define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5)
> +#define KVM_REQ_SDEI KVM_ARCH_REQ(6)
>
> #define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \
> KVM_DIRTY_LOG_INITIALLY_SET)
> diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
> index b0abc13a0256..7f5f5ad689e6 100644
> --- a/arch/arm64/include/asm/kvm_sdei.h
> +++ b/arch/arm64/include/asm/kvm_sdei.h
> @@ -112,6 +112,7 @@ KVM_SDEI_FLAG_FUNC(enabled)
> void kvm_sdei_init_vm(struct kvm *kvm);
> void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu);
> +void kvm_sdei_deliver(struct kvm_vcpu *vcpu);
> void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
> void kvm_sdei_destroy_vm(struct kvm *kvm);
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 2f021aa41632..0c3db1ef1ba9 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -689,6 +689,9 @@ static void check_vcpu_requests(struct kvm_vcpu *vcpu)
> if (kvm_check_request(KVM_REQ_VCPU_RESET, vcpu))
> kvm_reset_vcpu(vcpu);
>
> + if (kvm_check_request(KVM_REQ_SDEI, vcpu))
> + kvm_sdei_deliver(vcpu);
> +
> /*
> * Clear IRQ_PENDING requests that were made to guarantee
> * that a VCPU sees new virtual interrupts.
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index 62efee2b67b8..b5d6d1ed3858 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -671,6 +671,90 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> return 1;
> }
>
> +void kvm_sdei_deliver(struct kvm_vcpu *vcpu)
> +{
> + struct kvm *kvm = vcpu->kvm;
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> + struct kvm_sdei_event *kse = NULL;
> + struct kvm_sdei_kvm_event *kske = NULL;
> + struct kvm_sdei_vcpu_event *ksve = NULL;
> + struct kvm_sdei_vcpu_regs *regs = NULL;
> + unsigned long pstate;
> + int index = 0;
> +
> + /* Sanity check */
> + if (!(ksdei && vsdei))
> + return;
> +
> + /* The critical event can't be preempted */
move the comment after the spin_lock
> + spin_lock(&vsdei->lock);
> + if (vsdei->critical_event)
> + goto unlock;
> +
> + /*
> + * The normal event can be preempted by the critical event.
> + * However, the normal event can't be preempted by another
> + * normal event.
> + */
> + ksve = list_first_entry_or_null(&vsdei->critical_events,
> + struct kvm_sdei_vcpu_event, link);
> + if (!ksve && !vsdei->normal_event) {
> + ksve = list_first_entry_or_null(&vsdei->normal_events,
> + struct kvm_sdei_vcpu_event, link);
> + }
At this stage of the review the struct kvm_sdei_vcpu_event lifecycle is
not known.

From the dispatcher pseudocode I understand you check

((IsCriticalEvent(E) and !CriticalEventRunning(P, C)) ||
(!IsCriticalEvent(E) and !EventRunning(P, C)))

but I can't check you take care of
IsEnabled(E) and
IsEventTarget(E, P)
IsUnmasked(P)

Either you should shash with 18/21 or at least you should add comments.
> +
> + if (!ksve)
> + goto unlock;
> +
> + kske = ksve->kske;
> + kse = kske->kse;
> + if (kse->state.priority == SDEI_EVENT_PRIORITY_CRITICAL) {
> + vsdei->critical_event = ksve;
> + vsdei->state.critical_num = ksve->state.num;
> + regs = &vsdei->state.critical_regs;
> + } else {
> + vsdei->normal_event = ksve;
> + vsdei->state.normal_num = ksve->state.num;
> + regs = &vsdei->state.normal_regs;
> + }
> +
> + /* Save registers: x0 -> x17, PC, PState */
> + for (index = 0; index < ARRAY_SIZE(regs->regs); index++)
> + regs->regs[index] = vcpu_get_reg(vcpu, index);
> +
> + regs->pc = *vcpu_pc(vcpu);
> + regs->pstate = *vcpu_cpsr(vcpu);
> +
> + /*
> + * Inject SDEI event: x0 -> x3, PC, PState. We needn't take lock
> + * for the KVM event as it can't be destroyed because of its
> + * reference count.
> + */
> + for (index = 0; index < ARRAY_SIZE(regs->regs); index++)
> + vcpu_set_reg(vcpu, index, 0);
> +
> + index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
> + vcpu->vcpu_idx : 0;
> + vcpu_set_reg(vcpu, 0, kske->state.num);
> + vcpu_set_reg(vcpu, 1, kske->state.params[index]);
> + vcpu_set_reg(vcpu, 2, regs->pc);
> + vcpu_set_reg(vcpu, 3, regs->pstate);
> +
> + pstate = regs->pstate;
> + pstate |= (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT);
> + pstate &= ~PSR_MODE_MASK;
> + pstate |= PSR_MODE_EL1h;
> + pstate &= ~PSR_MODE32_BIT;
> +
> + vcpu_write_sys_reg(vcpu, regs->pstate, SPSR_EL1);
> + *vcpu_cpsr(vcpu) = pstate;
> + *vcpu_pc(vcpu) = kske->state.entries[index];
> +
> +unlock:
> + spin_unlock(&vsdei->lock);
> +}
> +
> void kvm_sdei_init_vm(struct kvm *kvm)
> {
> struct kvm_sdei_kvm *ksdei;
>
Eric

2021-11-10 11:18:51

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 06/21] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall

Hi Gavin,

On 8/15/21 2:13 AM, Gavin Shan wrote:
> This supports SDEI_EVENT_CONTEXT hypercall. It's used by the guest
> to retrieved the original registers (R0 - R17) in its SDEI event
> handler. Those registers can be corrupted during the SDEI event
> delivery.
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> arch/arm64/kvm/sdei.c | 40 ++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 40 insertions(+)
>
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index b022ce0a202b..b4162efda470 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -270,6 +270,44 @@ static unsigned long kvm_sdei_hypercall_enable(struct kvm_vcpu *vcpu,
> return ret;
> }
>
> +static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu)
> +{
> + struct kvm *kvm = vcpu->kvm;
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> + struct kvm_sdei_vcpu_regs *regs;
> + unsigned long index = smccc_get_arg1(vcpu);
s/index/param_id to match the spec?
> + unsigned long ret = SDEI_SUCCESS;
> +
> + /* Sanity check */
> + if (!(ksdei && vsdei)) {
> + ret = SDEI_NOT_SUPPORTED;
> + goto out;
> + }
> +
> + if (index > ARRAY_SIZE(vsdei->state.critical_regs.regs)) {
> + ret = SDEI_INVALID_PARAMETERS;
> + goto out;
> + }
I would move the above after regs = and use regs there (although the
regs ARRAY_SIZE of both is identifical)
> +
> + /* Check if the pending event exists */
> + spin_lock(&vsdei->lock);
> + if (!(vsdei->critical_event || vsdei->normal_event)) {
> + ret = SDEI_DENIED;
> + goto unlock;
> + }
> +
> + /* Fetch the requested register */
> + regs = vsdei->critical_event ? &vsdei->state.critical_regs :
> + &vsdei->state.normal_regs;
> + ret = regs->regs[index];
> +
> +unlock:
> + spin_unlock(&vsdei->lock);
> +out:
> + return ret;
> +}
> +
> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> {
> u32 func = smccc_get_function(vcpu);
> @@ -290,6 +328,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> ret = kvm_sdei_hypercall_enable(vcpu, false);
> break;
> case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
> + ret = kvm_sdei_hypercall_context(vcpu);
> + break;
> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
> case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
>
Eric

2021-11-10 11:37:46

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 15/21] KVM: arm64: Support SDEI event notifier

Hi Gavin,

On 8/15/21 2:13 AM, Gavin Shan wrote:
> The owner of the SDEI event, like asynchronous page fault, need
owner is not a terminology used in the SDEI spec
> know the state of injected SDEI event. This supports SDEI event
s/need know the state of injected/to know the state of the injected
> state updating by introducing notifier mechanism. It's notable
a notifier mechanism
> the notifier (handler) should be capable of migration.
I don't understand the last sentence
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> arch/arm64/include/asm/kvm_sdei.h | 12 +++++++
> arch/arm64/include/uapi/asm/kvm_sdei.h | 1 +
> arch/arm64/kvm/sdei.c | 45 +++++++++++++++++++++++++-
> 3 files changed, 57 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
> index 7f5f5ad689e6..19f2d9b91f85 100644
> --- a/arch/arm64/include/asm/kvm_sdei.h
> +++ b/arch/arm64/include/asm/kvm_sdei.h
> @@ -16,6 +16,16 @@
> #include <linux/list.h>
> #include <linux/spinlock.h>
>
> +struct kvm_vcpu;
> +
> +typedef void (*kvm_sdei_notifier)(struct kvm_vcpu *vcpu,
> + unsigned long num,
> + unsigned int state);
> +enum {
> + KVM_SDEI_NOTIFY_DELIVERED,
> + KVM_SDEI_NOTIFY_COMPLETED,
> +};
> +
> struct kvm_sdei_event {
> struct kvm_sdei_event_state state;
> struct kvm *kvm;
> @@ -112,6 +122,8 @@ KVM_SDEI_FLAG_FUNC(enabled)
> void kvm_sdei_init_vm(struct kvm *kvm);
> void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu);
> +int kvm_sdei_register_notifier(struct kvm *kvm, unsigned long num,
> + kvm_sdei_notifier notifier);
> void kvm_sdei_deliver(struct kvm_vcpu *vcpu);
> void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
> void kvm_sdei_destroy_vm(struct kvm *kvm);
> diff --git a/arch/arm64/include/uapi/asm/kvm_sdei.h b/arch/arm64/include/uapi/asm/kvm_sdei.h
> index 8928027023f6..4ef661d106fe 100644
> --- a/arch/arm64/include/uapi/asm/kvm_sdei.h
> +++ b/arch/arm64/include/uapi/asm/kvm_sdei.h
> @@ -23,6 +23,7 @@ struct kvm_sdei_event_state {
> __u8 type;
> __u8 signaled;
> __u8 priority;
> + __u64 notifier;
why is the notifier attached to the exposed event and not to the
registered or even vcpu event? This needs to be motivated.

Also as commented earlier I really think we first need to agree on the
uapi and get a consensus on it as it must be right on the 1st shot. In
that prospect maybe introduce a patch dedicated to the uapi and document
it properly, including the way the end user is supposed to use it.

Another way to proceed would be to not support migration at the moment,
mature the API and then introduce migration support later. Would it make
sense? For instance, in the past in-kernel ITS emulation was first
introduced without migration support.

Thanks

Eric
> };
>
> struct kvm_sdei_kvm_event_state {
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index 1e8e213c9d70..5f7a37dcaa77 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -314,9 +314,11 @@ static unsigned long kvm_sdei_hypercall_complete(struct kvm_vcpu *vcpu,
> struct kvm *kvm = vcpu->kvm;
> struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> + struct kvm_sdei_event *kse = NULL;
> struct kvm_sdei_kvm_event *kske = NULL;
> struct kvm_sdei_vcpu_event *ksve = NULL;
> struct kvm_sdei_vcpu_regs *regs;
> + kvm_sdei_notifier notifier;
> unsigned long ret = SDEI_SUCCESS;
> int index;
>
> @@ -349,6 +351,13 @@ static unsigned long kvm_sdei_hypercall_complete(struct kvm_vcpu *vcpu,
> *vcpu_cpsr(vcpu) = regs->pstate;
> *vcpu_pc(vcpu) = regs->pc;
>
> + /* Notifier */
> + kske = ksve->kske;
> + kse = kske->kse;
> + notifier = (kvm_sdei_notifier)(kse->state.notifier);
> + if (notifier)
> + notifier(vcpu, kse->state.num, KVM_SDEI_NOTIFY_COMPLETED);
> +
> /* Inject interrupt if needed */
> if (resume)
> kvm_inject_irq(vcpu);
> @@ -358,7 +367,6 @@ static unsigned long kvm_sdei_hypercall_complete(struct kvm_vcpu *vcpu,
> * event state as it's not destroyed because of the reference
> * count.
> */
> - kske = ksve->kske;
> ksve->state.refcount--;
> kske->state.refcount--;
> if (!ksve->state.refcount) {
> @@ -746,6 +754,35 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
> return 1;
> }
>
> +int kvm_sdei_register_notifier(struct kvm *kvm,
> + unsigned long num,
> + kvm_sdei_notifier notifier)
> +{
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_event *kse = NULL;
> + int ret = 0;
> +
> + if (!ksdei) {
> + ret = -EPERM;
> + goto out;
> + }
> +
> + spin_lock(&ksdei->lock);
> +
> + kse = kvm_sdei_find_event(kvm, num);
> + if (!kse) {
> + ret = -EINVAL;
> + goto unlock;
> + }
> +
> + kse->state.notifier = (unsigned long)notifier;
> +
> +unlock:
> + spin_unlock(&ksdei->lock);
> +out:
> + return ret;
> +}
> +
> void kvm_sdei_deliver(struct kvm_vcpu *vcpu)
> {
> struct kvm *kvm = vcpu->kvm;
> @@ -755,6 +792,7 @@ void kvm_sdei_deliver(struct kvm_vcpu *vcpu)
> struct kvm_sdei_kvm_event *kske = NULL;
> struct kvm_sdei_vcpu_event *ksve = NULL;
> struct kvm_sdei_vcpu_regs *regs = NULL;
> + kvm_sdei_notifier notifier;
> unsigned long pstate;
> int index = 0;
>
> @@ -826,6 +864,11 @@ void kvm_sdei_deliver(struct kvm_vcpu *vcpu)
> *vcpu_cpsr(vcpu) = pstate;
> *vcpu_pc(vcpu) = kske->state.entries[index];
>
> + /* Notifier */
> + notifier = (kvm_sdei_notifier)(kse->state.notifier);
> + if (notifier)
> + notifier(vcpu, kse->state.num, KVM_SDEI_NOTIFY_DELIVERED);
> +
> unlock:
> spin_unlock(&vsdei->lock);
> }
>

2021-11-10 13:51:43

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 16/21] KVM: arm64: Support SDEI ioctl commands on VM



On 8/15/21 2:13 AM, Gavin Shan wrote:
> This supports ioctl commands on VM to manage the various objects.
> It's primarily used by VMM to accomplish live migration. The ioctl
> commands introduced by this are highlighted as blow:
below
>
> * KVM_SDEI_CMD_GET_VERSION
> Retrieve the version of current implementation
which implementation, SDEI?
> * KVM_SDEI_CMD_SET_EVENT
> Add event to be exported from KVM so that guest can register
> against it afterwards
> * KVM_SDEI_CMD_GET_KEVENT_COUNT
> Retrieve number of registered SDEI events
> * KVM_SDEI_CMD_GET_KEVENT
> Retrieve the state of the registered SDEI event
> * KVM_SDEI_CMD_SET_KEVENT
> Populate the registered SDEI event
I think we really miss the full picture of what you want to achieve with
those IOCTLs or at least I fail to get it. Please document the UAPI
separately including the structs and IOCTLs.
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> arch/arm64/include/asm/kvm_sdei.h | 1 +
> arch/arm64/include/uapi/asm/kvm_sdei.h | 17 +++
> arch/arm64/kvm/arm.c | 3 +
> arch/arm64/kvm/sdei.c | 171 +++++++++++++++++++++++++
> include/uapi/linux/kvm.h | 3 +
> 5 files changed, 195 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
> index 19f2d9b91f85..8f5ea947ed0e 100644
> --- a/arch/arm64/include/asm/kvm_sdei.h
> +++ b/arch/arm64/include/asm/kvm_sdei.h
> @@ -125,6 +125,7 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu);
> int kvm_sdei_register_notifier(struct kvm *kvm, unsigned long num,
> kvm_sdei_notifier notifier);
> void kvm_sdei_deliver(struct kvm_vcpu *vcpu);
> +long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg);
> void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
> void kvm_sdei_destroy_vm(struct kvm *kvm);
>
> diff --git a/arch/arm64/include/uapi/asm/kvm_sdei.h b/arch/arm64/include/uapi/asm/kvm_sdei.h
> index 4ef661d106fe..35ff05be3c28 100644
> --- a/arch/arm64/include/uapi/asm/kvm_sdei.h
> +++ b/arch/arm64/include/uapi/asm/kvm_sdei.h
> @@ -57,5 +57,22 @@ struct kvm_sdei_vcpu_state {
> struct kvm_sdei_vcpu_regs normal_regs;
> };
>
> +#define KVM_SDEI_CMD_GET_VERSION 0
> +#define KVM_SDEI_CMD_SET_EVENT 1
> +#define KVM_SDEI_CMD_GET_KEVENT_COUNT 2
> +#define KVM_SDEI_CMD_GET_KEVENT 3
> +#define KVM_SDEI_CMD_SET_KEVENT 4
> +
> +struct kvm_sdei_cmd {
> + __u32 cmd;
> + union {
> + __u32 version;
> + __u32 count;
> + __u64 num;
> + struct kvm_sdei_event_state kse_state;
> + struct kvm_sdei_kvm_event_state kske_state;
> + };
> +};
> +
> #endif /* !__ASSEMBLY__ */
> #endif /* _UAPI__ASM_KVM_SDEI_H */
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 0c3db1ef1ba9..8d61585124b2 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1389,6 +1389,9 @@ long kvm_arch_vm_ioctl(struct file *filp,
> return -EFAULT;
> return kvm_vm_ioctl_mte_copy_tags(kvm, &copy_tags);
> }
> + case KVM_ARM_SDEI_COMMAND: {
> + return kvm_sdei_vm_ioctl(kvm, arg);
> + }
> default:
> return -EINVAL;
> }
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index 5f7a37dcaa77..bdd76c3e5153 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -931,6 +931,177 @@ void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu)
> vcpu->arch.sdei = vsdei;
> }
>
> +static long kvm_sdei_set_event(struct kvm *kvm,
> + struct kvm_sdei_event_state *kse_state)
> +{
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_event *kse = NULL;
> +
> + if (!kvm_sdei_is_valid_event_num(kse_state->num))
> + return -EINVAL;
> +
> + if (!(kse_state->type == SDEI_EVENT_TYPE_SHARED ||
> + kse_state->type == SDEI_EVENT_TYPE_PRIVATE))
> + return -EINVAL;
> +
> + if (!(kse_state->priority == SDEI_EVENT_PRIORITY_NORMAL ||
> + kse_state->priority == SDEI_EVENT_PRIORITY_CRITICAL))
> + return -EINVAL;
> +
> + kse = kvm_sdei_find_event(kvm, kse_state->num);
> + if (kse)
> + return -EEXIST;
> +
> + kse = kzalloc(sizeof(*kse), GFP_KERNEL);
> + if (!kse)
> + return -ENOMEM;
userspace can exhaust the mem since there is no limit. There must be a max.

> +
> + kse->state = *kse_state;
> + kse->kvm = kvm;
> + list_add_tail(&kse->link, &ksdei->events);
> +
> + return 0;
> +}
> +
> +static long kvm_sdei_get_kevent_count(struct kvm *kvm, int *count)
> +{
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_kvm_event *kske = NULL;
> + int total = 0;
> +
> + list_for_each_entry(kske, &ksdei->kvm_events, link) {
> + total++;
> + }
> +
> + *count = total;
> + return 0;
> +}
> +
> +static long kvm_sdei_get_kevent(struct kvm *kvm,
> + struct kvm_sdei_kvm_event_state *kske_state)
> +{
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_kvm_event *kske = NULL;
> +
> + /*
> + * The first entry is fetched if the event number is invalid.
> + * Otherwise, the next entry is fetched.
why don't we return an error? What is the point returning the next entry?
> + */
> + if (!kvm_sdei_is_valid_event_num(kske_state->num)) {
> + kske = list_first_entry_or_null(&ksdei->kvm_events,
> + struct kvm_sdei_kvm_event, link);
> + } else {
> + kske = kvm_sdei_find_kvm_event(kvm, kske_state->num);
> + if (kske && !list_is_last(&kske->link, &ksdei->kvm_events))
> + kske = list_next_entry(kske, link);
Sorry I don't get why we return the next one?
> + else
> + kske = NULL;
> + }
> +
> + if (!kske)
> + return -ENOENT;
> +
> + *kske_state = kske->state;
> +
> + return 0;
> +}
> +
> +static long kvm_sdei_set_kevent(struct kvm *kvm,
> + struct kvm_sdei_kvm_event_state *kske_state)
> +{
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_event *kse = NULL;
> + struct kvm_sdei_kvm_event *kske = NULL;
> +
> + /* Sanity check */
> + if (!kvm_sdei_is_valid_event_num(kske_state->num))
> + return -EINVAL;
> +
> + if (!(kske_state->route_mode == SDEI_EVENT_REGISTER_RM_ANY ||
> + kske_state->route_mode == SDEI_EVENT_REGISTER_RM_PE))
> + return -EINVAL;
> +
> + /* Check if the event number is valid */
> + kse = kvm_sdei_find_event(kvm, kske_state->num);
> + if (!kse)
> + return -ENOENT;
> +
> + /* Check if the event has been populated */
> + kske = kvm_sdei_find_kvm_event(kvm, kske_state->num);
> + if (kske)
> + return -EEXIST;
> +
> + kske = kzalloc(sizeof(*kske), GFP_KERNEL);
userspace can exhaust the mem since there is no limit
> + if (!kske)
> + return -ENOMEM;
> +
> + kske->state = *kske_state;
> + kske->kse = kse;
> + kske->kvm = kvm;
> + list_add_tail(&kske->link, &ksdei->kvm_events);
> +
> + return 0;
> +}
> +
> +long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg)
> +{
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_cmd *cmd = NULL;
> + void __user *argp = (void __user *)arg;
> + bool copy = false;
> + long ret = 0;
> +
> + /* Sanity check */
> + if (!ksdei) {
> + ret = -EPERM;
> + goto out;
> + }
> +
> + cmd = kzalloc(sizeof(*cmd), GFP_KERNEL);
> + if (!cmd) {
> + ret = -ENOMEM;
> + goto out;
> + }
> +
> + if (copy_from_user(cmd, argp, sizeof(*cmd))) {
> + ret = -EFAULT;
> + goto out;
> + }
> +
> + spin_lock(&ksdei->lock);
> +
> + switch (cmd->cmd) {
> + case KVM_SDEI_CMD_GET_VERSION:
> + copy = true;
> + cmd->version = (1 << 16); /* v1.0.0 */
> + break;
> + case KVM_SDEI_CMD_SET_EVENT:
> + ret = kvm_sdei_set_event(kvm, &cmd->kse_state);
> + break;
> + case KVM_SDEI_CMD_GET_KEVENT_COUNT:
> + copy = true;
> + ret = kvm_sdei_get_kevent_count(kvm, &cmd->count);
> + break;
> + case KVM_SDEI_CMD_GET_KEVENT:
> + copy = true;
> + ret = kvm_sdei_get_kevent(kvm, &cmd->kske_state);
> + break;
> + case KVM_SDEI_CMD_SET_KEVENT:
> + ret = kvm_sdei_set_kevent(kvm, &cmd->kske_state);
> + break;
> + default:
> + ret = -EINVAL;
> + }
> +
> + spin_unlock(&ksdei->lock);
> +out:
> + if (!ret && copy && copy_to_user(argp, cmd, sizeof(*cmd)))
> + ret = -EFAULT;
> +
> + kfree(cmd);
> + return ret;
> +}
> +
> void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu)
> {
> struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index d9e4aabcb31a..8cf41fd4bf86 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1679,6 +1679,9 @@ struct kvm_xen_vcpu_attr {
> #define KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_DATA 0x4
> #define KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST 0x5
>
> +/* Available with KVM_CAP_ARM_SDEI */
> +#define KVM_ARM_SDEI_COMMAND _IOWR(KVMIO, 0xce, struct kvm_sdei_cmd)
> +
> /* Secure Encrypted Virtualization command */
> enum sev_cmd_id {
> /* Guest initialization commands */
>
Eric

2021-11-10 13:58:09

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 20/21] KVM: arm64: Export SDEI capability



On 8/15/21 2:13 AM, Gavin Shan wrote:
> The SDEI functionality is ready to be exported so far. This adds
> new capability (KVM_CAP_ARM_SDEI) and exports it.

Need to be documented in
kvm/api.rst
as the rest of the API

Eric
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> arch/arm64/kvm/arm.c | 3 +++
> include/uapi/linux/kvm.h | 1 +
> 2 files changed, 4 insertions(+)
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 215cdbeb272a..7d9bbc888ae5 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -278,6 +278,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> case KVM_CAP_ARM_PTRAUTH_GENERIC:
> r = system_has_full_ptr_auth();
> break;
> + case KVM_CAP_ARM_SDEI:
> + r = 1;
> + break;
> default:
> r = 0;
> }
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 8cf41fd4bf86..2aa748fd89c7 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1112,6 +1112,7 @@ struct kvm_ppc_resize_hpt {
> #define KVM_CAP_BINARY_STATS_FD 203
> #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
> #define KVM_CAP_ARM_MTE 205
> +#define KVM_CAP_ARM_SDEI 206
>
> #ifdef KVM_CAP_IRQ_ROUTING
>
>
Eric

2021-11-10 14:07:53

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 18/21] KVM: arm64: Support SDEI event injection



On 8/15/21 2:13 AM, Gavin Shan wrote:
> This supports SDEI event injection by implementing kvm_sdei_inject().
> It's called by kernel directly or VMM through ioctl command to inject
> SDEI event to the specific vCPU.
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> arch/arm64/include/asm/kvm_sdei.h | 2 +
> arch/arm64/include/uapi/asm/kvm_sdei.h | 1 +
> arch/arm64/kvm/sdei.c | 108 +++++++++++++++++++++++++
> 3 files changed, 111 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
> index a997989bab77..51087fe971ba 100644
> --- a/arch/arm64/include/asm/kvm_sdei.h
> +++ b/arch/arm64/include/asm/kvm_sdei.h
> @@ -124,6 +124,8 @@ void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu);
> int kvm_sdei_register_notifier(struct kvm *kvm, unsigned long num,
> kvm_sdei_notifier notifier);
> +int kvm_sdei_inject(struct kvm_vcpu *vcpu,
> + unsigned long num, bool immediate);
> void kvm_sdei_deliver(struct kvm_vcpu *vcpu);
> long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg);
> long kvm_sdei_vcpu_ioctl(struct kvm_vcpu *vcpu, unsigned long arg);
> diff --git a/arch/arm64/include/uapi/asm/kvm_sdei.h b/arch/arm64/include/uapi/asm/kvm_sdei.h
> index b916c3435646..f7a6b2b22b50 100644
> --- a/arch/arm64/include/uapi/asm/kvm_sdei.h
> +++ b/arch/arm64/include/uapi/asm/kvm_sdei.h
> @@ -67,6 +67,7 @@ struct kvm_sdei_vcpu_state {
> #define KVM_SDEI_CMD_SET_VEVENT 7
> #define KVM_SDEI_CMD_GET_VCPU_STATE 8
> #define KVM_SDEI_CMD_SET_VCPU_STATE 9
> +#define KVM_SDEI_CMD_INJECT_EVENT 10
>
> struct kvm_sdei_cmd {
> __u32 cmd;
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index 79315b77f24b..7c2789cd1421 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -802,6 +802,111 @@ int kvm_sdei_register_notifier(struct kvm *kvm,
> return ret;
> }
>
> +int kvm_sdei_inject(struct kvm_vcpu *vcpu,
> + unsigned long num,
> + bool immediate)
don't get the immediate param.
> +{
> + struct kvm *kvm = vcpu->kvm;
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> + struct kvm_sdei_event *kse = NULL;
> + struct kvm_sdei_kvm_event *kske = NULL;
> + struct kvm_sdei_vcpu_event *ksve = NULL;
> + int index, ret = 0;
> +
> + /* Sanity check */
> + if (!(ksdei && vsdei)) {
> + ret = -EPERM;
> + goto out;
> + }
> +
> + if (!kvm_sdei_is_valid_event_num(num)) {
> + ret = -EINVAL;
> + goto out;
> + }
> +
> + /* Check the kvm event */
> + spin_lock(&ksdei->lock);
> + kske = kvm_sdei_find_kvm_event(kvm, num);
> + if (!kske) {
> + ret = -ENOENT;
> + goto unlock_kvm;
> + }
> +
> + kse = kske->kse;
> + index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
> + vcpu->vcpu_idx : 0;
> + if (!(kvm_sdei_is_registered(kske, index) &&
> + kvm_sdei_is_enabled(kske, index))) {
> + ret = -EPERM;
> + goto unlock_kvm;
> + }
> +
> + /* Check the vcpu state */
> + spin_lock(&vsdei->lock);
> + if (vsdei->state.masked) {
> + ret = -EPERM;
> + goto unlock_vcpu;
> + }
> +
> + /* Check if the event can be delivered immediately */
> + if (immediate) {
According to the dispatcher pseudo code this should be always checked?
> + if (kse->state.priority == SDEI_EVENT_PRIORITY_CRITICAL &&
> + !list_empty(&vsdei->critical_events)) {
> + ret = -ENOSPC;
> + goto unlock_vcpu;
> + }
> +
> + if (kse->state.priority == SDEI_EVENT_PRIORITY_NORMAL &&
> + (!list_empty(&vsdei->critical_events) ||
> + !list_empty(&vsdei->normal_events))) {
> + ret = -ENOSPC;
> + goto unlock_vcpu;
> + }
> + }
What about shared event dispatching. I don't see the afficinity checked
anywhere?
> +
> + /* Check if the vcpu event exists */
> + ksve = kvm_sdei_find_vcpu_event(vcpu, num);
> + if (ksve) {
> + kske->state.refcount++;
> + ksve->state.refcount++;
why this double refcount increment??
> + kvm_make_request(KVM_REQ_SDEI, vcpu);
> + goto unlock_vcpu;
> + }
> +
> + /* Allocate vcpu event */
> + ksve = kzalloc(sizeof(*ksve), GFP_KERNEL);
> + if (!ksve) {
> + ret = -ENOMEM;
> + goto unlock_vcpu;
> + }
> +
> + /*
> + * We should take lock to update KVM event state because its
> + * reference count might be zero. In that case, the KVM event
> + * could be destroyed.
> + */
> + kske->state.refcount++;
> + ksve->state.num = num;
> + ksve->state.refcount = 1;
> + ksve->kske = kske;
> + ksve->vcpu = vcpu;
> +
> + if (kse->state.priority == SDEI_EVENT_PRIORITY_CRITICAL)
> + list_add_tail(&ksve->link, &vsdei->critical_events);
> + else
> + list_add_tail(&ksve->link, &vsdei->normal_events);
> +
> + kvm_make_request(KVM_REQ_SDEI, vcpu);
> +
> +unlock_vcpu:
> + spin_unlock(&vsdei->lock);
> +unlock_kvm:
> + spin_unlock(&ksdei->lock);
> +out:
> + return ret;
> +}
> +
> void kvm_sdei_deliver(struct kvm_vcpu *vcpu)
> {
> struct kvm *kvm = vcpu->kvm;
> @@ -1317,6 +1422,9 @@ long kvm_sdei_vcpu_ioctl(struct kvm_vcpu *vcpu, unsigned long arg)
> case KVM_SDEI_CMD_SET_VCPU_STATE:
> ret = kvm_sdei_set_vcpu_state(vcpu, &cmd->ksv_state);
> break;
> + case KVM_SDEI_CMD_INJECT_EVENT:
> + ret = kvm_sdei_inject(vcpu, cmd->num, false);
> + break;
> default:
> ret = -EINVAL;
> }
>
Eric

2021-11-10 14:12:27

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 19/21] KVM: arm64: Support SDEI event cancellation



On 8/15/21 2:13 AM, Gavin Shan wrote:
> The injected SDEI event is to send notification to guest. The SDEI
> event might not be needed after it's injected. This introduces API
> to support cancellation on the injected SDEI event if it's not fired
> to the guest yet.
>
> This mechanism will be needed when we're going to support asynchronous
> page fault.

if we are able to manage the migration of an executing SDEI why can't we
manage the migration of pending SDEIs?

Eric
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> arch/arm64/include/asm/kvm_sdei.h | 1 +
> arch/arm64/kvm/sdei.c | 49 +++++++++++++++++++++++++++++++
> 2 files changed, 50 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
> index 51087fe971ba..353744c7bad9 100644
> --- a/arch/arm64/include/asm/kvm_sdei.h
> +++ b/arch/arm64/include/asm/kvm_sdei.h
> @@ -126,6 +126,7 @@ int kvm_sdei_register_notifier(struct kvm *kvm, unsigned long num,
> kvm_sdei_notifier notifier);
> int kvm_sdei_inject(struct kvm_vcpu *vcpu,
> unsigned long num, bool immediate);
> +int kvm_sdei_cancel(struct kvm_vcpu *vcpu, unsigned long num);
> void kvm_sdei_deliver(struct kvm_vcpu *vcpu);
> long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg);
> long kvm_sdei_vcpu_ioctl(struct kvm_vcpu *vcpu, unsigned long arg);
> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
> index 7c2789cd1421..4f5a582daa97 100644
> --- a/arch/arm64/kvm/sdei.c
> +++ b/arch/arm64/kvm/sdei.c
> @@ -907,6 +907,55 @@ int kvm_sdei_inject(struct kvm_vcpu *vcpu,
> return ret;
> }
>
> +int kvm_sdei_cancel(struct kvm_vcpu *vcpu, unsigned long num)
> +{
> + struct kvm *kvm = vcpu->kvm;
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> + struct kvm_sdei_kvm_event *kske = NULL;
> + struct kvm_sdei_vcpu_event *ksve = NULL;
> + int ret = 0;
> +
> + if (!(ksdei && vsdei)) {
> + ret = -EPERM;
> + goto out;
> + }
> +
> + /* Find the vCPU event */
> + spin_lock(&vsdei->lock);
> + ksve = kvm_sdei_find_vcpu_event(vcpu, num);
> + if (!ksve) {
> + ret = -EINVAL;
> + goto unlock;
> + }
> +
> + /* Event can't be cancelled if it has been delivered */
> + if (ksve->state.refcount <= 1 &&
> + (vsdei->critical_event == ksve ||
> + vsdei->normal_event == ksve)) {
> + ret = -EINPROGRESS;
> + goto unlock;
> + }
> +
> + /* Free the vCPU event if necessary */
> + kske = ksve->kske;
> + ksve->state.refcount--;
> + if (!ksve->state.refcount) {
> + list_del(&ksve->link);
> + kfree(ksve);
> + }
> +
> +unlock:
> + spin_unlock(&vsdei->lock);
> + if (kske) {
> + spin_lock(&ksdei->lock);
> + kske->state.refcount--;
> + spin_unlock(&ksdei->lock);
> + }
> +out:
> + return ret;
> +}
> +
> void kvm_sdei_deliver(struct kvm_vcpu *vcpu)
> {
> struct kvm *kvm = vcpu->kvm;
>

2021-11-10 14:32:24

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Support SDEI Virtualization

Hi Gavin,

On 8/15/21 2:19 AM, Gavin Shan wrote:
> On 8/15/21 10:13 AM, Gavin Shan wrote:
>> This series intends to virtualize Software Delegated Exception Interface
>> (SDEI), which is defined by DEN0054A. It allows the hypervisor to deliver
>> NMI-alike event to guest and it's needed by asynchronous page fault to
>> deliver page-not-present notification from hypervisor to guest. The code
>> and the required qemu changes can be found from:
>>
>>     https://developer.arm.com/documentation/den0054/latest
>>     https://github.com/gwshan/linux    ("kvm/arm64_sdei")
>>     https://github.com/gwshan/qemu     ("kvm/arm64_sdei")
>>
>> The SDEI event is identified by a 32-bits number. Bits[31:24] are used
>> to indicate the SDEI event properties while bits[23:0] are identifying
>> the unique number. The implementation takes bits[23:22] to indicate the
>> owner of the SDEI event. For example, those SDEI events owned by KVM
>> should have these two bits set to 0b01. Besides, the implementation
>> supports SDEI events owned by KVM only.
>>
>> The design is pretty straightforward and the implementation is just
>> following the SDEI specification, to support the defined SMCCC intefaces,
>> except the IRQ binding stuff. There are several data structures
>> introduced.
>> Some of the objects have to be migrated by VMM. So their definitions are
>> split up for VMM to include the corresponding states for migration.
>>
>>     struct kvm_sdei_kvm
>>        Associated with VM and used to track the KVM exposed SDEI events
>>        and those registered by guest.
>>     struct kvm_sdei_vcpu
>>        Associated with vCPU and used to track SDEI event delivery. The
>>        preempted context is saved prior to the delivery and restored
>>        after that.
>>     struct kvm_sdei_event
>>        SDEI events exposed by KVM so that guest can register and enable.
>>     struct kvm_sdei_kvm_event
>>        SDEI events that have been registered by guest.
>>     struct kvm_sdei_vcpu_event
>>        SDEI events that have been queued to specific vCPU for delivery.
>>
>> The series is organized as below:
>>
>>     PATCH[01]    Introduces template for smccc_get_argx()
>>     PATCH[02]    Introduces the data structures and infrastructure
>>     PATCH[03-14] Supports various SDEI related hypercalls
>>     PATCH[15]    Supports SDEI event notification
>>     PATCH[16-17] Introduces ioctl command for migration
>>     PATCH[18-19] Supports SDEI event injection and cancellation
>>     PATCH[20]    Exports SDEI capability
>>     PATCH[21]    Adds self-test case for SDEI virtualization
>>
>
> [...]
>
> I explicitly copied James Morse and Mark Rutland when posting the series,
> but something unknown went wrong. I'm including them here to avoid
> reposting the whole series.
I don't see James nor Mark included here either

Eric
>
> Thanks,
> Gavin
>
> _______________________________________________
> kvmarm mailing list
> [email protected]
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
>

2022-01-11 07:53:03

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 01/21] KVM: arm64: Introduce template for inline functions

Hi Eric,

On 11/9/21 11:26 PM, Eric Auger wrote:
> On 8/15/21 2:13 AM, Gavin Shan wrote:
>> The inline functions used to get the SMCCC parameters have same
>> layout. It means these functions can be presented by a template,
>> to make the code simplified. Besides, this adds more similar inline
>> functions like smccc_get_arg{4,5,6,7,8}() to visit more SMCCC arguments,
>> which are needed by SDEI virtualization support.
>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> include/kvm/arm_hypercalls.h | 34 +++++++++++++++-------------------
>> 1 file changed, 15 insertions(+), 19 deletions(-)
>>
>> diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
>> index 0e2509d27910..ebecb6c68254 100644
>> --- a/include/kvm/arm_hypercalls.h
>> +++ b/include/kvm/arm_hypercalls.h
>> @@ -6,27 +6,21 @@
>>
>> #include <asm/kvm_emulate.h>
>>
>> -int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
>> -
>> -static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
>> -{
>> - return vcpu_get_reg(vcpu, 0);
>> +#define SMCCC_DECLARE_GET_FUNC(type, name, reg) \
>> +static inline type smccc_get_##name(struct kvm_vcpu *vcpu) \
>> +{ \
>> + return vcpu_get_reg(vcpu, reg); \
>> }
>>
>> -static inline unsigned long smccc_get_arg1(struct kvm_vcpu *vcpu)
>> -{
>> - return vcpu_get_reg(vcpu, 1);
>> -}
>> -
>> -static inline unsigned long smccc_get_arg2(struct kvm_vcpu *vcpu)
>> -{
>> - return vcpu_get_reg(vcpu, 2);
>> -}
>> -
>> -static inline unsigned long smccc_get_arg3(struct kvm_vcpu *vcpu)
>> -{
>> - return vcpu_get_reg(vcpu, 3);
>> -}
>> +SMCCC_DECLARE_GET_FUNC(u32, function, 0)
>> +SMCCC_DECLARE_GET_FUNC(unsigned long, arg1, 1)
>> +SMCCC_DECLARE_GET_FUNC(unsigned long, arg2, 2)
>> +SMCCC_DECLARE_GET_FUNC(unsigned long, arg3, 3)
>> +SMCCC_DECLARE_GET_FUNC(unsigned long, arg4, 4)
>> +SMCCC_DECLARE_GET_FUNC(unsigned long, arg5, 5)
>> +SMCCC_DECLARE_GET_FUNC(unsigned long, arg6, 6)
>> +SMCCC_DECLARE_GET_FUNC(unsigned long, arg7, 7)
>> +SMCCC_DECLARE_GET_FUNC(unsigned long, arg8, 8)
> I think I would keep smccc_get_function() and add macros to get the
> 64-bit args. SMCCC_DECLARE_GET_FUNC is an odd macro name for a function
> fetching an arg. I would suggest:
>

I agree. The code will be changed accordingly in next respin.

>> +#define SMCCC_DECLARE_GET_ARG(reg) \
>> +static inline unsigned long smccc_get_arg##reg(struct kvm_vcpu *vcpu) \
>> +{ \
>> + return vcpu_get_reg(vcpu, reg); \
>> }
>>
>> static inline void smccc_set_retval(struct kvm_vcpu *vcpu,
>> unsigned long a0,
>> @@ -40,4 +34,6 @@ static inline void smccc_set_retval(struct kvm_vcpu *vcpu,
>> vcpu_set_reg(vcpu, 3, a3);
>> }
>>
>> +int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
>> +
> spurious change?

I thought the inline function would come before the exposed ones. However,
I don't think it's necessary. I will drop the changes in next respin.

>> #endif
>>

Thanks,
Gavin


2022-01-11 09:20:51

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 02/21] KVM: arm64: Add SDEI virtualization infrastructure

Hi Eric,

On 11/9/21 11:45 PM, Eric Auger wrote:
> On 8/15/21 2:13 AM, Gavin Shan wrote:
>> Software Delegated Exception Interface (SDEI) provides a mechanism for
>> registering and servicing system events. Those system events are high
>> priority events, which must be serviced immediately. It's going to be
>> used by Asynchronous Page Fault (APF) to deliver notification from KVM
>> to guest. It's noted that SDEI is defined by ARM DEN0054A specification.
>>
>> This introduces SDEI virtualization infrastructure where the SDEI events
>> are registered and manuplated by the guest through hypercall. The SDEI
> manipulated

Thanks, It will be corrected in next respin.

>> event is delivered to one specific vCPU by KVM once it's raised. This
>> introduces data structures to represent the needed objects to implement
>> the feature, which is highlighted as below. As those objects could be
>> migrated between VMs, these data structures are partially exported to
>> user space.
>>
>> * kvm_sdei_event
>> SDEI events are exported from KVM so that guest is able to register
>> and manuplate.
> manipulate

Thanks, It will be fixed in next respin. I'm uncertain how the wrong
spelling are still existing even though I had spelling check with
"scripts/checkpatch.pl --codespell".

>> * kvm_sdei_kvm_event
>> SDEI event that has been registered by guest.
> I would recomment to revisit the names. Why kvm event? Why not
> registered_event instead that actually would tell what it its. also you
> have kvm twice in the struct name.

Yep, I think I need reconsider the struct names. The primary reason
why I had the names are keeping the struct names short enough while
being easy to be identified: "kvm_sdei" is the prefix. How about to
have the following struct names?

kvm_sdei_event events exported from KVM to userspace
kvm_sdei_kevent events registered (associated) to KVM
kvm_sdei_vevent events associated with vCPU
kvm_sdei_vcpu vCPU context for event delivery

>> * kvm_sdei_kvm_vcpu
> Didn't you mean kvm_sdei_vcpu_event instead?

Yeah, you're correct. I was supposed to explain kvm_sdei_vcpu_event here.

>> SDEI event that has been delivered to the target vCPU.
>> * kvm_sdei_kvm
>> Place holder of exported and registered SDEI events.
>> * kvm_sdei_vcpu
>> Auxiliary object to save the preempted context during SDEI event
>> delivery.
>>
>> The error is returned for all SDEI hypercalls for now. They will be
>> implemented by the subsequent patches.
>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> arch/arm64/include/asm/kvm_host.h | 6 +
>> arch/arm64/include/asm/kvm_sdei.h | 118 +++++++++++++++
>> arch/arm64/include/uapi/asm/kvm.h | 1 +
>> arch/arm64/include/uapi/asm/kvm_sdei.h | 60 ++++++++
>> arch/arm64/kvm/Makefile | 2 +-
>> arch/arm64/kvm/arm.c | 7 +
>> arch/arm64/kvm/hypercalls.c | 18 +++
>> arch/arm64/kvm/sdei.c | 198 +++++++++++++++++++++++++
>> 8 files changed, 409 insertions(+), 1 deletion(-)
>> create mode 100644 arch/arm64/include/asm/kvm_sdei.h
>> create mode 100644 arch/arm64/include/uapi/asm/kvm_sdei.h
>> create mode 100644 arch/arm64/kvm/sdei.c
>>
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index 41911585ae0c..aedf901e1ec7 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -113,6 +113,9 @@ struct kvm_arch {
>> /* Interrupt controller */
>> struct vgic_dist vgic;
>>
>> + /* SDEI support */
> does not bring much. Why not reusing the commit msg explanation? Here
> and below.

I would drop the comment in next respin because I want to avoid too much
comments to be embedded into "struct kvm_arch". The struct is already
huge in terms of number of fields.

>> + struct kvm_sdei_kvm *sdei;
>> +
>> /* Mandated version of PSCI */
>> u32 psci_version;
>>
>> @@ -339,6 +342,9 @@ struct kvm_vcpu_arch {
>> * here.
>> */
>>
>> + /* SDEI support */
>> + struct kvm_sdei_vcpu *sdei;> +
>> /*
>> * Guest registers we preserve during guest debugging.
>> *
>> diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
>> new file mode 100644
>> index 000000000000..b0abc13a0256
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/kvm_sdei.h
>> @@ -0,0 +1,118 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Definitions of various KVM SDEI events.
>> + *
>> + * Copyright (C) 2021 Red Hat, Inc.
>> + *
>> + * Author(s): Gavin Shan <[email protected]>
>> + */
>> +
>> +#ifndef __ARM64_KVM_SDEI_H__
>> +#define __ARM64_KVM_SDEI_H__
>> +
>> +#include <uapi/linux/arm_sdei.h>> +#include <uapi/asm/kvm_sdei.h>
>> +#include <linux/bitmap.h>
>> +#include <linux/list.h>
>> +#include <linux/spinlock.h>
>> +
>> +struct kvm_sdei_event {
>> + struct kvm_sdei_event_state state;
>> + struct kvm *kvm;
>> + struct list_head link;
>> +};
>> +
>> +struct kvm_sdei_kvm_event {
>> + struct kvm_sdei_kvm_event_state state;
>> + struct kvm_sdei_event *kse;
>> + struct kvm *kvm;
> can't you reuse the kvm handle in state?

Nope, there is no kvm handle in @state.

>> + struct list_head link;
>> +};
>> +
>> +struct kvm_sdei_vcpu_event {
>> + struct kvm_sdei_vcpu_event_state state;
>> + struct kvm_sdei_kvm_event *kske;
>> + struct kvm_vcpu *vcpu;
>> + struct list_head link;
>> +};
>> +
>> +struct kvm_sdei_kvm {
>> + spinlock_t lock;
>> + struct list_head events; /* kvm_sdei_event */
>> + struct list_head kvm_events; /* kvm_sdei_kvm_event */
>> +};
>> +
>> +struct kvm_sdei_vcpu {
>> + spinlock_t lock;
>> + struct kvm_sdei_vcpu_state state;
> could you explain the fields below?

As defined by the specification, each SDEI event is given priority: critical
or normal priority. The priority affects how the SDEI event is delivered.
The critical event can preempt the normal one, but the reverse thing can't
be done.

>> + struct kvm_sdei_vcpu_event *critical_event;
>> + struct kvm_sdei_vcpu_event *normal_event;
>> + struct list_head critical_events;
>> + struct list_head normal_events;
>> +};
>> +
>> +/*
>> + * According to SDEI specification (v1.0), the event number spans 32-bits
>> + * and the lower 24-bits are used as the (real) event number. I don't
>> + * think we can use that much SDEI numbers in one system. So we reserve
>> + * two bits from the 24-bits real event number, to indicate its types:
>> + * physical event and virtual event. One reserved bit is enough for now,
>> + * but two bits are reserved for possible extension in future.
> I think this assumption is worth to be mentionned in the commit msg.

Sure, I will explain it in the commit log in next respin.

>> + *
>> + * The physical events are owned by underly firmware while the virtual
> underly?

s/underly firmware/firmware in next respin.

>> + * events are used by VMM and KVM.
>> + */
>> +#define KVM_SDEI_EV_NUM_TYPE_SHIFT 22
>> +#define KVM_SDEI_EV_NUM_TYPE_MASK 3
>> +#define KVM_SDEI_EV_NUM_TYPE_PHYS 0
>> +#define KVM_SDEI_EV_NUM_TYPE_VIRT 1
>> +
>> +static inline bool kvm_sdei_is_valid_event_num(unsigned long num)
> the name of the function does does not really describe what it does. It
> actually checks the sdei is a virtual one. suggest kvm_sdei_is_virtual?

The header file is only used by KVM where the virtual SDEI event is the
only concern. However, kvm_sdei_is_virtual() is a better name.

>> +{
>> + unsigned long type;
>> +
>> + if (num >> 32)
>> + return false;
>> +
>> + type = (num >> KVM_SDEI_EV_NUM_TYPE_SHIFT) & KVM_SDEI_EV_NUM_TYPE_MASK;
> I think the the mask generally is applied before shifting. See
> include/linux/irqchip/arm-gic-v3.h

Ok, I will adopt the style in next respin.

>> + if (type != KVM_SDEI_EV_NUM_TYPE_VIRT)
>> + return false;
>> +
>> + return true;
>> +}
>> +
>> +/* Accessors for the registration or enablement states of KVM event */
>> +#define KVM_SDEI_FLAG_FUNC(field) \
>> +static inline bool kvm_sdei_is_##field(struct kvm_sdei_kvm_event *kske, \
>> + unsigned int index) \
>> +{ \
>> + return !!test_bit(index, (void *)(kske->state.field)); \
>> +} \
>> + \
>> +static inline bool kvm_sdei_empty_##field(struct kvm_sdei_kvm_event *kske) \
> nit: s/empty/none ?

"empty" is sticky to bitmap_empty(), but "none" here looks better :)

>> +{ \
>> + return bitmap_empty((void *)(kske->state.field), \
>> + KVM_SDEI_MAX_VCPUS); \
>> +} \
>> +static inline void kvm_sdei_set_##field(struct kvm_sdei_kvm_event *kske, \
>> + unsigned int index) \
>> +{ \
>> + set_bit(index, (void *)(kske->state.field)); \
>> +} \
>> +static inline void kvm_sdei_clear_##field(struct kvm_sdei_kvm_event *kske, \
>> + unsigned int index) \
>> +{ \
>> + clear_bit(index, (void *)(kske->state.field)); \
>> +}
>> +
>> +KVM_SDEI_FLAG_FUNC(registered)
>> +KVM_SDEI_FLAG_FUNC(enabled)
>> +
>> +/* APIs */
>> +void kvm_sdei_init_vm(struct kvm *kvm);
>> +void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
>> +int kvm_sdei_hypercall(struct kvm_vcpu *vcpu);
>> +void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
>> +void kvm_sdei_destroy_vm(struct kvm *kvm);
>> +
>> +#endif /* __ARM64_KVM_SDEI_H__ */
>> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
>> index b3edde68bc3e..e1b200bb6482 100644
>> --- a/arch/arm64/include/uapi/asm/kvm.h
>> +++ b/arch/arm64/include/uapi/asm/kvm.h
>> @@ -36,6 +36,7 @@
>> #include <linux/types.h>
>> #include <asm/ptrace.h>
>> #include <asm/sve_context.h>
>> +#include <asm/kvm_sdei.h>
>>
>> #define __KVM_HAVE_GUEST_DEBUG
>> #define __KVM_HAVE_IRQ_LINE
>> diff --git a/arch/arm64/include/uapi/asm/kvm_sdei.h b/arch/arm64/include/uapi/asm/kvm_sdei.h
>> new file mode 100644
>> index 000000000000..8928027023f6
>> --- /dev/null
>> +++ b/arch/arm64/include/uapi/asm/kvm_sdei.h
>> @@ -0,0 +1,60 @@
>> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
>> +/*
>> + * Definitions of various KVM SDEI event states.
>> + *
>> + * Copyright (C) 2021 Red Hat, Inc.
>> + *
>> + * Author(s): Gavin Shan <[email protected]>
>> + */
>> +
>> +#ifndef _UAPI__ASM_KVM_SDEI_H
>> +#define _UAPI__ASM_KVM_SDEI_H
>> +
>> +#ifndef __ASSEMBLY__
>> +#include <linux/types.h>
>> +
>> +#define KVM_SDEI_MAX_VCPUS 512
>> +#define KVM_SDEI_INVALID_NUM 0
>> +#define KVM_SDEI_DEFAULT_NUM 0x40400000
>
> The motivation behind introducing such uapi should be clearer (besides
> just telling this aims at migrating). To me atm, this justification does
> not make possible to understand if those structs are well suited. You
> should document the migration process I think.
>
> I would remove _state suffix in all of them.

I think so. I will add document "Documentation/virt/kvm/arm/sdei.rst" to
explain the design and the corresponding data structs for migration. However,
I would keep "state" suffix because I used this field as indicator for
data structs to be migrated. However, the structs should be named accordingly
since they're embedded to their parent structs:

kvm_sdei_event_state
kvm_sdei_kevent_state
kvm_sdei_vevent_state
kvm_sdei_vcpu_state

>> +
>> +struct kvm_sdei_event_state {
> This is not really a state because it cannot be changed by the guest,
> right? I would remove _state and just call it kvm_sdei_event

The name kvm_sdei_event will be conflicting with same struct, defined
in include/asm/kvm_sdei.h. Lets keep "_state" as I explained. I use
the suffix as indicator to structs which need migration even though
they're not changeable.

>> + __u64 num;
>> +
>> + __u8 type;
>> + __u8 signaled;
>> + __u8 priority;
> you need some padding to be 64-bit aligned. See in generic or aarch64
> kvm.h for instance.

Sure.

>> +};
>> +
>> +struct kvm_sdei_kvm_event_state {
> I would rename into kvm_sdei_registered_event or smth alike

As above, it will be conflicting with its parent struct, defined
in include/asm/kvm_sdei.h

>> + __u64 num;
> how does this num differ from the event state one?

@num is same thing to that in kvm_sdei_event_state. It's used as
index to retrieve corresponding kvm_sdei_event_state. One kvm_sdei_event_state
instance can be dereferenced by kvm_sdei_kvm_event_state and kvm_sdei_kvm_vcpu_event_state.
It's why we don't embed kvm_sdei_event_state in them, to avoid duplicated
traffic in migration.

>> + __u32 refcount;
>> +
>> + __u8 route_mode;
> padding also here. See for instance
> https://lore.kernel.org/kvm/[email protected]/T/#m7bac2ff2b28a68f8d2196ec452afd3e46682760d
>
> Maybe put the the route_mode field and refcount at the end and add one
> byte of padding?
>
> Why can't we have a single sdei_event uapi representation where route
> mode defaults to unset and refcount defaults to 0 when not registered?
>

Ok. I will fix the padding and alignment in next respin. The @route_affinity
can be changed on request from the guest. The @refcount helps to prevent the
event from being unregistered if it's still dereferenced by kvm_sdei_vcpu_event_state.

>> + __u64 route_affinity;
>> + __u64 entries[KVM_SDEI_MAX_VCPUS];
>> + __u64 params[KVM_SDEI_MAX_VCPUS];
> I would rename entries into ep_address and params into ep_arg.

Ok, but what does "ep" means? I barely guess it's "entry point".
I'm not sure if you're talking about "PE" here.

>> + __u64 registered[KVM_SDEI_MAX_VCPUS/64];
> maybe add a comment along with KVM_SDEI_MAX_VCPUS that it must be a
> multiple of 64 (or a build check)
>

Sure.

>> + __u64 enabled[KVM_SDEI_MAX_VCPUS/64];
> Also you may clarify what this gets used for a shared event. I guess
> this only makes sense for a private event which can be registered by
> several EPs?

Nope, they're used by both shared and private events. For shared event,
the bit#0 is used to indicate the state, while the individual bit is
used for the private event. Yes, the private event can be registered
and enabled separately on multiple PEs.

>> +};
>> +
>> +struct kvm_sdei_vcpu_event_state {
>> + __u64 num;
>> + __u32 refcount;
> how does it differ from num and refcount of the registered event?
> padding++

About @num and @refcount, please refer to the above explanation. Yes,
I will fix padding in next respin.

>> +};
>> +
>> +struct kvm_sdei_vcpu_regs {
>> + __u64 regs[18];
>> + __u64 pc;
>> + __u64 pstate;
>> +};
>> +
>> +struct kvm_sdei_vcpu_state {
>> + __u8 masked;
> padding++

Ok.

>> + __u64 critical_num;
>> + __u64 normal_num;
>> + struct kvm_sdei_vcpu_regs critical_regs;
>> + struct kvm_sdei_vcpu_regs normal_regs;
>> +};> +
>> +#endif /* !__ASSEMBLY__ */
>> +#endif /* _UAPI__ASM_KVM_SDEI_H */
>> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
>> index 989bb5dad2c8..eefca8ca394d 100644
>> --- a/arch/arm64/kvm/Makefile
>> +++ b/arch/arm64/kvm/Makefile
>> @@ -16,7 +16,7 @@ kvm-y := $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o \
>> inject_fault.o va_layout.o handle_exit.o \
>> guest.o debug.o reset.o sys_regs.o \
>> vgic-sys-reg-v3.o fpsimd.o pmu.o \
>> - arch_timer.o trng.o\
>> + arch_timer.o trng.o sdei.o \
>> vgic/vgic.o vgic/vgic-init.o \
>> vgic/vgic-irqfd.o vgic/vgic-v2.o \
>> vgic/vgic-v3.o vgic/vgic-v4.o \
>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> index e9a2b8f27792..2f021aa41632 100644
>> --- a/arch/arm64/kvm/arm.c
>> +++ b/arch/arm64/kvm/arm.c
>> @@ -150,6 +150,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>>
>> kvm_vgic_early_init(kvm);
>>
>> + kvm_sdei_init_vm(kvm);
>> +
>> /* The maximum number of VCPUs is limited by the host's GIC model */
>> kvm->arch.max_vcpus = kvm_arm_default_max_vcpus();
>>
>> @@ -179,6 +181,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
>>
>> kvm_vgic_destroy(kvm);
>>
>> + kvm_sdei_destroy_vm(kvm);
>> +
>> for (i = 0; i < KVM_MAX_VCPUS; ++i) {
>> if (kvm->vcpus[i]) {
>> kvm_vcpu_destroy(kvm->vcpus[i]);
>> @@ -333,6 +337,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>>
>> kvm_arm_pvtime_vcpu_init(&vcpu->arch);
>>
>> + kvm_sdei_create_vcpu(vcpu);
>> +
>> vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu;
>>
>> err = kvm_vgic_vcpu_init(vcpu);
>> @@ -354,6 +360,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
>> kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
>> kvm_timer_vcpu_terminate(vcpu);
>> kvm_pmu_vcpu_destroy(vcpu);
>> + kvm_sdei_destroy_vcpu(vcpu);
>>
>> kvm_arm_vcpu_destroy(vcpu);
>> }
>> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
>> index 30da78f72b3b..d3fc893a4f58 100644
>> --- a/arch/arm64/kvm/hypercalls.c
>> +++ b/arch/arm64/kvm/hypercalls.c
>> @@ -139,6 +139,24 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
>> case ARM_SMCCC_TRNG_RND32:
>> case ARM_SMCCC_TRNG_RND64:
>> return kvm_trng_call(vcpu);
>> + case SDEI_1_0_FN_SDEI_VERSION:
>> + case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
>> + case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
>> + case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
>> + case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
>> + case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
>> + case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
>> + case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
>> + case SDEI_1_0_FN_SDEI_EVENT_STATUS:
>> + case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
>> + case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
>> + case SDEI_1_0_FN_SDEI_PE_MASK:
>> + case SDEI_1_0_FN_SDEI_PE_UNMASK:
>> + case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
>> + case SDEI_1_0_FN_SDEI_INTERRUPT_RELEASE:
>> + case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
>> + case SDEI_1_0_FN_SDEI_SHARED_RESET:
>> + return kvm_sdei_hypercall(vcpu);
>> default:
>> return kvm_psci_call(vcpu);
>> }
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> new file mode 100644
>> index 000000000000..ab330b74a965
>> --- /dev/null
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -0,0 +1,198 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * SDEI virtualization support.
>> + *
>> + * Copyright (C) 2021 Red Hat, Inc.
>> + *
>> + * Author(s): Gavin Shan <[email protected]>
>> + */
>> +
>> +#include <linux/kernel.h>
>> +#include <linux/kvm_host.h>
>> +#include <linux/spinlock.h>
>> +#include <linux/slab.h>
>> +#include <kvm/arm_hypercalls.h>
>> +
>> +static struct kvm_sdei_event_state defined_kse[] = {
>> + { KVM_SDEI_DEFAULT_NUM,
>> + SDEI_EVENT_TYPE_PRIVATE,
>> + 1,
>> + SDEI_EVENT_PRIORITY_CRITICAL
>> + },
>> +};
> I understand from the above we currently only support a single static (~
> platform) SDEI event with num = KVM_SDEI_DEFAULT_NUM. We do not support
> bound events. You may add a comment here and maybe in the commit msg.
> I would rename the variable into exported_events.

Yeah, we may enhance it to allow userspace to add more in future, but
not now. Ok, I will rename it to @exported_events.

>> +
>> +static void kvm_sdei_remove_events(struct kvm *kvm)
>> +{
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_event *kse, *tmp;
>> +
>> + list_for_each_entry_safe(kse, tmp, &ksdei->events, link) {
>> + list_del(&kse->link);
>> + kfree(kse);
>> + }
>> +}
>> +
>> +static void kvm_sdei_remove_kvm_events(struct kvm *kvm,
>> + unsigned int mask,
>> + bool force)
>> +{
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_event *kse;
>> + struct kvm_sdei_kvm_event *kske, *tmp;
>> +
>> + list_for_each_entry_safe(kske, tmp, &ksdei->kvm_events, link) {
>> + kse = kske->kse;
>> +
>> + if (!((1 << kse->state.type) & mask))
>> + continue;
> don't you need to hold a lock before looping? What if sbdy concurrently
> changes the state fields, especially the refcount below?

Yes, the caller holds @kvm->sdei_lock.

>> +
>> + if (!force && kske->state.refcount)
>> + continue;
> Usually the refcount is used to control the lifetime of the object. The
> 'force' flag looks wrong in that context. Shouldn't you make sure all
> users have released their refcounts and on the last decrement, delete
> the object?

@force is used for exceptional case. For example, the KVM process is
killed before the event reference count gets chance to be dropped.

>> +
>> + list_del(&kske->link);
>> + kfree(kske);
>> + }
>> +}
>> +
>> +static void kvm_sdei_remove_vcpu_events(struct kvm_vcpu *vcpu)
>> +{
>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> + struct kvm_sdei_vcpu_event *ksve, *tmp;
>> +
>> + list_for_each_entry_safe(ksve, tmp, &vsdei->critical_events, link) {
>> + list_del(&ksve->link);
>> + kfree(ksve);
>> + }
>> +
>> + list_for_each_entry_safe(ksve, tmp, &vsdei->normal_events, link) {
>> + list_del(&ksve->link);
>> + kfree(ksve);
>> + }
>> +}
>> +
>> +int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> +{
>> + u32 func = smccc_get_function(vcpu);
>> + bool has_result = true;
>> + unsigned long ret;
>> +
>> + switch (func) {
>> + case SDEI_1_0_FN_SDEI_VERSION:
>> + case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
>> + case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
>> + case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
>> + case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
>> + case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
>> + case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
>> + case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
>> + case SDEI_1_0_FN_SDEI_EVENT_STATUS:
>> + case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
>> + case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
>> + case SDEI_1_0_FN_SDEI_PE_MASK:
>> + case SDEI_1_0_FN_SDEI_PE_UNMASK:
>> + case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
>> + case SDEI_1_0_FN_SDEI_INTERRUPT_RELEASE:
>> + case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
>> + case SDEI_1_0_FN_SDEI_SHARED_RESET:
>> + default:
>> + ret = SDEI_NOT_SUPPORTED;
>> + }
>> +
>> + /*
>> + * We don't have return value for COMPLETE or COMPLETE_AND_RESUME
>> + * hypercalls. Otherwise, the restored context will be corrupted.
>> + */
>> + if (has_result)
>> + smccc_set_retval(vcpu, ret, 0, 0, 0);
> If I understand the above comment, COMPLETE and COMPLETE_AND_RESUME
> should have has_result set to false whereas in that case they will
> return NOT_SUPPORTED. Is that OK for the context restore?

Nice catch! @has_result needs to be false for COMPLETE and COMPLETE_AND_RESUME.

>> +
>> + return 1;
>> +}
>> +
>> +void kvm_sdei_init_vm(struct kvm *kvm)
>> +{
>> + struct kvm_sdei_kvm *ksdei;
>> + struct kvm_sdei_event *kse;
>> + int i;
>> +
>> + ksdei = kzalloc(sizeof(*ksdei), GFP_KERNEL);
>> + if (!ksdei)
>> + return;
>> +
>> + spin_lock_init(&ksdei->lock);
>> + INIT_LIST_HEAD(&ksdei->events);
>> + INIT_LIST_HEAD(&ksdei->kvm_events);
>> +
>> + /*
>> + * Populate the defined KVM SDEI events. The whole functionality
>> + * will be disabled on any errors.
> You should definitively revise your naming conventions. this brings
> confusion inbetween exported events and registered events. Why not
> simply adopt the spec terminology?

Yeah, I think so, but I think "defined KVM SDEI events" is following
the specification because the SDEI event is defined by the firmware
as the specification says. We're emulating firmware in KVM here.

>> + */
>> + for (i = 0; i < ARRAY_SIZE(defined_kse); i++) {
>> + kse = kzalloc(sizeof(*kse), GFP_KERNEL);
>> + if (!kse) {
>> + kvm_sdei_remove_events(kvm);
>> + kfree(ksdei);
>> + return;
>> + }
> Add a comment saying that despite we currently support a single static
> event we prepare for binding support by building a list of exposed events?
>
> Or maybe simplify the implementation at this stage of the development
> assuming a single platform event is supported?

I will add comment as you suggested in next respin. Note that another entry
will be added to the defined event array when Async PF is involved.

>> +
>> + kse->kvm = kvm;
>> + kse->state = defined_kse[i];
>> + list_add_tail(&kse->link, &ksdei->events);
>> + }
>> +
>> + kvm->arch.sdei = ksdei;
>> +}
>> +
>> +void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu)
>> +{
>> + struct kvm *kvm = vcpu->kvm;
>> + struct kvm_sdei_vcpu *vsdei;
>> +
>> + if (!kvm->arch.sdei)
>> + return;
>> +
>> + vsdei = kzalloc(sizeof(*vsdei), GFP_KERNEL);
>> + if (!vsdei)
>> + return;
>> +
>> + spin_lock_init(&vsdei->lock);
>> + vsdei->state.masked = 1;
>> + vsdei->state.critical_num = KVM_SDEI_INVALID_NUM;
>> + vsdei->state.normal_num = KVM_SDEI_INVALID_NUM;
>> + vsdei->critical_event = NULL;
>> + vsdei->normal_event = NULL;
>> + INIT_LIST_HEAD(&vsdei->critical_events);
>> + INIT_LIST_HEAD(&vsdei->normal_events);
>> +
>> + vcpu->arch.sdei = vsdei;
>> +}
>> +
>> +void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu)
>> +{
>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> +
>> + if (vsdei) {
>> + spin_lock(&vsdei->lock);
>> + kvm_sdei_remove_vcpu_events(vcpu);
>> + spin_unlock(&vsdei->lock);
>> +
>> + kfree(vsdei);
>> + vcpu->arch.sdei = NULL;
>> + }
>> +}
>> +
>> +void kvm_sdei_destroy_vm(struct kvm *kvm)
>> +{
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + unsigned int mask = (1 << SDEI_EVENT_TYPE_PRIVATE) |
>> + (1 << SDEI_EVENT_TYPE_SHARED);
>> +
>> + if (ksdei) {
>> + spin_lock(&ksdei->lock);
>> + kvm_sdei_remove_kvm_events(kvm, mask, true);> + kvm_sdei_remove_events(kvm);
>> + spin_unlock(&ksdei->lock);
>> +
>> + kfree(ksdei);
>> + kvm->arch.sdei = NULL;
>> + }
>> +}
>>

Thanks,
Gavin


2022-01-11 09:25:27

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 03/21] KVM: arm64: Support SDEI_VERSION hypercall

Hi Eric,

On 11/9/21 11:26 PM, Eric Auger wrote:
> On 8/15/21 2:13 AM, Gavin Shan wrote:
>> This supports SDEI_VERSION hypercall by returning v1.0.0 simply
> s/This supports/Add Support. I think this is the prefered way to start
> the commit msg. Here and elsewhere.

Ok.

>> when the functionality is supported on the VM and vCPU.
> Can you explain when the functionality isn't supported on either. From
> the infra patch I have the impression that an allocation failure is the
> sole cause of lack of support?

Yes, it's the only reason that SDEI isn't supported. I will
mention this in the commit log in next respin.

>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> arch/arm64/kvm/sdei.c | 18 ++++++++++++++++++
>> 1 file changed, 18 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index ab330b74a965..aa9485f076a9 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -70,6 +70,22 @@ static void kvm_sdei_remove_vcpu_events(struct kvm_vcpu *vcpu)
>> }
>> }
>>
>> +static unsigned long kvm_sdei_hypercall_version(struct kvm_vcpu *vcpu)
>> +{
>> + struct kvm *kvm = vcpu->kvm;
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> + unsigned long ret = SDEI_NOT_SUPPORTED;
> nit: I would remove ret local variable

Ok.

>> +
>> + if (!(ksdei && vsdei))
>> + return ret;
>> +
>> + /* v1.0.0 */
>> + ret = (1UL << SDEI_VERSION_MAJOR_SHIFT);
>> +
>> + return ret;
>> +}
>> +
>> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> {
>> u32 func = smccc_get_function(vcpu);
>> @@ -78,6 +94,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>>
>> switch (func) {
>> case SDEI_1_0_FN_SDEI_VERSION:
>> + ret = kvm_sdei_hypercall_version(vcpu);
>> + break;
>> case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
>> case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
>> case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
>>

Thanks,
Gavin


2022-01-11 09:40:57

by Shannon Zhao

[permalink] [raw]
Subject: Re: [PATCH v4 02/21] KVM: arm64: Add SDEI virtualization infrastructure



On 2021/8/15 8:13, Gavin Shan wrote:
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index e9a2b8f27792..2f021aa41632 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -150,6 +150,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>
> kvm_vgic_early_init(kvm);
>
> + kvm_sdei_init_vm(kvm);
> +
> /* The maximum number of VCPUs is limited by the host's GIC model */
> kvm->arch.max_vcpus = kvm_arm_default_max_vcpus();
Hi, Is it possible to let user space to choose whether enabling SEDI or
not rather than enable it by default?

2022-01-11 09:43:45

by Shannon Zhao

[permalink] [raw]
Subject: Re: [PATCH v4 06/21] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall



On 2021/8/15 8:13, Gavin Shan wrote:
> +static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu)
> +{
> + struct kvm *kvm = vcpu->kvm;
> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
> + struct kvm_sdei_vcpu_regs *regs;
> + unsigned long index = smccc_get_arg1(vcpu);
> + unsigned long ret = SDEI_SUCCESS;
> +
> + /* Sanity check */
> + if (!(ksdei && vsdei)) {
> + ret = SDEI_NOT_SUPPORTED;
> + goto out;
> + }
Maybe we could move these common sanity check codes to
kvm_sdei_hypercall to save some lines.

Thanks,
Shannon

2022-01-12 02:19:34

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 04/21] KVM: arm64: Support SDEI_EVENT_REGISTER hypercall

Hi Eric,

On 11/9/21 11:50 PM, Eric Auger wrote:
> On 8/15/21 2:13 AM, Gavin Shan wrote:
>> This supports SDEI_EVENT_REGISTER hypercall, which is used by guest
>> to register SDEI events. The SDEI event won't be raised to the guest
>> or specific vCPU until it's registered and enabled explicitly.
>>
>> Only those events that have been exported by KVM can be registered.
>> After the event is registered successfully, the KVM SDEI event (object)
>> is created or updated because the same KVM SDEI event is shared by
> revisit the terminology (KVM SDEI event). The same SDEI registered event
> object is shared by multiple vCPUs if it is a private event.

Yep, I will correct the commit log in next respin.

>> multiple vCPUs if it's a private event.>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> arch/arm64/kvm/sdei.c | 122 ++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 122 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index aa9485f076a9..d3ea3eee154b 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -21,6 +21,20 @@ static struct kvm_sdei_event_state defined_kse[] = {
>> },
>> };
>>
>> +static struct kvm_sdei_event *kvm_sdei_find_event(struct kvm *kvm,
>> + unsigned long num)
>> +{
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_event *kse;
> the 'k' prefix everywhere for your local variable is unneeded.

ok.

>> +
>> + list_for_each_entry(kse, &ksdei->events, link) {
>> + if (kse->state.num == num)
>> + return kse;
>> + }
>> +
>> + return NULL;
>> +}
>> +
>> static void kvm_sdei_remove_events(struct kvm *kvm)
>> {
>> struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> @@ -32,6 +46,20 @@ static void kvm_sdei_remove_events(struct kvm *kvm)
>> }
>> }
>>
>> +static struct kvm_sdei_kvm_event *kvm_sdei_find_kvm_event(struct kvm *kvm,
>> + unsigned long num)
>> +{
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_kvm_event *kske;
>> +
>> + list_for_each_entry(kske, &ksdei->kvm_events, link) {> + if (kske->state.num == num)
> I still don't get the diff between the num of an SDEI event vs the num
> of a so-called SDEI kvm event. Event numbers are either static or
> dynamically created using bind ops which you do not support. But to me
> this is a property of the root exposed SDEI event and not a property of
> the registered event. Please could you clarify?

Your understanding is correct. The SDEI events are defined staticly apart
from the binding one, which we don't support for now. As the information
(properties) of one specific SDEI event is scattered and associated with
different objects, like KVM/vCPU/context. The SDEI event (@num) is the key
to associate these scattered information (properties).

>> + return kske;
>> + }
>> +
>> + return NULL;
>> +}
>> +
>> static void kvm_sdei_remove_kvm_events(struct kvm *kvm,
>> unsigned int mask,
>> bool force)
>> @@ -86,6 +114,98 @@ static unsigned long kvm_sdei_hypercall_version(struct kvm_vcpu *vcpu)
>> return ret;
>> }
>>
>> +static unsigned long kvm_sdei_hypercall_register(struct kvm_vcpu *vcpu)
>> +{
>> + struct kvm *kvm = vcpu->kvm;
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> + struct kvm_sdei_event *kse = NULL;
>> + struct kvm_sdei_kvm_event *kske = NULL;
>> + unsigned long event_num = smccc_get_arg1(vcpu);
>> + unsigned long event_entry = smccc_get_arg2(vcpu);
>> + unsigned long event_param = smccc_get_arg3(vcpu);
>> + unsigned long route_mode = smccc_get_arg4(vcpu);
>> + unsigned long route_affinity = smccc_get_arg5(vcpu);
>> + int index = vcpu->vcpu_idx;
>> + unsigned long ret = SDEI_SUCCESS;
>> +
>> + /* Sanity check */
>> + if (!(ksdei && vsdei)) {
>> + ret = SDEI_NOT_SUPPORTED;
>> + goto out;
>> + }
>> +
>> + if (!kvm_sdei_is_valid_event_num(event_num)) {
>> + ret = SDEI_INVALID_PARAMETERS;
>> + goto out;
>> + }
>> +
>> + if (!(route_mode == SDEI_EVENT_REGISTER_RM_ANY ||
>> + route_mode == SDEI_EVENT_REGISTER_RM_PE)) {
>> + ret = SDEI_INVALID_PARAMETERS;
>> + goto out;
>> + }
>> +
>> + /*
>> + * The KVM event could have been created if it's a private event.
>> + * We needn't create a KVM event in this case.
> s/create a KVM event/to create another KVM event instance

Ok.

>> + */
>> + spin_lock(&ksdei->lock);
>> + kske = kvm_sdei_find_kvm_event(kvm, event_num);
>> + if (kske) {
>> + kse = kske->kse;
>> + index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
>> + vcpu->vcpu_idx : 0;
>> +
>> + if (kvm_sdei_is_registered(kske, index)) {
>> + ret = SDEI_DENIED;
>> + goto unlock;
>> + }
>> +
>> + kske->state.route_mode = route_mode;
>> + kske->state.route_affinity = route_affinity;
>> + kske->state.entries[index] = event_entry;
>> + kske->state.params[index] = event_param;
>> + kvm_sdei_set_registered(kske, index);
>> + goto unlock;
>> + }
>> +
>> + /* Check if the event number has been registered */
>> + kse = kvm_sdei_find_event(kvm, event_num);
> I don't get the comment. find_event looks up for exposed events and not
> registered events, right? So maybe this is the first thing to check, ie.
> the num matches one exposed event.

This should be corrected to:

/* Check if the event has been defined or exposed */

>> + if (!kse) {
>> + ret = SDEI_INVALID_PARAMETERS;
>> + goto unlock;
>> + }
>> +
>> + /* Create KVM event */
>> + kske = kzalloc(sizeof(*kske), GFP_KERNEL);
>> + if (!kske) {
>> + ret = SDEI_OUT_OF_RESOURCE;
>> + goto unlock;
>> + }
>> +
>> + /* Initialize KVM event state */
>> + index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
>> + vcpu->vcpu_idx : 0;
>> + kske->state.num = event_num;
>> + kske->state.refcount = 0;
>> + kske->state.route_mode = route_affinity;
>> + kske->state.route_affinity = route_affinity;
>> + kske->state.entries[index] = event_entry;
>> + kske->state.params[index] = event_param;
>> + kvm_sdei_set_registered(kske, index);
>> +
>> + /* Initialize KVM event */
>> + kske->kse = kse;
>> + kske->kvm = kvm;
>> + list_add_tail(&kske->link, &ksdei->kvm_events);
>> +
>> +unlock:
>> + spin_unlock(&ksdei->lock);
>> +out:
>> + return ret;
>> +}
>> +
>> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> {
>> u32 func = smccc_get_function(vcpu);
>> @@ -97,6 +217,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> ret = kvm_sdei_hypercall_version(vcpu);
>> break;
>> case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
>> + ret = kvm_sdei_hypercall_register(vcpu);
>> + break;
>> case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
>> case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
>> case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
>>

Thanks,
Gavin


2022-01-12 02:29:30

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 05/21] KVM: arm64: Support SDEI_EVENT_{ENABLE, DISABLE} hypercall

Hi Eric,

On 11/10/21 12:02 AM, Eric Auger wrote:
> On 8/15/21 2:13 AM, Gavin Shan wrote:
>> This supports SDEI_EVENT_{ENABLE, DISABLE} hypercall. After SDEI
>> event is registered by guest, it won't be delivered to the guest
>> until it's enabled. On the other hand, the SDEI event won't be
>> raised to the guest or specific vCPU if it's has been disabled
>> on the guest or specific vCPU.
>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> arch/arm64/kvm/sdei.c | 68 +++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 68 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index d3ea3eee154b..b022ce0a202b 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -206,6 +206,70 @@ static unsigned long kvm_sdei_hypercall_register(struct kvm_vcpu *vcpu)
>> return ret;
>> }
>>
>> +static unsigned long kvm_sdei_hypercall_enable(struct kvm_vcpu *vcpu,
>> + bool enable)
>> +{
>> + struct kvm *kvm = vcpu->kvm;
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> + struct kvm_sdei_event *kse = NULL;
>> + struct kvm_sdei_kvm_event *kske = NULL;
>> + unsigned long event_num = smccc_get_arg1(vcpu);
>> + int index = 0;
>> + unsigned long ret = SDEI_SUCCESS;
>> +
>> + /* Sanity check */
>> + if (!(ksdei && vsdei)) {
>> + ret = SDEI_NOT_SUPPORTED;
>> + goto out;
>> + }
>> +
>> + if (!kvm_sdei_is_valid_event_num(event_num)) {
> I would rename into is_exposed_event_num()

kvm_sdei_is_virtual() has been recommended by you when you reviewed the following
patch. I think kvm_sdei_is_virtual() is good enough :)

[PATCH v4 02/21] KVM: arm64: Add SDEI virtualization infrastructure

>> + ret = SDEI_INVALID_PARAMETERS;
>> + goto out;
>> + }
>> +
>> + /* Check if the KVM event exists */
>> + spin_lock(&ksdei->lock);
>> + kske = kvm_sdei_find_kvm_event(kvm, event_num);
>> + if (!kske) {
>> + ret = SDEI_INVALID_PARAMETERS;
> should be DENIED according to the spec, ie. nobody registered that event?

Ok.

>> + goto unlock;
>> + }
>> +
>> + /* Check if there is pending events */
> does that match the "handler-unregister-pending state" case mentionned
> in the spec?
>> + if (kske->state.refcount) {
>> + ret = SDEI_PENDING;
> ? not documented in my A spec? DENIED?

Yep, It should be DENIED.

>> + goto unlock;
>> + }
>> +
>> + /* Check if it has been registered */
> isn't duplicate of /* Check if the KVM event exists */ ?

It's not duplicate check, but the comment here seems misleading. I will
correct this to:

/* Check if it has been defined or exposed */

>> + kse = kske->kse;
>> + index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
>> + vcpu->vcpu_idx : 0;
>> + if (!kvm_sdei_is_registered(kske, index)) {
>> + ret = SDEI_DENIED;
>> + goto unlock;
>> + }
>> +
>> + /* Verify its enablement state */
>> + if (enable == kvm_sdei_is_enabled(kske, index)) {
> spec says:
> Enabling/disabled an event, which is already enabled/disabled, is
> permitted and has no effect. I guess ret should be OK.

yep, it should be ok.

>> + ret = SDEI_DENIED;
>> + goto unlock;
>> + }
>> +
>> + /* Update enablement state */
>> + if (enable)
>> + kvm_sdei_set_enabled(kske, index);
>> + else
>> + kvm_sdei_clear_enabled(kske, index);
>> +
>> +unlock:
>> + spin_unlock(&ksdei->lock);
>> +out:
>> + return ret;
>> +}
>> +
>> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> {
>> u32 func = smccc_get_function(vcpu);
>> @@ -220,7 +284,11 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> ret = kvm_sdei_hypercall_register(vcpu);
>> break;
>> case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
>> + ret = kvm_sdei_hypercall_enable(vcpu, true);
>> + break;
>> case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
>> + ret = kvm_sdei_hypercall_enable(vcpu, false);
>> + break;
>> case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
>> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
>> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
>>

Thanks,
Gavin


2022-01-12 02:34:11

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 06/21] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall

Hi Eric,

On 11/10/21 7:16 PM, Eric Auger wrote:
> On 8/15/21 2:13 AM, Gavin Shan wrote:
>> This supports SDEI_EVENT_CONTEXT hypercall. It's used by the guest
>> to retrieved the original registers (R0 - R17) in its SDEI event
>> handler. Those registers can be corrupted during the SDEI event
>> delivery.
>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> arch/arm64/kvm/sdei.c | 40 ++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 40 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index b022ce0a202b..b4162efda470 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -270,6 +270,44 @@ static unsigned long kvm_sdei_hypercall_enable(struct kvm_vcpu *vcpu,
>> return ret;
>> }
>>
>> +static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu)
>> +{
>> + struct kvm *kvm = vcpu->kvm;
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> + struct kvm_sdei_vcpu_regs *regs;
>> + unsigned long index = smccc_get_arg1(vcpu);
> s/index/param_id to match the spec?

Sure, but "reg_id" seems better here. As the parameter indicates the GPR index
to be fetched on request of the guest kernel.

>> + unsigned long ret = SDEI_SUCCESS;
>> +
>> + /* Sanity check */
>> + if (!(ksdei && vsdei)) {
>> + ret = SDEI_NOT_SUPPORTED;
>> + goto out;
>> + }
>> +
>> + if (index > ARRAY_SIZE(vsdei->state.critical_regs.regs)) {
>> + ret = SDEI_INVALID_PARAMETERS;
>> + goto out;
>> + }
> I would move the above after regs = and use regs there (although the
> regs ARRAY_SIZE of both is identifical)

Ok.

>> +
>> + /* Check if the pending event exists */
>> + spin_lock(&vsdei->lock);
>> + if (!(vsdei->critical_event || vsdei->normal_event)) {
>> + ret = SDEI_DENIED;
>> + goto unlock;
>> + }
>> +
>> + /* Fetch the requested register */
>> + regs = vsdei->critical_event ? &vsdei->state.critical_regs :
>> + &vsdei->state.normal_regs;
>> + ret = regs->regs[index];
>> +
>> +unlock:
>> + spin_unlock(&vsdei->lock);
>> +out:
>> + return ret;
>> +}
>> +
>> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> {
>> u32 func = smccc_get_function(vcpu);
>> @@ -290,6 +328,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> ret = kvm_sdei_hypercall_enable(vcpu, false);
>> break;
>> case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
>> + ret = kvm_sdei_hypercall_context(vcpu);
>> + break;
>> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
>> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
>> case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
>>

Thanks,
Gavin


2022-01-12 02:39:18

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 07/21] KVM: arm64: Support SDEI_EVENT_UNREGISTER hypercall

Hi Eric,

On 11/10/21 1:05 AM, Eric Auger wrote:
> On 8/15/21 2:13 AM, Gavin Shan wrote:
>> This supports SDEI_EVENT_UNREGISTER hypercall. It's used by the
>> guest to unregister SDEI event. The SDEI event won't be raised to
>> the guest or specific vCPU after it's unregistered successfully.
>> It's notable the SDEI event is disabled automatically on the guest
>> or specific vCPU once it's unregistered successfully.
>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> arch/arm64/kvm/sdei.c | 61 +++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 61 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index b4162efda470..a3ba69dc91cb 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -308,6 +308,65 @@ static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu)
>> return ret;
>> }
>>
>> +static unsigned long kvm_sdei_hypercall_unregister(struct kvm_vcpu *vcpu)
>> +{
>> + struct kvm *kvm = vcpu->kvm;
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> + struct kvm_sdei_event *kse = NULL;
>> + struct kvm_sdei_kvm_event *kske = NULL;
>> + unsigned long event_num = smccc_get_arg1(vcpu);
>> + int index = 0;
>> + unsigned long ret = SDEI_SUCCESS;
>> +
>> + /* Sanity check */
>> + if (!(ksdei && vsdei)) {
>> + ret = SDEI_NOT_SUPPORTED;
>> + goto out;
>> + }
>> +
>> + if (!kvm_sdei_is_valid_event_num(event_num)) {
>> + ret = SDEI_INVALID_PARAMETERS;
>> + goto out;
>> + }
>> +
>> + /* Check if the KVM event exists */
>> + spin_lock(&ksdei->lock);
>> + kske = kvm_sdei_find_kvm_event(kvm, event_num);
>> + if (!kske) {
>> + ret = SDEI_INVALID_PARAMETERS;
>> + goto unlock;
>> + }
>> +
>> + /* Check if there is pending events */
>> + if (kske->state.refcount) {
>> + ret = SDEI_PENDING;
> don't you want to record the fact the unregistration is outstanding to
> perform subsequent actions? Otherwise nothing will hapen when the
> current executing handlers complete?

It's not necessary. The guest should retry in this case.

>> + goto unlock;
>> + }
>> +
>> + /* Check if it has been registered */
>> + kse = kske->kse;
>> + index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
>> + vcpu->vcpu_idx : 0;
> you could have an inline for the above as this is executed in many
> functions. even including the code below.

Ok, it's a good idea.

>> + if (!kvm_sdei_is_registered(kske, index)) {
>> + ret = SDEI_DENIED;
>> + goto unlock;
>> + }
>> +
>> + /* The event is disabled when it's unregistered */
>> + kvm_sdei_clear_enabled(kske, index);
>> + kvm_sdei_clear_registered(kske, index);
>> + if (kvm_sdei_empty_registered(kske)) {
> a refcount mechanism would be cleaner I think.

A refcount isn't working well. We need a mapping here because the private
SDEI event can be enabled/registered on multiple vCPUs. We need to know
the exact vCPUs where the private SDEI event is enabled/registered.

>> + list_del(&kske->link);
>> + kfree(kske);
>> + }
>> +
>> +unlock:
>> + spin_unlock(&ksdei->lock);
>> +out:
>> + return ret;
>> +}
>> +
>> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> {
>> u32 func = smccc_get_function(vcpu);
>> @@ -333,6 +392,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
>> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
>> case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
>> + ret = kvm_sdei_hypercall_unregister(vcpu);
>> + break;
>> case SDEI_1_0_FN_SDEI_EVENT_STATUS:
>> case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
>> case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
>>

Thanks,
Gavin


2022-01-12 02:41:18

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 08/21] KVM: arm64: Support SDEI_EVENT_STATUS hypercall

Hi Eric,

On 11/10/21 1:12 AM, Eric Auger wrote:
> On 8/15/21 2:13 AM, Gavin Shan wrote:
>> This supports SDEI_EVENT_STATUS hypercall. It's used by the guest
>> to retrieve a bitmap to indicate the SDEI event states, including
>> registration, enablement and delivery state.
>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> arch/arm64/kvm/sdei.c | 50 +++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 50 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index a3ba69dc91cb..b95b8c4455e1 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -367,6 +367,54 @@ static unsigned long kvm_sdei_hypercall_unregister(struct kvm_vcpu *vcpu)
>> return ret;
>> }
>>
>> +static unsigned long kvm_sdei_hypercall_status(struct kvm_vcpu *vcpu)
>> +{
>> + struct kvm *kvm = vcpu->kvm;
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> + struct kvm_sdei_event *kse = NULL;
>> + struct kvm_sdei_kvm_event *kske = NULL;
>> + unsigned long event_num = smccc_get_arg1(vcpu);
>> + int index = 0;
>> + unsigned long ret = SDEI_SUCCESS;
>> +
>> + /* Sanity check */
>> + if (!(ksdei && vsdei)) {
>> + ret = SDEI_NOT_SUPPORTED;
>> + goto out;
>> + }
>> +
>> + if (!kvm_sdei_is_valid_event_num(event_num)) {
>> + ret = SDEI_INVALID_PARAMETERS;
>> + goto out;
>> + }
> if we were to support bound events, I do not know if a given even num
> can disapper inbetween that check and the rest of the code, in which
> case a lock would be needed?

For the bound events, it's possbile. However, @ksdei->lock can be reused
in that cause. Anyway, it's something for future :)

>> +
>> + /*
>> + * Check if the KVM event exists. None of the flags
>> + * will be set if it doesn't exist.
>> + */
>> + spin_lock(&ksdei->lock);
>> + kske = kvm_sdei_find_kvm_event(kvm, event_num);
>> + if (!kske) {
>> + ret = 0;
>> + goto unlock;
>> + }
>> +
>> + index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
>> + vcpu->vcpu_idx : 0;
>> + if (kvm_sdei_is_registered(kske, index))
>> + ret |= (1UL << SDEI_EVENT_STATUS_REGISTERED);
>> + if (kvm_sdei_is_enabled(kske, index))
>> + ret |= (1UL << SDEI_EVENT_STATUS_ENABLED);
>> + if (kske->state.refcount)
>> + ret |= (1UL << SDEI_EVENT_STATUS_RUNNING);
>> +
>> +unlock:
>> + spin_unlock(&ksdei->lock);
>> +out:
>> + return ret;
>> +}
>> +
>> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> {
>> u32 func = smccc_get_function(vcpu);
>> @@ -395,6 +443,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> ret = kvm_sdei_hypercall_unregister(vcpu);
>> break;
>> case SDEI_1_0_FN_SDEI_EVENT_STATUS:
>> + ret = kvm_sdei_hypercall_status(vcpu);
>> + break;
>> case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
>> case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
>> case SDEI_1_0_FN_SDEI_PE_MASK:
>>

Thanks,
Gavin


2022-01-12 02:47:24

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 09/21] KVM: arm64: Support SDEI_EVENT_GET_INFO hypercall

Hi Eric,

On 11/10/21 1:19 AM, Eric Auger wrote:
> On 8/15/21 2:13 AM, Gavin Shan wrote:
>> This supports SDEI_EVENT_GET_INFO hypercall. It's used by the guest
>> to retrieve various information about the supported (exported) events,
>> including type, signaled, route mode and affinity for the shared
>> events.
>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> arch/arm64/kvm/sdei.c | 76 +++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 76 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index b95b8c4455e1..5dfa74b093f1 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -415,6 +415,80 @@ static unsigned long kvm_sdei_hypercall_status(struct kvm_vcpu *vcpu)
>> return ret;
>> }
>>
>> +static unsigned long kvm_sdei_hypercall_info(struct kvm_vcpu *vcpu)
>> +{
>> + struct kvm *kvm = vcpu->kvm;
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> + struct kvm_sdei_event *kse = NULL;
>> + struct kvm_sdei_kvm_event *kske = NULL;
>> + unsigned long event_num = smccc_get_arg1(vcpu);
>> + unsigned long event_info = smccc_get_arg2(vcpu);
>> + unsigned long ret = SDEI_SUCCESS;
>> +
>> + /* Sanity check */
>> + if (!(ksdei && vsdei)) {
>> + ret = SDEI_NOT_SUPPORTED;
>> + goto out;
>> + }
>> +
>> + if (!kvm_sdei_is_valid_event_num(event_num)) {
>> + ret = SDEI_INVALID_PARAMETERS;
>> + goto out;
>> + }
>> +
>> + /*
>> + * Check if the KVM event exists. The event might have been
>> + * registered, we need fetch the information from the registered
> s/fetch/to fetch

Ack.

>> + * event in that case.
>> + */
>> + spin_lock(&ksdei->lock);
>> + kske = kvm_sdei_find_kvm_event(kvm, event_num);
>> + kse = kske ? kske->kse : NULL;
>> + if (!kse) {
>> + kse = kvm_sdei_find_event(kvm, event_num);
>> + if (!kse) {
>> + ret = SDEI_INVALID_PARAMETERS;
> this should have already be covered by !kvm_sdei_is_valid_event_num I
> think (although this latter only checks the since static event num with
> KVM owner mask)

Nope. Strictly speaking, kvm_sdei_find_event() covers the check carried
by !kvm_sdei_is_valid_event_num(). All the defined (exposed) events should
have virtual event number :)

>> + goto unlock;
>> + }
>> + }
>> +
>> + /* Retrieve the requested information */
>> + switch (event_info) {
>> + case SDEI_EVENT_INFO_EV_TYPE:
>> + ret = kse->state.type;
>> + break;
>> + case SDEI_EVENT_INFO_EV_SIGNALED:
>> + ret = kse->state.signaled;
>> + break;
>> + case SDEI_EVENT_INFO_EV_PRIORITY:
>> + ret = kse->state.priority;
>> + break;
>> + case SDEI_EVENT_INFO_EV_ROUTING_MODE:
>> + case SDEI_EVENT_INFO_EV_ROUTING_AFF:
>> + if (kse->state.type != SDEI_EVENT_TYPE_SHARED) {
>> + ret = SDEI_INVALID_PARAMETERS;
>> + break;
>> + }
>> +
>> + if (event_info == SDEI_EVENT_INFO_EV_ROUTING_MODE) {
>> + ret = kske ? kske->state.route_mode :
>> + SDEI_EVENT_REGISTER_RM_ANY;
> no, if event is not registered (!kske) DENIED should be returned

I don't think so. According to the specification, there is no DENIED
return value for STATUS hypercall. Either INVALID_PARAMETERS or NOT_SUPPORTED
should be returned from this hypercall :)

>> + } else {
> same here
>> + ret = kske ? kske->state.route_affinity : 0;
>> + }
>> +
>> + break;
>> + default:
>> + ret = SDEI_INVALID_PARAMETERS;
>> + }
>> +
>> +unlock:
>> + spin_unlock(&ksdei->lock);
>> +out:
>> + return ret;
>> +}
>> +
>> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> {
>> u32 func = smccc_get_function(vcpu);
>> @@ -446,6 +520,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> ret = kvm_sdei_hypercall_status(vcpu);
>> break;
>> case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
>> + ret = kvm_sdei_hypercall_info(vcpu);
>> + break;
>> case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
>> case SDEI_1_0_FN_SDEI_PE_MASK:
>> case SDEI_1_0_FN_SDEI_PE_UNMASK:
>>

Thanks,
Gavin


2022-01-12 02:54:35

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 10/21] KVM: arm64: Support SDEI_EVENT_ROUTING_SET hypercall

Hi Eric,

On 11/10/21 2:47 AM, Eric Auger wrote:
> On 8/15/21 2:13 AM, Gavin Shan wrote:
>> This supports SDEI_EVENT_ROUTING_SET hypercall. It's used by the
>> guest to set route mode and affinity for the registered KVM event.
>> It's only valid for the shared events. It's not allowed to do so
>> when the corresponding event has been raised to the guest.
>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> arch/arm64/kvm/sdei.c | 64 +++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 64 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index 5dfa74b093f1..458695c2394f 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -489,6 +489,68 @@ static unsigned long kvm_sdei_hypercall_info(struct kvm_vcpu *vcpu)
>> return ret;
>> }
>>
>> +static unsigned long kvm_sdei_hypercall_route(struct kvm_vcpu *vcpu)
>> +{
>> + struct kvm *kvm = vcpu->kvm;
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> + struct kvm_sdei_event *kse = NULL;
>> + struct kvm_sdei_kvm_event *kske = NULL;
>> + unsigned long event_num = smccc_get_arg1(vcpu);
>> + unsigned long route_mode = smccc_get_arg2(vcpu);
>> + unsigned long route_affinity = smccc_get_arg3(vcpu);
>> + int index = 0;
>> + unsigned long ret = SDEI_SUCCESS;
>> +
>> + /* Sanity check */
>> + if (!(ksdei && vsdei)) {
>> + ret = SDEI_NOT_SUPPORTED;
>> + goto out;
>> + }
>> +
>> + if (!kvm_sdei_is_valid_event_num(event_num)) {
>> + ret = SDEI_INVALID_PARAMETERS;
>> + goto out;
>> + }
>> +
>> + if (!(route_mode == SDEI_EVENT_REGISTER_RM_ANY ||
>> + route_mode == SDEI_EVENT_REGISTER_RM_PE)) {
>> + ret = SDEI_INVALID_PARAMETERS;
>> + goto out;
>> + }
> Some sanity checking on the affinity arg could be made as well according
> to 5.1.2 affinity desc. The fn shall return INVALID_PARAMETER in case
> of invalid affinity.

Yep, you're right. I didn't figure out it. I may put a comment here.
For now, the SDEI client driver in the guest kernel doesn't attempt
to change the routing mode.

/* FIXME: The affinity should be verified */

>> +
>> + /* Check if the KVM event has been registered */
>> + spin_lock(&ksdei->lock);
>> + kske = kvm_sdei_find_kvm_event(kvm, event_num);
>> + if (!kske) {
>> + ret = SDEI_INVALID_PARAMETERS;
>> + goto unlock;
>> + }
>> +
>> + /* Validate KVM event state */
>> + kse = kske->kse;
>> + if (kse->state.type != SDEI_EVENT_TYPE_SHARED) {
>> + ret = SDEI_INVALID_PARAMETERS;
>> + goto unlock;
>> + }
>> +
> Event handler is in a state other than: handler-registered.

They're equivalent as the handler is provided as a parameter when
the event is registered.

>> + if (!kvm_sdei_is_registered(kske, index) ||
>> + kvm_sdei_is_enabled(kske, index) ||
>> + kske->state.refcount) {
> I am not sure about the refcount role here. Does it make sure the state
> is != handler-enabled and running or handler-unregister-pending?
>
> I think we would gain in readibility if we had a helper to check whether
> we are in those states?

@refcount here indicates pending SDEI event for delivery. In this case,
chaning its routing mode is disallowed.

>> + ret = SDEI_DENIED;
>> + goto unlock;
>> + }
>> +
>> + /* Update state */
>> + kske->state.route_mode = route_mode;
>> + kske->state.route_affinity = route_affinity;
>> +
>> +unlock:
>> + spin_unlock(&ksdei->lock);
>> +out:
>> + return ret;
>> +}
>> +
>> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> {
>> u32 func = smccc_get_function(vcpu);
>> @@ -523,6 +585,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> ret = kvm_sdei_hypercall_info(vcpu);
>> break;
>> case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
>> + ret = kvm_sdei_hypercall_route(vcpu);
>> + break;
>> case SDEI_1_0_FN_SDEI_PE_MASK:
>> case SDEI_1_0_FN_SDEI_PE_UNMASK:
>> case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
>>

Thanks,
Gavin


2022-01-12 02:58:49

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 11/21] KVM: arm64: Support SDEI_PE_{MASK, UNMASK} hypercall

Hi Eric,

On 11/10/21 4:31 AM, Eric Auger wrote:
> On 8/15/21 2:13 AM, Gavin Shan wrote:
>> This supports SDEI_PE_{MASK, UNMASK} hypercall. They are used by
>> the guest to stop the specific vCPU from receiving SDEI events.
>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> arch/arm64/kvm/sdei.c | 35 +++++++++++++++++++++++++++++++++++
>> 1 file changed, 35 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index 458695c2394f..3fb33258b494 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -551,6 +551,37 @@ static unsigned long kvm_sdei_hypercall_route(struct kvm_vcpu *vcpu)
>> return ret;
>> }
>>
>> +static unsigned long kvm_sdei_hypercall_mask(struct kvm_vcpu *vcpu,
>> + bool mask)
>> +{
>> + struct kvm *kvm = vcpu->kvm;
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> + unsigned long ret = SDEI_SUCCESS;
>> +
>> + /* Sanity check */
>> + if (!(ksdei && vsdei)) {
>> + ret = SDEI_NOT_SUPPORTED;
>> + goto out;
>> + }
>> +
>> + spin_lock(&vsdei->lock);
>> +
>> + /* Check the state */
>> + if (mask == vsdei->state.masked) {
>> + ret = SDEI_DENIED;
> are you sure? I don't this error documented in 5.1.12?
>
> Besides the spec says:
> "
> This call can be invoked by the client to mask the PE, whether or not
> the PE is already masked."

Yep, I think this check can safely dropped.

>> + goto unlock;
>> + }
>> +
>> + /* Update the state */
>> + vsdei->state.masked = mask ? 1 : 0;
>> +
>> +unlock:
>> + spin_unlock(&vsdei->lock);
>> +out:
>> + return ret;
> In case of success the returned value is SUCESS for UNMASK but not for
> MASK (see table in 5.1.12).
>
> By the way I have just noticed there is a more recent of the spec than
> the A:
>
> ARM_DEN0054C
>
> You should update the cover letter and [PATCH v4 02/21] KVM: arm64: Add
> SDEI virtualization infrastructure commit msg
>

Thanks, Eric. You've looked into newer version of spec. I will update
the code and link to the spec accordingly :)

>
>> +}
>> +
>> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> {
>> u32 func = smccc_get_function(vcpu);
>> @@ -588,7 +619,11 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> ret = kvm_sdei_hypercall_route(vcpu);
>> break;
>> case SDEI_1_0_FN_SDEI_PE_MASK:
>> + ret = kvm_sdei_hypercall_mask(vcpu, true);
>> + break;
>> case SDEI_1_0_FN_SDEI_PE_UNMASK:
>> + ret = kvm_sdei_hypercall_mask(vcpu, false);
>> + break;
>> case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
>> case SDEI_1_0_FN_SDEI_INTERRUPT_RELEASE:
>> case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
>>

Thanks,
Gavin


2022-01-12 03:01:39

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 12/21] KVM: arm64: Support SDEI_{PRIVATE, SHARED}_RESET hypercall

Hi Eric,

On 11/10/21 4:37 AM, Eric Auger wrote:
> On 8/15/21 2:13 AM, Gavin Shan wrote:
>> This supports SDEI_{PRIVATE, SHARED}_RESET. They are used by the
>> guest to purge the private or shared SDEI events, which are registered
> to reset all private SDEI event registrations of the calling PE (resp.
> PRIVATE or SHARED)

Ok.

>> previously.
>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> arch/arm64/kvm/sdei.c | 29 +++++++++++++++++++++++++++++
>> 1 file changed, 29 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index 3fb33258b494..62efee2b67b8 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -582,6 +582,29 @@ static unsigned long kvm_sdei_hypercall_mask(struct kvm_vcpu *vcpu,
>> return ret;
>> }
>>
>> +static unsigned long kvm_sdei_hypercall_reset(struct kvm_vcpu *vcpu,
>> + bool private)
>> +{
>> + struct kvm *kvm = vcpu->kvm;
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> + unsigned int mask = private ? (1 << SDEI_EVENT_TYPE_PRIVATE) :
>> + (1 << SDEI_EVENT_TYPE_SHARED);
>> + unsigned long ret = SDEI_SUCCESS;
>> +
>> + /* Sanity check */
>> + if (!(ksdei && vsdei)) {
>> + ret = SDEI_NOT_SUPPORTED;
>> + goto out;
>> + }
>> +
>> + spin_lock(&ksdei->lock);
>> + kvm_sdei_remove_kvm_events(kvm, mask, false);
> With kvm_sdei_remove_kvm_events() implementation, why do you make sure
> that events which have a running handler get unregistered once the
> handler completes? I just see the refcount check that prevents the "KVM
> event object" from being removed.

Good point. I think here we need enhancement to cancel the pending
events prior to destroying them. I will think about it :)

>> + spin_unlock(&ksdei->lock);
>> +out:
>> + return ret;
>> +}
>> +
>> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> {
>> u32 func = smccc_get_function(vcpu);
>> @@ -626,8 +649,14 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> break;
>> case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
>> case SDEI_1_0_FN_SDEI_INTERRUPT_RELEASE:
>> + ret = SDEI_NOT_SUPPORTED;
>> + break;
>> case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
>> + ret = kvm_sdei_hypercall_reset(vcpu, true);
>> + break;
>> case SDEI_1_0_FN_SDEI_SHARED_RESET:
>> + ret = kvm_sdei_hypercall_reset(vcpu, false);
>> + break;
>> default:
>> ret = SDEI_NOT_SUPPORTED;
>> }
>>

Thanks,
Gavin


2022-01-12 06:35:02

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 13/21] KVM: arm64: Impment SDEI event delivery

Hi Eric,

On 11/10/21 6:58 PM, Eric Auger wrote:
> s/Impment/Implement in the commit title
>
> On 8/15/21 2:13 AM, Gavin Shan wrote:
>> This implement kvm_sdei_deliver() to support SDEI event delivery.
>> The function is called when the request (KVM_REQ_SDEI) is raised.
>> The following rules are taken according to the SDEI specification:
>>
>> * x0 - x17 are saved. All of them are cleared except the following
>> registered:
> s/registered/registers
>> x0: number SDEI event to be delivered
> s/number SDEI event/SDEI event number
>> x1: parameter associated with the SDEI event
> user arg?

The commit log will be improved in next respin.

>> x2: PC of the interrupted context
>> x3: PState of the interrupted context
>>
>> * PC is set to the handler of the SDEI event, which was provided
>> during its registration. PState is modified accordingly.
>>
>> * SDEI event with critical priority can preempt those with normal
>> priority.
>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> arch/arm64/include/asm/kvm_host.h | 1 +
>> arch/arm64/include/asm/kvm_sdei.h | 1 +
>> arch/arm64/kvm/arm.c | 3 ++
>> arch/arm64/kvm/sdei.c | 84 +++++++++++++++++++++++++++++++
>> 4 files changed, 89 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index aedf901e1ec7..46f363aa6524 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -47,6 +47,7 @@
>> #define KVM_REQ_RECORD_STEAL KVM_ARCH_REQ(3)
>> #define KVM_REQ_RELOAD_GICv4 KVM_ARCH_REQ(4)
>> #define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5)
>> +#define KVM_REQ_SDEI KVM_ARCH_REQ(6)
>>
>> #define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \
>> KVM_DIRTY_LOG_INITIALLY_SET)
>> diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
>> index b0abc13a0256..7f5f5ad689e6 100644
>> --- a/arch/arm64/include/asm/kvm_sdei.h
>> +++ b/arch/arm64/include/asm/kvm_sdei.h
>> @@ -112,6 +112,7 @@ KVM_SDEI_FLAG_FUNC(enabled)
>> void kvm_sdei_init_vm(struct kvm *kvm);
>> void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
>> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu);
>> +void kvm_sdei_deliver(struct kvm_vcpu *vcpu);
>> void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
>> void kvm_sdei_destroy_vm(struct kvm *kvm);
>>
>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> index 2f021aa41632..0c3db1ef1ba9 100644
>> --- a/arch/arm64/kvm/arm.c
>> +++ b/arch/arm64/kvm/arm.c
>> @@ -689,6 +689,9 @@ static void check_vcpu_requests(struct kvm_vcpu *vcpu)
>> if (kvm_check_request(KVM_REQ_VCPU_RESET, vcpu))
>> kvm_reset_vcpu(vcpu);
>>
>> + if (kvm_check_request(KVM_REQ_SDEI, vcpu))
>> + kvm_sdei_deliver(vcpu);
>> +
>> /*
>> * Clear IRQ_PENDING requests that were made to guarantee
>> * that a VCPU sees new virtual interrupts.
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index 62efee2b67b8..b5d6d1ed3858 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -671,6 +671,90 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> return 1;
>> }
>>
>> +void kvm_sdei_deliver(struct kvm_vcpu *vcpu)
>> +{
>> + struct kvm *kvm = vcpu->kvm;
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> + struct kvm_sdei_event *kse = NULL;
>> + struct kvm_sdei_kvm_event *kske = NULL;
>> + struct kvm_sdei_vcpu_event *ksve = NULL;
>> + struct kvm_sdei_vcpu_regs *regs = NULL;
>> + unsigned long pstate;
>> + int index = 0;
>> +
>> + /* Sanity check */
>> + if (!(ksdei && vsdei))
>> + return;
>> +
>> + /* The critical event can't be preempted */
> move the comment after the spin_lock

Ok.

>> + spin_lock(&vsdei->lock);
>> + if (vsdei->critical_event)
>> + goto unlock;
>> +
>> + /*
>> + * The normal event can be preempted by the critical event.
>> + * However, the normal event can't be preempted by another
>> + * normal event.
>> + */
>> + ksve = list_first_entry_or_null(&vsdei->critical_events,
>> + struct kvm_sdei_vcpu_event, link);
>> + if (!ksve && !vsdei->normal_event) {
>> + ksve = list_first_entry_or_null(&vsdei->normal_events,
>> + struct kvm_sdei_vcpu_event, link);
>> + }
> At this stage of the review the struct kvm_sdei_vcpu_event lifecycle is
> not known.
>

The object (kvm_sdei_vcpu_event) is queued to the target vCPU for dispatch.
The multiple and same SDEI events can be queued to one target vCPU. In this
case, the objecct is reused.

>>From the dispatcher pseudocode I understand you check
>
> ((IsCriticalEvent(E) and !CriticalEventRunning(P, C)) ||
> (!IsCriticalEvent(E) and !EventRunning(P, C)))
>
> but I can't check you take care of
> IsEnabled(E) and
> IsEventTarget(E, P)
> IsUnmasked(P)
>
> Either you should shash with 18/21 or at least you should add comments.

The additional conditions are checked when the event is injected in PATCH[v4 18/21].
I think it's good idead to squash them.

>> +
>> + if (!ksve)
>> + goto unlock;
>> +
>> + kske = ksve->kske;
>> + kse = kske->kse;
>> + if (kse->state.priority == SDEI_EVENT_PRIORITY_CRITICAL) {
>> + vsdei->critical_event = ksve;
>> + vsdei->state.critical_num = ksve->state.num;
>> + regs = &vsdei->state.critical_regs;
>> + } else {
>> + vsdei->normal_event = ksve;
>> + vsdei->state.normal_num = ksve->state.num;
>> + regs = &vsdei->state.normal_regs;
>> + }
>> +
>> + /* Save registers: x0 -> x17, PC, PState */
>> + for (index = 0; index < ARRAY_SIZE(regs->regs); index++)
>> + regs->regs[index] = vcpu_get_reg(vcpu, index);
>> +
>> + regs->pc = *vcpu_pc(vcpu);
>> + regs->pstate = *vcpu_cpsr(vcpu);
>> +
>> + /*
>> + * Inject SDEI event: x0 -> x3, PC, PState. We needn't take lock
>> + * for the KVM event as it can't be destroyed because of its
>> + * reference count.
>> + */
>> + for (index = 0; index < ARRAY_SIZE(regs->regs); index++)
>> + vcpu_set_reg(vcpu, index, 0);
>> +
>> + index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
>> + vcpu->vcpu_idx : 0;
>> + vcpu_set_reg(vcpu, 0, kske->state.num);
>> + vcpu_set_reg(vcpu, 1, kske->state.params[index]);
>> + vcpu_set_reg(vcpu, 2, regs->pc);
>> + vcpu_set_reg(vcpu, 3, regs->pstate);
>> +
>> + pstate = regs->pstate;
>> + pstate |= (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT);
>> + pstate &= ~PSR_MODE_MASK;
>> + pstate |= PSR_MODE_EL1h;
>> + pstate &= ~PSR_MODE32_BIT;
>> +
>> + vcpu_write_sys_reg(vcpu, regs->pstate, SPSR_EL1);
>> + *vcpu_cpsr(vcpu) = pstate;
>> + *vcpu_pc(vcpu) = kske->state.entries[index];
>> +
>> +unlock:
>> + spin_unlock(&vsdei->lock);
>> +}
>> +
>> void kvm_sdei_init_vm(struct kvm *kvm)
>> {
>> struct kvm_sdei_kvm *ksdei;
>>

Thanks,
Gavin


2022-01-12 06:43:52

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 14/21] KVM: arm64: Support SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall

Hi Eric,

On 11/10/21 6:58 PM, Eric Auger wrote:
> On 8/15/21 2:13 AM, Gavin Shan wrote:
>> This supports SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall.
>> They are used by the guest to notify the completion of the SDEI
>> event in the handler. The registers are changed according to the
>> SDEI specification as below:
>>
>> * x0 - x17, PC and PState are restored to what values we had in
>> the interrupted context.
>>
>> * If it's SDEI_EVENT_COMPLETE_AND_RESUME hypercall, IRQ exception
>> is injected.
>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> arch/arm64/include/asm/kvm_emulate.h | 1 +
>> arch/arm64/include/asm/kvm_host.h | 1 +
>> arch/arm64/kvm/hyp/exception.c | 7 +++
>> arch/arm64/kvm/inject_fault.c | 27 ++++++++++
>> arch/arm64/kvm/sdei.c | 75 ++++++++++++++++++++++++++++
>> 5 files changed, 111 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
>> index fd418955e31e..923b4d08ea9a 100644
>> --- a/arch/arm64/include/asm/kvm_emulate.h
>> +++ b/arch/arm64/include/asm/kvm_emulate.h
>> @@ -37,6 +37,7 @@ bool kvm_condition_valid32(const struct kvm_vcpu *vcpu);
>> void kvm_skip_instr32(struct kvm_vcpu *vcpu);
>>
>> void kvm_inject_undefined(struct kvm_vcpu *vcpu);
>> +void kvm_inject_irq(struct kvm_vcpu *vcpu);
>> void kvm_inject_vabt(struct kvm_vcpu *vcpu);
>> void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr);
>> void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr);
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index 46f363aa6524..1824f7e1f9ab 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -437,6 +437,7 @@ struct kvm_vcpu_arch {
>> #define KVM_ARM64_EXCEPT_AA32_UND (0 << 9)
>> #define KVM_ARM64_EXCEPT_AA32_IABT (1 << 9)
>> #define KVM_ARM64_EXCEPT_AA32_DABT (2 << 9)
>> +#define KVM_ARM64_EXCEPT_AA32_IRQ (3 << 9)
>> /* For AArch64: */
>> #define KVM_ARM64_EXCEPT_AA64_ELx_SYNC (0 << 9)
>> #define KVM_ARM64_EXCEPT_AA64_ELx_IRQ (1 << 9)
>> diff --git a/arch/arm64/kvm/hyp/exception.c b/arch/arm64/kvm/hyp/exception.c
>> index 0418399e0a20..ef458207d152 100644
>> --- a/arch/arm64/kvm/hyp/exception.c
>> +++ b/arch/arm64/kvm/hyp/exception.c
>> @@ -310,6 +310,9 @@ static void kvm_inject_exception(struct kvm_vcpu *vcpu)
>> case KVM_ARM64_EXCEPT_AA32_DABT:
>> enter_exception32(vcpu, PSR_AA32_MODE_ABT, 16);
>> break;
>> + case KVM_ARM64_EXCEPT_AA32_IRQ:
>> + enter_exception32(vcpu, PSR_AA32_MODE_IRQ, 4);
>> + break;
>> default:
>> /* Err... */
>> break;
>> @@ -320,6 +323,10 @@ static void kvm_inject_exception(struct kvm_vcpu *vcpu)
>> KVM_ARM64_EXCEPT_AA64_EL1):
>> enter_exception64(vcpu, PSR_MODE_EL1h, except_type_sync);
>> break;
>> + case (KVM_ARM64_EXCEPT_AA64_ELx_IRQ |
>> + KVM_ARM64_EXCEPT_AA64_EL1):
>> + enter_exception64(vcpu, PSR_MODE_EL1h, except_type_irq);
>> + break;
>> default:
>> /*
>> * Only EL1_SYNC makes sense so far, EL2_{SYNC,IRQ}
>> diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
>> index b47df73e98d7..3a8c55867d2f 100644
>> --- a/arch/arm64/kvm/inject_fault.c
>> +++ b/arch/arm64/kvm/inject_fault.c
>> @@ -66,6 +66,13 @@ static void inject_undef64(struct kvm_vcpu *vcpu)
>> vcpu_write_sys_reg(vcpu, esr, ESR_EL1);
>> }
>>
>> +static void inject_irq64(struct kvm_vcpu *vcpu)
>> +{
>> + vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1 |
>> + KVM_ARM64_EXCEPT_AA64_ELx_IRQ |
>> + KVM_ARM64_PENDING_EXCEPTION);
>> +}
>> +
>> #define DFSR_FSC_EXTABT_LPAE 0x10
>> #define DFSR_FSC_EXTABT_nLPAE 0x08
>> #define DFSR_LPAE BIT(9)
>> @@ -77,6 +84,12 @@ static void inject_undef32(struct kvm_vcpu *vcpu)
>> KVM_ARM64_PENDING_EXCEPTION);
>> }
>>
>> +static void inject_irq32(struct kvm_vcpu *vcpu)
>> +{
>> + vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA32_IRQ |
>> + KVM_ARM64_PENDING_EXCEPTION);
>> +}
>> +
>> /*
>> * Modelled after TakeDataAbortException() and TakePrefetchAbortException
>> * pseudocode.
>> @@ -160,6 +173,20 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
>> inject_undef64(vcpu);
>> }
>>
>> +/**
>> + * kvm_inject_irq - inject an IRQ into the guest
>> + *
>> + * It is assumed that this code is called from the VCPU thread and that the
>> + * VCPU therefore is not currently executing guest code.
>> + */
>> +void kvm_inject_irq(struct kvm_vcpu *vcpu)
>> +{
>> + if (vcpu_el1_is_32bit(vcpu))
>> + inject_irq32(vcpu);
>> + else
>> + inject_irq64(vcpu);
>> +}
>> +
>> void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 esr)
>> {
>> vcpu_set_vsesr(vcpu, esr & ESR_ELx_ISS_MASK);
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index b5d6d1ed3858..1e8e213c9d70 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -308,6 +308,75 @@ static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu)
>> return ret;
>> }
>>
>> +static unsigned long kvm_sdei_hypercall_complete(struct kvm_vcpu *vcpu,
>> + bool resume)
>> +{
>> + struct kvm *kvm = vcpu->kvm;
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> + struct kvm_sdei_kvm_event *kske = NULL;
>> + struct kvm_sdei_vcpu_event *ksve = NULL;
>> + struct kvm_sdei_vcpu_regs *regs;
>> + unsigned long ret = SDEI_SUCCESS;
> for the RESUME you never seem to read resume_addr arg? How does it work?
> I don't get the irq injection path. Please could you explain?

The guest kernel uses COMPLETE and COMPLETE_AND_RESUME hypercalls to notify the
SDEI event has been acknoledged by it. The difference between them is COMPLETE_AND_RESUME
fires the pending interrupts, but COMPLETE doesn't.

>> + int index;
>> +
>> + /* Sanity check */
>> + if (!(ksdei && vsdei)) {
>> + ret = SDEI_NOT_SUPPORTED;
>> + goto out;
>> + }
>> +
>> + spin_lock(&vsdei->lock);
>> + if (vsdei->critical_event) {
>> + ksve = vsdei->critical_event;
>> + regs = &vsdei->state.critical_regs;
>> + vsdei->critical_event = NULL;
>> + vsdei->state.critical_num = KVM_SDEI_INVALID_NUM;
>> + } else if (vsdei->normal_event) {
>> + ksve = vsdei->normal_event;
>> + regs = &vsdei->state.normal_regs;
>> + vsdei->normal_event = NULL;
>> + vsdei->state.normal_num = KVM_SDEI_INVALID_NUM;
>> + } else {
>> + ret = SDEI_DENIED;
>> + goto unlock;
>> + }
>> +
>> + /* Restore registers: x0 -> x17, PC, PState */
>> + for (index = 0; index < ARRAY_SIZE(regs->regs); index++)
>> + vcpu_set_reg(vcpu, index, regs->regs[index]);
>> +
>> + *vcpu_cpsr(vcpu) = regs->pstate;
>> + *vcpu_pc(vcpu) = regs->pc;
>> +
>> + /* Inject interrupt if needed */
>> + if (resume)
>> + kvm_inject_irq(vcpu);
>> +
>> + /*
>> + * Update state. We needn't take lock in order to update the KVM
>> + * event state as it's not destroyed because of the reference
>> + * count.
>> + */
>> + kske = ksve->kske;
>> + ksve->state.refcount--;
>> + kske->state.refcount--;
> why double --?

On each SDEI event is queued for delivery, both reference count are increased. I guess
it's a bit confusing. I will change in next revision:

ksve->state.refcount: Increased on each SDEI event is queued for delivered
kske->state.refcount: Increased on each @ksve is created


>> + if (!ksve->state.refcount) {
> why not using a struct kref directly?

The reason is kref isn't friendly to userspace. This field (@refcount) needs to be
migrated :)

>> + list_del(&ksve->link);
>> + kfree(ksve);
>> + }
>> +
>> + /* Make another request if there is pending event */
>> + if (!(list_empty(&vsdei->critical_events) &&
>> + list_empty(&vsdei->normal_events)))
>> + kvm_make_request(KVM_REQ_SDEI, vcpu);
>> +
>> +unlock:
>> + spin_unlock(&vsdei->lock);
>> +out:
>> + return ret;
>> +}
>> +
>> static unsigned long kvm_sdei_hypercall_unregister(struct kvm_vcpu *vcpu)
>> {
>> struct kvm *kvm = vcpu->kvm;
>> @@ -628,7 +697,13 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> ret = kvm_sdei_hypercall_context(vcpu);
>> break;
>> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
>> + has_result = false;
>> + ret = kvm_sdei_hypercall_complete(vcpu, false);
>> + break;
>> case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
>> + has_result = false;
>> + ret = kvm_sdei_hypercall_complete(vcpu, true);
>> + break;
>> case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
>> ret = kvm_sdei_hypercall_unregister(vcpu);
>> break;
>>

Thanks,
Gavin


2022-01-12 06:48:53

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 15/21] KVM: arm64: Support SDEI event notifier

Hi Eric,

On 11/10/21 7:35 PM, Eric Auger wrote:
> On 8/15/21 2:13 AM, Gavin Shan wrote:
>> The owner of the SDEI event, like asynchronous page fault, need
> owner is not a terminology used in the SDEI spec
>> know the state of injected SDEI event. This supports SDEI event
> s/need know the state of injected/to know the state of the injected
>> state updating by introducing notifier mechanism. It's notable
> a notifier mechanism
>> the notifier (handler) should be capable of migration.
> I don't understand the last sentence

Thanks, Eric. The commit log will be improved accordingly in next
revision.

>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> arch/arm64/include/asm/kvm_sdei.h | 12 +++++++
>> arch/arm64/include/uapi/asm/kvm_sdei.h | 1 +
>> arch/arm64/kvm/sdei.c | 45 +++++++++++++++++++++++++-
>> 3 files changed, 57 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
>> index 7f5f5ad689e6..19f2d9b91f85 100644
>> --- a/arch/arm64/include/asm/kvm_sdei.h
>> +++ b/arch/arm64/include/asm/kvm_sdei.h
>> @@ -16,6 +16,16 @@
>> #include <linux/list.h>
>> #include <linux/spinlock.h>
>>
>> +struct kvm_vcpu;
>> +
>> +typedef void (*kvm_sdei_notifier)(struct kvm_vcpu *vcpu,
>> + unsigned long num,
>> + unsigned int state);
>> +enum {
>> + KVM_SDEI_NOTIFY_DELIVERED,
>> + KVM_SDEI_NOTIFY_COMPLETED,
>> +};
>> +
>> struct kvm_sdei_event {
>> struct kvm_sdei_event_state state;
>> struct kvm *kvm;
>> @@ -112,6 +122,8 @@ KVM_SDEI_FLAG_FUNC(enabled)
>> void kvm_sdei_init_vm(struct kvm *kvm);
>> void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
>> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu);
>> +int kvm_sdei_register_notifier(struct kvm *kvm, unsigned long num,
>> + kvm_sdei_notifier notifier);
>> void kvm_sdei_deliver(struct kvm_vcpu *vcpu);
>> void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
>> void kvm_sdei_destroy_vm(struct kvm *kvm);
>> diff --git a/arch/arm64/include/uapi/asm/kvm_sdei.h b/arch/arm64/include/uapi/asm/kvm_sdei.h
>> index 8928027023f6..4ef661d106fe 100644
>> --- a/arch/arm64/include/uapi/asm/kvm_sdei.h
>> +++ b/arch/arm64/include/uapi/asm/kvm_sdei.h
>> @@ -23,6 +23,7 @@ struct kvm_sdei_event_state {
>> __u8 type;
>> __u8 signaled;
>> __u8 priority;
>> + __u64 notifier;
> why is the notifier attached to the exposed event and not to the
> registered or even vcpu event? This needs to be motivated.
>
> Also as commented earlier I really think we first need to agree on the
> uapi and get a consensus on it as it must be right on the 1st shot. In
> that prospect maybe introduce a patch dedicated to the uapi and document
> it properly, including the way the end user is supposed to use it.
>
> Another way to proceed would be to not support migration at the moment,
> mature the API and then introduce migration support later. Would it make
> sense? For instance, in the past in-kernel ITS emulation was first
> introduced without migration support.
>

You're correct that @notifier needs to be migrated. I perfer to drop the
migration support at first, and then add the support when the APIs become
mature. However, the only user of SDEI would be Async PF, which is used in
migration scenario. So I will think about how to reorgnize the code and have
separate patch for the UAPI stuff, including the document.


>> };
>>
>> struct kvm_sdei_kvm_event_state {
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index 1e8e213c9d70..5f7a37dcaa77 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -314,9 +314,11 @@ static unsigned long kvm_sdei_hypercall_complete(struct kvm_vcpu *vcpu,
>> struct kvm *kvm = vcpu->kvm;
>> struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> + struct kvm_sdei_event *kse = NULL;
>> struct kvm_sdei_kvm_event *kske = NULL;
>> struct kvm_sdei_vcpu_event *ksve = NULL;
>> struct kvm_sdei_vcpu_regs *regs;
>> + kvm_sdei_notifier notifier;
>> unsigned long ret = SDEI_SUCCESS;
>> int index;
>>
>> @@ -349,6 +351,13 @@ static unsigned long kvm_sdei_hypercall_complete(struct kvm_vcpu *vcpu,
>> *vcpu_cpsr(vcpu) = regs->pstate;
>> *vcpu_pc(vcpu) = regs->pc;
>>
>> + /* Notifier */
>> + kske = ksve->kske;
>> + kse = kske->kse;
>> + notifier = (kvm_sdei_notifier)(kse->state.notifier);
>> + if (notifier)
>> + notifier(vcpu, kse->state.num, KVM_SDEI_NOTIFY_COMPLETED);
>> +
>> /* Inject interrupt if needed */
>> if (resume)
>> kvm_inject_irq(vcpu);
>> @@ -358,7 +367,6 @@ static unsigned long kvm_sdei_hypercall_complete(struct kvm_vcpu *vcpu,
>> * event state as it's not destroyed because of the reference
>> * count.
>> */
>> - kske = ksve->kske;
>> ksve->state.refcount--;
>> kske->state.refcount--;
>> if (!ksve->state.refcount) {
>> @@ -746,6 +754,35 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>> return 1;
>> }
>>
>> +int kvm_sdei_register_notifier(struct kvm *kvm,
>> + unsigned long num,
>> + kvm_sdei_notifier notifier)
>> +{
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_event *kse = NULL;
>> + int ret = 0;
>> +
>> + if (!ksdei) {
>> + ret = -EPERM;
>> + goto out;
>> + }
>> +
>> + spin_lock(&ksdei->lock);
>> +
>> + kse = kvm_sdei_find_event(kvm, num);
>> + if (!kse) {
>> + ret = -EINVAL;
>> + goto unlock;
>> + }
>> +
>> + kse->state.notifier = (unsigned long)notifier;
>> +
>> +unlock:
>> + spin_unlock(&ksdei->lock);
>> +out:
>> + return ret;
>> +}
>> +
>> void kvm_sdei_deliver(struct kvm_vcpu *vcpu)
>> {
>> struct kvm *kvm = vcpu->kvm;
>> @@ -755,6 +792,7 @@ void kvm_sdei_deliver(struct kvm_vcpu *vcpu)
>> struct kvm_sdei_kvm_event *kske = NULL;
>> struct kvm_sdei_vcpu_event *ksve = NULL;
>> struct kvm_sdei_vcpu_regs *regs = NULL;
>> + kvm_sdei_notifier notifier;
>> unsigned long pstate;
>> int index = 0;
>>
>> @@ -826,6 +864,11 @@ void kvm_sdei_deliver(struct kvm_vcpu *vcpu)
>> *vcpu_cpsr(vcpu) = pstate;
>> *vcpu_pc(vcpu) = kske->state.entries[index];
>>
>> + /* Notifier */
>> + notifier = (kvm_sdei_notifier)(kse->state.notifier);
>> + if (notifier)
>> + notifier(vcpu, kse->state.num, KVM_SDEI_NOTIFY_DELIVERED);
>> +
>> unlock:
>> spin_unlock(&vsdei->lock);
>> }
>>
>

Thanks,
Gavin


2022-01-12 07:03:40

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 16/21] KVM: arm64: Support SDEI ioctl commands on VM

Hi Eric,

On 11/10/21 9:48 PM, Eric Auger wrote:
> On 8/15/21 2:13 AM, Gavin Shan wrote:
>> This supports ioctl commands on VM to manage the various objects.
>> It's primarily used by VMM to accomplish live migration. The ioctl
>> commands introduced by this are highlighted as blow:
> below
>>
>> * KVM_SDEI_CMD_GET_VERSION
>> Retrieve the version of current implementation
> which implementation, SDEI?
>> * KVM_SDEI_CMD_SET_EVENT
>> Add event to be exported from KVM so that guest can register
>> against it afterwards
>> * KVM_SDEI_CMD_GET_KEVENT_COUNT
>> Retrieve number of registered SDEI events
>> * KVM_SDEI_CMD_GET_KEVENT
>> Retrieve the state of the registered SDEI event
>> * KVM_SDEI_CMD_SET_KEVENT
>> Populate the registered SDEI event
> I think we really miss the full picture of what you want to achieve with
> those IOCTLs or at least I fail to get it. Please document the UAPI
> separately including the structs and IOCTLs.

The commit log will be improved accordingly in next revision. Yep, I will
add document for UAPI and IOCTLs :)

>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> arch/arm64/include/asm/kvm_sdei.h | 1 +
>> arch/arm64/include/uapi/asm/kvm_sdei.h | 17 +++
>> arch/arm64/kvm/arm.c | 3 +
>> arch/arm64/kvm/sdei.c | 171 +++++++++++++++++++++++++
>> include/uapi/linux/kvm.h | 3 +
>> 5 files changed, 195 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
>> index 19f2d9b91f85..8f5ea947ed0e 100644
>> --- a/arch/arm64/include/asm/kvm_sdei.h
>> +++ b/arch/arm64/include/asm/kvm_sdei.h
>> @@ -125,6 +125,7 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu);
>> int kvm_sdei_register_notifier(struct kvm *kvm, unsigned long num,
>> kvm_sdei_notifier notifier);
>> void kvm_sdei_deliver(struct kvm_vcpu *vcpu);
>> +long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg);
>> void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
>> void kvm_sdei_destroy_vm(struct kvm *kvm);
>>
>> diff --git a/arch/arm64/include/uapi/asm/kvm_sdei.h b/arch/arm64/include/uapi/asm/kvm_sdei.h
>> index 4ef661d106fe..35ff05be3c28 100644
>> --- a/arch/arm64/include/uapi/asm/kvm_sdei.h
>> +++ b/arch/arm64/include/uapi/asm/kvm_sdei.h
>> @@ -57,5 +57,22 @@ struct kvm_sdei_vcpu_state {
>> struct kvm_sdei_vcpu_regs normal_regs;
>> };
>>
>> +#define KVM_SDEI_CMD_GET_VERSION 0
>> +#define KVM_SDEI_CMD_SET_EVENT 1
>> +#define KVM_SDEI_CMD_GET_KEVENT_COUNT 2
>> +#define KVM_SDEI_CMD_GET_KEVENT 3
>> +#define KVM_SDEI_CMD_SET_KEVENT 4
>> +
>> +struct kvm_sdei_cmd {
>> + __u32 cmd;
>> + union {
>> + __u32 version;
>> + __u32 count;
>> + __u64 num;
>> + struct kvm_sdei_event_state kse_state;
>> + struct kvm_sdei_kvm_event_state kske_state;
>> + };
>> +};
>> +
>> #endif /* !__ASSEMBLY__ */
>> #endif /* _UAPI__ASM_KVM_SDEI_H */
>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> index 0c3db1ef1ba9..8d61585124b2 100644
>> --- a/arch/arm64/kvm/arm.c
>> +++ b/arch/arm64/kvm/arm.c
>> @@ -1389,6 +1389,9 @@ long kvm_arch_vm_ioctl(struct file *filp,
>> return -EFAULT;
>> return kvm_vm_ioctl_mte_copy_tags(kvm, &copy_tags);
>> }
>> + case KVM_ARM_SDEI_COMMAND: {
>> + return kvm_sdei_vm_ioctl(kvm, arg);
>> + }
>> default:
>> return -EINVAL;
>> }
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index 5f7a37dcaa77..bdd76c3e5153 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -931,6 +931,177 @@ void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu)
>> vcpu->arch.sdei = vsdei;
>> }
>>
>> +static long kvm_sdei_set_event(struct kvm *kvm,
>> + struct kvm_sdei_event_state *kse_state)
>> +{
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_event *kse = NULL;
>> +
>> + if (!kvm_sdei_is_valid_event_num(kse_state->num))
>> + return -EINVAL;
>> +
>> + if (!(kse_state->type == SDEI_EVENT_TYPE_SHARED ||
>> + kse_state->type == SDEI_EVENT_TYPE_PRIVATE))
>> + return -EINVAL;
>> +
>> + if (!(kse_state->priority == SDEI_EVENT_PRIORITY_NORMAL ||
>> + kse_state->priority == SDEI_EVENT_PRIORITY_CRITICAL))
>> + return -EINVAL;
>> +
>> + kse = kvm_sdei_find_event(kvm, kse_state->num);
>> + if (kse)
>> + return -EEXIST;
>> +
>> + kse = kzalloc(sizeof(*kse), GFP_KERNEL);
>> + if (!kse)
>> + return -ENOMEM;
> userspace can exhaust the mem since there is no limit. There must be a max.
>

Ok. I think it's minor or corner case. For now, the number of defined SDEI
events are only one. I leave it for something to be improved in future.

>> +
>> + kse->state = *kse_state;
>> + kse->kvm = kvm;
>> + list_add_tail(&kse->link, &ksdei->events);
>> +
>> + return 0;
>> +}
>> +
>> +static long kvm_sdei_get_kevent_count(struct kvm *kvm, int *count)
>> +{
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_kvm_event *kske = NULL;
>> + int total = 0;
>> +
>> + list_for_each_entry(kske, &ksdei->kvm_events, link) {
>> + total++;
>> + }
>> +
>> + *count = total;
>> + return 0;
>> +}
>> +
>> +static long kvm_sdei_get_kevent(struct kvm *kvm,
>> + struct kvm_sdei_kvm_event_state *kske_state)
>> +{
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_kvm_event *kske = NULL;
>> +
>> + /*
>> + * The first entry is fetched if the event number is invalid.
>> + * Otherwise, the next entry is fetched.
> why don't we return an error? What is the point returning the next entry?

The SDEI events attached to the KVM are migrated one by one. Thoese attached
SDEI events are linked through a linked list:

(1) on !kvm_sdei_is_valid_event_num(kske_state->num), the first SDEI event
in the linked list is retrieved from source VM and will be restored on
the destination VM.

(2) Otherwise, the next SDEI event in the linked list will be retrieved
from source VM and restored on the destination VM.

Another option is to introduce additional struct like below. In this way, all
the attached SDEI events are retrieved and restored once. In this way, the
memory block used for storing @kvm_sdei_kvm_event_state should be allocated
and released by QEMU. Please let me know your preference:

struct xxx {
__u64 count;
struct kvm_sdei_kvm_event_state events;
}

>> + */
>> + if (!kvm_sdei_is_valid_event_num(kske_state->num)) {
>> + kske = list_first_entry_or_null(&ksdei->kvm_events,
>> + struct kvm_sdei_kvm_event, link);
>> + } else {
>> + kske = kvm_sdei_find_kvm_event(kvm, kske_state->num);
>> + if (kske && !list_is_last(&kske->link, &ksdei->kvm_events))
>> + kske = list_next_entry(kske, link);
> Sorry I don't get why we return the next one?

Please refer to the explanation above.

>> + else
>> + kske = NULL;
>> + }
>> +
>> + if (!kske)
>> + return -ENOENT;
>> +
>> + *kske_state = kske->state;
>> +
>> + return 0;
>> +}
>> +
>> +static long kvm_sdei_set_kevent(struct kvm *kvm,
>> + struct kvm_sdei_kvm_event_state *kske_state)
>> +{
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_event *kse = NULL;
>> + struct kvm_sdei_kvm_event *kske = NULL;
>> +
>> + /* Sanity check */
>> + if (!kvm_sdei_is_valid_event_num(kske_state->num))
>> + return -EINVAL;
>> +
>> + if (!(kske_state->route_mode == SDEI_EVENT_REGISTER_RM_ANY ||
>> + kske_state->route_mode == SDEI_EVENT_REGISTER_RM_PE))
>> + return -EINVAL;
>> +
>> + /* Check if the event number is valid */
>> + kse = kvm_sdei_find_event(kvm, kske_state->num);
>> + if (!kse)
>> + return -ENOENT;
>> +
>> + /* Check if the event has been populated */
>> + kske = kvm_sdei_find_kvm_event(kvm, kske_state->num);
>> + if (kske)
>> + return -EEXIST;
>> +
>> + kske = kzalloc(sizeof(*kske), GFP_KERNEL);
> userspace can exhaust the mem since there is no limit

Ok.

>> + if (!kske)
>> + return -ENOMEM;
>> +
>> + kske->state = *kske_state;
>> + kske->kse = kse;
>> + kske->kvm = kvm;
>> + list_add_tail(&kske->link, &ksdei->kvm_events);
>> +
>> + return 0;
>> +}
>> +
>> +long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg)
>> +{
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_cmd *cmd = NULL;
>> + void __user *argp = (void __user *)arg;
>> + bool copy = false;
>> + long ret = 0;
>> +
>> + /* Sanity check */
>> + if (!ksdei) {
>> + ret = -EPERM;
>> + goto out;
>> + }
>> +
>> + cmd = kzalloc(sizeof(*cmd), GFP_KERNEL);
>> + if (!cmd) {
>> + ret = -ENOMEM;
>> + goto out;
>> + }
>> +
>> + if (copy_from_user(cmd, argp, sizeof(*cmd))) {
>> + ret = -EFAULT;
>> + goto out;
>> + }
>> +
>> + spin_lock(&ksdei->lock);
>> +
>> + switch (cmd->cmd) {
>> + case KVM_SDEI_CMD_GET_VERSION:
>> + copy = true;
>> + cmd->version = (1 << 16); /* v1.0.0 */
>> + break;
>> + case KVM_SDEI_CMD_SET_EVENT:
>> + ret = kvm_sdei_set_event(kvm, &cmd->kse_state);
>> + break;
>> + case KVM_SDEI_CMD_GET_KEVENT_COUNT:
>> + copy = true;
>> + ret = kvm_sdei_get_kevent_count(kvm, &cmd->count);
>> + break;
>> + case KVM_SDEI_CMD_GET_KEVENT:
>> + copy = true;
>> + ret = kvm_sdei_get_kevent(kvm, &cmd->kske_state);
>> + break;
>> + case KVM_SDEI_CMD_SET_KEVENT:
>> + ret = kvm_sdei_set_kevent(kvm, &cmd->kske_state);
>> + break;
>> + default:
>> + ret = -EINVAL;
>> + }
>> +
>> + spin_unlock(&ksdei->lock);
>> +out:
>> + if (!ret && copy && copy_to_user(argp, cmd, sizeof(*cmd)))
>> + ret = -EFAULT;
>> +
>> + kfree(cmd);
>> + return ret;
>> +}
>> +
>> void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu)
>> {
>> struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index d9e4aabcb31a..8cf41fd4bf86 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -1679,6 +1679,9 @@ struct kvm_xen_vcpu_attr {
>> #define KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_DATA 0x4
>> #define KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST 0x5
>>
>> +/* Available with KVM_CAP_ARM_SDEI */
>> +#define KVM_ARM_SDEI_COMMAND _IOWR(KVMIO, 0xce, struct kvm_sdei_cmd)
>> +
>> /* Secure Encrypted Virtualization command */
>> enum sev_cmd_id {
>> /* Guest initialization commands */
>>

Thanks,
Gavin


2022-01-12 07:12:58

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 18/21] KVM: arm64: Support SDEI event injection

Hi Eric,

On 11/10/21 10:05 PM, Eric Auger wrote:
> On 8/15/21 2:13 AM, Gavin Shan wrote:
>> This supports SDEI event injection by implementing kvm_sdei_inject().
>> It's called by kernel directly or VMM through ioctl command to inject
>> SDEI event to the specific vCPU.
>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> arch/arm64/include/asm/kvm_sdei.h | 2 +
>> arch/arm64/include/uapi/asm/kvm_sdei.h | 1 +
>> arch/arm64/kvm/sdei.c | 108 +++++++++++++++++++++++++
>> 3 files changed, 111 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
>> index a997989bab77..51087fe971ba 100644
>> --- a/arch/arm64/include/asm/kvm_sdei.h
>> +++ b/arch/arm64/include/asm/kvm_sdei.h
>> @@ -124,6 +124,8 @@ void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
>> int kvm_sdei_hypercall(struct kvm_vcpu *vcpu);
>> int kvm_sdei_register_notifier(struct kvm *kvm, unsigned long num,
>> kvm_sdei_notifier notifier);
>> +int kvm_sdei_inject(struct kvm_vcpu *vcpu,
>> + unsigned long num, bool immediate);
>> void kvm_sdei_deliver(struct kvm_vcpu *vcpu);
>> long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg);
>> long kvm_sdei_vcpu_ioctl(struct kvm_vcpu *vcpu, unsigned long arg);
>> diff --git a/arch/arm64/include/uapi/asm/kvm_sdei.h b/arch/arm64/include/uapi/asm/kvm_sdei.h
>> index b916c3435646..f7a6b2b22b50 100644
>> --- a/arch/arm64/include/uapi/asm/kvm_sdei.h
>> +++ b/arch/arm64/include/uapi/asm/kvm_sdei.h
>> @@ -67,6 +67,7 @@ struct kvm_sdei_vcpu_state {
>> #define KVM_SDEI_CMD_SET_VEVENT 7
>> #define KVM_SDEI_CMD_GET_VCPU_STATE 8
>> #define KVM_SDEI_CMD_SET_VCPU_STATE 9
>> +#define KVM_SDEI_CMD_INJECT_EVENT 10
>>
>> struct kvm_sdei_cmd {
>> __u32 cmd;
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index 79315b77f24b..7c2789cd1421 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -802,6 +802,111 @@ int kvm_sdei_register_notifier(struct kvm *kvm,
>> return ret;
>> }
>>
>> +int kvm_sdei_inject(struct kvm_vcpu *vcpu,
>> + unsigned long num,
>> + bool immediate)
> don't get the immediate param.

I definitely need comments to explain @immediate here. It means the
injected SDEI event should be handled and delivered immediately.
For example, if one of the following conditions is met, the injected
event can't be delivered immediately. This mechanism is needed by
Async PF to inject event for page-not-present notification and it's
expected to be delivered immediately.

(a) Current event is critical, another critical event is being delivered.
(b) Current event is normal, another critical or normal event is being delivered.

>> +{
>> + struct kvm *kvm = vcpu->kvm;
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> + struct kvm_sdei_event *kse = NULL;
>> + struct kvm_sdei_kvm_event *kske = NULL;
>> + struct kvm_sdei_vcpu_event *ksve = NULL;
>> + int index, ret = 0;
>> +
>> + /* Sanity check */
>> + if (!(ksdei && vsdei)) {
>> + ret = -EPERM;
>> + goto out;
>> + }
>> +
>> + if (!kvm_sdei_is_valid_event_num(num)) {
>> + ret = -EINVAL;
>> + goto out;
>> + }
>> +
>> + /* Check the kvm event */
>> + spin_lock(&ksdei->lock);
>> + kske = kvm_sdei_find_kvm_event(kvm, num);
>> + if (!kske) {
>> + ret = -ENOENT;
>> + goto unlock_kvm;
>> + }
>> +
>> + kse = kske->kse;
>> + index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
>> + vcpu->vcpu_idx : 0;
>> + if (!(kvm_sdei_is_registered(kske, index) &&
>> + kvm_sdei_is_enabled(kske, index))) {
>> + ret = -EPERM;
>> + goto unlock_kvm;
>> + }
>> +
>> + /* Check the vcpu state */
>> + spin_lock(&vsdei->lock);
>> + if (vsdei->state.masked) {
>> + ret = -EPERM;
>> + goto unlock_vcpu;
>> + }
>> +
>> + /* Check if the event can be delivered immediately */
>> + if (immediate) {
> According to the dispatcher pseudo code this should be always checked?

Nope, The spec doesn't require that the event is delivered immediately.
It means the event can be delayed.

>> + if (kse->state.priority == SDEI_EVENT_PRIORITY_CRITICAL &&
>> + !list_empty(&vsdei->critical_events)) {
>> + ret = -ENOSPC;
>> + goto unlock_vcpu;
>> + }
>> +
>> + if (kse->state.priority == SDEI_EVENT_PRIORITY_NORMAL &&
>> + (!list_empty(&vsdei->critical_events) ||
>> + !list_empty(&vsdei->normal_events))) {
>> + ret = -ENOSPC;
>> + goto unlock_vcpu;
>> + }
>> + }
> What about shared event dispatching. I don't see the afficinity checked
> anywhere?

You're correct. I ignore affinity stuff for now for two reasons: (1) I didn't
figure out the mechanism to verify the affinity, as I mentioned before.
(2) Currently, the affinity isn't used by SDEI client driver in the guest kernel.

>> +
>> + /* Check if the vcpu event exists */
>> + ksve = kvm_sdei_find_vcpu_event(vcpu, num);
>> + if (ksve) {
>> + kske->state.refcount++;
>> + ksve->state.refcount++;
> why this double refcount increment??

Yep, As I explained before, "ksve->state.refcount" should be increased only
when the corresponding vCPU event is created.

>> + kvm_make_request(KVM_REQ_SDEI, vcpu);
>> + goto unlock_vcpu;
>> + }
>> +
>> + /* Allocate vcpu event */
>> + ksve = kzalloc(sizeof(*ksve), GFP_KERNEL);
>> + if (!ksve) {
>> + ret = -ENOMEM;
>> + goto unlock_vcpu;
>> + }
>> +
>> + /*
>> + * We should take lock to update KVM event state because its
>> + * reference count might be zero. In that case, the KVM event
>> + * could be destroyed.
>> + */
>> + kske->state.refcount++;
>> + ksve->state.num = num;
>> + ksve->state.refcount = 1;
>> + ksve->kske = kske;
>> + ksve->vcpu = vcpu;
>> +
>> + if (kse->state.priority == SDEI_EVENT_PRIORITY_CRITICAL)
>> + list_add_tail(&ksve->link, &vsdei->critical_events);
>> + else
>> + list_add_tail(&ksve->link, &vsdei->normal_events);
>> +
>> + kvm_make_request(KVM_REQ_SDEI, vcpu);
>> +
>> +unlock_vcpu:
>> + spin_unlock(&vsdei->lock);
>> +unlock_kvm:
>> + spin_unlock(&ksdei->lock);
>> +out:
>> + return ret;
>> +}
>> +
>> void kvm_sdei_deliver(struct kvm_vcpu *vcpu)
>> {
>> struct kvm *kvm = vcpu->kvm;
>> @@ -1317,6 +1422,9 @@ long kvm_sdei_vcpu_ioctl(struct kvm_vcpu *vcpu, unsigned long arg)
>> case KVM_SDEI_CMD_SET_VCPU_STATE:
>> ret = kvm_sdei_set_vcpu_state(vcpu, &cmd->ksv_state);
>> break;
>> + case KVM_SDEI_CMD_INJECT_EVENT:
>> + ret = kvm_sdei_inject(vcpu, cmd->num, false);
>> + break;
>> default:
>> ret = -EINVAL;
>> }
>>

Thanks,
Gavin


2022-01-12 07:19:38

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 19/21] KVM: arm64: Support SDEI event cancellation

Hi Eric,

On 11/10/21 10:09 PM, Eric Auger wrote:
> On 8/15/21 2:13 AM, Gavin Shan wrote:
>> The injected SDEI event is to send notification to guest. The SDEI
>> event might not be needed after it's injected. This introduces API
>> to support cancellation on the injected SDEI event if it's not fired
>> to the guest yet.
>>
>> This mechanism will be needed when we're going to support asynchronous
>> page fault.
>
> if we are able to manage the migration of an executing SDEI why can't we
> manage the migration of pending SDEIs?
>

I think the commit log needs to explain the use case in a clearer way. It's
about Async PF's performance, not migration. In Async PF, the page fault is
delivered in asynchronous way using worker. The page fault can be completed
before the injected SDEI event for page-not-present notification is delivered.
In this case, we needn't the overhead caused by the injected SDEI event.

I will think about it and may drop this patch from the series, to detach
SDEI and async PF as much as possible :)


>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> arch/arm64/include/asm/kvm_sdei.h | 1 +
>> arch/arm64/kvm/sdei.c | 49 +++++++++++++++++++++++++++++++
>> 2 files changed, 50 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
>> index 51087fe971ba..353744c7bad9 100644
>> --- a/arch/arm64/include/asm/kvm_sdei.h
>> +++ b/arch/arm64/include/asm/kvm_sdei.h
>> @@ -126,6 +126,7 @@ int kvm_sdei_register_notifier(struct kvm *kvm, unsigned long num,
>> kvm_sdei_notifier notifier);
>> int kvm_sdei_inject(struct kvm_vcpu *vcpu,
>> unsigned long num, bool immediate);
>> +int kvm_sdei_cancel(struct kvm_vcpu *vcpu, unsigned long num);
>> void kvm_sdei_deliver(struct kvm_vcpu *vcpu);
>> long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg);
>> long kvm_sdei_vcpu_ioctl(struct kvm_vcpu *vcpu, unsigned long arg);
>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>> index 7c2789cd1421..4f5a582daa97 100644
>> --- a/arch/arm64/kvm/sdei.c
>> +++ b/arch/arm64/kvm/sdei.c
>> @@ -907,6 +907,55 @@ int kvm_sdei_inject(struct kvm_vcpu *vcpu,
>> return ret;
>> }
>>
>> +int kvm_sdei_cancel(struct kvm_vcpu *vcpu, unsigned long num)
>> +{
>> + struct kvm *kvm = vcpu->kvm;
>> + struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> + struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> + struct kvm_sdei_kvm_event *kske = NULL;
>> + struct kvm_sdei_vcpu_event *ksve = NULL;
>> + int ret = 0;
>> +
>> + if (!(ksdei && vsdei)) {
>> + ret = -EPERM;
>> + goto out;
>> + }
>> +
>> + /* Find the vCPU event */
>> + spin_lock(&vsdei->lock);
>> + ksve = kvm_sdei_find_vcpu_event(vcpu, num);
>> + if (!ksve) {
>> + ret = -EINVAL;
>> + goto unlock;
>> + }
>> +
>> + /* Event can't be cancelled if it has been delivered */
>> + if (ksve->state.refcount <= 1 &&
>> + (vsdei->critical_event == ksve ||
>> + vsdei->normal_event == ksve)) {
>> + ret = -EINPROGRESS;
>> + goto unlock;
>> + }
>> +
>> + /* Free the vCPU event if necessary */
>> + kske = ksve->kske;
>> + ksve->state.refcount--;
>> + if (!ksve->state.refcount) {
>> + list_del(&ksve->link);
>> + kfree(ksve);
>> + }
>> +
>> +unlock:
>> + spin_unlock(&vsdei->lock);
>> + if (kske) {
>> + spin_lock(&ksdei->lock);
>> + kske->state.refcount--;
>> + spin_unlock(&ksdei->lock);
>> + }
>> +out:
>> + return ret;
>> +}
>> +
>> void kvm_sdei_deliver(struct kvm_vcpu *vcpu)
>> {
>> struct kvm *kvm = vcpu->kvm;
>>

Thanks,
Gavin


2022-01-12 07:22:45

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 20/21] KVM: arm64: Export SDEI capability

On 11/10/21 9:55 PM, Eric Auger wrote:
> On 8/15/21 2:13 AM, Gavin Shan wrote:
>> The SDEI functionality is ready to be exported so far. This adds
>> new capability (KVM_CAP_ARM_SDEI) and exports it.
>
> Need to be documented in
> kvm/api.rst
> as the rest of the API
>

Ok, Thanks, Eric :)


>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> arch/arm64/kvm/arm.c | 3 +++
>> include/uapi/linux/kvm.h | 1 +
>> 2 files changed, 4 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> index 215cdbeb272a..7d9bbc888ae5 100644
>> --- a/arch/arm64/kvm/arm.c
>> +++ b/arch/arm64/kvm/arm.c
>> @@ -278,6 +278,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>> case KVM_CAP_ARM_PTRAUTH_GENERIC:
>> r = system_has_full_ptr_auth();
>> break;
>> + case KVM_CAP_ARM_SDEI:
>> + r = 1;
>> + break;
>> default:
>> r = 0;
>> }
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index 8cf41fd4bf86..2aa748fd89c7 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -1112,6 +1112,7 @@ struct kvm_ppc_resize_hpt {
>> #define KVM_CAP_BINARY_STATS_FD 203
>> #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
>> #define KVM_CAP_ARM_MTE 205
>> +#define KVM_CAP_ARM_SDEI 206
>>
>> #ifdef KVM_CAP_IRQ_ROUTING
>>
>>
> Eric
>


2022-01-12 07:24:29

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Support SDEI Virtualization

Hi Eric,

On 11/10/21 10:29 PM, Eric Auger wrote:
> On 8/15/21 2:19 AM, Gavin Shan wrote:
>> On 8/15/21 10:13 AM, Gavin Shan wrote:
>>> This series intends to virtualize Software Delegated Exception Interface
>>> (SDEI), which is defined by DEN0054A. It allows the hypervisor to deliver
>>> NMI-alike event to guest and it's needed by asynchronous page fault to
>>> deliver page-not-present notification from hypervisor to guest. The code
>>> and the required qemu changes can be found from:
>>>
>>>     https://developer.arm.com/documentation/den0054/latest
>>>     https://github.com/gwshan/linux    ("kvm/arm64_sdei")
>>>     https://github.com/gwshan/qemu     ("kvm/arm64_sdei")
>>>
>>> The SDEI event is identified by a 32-bits number. Bits[31:24] are used
>>> to indicate the SDEI event properties while bits[23:0] are identifying
>>> the unique number. The implementation takes bits[23:22] to indicate the
>>> owner of the SDEI event. For example, those SDEI events owned by KVM
>>> should have these two bits set to 0b01. Besides, the implementation
>>> supports SDEI events owned by KVM only.
>>>
>>> The design is pretty straightforward and the implementation is just
>>> following the SDEI specification, to support the defined SMCCC intefaces,
>>> except the IRQ binding stuff. There are several data structures
>>> introduced.
>>> Some of the objects have to be migrated by VMM. So their definitions are
>>> split up for VMM to include the corresponding states for migration.
>>>
>>>     struct kvm_sdei_kvm
>>>        Associated with VM and used to track the KVM exposed SDEI events
>>>        and those registered by guest.
>>>     struct kvm_sdei_vcpu
>>>        Associated with vCPU and used to track SDEI event delivery. The
>>>        preempted context is saved prior to the delivery and restored
>>>        after that.
>>>     struct kvm_sdei_event
>>>        SDEI events exposed by KVM so that guest can register and enable.
>>>     struct kvm_sdei_kvm_event
>>>        SDEI events that have been registered by guest.
>>>     struct kvm_sdei_vcpu_event
>>>        SDEI events that have been queued to specific vCPU for delivery.
>>>
>>> The series is organized as below:
>>>
>>>     PATCH[01]    Introduces template for smccc_get_argx()
>>>     PATCH[02]    Introduces the data structures and infrastructure
>>>     PATCH[03-14] Supports various SDEI related hypercalls
>>>     PATCH[15]    Supports SDEI event notification
>>>     PATCH[16-17] Introduces ioctl command for migration
>>>     PATCH[18-19] Supports SDEI event injection and cancellation
>>>     PATCH[20]    Exports SDEI capability
>>>     PATCH[21]    Adds self-test case for SDEI virtualization
>>>
>>
>> [...]
>>
>> I explicitly copied James Morse and Mark Rutland when posting the series,
>> but something unknown went wrong. I'm including them here to avoid
>> reposting the whole series.
> I don't see James nor Mark included here either
>

Yeah, I used the following command to post the series, but I don't know
why James/Mark are missed. I'm not sure it's git-sendemail issue or not
so far. The issue appears some times on my laptop :)

# git-sendemail --to=<mail0> --cc=<mail1> --cc=<mail2> *.patch

Thanks,
Gavin


2022-01-13 07:02:44

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 06/21] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall

Hi Shannon,

On 1/11/22 5:43 PM, Shannon Zhao wrote:
> On 2021/8/15 8:13, Gavin Shan wrote:
>> +static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu)
>> +{
>> +    struct kvm *kvm = vcpu->kvm;
>> +    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>> +    struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>> +    struct kvm_sdei_vcpu_regs *regs;
>> +    unsigned long index = smccc_get_arg1(vcpu);
>> +    unsigned long ret = SDEI_SUCCESS;
>> +
>> +    /* Sanity check */
>> +    if (!(ksdei && vsdei)) {
>> +        ret = SDEI_NOT_SUPPORTED;
>> +        goto out;
>> +    }
> Maybe we could move these common sanity check codes to kvm_sdei_hypercall to save some lines.
>

Not all hypercalls need this check. For example, COMPLETE/COMPLETE_RESUME/CONTEXT don't
have SDEI event number as the argument. If we really want move this check into function
kvm_sdei_hypercall(), we would have code like below. Too much duplicated snippets will
be seen. I don't think it's better than what we have if I fully understand your comments.

switch (...) {
case REGISTER:
if (!(ksdei && vsdei)) {
ret = SDEI_NOT_SUPPORTED;
break;
}

ret = kvm_sdei_hypercall_register(vcpu);
break;
case UNREGISTER:
if (!(ksdei && vsdei)) {
ret = SDEI_NOT_SUPPORTED;
break;
}

ret = kvm_sdei_hypercall_unregister(vcpu);
break;
case CONTEXT:
ret = kvm_sdei_hypercall_context(vcpu);
break;
:
}

Thanks,
Gavin


2022-01-13 07:09:50

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 02/21] KVM: arm64: Add SDEI virtualization infrastructure

Hi Shannon,

On 1/11/22 5:40 PM, Shannon Zhao wrote:
> On 2021/8/15 8:13, Gavin Shan wrote:
>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> index e9a2b8f27792..2f021aa41632 100644
>> --- a/arch/arm64/kvm/arm.c
>> +++ b/arch/arm64/kvm/arm.c
>> @@ -150,6 +150,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>>       kvm_vgic_early_init(kvm);
>> +    kvm_sdei_init_vm(kvm);
>> +
>>       /* The maximum number of VCPUs is limited by the host's GIC model */
>>       kvm->arch.max_vcpus = kvm_arm_default_max_vcpus();
> Hi, Is it possible to let user space to choose whether enabling SEDI or not rather than enable it by default?
>

It's possible, but what's the benefit to do so. I think about it for
a while and I don't think it's not necessary, at least for now. First
of all, the SDEI event is injected from individual modules in userspace
(QEMU) or host kernel (Async PF). If we really want the function to be
disabled, the individual modules can accept parameter, used to indicate
the SDEI event injection is allowed or not. In this case, SDEI is enabled
by default, but the individual modules can choose not to use it :)

Thanks,
Gavin


2022-01-13 07:13:44

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v4 06/21] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall

Hi Shannon,

On 1/13/22 3:02 PM, Gavin Shan wrote:
> On 1/11/22 5:43 PM, Shannon Zhao wrote:
>> On 2021/8/15 8:13, Gavin Shan wrote:
>>> +static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu)
>>> +{
>>> +    struct kvm *kvm = vcpu->kvm;
>>> +    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>>> +    struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>>> +    struct kvm_sdei_vcpu_regs *regs;
>>> +    unsigned long index = smccc_get_arg1(vcpu);
>>> +    unsigned long ret = SDEI_SUCCESS;
>>> +
>>> +    /* Sanity check */
>>> +    if (!(ksdei && vsdei)) {
>>> +        ret = SDEI_NOT_SUPPORTED;
>>> +        goto out;
>>> +    }
>> Maybe we could move these common sanity check codes to kvm_sdei_hypercall to save some lines.
>>
>
> Not all hypercalls need this check. For example, COMPLETE/COMPLETE_RESUME/CONTEXT don't
> have SDEI event number as the argument. If we really want move this check into function
> kvm_sdei_hypercall(), we would have code like below. Too much duplicated snippets will
> be seen. I don't think it's better than what we have if I fully understand your comments.
>

oops... sorry. Please ignore my previous reply. I thought you talk about
the check on the SDEI event number wrongly. Yes, you're correct that the
check should be moved to kvm_sdei_hypercall().

Thanks,
Gavin


2022-01-26 03:30:33

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 05/21] KVM: arm64: Support SDEI_EVENT_{ENABLE, DISABLE} hypercall

Hi Gavin,

On 1/12/22 3:29 AM, Gavin Shan wrote:
> Hi Eric,
>
> On 11/10/21 12:02 AM, Eric Auger wrote:
>> On 8/15/21 2:13 AM, Gavin Shan wrote:
>>> This supports SDEI_EVENT_{ENABLE, DISABLE} hypercall. After SDEI
>>> event is registered by guest, it won't be delivered to the guest
>>> until it's enabled. On the other hand, the SDEI event won't be
>>> raised to the guest or specific vCPU if it's has been disabled
>>> on the guest or specific vCPU.
>>>
>>> Signed-off-by: Gavin Shan <[email protected]>
>>> ---
>>>   arch/arm64/kvm/sdei.c | 68 +++++++++++++++++++++++++++++++++++++++++++
>>>   1 file changed, 68 insertions(+)
>>>
>>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>>> index d3ea3eee154b..b022ce0a202b 100644
>>> --- a/arch/arm64/kvm/sdei.c
>>> +++ b/arch/arm64/kvm/sdei.c
>>> @@ -206,6 +206,70 @@ static unsigned long
>>> kvm_sdei_hypercall_register(struct kvm_vcpu *vcpu)
>>>       return ret;
>>>   }
>>>   +static unsigned long kvm_sdei_hypercall_enable(struct kvm_vcpu *vcpu,
>>> +                           bool enable)
>>> +{
>>> +    struct kvm *kvm = vcpu->kvm;
>>> +    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>>> +    struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>>> +    struct kvm_sdei_event *kse = NULL;
>>> +    struct kvm_sdei_kvm_event *kske = NULL;
>>> +    unsigned long event_num = smccc_get_arg1(vcpu);
>>> +    int index = 0;
>>> +    unsigned long ret = SDEI_SUCCESS;
>>> +
>>> +    /* Sanity check */
>>> +    if (!(ksdei && vsdei)) {
>>> +        ret = SDEI_NOT_SUPPORTED;
>>> +        goto out;
>>> +    }
>>> +
>>> +    if (!kvm_sdei_is_valid_event_num(event_num)) {
>> I would rename into is_exposed_event_num()
>
> kvm_sdei_is_virtual() has been recommended by you when you reviewed the
> following
> patch. I think kvm_sdei_is_virtual() is good enough :)

argh, is_virtual() then :)

Eric
>
>    [PATCH v4 02/21] KVM: arm64: Add SDEI virtualization infrastructure
>
>>> +        ret = SDEI_INVALID_PARAMETERS;
>>> +        goto out;
>>> +    }
>>> +
>>> +    /* Check if the KVM event exists */
>>> +    spin_lock(&ksdei->lock);
>>> +    kske = kvm_sdei_find_kvm_event(kvm, event_num);
>>> +    if (!kske) {
>>> +        ret = SDEI_INVALID_PARAMETERS;
>> should be DENIED according to the spec, ie. nobody registered that event?
>
> Ok.
>
>>> +        goto unlock;
>>> +    }
>>> +
>>> +    /* Check if there is pending events */
>> does that match the "handler-unregister-pending state" case mentionned
>> in the spec?
>>> +    if (kske->state.refcount) {
>>> +        ret = SDEI_PENDING;
>> ? not documented in my A spec? DENIED?
>
> Yep, It should be DENIED.
>
>>> +        goto unlock;
>>> +    }
>>> +
>>> +    /* Check if it has been registered */
>> isn't duplicate of /* Check if the KVM event exists */ ?
>
> It's not duplicate check, but the comment here seems misleading. I will
> correct this to:
>
>     /* Check if it has been defined or exposed */
>
>>> +    kse = kske->kse;
>>> +    index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
>>> +        vcpu->vcpu_idx : 0;
>>> +    if (!kvm_sdei_is_registered(kske, index)) {
>>> +        ret = SDEI_DENIED;
>>> +        goto unlock;
>>> +    }
>>> +
>>> +    /* Verify its enablement state */
>>> +    if (enable == kvm_sdei_is_enabled(kske, index)) {
>> spec says:
>> Enabling/disabled an event, which is already enabled/disabled, is
>> permitted and has no effect. I guess ret should be OK.
>
> yep, it should be ok.
>
>>> +        ret = SDEI_DENIED;
>>> +        goto unlock;
>>> +    }
>>> +
>>> +    /* Update enablement state */
>>> +    if (enable)
>>> +        kvm_sdei_set_enabled(kske, index);
>>> +    else
>>> +        kvm_sdei_clear_enabled(kske, index);
>>> +
>>> +unlock:
>>> +    spin_unlock(&ksdei->lock);
>>> +out:
>>> +    return ret;
>>> +}
>>> +
>>>   int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>>>   {
>>>       u32 func = smccc_get_function(vcpu);
>>> @@ -220,7 +284,11 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>>>           ret = kvm_sdei_hypercall_register(vcpu);
>>>           break;
>>>       case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
>>> +        ret = kvm_sdei_hypercall_enable(vcpu, true);
>>> +        break;
>>>       case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
>>> +        ret = kvm_sdei_hypercall_enable(vcpu, false);
>>> +        break;
>>>       case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
>>>       case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
>>>       case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
>>>
>
> Thanks,
> Gavin
>

2022-01-26 03:32:15

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 06/21] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall

Hi Gavin,

On 1/12/22 3:33 AM, Gavin Shan wrote:
> Hi Eric,
>
> On 11/10/21 7:16 PM, Eric Auger wrote:
>> On 8/15/21 2:13 AM, Gavin Shan wrote:
>>> This supports SDEI_EVENT_CONTEXT hypercall. It's used by the guest
>>> to retrieved the original registers (R0 - R17) in its SDEI event
>>> handler. Those registers can be corrupted during the SDEI event
>>> delivery.
>>>
>>> Signed-off-by: Gavin Shan <[email protected]>
>>> ---
>>>   arch/arm64/kvm/sdei.c | 40 ++++++++++++++++++++++++++++++++++++++++
>>>   1 file changed, 40 insertions(+)
>>>
>>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>>> index b022ce0a202b..b4162efda470 100644
>>> --- a/arch/arm64/kvm/sdei.c
>>> +++ b/arch/arm64/kvm/sdei.c
>>> @@ -270,6 +270,44 @@ static unsigned long
>>> kvm_sdei_hypercall_enable(struct kvm_vcpu *vcpu,
>>>       return ret;
>>>   }
>>>   +static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu
>>> *vcpu)
>>> +{
>>> +    struct kvm *kvm = vcpu->kvm;
>>> +    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>>> +    struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>>> +    struct kvm_sdei_vcpu_regs *regs;
>>> +    unsigned long index = smccc_get_arg1(vcpu);
>> s/index/param_id to match the spec?
>
> Sure, but "reg_id" seems better here. As the parameter indicates the GPR
> index
> to be fetched on request of the guest kernel.
fine with me.
>
>>> +    unsigned long ret = SDEI_SUCCESS;
>>> +
>>> +    /* Sanity check */
>>> +    if (!(ksdei && vsdei)) {
>>> +        ret = SDEI_NOT_SUPPORTED;
>>> +        goto out;
>>> +    }
>>> +
>>> +    if (index > ARRAY_SIZE(vsdei->state.critical_regs.regs)) {
>>> +        ret = SDEI_INVALID_PARAMETERS;
>>> +        goto out;
>>> +    }
>> I would move the above after regs = and use regs there (although the
>> regs ARRAY_SIZE of both is identifical)
>
> Ok.
>
>>> +
>>> +    /* Check if the pending event exists */
>>> +    spin_lock(&vsdei->lock);
>>> +    if (!(vsdei->critical_event || vsdei->normal_event)) {
>>> +        ret = SDEI_DENIED;
>>> +        goto unlock;
>>> +    }
>>> +
>>> +    /* Fetch the requested register */
>>> +    regs = vsdei->critical_event ? &vsdei->state.critical_regs :
>>> +                       &vsdei->state.normal_regs;
>>> +    ret = regs->regs[index];
>>> +
>>> +unlock:
>>> +    spin_unlock(&vsdei->lock);
>>> +out:
>>> +    return ret;
>>> +}
>>> +
>>>   int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>>>   {
>>>       u32 func = smccc_get_function(vcpu);
>>> @@ -290,6 +328,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>>>           ret = kvm_sdei_hypercall_enable(vcpu, false);
>>>           break;
>>>       case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
>>> +        ret = kvm_sdei_hypercall_context(vcpu);
>>> +        break;
>>>       case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
>>>       case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
>>>       case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
>>>
>
> Thanks,
> Gavin
>
Eric

2022-01-26 03:34:49

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 06/21] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall

Hi Gavin,

On 1/13/22 8:13 AM, Gavin Shan wrote:
> Hi Shannon,
>
> On 1/13/22 3:02 PM, Gavin Shan wrote:
>> On 1/11/22 5:43 PM, Shannon Zhao wrote:
>>> On 2021/8/15 8:13, Gavin Shan wrote:
>>>> +static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu)
>>>> +{
>>>> +    struct kvm *kvm = vcpu->kvm;
>>>> +    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>>>> +    struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>>>> +    struct kvm_sdei_vcpu_regs *regs;
>>>> +    unsigned long index = smccc_get_arg1(vcpu);
>>>> +    unsigned long ret = SDEI_SUCCESS;
>>>> +
>>>> +    /* Sanity check */
>>>> +    if (!(ksdei && vsdei)) {
>>>> +        ret = SDEI_NOT_SUPPORTED;
>>>> +        goto out;
>>>> +    }
>>> Maybe we could move these common sanity check codes to
>>> kvm_sdei_hypercall to save some lines.
>>>
>>
>> Not all hypercalls need this check. For example,
>> COMPLETE/COMPLETE_RESUME/CONTEXT don't
>> have SDEI event number as the argument. If we really want move this
>> check into function
>> kvm_sdei_hypercall(), we would have code like below. Too much
>> duplicated snippets will
>> be seen. I don't think it's better than what we have if I fully
>> understand your comments.
>>
>
> oops... sorry. Please ignore my previous reply. I thought you talk about
> the check on the SDEI event number wrongly. Yes, you're correct that the
> check should be moved to kvm_sdei_hypercall().

even better than my previous proposal then

Eric
>
> Thanks,
> Gavin
>
> _______________________________________________
> kvmarm mailing list
> [email protected]
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

2022-01-26 03:34:55

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 06/21] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall

Hi Gavin,

On 1/13/22 8:02 AM, Gavin Shan wrote:
> Hi Shannon,
>
> On 1/11/22 5:43 PM, Shannon Zhao wrote:
>> On 2021/8/15 8:13, Gavin Shan wrote:
>>> +static unsigned long kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu)
>>> +{
>>> +    struct kvm *kvm = vcpu->kvm;
>>> +    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>>> +    struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>>> +    struct kvm_sdei_vcpu_regs *regs;
>>> +    unsigned long index = smccc_get_arg1(vcpu);
>>> +    unsigned long ret = SDEI_SUCCESS;
>>> +
>>> +    /* Sanity check */
>>> +    if (!(ksdei && vsdei)) {
>>> +        ret = SDEI_NOT_SUPPORTED;
>>> +        goto out;
>>> +    }
>> Maybe we could move these common sanity check codes to
>> kvm_sdei_hypercall to save some lines.
>>
>
> Not all hypercalls need this check. For example,
> COMPLETE/COMPLETE_RESUME/CONTEXT don't
> have SDEI event number as the argument. If we really want move this
> check into function
> kvm_sdei_hypercall(), we would have code like below. Too much duplicated
> snippets will
> be seen. I don't think it's better than what we have if I fully
> understand your comments.
>
>       switch (...) {
>       case REGISTER:
>            if (!(ksdei && vsdei)) {
>                ret = SDEI_NOT_SUPPORTED;
>                break;
>            }
at least you can use an inline function taking the vcpu as param?

Thanks

Eric
>
>            ret = kvm_sdei_hypercall_register(vcpu);
>            break;
>       case UNREGISTER:
>            if (!(ksdei && vsdei)) {
>                ret = SDEI_NOT_SUPPORTED;
>                break;
>            }
>
>            ret = kvm_sdei_hypercall_unregister(vcpu);
>            break;
>      case CONTEXT:
>            ret = kvm_sdei_hypercall_context(vcpu);
>            break;
>        :
>     }
>
> Thanks,
> Gavin
>
> _______________________________________________
> kvmarm mailing list
> [email protected]
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

2022-01-26 06:34:43

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 07/21] KVM: arm64: Support SDEI_EVENT_UNREGISTER hypercall

Hi Gavin,
On 1/12/22 3:38 AM, Gavin Shan wrote:
> Hi Eric,
>
> On 11/10/21 1:05 AM, Eric Auger wrote:
>> On 8/15/21 2:13 AM, Gavin Shan wrote:
>>> This supports SDEI_EVENT_UNREGISTER hypercall. It's used by the
>>> guest to unregister SDEI event. The SDEI event won't be raised to
>>> the guest or specific vCPU after it's unregistered successfully.
>>> It's notable the SDEI event is disabled automatically on the guest
>>> or specific vCPU once it's unregistered successfully.
>>>
>>> Signed-off-by: Gavin Shan <[email protected]>
>>> ---
>>>   arch/arm64/kvm/sdei.c | 61 +++++++++++++++++++++++++++++++++++++++++++
>>>   1 file changed, 61 insertions(+)
>>>
>>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>>> index b4162efda470..a3ba69dc91cb 100644
>>> --- a/arch/arm64/kvm/sdei.c
>>> +++ b/arch/arm64/kvm/sdei.c
>>> @@ -308,6 +308,65 @@ static unsigned long
>>> kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu)
>>>       return ret;
>>>   }
>>>   +static unsigned long kvm_sdei_hypercall_unregister(struct kvm_vcpu
>>> *vcpu)
>>> +{
>>> +    struct kvm *kvm = vcpu->kvm;
>>> +    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>>> +    struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>>> +    struct kvm_sdei_event *kse = NULL;
>>> +    struct kvm_sdei_kvm_event *kske = NULL;
>>> +    unsigned long event_num = smccc_get_arg1(vcpu);
>>> +    int index = 0;
>>> +    unsigned long ret = SDEI_SUCCESS;
>>> +
>>> +    /* Sanity check */
>>> +    if (!(ksdei && vsdei)) {
>>> +        ret = SDEI_NOT_SUPPORTED;
>>> +        goto out;
>>> +    }
>>> +
>>> +    if (!kvm_sdei_is_valid_event_num(event_num)) {
>>> +        ret = SDEI_INVALID_PARAMETERS;
>>> +        goto out;
>>> +    }
>>> +
>>> +    /* Check if the KVM event exists */
>>> +    spin_lock(&ksdei->lock);
>>> +    kske = kvm_sdei_find_kvm_event(kvm, event_num);
>>> +    if (!kske) {
>>> +        ret = SDEI_INVALID_PARAMETERS;
>>> +        goto unlock;
>>> +    }
>>> +
>>> +    /* Check if there is pending events */
>>> +    if (kske->state.refcount) {
>>> +        ret = SDEI_PENDING;
>> don't you want to record the fact the unregistration is outstanding to
>> perform subsequent actions? Otherwise nothing will hapen when the
>> current executing handlers complete?>
> It's not necessary. The guest should retry in this case.

I do not understand that from the spec:
6.7 Unregistering an event says

With the PENDING status, the unregister request will be queued until the
event is completed using SDEI_EVENT_COMPLETE .

Also there is state called "Handler-unregister-pending"

But well I would need to dig further into the spec again :)


>
>>> +        goto unlock;
>>> +    }
>>> +
>>> +    /* Check if it has been registered */
>>> +    kse = kske->kse;
>>> +    index = (kse->state.type == SDEI_EVENT_TYPE_PRIVATE) ?
>>> +        vcpu->vcpu_idx : 0;
>> you could have an inline for the above as this is executed in many
>> functions. even including the code below.
>
> Ok, it's a good idea.
>
>>> +    if (!kvm_sdei_is_registered(kske, index)) {
>>> +        ret = SDEI_DENIED;
>>> +        goto unlock;
>>> +    }
>>> +
>>> +    /* The event is disabled when it's unregistered */
>>> +    kvm_sdei_clear_enabled(kske, index);
>>> +    kvm_sdei_clear_registered(kske, index);
>>> +    if (kvm_sdei_empty_registered(kske)) {
>> a refcount mechanism would be cleaner I think.
>
> A refcount isn't working well. We need a mapping here because the private
> SDEI event can be enabled/registered on multiple vCPUs. We need to know
> the exact vCPUs where the private SDEI event is enabled/registered.

I don't get why you can't increment/decrement the ref count each time
the event is registered/unregistered by a given vcpu to manage its life
cycle? Does not mean you don't need the bitmap to know the actual mapping.

Thanks

Eric
>
>>> +        list_del(&kske->link);
>>> +        kfree(kske);
>>> +    }
>>> +
>>> +unlock:
>>> +    spin_unlock(&ksdei->lock);
>>> +out:
>>> +    return ret;
>>> +}
>>> +
>>>   int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>>>   {
>>>       u32 func = smccc_get_function(vcpu);
>>> @@ -333,6 +392,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>>>       case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
>>>       case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
>>>       case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
>>> +        ret = kvm_sdei_hypercall_unregister(vcpu);
>>> +        break;
>>>       case SDEI_1_0_FN_SDEI_EVENT_STATUS:
>>>       case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
>>>       case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
>>>
>
> Thanks,
> Gavin
>

2022-01-28 00:29:37

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 02/21] KVM: arm64: Add SDEI virtualization infrastructure

Hi Gavin,

On 1/11/22 10:20 AM, Gavin Shan wrote:
> Hi Eric,
>
> On 11/9/21 11:45 PM, Eric Auger wrote:
>> On 8/15/21 2:13 AM, Gavin Shan wrote:
>>> Software Delegated Exception Interface (SDEI) provides a mechanism for
>>> registering and servicing system events. Those system events are high
>>> priority events, which must be serviced immediately. It's going to be
>>> used by Asynchronous Page Fault (APF) to deliver notification from KVM
>>> to guest. It's noted that SDEI is defined by ARM DEN0054A specification.
>>>
>>> This introduces SDEI virtualization infrastructure where the SDEI events
>>> are registered and manuplated by the guest through hypercall. The SDEI
>> manipulated
>
> Thanks, It will be corrected in next respin.
>
>>> event is delivered to one specific vCPU by KVM once it's raised. This
>>> introduces data structures to represent the needed objects to implement
>>> the feature, which is highlighted as below. As those objects could be
>>> migrated between VMs, these data structures are partially exported to
>>> user space.
>>>
>>>     * kvm_sdei_event
>>>       SDEI events are exported from KVM so that guest is able to
>>> register
>>>       and manuplate.
>> manipulate
>
> Thanks, It will be fixed in next respin. I'm uncertain how the wrong
> spelling are still existing even though I had spelling check with
> "scripts/checkpatch.pl --codespell".
I don't know. I am not used to it :(
>
>>>     * kvm_sdei_kvm_event
>>>       SDEI event that has been registered by guest.
>> I would recomment to revisit the names. Why kvm event? Why not
>> registered_event instead that actually would tell what it its. also you
>> have kvm twice in the struct name.
>
> Yep, I think I need reconsider the struct names. The primary reason
> why I had the names are keeping the struct names short enough while
> being easy to be identified: "kvm_sdei" is the prefix. How about to
> have the following struct names?
also kvm_sdei_kvm looks awkward to me. since it is arch specific,
couldn't you name kvm_sdei_arch?
>
>     kvm_sdei_event             events exported from KVM to userspace
>     kvm_sdei_kevent            events registered (associated) to KVM
I still don't find kevent self explanatory. and even confusing because
it makes me think of events exposed by kvm.

To me there are exposed events and registered events and I think it
would be simpler to stick to this terminology. I would rather rename
kevent into registered_event.
>     kvm_sdei_vevent            events associated with vCPU
s/vevent/vcpu_event otherwise sounds like virtual event
>     kvm_sdei_vcpu              vCPU context for event delivery
>
>>>     * kvm_sdei_kvm_vcpu
>> Didn't you mean kvm_sdei_vcpu_event instead?
>
> Yeah, you're correct. I was supposed to explain kvm_sdei_vcpu_event here.
>
>>>       SDEI event that has been delivered to the target vCPU.
>>>     * kvm_sdei_kvm
>>>       Place holder of exported and registered SDEI events.
>>>     * kvm_sdei_vcpu
>>>       Auxiliary object to save the preempted context during SDEI event
>>>       delivery.
>>>
>>> The error is returned for all SDEI hypercalls for now. They will be
>>> implemented by the subsequent patches.
>>>
>>> Signed-off-by: Gavin Shan <[email protected]>
>>> ---
>>>   arch/arm64/include/asm/kvm_host.h      |   6 +
>>>   arch/arm64/include/asm/kvm_sdei.h      | 118 +++++++++++++++
>>>   arch/arm64/include/uapi/asm/kvm.h      |   1 +
>>>   arch/arm64/include/uapi/asm/kvm_sdei.h |  60 ++++++++
>>>   arch/arm64/kvm/Makefile                |   2 +-
>>>   arch/arm64/kvm/arm.c                   |   7 +
>>>   arch/arm64/kvm/hypercalls.c            |  18 +++
>>>   arch/arm64/kvm/sdei.c                  | 198 +++++++++++++++++++++++++
>>>   8 files changed, 409 insertions(+), 1 deletion(-)
>>>   create mode 100644 arch/arm64/include/asm/kvm_sdei.h
>>>   create mode 100644 arch/arm64/include/uapi/asm/kvm_sdei.h
>>>   create mode 100644 arch/arm64/kvm/sdei.c
>>>
>>> diff --git a/arch/arm64/include/asm/kvm_host.h
>>> b/arch/arm64/include/asm/kvm_host.h
>>> index 41911585ae0c..aedf901e1ec7 100644
>>> --- a/arch/arm64/include/asm/kvm_host.h
>>> +++ b/arch/arm64/include/asm/kvm_host.h
>>> @@ -113,6 +113,9 @@ struct kvm_arch {
>>>       /* Interrupt controller */
>>>       struct vgic_dist    vgic;
>>>   +    /* SDEI support */
>> does not bring much. Why not reusing the commit msg explanation? Here
>> and below.
>
> I would drop the comment in next respin because I want to avoid too much
> comments to be embedded into "struct kvm_arch". The struct is already
> huge in terms of number of fields.
Yep I would drop it too.
>
>>> +    struct kvm_sdei_kvm *sdei;
>>> +
>>>       /* Mandated version of PSCI */
>>>       u32 psci_version;
>>>   @@ -339,6 +342,9 @@ struct kvm_vcpu_arch {
>>>        * here.
>>>        */
>>>   +    /* SDEI support */
>>> +    struct kvm_sdei_vcpu *sdei;> +
>>>       /*
>>>        * Guest registers we preserve during guest debugging.
>>>        *
>>> diff --git a/arch/arm64/include/asm/kvm_sdei.h
>>> b/arch/arm64/include/asm/kvm_sdei.h
>>> new file mode 100644
>>> index 000000000000..b0abc13a0256
>>> --- /dev/null
>>> +++ b/arch/arm64/include/asm/kvm_sdei.h
>>> @@ -0,0 +1,118 @@
>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>> +/*
>>> + * Definitions of various KVM SDEI events.
>>> + *
>>> + * Copyright (C) 2021 Red Hat, Inc.
>>> + *
>>> + * Author(s): Gavin Shan <[email protected]>
>>> + */
>>> +
>>> +#ifndef __ARM64_KVM_SDEI_H__
>>> +#define __ARM64_KVM_SDEI_H__
>>> +
>>> +#include <uapi/linux/arm_sdei.h>> +#include <uapi/asm/kvm_sdei.h>
>>> +#include <linux/bitmap.h>
>>> +#include <linux/list.h>
>>> +#include <linux/spinlock.h>
>>> +
>>> +struct kvm_sdei_event {
>>> +    struct kvm_sdei_event_state        state;
>>> +    struct kvm                *kvm;
>>> +    struct list_head            link;>>> +};
>>> +
>>> +struct kvm_sdei_kvm_event {
>>> +    struct kvm_sdei_kvm_event_state        state;
>>> +    struct kvm_sdei_event            *kse;
>>> +    struct kvm                *kvm;
>> can't you reuse the kvm handle in state?
>
> Nope, there is no kvm handle in @state.
right mixed names sorry
>
>>> +    struct list_head            link;
>>> +};
>>> +
>>> +struct kvm_sdei_vcpu_event {
>>> +    struct kvm_sdei_vcpu_event_state    state;
>>> +    struct kvm_sdei_kvm_event        *kske;
>>> +    struct kvm_vcpu                *vcpu;
>>> +    struct list_head            link;
>>> +};
>>> +
>>> +struct kvm_sdei_kvm {
>>> +    spinlock_t        lock;
>>> +    struct list_head    events;        /* kvm_sdei_event */
>>> +    struct list_head    kvm_events;    /* kvm_sdei_kvm_event */
>>> +};
>>> +
>>> +struct kvm_sdei_vcpu {
>>> +    spinlock_t                      lock;
>>> +    struct kvm_sdei_vcpu_state      state;
>> could you explain the fields below?
>
> As defined by the specification, each SDEI event is given priority:
> critical
> or normal priority. The priority affects how the SDEI event is delivered.
> The critical event can preempt the normal one, but the reverse thing can't
> be done.
>
>>> +    struct kvm_sdei_vcpu_event      *critical_event;
>>> +    struct kvm_sdei_vcpu_event      *normal_event;
>>> +    struct list_head                critical_events;
>>> +    struct list_head                normal_events;
>>> +};
>>> +
>>> +/*
>>> + * According to SDEI specification (v1.0), the event number spans
>>> 32-bits
>>> + * and the lower 24-bits are used as the (real) event number. I don't
>>> + * think we can use that much SDEI numbers in one system. So we reserve
>>> + * two bits from the 24-bits real event number, to indicate its types:
>>> + * physical event and virtual event. One reserved bit is enough for
>>> now,
>>> + * but two bits are reserved for possible extension in future.
>> I think this assumption is worth to be mentionned in the commit msg.
>
> Sure, I will explain it in the commit log in next respin.
>
>>> + *
>>> + * The physical events are owned by underly firmware while the virtual
>> underly?
>
> s/underly firmware/firmware in next respin.
>
>>> + * events are used by VMM and KVM.
>>> + */
>>> +#define KVM_SDEI_EV_NUM_TYPE_SHIFT    22
>>> +#define KVM_SDEI_EV_NUM_TYPE_MASK    3
>>> +#define KVM_SDEI_EV_NUM_TYPE_PHYS    0
>>> +#define KVM_SDEI_EV_NUM_TYPE_VIRT    1
>>> +
>>> +static inline bool kvm_sdei_is_valid_event_num(unsigned long num)
>> the name of the function does does not really describe what it does. It
>> actually checks the sdei is a virtual one. suggest kvm_sdei_is_virtual?
>
> The header file is only used by KVM where the virtual SDEI event is the
> only concern. However, kvm_sdei_is_virtual() is a better name.
>
>>> +{
>>> +    unsigned long type;
>>> +
>>> +    if (num >> 32)
>>> +        return false;
>>> +
>>> +    type = (num >> KVM_SDEI_EV_NUM_TYPE_SHIFT) &
>>> KVM_SDEI_EV_NUM_TYPE_MASK;
>> I think the the mask generally is applied before shifting. See
>> include/linux/irqchip/arm-gic-v3.h
>
> Ok, I will adopt the style in next respin.
>
>>> +    if (type != KVM_SDEI_EV_NUM_TYPE_VIRT)
>>> +        return false;
>>> +
>>> +    return true;
>>> +}
>>> +
>>> +/* Accessors for the registration or enablement states of KVM event */
>>> +#define KVM_SDEI_FLAG_FUNC(field)                       \
>>> +static inline bool kvm_sdei_is_##field(struct kvm_sdei_kvm_event
>>> *kske,       \
>>> +                       unsigned int index)           \
>>> +{                                       \
>>> +    return !!test_bit(index, (void *)(kske->state.field));           \
>>> +}                                       \
>>> +                                       \
>>> +static inline bool kvm_sdei_empty_##field(struct kvm_sdei_kvm_event
>>> *kske) \
>> nit: s/empty/none ?
>
> "empty" is sticky to bitmap_empty(), but "none" here looks better :)
>
>>> +{                                       \
>>> +    return bitmap_empty((void *)(kske->state.field),           \
>>> +                KVM_SDEI_MAX_VCPUS);               \
>>> +}                                       \
>>> +static inline void kvm_sdei_set_##field(struct kvm_sdei_kvm_event
>>> *kske,   \
>>> +                    unsigned int index)           \
>>> +{                                       \
>>> +    set_bit(index, (void *)(kske->state.field));               \
>>> +}                                       \
>>> +static inline void kvm_sdei_clear_##field(struct kvm_sdei_kvm_event
>>> *kske, \
>>> +                      unsigned int index)           \
>>> +{                                       \
>>> +    clear_bit(index, (void *)(kske->state.field));               \
>>> +}
>>> +
>>> +KVM_SDEI_FLAG_FUNC(registered)
>>> +KVM_SDEI_FLAG_FUNC(enabled)
>>> +
>>> +/* APIs */
>>> +void kvm_sdei_init_vm(struct kvm *kvm);
>>> +void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
>>> +int kvm_sdei_hypercall(struct kvm_vcpu *vcpu);
>>> +void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
>>> +void kvm_sdei_destroy_vm(struct kvm *kvm);
>>> +
>>> +#endif /* __ARM64_KVM_SDEI_H__ */
>>> diff --git a/arch/arm64/include/uapi/asm/kvm.h
>>> b/arch/arm64/include/uapi/asm/kvm.h
>>> index b3edde68bc3e..e1b200bb6482 100644
>>> --- a/arch/arm64/include/uapi/asm/kvm.h
>>> +++ b/arch/arm64/include/uapi/asm/kvm.h
>>> @@ -36,6 +36,7 @@
>>>   #include <linux/types.h>
>>>   #include <asm/ptrace.h>
>>>   #include <asm/sve_context.h>
>>> +#include <asm/kvm_sdei.h>
>>>     #define __KVM_HAVE_GUEST_DEBUG
>>>   #define __KVM_HAVE_IRQ_LINE
>>> diff --git a/arch/arm64/include/uapi/asm/kvm_sdei.h
>>> b/arch/arm64/include/uapi/asm/kvm_sdei.h
>>> new file mode 100644
>>> index 000000000000..8928027023f6
>>> --- /dev/null
>>> +++ b/arch/arm64/include/uapi/asm/kvm_sdei.h
>>> @@ -0,0 +1,60 @@
>>> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
>>> +/*
>>> + * Definitions of various KVM SDEI event states.
>>> + *
>>> + * Copyright (C) 2021 Red Hat, Inc.
>>> + *
>>> + * Author(s): Gavin Shan <[email protected]>
>>> + */
>>> +
>>> +#ifndef _UAPI__ASM_KVM_SDEI_H
>>> +#define _UAPI__ASM_KVM_SDEI_H
>>> +
>>> +#ifndef __ASSEMBLY__
>>> +#include <linux/types.h>
>>> +
>>> +#define KVM_SDEI_MAX_VCPUS    512
>>> +#define KVM_SDEI_INVALID_NUM    0
>>> +#define KVM_SDEI_DEFAULT_NUM    0x40400000
>>
>> The motivation behind introducing such uapi should be clearer (besides
>> just telling this aims at migrating). To me atm, this justification does
>> not make possible to understand if those structs are well suited. You
>> should document the migration process I think.
>>
>> I would remove _state suffix in all of them.
>
> I think so. I will add document "Documentation/virt/kvm/arm/sdei.rst" to
> explain the design and the corresponding data structs for migration.
> However,
> I would keep "state" suffix because I used this field as indicator for
> data structs to be migrated. However, the structs should be named
> accordingly
> since they're embedded to their parent structs:
>
>    kvm_sdei_event_state
>    kvm_sdei_kevent_state
>    kvm_sdei_vevent_state
>    kvm_sdei_vcpu_state

>
>>> +
>>> +struct kvm_sdei_event_state {
>> This is not really a state because it cannot be changed by the guest,
>> right? I would remove _state and just call it kvm_sdei_event
>
> The name kvm_sdei_event will be conflicting with same struct, defined
> in include/asm/kvm_sdei.h. Lets keep "_state" as I explained. I use
> the suffix as indicator to structs which need migration even though
> they're not changeable.
ok
>
>>> +    __u64    num;
>>> +
>>> +    __u8    type;
>>> +    __u8    signaled;
>>> +    __u8    priority;
>> you need some padding to be 64-bit aligned. See in generic or aarch64
>> kvm.h for instance.
>
> Sure.
>
>>> +};
>>> +
>>> +struct kvm_sdei_kvm_event_state {
>> I would rename into kvm_sdei_registered_event or smth alike
anyway the doc explaining the migration process will help here.
>
> As above, it will be conflicting with its parent struct, defined
> in include/asm/kvm_sdei.h
>
>>> +    __u64    num;
>> how does this num differ from the event state one?
>
> @num is same thing to that in kvm_sdei_event_state. It's used as
> index to retrieve corresponding kvm_sdei_event_state. One
> kvm_sdei_event_state
> instance can be dereferenced by kvm_sdei_kvm_event_state and
> kvm_sdei_kvm_vcpu_event_state.
> It's why we don't embed kvm_sdei_event_state in them, to avoid duplicated
> traffic in migration.
>
>>> +    __u32    refcount;
>>> +
>>> +    __u8    route_mode;
>> padding also here. See for instance
>> https://lore.kernel.org/kvm/[email protected]/T/#m7bac2ff2b28a68f8d2196ec452afd3e46682760d
>>
>>
>> Maybe put the the route_mode field and refcount at the end and add one
>> byte of padding?
>>
>> Why can't we have a single sdei_event uapi representation where route
>> mode defaults to unset and refcount defaults to 0 when not registered?
>>
>
> Ok. I will fix the padding and alignment in next respin. The
> @route_affinity
> can be changed on request from the guest. The @refcount helps to prevent
> the
> event from being unregistered if it's still dereferenced by
> kvm_sdei_vcpu_event_state.
>
>>> +    __u64    route_affinity;
>>> +    __u64    entries[KVM_SDEI_MAX_VCPUS];
>>> +    __u64    params[KVM_SDEI_MAX_VCPUS];
>> I would rename entries into ep_address and params into ep_arg.
>
> Ok, but what does "ep" means? I barely guess it's "entry point".
> I'm not sure if you're talking about "PE" here.
ep = entry point
>
>>> +    __u64    registered[KVM_SDEI_MAX_VCPUS/64];
>> maybe add a comment along with KVM_SDEI_MAX_VCPUS that it must be a
>> multiple of 64 (or a build check)
>>
>
> Sure.
>  
>>> +    __u64    enabled[KVM_SDEI_MAX_VCPUS/64];
>> Also you may clarify what this gets used for a shared event. I guess
>> this only makes sense for a private event which can be registered by
>> several EPs?
>
> Nope, they're used by both shared and private events. For shared event,
> the bit#0 is used to indicate the state, while the individual bit is
> used for the private eventYes, the private event can be registered
> and enabled separately on multiple PEs.
>
>>> +};
>>> +
>>> +struct kvm_sdei_vcpu_event_state {
>>> +    __u64    num;
>>> +    __u32    refcount;
>> how does it differ from num and refcount of the registered event?
>> padding++
>
> About @num and @refcount, please refer to the above explanation. Yes,
> I will fix padding in next respin.
>
>>> +};
>>> +
>>> +struct kvm_sdei_vcpu_regs {
>>> +    __u64    regs[18];
>>> +    __u64    pc;
>>> +    __u64    pstate;
>>> +};
>>> +
>>> +struct kvm_sdei_vcpu_state {
>>> +    __u8                masked;
>> padding++
>
> Ok.
>
>>> +    __u64                critical_num;
>>> +    __u64                normal_num;
>>> +    struct kvm_sdei_vcpu_regs    critical_regs;
>>> +    struct kvm_sdei_vcpu_regs    normal_regs;
>>> +};> +
>>> +#endif /* !__ASSEMBLY__ */
>>> +#endif /* _UAPI__ASM_KVM_SDEI_H */
>>> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
>>> index 989bb5dad2c8..eefca8ca394d 100644
>>> --- a/arch/arm64/kvm/Makefile
>>> +++ b/arch/arm64/kvm/Makefile
>>> @@ -16,7 +16,7 @@ kvm-y := $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o
>>> $(KVM)/eventfd.o \
>>>        inject_fault.o va_layout.o handle_exit.o \
>>>        guest.o debug.o reset.o sys_regs.o \
>>>        vgic-sys-reg-v3.o fpsimd.o pmu.o \
>>> -     arch_timer.o trng.o\
>>> +     arch_timer.o trng.o sdei.o \
>>>        vgic/vgic.o vgic/vgic-init.o \
>>>        vgic/vgic-irqfd.o vgic/vgic-v2.o \
>>>        vgic/vgic-v3.o vgic/vgic-v4.o \
>>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>>> index e9a2b8f27792..2f021aa41632 100644
>>> --- a/arch/arm64/kvm/arm.c
>>> +++ b/arch/arm64/kvm/arm.c
>>> @@ -150,6 +150,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned
>>> long type)
>>>         kvm_vgic_early_init(kvm);
>>>   +    kvm_sdei_init_vm(kvm);
>>> +
>>>       /* The maximum number of VCPUs is limited by the host's GIC
>>> model */
>>>       kvm->arch.max_vcpus = kvm_arm_default_max_vcpus();
>>>   @@ -179,6 +181,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
>>>         kvm_vgic_destroy(kvm);
>>>   +    kvm_sdei_destroy_vm(kvm);
>>> +
>>>       for (i = 0; i < KVM_MAX_VCPUS; ++i) {
>>>           if (kvm->vcpus[i]) {
>>>               kvm_vcpu_destroy(kvm->vcpus[i]);
>>> @@ -333,6 +337,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>>>         kvm_arm_pvtime_vcpu_init(&vcpu->arch);
>>>   +    kvm_sdei_create_vcpu(vcpu);
>>> +
>>>       vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu;
>>>         err = kvm_vgic_vcpu_init(vcpu);
>>> @@ -354,6 +360,7 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
>>>       kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
>>>       kvm_timer_vcpu_terminate(vcpu);
>>>       kvm_pmu_vcpu_destroy(vcpu);
>>> +    kvm_sdei_destroy_vcpu(vcpu);
>>>         kvm_arm_vcpu_destroy(vcpu);
>>>   }
>>> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
>>> index 30da78f72b3b..d3fc893a4f58 100644
>>> --- a/arch/arm64/kvm/hypercalls.c
>>> +++ b/arch/arm64/kvm/hypercalls.c
>>> @@ -139,6 +139,24 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
>>>       case ARM_SMCCC_TRNG_RND32:
>>>       case ARM_SMCCC_TRNG_RND64:
>>>           return kvm_trng_call(vcpu);
>>> +    case SDEI_1_0_FN_SDEI_VERSION:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_STATUS:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
>>> +    case SDEI_1_0_FN_SDEI_PE_MASK:
>>> +    case SDEI_1_0_FN_SDEI_PE_UNMASK:
>>> +    case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
>>> +    case SDEI_1_0_FN_SDEI_INTERRUPT_RELEASE:
>>> +    case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
>>> +    case SDEI_1_0_FN_SDEI_SHARED_RESET:
>>> +        return kvm_sdei_hypercall(vcpu);
>>>       default:
>>>           return kvm_psci_call(vcpu);
>>>       }
>>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>>> new file mode 100644
>>> index 000000000000..ab330b74a965
>>> --- /dev/null
>>> +++ b/arch/arm64/kvm/sdei.c
>>> @@ -0,0 +1,198 @@
>>> +// SPDX-License-Identifier: GPL-2.0-only
>>> +/*
>>> + * SDEI virtualization support.
>>> + *
>>> + * Copyright (C) 2021 Red Hat, Inc.
>>> + *
>>> + * Author(s): Gavin Shan <[email protected]>
>>> + */
>>> +
>>> +#include <linux/kernel.h>
>>> +#include <linux/kvm_host.h>
>>> +#include <linux/spinlock.h>
>>> +#include <linux/slab.h>
>>> +#include <kvm/arm_hypercalls.h>
>>> +
>>> +static struct kvm_sdei_event_state defined_kse[] = {
>>> +    { KVM_SDEI_DEFAULT_NUM,
>>> +      SDEI_EVENT_TYPE_PRIVATE,
>>> +      1,
>>> +      SDEI_EVENT_PRIORITY_CRITICAL
>>> +    },
>>> +};
>> I understand from the above we currently only support a single static (~
>> platform) SDEI event with num = KVM_SDEI_DEFAULT_NUM. We do not support
>> bound events. You may add a comment here and maybe in the commit msg.
>> I would rename the variable into exported_events.
>
> Yeah, we may enhance it to allow userspace to add more in future, but
> not now. Ok, I will rename it to @exported_events.
>
>>> +
>>> +static void kvm_sdei_remove_events(struct kvm *kvm)
>>> +{
>>> +    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>>> +    struct kvm_sdei_event *kse, *tmp;
>>> +
>>> +    list_for_each_entry_safe(kse, tmp, &ksdei->events, link) {
>>> +        list_del(&kse->link);
>>> +        kfree(kse);
>>> +    }
>>> +}
>>> +
>>> +static void kvm_sdei_remove_kvm_events(struct kvm *kvm,
>>> +                       unsigned int mask,
>>> +                       bool force)
>>> +{
>>> +    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>>> +    struct kvm_sdei_event *kse;
>>> +    struct kvm_sdei_kvm_event *kske, *tmp;
>>> +
>>> +    list_for_each_entry_safe(kske, tmp, &ksdei->kvm_events, link) {
>>> +        kse = kske->kse;
>>> +
>>> +        if (!((1 << kse->state.type) & mask))
>>> +            continue;
>> don't you need to hold a lock before looping? What if sbdy concurrently
>> changes the state fields, especially the refcount below?
>
> Yes, the caller holds @kvm->sdei_lock.
>
>>> +
>>> +        if (!force && kske->state.refcount)
>>> +            continue;
>> Usually the refcount is used to control the lifetime of the object. The
>> 'force' flag looks wrong in that context. Shouldn't you make sure all
>> users have released their refcounts and on the last decrement, delete
>> the object?
>
> @force is used for exceptional case. For example, the KVM process is
> killed before the event reference count gets chance to be dropped.
hum not totally convinced here but let's see your next version ;-)
>
>>> +
>>> +        list_del(&kske->link);
>>> +        kfree(kske);
>>> +    }
>>> +}
>>> +
>>> +static void kvm_sdei_remove_vcpu_events(struct kvm_vcpu *vcpu)
>>> +{
>>> +    struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>>> +    struct kvm_sdei_vcpu_event *ksve, *tmp;
>>> +
>>> +    list_for_each_entry_safe(ksve, tmp, &vsdei->critical_events,
>>> link) {
>>> +        list_del(&ksve->link);
>>> +        kfree(ksve);
>>> +    }
>>> +
>>> +    list_for_each_entry_safe(ksve, tmp, &vsdei->normal_events, link) {
>>> +        list_del(&ksve->link);
>>> +        kfree(ksve);
>>> +    }
>>> +}
>>> +
>>> +int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>>> +{
>>> +    u32 func = smccc_get_function(vcpu);
>>> +    bool has_result = true;
>>> +    unsigned long ret;
>>> +
>>> +    switch (func) {
>>> +    case SDEI_1_0_FN_SDEI_VERSION:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_STATUS:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
>>> +    case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
>>> +    case SDEI_1_0_FN_SDEI_PE_MASK:
>>> +    case SDEI_1_0_FN_SDEI_PE_UNMASK:
>>> +    case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
>>> +    case SDEI_1_0_FN_SDEI_INTERRUPT_RELEASE:
>>> +    case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
>>> +    case SDEI_1_0_FN_SDEI_SHARED_RESET:
>>> +    default:
>>> +        ret = SDEI_NOT_SUPPORTED;
>>> +    }
>>> +
>>> +    /*
>>> +     * We don't have return value for COMPLETE or COMPLETE_AND_RESUME
>>> +     * hypercalls. Otherwise, the restored context will be corrupted.
>>> +     */
>>> +    if (has_result)
>>> +        smccc_set_retval(vcpu, ret, 0, 0, 0);
>> If I understand the above comment, COMPLETE and COMPLETE_AND_RESUME
>> should have has_result set to false whereas in that case they will
>> return NOT_SUPPORTED. Is that OK for the context restore?
>
> Nice catch! @has_result needs to be false for COMPLETE and
> COMPLETE_AND_RESUME.
>
>>> +
>>> +    return 1;
>>> +}
>>> +
>>> +void kvm_sdei_init_vm(struct kvm *kvm)
>>> +{
>>> +    struct kvm_sdei_kvm *ksdei;
>>> +    struct kvm_sdei_event *kse;
>>> +    int i;
>>> +
>>> +    ksdei = kzalloc(sizeof(*ksdei), GFP_KERNEL);
>>> +    if (!ksdei)
>>> +        return;
>>> +
>>> +    spin_lock_init(&ksdei->lock);
>>> +    INIT_LIST_HEAD(&ksdei->events);
>>> +    INIT_LIST_HEAD(&ksdei->kvm_events);
>>> +
>>> +    /*
>>> +     * Populate the defined KVM SDEI events. The whole functionality
>>> +     * will be disabled on any errors.
>> You should definitively revise your naming conventions. this brings
>> confusion inbetween exported events and registered events. Why not
>> simply adopt the spec terminology?
>
> Yeah, I think so, but I think "defined KVM SDEI events" is following
> the specification because the SDEI event is defined by the firmware
> as the specification says. We're emulating firmware in KVM here.
>
>>> +     */
>>> +    for (i = 0; i < ARRAY_SIZE(defined_kse); i++) {
>>> +        kse = kzalloc(sizeof(*kse), GFP_KERNEL);
>>> +        if (!kse) {
>>> +            kvm_sdei_remove_events(kvm);
>>> +            kfree(ksdei);
>>> +            return;
>>> +        }
>> Add a comment saying that despite we currently support a single static
>> event we prepare for binding support by building a list of exposed
>> events?
>>
>> Or maybe simplify the implementation at this stage of the development
>> assuming a single platform event is supported?
>
> I will add comment as you suggested in next respin. Note that another entry
> will be added to the defined event array when Async PF is involved.
>
>>> +
>>> +        kse->kvm   = kvm;
>>> +        kse->state = defined_kse[i];
>>> +        list_add_tail(&kse->link, &ksdei->events);
>>> +    }
>>> +
>>> +    kvm->arch.sdei = ksdei;
>>> +}
>>> +
>>> +void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu)
>>> +{
>>> +    struct kvm *kvm = vcpu->kvm;
>>> +    struct kvm_sdei_vcpu *vsdei;
>>> +
>>> +    if (!kvm->arch.sdei)
>>> +        return;
>>> +
>>> +    vsdei = kzalloc(sizeof(*vsdei), GFP_KERNEL);
>>> +    if (!vsdei)
>>> +        return;
>>> +
>>> +    spin_lock_init(&vsdei->lock);
>>> +    vsdei->state.masked       = 1;
>>> +    vsdei->state.critical_num = KVM_SDEI_INVALID_NUM;
>>> +    vsdei->state.normal_num   = KVM_SDEI_INVALID_NUM;
>>> +    vsdei->critical_event     = NULL;
>>> +    vsdei->normal_event       = NULL;
>>> +    INIT_LIST_HEAD(&vsdei->critical_events);
>>> +    INIT_LIST_HEAD(&vsdei->normal_events);
>>> +
>>> +    vcpu->arch.sdei = vsdei;
>>> +}
>>> +
>>> +void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu)
>>> +{
>>> +    struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>>> +
>>> +    if (vsdei) {
>>> +        spin_lock(&vsdei->lock);
>>> +        kvm_sdei_remove_vcpu_events(vcpu);
>>> +        spin_unlock(&vsdei->lock);
>>> +
>>> +        kfree(vsdei);
>>> +        vcpu->arch.sdei = NULL;
>>> +    }
>>> +}
>>> +
>>> +void kvm_sdei_destroy_vm(struct kvm *kvm)
>>> +{
>>> +    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>>> +    unsigned int mask = (1 << SDEI_EVENT_TYPE_PRIVATE) |
>>> +                (1 << SDEI_EVENT_TYPE_SHARED);
>>> +
>>> +    if (ksdei) {
>>> +        spin_lock(&ksdei->lock);
>>> +        kvm_sdei_remove_kvm_events(kvm, mask, true);> +       
>>> kvm_sdei_remove_events(kvm);
>>> +        spin_unlock(&ksdei->lock);
>>> +
>>> +        kfree(ksdei);
>>> +        kvm->arch.sdei = NULL;
>>> +    }
>>> +}
>>>
>
> Thanks,
> Gavin
>
Thanks

Eric

2022-01-28 02:57:31

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 16/21] KVM: arm64: Support SDEI ioctl commands on VM

Hi gavin,

On 1/12/22 8:03 AM, Gavin Shan wrote:
> Hi Eric,
>
> On 11/10/21 9:48 PM, Eric Auger wrote:
>> On 8/15/21 2:13 AM, Gavin Shan wrote:
>>> This supports ioctl commands on VM to manage the various objects.
>>> It's primarily used by VMM to accomplish live migration. The ioctl
>>> commands introduced by this are highlighted as blow:
>> below
>>>
>>>     * KVM_SDEI_CMD_GET_VERSION
>>>       Retrieve the version of current implementation
>> which implementation, SDEI?
>>>     * KVM_SDEI_CMD_SET_EVENT
>>>       Add event to be exported from KVM so that guest can register
>>>       against it afterwards
>>>     * KVM_SDEI_CMD_GET_KEVENT_COUNT
>>>       Retrieve number of registered SDEI events
>>>     * KVM_SDEI_CMD_GET_KEVENT
>>>       Retrieve the state of the registered SDEI event
>>>     * KVM_SDEI_CMD_SET_KEVENT
>>>       Populate the registered SDEI event
>> I think we really miss the full picture of what you want to achieve with
>> those IOCTLs or at least I fail to get it. Please document the UAPI
>> separately including the structs and IOCTLs.
>
> The commit log will be improved accordingly in next revision. Yep, I will
> add document for UAPI and IOCTLs :)
>
>>>
>>> Signed-off-by: Gavin Shan <[email protected]>
>>> ---
>>>   arch/arm64/include/asm/kvm_sdei.h      |   1 +
>>>   arch/arm64/include/uapi/asm/kvm_sdei.h |  17 +++
>>>   arch/arm64/kvm/arm.c                   |   3 +
>>>   arch/arm64/kvm/sdei.c                  | 171 +++++++++++++++++++++++++
>>>   include/uapi/linux/kvm.h               |   3 +
>>>   5 files changed, 195 insertions(+)
>>>
>>> diff --git a/arch/arm64/include/asm/kvm_sdei.h
>>> b/arch/arm64/include/asm/kvm_sdei.h
>>> index 19f2d9b91f85..8f5ea947ed0e 100644
>>> --- a/arch/arm64/include/asm/kvm_sdei.h
>>> +++ b/arch/arm64/include/asm/kvm_sdei.h
>>> @@ -125,6 +125,7 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu);
>>>   int kvm_sdei_register_notifier(struct kvm *kvm, unsigned long num,
>>>                      kvm_sdei_notifier notifier);
>>>   void kvm_sdei_deliver(struct kvm_vcpu *vcpu);
>>> +long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg);
>>>   void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
>>>   void kvm_sdei_destroy_vm(struct kvm *kvm);
>>>   diff --git a/arch/arm64/include/uapi/asm/kvm_sdei.h
>>> b/arch/arm64/include/uapi/asm/kvm_sdei.h
>>> index 4ef661d106fe..35ff05be3c28 100644
>>> --- a/arch/arm64/include/uapi/asm/kvm_sdei.h
>>> +++ b/arch/arm64/include/uapi/asm/kvm_sdei.h
>>> @@ -57,5 +57,22 @@ struct kvm_sdei_vcpu_state {
>>>       struct kvm_sdei_vcpu_regs    normal_regs;
>>>   };
>>>   +#define KVM_SDEI_CMD_GET_VERSION        0
>>> +#define KVM_SDEI_CMD_SET_EVENT            1
>>> +#define KVM_SDEI_CMD_GET_KEVENT_COUNT        2
>>> +#define KVM_SDEI_CMD_GET_KEVENT            3
>>> +#define KVM_SDEI_CMD_SET_KEVENT            4
>>> +
>>> +struct kvm_sdei_cmd {
>>> +    __u32                        cmd;
>>> +    union {
>>> +        __u32                    version;
>>> +        __u32                    count;
>>> +        __u64                    num;
>>> +        struct kvm_sdei_event_state        kse_state;
>>> +        struct kvm_sdei_kvm_event_state        kske_state;
>>> +    };
>>> +};
>>> +
>>>   #endif /* !__ASSEMBLY__ */
>>>   #endif /* _UAPI__ASM_KVM_SDEI_H */
>>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>>> index 0c3db1ef1ba9..8d61585124b2 100644
>>> --- a/arch/arm64/kvm/arm.c
>>> +++ b/arch/arm64/kvm/arm.c
>>> @@ -1389,6 +1389,9 @@ long kvm_arch_vm_ioctl(struct file *filp,
>>>               return -EFAULT;
>>>           return kvm_vm_ioctl_mte_copy_tags(kvm, &copy_tags);
>>>       }
>>> +    case KVM_ARM_SDEI_COMMAND: {
>>> +        return kvm_sdei_vm_ioctl(kvm, arg);
>>> +    }
>>>       default:
>>>           return -EINVAL;
>>>       }
>>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>>> index 5f7a37dcaa77..bdd76c3e5153 100644
>>> --- a/arch/arm64/kvm/sdei.c
>>> +++ b/arch/arm64/kvm/sdei.c
>>> @@ -931,6 +931,177 @@ void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu)
>>>       vcpu->arch.sdei = vsdei;
>>>   }
>>>   +static long kvm_sdei_set_event(struct kvm *kvm,
>>> +                   struct kvm_sdei_event_state *kse_state)
>>> +{
>>> +    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>>> +    struct kvm_sdei_event *kse = NULL;
>>> +
>>> +    if (!kvm_sdei_is_valid_event_num(kse_state->num))
>>> +        return -EINVAL;
>>> +
>>> +    if (!(kse_state->type == SDEI_EVENT_TYPE_SHARED ||
>>> +          kse_state->type == SDEI_EVENT_TYPE_PRIVATE))
>>> +        return -EINVAL;
>>> +
>>> +    if (!(kse_state->priority == SDEI_EVENT_PRIORITY_NORMAL ||
>>> +          kse_state->priority == SDEI_EVENT_PRIORITY_CRITICAL))
>>> +        return -EINVAL;
>>> +
>>> +    kse = kvm_sdei_find_event(kvm, kse_state->num);
>>> +    if (kse)
>>> +        return -EEXIST;
>>> +
>>> +    kse = kzalloc(sizeof(*kse), GFP_KERNEL);
>>> +    if (!kse)
>>> +        return -ENOMEM;
>> userspace can exhaust the mem since there is no limit. There must be a
>> max.
>>
>
> Ok. I think it's minor or corner case. For now, the number of defined SDEI
> events are only one. I leave it for something to be improved in future.
Hum ok, actually this depends on kvm_sdei_is_valid_event_num's
implementation.
>
>>> +
>>> +    kse->state = *kse_state;
>>> +    kse->kvm = kvm;
>>> +    list_add_tail(&kse->link, &ksdei->events);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static long kvm_sdei_get_kevent_count(struct kvm *kvm, int *count)
>>> +{
>>> +    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>>> +    struct kvm_sdei_kvm_event *kske = NULL;
>>> +    int total = 0;
>>> +
>>> +    list_for_each_entry(kske, &ksdei->kvm_events, link) {
>>> +        total++;
>>> +    }
>>> +
>>> +    *count = total;
>>> +    return 0;
>>> +}
>>> +
>>> +static long kvm_sdei_get_kevent(struct kvm *kvm,
>>> +                struct kvm_sdei_kvm_event_state *kske_state)
shouldn't the function return a int instead?
>>> +{
>>> +    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>>> +    struct kvm_sdei_kvm_event *kske = NULL;
>>> +
>>> +    /*
>>> +     * The first entry is fetched if the event number is invalid.
>>> +     * Otherwise, the next entry is fetched.
>> why don't we return an error? What is the point returning the next entry?
>
> The SDEI events attached to the KVM are migrated one by one. Thoese
> attached
> SDEI events are linked through a linked list:
>
>     (1) on !kvm_sdei_is_valid_event_num(kske_state->num), the first SDEI
> event
>         in the linked list is retrieved from source VM and will be
> restored on
>         the destination VM.
>
>     (2) Otherwise, the next SDEI event in the linked list will be retrieved
>         from source VM and restored on the destination VM.

and why not returning NULL if the num is incorrect? Why do return the
1st elem?

Eric
>
> Another option is to introduce additional struct like below. In this
> way, all
> the attached SDEI events are retrieved and restored once. In this way, the
> memory block used for storing @kvm_sdei_kvm_event_state should be allocated
> and released by QEMU. Please let me know your preference:
>
>     struct xxx {
>            __u64                              count;
>            struct kvm_sdei_kvm_event_state    events;
>     }
>
>>> +     */
>>> +    if (!kvm_sdei_is_valid_event_num(kske_state->num)) {
>>> +        kske = list_first_entry_or_null(&ksdei->kvm_events,
>>> +                struct kvm_sdei_kvm_event, link);
>>> +    } else {
>>> +        kske = kvm_sdei_find_kvm_event(kvm, kske_state->num);
>>> +        if (kske && !list_is_last(&kske->link, &ksdei->kvm_events))
>>> +            kske = list_next_entry(kske, link);
>> Sorry I don't get why we return the next one?
>
> Please refer to the explanation above.
>
>>> +        else
>>> +            kske = NULL;
>>> +    }
>>> +
>>> +    if (!kske)
>>> +        return -ENOENT;
>>> +
>>> +    *kske_state = kske->state;
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static long kvm_sdei_set_kevent(struct kvm *kvm,
>>> +                struct kvm_sdei_kvm_event_state *kske_state)
>>> +{
>>> +    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>>> +    struct kvm_sdei_event *kse = NULL;
>>> +    struct kvm_sdei_kvm_event *kske = NULL;
>>> +
>>> +    /* Sanity check */
>>> +    if (!kvm_sdei_is_valid_event_num(kske_state->num))
>>> +        return -EINVAL;
>>> +
>>> +    if (!(kske_state->route_mode == SDEI_EVENT_REGISTER_RM_ANY ||
>>> +          kske_state->route_mode == SDEI_EVENT_REGISTER_RM_PE))
>>> +        return -EINVAL;
>>> +
>>> +    /* Check if the event number is valid */
>>> +    kse = kvm_sdei_find_event(kvm, kske_state->num);
>>> +    if (!kse)
>>> +        return -ENOENT;
>>> +
>>> +    /* Check if the event has been populated */
>>> +    kske = kvm_sdei_find_kvm_event(kvm, kske_state->num);
>>> +    if (kske)
>>> +        return -EEXIST;
>>> +
>>> +    kske = kzalloc(sizeof(*kske), GFP_KERNEL);
>> userspace can exhaust the mem since there is no limit
>
> Ok.
>
>>> +    if (!kske)
>>> +        return -ENOMEM;
>>> +
>>> +    kske->state = *kske_state;
>>> +    kske->kse   = kse;
>>> +    kske->kvm   = kvm;
>>> +    list_add_tail(&kske->link, &ksdei->kvm_events);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg)
>>> +{
>>> +    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>>> +    struct kvm_sdei_cmd *cmd = NULL;
>>> +    void __user *argp = (void __user *)arg;
>>> +    bool copy = false;
>>> +    long ret = 0;
>>> +
>>> +    /* Sanity check */
>>> +    if (!ksdei) {
>>> +        ret = -EPERM;
>>> +        goto out;
>>> +    }
>>> +
>>> +    cmd = kzalloc(sizeof(*cmd), GFP_KERNEL);
>>> +    if (!cmd) {
>>> +        ret = -ENOMEM;
>>> +        goto out;
>>> +    }
>>> +
>>> +    if (copy_from_user(cmd, argp, sizeof(*cmd))) {
>>> +        ret = -EFAULT;
>>> +        goto out;
>>> +    }
>>> +
>>> +    spin_lock(&ksdei->lock);
>>> +
>>> +    switch (cmd->cmd) {
>>> +    case KVM_SDEI_CMD_GET_VERSION:
>>> +        copy = true;
>>> +        cmd->version = (1 << 16);       /* v1.0.0 */
>>> +        break;
>>> +    case KVM_SDEI_CMD_SET_EVENT:
>>> +        ret = kvm_sdei_set_event(kvm, &cmd->kse_state);
>>> +        break;
>>> +    case KVM_SDEI_CMD_GET_KEVENT_COUNT:
>>> +        copy = true;
>>> +        ret = kvm_sdei_get_kevent_count(kvm, &cmd->count);
>>> +        break;
>>> +    case KVM_SDEI_CMD_GET_KEVENT:
>>> +        copy = true;
>>> +        ret = kvm_sdei_get_kevent(kvm, &cmd->kske_state);
>>> +        break;
>>> +    case KVM_SDEI_CMD_SET_KEVENT:
>>> +        ret = kvm_sdei_set_kevent(kvm, &cmd->kske_state);
>>> +        break;
>>> +    default:
>>> +        ret = -EINVAL;
>>> +    }
>>> +
>>> +    spin_unlock(&ksdei->lock);
>>> +out:
>>> +    if (!ret && copy && copy_to_user(argp, cmd, sizeof(*cmd)))
>>> +        ret = -EFAULT;
>>> +
>>> +    kfree(cmd);
>>> +    return ret;
>>> +}
>>> +
>>>   void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu)
>>>   {
>>>       struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>>> index d9e4aabcb31a..8cf41fd4bf86 100644
>>> --- a/include/uapi/linux/kvm.h
>>> +++ b/include/uapi/linux/kvm.h
>>> @@ -1679,6 +1679,9 @@ struct kvm_xen_vcpu_attr {
>>>   #define KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_DATA    0x4
>>>   #define KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST    0x5
>>>   +/* Available with KVM_CAP_ARM_SDEI */
>>> +#define KVM_ARM_SDEI_COMMAND    _IOWR(KVMIO, 0xce, struct kvm_sdei_cmd)
>>> +
>>>   /* Secure Encrypted Virtualization command */
>>>   enum sev_cmd_id {
>>>       /* Guest initialization commands */
>>>
>
> Thanks,
> Gavin
>

2022-01-28 04:25:47

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 10/21] KVM: arm64: Support SDEI_EVENT_ROUTING_SET hypercall

Hi Gavin,

On 1/12/22 3:54 AM, Gavin Shan wrote:
> Hi Eric,
>
> On 11/10/21 2:47 AM, Eric Auger wrote:
>> On 8/15/21 2:13 AM, Gavin Shan wrote:
>>> This supports SDEI_EVENT_ROUTING_SET hypercall. It's used by the
>>> guest to set route mode and affinity for the registered KVM event.
>>> It's only valid for the shared events. It's not allowed to do so
>>> when the corresponding event has been raised to the guest.
>>>
>>> Signed-off-by: Gavin Shan <[email protected]>
>>> ---
>>>   arch/arm64/kvm/sdei.c | 64 +++++++++++++++++++++++++++++++++++++++++++
>>>   1 file changed, 64 insertions(+)
>>>
>>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>>> index 5dfa74b093f1..458695c2394f 100644
>>> --- a/arch/arm64/kvm/sdei.c
>>> +++ b/arch/arm64/kvm/sdei.c
>>> @@ -489,6 +489,68 @@ static unsigned long
>>> kvm_sdei_hypercall_info(struct kvm_vcpu *vcpu)
>>>       return ret;
>>>   }
>>>   +static unsigned long kvm_sdei_hypercall_route(struct kvm_vcpu *vcpu)
>>> +{
>>> +    struct kvm *kvm = vcpu->kvm;
>>> +    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>>> +    struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>>> +    struct kvm_sdei_event *kse = NULL;
>>> +    struct kvm_sdei_kvm_event *kske = NULL;
>>> +    unsigned long event_num = smccc_get_arg1(vcpu);
>>> +    unsigned long route_mode = smccc_get_arg2(vcpu);
>>> +    unsigned long route_affinity = smccc_get_arg3(vcpu);
>>> +    int index = 0;
>>> +    unsigned long ret = SDEI_SUCCESS;
>>> +
>>> +    /* Sanity check */
>>> +    if (!(ksdei && vsdei)) {
>>> +        ret = SDEI_NOT_SUPPORTED;
>>> +        goto out;
>>> +    }
>>> +
>>> +    if (!kvm_sdei_is_valid_event_num(event_num)) {
>>> +        ret = SDEI_INVALID_PARAMETERS;
>>> +        goto out;
>>> +    }
>>> +
>>> +    if (!(route_mode == SDEI_EVENT_REGISTER_RM_ANY ||
>>> +          route_mode == SDEI_EVENT_REGISTER_RM_PE)) {
>>> +        ret = SDEI_INVALID_PARAMETERS;
>>> +        goto out;
>>> +    }
>> Some sanity checking on the affinity arg could be made as well according
>> to 5.1.2  affinity desc. The fn shall return INVALID_PARAMETER in case
>> of invalid affinity.
>
> Yep, you're right. I didn't figure out it. I may put a comment here.
> For now, the SDEI client driver in the guest kernel doesn't attempt
> to change the routing mode.
>
>     /* FIXME: The affinity should be verified */
>
>>> +
>>> +    /* Check if the KVM event has been registered */
>>> +    spin_lock(&ksdei->lock);
>>> +    kske = kvm_sdei_find_kvm_event(kvm, event_num);
>>> +    if (!kske) {
>>> +        ret = SDEI_INVALID_PARAMETERS;
>>> +        goto unlock;
>>> +    }
>>> +
>>> +    /* Validate KVM event state */
>>> +    kse = kske->kse;
>>> +    if (kse->state.type != SDEI_EVENT_TYPE_SHARED) {
>>> +        ret = SDEI_INVALID_PARAMETERS;
>>> +        goto unlock;
>>> +    }
>>> +
>> Event handler is in a state other than: handler-registered.
>
> They're equivalent as the handler is provided as a parameter when
> the event is registered.
>
>>> +    if (!kvm_sdei_is_registered(kske, index) ||
>>> +        kvm_sdei_is_enabled(kske, index)     ||
>>> +        kske->state.refcount) {
>> I am not sure about the refcount role here. Does it make sure the state
>> is != handler-enabled and running or handler-unregister-pending?
>>
>> I think we would gain in readibility if we had a helper to check whether
>> we are in those states?
>
> @refcount here indicates pending SDEI event for delivery. In this case,
> chaning its routing mode is disallowed.
OK. I guess you will document the refcount role somewhere.

Thanks

Eric
>
>>> +        ret = SDEI_DENIED;
>>> +        goto unlock;
>>> +    }
>>> +
>>> +    /* Update state */
>>> +    kske->state.route_mode     = route_mode;
>>> +    kske->state.route_affinity = route_affinity;
>>> +
>>> +unlock:
>>> +    spin_unlock(&ksdei->lock);
>>> +out:
>>> +    return ret;
>>> +}
>>> +
>>>   int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>>>   {
>>>       u32 func = smccc_get_function(vcpu);
>>> @@ -523,6 +585,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>>>           ret = kvm_sdei_hypercall_info(vcpu);
>>>           break;
>>>       case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
>>> +        ret = kvm_sdei_hypercall_route(vcpu);
>>> +        break;
>>>       case SDEI_1_0_FN_SDEI_PE_MASK:
>>>       case SDEI_1_0_FN_SDEI_PE_UNMASK:
>>>       case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
>>>
>
> Thanks,
> Gavin
>

2022-01-28 04:57:38

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 09/21] KVM: arm64: Support SDEI_EVENT_GET_INFO hypercall

Hi Gavin,

On 1/12/22 3:46 AM, Gavin Shan wrote:
> Hi Eric,
>
> On 11/10/21 1:19 AM, Eric Auger wrote:
>> On 8/15/21 2:13 AM, Gavin Shan wrote:
>>> This supports SDEI_EVENT_GET_INFO hypercall. It's used by the guest
>>> to retrieve various information about the supported (exported) events,
>>> including type, signaled, route mode and affinity for the shared
>>> events.
>>>
>>> Signed-off-by: Gavin Shan <[email protected]>
>>> ---
>>>   arch/arm64/kvm/sdei.c | 76 +++++++++++++++++++++++++++++++++++++++++++
>>>   1 file changed, 76 insertions(+)
>>>
>>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>>> index b95b8c4455e1..5dfa74b093f1 100644
>>> --- a/arch/arm64/kvm/sdei.c
>>> +++ b/arch/arm64/kvm/sdei.c
>>> @@ -415,6 +415,80 @@ static unsigned long
>>> kvm_sdei_hypercall_status(struct kvm_vcpu *vcpu)
>>>       return ret;
>>>   }
>>>   +static unsigned long kvm_sdei_hypercall_info(struct kvm_vcpu *vcpu)
>>> +{
>>> +    struct kvm *kvm = vcpu->kvm;
>>> +    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>>> +    struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>>> +    struct kvm_sdei_event *kse = NULL;
>>> +    struct kvm_sdei_kvm_event *kske = NULL;
>>> +    unsigned long event_num = smccc_get_arg1(vcpu);
>>> +    unsigned long event_info = smccc_get_arg2(vcpu);
>>> +    unsigned long ret = SDEI_SUCCESS;
>>> +
>>> +    /* Sanity check */
>>> +    if (!(ksdei && vsdei)) {
>>> +        ret = SDEI_NOT_SUPPORTED;
>>> +        goto out;
>>> +    }
>>> +
>>> +    if (!kvm_sdei_is_valid_event_num(event_num)) {
>>> +        ret = SDEI_INVALID_PARAMETERS;
>>> +        goto out;
>>> +    }
>>> +
>>> +    /*
>>> +     * Check if the KVM event exists. The event might have been
>>> +     * registered, we need fetch the information from the registered
>> s/fetch/to fetch
>
> Ack.
>
>>> +     * event in that case.
>>> +     */
>>> +    spin_lock(&ksdei->lock);
>>> +    kske = kvm_sdei_find_kvm_event(kvm, event_num);
>>> +    kse = kske ? kske->kse : NULL;
>>> +    if (!kse) {
>>> +        kse = kvm_sdei_find_event(kvm, event_num);
>>> +        if (!kse) {
>>> +            ret = SDEI_INVALID_PARAMETERS;
>> this should have already be covered by !kvm_sdei_is_valid_event_num I
>> think (although this latter only checks the since static event num with
>> KVM owner mask)
>
> Nope. Strictly speaking, kvm_sdei_find_event() covers the check carried
> by !kvm_sdei_is_valid_event_num(). All the defined (exposed) events should
> have virtual event number :)
you're right
>
>>> +            goto unlock;
>>> +        }
>>> +    }
>>> +
>>> +    /* Retrieve the requested information */
>>> +    switch (event_info) {
>>> +    case SDEI_EVENT_INFO_EV_TYPE:
>>> +        ret = kse->state.type;
>>> +        break;
>>> +    case SDEI_EVENT_INFO_EV_SIGNALED:
>>> +        ret = kse->state.signaled;
>>> +        break;
>>> +    case SDEI_EVENT_INFO_EV_PRIORITY:
>>> +        ret = kse->state.priority;
>>> +        break;
>>> +    case SDEI_EVENT_INFO_EV_ROUTING_MODE:
>>> +    case SDEI_EVENT_INFO_EV_ROUTING_AFF:
>>> +        if (kse->state.type != SDEI_EVENT_TYPE_SHARED) {
>>> +            ret = SDEI_INVALID_PARAMETERS;
>>> +            break;
>>> +        }
>>> +
>>> +        if (event_info == SDEI_EVENT_INFO_EV_ROUTING_MODE) {
>>> +            ret = kske ? kske->state.route_mode :
>>> +                     SDEI_EVENT_REGISTER_RM_ANY;
>> no, if event is not registered (!kske) DENIED should be returned
>
> I don't think so. According to the specification, there is no DENIED
> return value for STATUS hypercall. Either INVALID_PARAMETERS or
> NOT_SUPPORTED
> should be returned from this hypercall :)

Look at table 5.1.10.2 Parameter a,d Return Values. DENIED is returned
in some cases

Eric
>
>>> +        } else {
>> same here
>>> +            ret = kske ? kske->state.route_affinity : 0;
>>> +        }
>>> +
>>> +        break;
>>> +    default:
>>> +        ret = SDEI_INVALID_PARAMETERS;
>>> +    }
>>> +
>>> +unlock:
>>> +    spin_unlock(&ksdei->lock);
>>> +out:
>>> +    return ret;
>>> +}
>>> +
>>>   int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>>>   {
>>>       u32 func = smccc_get_function(vcpu);
>>> @@ -446,6 +520,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>>>           ret = kvm_sdei_hypercall_status(vcpu);
>>>           break;
>>>       case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
>>> +        ret = kvm_sdei_hypercall_info(vcpu);
>>> +        break;
>>>       case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
>>>       case SDEI_1_0_FN_SDEI_PE_MASK:
>>>       case SDEI_1_0_FN_SDEI_PE_UNMASK:
>>>
>
> Thanks,
> Gavin
>

2022-01-28 06:32:37

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 14/21] KVM: arm64: Support SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall

Hi Gavin,

On 1/12/22 7:43 AM, Gavin Shan wrote:
> Hi Eric,
>
> On 11/10/21 6:58 PM, Eric Auger wrote:
>> On 8/15/21 2:13 AM, Gavin Shan wrote:
>>> This supports SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall.
>>> They are used by the guest to notify the completion of the SDEI
>>> event in the handler. The registers are changed according to the
>>> SDEI specification as below:
>>>
>>>     * x0 - x17, PC and PState are restored to what values we had in
>>>       the interrupted context.
>>>
>>>     * If it's SDEI_EVENT_COMPLETE_AND_RESUME hypercall, IRQ exception
>>>       is injected.
>>>
>>> Signed-off-by: Gavin Shan <[email protected]>
>>> ---
>>>   arch/arm64/include/asm/kvm_emulate.h |  1 +
>>>   arch/arm64/include/asm/kvm_host.h    |  1 +
>>>   arch/arm64/kvm/hyp/exception.c       |  7 +++
>>>   arch/arm64/kvm/inject_fault.c        | 27 ++++++++++
>>>   arch/arm64/kvm/sdei.c                | 75 ++++++++++++++++++++++++++++
>>>   5 files changed, 111 insertions(+)
>>>
>>> diff --git a/arch/arm64/include/asm/kvm_emulate.h
>>> b/arch/arm64/include/asm/kvm_emulate.h
>>> index fd418955e31e..923b4d08ea9a 100644
>>> --- a/arch/arm64/include/asm/kvm_emulate.h
>>> +++ b/arch/arm64/include/asm/kvm_emulate.h
>>> @@ -37,6 +37,7 @@ bool kvm_condition_valid32(const struct kvm_vcpu
>>> *vcpu);
>>>   void kvm_skip_instr32(struct kvm_vcpu *vcpu);
>>>     void kvm_inject_undefined(struct kvm_vcpu *vcpu);
>>> +void kvm_inject_irq(struct kvm_vcpu *vcpu);
>>>   void kvm_inject_vabt(struct kvm_vcpu *vcpu);
>>>   void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr);
>>>   void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr);
>>> diff --git a/arch/arm64/include/asm/kvm_host.h
>>> b/arch/arm64/include/asm/kvm_host.h
>>> index 46f363aa6524..1824f7e1f9ab 100644
>>> --- a/arch/arm64/include/asm/kvm_host.h
>>> +++ b/arch/arm64/include/asm/kvm_host.h
>>> @@ -437,6 +437,7 @@ struct kvm_vcpu_arch {
>>>   #define KVM_ARM64_EXCEPT_AA32_UND    (0 << 9)
>>>   #define KVM_ARM64_EXCEPT_AA32_IABT    (1 << 9)
>>>   #define KVM_ARM64_EXCEPT_AA32_DABT    (2 << 9)
>>> +#define KVM_ARM64_EXCEPT_AA32_IRQ    (3 << 9)
>>>   /* For AArch64: */
>>>   #define KVM_ARM64_EXCEPT_AA64_ELx_SYNC    (0 << 9)
>>>   #define KVM_ARM64_EXCEPT_AA64_ELx_IRQ    (1 << 9)
>>> diff --git a/arch/arm64/kvm/hyp/exception.c
>>> b/arch/arm64/kvm/hyp/exception.c
>>> index 0418399e0a20..ef458207d152 100644
>>> --- a/arch/arm64/kvm/hyp/exception.c
>>> +++ b/arch/arm64/kvm/hyp/exception.c
>>> @@ -310,6 +310,9 @@ static void kvm_inject_exception(struct kvm_vcpu
>>> *vcpu)
>>>           case KVM_ARM64_EXCEPT_AA32_DABT:
>>>               enter_exception32(vcpu, PSR_AA32_MODE_ABT, 16);
>>>               break;
>>> +        case KVM_ARM64_EXCEPT_AA32_IRQ:
>>> +            enter_exception32(vcpu, PSR_AA32_MODE_IRQ, 4);
>>> +            break;
>>>           default:
>>>               /* Err... */
>>>               break;
>>> @@ -320,6 +323,10 @@ static void kvm_inject_exception(struct kvm_vcpu
>>> *vcpu)
>>>                 KVM_ARM64_EXCEPT_AA64_EL1):
>>>               enter_exception64(vcpu, PSR_MODE_EL1h, except_type_sync);
>>>               break;
>>> +        case (KVM_ARM64_EXCEPT_AA64_ELx_IRQ |
>>> +              KVM_ARM64_EXCEPT_AA64_EL1):
>>> +            enter_exception64(vcpu, PSR_MODE_EL1h, except_type_irq);
>>> +            break;
>>>           default:
>>>               /*
>>>                * Only EL1_SYNC makes sense so far, EL2_{SYNC,IRQ}
>>> diff --git a/arch/arm64/kvm/inject_fault.c
>>> b/arch/arm64/kvm/inject_fault.c
>>> index b47df73e98d7..3a8c55867d2f 100644
>>> --- a/arch/arm64/kvm/inject_fault.c
>>> +++ b/arch/arm64/kvm/inject_fault.c
>>> @@ -66,6 +66,13 @@ static void inject_undef64(struct kvm_vcpu *vcpu)
>>>       vcpu_write_sys_reg(vcpu, esr, ESR_EL1);
>>>   }
>>>   +static void inject_irq64(struct kvm_vcpu *vcpu)
>>> +{
>>> +    vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1     |
>>> +                 KVM_ARM64_EXCEPT_AA64_ELx_IRQ |
>>> +                 KVM_ARM64_PENDING_EXCEPTION);
>>> +}
>>> +
>>>   #define DFSR_FSC_EXTABT_LPAE    0x10
>>>   #define DFSR_FSC_EXTABT_nLPAE    0x08
>>>   #define DFSR_LPAE        BIT(9)
>>> @@ -77,6 +84,12 @@ static void inject_undef32(struct kvm_vcpu *vcpu)
>>>                    KVM_ARM64_PENDING_EXCEPTION);
>>>   }
>>>   +static void inject_irq32(struct kvm_vcpu *vcpu)
>>> +{
>>> +    vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA32_IRQ |
>>> +                 KVM_ARM64_PENDING_EXCEPTION);
>>> +}
>>> +
>>>   /*
>>>    * Modelled after TakeDataAbortException() and
>>> TakePrefetchAbortException
>>>    * pseudocode.
>>> @@ -160,6 +173,20 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
>>>           inject_undef64(vcpu);
>>>   }
>>>   +/**
>>> + * kvm_inject_irq - inject an IRQ into the guest
>>> + *
>>> + * It is assumed that this code is called from the VCPU thread and
>>> that the
>>> + * VCPU therefore is not currently executing guest code.
>>> + */
>>> +void kvm_inject_irq(struct kvm_vcpu *vcpu)
>>> +{
>>> +    if (vcpu_el1_is_32bit(vcpu))
>>> +        inject_irq32(vcpu);
>>> +    else
>>> +        inject_irq64(vcpu);
>>> +}
>>> +
>>>   void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 esr)
>>>   {
>>>       vcpu_set_vsesr(vcpu, esr & ESR_ELx_ISS_MASK);
>>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>>> index b5d6d1ed3858..1e8e213c9d70 100644
>>> --- a/arch/arm64/kvm/sdei.c
>>> +++ b/arch/arm64/kvm/sdei.c
>>> @@ -308,6 +308,75 @@ static unsigned long
>>> kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu)
>>>       return ret;
>>>   }
>>>   +static unsigned long kvm_sdei_hypercall_complete(struct kvm_vcpu
>>> *vcpu,
>>> +                         bool resume)
>>> +{
>>> +    struct kvm *kvm = vcpu->kvm;
>>> +    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>>> +    struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>>> +    struct kvm_sdei_kvm_event *kske = NULL;
>>> +    struct kvm_sdei_vcpu_event *ksve = NULL;
>>> +    struct kvm_sdei_vcpu_regs *regs;
>>> +    unsigned long ret = SDEI_SUCCESS;
>> for the RESUME you never seem to read resume_addr arg? How does it work?
>> I don't get the irq injection path. Please could you explain?
>
> The guest kernel uses COMPLETE and COMPLETE_AND_RESUME hypercalls to
> notify the
> SDEI event has been acknoledged by it. The difference between them is
> COMPLETE_AND_RESUME
> fires the pending interrupts, but COMPLETE doesn't.
so resume_addr never is used, right?
>
>>> +    int index;
>>> +
>>> +    /* Sanity check */
>>> +    if (!(ksdei && vsdei)) {
>>> +        ret = SDEI_NOT_SUPPORTED;
>>> +        goto out;
>>> +    }
>>> +
>>> +    spin_lock(&vsdei->lock);
>>> +    if (vsdei->critical_event) {
>>> +        ksve = vsdei->critical_event;
>>> +        regs = &vsdei->state.critical_regs;
>>> +        vsdei->critical_event = NULL;
>>> +        vsdei->state.critical_num = KVM_SDEI_INVALID_NUM;
>>> +    } else if (vsdei->normal_event) {
>>> +        ksve = vsdei->normal_event;
>>> +        regs = &vsdei->state.normal_regs;
>>> +        vsdei->normal_event = NULL;
>>> +        vsdei->state.normal_num = KVM_SDEI_INVALID_NUM;
>>> +    } else {
>>> +        ret = SDEI_DENIED;
>>> +        goto unlock;
>>> +    }
>>> +
>>> +    /* Restore registers: x0 -> x17, PC, PState */
>>> +    for (index = 0; index < ARRAY_SIZE(regs->regs); index++)
>>> +        vcpu_set_reg(vcpu, index, regs->regs[index]);
>>> +
>>> +    *vcpu_cpsr(vcpu) = regs->pstate;
>>> +    *vcpu_pc(vcpu) = regs->pc;
>>> +
>>> +    /* Inject interrupt if needed */
>>> +    if (resume)
>>> +        kvm_inject_irq(vcpu);
>>> +
>>> +    /*
>>> +     * Update state. We needn't take lock in order to update the KVM
>>> +     * event state as it's not destroyed because of the reference
>>> +     * count.
>>> +     */
>>> +    kske = ksve->kske;
>>> +    ksve->state.refcount--;
>>> +    kske->state.refcount--;
>> why double --?
>
> On each SDEI event is queued for delivery, both reference count are
> increased. I guess
> it's a bit confusing. I will change in next revision:
>
> ksve->state.refcount: Increased on each SDEI event is queued for delivered
> kske->state.refcount: Increased on each @ksve is created
>
>
>>> +    if (!ksve->state.refcount) {
>> why not using a struct kref directly?
>
> The reason is kref isn't friendly to userspace. This field (@refcount)
> needs to be
> migrated :)

I will see with next version migration doc

Thanks

Eric
>
>>> +        list_del(&ksve->link);
>>> +        kfree(ksve);
>>> +    }
>>> +
>>> +    /* Make another request if there is pending event */
>>> +    if (!(list_empty(&vsdei->critical_events) &&
>>> +          list_empty(&vsdei->normal_events)))
>>> +        kvm_make_request(KVM_REQ_SDEI, vcpu);
>>> +
>>> +unlock:
>>> +    spin_unlock(&vsdei->lock);
>>> +out:
>>> +    return ret;
>>> +}
>>> +
>>>   static unsigned long kvm_sdei_hypercall_unregister(struct kvm_vcpu
>>> *vcpu)
>>>   {
>>>       struct kvm *kvm = vcpu->kvm;
>>> @@ -628,7 +697,13 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>>>           ret = kvm_sdei_hypercall_context(vcpu);
>>>           break;
>>>       case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
>>> +        has_result = false;
>>> +        ret = kvm_sdei_hypercall_complete(vcpu, false);
>>> +        break;
>>>       case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
>>> +        has_result = false;
>>> +        ret = kvm_sdei_hypercall_complete(vcpu, true);
>>> +        break;
>>>       case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
>>>           ret = kvm_sdei_hypercall_unregister(vcpu);
>>>           break;
>>>
>
> Thanks,
> Gavin
>

2022-01-28 07:49:36

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v4 14/21] KVM: arm64: Support SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall

Hi Gavin,
On 1/12/22 7:43 AM, Gavin Shan wrote:
> Hi Eric,
>
> On 11/10/21 6:58 PM, Eric Auger wrote:
>> On 8/15/21 2:13 AM, Gavin Shan wrote:
>>> This supports SDEI_EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall.
>>> They are used by the guest to notify the completion of the SDEI
>>> event in the handler. The registers are changed according to the
>>> SDEI specification as below:
>>>
>>>     * x0 - x17, PC and PState are restored to what values we had in
>>>       the interrupted context.
>>>
>>>     * If it's SDEI_EVENT_COMPLETE_AND_RESUME hypercall, IRQ exception
>>>       is injected.
>>>
>>> Signed-off-by: Gavin Shan <[email protected]>
>>> ---
>>>   arch/arm64/include/asm/kvm_emulate.h |  1 +
>>>   arch/arm64/include/asm/kvm_host.h    |  1 +
>>>   arch/arm64/kvm/hyp/exception.c       |  7 +++
>>>   arch/arm64/kvm/inject_fault.c        | 27 ++++++++++
>>>   arch/arm64/kvm/sdei.c                | 75 ++++++++++++++++++++++++++++
>>>   5 files changed, 111 insertions(+)
>>>
>>> diff --git a/arch/arm64/include/asm/kvm_emulate.h
>>> b/arch/arm64/include/asm/kvm_emulate.h
>>> index fd418955e31e..923b4d08ea9a 100644
>>> --- a/arch/arm64/include/asm/kvm_emulate.h
>>> +++ b/arch/arm64/include/asm/kvm_emulate.h
>>> @@ -37,6 +37,7 @@ bool kvm_condition_valid32(const struct kvm_vcpu
>>> *vcpu);
>>>   void kvm_skip_instr32(struct kvm_vcpu *vcpu);
>>>     void kvm_inject_undefined(struct kvm_vcpu *vcpu);
>>> +void kvm_inject_irq(struct kvm_vcpu *vcpu);
>>>   void kvm_inject_vabt(struct kvm_vcpu *vcpu);
>>>   void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr);
>>>   void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr);
>>> diff --git a/arch/arm64/include/asm/kvm_host.h
>>> b/arch/arm64/include/asm/kvm_host.h
>>> index 46f363aa6524..1824f7e1f9ab 100644
>>> --- a/arch/arm64/include/asm/kvm_host.h
>>> +++ b/arch/arm64/include/asm/kvm_host.h
>>> @@ -437,6 +437,7 @@ struct kvm_vcpu_arch {
>>>   #define KVM_ARM64_EXCEPT_AA32_UND    (0 << 9)
>>>   #define KVM_ARM64_EXCEPT_AA32_IABT    (1 << 9)
>>>   #define KVM_ARM64_EXCEPT_AA32_DABT    (2 << 9)
>>> +#define KVM_ARM64_EXCEPT_AA32_IRQ    (3 << 9)
>>>   /* For AArch64: */
>>>   #define KVM_ARM64_EXCEPT_AA64_ELx_SYNC    (0 << 9)
>>>   #define KVM_ARM64_EXCEPT_AA64_ELx_IRQ    (1 << 9)
>>> diff --git a/arch/arm64/kvm/hyp/exception.c
>>> b/arch/arm64/kvm/hyp/exception.c
>>> index 0418399e0a20..ef458207d152 100644
>>> --- a/arch/arm64/kvm/hyp/exception.c
>>> +++ b/arch/arm64/kvm/hyp/exception.c
>>> @@ -310,6 +310,9 @@ static void kvm_inject_exception(struct kvm_vcpu
>>> *vcpu)
>>>           case KVM_ARM64_EXCEPT_AA32_DABT:
>>>               enter_exception32(vcpu, PSR_AA32_MODE_ABT, 16);
>>>               break;
>>> +        case KVM_ARM64_EXCEPT_AA32_IRQ:
>>> +            enter_exception32(vcpu, PSR_AA32_MODE_IRQ, 4);
>>> +            break;
>>>           default:
>>>               /* Err... */
>>>               break;
>>> @@ -320,6 +323,10 @@ static void kvm_inject_exception(struct kvm_vcpu
>>> *vcpu)
>>>                 KVM_ARM64_EXCEPT_AA64_EL1):
>>>               enter_exception64(vcpu, PSR_MODE_EL1h, except_type_sync);
>>>               break;
>>> +        case (KVM_ARM64_EXCEPT_AA64_ELx_IRQ |
>>> +              KVM_ARM64_EXCEPT_AA64_EL1):
>>> +            enter_exception64(vcpu, PSR_MODE_EL1h, except_type_irq);
>>> +            break;
>>>           default:
>>>               /*
>>>                * Only EL1_SYNC makes sense so far, EL2_{SYNC,IRQ}
>>> diff --git a/arch/arm64/kvm/inject_fault.c
>>> b/arch/arm64/kvm/inject_fault.c
>>> index b47df73e98d7..3a8c55867d2f 100644
>>> --- a/arch/arm64/kvm/inject_fault.c
>>> +++ b/arch/arm64/kvm/inject_fault.c
>>> @@ -66,6 +66,13 @@ static void inject_undef64(struct kvm_vcpu *vcpu)
>>>       vcpu_write_sys_reg(vcpu, esr, ESR_EL1);
>>>   }
>>>   +static void inject_irq64(struct kvm_vcpu *vcpu)
>>> +{
>>> +    vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1     |
>>> +                 KVM_ARM64_EXCEPT_AA64_ELx_IRQ |
>>> +                 KVM_ARM64_PENDING_EXCEPTION);
>>> +}
>>> +
>>>   #define DFSR_FSC_EXTABT_LPAE    0x10
>>>   #define DFSR_FSC_EXTABT_nLPAE    0x08
>>>   #define DFSR_LPAE        BIT(9)
>>> @@ -77,6 +84,12 @@ static void inject_undef32(struct kvm_vcpu *vcpu)
>>>                    KVM_ARM64_PENDING_EXCEPTION);
>>>   }
>>>   +static void inject_irq32(struct kvm_vcpu *vcpu)
>>> +{
>>> +    vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA32_IRQ |
>>> +                 KVM_ARM64_PENDING_EXCEPTION);
>>> +}
>>> +
>>>   /*
>>>    * Modelled after TakeDataAbortException() and
>>> TakePrefetchAbortException
>>>    * pseudocode.
>>> @@ -160,6 +173,20 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
>>>           inject_undef64(vcpu);
>>>   }
>>>   +/**
>>> + * kvm_inject_irq - inject an IRQ into the guest
>>> + *
>>> + * It is assumed that this code is called from the VCPU thread and
>>> that the
>>> + * VCPU therefore is not currently executing guest code.
>>> + */
>>> +void kvm_inject_irq(struct kvm_vcpu *vcpu)
>>> +{
>>> +    if (vcpu_el1_is_32bit(vcpu))
>>> +        inject_irq32(vcpu);
>>> +    else
>>> +        inject_irq64(vcpu);
>>> +}
>>> +
>>>   void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 esr)
>>>   {
>>>       vcpu_set_vsesr(vcpu, esr & ESR_ELx_ISS_MASK);
>>> diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
>>> index b5d6d1ed3858..1e8e213c9d70 100644
>>> --- a/arch/arm64/kvm/sdei.c
>>> +++ b/arch/arm64/kvm/sdei.c
>>> @@ -308,6 +308,75 @@ static unsigned long
>>> kvm_sdei_hypercall_context(struct kvm_vcpu *vcpu)
>>>       return ret;
>>>   }
>>>   +static unsigned long kvm_sdei_hypercall_complete(struct kvm_vcpu
>>> *vcpu,
>>> +                         bool resume)
>>> +{
>>> +    struct kvm *kvm = vcpu->kvm;
>>> +    struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
>>> +    struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
>>> +    struct kvm_sdei_kvm_event *kske = NULL;
>>> +    struct kvm_sdei_vcpu_event *ksve = NULL;
>>> +    struct kvm_sdei_vcpu_regs *regs;
>>> +    unsigned long ret = SDEI_SUCCESS;
>> for the RESUME you never seem to read resume_addr arg? How does it work?
>> I don't get the irq injection path. Please could you explain?
>
> The guest kernel uses COMPLETE and COMPLETE_AND_RESUME hypercalls to
> notify the
> SDEI event has been acknoledged by it. The difference between them is
> COMPLETE_AND_RESUME
> fires the pending interrupts, but COMPLETE doesn't.
>
>>> +    int index;
>>> +
>>> +    /* Sanity check */
>>> +    if (!(ksdei && vsdei)) {
>>> +        ret = SDEI_NOT_SUPPORTED;
>>> +        goto out;
>>> +    }
>>> +
>>> +    spin_lock(&vsdei->lock);
>>> +    if (vsdei->critical_event) {
>>> +        ksve = vsdei->critical_event;
>>> +        regs = &vsdei->state.critical_regs;
>>> +        vsdei->critical_event = NULL;
>>> +        vsdei->state.critical_num = KVM_SDEI_INVALID_NUM;
>>> +    } else if (vsdei->normal_event) {
>>> +        ksve = vsdei->normal_event;
>>> +        regs = &vsdei->state.normal_regs;
>>> +        vsdei->normal_event = NULL;
>>> +        vsdei->state.normal_num = KVM_SDEI_INVALID_NUM;
>>> +    } else {
>>> +        ret = SDEI_DENIED;
>>> +        goto unlock;
>>> +    }
>>> +
>>> +    /* Restore registers: x0 -> x17, PC, PState */
>>> +    for (index = 0; index < ARRAY_SIZE(regs->regs); index++)
>>> +        vcpu_set_reg(vcpu, index, regs->regs[index]);
>>> +
>>> +    *vcpu_cpsr(vcpu) = regs->pstate;
>>> +    *vcpu_pc(vcpu) = regs->pc;
>>> +
>>> +    /* Inject interrupt if needed */
>>> +    if (resume)
>>> +        kvm_inject_irq(vcpu);
>>> +
>>> +    /*
>>> +     * Update state. We needn't take lock in order to update the KVM
>>> +     * event state as it's not destroyed because of the reference
>>> +     * count.
>>> +     */
>>> +    kske = ksve->kske;
>>> +    ksve->state.refcount--;
>>> +    kske->state.refcount--;
>> why double --?

>
> On each SDEI event is queued for delivery, both reference count are
> increased. I guess
> it's a bit confusing. I will change in next revision:
>
> ksve->state.refcount: Increased on each SDEI event is queued for delivered
> kske->state.refcount: Increased on each @ksve is created
Well generally this kind of stuff is frown upon.
>
>
>>> +    if (!ksve->state.refcount) {
>> why not using a struct kref directly?
>
> The reason is kref isn't friendly to userspace. This field (@refcount)
> needs to be
> migrated :)
waiting for the mig documentation to further comment.

Thanks

Eric
>
>>> +        list_del(&ksve->link);
>>> +        kfree(ksve);
>>> +    }
>>> +
>>> +    /* Make another request if there is pending event */
>>> +    if (!(list_empty(&vsdei->critical_events) &&
>>> +          list_empty(&vsdei->normal_events)))
>>> +        kvm_make_request(KVM_REQ_SDEI, vcpu);
>>> +
>>> +unlock:
>>> +    spin_unlock(&vsdei->lock);
>>> +out:
>>> +    return ret;
>>> +}
>>> +
>>>   static unsigned long kvm_sdei_hypercall_unregister(struct kvm_vcpu
>>> *vcpu)
>>>   {
>>>       struct kvm *kvm = vcpu->kvm;
>>> @@ -628,7 +697,13 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
>>>           ret = kvm_sdei_hypercall_context(vcpu);
>>>           break;
>>>       case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
>>> +        has_result = false;
>>> +        ret = kvm_sdei_hypercall_complete(vcpu, false);
>>> +        break;
>>>       case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
>>> +        has_result = false;
>>> +        ret = kvm_sdei_hypercall_complete(vcpu, true);
>>> +        break;
>>>       case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
>>>           ret = kvm_sdei_hypercall_unregister(vcpu);
>>>           break;
>>>
>
> Thanks,
> Gavin
>