2022-03-22 08:23:15

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v5 00/22] Support SDEI Virtualization

This series intends to virtualize Software Delegated Exception Interface
(SDEI), which is defined by DEN0054C (v1.1). It allows the hypervisor to
deliver NMI-alike SDEI event to guest and it's needed by Async PF to
deliver page-not-present notification from hypervisor to guest. The code
and the required qemu changes can be found from:

https://developer.arm.com/documentation/den0054/c
https://github.com/gwshan/linux ("kvm/arm64_sdei")
https://github.com/gwshan/qemu ("kvm/arm64_sdei")

For the design and migration needs, please refer to the document in
PATCH[21/22] in this series. The series is organized as below:

PATCH[01] Introduces template for smccc_get_argx()
PATCH[02] Adds SDEI virtualization infrastructure
PATCH[03-17] Supports various SDEI hypercalls and event handling
PATCH[18-20] Adds ioctl commands to support migration and configuration
and exports SDEI capability
PATCH[21] Adds SDEI document
PATCH[22] Adds SDEI selftest case

Testing
=======

[1] The selftest case included in this series works fine. The default SDEI
event, whose number is zero, can be registered, enabled, raised. The
SDEI event handler can be invoked.

[host]# pwd
/home/gavin/sandbox/linux.main/tools/testing/selftests/kvm
[root@virtlab-arm01 kvm]# ./aarch64/sdei

NR_VCPUS: 2 SDEI Event: 0x00000000

--- VERSION
Version: 1.1 (vendor: 0x4b564d)
--- FEATURES
Shared event slots: 0
Private event slots: 0
Relative mode: No
--- PRIVATE_RESET
--- SHARED_RESET
--- PE_UNMASK
--- EVENT_GET_INFO
Type: Private
Priority: Normal
Signaled: Yes
--- EVENT_REGISTER
--- EVENT_ENABLE
--- EVENT_SIGNAL
Handled: Yes
IRQ: No
Status: Registered-Enabled-Running
PC/PSTATE: 000000000040232c 00000000600003c5
Regs: 0000000000000000 0000000000000000
0000000000000000 0000000000000000
--- PE_MASK
--- EVENT_DISABLE
--- EVENT_UNREGISTER

Result: OK

[2] There are additional patches in the following repositories to create
procfs entries, allowing to inject SDEI event from host side. The
SDEI client in the guest side registers the SDEI default event, whose
number is zero. Also, the QEMU exports SDEI ACPI table and supports
migration for SDEI.

https://github.com/gwshan/linux ("kvm/arm64_sdei")
https://github.com/gwshan/qemu ("kvm/arm64_sdei")

[2.1] Start the guests and migrate the source VM to the destination
VM.

[host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
-accel kvm -machine virt,gic-version=host \
-cpu host -smp 6,sockets=2,cores=3,threads=1 \
-m 1024M,slots=16,maxmem=64G \
: \
-kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
-initrd /home/gavin/sandbox/images/rootfs.cpio.xz \
-append earlycon=pl011,mmio,0x9000000 \
:

[host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
-accel kvm -machine virt,gic-version=host \
-cpu host -smp 6,sockets=2,cores=3,threads=1 \
-m 1024M,slots=16,maxmem=64G \
: \
-kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
-initrd /home/gavin/sandbox/images/rootfs.cpio.xz \
-append earlycon=pl011,mmio,0x9000000 \
-incoming tcp:0:4444 \
:

[2.2] Check kernel log on the source VM. The SDEI service is enabled
and the default SDEI event (0x0) is enabled.

[guest-src]# dmesg | grep -i sdei
ACPI: SDEI 0x000000005BC80000 000024 \
(v00 BOCHS BXPC 00000001 BXPC 00000001)
sdei: SDEIv1.1 (0x4b564d) detected in firmware.
SDEI TEST: Version 1.1, Vendor 0x4b564d
sdei_init: SDEI event (0x0) registered
sdei_init: SDEI event (0x0) enabled


(qemu) migrate -d tcp:localhost:4444

[2.3] Migrate the source VM to the destination VM. Inject SDEI event
to the destination VM. The event is raised and handled.

(qemu) migrate -d tcp:localhost:4444

[host]# echo 0 > /proc/kvm/kvm-5360/vcpu-1

[guest-dst]#
=========== SDEI Event (CPU#1) ===========
Event: 0000000000000000 Parameter: 00000000dabfdabf
PC: ffff800008cbb554 PSTATE: 00000000604000c5 SP: ffff800009c7bde0
Regs: 00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001
ffff800016c28000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 ffff800009399008
ffff8000097d9af0 ffff8000097d99f8 ffff8000093a8db8 ffff8000097d9b18
0000000000000000 0000000000000000 ffff000000339d00 0000000000000000
0000000000000000 ffff800009c7bde0 ffff800008cbb5c4
Context: 00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001
ffff800016c28000 03ffffffffffffff 000000024325db59 ffff8000097de190
ffff00000033a790 ffff800008cbb814 0000000000000a30 0000000000000000

Changelog
=========
v5:
* Rebased to v5.17.rc7 (Gavin)
* Unified names for the objects, data structures, variables
and functions. The events have been named as exposed,
registered and vcpu event. The staes needs to be migrated
is put into kvm_sdei_state.h (Eric)
* More inline functions to visit SDEI event's properties (Eric)
* Support unregistration pending state (Eric)
* Support v1.1 SDEI specification (Eric)
* Fold the code to inject, deliver and handle SDEI event
from PATCH[v4 13/18/19] into PATCH[v5 13] (Eric)
* Simplified ioctl interface to visit all events at once (Eric/Gavin)
* Improved reference count and avoid its migration. Also,
the limit to memory allocation is added based on it. (Eric)
* Change the return values from hypercall functions (Eric)
* Validate @ksdei and @vsdi in kvm_sdei_hypercall() (Shannon)
* Add document to explain how SDEI virutalization and the
migration are supported (Eric)
* Improved selftest case to inject and handle SDEI event (Gavin)
* Improved comments and commit logs (Eric)
* Address misc comments from Eric. Hopefully, all of them
are covered in v5 because Eric provided lots of comments
in the last round of review (Eric)
v4:
* Rebased to v5.14.rc5 (Gavin)
v3:
* Rebased to v5.13.rc1 (Gavin)
* Use linux data types in kvm_sdei.h (Gavin)
v2:
* Rebased to v5.11.rc6 (Gavin)
* Dropped changes related to SDEI client driver (Gavin)
* Removed support for passthrou SDEI events (Gavin)
* Redesigned data structures (Gavin)
* Implementation is almost rewritten as the data structures
are totally changed (Gavin)
* Added ioctl commands to support migration (Gavin)

Gavin Shan (22):
KVM: arm64: Introduce template for inline functions
KVM: arm64: Add SDEI virtualization infrastructure
KVM: arm64: Support SDEI_VERSION hypercall
KVM: arm64: Support SDEI_EVENT_REGISTER hypercall
KVM: arm64: Support SDEI_EVENT_{ENABLE, DISABLE} hypercall
KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall
KVM: arm64: Support SDEI_EVENT_UNREGISTER hypercall
KVM: arm64: Support SDEI_EVENT_STATUS hypercall
KVM: arm64: Support SDEI_EVENT_GET_INFO hypercall
KVM: arm64: Support SDEI_EVENT_ROUTING_SET hypercall
KVM: arm64: Support SDEI_PE_{MASK, UNMASK} hypercall
KVM: arm64: Support SDEI_{PRIVATE, SHARED}_RESET
KVM: arm64: Support SDEI_FEATURES hypercall
KVM: arm64: Support SDEI event injection, delivery and cancellation
KVM: arm64: Support SDEI_EVENT_SIGNAL hypercall
KVM: arm64: Support SDEI_EVENT_{COMPLETE,COMPLETE_AND_RESUME}
hypercall
KVM: arm64: Support SDEI event notifier
KVM: arm64: Support SDEI ioctl commands on VM
KVM: arm64: Support SDEI ioctl commands on vCPU
KVM: arm64: Export SDEI capability
KVM: arm64: Add SDEI document
KVM: selftests: Add SDEI test case

Documentation/virt/kvm/api.rst | 10 +
Documentation/virt/kvm/arm/sdei.rst | 325 +++
arch/arm64/include/asm/kvm_emulate.h | 1 +
arch/arm64/include/asm/kvm_host.h | 5 +
arch/arm64/include/asm/kvm_sdei.h | 187 ++
arch/arm64/include/uapi/asm/kvm.h | 1 +
arch/arm64/include/uapi/asm/kvm_sdei_state.h | 101 +
arch/arm64/kvm/Makefile | 2 +-
arch/arm64/kvm/arm.c | 20 +
arch/arm64/kvm/hypercalls.c | 21 +
arch/arm64/kvm/inject_fault.c | 29 +
arch/arm64/kvm/sdei.c | 1900 ++++++++++++++++++
include/kvm/arm_hypercalls.h | 24 +-
include/uapi/linux/arm_sdei.h | 2 +
include/uapi/linux/kvm.h | 4 +
tools/testing/selftests/kvm/Makefile | 1 +
tools/testing/selftests/kvm/aarch64/sdei.c | 525 +++++
17 files changed, 3145 insertions(+), 13 deletions(-)
create mode 100644 Documentation/virt/kvm/arm/sdei.rst
create mode 100644 arch/arm64/include/asm/kvm_sdei.h
create mode 100644 arch/arm64/include/uapi/asm/kvm_sdei_state.h
create mode 100644 arch/arm64/kvm/sdei.c
create mode 100644 tools/testing/selftests/kvm/aarch64/sdei.c

--
2.23.0


2022-03-22 08:23:22

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v5 18/22] KVM: arm64: Support SDEI ioctl commands on VM

This supports ioctl commands on VM to manage the various objects.
It's primarily used by VMM to accomplish migration. The ioctl
commands introduced by this are highlighted as below:

* KVM_SDEI_CMD_GET_VERSION
Retrieve the version of current implementation. It's different
from the version of the followed SDEI specification. This version
is used to indicates what functionalities documented in the SDEI
specification have been supported or not supported.

* KVM_SDEI_CMD_GET_EXPOSED_EVENT_COUNT
Return the total count of exposed events.

* KVM_SDEI_CMD_GET_EXPOSED_EVENT
* KVM_SDEI_CMD_SET_EXPOSED_EVENT
Get or set exposed event

* KVM_SDEI_CMD_GET_REGISTERED_EVENT_COUNT
Return the total count of registered events.

* KVM_SDEI_CMD_GET_REGISTERED_EVENT
* KVM_SDEI_CMD_SET_REGISTERED_EVENT
Get or set registered event.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/include/asm/kvm_sdei.h | 1 +
arch/arm64/include/uapi/asm/kvm_sdei_state.h | 20 ++
arch/arm64/kvm/arm.c | 3 +
arch/arm64/kvm/sdei.c | 302 +++++++++++++++++++
include/uapi/linux/kvm.h | 3 +
5 files changed, 329 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
index 2480ec0e9824..64f00cc79162 100644
--- a/arch/arm64/include/asm/kvm_sdei.h
+++ b/arch/arm64/include/asm/kvm_sdei.h
@@ -179,6 +179,7 @@ int kvm_sdei_inject_event(struct kvm_vcpu *vcpu,
unsigned long num, bool immediate);
int kvm_sdei_cancel_event(struct kvm_vcpu *vcpu, unsigned long num);
void kvm_sdei_deliver_event(struct kvm_vcpu *vcpu);
+long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg);
void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
void kvm_sdei_destroy_vm(struct kvm *kvm);

diff --git a/arch/arm64/include/uapi/asm/kvm_sdei_state.h b/arch/arm64/include/uapi/asm/kvm_sdei_state.h
index b14844230117..2bd6d11627bc 100644
--- a/arch/arm64/include/uapi/asm/kvm_sdei_state.h
+++ b/arch/arm64/include/uapi/asm/kvm_sdei_state.h
@@ -68,5 +68,25 @@ struct kvm_sdei_vcpu_state {
struct kvm_sdei_vcpu_regs_state normal_regs;
};

+#define KVM_SDEI_CMD_GET_VERSION 0
+#define KVM_SDEI_CMD_GET_EXPOSED_EVENT_COUNT 1
+#define KVM_SDEI_CMD_GET_EXPOSED_EVENT 2
+#define KVM_SDEI_CMD_SET_EXPOSED_EVENT 3
+#define KVM_SDEI_CMD_GET_REGISTERED_EVENT_COUNT 4
+#define KVM_SDEI_CMD_GET_REGISTERED_EVENT 5
+#define KVM_SDEI_CMD_SET_REGISTERED_EVENT 6
+
+struct kvm_sdei_cmd {
+ __u32 cmd;
+ union {
+ __u32 version;
+ __u32 count;
+ };
+ union {
+ struct kvm_sdei_exposed_event_state *exposed_event_state;
+ struct kvm_sdei_registered_event_state *registered_event_state;
+ };
+};
+
#endif /* !__ASSEMBLY__ */
#endif /* _UAPI__ASM_KVM_SDEI_STATE_H */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 00c136a6e8df..ebfd504a1c08 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1465,6 +1465,9 @@ long kvm_arch_vm_ioctl(struct file *filp,
return -EFAULT;
return kvm_vm_ioctl_mte_copy_tags(kvm, &copy_tags);
}
+ case KVM_ARM_SDEI_COMMAND: {
+ return kvm_sdei_vm_ioctl(kvm, arg);
+ }
default:
return -EINVAL;
}
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 9f1959653318..d9cf494990a9 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -1265,6 +1265,308 @@ void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu)
vcpu->arch.sdei = vsdei;
}

+static long vm_ioctl_get_exposed_event(struct kvm *kvm,
+ struct kvm_sdei_cmd *cmd)
+{
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_exposed_event *exposed_event;
+ struct kvm_sdei_exposed_event_state *state;
+ void __user *user_state = (void __user *)(cmd->exposed_event_state);
+ unsigned int count, i;
+ long ret = 0;
+
+ if (!cmd->count)
+ return 0;
+
+ state = kcalloc(cmd->count, sizeof(*state), GFP_KERNEL_ACCOUNT);
+ if (!state)
+ return -ENOMEM;
+
+ i = 0;
+ count = cmd->count;
+ list_for_each_entry(exposed_event, &ksdei->exposed_events, link) {
+ state[i++] = exposed_event->state;
+ if (!--count)
+ break;
+ }
+
+ if (copy_to_user(user_state, state, sizeof(*state) * cmd->count))
+ ret = -EFAULT;
+
+ kfree(state);
+ return ret;
+}
+
+static long vm_ioctl_set_exposed_event(struct kvm *kvm,
+ struct kvm_sdei_cmd *cmd)
+{
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_exposed_event *exposed_event;
+ struct kvm_sdei_exposed_event_state *state;
+ void __user *user_state = (void __user *)(cmd->exposed_event_state);
+ unsigned int i, j;
+ long ret = 0;
+
+ if (!cmd->count)
+ return 0;
+
+ if ((ksdei->exposed_event_count + cmd->count) > KVM_SDEI_MAX_EVENTS)
+ return -ERANGE;
+
+ state = kcalloc(cmd->count, sizeof(*state), GFP_KERNEL_ACCOUNT);
+ if (!state)
+ return -ENOMEM;
+
+ if (copy_from_user(state, user_state, sizeof(*state) * cmd->count)) {
+ ret = -EFAULT;
+ goto out;
+ }
+
+ for (i = 0; i < cmd->count; i++) {
+ if (!kvm_sdei_is_supported(state[i].num)) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ if (!kvm_sdei_is_shared(state[i].type) &&
+ !kvm_sdei_is_private(state[i].type)) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ if (!kvm_sdei_is_critical(state[i].priority) &&
+ !kvm_sdei_is_normal(state[i].priority)) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ /*
+ * Check if the event has been exposed. The notifier is
+ * allowed to be changed.
+ */
+ exposed_event = find_exposed_event(kvm, state[i].num);
+ if (exposed_event &&
+ (state[i].num != exposed_event->state.num ||
+ state[i].type != exposed_event->state.type ||
+ state[i].signaled != exposed_event->state.signaled ||
+ state[i].priority != exposed_event->state.priority)) {
+ ret = -EEXIST;
+ goto out;
+ }
+
+ /* Avoid the duplicated event */
+ for (j = 0; j < cmd->count; j++) {
+ if (i != j && state[i].num == state[j].num) {
+ ret = -EINVAL;
+ goto out;
+ }
+ }
+ }
+
+ for (i = 0; i < cmd->count; i++) {
+ exposed_event = find_exposed_event(kvm, state[i].num);
+ if (exposed_event) {
+ exposed_event->state = state[i];
+ continue;
+ }
+
+ exposed_event = kzalloc(sizeof(*exposed_event),
+ GFP_KERNEL_ACCOUNT);
+ if (!exposed_event) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ exposed_event->state = state[i];
+ exposed_event->kvm = kvm;
+
+ ksdei->exposed_event_count++;
+ list_add_tail(&exposed_event->link, &ksdei->exposed_events);
+ }
+
+out:
+ kfree(state);
+ return ret;
+}
+
+static long vm_ioctl_get_registered_event(struct kvm *kvm,
+ struct kvm_sdei_cmd *cmd)
+{
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_registered_event *registered_event;
+ struct kvm_sdei_registered_event_state *state;
+ void __user *user_state = (void __user *)(cmd->registered_event_state);
+ unsigned int count, i;
+ long ret = 0;
+
+ if (!cmd->count)
+ return 0;
+
+ state = kcalloc(cmd->count, sizeof(*state), GFP_KERNEL_ACCOUNT);
+ if (!state)
+ return -ENOMEM;
+
+ i = 0;
+ count = cmd->count;
+ list_for_each_entry(registered_event,
+ &ksdei->registered_events, link) {
+ state[i++] = registered_event->state;
+ if (!--count)
+ break;
+ }
+
+ if (copy_to_user(user_state, state, sizeof(*state) * cmd->count))
+ ret = -EFAULT;
+
+ kfree(state);
+ return ret;
+}
+
+static long vm_ioctl_set_registered_event(struct kvm *kvm,
+ struct kvm_sdei_cmd *cmd)
+{
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_exposed_event *exposed_event;
+ struct kvm_sdei_registered_event *registered_event;
+ struct kvm_sdei_registered_event_state *state;
+ void __user *user_state = (void __user *)(cmd->registered_event_state);
+ unsigned int i, j;
+ long ret = 0;
+
+ if (!cmd->count)
+ return 0;
+
+ if ((ksdei->registered_event_count + cmd->count) > KVM_SDEI_MAX_EVENTS)
+ return -ERANGE;
+
+ state = kcalloc(cmd->count, sizeof(*state), GFP_KERNEL_ACCOUNT);
+ if (!state)
+ return -ENOMEM;
+
+ if (copy_from_user(state, user_state, sizeof(*state) * cmd->count)) {
+ ret = -EFAULT;
+ goto out;
+ }
+
+ for (i = 0; i < cmd->count; i++) {
+ if (!kvm_sdei_is_supported(state[i].num)) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ if (state[i].route_mode != SDEI_EVENT_REGISTER_RM_ANY &&
+ state[i].route_mode != SDEI_EVENT_REGISTER_RM_PE) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ /* Check if the event has been exposed */
+ exposed_event = find_exposed_event(kvm, state[i].num);
+ if (!exposed_event) {
+ ret = -ENOENT;
+ goto out;
+ }
+
+ /* Check if the event has been registered */
+ registered_event = find_registered_event(kvm, state[i].num);
+ if (registered_event) {
+ ret = -EEXIST;
+ goto out;
+ }
+
+ /* Avoid the duplicated event */
+ for (j = 0; j < cmd->count; j++) {
+ if (i != j && state[i].num == state[j].num) {
+ ret = -EINVAL;
+ goto out;
+ }
+ }
+ }
+
+ for (i = 0; i < cmd->count; i++) {
+ registered_event = kzalloc(sizeof(*registered_event),
+ GFP_KERNEL_ACCOUNT);
+ if (!registered_event) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ exposed_event = find_exposed_event(kvm, state[i].num);
+ registered_event->state = state[i];
+ registered_event->kvm = kvm;
+ registered_event->exposed_event = exposed_event;
+
+ ksdei->registered_event_count++;
+ exposed_event->registered_event_count++;
+ list_add_tail(&registered_event->link,
+ &ksdei->registered_events);
+ }
+
+out:
+ kfree(state);
+ return ret;
+}
+
+long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg)
+{
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_cmd *cmd;
+ void __user *argp = (void __user *)arg;
+ long ret = 0;
+
+ if (!ksdei)
+ return -EPERM;
+
+ cmd = kzalloc(sizeof(*cmd), GFP_KERNEL_ACCOUNT);
+ if (!cmd)
+ return -ENOMEM;
+
+ if (copy_from_user(cmd, argp, sizeof(*cmd))) {
+ ret = -EFAULT;
+ goto out;
+ }
+
+ spin_lock(&ksdei->lock);
+
+ switch (cmd->cmd) {
+ case KVM_SDEI_CMD_GET_VERSION:
+ cmd->version = (1 << 16); /* v1.0.0 */
+ if (copy_to_user(argp, cmd, sizeof(*cmd)))
+ ret = -EFAULT;
+ break;
+ case KVM_SDEI_CMD_GET_EXPOSED_EVENT_COUNT:
+ cmd->count = ksdei->exposed_event_count;
+ if (copy_to_user(argp, cmd, sizeof(*cmd)))
+ ret = -EFAULT;
+ break;
+ case KVM_SDEI_CMD_GET_EXPOSED_EVENT:
+ ret = vm_ioctl_get_exposed_event(kvm, cmd);
+ break;
+ case KVM_SDEI_CMD_SET_EXPOSED_EVENT:
+ ret = vm_ioctl_set_exposed_event(kvm, cmd);
+ break;
+ case KVM_SDEI_CMD_GET_REGISTERED_EVENT_COUNT:
+ cmd->count = ksdei->registered_event_count;
+ if (copy_to_user(argp, cmd, sizeof(*cmd)))
+ ret = -EFAULT;
+ break;
+ case KVM_SDEI_CMD_GET_REGISTERED_EVENT:
+ ret = vm_ioctl_get_registered_event(kvm, cmd);
+ break;
+ case KVM_SDEI_CMD_SET_REGISTERED_EVENT:
+ ret = vm_ioctl_set_registered_event(kvm, cmd);
+ break;
+ default:
+ ret = -EINVAL;
+ }
+
+ spin_unlock(&ksdei->lock);
+out:
+
+ kfree(cmd);
+ return ret;
+}
+
void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 507ee1f2aa96..2d11c909ec42 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -2049,4 +2049,7 @@ struct kvm_stats_desc {
/* Available with KVM_CAP_XSAVE2 */
#define KVM_GET_XSAVE2 _IOR(KVMIO, 0xcf, struct kvm_xsave)

+/* Available with KVM_CAP_ARM_SDEI */
+#define KVM_ARM_SDEI_COMMAND _IOWR(KVMIO, 0xd0, struct kvm_sdei_cmd)
+
#endif /* __LINUX_KVM_H */
--
2.23.0

2022-03-22 08:23:36

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v5 08/22] KVM: arm64: Support SDEI_EVENT_STATUS hypercall

This supports SDEI_EVENT_STATUS hypercall. It's used by the guest
to retrieve the status about the specified SDEI event. A bitmap
is returned to indicate the corresponding status, including
registration, enablement and delivery state.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/kvm/sdei.c | 42 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 42 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 36eda31e0392..5c43c8912ea1 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -454,6 +454,46 @@ static unsigned long hypercall_unregister(struct kvm_vcpu *vcpu)
return ret;
}

+static unsigned long hypercall_status(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_exposed_event *exposed_event;
+ struct kvm_sdei_registered_event *registered_event;
+ unsigned long event_num = smccc_get_arg1(vcpu);
+ int index;
+ unsigned long ret = 0;
+
+ if (!kvm_sdei_is_supported(event_num)) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto out;
+ }
+
+ spin_lock(&ksdei->lock);
+
+ /*
+ * Check if the registered event exists. None of the flags
+ * will be set if it doesn't exist.
+ */
+ registered_event = find_registered_event(kvm, event_num);
+ if (!registered_event)
+ goto unlock;
+
+ exposed_event = registered_event->exposed_event;
+ index = kvm_sdei_vcpu_index(vcpu, exposed_event);
+ if (kvm_sdei_is_registered(registered_event, index))
+ ret |= (1UL << SDEI_EVENT_STATUS_REGISTERED);
+ if (kvm_sdei_is_enabled(registered_event, index))
+ ret |= (1UL << SDEI_EVENT_STATUS_ENABLED);
+ if (registered_event->vcpu_event_count > 0)
+ ret |= (1UL << SDEI_EVENT_STATUS_RUNNING);
+
+unlock:
+ spin_unlock(&ksdei->lock);
+out:
+ return ret;
+}
+
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
@@ -500,6 +540,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
ret = hypercall_unregister(vcpu);
break;
case SDEI_1_0_FN_SDEI_EVENT_STATUS:
+ ret = hypercall_status(vcpu);
+ break;
case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
case SDEI_1_0_FN_SDEI_PE_MASK:
--
2.23.0

2022-03-22 08:24:49

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v5 06/22] KVM: arm64: Support SDEI_EVENT_CONTEXT hypercall

This supports SDEI_EVENT_CONTEXT hypercall. It's used by the guest
to retrieve the registers (R0 - R17) in the preempted context
in the SDEI event handler. The preempted context is saved prior to
servicing or handling the SDEI event and restored after that.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/kvm/sdei.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 4ab58f264992..4488d3f044f2 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -297,6 +297,34 @@ static unsigned long hypercall_enable(struct kvm_vcpu *vcpu, bool enable)
return ret;
}

+static unsigned long hypercall_context(struct kvm_vcpu *vcpu)
+{
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_vcpu_regs_state *regs;
+ unsigned long param_id = smccc_get_arg1(vcpu);
+ unsigned long ret = SDEI_SUCCESS;
+
+ spin_lock(&vsdei->lock);
+
+ /* Check if the pending event exists */
+ if (!vsdei->critical_event && !vsdei->normal_event) {
+ ret = SDEI_DENIED;
+ goto unlock;
+ }
+
+ /* Fetch the requested register */
+ regs = vsdei->critical_event ? &vsdei->state.critical_regs :
+ &vsdei->state.normal_regs;
+ if (param_id < ARRAY_SIZE(regs->regs))
+ ret = regs->regs[param_id];
+ else
+ ret = SDEI_INVALID_PARAMETERS;
+
+unlock:
+ spin_unlock(&vsdei->lock);
+ return ret;
+}
+
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
@@ -333,6 +361,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
ret = hypercall_enable(vcpu, false);
break;
case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
+ ret = hypercall_context(vcpu);
+ break;
case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
--
2.23.0

2022-03-22 08:24:49

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v5 13/22] KVM: arm64: Support SDEI_FEATURES hypercall

This supports SDEI_FEATURES hypercall. It's used by the guest to
retrieve the supported features, which are number of binding slots
and relative mode for the event handler. Currently, none of them
is supported.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/kvm/sdei.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 0dec35a0eed1..1e0ca9022eaa 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -662,6 +662,20 @@ static unsigned long hypercall_reset(struct kvm_vcpu *vcpu, bool private)
return ret;
}

+static unsigned long hypercall_features(struct kvm_vcpu *vcpu)
+{
+ unsigned long feature = smccc_get_arg1(vcpu);
+
+ switch (feature) {
+ case 0: /* BIND_SLOTS */
+ return 0;
+ case 1: /* RELATIVE_MODE */
+ return 0;
+ }
+
+ return SDEI_INVALID_PARAMETERS;
+}
+
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
@@ -734,6 +748,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
ret = hypercall_reset(vcpu, false);
break;
case SDEI_1_1_FN_SDEI_FEATURES:
+ ret = hypercall_features(vcpu);
+ break;
default:
ret = SDEI_NOT_SUPPORTED;
}
--
2.23.0

2022-03-22 08:28:30

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v5 19/22] KVM: arm64: Support SDEI ioctl commands on vCPU

This supports ioctl commands on vCPU to manage the various object.
It's primarily used by VMM to accomplish migration. The ioctl
commands introduced by this are highlighted as below:

* KVM_SDEI_CMD_GET_VCPU_EVENT_COUNT
Return the total count of vCPU events, which have been queued
on the target vCPU.

* KVM_SDEI_CMD_GET_VCPU_EVENT
* KVM_SDEI_CMD_SET_VCPU_EVENT
Get or set vCPU events.

* KVM_SDEI_CMD_GET_VCPU_STATE
* KVM_SDEI_CMD_SET_VCPU_STATE
Get or set vCPU state.

* KVM_SDEI_CMD_INJECT_EVENT
Inject SDEI event.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/include/asm/kvm_sdei.h | 1 +
arch/arm64/include/uapi/asm/kvm_sdei_state.h | 9 +
arch/arm64/kvm/arm.c | 3 +
arch/arm64/kvm/sdei.c | 299 +++++++++++++++++++
4 files changed, 312 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
index 64f00cc79162..ea4f222cf73d 100644
--- a/arch/arm64/include/asm/kvm_sdei.h
+++ b/arch/arm64/include/asm/kvm_sdei.h
@@ -180,6 +180,7 @@ int kvm_sdei_inject_event(struct kvm_vcpu *vcpu,
int kvm_sdei_cancel_event(struct kvm_vcpu *vcpu, unsigned long num);
void kvm_sdei_deliver_event(struct kvm_vcpu *vcpu);
long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg);
+long kvm_sdei_vcpu_ioctl(struct kvm_vcpu *vcpu, unsigned long arg);
void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
void kvm_sdei_destroy_vm(struct kvm *kvm);

diff --git a/arch/arm64/include/uapi/asm/kvm_sdei_state.h b/arch/arm64/include/uapi/asm/kvm_sdei_state.h
index 2bd6d11627bc..149451c5584f 100644
--- a/arch/arm64/include/uapi/asm/kvm_sdei_state.h
+++ b/arch/arm64/include/uapi/asm/kvm_sdei_state.h
@@ -75,6 +75,12 @@ struct kvm_sdei_vcpu_state {
#define KVM_SDEI_CMD_GET_REGISTERED_EVENT_COUNT 4
#define KVM_SDEI_CMD_GET_REGISTERED_EVENT 5
#define KVM_SDEI_CMD_SET_REGISTERED_EVENT 6
+#define KVM_SDEI_CMD_GET_VCPU_EVENT_COUNT 7
+#define KVM_SDEI_CMD_GET_VCPU_EVENT 8
+#define KVM_SDEI_CMD_SET_VCPU_EVENT 9
+#define KVM_SDEI_CMD_GET_VCPU_STATE 10
+#define KVM_SDEI_CMD_SET_VCPU_STATE 11
+#define KVM_SDEI_CMD_INJECT_EVENT 12

struct kvm_sdei_cmd {
__u32 cmd;
@@ -85,6 +91,9 @@ struct kvm_sdei_cmd {
union {
struct kvm_sdei_exposed_event_state *exposed_event_state;
struct kvm_sdei_registered_event_state *registered_event_state;
+ struct kvm_sdei_vcpu_event_state *vcpu_event_state;
+ struct kvm_sdei_vcpu_state *vcpu_state;
+ __u64 num;
};
};

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index ebfd504a1c08..3f532e1c4a95 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1387,6 +1387,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp,

return kvm_arm_vcpu_finalize(vcpu, what);
}
+ case KVM_ARM_SDEI_COMMAND: {
+ return kvm_sdei_vcpu_ioctl(vcpu, arg);
+ }
default:
r = -EINVAL;
}
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index d9cf494990a9..06895ac73c24 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -1567,6 +1567,305 @@ long kvm_sdei_vm_ioctl(struct kvm *kvm, unsigned long arg)
return ret;
}

+static long vcpu_ioctl_get_vcpu_event(struct kvm_vcpu *vcpu,
+ struct kvm_sdei_cmd *cmd)
+{
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_vcpu_event *vcpu_event;
+ struct kvm_sdei_vcpu_event_state *state;
+ void __user *user_state = (void __user *)(cmd->vcpu_event_state);
+ unsigned int count, i;
+ long ret = 0;
+
+ if (!cmd->count)
+ return 0;
+
+ state = kcalloc(cmd->count, sizeof(*state), GFP_KERNEL_ACCOUNT);
+ if (!state)
+ return -ENOMEM;
+
+ i = 0;
+ count = cmd->count;
+ list_for_each_entry(vcpu_event, &vsdei->critical_events, link) {
+ state[i++] = vcpu_event->state;
+ if (!--count)
+ break;
+ }
+
+ if (count) {
+ list_for_each_entry(vcpu_event, &vsdei->normal_events, link) {
+ state[i++] = vcpu_event->state;
+ if (!--count)
+ break;
+ }
+ }
+
+ if (copy_to_user(user_state, state, sizeof(*state) * cmd->count))
+ ret = -EFAULT;
+
+ kfree(state);
+ return ret;
+}
+
+static long vcpu_ioctl_set_vcpu_event(struct kvm_vcpu *vcpu,
+ struct kvm_sdei_cmd *cmd)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_exposed_event *exposed_event;
+ struct kvm_sdei_registered_event *registered_event;
+ struct kvm_sdei_vcpu_event *vcpu_event;
+ struct kvm_sdei_vcpu_event_state *state;
+ void __user *user_state = (void __user *)(cmd->vcpu_event_state);
+ unsigned int vcpu_event_count, i, j;
+ long ret = 0;
+
+ if (!cmd->count)
+ return 0;
+
+ state = kcalloc(cmd->count, sizeof(*state), GFP_KERNEL_ACCOUNT);
+ if (!state)
+ return -ENOMEM;
+
+ if (copy_from_user(state, user_state, sizeof(*state) * cmd->count)) {
+ ret = -EFAULT;
+ goto out;
+ }
+
+ vcpu_event_count = vsdei->critical_event_count +
+ vsdei->normal_event_count;
+ for (i = 0; i < cmd->count; i++) {
+ if (!kvm_sdei_is_supported(state[i].num)) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ /* Check if the event has been exposed */
+ exposed_event = find_exposed_event(kvm, state[i].num);
+ if (!exposed_event) {
+ ret = -ENOENT;
+ goto out;
+ }
+
+ /* Check if the event has been registered */
+ registered_event = find_registered_event(kvm, state[i].num);
+ if (!registered_event) {
+ ret = -ENOENT;
+ goto out;
+ }
+
+ /*
+ * Calculate the total count of the vcpu event instances.
+ * We needn't a new vcpu event instance if it is existing
+ * or a duplicated event.
+ */
+ vcpu_event = find_vcpu_event(vcpu, state[i].num);
+ if (vcpu_event)
+ continue;
+
+ for (j = 0; j < cmd->count; j++) {
+ if (j != i && state[j].num == state[i].num)
+ break;
+ }
+
+ if (j >= cmd->count || i < j)
+ vcpu_event_count++;
+ }
+
+ /*
+ * Check if the required count of vcpu event instances exceeds
+ * the limit.
+ */
+ if (vcpu_event_count > KVM_SDEI_MAX_EVENTS) {
+ ret = -ERANGE;
+ goto out;
+ }
+
+ for (i = 0; i < cmd->count; i++) {
+ /* The vcpu event might have been existing */
+ vcpu_event = find_vcpu_event(vcpu, state[i].num);
+ if (vcpu_event) {
+ vcpu_event->state.event_count += state[i].event_count;
+ continue;
+ }
+
+ vcpu_event = kzalloc(sizeof(*vcpu_event), GFP_KERNEL_ACCOUNT);
+ if (!vcpu_event) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ registered_event = find_registered_event(kvm, state[i].num);
+ exposed_event = registered_event->exposed_event;
+
+ vcpu_event->state = state[i];
+ vcpu_event->registered_event = registered_event;
+ vcpu_event->vcpu = vcpu;
+
+ registered_event->vcpu_event_count++;
+ if (kvm_sdei_is_critical(exposed_event->state.priority)) {
+ list_add_tail(&vcpu_event->link,
+ &vsdei->critical_events);
+ vsdei->critical_event_count++;
+ } else {
+ list_add_tail(&vcpu_event->link,
+ &vsdei->normal_events);
+ vsdei->normal_event_count++;
+ }
+ }
+
+out:
+ kfree(state);
+ return ret;
+}
+
+static long vcpu_ioctl_set_vcpu_state(struct kvm_vcpu *vcpu,
+ struct kvm_sdei_cmd *cmd)
+{
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_vcpu_event *critical_vcpu_event = NULL;
+ struct kvm_sdei_vcpu_event *normal_vcpu_event = NULL;
+ struct kvm_sdei_vcpu_state *state;
+ void __user *user_state = (void __user *)(cmd->vcpu_state);
+ long ret = 0;
+
+ state = kzalloc(sizeof(*state), GFP_KERNEL_ACCOUNT);
+ if (!state)
+ return -ENOMEM;
+
+ if (copy_from_user(state, user_state, sizeof(*state))) {
+ ret = -EFAULT;
+ goto out;
+ }
+
+ if (kvm_sdei_is_supported(state->critical_num)) {
+ critical_vcpu_event = find_vcpu_event(vcpu,
+ state->critical_num);
+ if (!critical_vcpu_event) {
+ ret = -EINVAL;
+ goto out;
+ }
+ }
+
+ if (kvm_sdei_is_supported(state->normal_num)) {
+ normal_vcpu_event = find_vcpu_event(vcpu, state->normal_num);
+ if (!normal_vcpu_event) {
+ ret = -EINVAL;
+ goto out;
+ }
+ }
+
+ vsdei->state = *state;
+ vsdei->critical_event = critical_vcpu_event;
+ vsdei->normal_event = normal_vcpu_event;
+
+ /*
+ * To deliver the vCPU events if we don't have a valid handler
+ * running. Otherwise, the vCPU events should be delivered when
+ * the running handler is completed.
+ */
+ if (!vsdei->critical_event && !vsdei->normal_event &&
+ (vsdei->critical_event_count + vsdei->normal_event_count) > 0)
+ kvm_make_request(KVM_REQ_SDEI, vcpu);
+
+out:
+ kfree(state);
+ return ret;
+}
+
+static long vcpu_ioctl_inject_event(struct kvm_vcpu *vcpu,
+ struct kvm_sdei_cmd *cmd)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_exposed_event *exposed_event;
+ struct kvm_sdei_registered_event *registered_event;
+ int index;
+
+ if (!kvm_sdei_is_supported(cmd->num))
+ return -EINVAL;
+
+ registered_event = find_registered_event(kvm, cmd->num);
+ if (!registered_event)
+ return -ENOENT;
+
+ exposed_event = registered_event->exposed_event;
+ index = kvm_sdei_vcpu_index(vcpu, exposed_event);
+ if (!kvm_sdei_is_registered(registered_event, index) ||
+ !kvm_sdei_is_enabled(registered_event, index) ||
+ kvm_sdei_is_unregister_pending(registered_event, index))
+ return -EPERM;
+
+ if (vsdei->state.masked)
+ return -EPERM;
+
+ return do_inject_event(vcpu, registered_event, false);
+}
+
+long kvm_sdei_vcpu_ioctl(struct kvm_vcpu *vcpu, unsigned long arg)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_cmd *cmd = NULL;
+ void __user *argp = (void __user *)arg;
+ long ret = 0;
+
+ if (!(ksdei && vsdei)) {
+ ret = -EPERM;
+ goto out;
+ }
+
+ cmd = kzalloc(sizeof(*cmd), GFP_KERNEL_ACCOUNT);
+ if (!cmd) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ if (copy_from_user(cmd, argp, sizeof(*cmd))) {
+ ret = -EFAULT;
+ goto out;
+ }
+
+ spin_lock(&ksdei->lock);
+ spin_lock(&vsdei->lock);
+
+ switch (cmd->cmd) {
+ case KVM_SDEI_CMD_GET_VCPU_EVENT_COUNT:
+ cmd->count = vsdei->critical_event_count +
+ vsdei->normal_event_count;
+ if (copy_to_user(argp, cmd, sizeof(*cmd)))
+ ret = -EFAULT;
+ break;
+ case KVM_SDEI_CMD_GET_VCPU_EVENT:
+ ret = vcpu_ioctl_get_vcpu_event(vcpu, cmd);
+ break;
+ case KVM_SDEI_CMD_SET_VCPU_EVENT:
+ ret = vcpu_ioctl_set_vcpu_event(vcpu, cmd);
+ break;
+ case KVM_SDEI_CMD_GET_VCPU_STATE:
+ if (copy_to_user(cmd->vcpu_state, &vsdei->state,
+ sizeof(vsdei->state)))
+ ret = -EFAULT;
+ break;
+ case KVM_SDEI_CMD_SET_VCPU_STATE:
+ ret = vcpu_ioctl_set_vcpu_state(vcpu, cmd);
+ break;
+ case KVM_SDEI_CMD_INJECT_EVENT:
+ ret = vcpu_ioctl_inject_event(vcpu, cmd);
+ break;
+ default:
+ ret = -EINVAL;
+ }
+
+ spin_unlock(&vsdei->lock);
+ spin_unlock(&ksdei->lock);
+
+out:
+ kfree(cmd);
+ return ret;
+}
+
void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
--
2.23.0

2022-03-22 08:34:52

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v5 05/22] KVM: arm64: Support SDEI_EVENT_{ENABLE, DISABLE} hypercall

This supports SDEI_EVENT_{ENABLE, DISABLE} hypercall. After SDEI
event is registered by guest, it won't be delivered to the guest
until it's enabled. For unregistration pending event, we can't
enable or disable it as the registered event is going to be
destroyed after current event is handled.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/kvm/sdei.c | 49 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 49 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 2458dc666445..4ab58f264992 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -252,6 +252,51 @@ static unsigned long hypercall_register(struct kvm_vcpu *vcpu)
return ret;
}

+static unsigned long hypercall_enable(struct kvm_vcpu *vcpu, bool enable)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_exposed_event *exposed_event;
+ struct kvm_sdei_registered_event *registered_event;
+ unsigned long event_num = smccc_get_arg1(vcpu);
+ int index;
+ unsigned long ret = SDEI_SUCCESS;
+
+ if (!kvm_sdei_is_supported(event_num)) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto out;
+ }
+
+ spin_lock(&ksdei->lock);
+
+ /* Check if the registered event exists */
+ registered_event = find_registered_event(kvm, event_num);
+ if (!registered_event) {
+ ret = SDEI_DENIED;
+ goto unlock;
+ }
+
+ /* Check if the event is registered and pending for unregistration */
+ exposed_event = registered_event->exposed_event;
+ index = kvm_sdei_vcpu_index(vcpu, exposed_event);
+ if (!kvm_sdei_is_registered(registered_event, index) ||
+ kvm_sdei_is_unregister_pending(registered_event, index)) {
+ ret = SDEI_DENIED;
+ goto unlock;
+ }
+
+ /* Update the enablement state */
+ if (enable)
+ kvm_sdei_set_enabled(registered_event, index);
+ else
+ kvm_sdei_clear_enabled(registered_event, index);
+
+unlock:
+ spin_unlock(&ksdei->lock);
+out:
+ return ret;
+}
+
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
@@ -282,7 +327,11 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
ret = hypercall_register(vcpu);
break;
case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
+ ret = hypercall_enable(vcpu, true);
+ break;
case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
+ ret = hypercall_enable(vcpu, false);
+ break;
case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
case SDEI_1_0_FN_SDEI_EVENT_COMPLETE:
case SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME:
--
2.23.0

2022-03-22 08:35:39

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v5 09/22] KVM: arm64: Support SDEI_EVENT_GET_INFO hypercall

This supports SDEI_EVENT_GET_INFO hypercall. It's used by the guest
to retrieve various information about the exposed or registered event,
including type, signaled, routing mode and affinity. The routing
mode and affinity information is only valid to the shared and
registered event.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/kvm/sdei.c | 73 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 73 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 5c43c8912ea1..4f26e5f70bff 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -494,6 +494,77 @@ static unsigned long hypercall_status(struct kvm_vcpu *vcpu)
return ret;
}

+static unsigned long hypercall_info(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_exposed_event *exposed_event = NULL;
+ struct kvm_sdei_registered_event *registered_event = NULL;
+ unsigned long event_num = smccc_get_arg1(vcpu);
+ unsigned long event_info = smccc_get_arg2(vcpu);
+ int index;
+ unsigned long ret = SDEI_SUCCESS;
+
+ if (!kvm_sdei_is_supported(event_num)) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto out;
+ }
+
+ spin_lock(&ksdei->lock);
+
+ /*
+ * Retrieve the information from the registered event if it exists.
+ * Otherwise, we turn into the exposed event if needed.
+ */
+ registered_event = find_registered_event(kvm, event_num);
+ exposed_event = registered_event ? registered_event->exposed_event :
+ find_exposed_event(kvm, event_num);
+ if (!exposed_event) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto unlock;
+ }
+
+ /* Retrieve the requested information */
+ switch (event_info) {
+ case SDEI_EVENT_INFO_EV_TYPE:
+ ret = exposed_event->state.type;
+ break;
+ case SDEI_EVENT_INFO_EV_SIGNALED:
+ ret = exposed_event->state.signaled;
+ break;
+ case SDEI_EVENT_INFO_EV_PRIORITY:
+ ret = exposed_event->state.priority;
+ break;
+ case SDEI_EVENT_INFO_EV_ROUTING_MODE:
+ case SDEI_EVENT_INFO_EV_ROUTING_AFF:
+ if (!kvm_sdei_is_shared(exposed_event->state.type)) {
+ ret = SDEI_INVALID_PARAMETERS;
+ break;
+ }
+
+ index = kvm_sdei_vcpu_index(vcpu, exposed_event);
+ if (!registered_event ||
+ !kvm_sdei_is_registered(registered_event, index)) {
+ ret = SDEI_DENIED;
+ break;
+ }
+
+ if (event_info == SDEI_EVENT_INFO_EV_ROUTING_MODE)
+ ret = registered_event->state.route_mode;
+ else
+ ret = registered_event->state.route_affinity;
+
+ break;
+ default:
+ ret = SDEI_INVALID_PARAMETERS;
+ }
+
+unlock:
+ spin_unlock(&ksdei->lock);
+out:
+ return ret;
+}
+
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
@@ -543,6 +614,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
ret = hypercall_status(vcpu);
break;
case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
+ ret = hypercall_info(vcpu);
+ break;
case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
case SDEI_1_0_FN_SDEI_PE_MASK:
case SDEI_1_0_FN_SDEI_PE_UNMASK:
--
2.23.0

2022-03-22 08:35:49

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v5 21/22] KVM: arm64: Add SDEI document

This adds one document to explain how virtualized SDEI service is
implemented and supported.

Signed-off-by: Gavin Shan <[email protected]>
---
Documentation/virt/kvm/arm/sdei.rst | 325 ++++++++++++++++++++++++++++
1 file changed, 325 insertions(+)
create mode 100644 Documentation/virt/kvm/arm/sdei.rst

diff --git a/Documentation/virt/kvm/arm/sdei.rst b/Documentation/virt/kvm/arm/sdei.rst
new file mode 100644
index 000000000000..61213e4b9aea
--- /dev/null
+++ b/Documentation/virt/kvm/arm/sdei.rst
@@ -0,0 +1,325 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====================================
+SDEI Virtualization Support for ARM64
+=====================================
+
+Arm specification DEN0054/C defines Software Delegated Exception Interface
+(SDEI). It provides a mechanism for registering and servicing system events
+from system firmware. The interface is offered by a higher exception level
+to a lower exception level, in other words by a secure platform firmware
+to hypervisor or hypervisor to OS or both.
+
+https://developer.arm.com/documentation/den0054/c
+
+KVM/arm64 implements the defined hypercalls in the specification so that
+the system events can be registered and serviced from KVM hypervisor to
+the guest OS.
+
+SDEI Event Management
+=====================
+
+Each SDEI event is associated and identified with a 32-bit number. The
+lower 24-bits is the event number, but other bits are reserved or used
+for vendor defined events. In KVM/arm64 implementation, bit 22 and 23
+are further reserved to identify the events visible to the implementation.
+Value 0x1 should be seen in this field for those KVM/arm64 visible events.
+In the meanwhile, event number 0x0 is also supported since it is used by
+SDEI_EVENT_SIGNAL hypercall.
+
+The SDEI event needs to be exposed through ioctl interface by VMM to
+hypervisor before the guest is able to register it. The exposed event
+is represented by ``struct kvm_sdei_exposed_event``. The registered event
+is represented by ``struct kvm_sdei_registered_event``, whose instances
+are created on SDEI_EVENT_REGISTER hypercall. There is only one registered
+event instance in one particular VM no matter what event type is. The
+registered event can be injected and delivered to one specific vcpu in
+forms of vcpu event. ``struct kvm_sdei_vcpu_event`` describes vcpu events.
+On the particular vcpu, one vcpu event instance can be shared by multiple
+events.
+
+The execution runs into SDEI context when vcpu event is handled. The
+interrupted or preempted context should be saved to vcpu state, which
+is represented by ``struct kvm_sdei_vcpu``. After that, the SDEI event
+handler, which was provided by SDEI_EVENT_REGISTER hypercall is invoked.
+SDEI_EVENT_COMPLETE or SDEI_EVENT_COMPLETE_AND_RESUME hypercall must be
+issued before the SDEI event handler is to complete. When one of these
+two hypercalls is received, the interrupted or preempted context is
+restored from vcpu state for execution.
+
+When migration happens, the status of one particular SDEI event is not
+deterministic. So we need to support migration for the aforementioned
+structures or objects. The information capable of migration in these
+objects are put into separate data structures, which are the corresponding
+state variant in kvm_sdei_state.h. Besides, ioctl command are introduced
+to read and write them on the source and destination VM during migration.
+
+IOCTL Commands
+==============
+
+KVM_ARM_SDEI_COMMAND
+--------------------
+
+:Capability: KVM_CAP_ARM_SDEI
+:Type: vm ioctl, vcpu ioctl
+:Parameters: struct kvm_sdei_cmd
+:Returns: 0 on success, < 0 on error
+
+::
+
+ struct kvm_sdei_cmd {
+ __u32 cmd;
+ union {
+ __u32 version;
+ __u32 count;
+ };
+ union {
+ struct kvm_sdei_exposed_event_state *exposed_event_state;
+ struct kvm_sdei_registered_event_state *registered_event_state;
+ struct kvm_sdei_vcpu_event_state *vcpu_event_state;
+ struct kvm_sdei_vcpu_state *vcpu_state;
+ __u64 num;
+ };
+ };
+
+The SDEI ioctl command is identified by KVM_ARM_SDEI_COMMAND and ``cmd``
+in the argument ``struct kvm_sdei_cmd`` provides further command to be
+executed.
+
+KVM_SDEI_CMD_GET_VERSION
+------------------------
+
+:Type: vm ioctl
+:Parameters: struct kvm_sdei_cmd
+:Returns: 0 on success, < 0 on error
+
+On success, the implementation version is returned in ``version`` of
+``struct kvm_sdei_cmd``. This version is different from that of the
+followed SDEI specification. The implementation version is used to tell
+the coherence extent to the following specification. For example, the
+SDEI interrupt binding event is defined in SDEI v1.1 specification,
+but it is not supported in current implementation.
+
+KVM_SDEI_CMD_GET_EXPOSED_EVENT_COUNT
+------------------------------------
+
+:Type: vm ioctl
+:Parameters: struct kvm_sdei_cmd
+:Returns: 0 on success, < 0 on error
+
+This ioctl command is used prior to KVM_SDEI_CMD_GET_EXPOSED_EVENT, to
+prepare ``exposed_event_state`` of ``struct kvm_sdei_cmd`` for that
+command during migration.
+
+On success, the number of exposed events is returned by ``count``
+of ``struct kvm_sdei_cmd``.
+
+KVM_SDEI_CMD_GET_EXPOSED_EVENT
+------------------------------
+
+:Type: vm ioctl
+:Parameters: struct kvm_sdei_cmd, struct kvm_sdei_exposed_event_state
+:Returns: 0 on success, < 0 on error
+
+::
+
+ struct kvm_sdei_exposed_event_state {
+ __u64 num;
+
+ __u8 type;
+ __u8 signaled;
+ __u8 priority;
+ __u8 padding[5];
+ __u64 notifier;
+ };
+
+This ioctl command is used to retrieve the exposed events on the source
+VM during migration.
+
+The number of exposed events to be retrieved is specified by ``count``
+of ``struct kvm_sdei_cmd``. On success, the retrieved exposed events are
+returned by ``exposed_event_state`` of ``struct kvm_sdei_state``.
+
+KVM_SDEI_CMD_SET_EXPOSED_EVENT
+------------------------------
+
+:Type: vm ioctl
+:Parameters: struct kvm_sdei_cmd, struct kvm_sdei_exposed_event_state
+:Returns: 0 on success, < 0 on error
+
+This ioctl command is used by VMM to expose SDEI events and migrate
+them. The ``notifier`` of ``struct kvm_sdei_exposed_event_state``
+will be modified if the specified exposed event has been existing.
+
+The number of events to be exposed is specified by ``count`` of
+``struct kvm_sdei_cmd``. The information about the exposed events is
+passed by ``exposed_event_state`` of ``struct kvm_sdei_cmd``.
+
+KVM_SDEI_CMD_GET_REGISTERED_EVENT_COUNT
+---------------------------------------
+
+:Type: vm ioctl
+:Parameters: struct kvm_sdei_cmd
+:Returns: 0 on success, < 0 on error
+
+This ioctl command is used prior to KVM_SDEI_CMD_GET_REGISTERED_EVENT,
+to prepare ``registered_event_state`` of ``struct kvm_sdei_cmd`` for
+that command during migration.
+
+On success, the number of registered events is returned by ``count``
+of ``struct kvm_sdei_cmd``.
+
+KVM_SDEI_CMD_GET_REGISTERED_EVENT
+---------------------------------
+
+:Type: vm ioctl
+:Parameters: struct kvm_sdei_cmd, struct kvm_sdei_registered_event_state
+:Returns: 0 on success, < 0 on error
+
+::
+ struct kvm_sdei_registered_event_state {
+ __u64 num;
+
+ __u8 route_mode;
+ __u8 padding[3];
+ __u64 route_affinity;
+ __u64 ep_address[KVM_SDEI_MAX_VCPUS];
+ __u64 ep_arg[KVM_SDEI_MAX_VCPUS];
+ __u64 registered[KVM_SDEI_MAX_VCPUS/64];
+ __u64 enabled[KVM_SDEI_MAX_VCPUS/64];
+ __u64 unregister_pending[KVM_SDEI_MAX_VCPUS/64];
+ };
+
+This ioctl command is used to retrieve the registered events on the
+source VM during migration.
+
+The number of registered events to be retrieved is specified by ``count``
+of ``struct kvm_sdei_cmd``. On success, the retrieved registered events
+are returned by ``registered_event_state`` of ``struct kvm_sdei_state``.
+
+KVM_SDEI_CMD_SET_REGISTERED_EVENT
+---------------------------------
+
+:Type: vm ioctl
+:Parameters: struct kvm_sdei_cmd, struct kvm_sdei_registered_event_state
+:Returns: 0 on success, < 0 on error
+
+This ioctl command is used by VMM to migrate the registered events
+on the destination VM.
+
+The number of events to be registered is specified by ``count`` of
+``struct kvm_sdei_cmd``. The information about the registered events
+is passed by ``registered_event_state`` of ``struct kvm_sdei_cmd``.
+
+KVM_SDEI_CMD_GET_VCPU_EVENT_COUNT
+---------------------------------
+
+:Type: vcpu ioctl
+:Parameters: struct kvm_sdei_cmd
+:Returns: 0 on success, < 0 on error
+
+This ioctl command is used prior to KVM_SDEI_CMD_GET_VCPU_EVENT, to
+prepare ``vcpu_event_state`` of ``struct kvm_sdei_cmd`` for that
+command during migration.
+
+On success, the number of vcpu events is returned by ``count``
+of ``struct kvm_sdei_cmd``.
+
+KVM_SDEI_CMD_GET_VCPU_EVENT
+---------------------------
+
+:Type: vcpu ioctl
+:Parameters: struct kvm_sdei_cmd, struct kvm_sdei_vcpu_event_state
+:Returns: 0 on success, < 0 on error
+
+::
+
+ struct kvm_sdei_vcpu_event_state {
+ __u64 num;
+
+ __u32 event_count;
+ __u32 padding;
+ };
+
+This ioctl command is used to retrieve the vcpu events on the source
+VM during migration.
+
+The number of vcpu events to be retrieved is specified by ``count`` of
+``struct kvm_sdei_cmd``. On success, the retrieved exposed events are
+returned by ``vcpu_event_state`` of ``struct kvm_sdei_state``.
+
+KVM_SDEI_CMD_SET_VCPU_EVENT
+---------------------------
+
+:Type: vcpu ioctl
+:Parameters: struct kvm_sdei_cmd, struct kvm_sdei_vcpu_event_state
+:Returns: 0 on success, < 0 on error
+
+This ioctl command is used by VMM to migrate the vcpu events on the
+destination VM.
+
+The number of vcpu events to be added is specified by ``count`` of
+``struct kvm_sdei_cmd``. The information about the vcpu events is
+passed by ``vcpu_event_state`` of ``struct kvm_sdei_cmd``.
+
+KVM_SDEI_CMD_GET_VCPU_STATE
+---------------------------
+
+:Type: vcpu ioctl
+:Parameters: struct kvm_sdei_cmd, struct kvm_sdei_vcpu_state
+:Returns: 0 on success, < 0 on error
+
+::
+
+ struct kvm_sdei_vcpu_regs_state {
+ __u64 regs[18];
+ __u64 pc;
+ __u64 pstate;
+ };
+
+ struct kvm_sdei_vcpu_state {
+ __u8 masked;
+ __u8 padding[7];
+ __u64 critical_num;
+ __u64 normal_num;
+ struct kvm_sdei_vcpu_regs_state critical_regs;
+ struct kvm_sdei_vcpu_regs_state normal_regs;
+ };
+
+This ioctl command is used to retrieve the vcpu state on the source VM
+during migration.
+
+On success, the current vcpu state is returned by ``vcpu_state`` of
+``struct kvm_sdei_cmd``.
+
+KVM_SDEI_CMD_SET_VCPU_STATE
+---------------------------
+
+:Type: vcpu ioctl
+:Parameters: struct kvm_sdei_cmd, struct kvm_sdei_vcpu_state
+:Returns: 0 on success, < 0 on error
+
+This ioctl command is used by VMM to migrate the vcpu state on the
+destination VM.
+
+The vcpu state to be configured is passed by ``vcpu_state`` of
+``struct kvm_sdei_cmd``.
+
+KVM_SDEI_CMD_INJECT_EVENT
+-------------------------
+
+:Type: vcpu ioctl
+:Parameters: struct kvm_sdei_cmd, struct kvm_sdei_vcpu_state
+:Returns: 0 on success, < 0 on error
+
+This ioctl command is used by VMM to inject SDEI event to the specified
+vcpu.
+
+The number of the SDEI event to be injected is passed by ``num`` of
+``struct kvm_sdei_cmd``.
+
+Future Work
+===========
+
+1. Support the routing mode and affinity for the shared events.
+2. Support interrupt binding events.
--
2.23.0

2022-03-22 08:35:49

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v5 14/22] KVM: arm64: Support SDEI event injection, delivery and cancellation

This supports SDEI event injection, delivery and cancellation. The
SDEI event is injected by kvm_sdei_inject_event(). The injected
event can be cancelled by kvm_sdei_cancel_event() before it's
delivered and handled.

KVM_REQ_SDEI request becomes pending once the SDEI event is injected
and kvm_sdei_deliver_event() is called to accommodate the request.
The injected SDEI event is delivered and handled in this way. The
context for execution is switched like below:

* x0 - x17 are saved. All of them are cleared except the following
registers:

x0: SDEI event number
x1: user argument associated with the SDEI event
x2: PC of the interrupted or preempted context
x3: PSTATE of the interrupted or preempted context

* PC is set to the handler of the SDEI event, which was provided
during its registration. PSTATE is modified according to the
SDEI specification.

* The SDEI event with normal priority can be preempted by that
with critical priority. However, no one can preempt the SDEI
event with critical event.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/include/asm/kvm_host.h | 1 +
arch/arm64/include/asm/kvm_sdei.h | 4 +
arch/arm64/kvm/arm.c | 3 +
arch/arm64/kvm/sdei.c | 284 ++++++++++++++++++++++++++++++
4 files changed, 292 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 5d37e046a458..e2762d08ab1c 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -46,6 +46,7 @@
#define KVM_REQ_RECORD_STEAL KVM_ARCH_REQ(3)
#define KVM_REQ_RELOAD_GICv4 KVM_ARCH_REQ(4)
#define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5)
+#define KVM_REQ_SDEI KVM_ARCH_REQ(6)

#define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \
KVM_DIRTY_LOG_INITIALLY_SET)
diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
index 6f58a846d05c..54c730acd298 100644
--- a/arch/arm64/include/asm/kvm_sdei.h
+++ b/arch/arm64/include/asm/kvm_sdei.h
@@ -165,6 +165,10 @@ KVM_SDEI_REGISTERED_EVENT_FUNC(unregister_pending)
void kvm_sdei_init_vm(struct kvm *kvm);
void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu);
+int kvm_sdei_inject_event(struct kvm_vcpu *vcpu,
+ unsigned long num, bool immediate);
+int kvm_sdei_cancel_event(struct kvm_vcpu *vcpu, unsigned long num);
+void kvm_sdei_deliver_event(struct kvm_vcpu *vcpu);
void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
void kvm_sdei_destroy_vm(struct kvm *kvm);

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 96fcae5beee4..00c136a6e8df 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -734,6 +734,9 @@ static void check_vcpu_requests(struct kvm_vcpu *vcpu)
if (kvm_check_request(KVM_REQ_VCPU_RESET, vcpu))
kvm_reset_vcpu(vcpu);

+ if (kvm_check_request(KVM_REQ_SDEI, vcpu))
+ kvm_sdei_deliver_event(vcpu);
+
/*
* Clear IRQ_PENDING requests that were made to guarantee
* that a VCPU sees new virtual interrupts.
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 1e0ca9022eaa..a24270378305 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -43,6 +43,25 @@ find_registered_event(struct kvm *kvm, unsigned long num)
return NULL;
}

+static struct kvm_sdei_vcpu_event *
+find_vcpu_event(struct kvm_vcpu *vcpu, unsigned long num)
+{
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_vcpu_event *vcpu_event;
+
+ list_for_each_entry(vcpu_event, &vsdei->critical_events, link) {
+ if (vcpu_event->state.num == num)
+ return vcpu_event;
+ }
+
+ list_for_each_entry(vcpu_event, &vsdei->normal_events, link) {
+ if (vcpu_event->state.num == num)
+ return vcpu_event;
+ }
+
+ return NULL;
+}
+
static void remove_all_exposed_events(struct kvm *kvm)
{
struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
@@ -637,6 +656,76 @@ static unsigned long hypercall_mask(struct kvm_vcpu *vcpu, bool mask)
return ret;
}

+static int do_inject_event(struct kvm_vcpu *vcpu,
+ struct kvm_sdei_registered_event *registered_event,
+ bool immediate)
+{
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_exposed_event *exposed_event;
+ struct kvm_sdei_vcpu_event *vcpu_event;
+ unsigned int vcpu_event_count;
+
+ /*
+ * In some cases, the injected event is expected to be delivered
+ * immediately. However, there are two cases the injected event
+ * can't be delivered immediately: (a) the injected event is a
+ * critical one, but we already have pending critical events for
+ * delivery. (b) the injected event is a normal one, but we have
+ * pending events for delivery, regardless of their priorities.
+ */
+ exposed_event = registered_event->exposed_event;
+ if (immediate) {
+ vcpu_event_count = vsdei->critical_event_count;
+ if (kvm_sdei_is_normal(exposed_event->state.priority))
+ vcpu_event_count += vsdei->normal_event_count;
+
+ if (vcpu_event_count > 0)
+ return -ENOSPC;
+ }
+
+ /* Check if the vcpu event exists */
+ vcpu_event = find_vcpu_event(vcpu, registered_event->state.num);
+ if (vcpu_event) {
+ vcpu_event->state.event_count++;
+ kvm_make_request(KVM_REQ_SDEI, vcpu);
+ return 0;
+ }
+
+ /* Check if the count of vcpu event instances exceeds the limit */
+ vcpu_event_count = vsdei->critical_event_count +
+ vsdei->normal_event_count;
+ if (vcpu_event_count >= KVM_SDEI_MAX_EVENTS)
+ return -ERANGE;
+
+ /* Allocate the vcpu event */
+ vcpu_event = kzalloc(sizeof(*vcpu_event), GFP_KERNEL_ACCOUNT);
+ if (!vcpu_event)
+ return -ENOMEM;
+
+ /*
+ * We should take lock to update the registered event because its
+ * reference count might be zero. In that case, the registered event
+ * could be released.
+ */
+ vcpu_event->state.num = registered_event->state.num;
+ vcpu_event->state.event_count = 1;
+ vcpu_event->vcpu = vcpu;
+ vcpu_event->registered_event = registered_event;
+
+ registered_event->vcpu_event_count++;
+ if (kvm_sdei_is_critical(exposed_event->state.priority)) {
+ list_add_tail(&vcpu_event->link, &vsdei->critical_events);
+ vsdei->critical_event_count++;
+ } else {
+ list_add_tail(&vcpu_event->link, &vsdei->normal_events);
+ vsdei->normal_event_count++;
+ }
+
+ kvm_make_request(KVM_REQ_SDEI, vcpu);
+
+ return 0;
+}
+
static unsigned long hypercall_reset(struct kvm_vcpu *vcpu, bool private)
{
struct kvm *kvm = vcpu->kvm;
@@ -761,6 +850,201 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
return 1;
}

+int kvm_sdei_inject_event(struct kvm_vcpu *vcpu,
+ unsigned long num,
+ bool immediate)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_exposed_event *exposed_event = NULL;
+ struct kvm_sdei_registered_event *registered_event = NULL;
+ int index, ret = 0;
+
+ if (!(ksdei && vsdei)) {
+ ret = -EPERM;
+ goto out;
+ }
+
+ if (!kvm_sdei_is_supported(num)) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ spin_lock(&ksdei->lock);
+
+ /* Check if the registered event exists */
+ registered_event = find_registered_event(kvm, num);
+ if (!registered_event) {
+ ret = -ENOENT;
+ goto unlock_kvm;
+ }
+
+ /* Check if the event has been registered and enabled */
+ exposed_event = registered_event->exposed_event;
+ index = kvm_sdei_vcpu_index(vcpu, exposed_event);
+ if (!kvm_sdei_is_registered(registered_event, index) ||
+ !kvm_sdei_is_enabled(registered_event, index) ||
+ kvm_sdei_is_unregister_pending(registered_event, index)) {
+ ret = -EPERM;
+ goto unlock_kvm;
+ }
+
+ /* Check if the vcpu has been masked off */
+ spin_lock(&vsdei->lock);
+ if (vsdei->state.masked) {
+ ret = -EPERM;
+ goto unlock_vcpu;
+ }
+
+ /* Inject the event */
+ ret = do_inject_event(vcpu, registered_event, immediate);
+
+unlock_vcpu:
+ spin_unlock(&vsdei->lock);
+unlock_kvm:
+ spin_unlock(&ksdei->lock);
+out:
+ return ret;
+}
+
+int kvm_sdei_cancel_event(struct kvm_vcpu *vcpu, unsigned long num)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_exposed_event *exposed_event = NULL;
+ struct kvm_sdei_registered_event *registered_event = NULL;
+ struct kvm_sdei_vcpu_event *vcpu_event = NULL;
+ int ret = 0;
+
+ if (!(ksdei && vsdei)) {
+ ret = -EPERM;
+ goto out;
+ }
+
+ spin_lock(&ksdei->lock);
+ spin_lock(&vsdei->lock);
+
+ /* Find the vcpu event */
+ vcpu_event = find_vcpu_event(vcpu, num);
+ if (!vcpu_event) {
+ ret = -EINVAL;
+ goto unlock;
+ }
+
+ /* We can't cancel the event if it has been delivered */
+ if (vcpu_event->state.event_count <= 1 &&
+ (vsdei->critical_event == vcpu_event ||
+ vsdei->normal_event == vcpu_event)) {
+ ret = -EINPROGRESS;
+ goto unlock;
+ }
+
+ /* Destroy the vcpu event instance if needed */
+ registered_event = vcpu_event->registered_event;
+ exposed_event = registered_event->exposed_event;
+ vcpu_event->state.event_count--;
+ if (!vcpu_event->state.event_count)
+ remove_one_vcpu_event(vcpu, vcpu_event);
+
+unlock:
+ spin_unlock(&vsdei->lock);
+ spin_unlock(&ksdei->lock);
+out:
+ return ret;
+}
+
+void kvm_sdei_deliver_event(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_exposed_event *exposed_event;
+ struct kvm_sdei_registered_event *registered_event;
+ struct kvm_sdei_vcpu_event *vcpu_event;
+ struct kvm_sdei_vcpu_regs_state *regs;
+ unsigned long pstate;
+ int index;
+
+ if (!(ksdei && vsdei))
+ return;
+
+ spin_lock(&vsdei->lock);
+
+ /* The critical event can't be preempted */
+ if (vsdei->critical_event)
+ goto unlock;
+
+ /*
+ * The normal event can be preempted by the critical event.
+ * However, the normal event can't be preempted by another
+ * normal event.
+ */
+ vcpu_event = list_first_entry_or_null(&vsdei->critical_events,
+ struct kvm_sdei_vcpu_event, link);
+ if (!vcpu_event && !vsdei->normal_event) {
+ vcpu_event = list_first_entry_or_null(&vsdei->normal_events,
+ struct kvm_sdei_vcpu_event, link);
+ }
+
+ if (!vcpu_event)
+ goto unlock;
+
+ registered_event = vcpu_event->registered_event;
+ exposed_event = registered_event->exposed_event;
+ if (kvm_sdei_is_critical(exposed_event->state.priority)) {
+ vsdei->critical_event = vcpu_event;
+ vsdei->state.critical_num = vcpu_event->state.num;
+ regs = &vsdei->state.critical_regs;
+ } else {
+ vsdei->normal_event = vcpu_event;
+ vsdei->state.normal_num = vcpu_event->state.num;
+ regs = &vsdei->state.normal_regs;
+ }
+
+ /*
+ * Save registers: x0 -> x17, PC, PState. There might be pending
+ * exception or PC increment request in the last run on this vCPU.
+ * In this case, we need to save the site in advance. Otherwise,
+ * the passed entry point could be floated by 4 bytes in the
+ * subsequent call to __kvm_adjust_pc().
+ */
+ __kvm_adjust_pc(vcpu);
+ for (index = 0; index < ARRAY_SIZE(regs->regs); index++)
+ regs->regs[index] = vcpu_get_reg(vcpu, index);
+
+ regs->pc = *vcpu_pc(vcpu);
+ regs->pstate = *vcpu_cpsr(vcpu);
+
+ /*
+ * Inject SDEI event: x0 -> x3, PC, PState. We needn't take lock
+ * for the registered event as it can't be released because of
+ * its reference count.
+ */
+ for (index = 0; index < ARRAY_SIZE(regs->regs); index++)
+ vcpu_set_reg(vcpu, index, 0);
+
+ index = kvm_sdei_vcpu_index(vcpu, exposed_event);
+ vcpu_set_reg(vcpu, 0, registered_event->state.num);
+ vcpu_set_reg(vcpu, 1, registered_event->state.ep_arg[index]);
+ vcpu_set_reg(vcpu, 2, regs->pc);
+ vcpu_set_reg(vcpu, 3, regs->pstate);
+
+ pstate = regs->pstate;
+ pstate |= (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT);
+ pstate &= ~PSR_MODE_MASK;
+ pstate |= PSR_MODE_EL1h;
+ pstate &= ~PSR_MODE32_BIT;
+
+ vcpu_write_sys_reg(vcpu, regs->pstate, SPSR_EL1);
+ *vcpu_cpsr(vcpu) = pstate;
+ *vcpu_pc(vcpu) = registered_event->state.ep_address[index];
+
+unlock:
+ spin_unlock(&vsdei->lock);
+}
+
void kvm_sdei_init_vm(struct kvm *kvm)
{
struct kvm_sdei_kvm *ksdei;
--
2.23.0

2022-03-22 08:36:51

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v5 22/22] KVM: selftests: Add SDEI test case

This adds SDEI self-test case where the various hypercalls are issued
to default event (0x0). The default event is private, signaled and in
normal priority.

By default, two vCPUs are started and the following ioctl commands
or hypercalls are sent to them in sequence, to simulate how they
are used in VMM and the linux guest:

kvm_check_cap(KVM_CAP_ARM_SDEI)
KVM_SDEI_CMD_GET_VERSION
KVM_SDEI_CMD_SET_EXPOSED_EVENT (expose event)

SDEI_1_0_FN_SDEI_VERSION
SDEI_1_1_FN_SDEI_FEATURES (SDEI capability probing)
SDEI_1_0_FN_SDEI_SHARED_RESET (restart SDEI)
SDEI_1_0_FN_SDEI_PE_UNMASK (CPU online)

SDEI_1_0_FN_SDEI_EVENT_GET_INFO
SDEI_1_0_FN_SDEI_EVENT_REGISTER (register event)
SDEI_1_0_FN_SDEI_EVENT_ENABLE (enable event)
SDEI_1_1_FN_SDEI_EVENT_SIGNAL (event injection)

SDEI_1_0_FN_SDEI_EVENT_DISABLE (disable event)
SDEI_1_0_FN_SDEI_EVENT_UNREGISTER (unregister event)
SDEI_1_0_FN_SDEI_PE_MASK (CPU offline)

Signed-off-by: Gavin Shan <[email protected]>
---
tools/testing/selftests/kvm/Makefile | 1 +
tools/testing/selftests/kvm/aarch64/sdei.c | 525 +++++++++++++++++++++
2 files changed, 526 insertions(+)
create mode 100644 tools/testing/selftests/kvm/aarch64/sdei.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 17c3f0749f05..4e4c03cc316b 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -105,6 +105,7 @@ TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
TEST_GEN_PROGS_aarch64 += aarch64/psci_cpu_on_test
TEST_GEN_PROGS_aarch64 += aarch64/vgic_init
TEST_GEN_PROGS_aarch64 += aarch64/vgic_irq
+TEST_GEN_PROGS_aarch64 += aarch64/sdei
TEST_GEN_PROGS_aarch64 += demand_paging_test
TEST_GEN_PROGS_aarch64 += dirty_log_test
TEST_GEN_PROGS_aarch64 += dirty_log_perf_test
diff --git a/tools/testing/selftests/kvm/aarch64/sdei.c b/tools/testing/selftests/kvm/aarch64/sdei.c
new file mode 100644
index 000000000000..2a7d816ce438
--- /dev/null
+++ b/tools/testing/selftests/kvm/aarch64/sdei.c
@@ -0,0 +1,525 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ARM SDEI test
+ *
+ * Copyright (C) 2022 Red Hat, Inc.
+ *
+ * Author(s): Gavin Shan <[email protected]>
+ */
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <pthread.h>
+#include <linux/bitmap.h>
+
+#include "kvm_util.h"
+#include "processor.h"
+#include "linux/arm_sdei.h"
+#include "asm/kvm_sdei_state.h"
+
+#define NR_VCPUS 2
+#define SDEI_EVENT_NUM KVM_SDEI_DEFAULT_EVENT
+
+#define VCPU_COMMAND_IDLE 0
+#define VCPU_COMMAND_EXIT 1
+
+struct vcpu_command {
+ const char *name;
+ uint64_t command;
+};
+
+struct sdei_feature {
+ uint16_t shared_slots;
+ uint16_t private_slots;
+ uint8_t relative_mode;
+};
+
+struct sdei_event_info {
+ uint8_t type;
+ uint8_t priority;
+ uint8_t signaled;
+};
+
+struct sdei_event_signal {
+ uint8_t handled;
+ uint8_t irq;
+ uint64_t status;
+ uint64_t pc;
+ uint64_t pstate;
+ uint64_t regs[18];
+};
+
+struct sdei_state {
+ uint64_t command;
+ uint64_t num;
+ uint64_t status;
+ union {
+ uint64_t version;
+ struct sdei_feature feature;
+ struct sdei_event_info info;
+ struct sdei_event_signal signal;
+ };
+
+ uint8_t command_completed;
+};
+
+struct vcpu_state {
+ struct kvm_vm *vm;
+ uint32_t vcpu_id;
+ pthread_t thread;
+ struct sdei_state state;
+};
+
+static struct vcpu_state vcpu_states[NR_VCPUS];
+static struct vcpu_command vcpu_commands[] = {
+ { "VERSION", SDEI_1_0_FN_SDEI_VERSION },
+ { "FEATURES", SDEI_1_1_FN_SDEI_FEATURES },
+ { "PRIVATE_RESET", SDEI_1_0_FN_SDEI_PRIVATE_RESET },
+ { "SHARED_RESET", SDEI_1_0_FN_SDEI_SHARED_RESET },
+ { "PE_UNMASK", SDEI_1_0_FN_SDEI_PE_UNMASK },
+ { "EVENT_GET_INFO", SDEI_1_0_FN_SDEI_EVENT_GET_INFO },
+ { "EVENT_REGISTER", SDEI_1_0_FN_SDEI_EVENT_REGISTER },
+ { "EVENT_ENABLE", SDEI_1_0_FN_SDEI_EVENT_ENABLE },
+ { "EVENT_SIGNAL", SDEI_1_1_FN_SDEI_EVENT_SIGNAL },
+ { "PE_MASK", SDEI_1_0_FN_SDEI_PE_MASK },
+ { "EVENT_DISABLE", SDEI_1_0_FN_SDEI_EVENT_DISABLE },
+ { "EVENT_UNREGISTER", SDEI_1_0_FN_SDEI_EVENT_UNREGISTER },
+};
+
+static inline int64_t smccc(uint32_t func, uint64_t arg0, uint64_t arg1,
+ uint64_t arg2, uint64_t arg3, uint64_t arg4)
+{
+ int64_t ret;
+
+ asm volatile (
+ "mov x0, %1\n"
+ "mov x1, %2\n"
+ "mov x2, %3\n"
+ "mov x3, %4\n"
+ "mov x4, %5\n"
+ "mov x5, %6\n"
+ "hvc #0\n"
+ "mov %0, x0\n"
+ : "=r" (ret) : "r" (func), "r" (arg0), "r" (arg1),
+ "r" (arg2), "r" (arg3), "r" (arg4) :
+ "x0", "x1", "x2", "x3", "x4", "x5");
+
+ return ret;
+}
+
+static inline bool is_error(int64_t status)
+{
+ if (status == SDEI_NOT_SUPPORTED ||
+ status == SDEI_INVALID_PARAMETERS ||
+ status == SDEI_DENIED ||
+ status == SDEI_PENDING ||
+ status == SDEI_OUT_OF_RESOURCE)
+ return true;
+
+ return false;
+}
+
+static void guest_irq_handler(struct ex_regs *regs)
+{
+ int vcpu_id = guest_get_vcpuid();
+ struct sdei_state *state = &vcpu_states[vcpu_id].state;
+
+ WRITE_ONCE(state->signal.irq, true);
+}
+
+static void sdei_event_handler(uint64_t num, uint64_t arg,
+ uint64_t pc, uint64_t pstate)
+{
+ struct sdei_state *state = (struct sdei_state *)arg;
+ uint64_t status;
+
+ status = smccc(SDEI_1_0_FN_SDEI_EVENT_STATUS, num, 0, 0, 0, 0);
+ WRITE_ONCE(state->signal.status, status);
+
+ WRITE_ONCE(state->signal.pc, pc);
+ WRITE_ONCE(state->signal.pstate, pstate);
+
+ status = smccc(SDEI_1_0_FN_SDEI_EVENT_CONTEXT, 0, 0, 0, 0, 0);
+ WRITE_ONCE(state->signal.regs[0], status);
+ status = smccc(SDEI_1_0_FN_SDEI_EVENT_CONTEXT, 1, 0, 0, 0, 0);
+ WRITE_ONCE(state->signal.regs[1], status);
+ status = smccc(SDEI_1_0_FN_SDEI_EVENT_CONTEXT, 2, 0, 0, 0, 0);
+ WRITE_ONCE(state->signal.regs[2], status);
+ status = smccc(SDEI_1_0_FN_SDEI_EVENT_CONTEXT, 3, 0, 0, 0, 0);
+ WRITE_ONCE(state->signal.regs[3], status);
+
+ WRITE_ONCE(state->signal.handled, true);
+ smccc(SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME,
+ num, 0, 0, 0, 0);
+}
+
+static bool sdei_event_wait(struct sdei_state *state,
+ uint64_t timeout_in_seconds)
+{
+ uint64_t limit, count = 0;
+
+ limit = (timeout_in_seconds * 1000000) / 10;
+
+ while (1) {
+ if (READ_ONCE(state->signal.handled))
+ return true;
+
+ if (++count >= limit)
+ return false;
+
+ /*
+ * We issues HVC calls here to ensure the injected
+ * event can be delivered in time.
+ */
+ smccc(SDEI_1_0_FN_SDEI_EVENT_GET_INFO,
+ READ_ONCE(state->num), SDEI_EVENT_INFO_EV_TYPE,
+ 0, 0, 0);
+
+ usleep(10);
+ }
+
+ return false;
+}
+
+static void guest_code(int vcpu_id)
+{
+ struct sdei_state *state;
+ uint64_t command, last_command = -1UL, num, status;
+
+ state = &vcpu_states[vcpu_id].state;
+
+ while (1) {
+ command = READ_ONCE(state->command);
+ if (command == last_command)
+ continue;
+
+ num = READ_ONCE(state->num);
+ switch (command) {
+ case VCPU_COMMAND_IDLE:
+ WRITE_ONCE(state->status, SDEI_SUCCESS);
+ break;
+ case SDEI_1_0_FN_SDEI_VERSION:
+ status = smccc(command, 0, 0, 0, 0, 0);
+ WRITE_ONCE(state->status, status);
+ if (is_error(status))
+ break;
+
+ WRITE_ONCE(state->version, status);
+ break;
+ case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
+ case SDEI_1_0_FN_SDEI_SHARED_RESET:
+ case SDEI_1_0_FN_SDEI_PE_UNMASK:
+ case SDEI_1_0_FN_SDEI_PE_MASK:
+ status = smccc(command, 0, 0, 0, 0, 0);
+ WRITE_ONCE(state->status, status);
+ break;
+ case SDEI_1_1_FN_SDEI_FEATURES:
+ status = smccc(command, 0, 0, 0, 0, 0);
+ WRITE_ONCE(state->status, status);
+ if (is_error(status))
+ break;
+
+ WRITE_ONCE(state->feature.shared_slots,
+ (status & 0xffff0000) >> 16);
+ WRITE_ONCE(state->feature.private_slots,
+ (status & 0x0000ffff));
+ status = smccc(command, 1, 0, 0, 0, 0);
+ WRITE_ONCE(state->status, status);
+ if (is_error(status))
+ break;
+
+ WRITE_ONCE(state->feature.relative_mode, status);
+ break;
+ case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
+ status = smccc(command, num,
+ SDEI_EVENT_INFO_EV_TYPE, 0, 0, 0);
+ WRITE_ONCE(state->status, status);
+ if (is_error(status))
+ break;
+
+ WRITE_ONCE(state->info.type, status);
+ status = smccc(command, num,
+ SDEI_EVENT_INFO_EV_PRIORITY, 0, 0, 0);
+ WRITE_ONCE(state->status, status);
+ if (is_error(status))
+ break;
+
+ WRITE_ONCE(state->info.priority, status);
+ status = smccc(command, num,
+ SDEI_EVENT_INFO_EV_SIGNALED, 0, 0, 0);
+ if (is_error(status))
+ break;
+
+ WRITE_ONCE(state->info.signaled, status);
+ break;
+ case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
+ status = smccc(command, num,
+ (uint64_t)sdei_event_handler,
+ (uint64_t)state,
+ SDEI_EVENT_REGISTER_RM_ANY, 0);
+ WRITE_ONCE(state->status, status);
+ break;
+ case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
+ case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
+ case SDEI_1_0_FN_SDEI_EVENT_UNREGISTER:
+ status = smccc(command, num, 0, 0, 0, 0);
+ WRITE_ONCE(state->status, status);
+ break;
+ case SDEI_1_1_FN_SDEI_EVENT_SIGNAL:
+ status = smccc(command, num, (uint64_t)state, 0, 0, 0);
+ WRITE_ONCE(state->status, status);
+ if (is_error(status))
+ break;
+
+ if (!sdei_event_wait(state, 5))
+ WRITE_ONCE(state->status, SDEI_DENIED);
+
+ break;
+ case VCPU_COMMAND_EXIT:
+ WRITE_ONCE(state->status, SDEI_SUCCESS);
+ GUEST_DONE();
+ break;
+ default:
+ WRITE_ONCE(state->status, SDEI_INVALID_PARAMETERS);
+ }
+
+ last_command = command;
+ WRITE_ONCE(state->command_completed, true);
+ }
+}
+
+static void *vcpu_thread(void *arg)
+{
+ struct vcpu_state *state = arg;
+
+ vcpu_run(state->vm, state->vcpu_id);
+
+ return NULL;
+}
+
+static bool vcpu_wait(struct kvm_vm *vm, int timeout_in_seconds)
+{
+ unsigned long count, limit;
+ int i;
+
+ count = 0;
+ limit = (timeout_in_seconds * 1000000) / 50;
+ while (1) {
+ for (i = 0; i < NR_VCPUS; i++) {
+ sync_global_from_guest(vm, vcpu_states[i].state);
+ if (!vcpu_states[i].state.command_completed)
+ break;
+ }
+
+ if (i >= NR_VCPUS)
+ return true;
+
+ if (++count > limit)
+ return false;
+
+ usleep(50);
+ }
+
+ return false;
+}
+
+static void vcpu_send_command(struct kvm_vm *vm, uint64_t command)
+{
+ int i;
+
+ for (i = 0; i < NR_VCPUS; i++) {
+ memset(&vcpu_states[i].state, 0,
+ sizeof(vcpu_states[0].state));
+ vcpu_states[i].state.num = SDEI_EVENT_NUM;
+ vcpu_states[i].state.status = SDEI_SUCCESS;
+ vcpu_states[i].state.command = command;
+ vcpu_states[i].state.command_completed = false;
+
+ sync_global_to_guest(vm, vcpu_states[i].state);
+ }
+}
+
+static bool vcpu_check_state(struct kvm_vm *vm)
+{
+ int i, j, ret;
+
+ for (i = 0; i < NR_VCPUS; i++)
+ sync_global_from_guest(vm, vcpu_states[i].state);
+
+ for (i = 0; i < NR_VCPUS; i++) {
+ if (is_error(vcpu_states[i].state.status))
+ return false;
+
+ for (j = 0; j < NR_VCPUS; j++) {
+ ret = memcmp(&vcpu_states[i].state,
+ &vcpu_states[j].state,
+ sizeof(vcpu_states[0].state));
+ if (ret)
+ return false;
+ }
+ }
+
+ return true;
+}
+
+static void vcpu_dump_state(int index)
+{
+ struct sdei_state *state = &vcpu_states[0].state;
+
+ pr_info("--- %s\n", vcpu_commands[index].name);
+ switch (state->command) {
+ case SDEI_1_0_FN_SDEI_VERSION:
+ pr_info(" Version: %ld.%ld (vendor: 0x%lx)\n",
+ SDEI_VERSION_MAJOR(state->version),
+ SDEI_VERSION_MINOR(state->version),
+ SDEI_VERSION_VENDOR(state->version));
+ break;
+ case SDEI_1_1_FN_SDEI_FEATURES:
+ pr_info(" Shared event slots: %d\n",
+ state->feature.shared_slots);
+ pr_info(" Private event slots: %d\n",
+ state->feature.private_slots);
+ pr_info(" Relative mode: %s\n",
+ state->feature.relative_mode ? "Yes" : "No");
+ break;
+ case SDEI_1_0_FN_SDEI_EVENT_GET_INFO:
+ pr_info(" Type: %s\n",
+ state->info.type == SDEI_EVENT_TYPE_SHARED ?
+ "Shared" : "Private");
+ pr_info(" Priority: %s\n",
+ state->info.priority == SDEI_EVENT_PRIORITY_NORMAL ?
+ "Normal" : "Critical");
+ pr_info(" Signaled: %s\n",
+ state->info.signaled ? "Yes" : "No");
+ break;
+ case SDEI_1_1_FN_SDEI_EVENT_SIGNAL:
+ pr_info(" Handled: %s\n",
+ state->signal.handled ? "Yes" : "No");
+ pr_info(" IRQ: %s\n",
+ state->signal.irq ? "Yes" : "No");
+ pr_info(" Status: %s-%s-%s\n",
+ state->signal.status & (1 << SDEI_EVENT_STATUS_REGISTERED) ?
+ "Registered" : "x",
+ state->signal.status & (1 << SDEI_EVENT_STATUS_ENABLED) ?
+ "Enabled" : "x",
+ state->signal.status & (1 << SDEI_EVENT_STATUS_RUNNING) ?
+ "Running" : "x");
+ pr_info(" PC/PSTATE: %016lx %016lx\n",
+ state->signal.pc, state->signal.pstate);
+ pr_info(" Regs: %016lx %016lx %016lx %016lx\n",
+ state->signal.regs[0], state->signal.regs[1],
+ state->signal.regs[2], state->signal.regs[3]);
+ break;
+ }
+
+ if (index == ARRAY_SIZE(vcpu_commands))
+ pr_info("\n");
+}
+
+int main(int argc, char **argv)
+{
+ struct kvm_vm *vm;
+ struct kvm_sdei_cmd cmd;
+ struct kvm_sdei_exposed_event_state exposed_event;
+ uint32_t vcpu_ids[NR_VCPUS];
+ int i, ret;
+
+ if (!kvm_check_cap(KVM_CAP_ARM_SDEI)) {
+ pr_info("SDEI not supported\n");
+ return 0;
+ }
+
+ /* Create VM */
+ for (i = 0; i < NR_VCPUS; i++) {
+ vcpu_states[i].vcpu_id = i;
+ vcpu_ids[i] = i;
+ }
+
+ vm = vm_create_default_with_vcpus(NR_VCPUS, 0, 0,
+ guest_code, vcpu_ids);
+ vm_init_descriptor_tables(vm);
+ vm_install_exception_handler(vm, VECTOR_IRQ_CURRENT,
+ guest_irq_handler);
+
+ ucall_init(vm, NULL);
+
+ /* Ensure the version is v1.0.0 */
+ cmd.cmd = KVM_SDEI_CMD_GET_VERSION;
+ cmd.version = 0;
+ vm_ioctl(vm, KVM_ARM_SDEI_COMMAND, &cmd);
+ if (cmd.version != 0x10000) {
+ pr_info("v%d.%d.%d doesn't match with v1.0.0\n",
+ (cmd.version & 0xFF0000) >> 16,
+ (cmd.version & 0xFF00) >> 8,
+ (cmd.version & 0xFF));
+ return 0;
+ }
+
+ /* Expose the default SDEI event */
+ exposed_event.num = SDEI_EVENT_NUM;
+ exposed_event.type = SDEI_EVENT_TYPE_PRIVATE;
+ exposed_event.priority = SDEI_EVENT_PRIORITY_NORMAL;
+ exposed_event.signaled = 1;
+ exposed_event.notifier = 0;
+ cmd.cmd = KVM_SDEI_CMD_SET_EXPOSED_EVENT;
+ cmd.count = 1;
+ cmd.exposed_event_state = &exposed_event;
+ vm_ioctl(vm, KVM_ARM_SDEI_COMMAND, &cmd);
+
+ /* Start the vCPUs */
+ vcpu_send_command(vm, VCPU_COMMAND_IDLE);
+ for (i = 0; i < NR_VCPUS; i++) {
+ vcpu_states[i].vm = vm;
+ vcpu_args_set(vm, i, 1, i);
+ vcpu_init_descriptor_tables(vm, i);
+
+ ret = pthread_create(&vcpu_states[i].thread, NULL,
+ vcpu_thread, &vcpu_states[i]);
+ TEST_ASSERT(!ret, "Failed to create vCPU-%d pthread\n", i);
+ }
+
+ /* Wait the idle command to complete */
+ ret = vcpu_wait(vm, 5);
+ TEST_ASSERT(ret, "Timeout to execute IDLE command\n");
+
+ /* Start the tests */
+ pr_info("\n");
+ pr_info(" NR_VCPUS: %d SDEI Event: 0x%08x\n\n",
+ NR_VCPUS, SDEI_EVENT_NUM);
+ for (i = 0; i < ARRAY_SIZE(vcpu_commands); i++) {
+ /*
+ * We depends on SDEI_1_1_FN_SDEI_EVENT_SIGNAL hypercall
+ * to inject SDEI event. The number of the injected event
+ * must be zero. So we have to skip the corresponding test
+ * if the SDEI event number isn't zero.
+ */
+ if (SDEI_EVENT_NUM != 0x0 &&
+ vcpu_commands[i].command == SDEI_1_1_FN_SDEI_EVENT_SIGNAL)
+ continue;
+
+ vcpu_send_command(vm, vcpu_commands[i].command);
+ ret = vcpu_wait(vm, 5);
+ if (!ret) {
+ pr_info("%s: Timeout\n", vcpu_commands[i].name);
+ return -1;
+ }
+
+ ret = vcpu_check_state(vm);
+ if (!ret) {
+ pr_info("%s: Fail\n", vcpu_commands[i].name);
+ return -1;
+ }
+
+ vcpu_dump_state(i);
+ }
+
+ /* Terminate the guests */
+ pr_info("\n Result: OK\n\n");
+ vcpu_send_command(vm, VCPU_COMMAND_EXIT);
+ sleep(1);
+
+ return 0;
+}
--
2.23.0

2022-03-22 08:39:04

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v5 01/22] KVM: arm64: Introduce template for inline functions

The inline functions used to get the SMCCC parameters have same
layout. It means these functions can be presented by an unified
template, to make the code simplified. Besides, this adds more
similar inline functions like smccc_get_arg{4,5,6,7,8}() to get
more SMCCC arguments, which are needed by SDEI virtualization
support.

Signed-off-by: Gavin Shan <[email protected]>
---
include/kvm/arm_hypercalls.h | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
index 0e2509d27910..d5144c852fe4 100644
--- a/include/kvm/arm_hypercalls.h
+++ b/include/kvm/arm_hypercalls.h
@@ -13,20 +13,20 @@ static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
return vcpu_get_reg(vcpu, 0);
}

-static inline unsigned long smccc_get_arg1(struct kvm_vcpu *vcpu)
-{
- return vcpu_get_reg(vcpu, 1);
-}
-
-static inline unsigned long smccc_get_arg2(struct kvm_vcpu *vcpu)
-{
- return vcpu_get_reg(vcpu, 2);
+#define SMCCC_DECLARE_GET_ARG(reg) \
+static inline unsigned long smccc_get_arg##reg(struct kvm_vcpu *vcpu) \
+{ \
+ return vcpu_get_reg(vcpu, reg); \
}

-static inline unsigned long smccc_get_arg3(struct kvm_vcpu *vcpu)
-{
- return vcpu_get_reg(vcpu, 3);
-}
+SMCCC_DECLARE_GET_ARG(1)
+SMCCC_DECLARE_GET_ARG(2)
+SMCCC_DECLARE_GET_ARG(3)
+SMCCC_DECLARE_GET_ARG(4)
+SMCCC_DECLARE_GET_ARG(5)
+SMCCC_DECLARE_GET_ARG(6)
+SMCCC_DECLARE_GET_ARG(7)
+SMCCC_DECLARE_GET_ARG(8)

static inline void smccc_set_retval(struct kvm_vcpu *vcpu,
unsigned long a0,
--
2.23.0

2022-03-22 08:46:03

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v5 12/22] KVM: arm64: Support SDEI_{PRIVATE, SHARED}_RESET

This supports SDEI_{PRIVATE, SHARED}_RESET. They are used by the
guest to reset the private events on the calling vCPU or the
shared events on all vCPUs.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/kvm/sdei.c | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index b2a916724cfa..0dec35a0eed1 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -637,6 +637,31 @@ static unsigned long hypercall_mask(struct kvm_vcpu *vcpu, bool mask)
return ret;
}

+static unsigned long hypercall_reset(struct kvm_vcpu *vcpu, bool private)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_exposed_event *exposed_event;
+ struct kvm_sdei_registered_event *registered_event, *tmp;
+ unsigned long r, ret = SDEI_SUCCESS;
+
+ spin_lock(&ksdei->lock);
+
+ list_for_each_entry_safe(registered_event, tmp,
+ &ksdei->registered_events, link) {
+ exposed_event = registered_event->exposed_event;
+ if (private ^ kvm_sdei_is_shared(exposed_event->state.type))
+ continue;
+
+ r = unregister_one_event(kvm, NULL, registered_event);
+ ret = (r == SDEI_SUCCESS) ? ret : r;
+ }
+
+ spin_unlock(&ksdei->lock);
+
+ return ret;
+}
+
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
@@ -700,8 +725,14 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
case SDEI_1_0_FN_SDEI_INTERRUPT_RELEASE:
case SDEI_1_1_FN_SDEI_EVENT_SIGNAL:
+ ret = SDEI_NOT_SUPPORTED;
+ break;
case SDEI_1_0_FN_SDEI_PRIVATE_RESET:
+ ret = hypercall_reset(vcpu, true);
+ break;
case SDEI_1_0_FN_SDEI_SHARED_RESET:
+ ret = hypercall_reset(vcpu, false);
+ break;
case SDEI_1_1_FN_SDEI_FEATURES:
default:
ret = SDEI_NOT_SUPPORTED;
--
2.23.0

2022-03-22 08:47:44

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v5 10/22] KVM: arm64: Support SDEI_EVENT_ROUTING_SET hypercall

This supports SDEI_EVENT_ROUTING_SET hypercall. It's used by the
guest to set route mode and affinity for the shared and registered
events. The request to configure the routing mode and affinity for
the private events are disallowed. Besides, It's not allowed to do
when the corresponding vCPU events are existing.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/kvm/sdei.c | 62 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 62 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 4f26e5f70bff..db82ea441eae 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -565,6 +565,66 @@ static unsigned long hypercall_info(struct kvm_vcpu *vcpu)
return ret;
}

+static unsigned long hypercall_route(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_exposed_event *exposed_event;
+ struct kvm_sdei_registered_event *registered_event;
+ unsigned long event_num = smccc_get_arg1(vcpu);
+ unsigned long route_mode = smccc_get_arg2(vcpu);
+ unsigned long route_affinity = smccc_get_arg3(vcpu);
+ int index = 0;
+ unsigned long ret = SDEI_SUCCESS;
+
+ if (!kvm_sdei_is_supported(event_num)) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto out;
+ }
+
+ /*
+ * FIXME: The affinity should be verified when it's supported. We
+ * accept anything for now.
+ */
+ if (route_mode != SDEI_EVENT_REGISTER_RM_ANY &&
+ route_mode != SDEI_EVENT_REGISTER_RM_PE) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto out;
+ }
+
+ spin_lock(&ksdei->lock);
+
+ /* Check if the registered event exists */
+ registered_event = find_registered_event(kvm, event_num);
+ if (!registered_event) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto unlock;
+ }
+
+ /* Check the registered event is a shared one */
+ exposed_event = registered_event->exposed_event;
+ if (!kvm_sdei_is_shared(exposed_event->state.type)) {
+ ret = SDEI_DENIED;
+ goto unlock;
+ }
+
+ if (!kvm_sdei_is_registered(registered_event, index) ||
+ kvm_sdei_is_enabled(registered_event, index) ||
+ registered_event->vcpu_event_count > 0) {
+ ret = SDEI_DENIED;
+ goto unlock;
+ }
+
+ /* Update the registered event state */
+ registered_event->state.route_mode = route_mode;
+ registered_event->state.route_affinity = route_affinity;
+
+unlock:
+ spin_unlock(&ksdei->lock);
+out:
+ return ret;
+}
+
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
@@ -617,6 +677,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
ret = hypercall_info(vcpu);
break;
case SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET:
+ ret = hypercall_route(vcpu);
+ break;
case SDEI_1_0_FN_SDEI_PE_MASK:
case SDEI_1_0_FN_SDEI_PE_UNMASK:
case SDEI_1_0_FN_SDEI_INTERRUPT_BIND:
--
2.23.0

2022-03-22 08:48:56

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v5 04/22] KVM: arm64: Support SDEI_EVENT_REGISTER hypercall

This supports SDEI_EVENT_REGISTER hypercall, which is used by guest
to register SDEI events. The SDEI event won't be raised until it's
registered and enabled explicitly.

Only the exposed events can be registered. For shared event, the
registered event instance is created. However, the instance may be
not created for the private events.

Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/kvm/sdei.c | 128 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 128 insertions(+)

diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 5a3a64cd6e84..2458dc666445 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -14,6 +14,35 @@
#include <kvm/arm_hypercalls.h>
#include <asm/kvm_sdei.h>

+static struct kvm_sdei_exposed_event *
+find_exposed_event(struct kvm *kvm, unsigned long num)
+{
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_exposed_event *exposed_event;
+
+ list_for_each_entry(exposed_event, &ksdei->exposed_events, link) {
+ if (exposed_event->state.num == num)
+ return exposed_event;
+ }
+
+ return NULL;
+}
+
+static struct kvm_sdei_registered_event *
+find_registered_event(struct kvm *kvm, unsigned long num)
+{
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_registered_event *registered_event;
+
+ list_for_each_entry(registered_event,
+ &ksdei->registered_events, link) {
+ if (registered_event->state.num == num)
+ return registered_event;
+ }
+
+ return NULL;
+}
+
static void remove_all_exposed_events(struct kvm *kvm)
{
struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
@@ -126,6 +155,103 @@ static unsigned long hypercall_version(struct kvm_vcpu *vcpu)
0x4b564d;
}

+static unsigned long hypercall_register(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_sdei_kvm *ksdei = kvm->arch.sdei;
+ struct kvm_sdei_exposed_event *exposed_event;
+ struct kvm_sdei_registered_event *registered_event;
+ unsigned long event_num = smccc_get_arg1(vcpu);
+ unsigned long event_ep_address = smccc_get_arg2(vcpu);
+ unsigned long event_ep_arg = smccc_get_arg3(vcpu);
+ unsigned long route_mode = smccc_get_arg4(vcpu);
+ unsigned long route_affinity = smccc_get_arg5(vcpu);
+ int index;
+ unsigned long ret = SDEI_SUCCESS;
+
+ if (!kvm_sdei_is_supported(event_num)) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto out;
+ }
+
+ if (route_mode != SDEI_EVENT_REGISTER_RM_ANY &&
+ route_mode != SDEI_EVENT_REGISTER_RM_PE) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto out;
+ }
+
+ spin_lock(&ksdei->lock);
+
+ /*
+ * The registered event could have been existing if it's a private
+ * one. We needn't to create another registered event instance
+ * in this case.
+ */
+ registered_event = find_registered_event(kvm, event_num);
+ if (registered_event) {
+ exposed_event = registered_event->exposed_event;
+ index = kvm_sdei_vcpu_index(vcpu, exposed_event);
+ if (kvm_sdei_is_registered(registered_event, index) ||
+ kvm_sdei_is_unregister_pending(registered_event, index)) {
+ ret = SDEI_DENIED;
+ goto unlock;
+ }
+
+ registered_event->state.route_mode = route_mode;
+ registered_event->state.route_affinity = route_affinity;
+ registered_event->state.ep_address[index] = event_ep_address;
+ registered_event->state.ep_arg[index] = event_ep_arg;
+ kvm_sdei_set_registered(registered_event, index);
+ goto unlock;
+ }
+
+ /* Check if the exposed event exists */
+ exposed_event = find_exposed_event(kvm, event_num);
+ if (!exposed_event) {
+ ret = SDEI_INVALID_PARAMETERS;
+ goto unlock;
+ }
+
+ /*
+ * Check if the count of registered event instances exceeds
+ * the limit.
+ */
+ if (ksdei->registered_event_count >= KVM_SDEI_MAX_EVENTS) {
+ ret = SDEI_OUT_OF_RESOURCE;
+ goto unlock;
+ }
+
+ /* Allocate the registered event instance */
+ registered_event = kzalloc(sizeof(*registered_event),
+ GFP_KERNEL_ACCOUNT);
+ if (!registered_event) {
+ ret = SDEI_OUT_OF_RESOURCE;
+ goto unlock;
+ }
+
+ /* Initialize the registered event state */
+ index = kvm_sdei_vcpu_index(vcpu, exposed_event);
+ registered_event->state.num = event_num;
+ registered_event->state.route_mode = route_affinity;
+ registered_event->state.route_affinity = route_affinity;
+ registered_event->state.ep_address[index] = event_ep_address;
+ registered_event->state.ep_arg[index] = event_ep_arg;
+ registered_event->kvm = kvm;
+ registered_event->exposed_event = exposed_event;
+ registered_event->vcpu_event_count = 0;
+ kvm_sdei_set_registered(registered_event, index);
+
+ /* Add the registered event instance */
+ ksdei->registered_event_count++;
+ exposed_event->registered_event_count++;
+ list_add_tail(&registered_event->link, &ksdei->registered_events);
+
+unlock:
+ spin_unlock(&ksdei->lock);
+out:
+ return ret;
+}
+
int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;
@@ -153,6 +279,8 @@ int kvm_sdei_hypercall(struct kvm_vcpu *vcpu)
ret = hypercall_version(vcpu);
break;
case SDEI_1_0_FN_SDEI_EVENT_REGISTER:
+ ret = hypercall_register(vcpu);
+ break;
case SDEI_1_0_FN_SDEI_EVENT_ENABLE:
case SDEI_1_0_FN_SDEI_EVENT_DISABLE:
case SDEI_1_0_FN_SDEI_EVENT_CONTEXT:
--
2.23.0

2022-03-22 08:51:08

by Gavin Shan

[permalink] [raw]
Subject: [PATCH v5 20/22] KVM: arm64: Export SDEI capability

The SDEI functionality is ready to be exported so far. This adds
new capability (KVM_CAP_ARM_SDEI) and exports it.

Signed-off-by: Gavin Shan <[email protected]>
---
Documentation/virt/kvm/api.rst | 10 ++++++++++
arch/arm64/kvm/arm.c | 3 +++
include/uapi/linux/kvm.h | 1 +
3 files changed, 14 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 9f3172376ec3..06cf27f37b4d 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7575,3 +7575,13 @@ The argument to KVM_ENABLE_CAP is also a bitmask, and must be a subset
of the result of KVM_CHECK_EXTENSION. KVM will forward to userspace
the hypercalls whose corresponding bit is in the argument, and return
ENOSYS for the others.
+
+8.35 KVM_CAP_ARM_SDEI
+---------------------
+
+:Capability: KVM_CAP_ARM_SDEI
+:Architectures: arm64
+
+This capability indicates that the SDEI virtual service is supported
+in the host. A VMM can check whether the service is available to enable
+it.
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 3f532e1c4a95..ae3b53dfe88f 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -282,6 +282,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_ARM_PTRAUTH_GENERIC:
r = system_has_full_ptr_auth();
break;
+ case KVM_CAP_ARM_SDEI:
+ r = 1;
+ break;
default:
r = 0;
}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 2d11c909ec42..5772385639b8 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1135,6 +1135,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_XSAVE2 208
#define KVM_CAP_SYS_ATTRIBUTES 209
#define KVM_CAP_PPC_AIL_MODE_3 210
+#define KVM_CAP_ARM_SDEI 211

#ifdef KVM_CAP_IRQ_ROUTING

--
2.23.0

2022-03-22 20:12:24

by Oliver Upton

[permalink] [raw]
Subject: Re: [PATCH v5 00/22] Support SDEI Virtualization

On Tue, Mar 22, 2022 at 04:06:48PM +0800, Gavin Shan wrote:
> This series intends to virtualize Software Delegated Exception Interface
> (SDEI), which is defined by DEN0054C (v1.1). It allows the hypervisor to
> deliver NMI-alike SDEI event to guest and it's needed by Async PF to
> deliver page-not-present notification from hypervisor to guest. The code
> and the required qemu changes can be found from:
>
> https://developer.arm.com/documentation/den0054/c
> https://github.com/gwshan/linux ("kvm/arm64_sdei")
> https://github.com/gwshan/qemu ("kvm/arm64_sdei")
>
> For the design and migration needs, please refer to the document in
> PATCH[21/22] in this series. The series is organized as below:
>
> PATCH[01] Introduces template for smccc_get_argx()
> PATCH[02] Adds SDEI virtualization infrastructure
> PATCH[03-17] Supports various SDEI hypercalls and event handling
> PATCH[18-20] Adds ioctl commands to support migration and configuration
> and exports SDEI capability
> PATCH[21] Adds SDEI document
> PATCH[22] Adds SDEI selftest case
>
> Testing
> =======
>
> [1] The selftest case included in this series works fine. The default SDEI
> event, whose number is zero, can be registered, enabled, raised. The
> SDEI event handler can be invoked.
>
> [host]# pwd
> /home/gavin/sandbox/linux.main/tools/testing/selftests/kvm
> [root@virtlab-arm01 kvm]# ./aarch64/sdei
>
> NR_VCPUS: 2 SDEI Event: 0x00000000
>
> --- VERSION
> Version: 1.1 (vendor: 0x4b564d)
> --- FEATURES
> Shared event slots: 0
> Private event slots: 0
> Relative mode: No
> --- PRIVATE_RESET
> --- SHARED_RESET
> --- PE_UNMASK
> --- EVENT_GET_INFO
> Type: Private
> Priority: Normal
> Signaled: Yes
> --- EVENT_REGISTER
> --- EVENT_ENABLE
> --- EVENT_SIGNAL
> Handled: Yes
> IRQ: No
> Status: Registered-Enabled-Running
> PC/PSTATE: 000000000040232c 00000000600003c5
> Regs: 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000
> --- PE_MASK
> --- EVENT_DISABLE
> --- EVENT_UNREGISTER
>
> Result: OK
>
> [2] There are additional patches in the following repositories to create
> procfs entries, allowing to inject SDEI event from host side. The
> SDEI client in the guest side registers the SDEI default event, whose
> number is zero. Also, the QEMU exports SDEI ACPI table and supports
> migration for SDEI.
>
> https://github.com/gwshan/linux ("kvm/arm64_sdei")
> https://github.com/gwshan/qemu ("kvm/arm64_sdei")
>
> [2.1] Start the guests and migrate the source VM to the destination
> VM.
>
> [host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
> -accel kvm -machine virt,gic-version=host \
> -cpu host -smp 6,sockets=2,cores=3,threads=1 \
> -m 1024M,slots=16,maxmem=64G \
> : \
> -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
> -initrd /home/gavin/sandbox/images/rootfs.cpio.xz \
> -append earlycon=pl011,mmio,0x9000000 \
> :
>
> [host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
> -accel kvm -machine virt,gic-version=host \
> -cpu host -smp 6,sockets=2,cores=3,threads=1 \
> -m 1024M,slots=16,maxmem=64G \
> : \
> -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
> -initrd /home/gavin/sandbox/images/rootfs.cpio.xz \
> -append earlycon=pl011,mmio,0x9000000 \
> -incoming tcp:0:4444 \
> :
>
> [2.2] Check kernel log on the source VM. The SDEI service is enabled
> and the default SDEI event (0x0) is enabled.
>
> [guest-src]# dmesg | grep -i sdei
> ACPI: SDEI 0x000000005BC80000 000024 \
> (v00 BOCHS BXPC 00000001 BXPC 00000001)
> sdei: SDEIv1.1 (0x4b564d) detected in firmware.
> SDEI TEST: Version 1.1, Vendor 0x4b564d
> sdei_init: SDEI event (0x0) registered
> sdei_init: SDEI event (0x0) enabled
>
>
> (qemu) migrate -d tcp:localhost:4444
>
> [2.3] Migrate the source VM to the destination VM. Inject SDEI event
> to the destination VM. The event is raised and handled.
>
> (qemu) migrate -d tcp:localhost:4444
>
> [host]# echo 0 > /proc/kvm/kvm-5360/vcpu-1
>
> [guest-dst]#
> =========== SDEI Event (CPU#1) ===========
> Event: 0000000000000000 Parameter: 00000000dabfdabf
> PC: ffff800008cbb554 PSTATE: 00000000604000c5 SP: ffff800009c7bde0
> Regs: 00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001
> ffff800016c28000 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 ffff800009399008
> ffff8000097d9af0 ffff8000097d99f8 ffff8000093a8db8 ffff8000097d9b18
> 0000000000000000 0000000000000000 ffff000000339d00 0000000000000000
> 0000000000000000 ffff800009c7bde0 ffff800008cbb5c4
> Context: 00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001
> ffff800016c28000 03ffffffffffffff 000000024325db59 ffff8000097de190
> ffff00000033a790 ffff800008cbb814 0000000000000a30 0000000000000000
>
> Changelog
> =========
> v5:

Next time can you include a link to the cover letter of the previous
patch set? It is extremely helpful for understanding the progress to
date and allows reviewers to see prior feedback.

--
Thanks,
Oliver

2022-03-24 09:34:21

by Oliver Upton

[permalink] [raw]
Subject: Re: [PATCH v5 18/22] KVM: arm64: Support SDEI ioctl commands on VM

On Tue, Mar 22, 2022 at 04:07:06PM +0800, Gavin Shan wrote:
> This supports ioctl commands on VM to manage the various objects.
> It's primarily used by VMM to accomplish migration. The ioctl
> commands introduced by this are highlighted as below:
>
> * KVM_SDEI_CMD_GET_VERSION
> Retrieve the version of current implementation. It's different
> from the version of the followed SDEI specification. This version
> is used to indicates what functionalities documented in the SDEI
> specification have been supported or not supported.

Don't we need a way to set the version as well? KVM is very much
responsible for upholding ABI of older specs. So, if a VMM and guest
expect SDEI v1.1, we can't just forcibly raise it to something else
during a migration.

The PSCI implementation is a great example of how KVM has grown its
implementation in line with a specification, all the while preserving
backwards compatibility.

> * KVM_SDEI_CMD_GET_EXPOSED_EVENT_COUNT
> Return the total count of exposed events.
>
> * KVM_SDEI_CMD_GET_EXPOSED_EVENT
> * KVM_SDEI_CMD_SET_EXPOSED_EVENT
> Get or set exposed event
>
> * KVM_SDEI_CMD_GET_REGISTERED_EVENT_COUNT
> Return the total count of registered events.
>
> * KVM_SDEI_CMD_GET_REGISTERED_EVENT
> * KVM_SDEI_CMD_SET_REGISTERED_EVENT
> Get or set registered event.

Any new UAPI needs to be documented in Documentation/virt/kvm/api.rst

Additionally, we desperately need a better, generic way to save/restore
VM scoped state. IMO, we should only be adding ioctls if we are
affording userspace a meaningful interface. Every save/restore pair of
ioctls winds up wasting precious ioctl numbers and requires userspace
take a change to read/write an otherwise opaque value.

Marc had made some suggestions in this area already that Raghavendra
experimented with [1], and I think its time to meaningfully consider
our options. Basically, KVM_GET_REG_LIST needs to convey whether a
particular register is VM or vCPU state. We only need to save/restore a
VM state register once. That way, userspace doesn't have to care about
the underlying data and the next piece of VM state that comes along
doesn't require an ioctl nr nor VMM participation.

[1]: http://lore.kernel.org/r/[email protected]

--
Thanks,
Oliver

2022-03-24 12:24:00

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v5 01/22] KVM: arm64: Introduce template for inline functions

Hi Oliver,

On 3/23/22 3:42 AM, Oliver Upton wrote:
> On Tue, Mar 22, 2022 at 04:06:49PM +0800, Gavin Shan wrote:
>> The inline functions used to get the SMCCC parameters have same
>> layout. It means these functions can be presented by an unified
>> template, to make the code simplified. Besides, this adds more
>> similar inline functions like smccc_get_arg{4,5,6,7,8}() to get
>> more SMCCC arguments, which are needed by SDEI virtualization
>> support.
>>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> include/kvm/arm_hypercalls.h | 24 ++++++++++++------------
>> 1 file changed, 12 insertions(+), 12 deletions(-)
>>
>> diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
>> index 0e2509d27910..d5144c852fe4 100644
>> --- a/include/kvm/arm_hypercalls.h
>> +++ b/include/kvm/arm_hypercalls.h
>> @@ -13,20 +13,20 @@ static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
>> return vcpu_get_reg(vcpu, 0);
>> }
>>
>> -static inline unsigned long smccc_get_arg1(struct kvm_vcpu *vcpu)
>> -{
>> - return vcpu_get_reg(vcpu, 1);
>> -}
>> -
>> -static inline unsigned long smccc_get_arg2(struct kvm_vcpu *vcpu)
>> -{
>> - return vcpu_get_reg(vcpu, 2);
>> +#define SMCCC_DECLARE_GET_ARG(reg) \
>> +static inline unsigned long smccc_get_arg##reg(struct kvm_vcpu *vcpu) \
>> +{ \
>> + return vcpu_get_reg(vcpu, reg); \
>> }
>>
>> -static inline unsigned long smccc_get_arg3(struct kvm_vcpu *vcpu)
>> -{
>> - return vcpu_get_reg(vcpu, 3);
>> -}
>> +SMCCC_DECLARE_GET_ARG(1)
>> +SMCCC_DECLARE_GET_ARG(2)
>> +SMCCC_DECLARE_GET_ARG(3)
>> +SMCCC_DECLARE_GET_ARG(4)
>> +SMCCC_DECLARE_GET_ARG(5)
>> +SMCCC_DECLARE_GET_ARG(6)
>> +SMCCC_DECLARE_GET_ARG(7)
>> +SMCCC_DECLARE_GET_ARG(8)
>
> Hmm. What if we specify a single inline function where the caller passes
> the arg # as a parameter? We really just want to abstract away the
> off-by-one difference between GP registers and SMCCC arguments.
>
> Macros generally make me uneasy for template functions, but I may be in
> the vocal minority on this topic :)
>

I think it's a good idea to have smccc_get_arg(unsigned char index).
However, it will cause more code changes because the following functions
have been used. Anyway, I think it's still worthy to pass @index to
differentiate the argument index. I will change it accordingly in
next respin.

smccc_get_arg1()
smccc_get_arg2()
smccc_get_arg3()

Thanks,
Gavin

2022-03-24 15:45:28

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v5 00/22] Support SDEI Virtualization

Hi Oliver,

On 3/23/22 2:13 AM, Oliver Upton wrote:
> On Tue, Mar 22, 2022 at 04:06:48PM +0800, Gavin Shan wrote:
>> This series intends to virtualize Software Delegated Exception Interface
>> (SDEI), which is defined by DEN0054C (v1.1). It allows the hypervisor to
>> deliver NMI-alike SDEI event to guest and it's needed by Async PF to
>> deliver page-not-present notification from hypervisor to guest. The code
>> and the required qemu changes can be found from:
>>
>> https://developer.arm.com/documentation/den0054/c
>> https://github.com/gwshan/linux ("kvm/arm64_sdei")
>> https://github.com/gwshan/qemu ("kvm/arm64_sdei")
>>
>> For the design and migration needs, please refer to the document in
>> PATCH[21/22] in this series. The series is organized as below:
>>
>> PATCH[01] Introduces template for smccc_get_argx()
>> PATCH[02] Adds SDEI virtualization infrastructure
>> PATCH[03-17] Supports various SDEI hypercalls and event handling
>> PATCH[18-20] Adds ioctl commands to support migration and configuration
>> and exports SDEI capability
>> PATCH[21] Adds SDEI document
>> PATCH[22] Adds SDEI selftest case
>>
>> Testing
>> =======
>>
>> [1] The selftest case included in this series works fine. The default SDEI
>> event, whose number is zero, can be registered, enabled, raised. The
>> SDEI event handler can be invoked.
>>
>> [host]# pwd
>> /home/gavin/sandbox/linux.main/tools/testing/selftests/kvm
>> [root@virtlab-arm01 kvm]# ./aarch64/sdei
>>
>> NR_VCPUS: 2 SDEI Event: 0x00000000
>>
>> --- VERSION
>> Version: 1.1 (vendor: 0x4b564d)
>> --- FEATURES
>> Shared event slots: 0
>> Private event slots: 0
>> Relative mode: No
>> --- PRIVATE_RESET
>> --- SHARED_RESET
>> --- PE_UNMASK
>> --- EVENT_GET_INFO
>> Type: Private
>> Priority: Normal
>> Signaled: Yes
>> --- EVENT_REGISTER
>> --- EVENT_ENABLE
>> --- EVENT_SIGNAL
>> Handled: Yes
>> IRQ: No
>> Status: Registered-Enabled-Running
>> PC/PSTATE: 000000000040232c 00000000600003c5
>> Regs: 0000000000000000 0000000000000000
>> 0000000000000000 0000000000000000
>> --- PE_MASK
>> --- EVENT_DISABLE
>> --- EVENT_UNREGISTER
>>
>> Result: OK
>>
>> [2] There are additional patches in the following repositories to create
>> procfs entries, allowing to inject SDEI event from host side. The
>> SDEI client in the guest side registers the SDEI default event, whose
>> number is zero. Also, the QEMU exports SDEI ACPI table and supports
>> migration for SDEI.
>>
>> https://github.com/gwshan/linux ("kvm/arm64_sdei")
>> https://github.com/gwshan/qemu ("kvm/arm64_sdei")
>>
>> [2.1] Start the guests and migrate the source VM to the destination
>> VM.
>>
>> [host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
>> -accel kvm -machine virt,gic-version=host \
>> -cpu host -smp 6,sockets=2,cores=3,threads=1 \
>> -m 1024M,slots=16,maxmem=64G \
>> : \
>> -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
>> -initrd /home/gavin/sandbox/images/rootfs.cpio.xz \
>> -append earlycon=pl011,mmio,0x9000000 \
>> :
>>
>> [host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
>> -accel kvm -machine virt,gic-version=host \
>> -cpu host -smp 6,sockets=2,cores=3,threads=1 \
>> -m 1024M,slots=16,maxmem=64G \
>> : \
>> -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
>> -initrd /home/gavin/sandbox/images/rootfs.cpio.xz \
>> -append earlycon=pl011,mmio,0x9000000 \
>> -incoming tcp:0:4444 \
>> :
>>
>> [2.2] Check kernel log on the source VM. The SDEI service is enabled
>> and the default SDEI event (0x0) is enabled.
>>
>> [guest-src]# dmesg | grep -i sdei
>> ACPI: SDEI 0x000000005BC80000 000024 \
>> (v00 BOCHS BXPC 00000001 BXPC 00000001)
>> sdei: SDEIv1.1 (0x4b564d) detected in firmware.
>> SDEI TEST: Version 1.1, Vendor 0x4b564d
>> sdei_init: SDEI event (0x0) registered
>> sdei_init: SDEI event (0x0) enabled
>>
>>
>> (qemu) migrate -d tcp:localhost:4444
>>
>> [2.3] Migrate the source VM to the destination VM. Inject SDEI event
>> to the destination VM. The event is raised and handled.
>>
>> (qemu) migrate -d tcp:localhost:4444
>>
>> [host]# echo 0 > /proc/kvm/kvm-5360/vcpu-1
>>
>> [guest-dst]#
>> =========== SDEI Event (CPU#1) ===========
>> Event: 0000000000000000 Parameter: 00000000dabfdabf
>> PC: ffff800008cbb554 PSTATE: 00000000604000c5 SP: ffff800009c7bde0
>> Regs: 00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001
>> ffff800016c28000 0000000000000000 0000000000000000 0000000000000000
>> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> 0000000000000000 0000000000000000 0000000000000000 ffff800009399008
>> ffff8000097d9af0 ffff8000097d99f8 ffff8000093a8db8 ffff8000097d9b18
>> 0000000000000000 0000000000000000 ffff000000339d00 0000000000000000
>> 0000000000000000 ffff800009c7bde0 ffff800008cbb5c4
>> Context: 00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001
>> ffff800016c28000 03ffffffffffffff 000000024325db59 ffff8000097de190
>> ffff00000033a790 ffff800008cbb814 0000000000000a30 0000000000000000
>>
>> Changelog
>> =========
>> v5:
>
> Next time can you include a link to the cover letter of the previous
> patch set? It is extremely helpful for understanding the progress to
> date and allows reviewers to see prior feedback.
>

Yep, I will provide the link to the cover letter of the previous version.
I'm amending it this time:

https://lore.kernel.org/all/[email protected]/

Besides, I don't know what happened to my "git send-email". Some of the
recipients are skipped even I have put them into the cc list. Lets amend
it again to avoid resending the series.

Thanks,
Gavin


2022-03-25 17:34:10

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v5 18/22] KVM: arm64: Support SDEI ioctl commands on VM

Hi Oliver,

On 3/24/22 1:28 AM, Oliver Upton wrote:
> On Tue, Mar 22, 2022 at 04:07:06PM +0800, Gavin Shan wrote:
>> This supports ioctl commands on VM to manage the various objects.
>> It's primarily used by VMM to accomplish migration. The ioctl
>> commands introduced by this are highlighted as below:
>>
>> * KVM_SDEI_CMD_GET_VERSION
>> Retrieve the version of current implementation. It's different
>> from the version of the followed SDEI specification. This version
>> is used to indicates what functionalities documented in the SDEI
>> specification have been supported or not supported.
>
> Don't we need a way to set the version as well? KVM is very much
> responsible for upholding ABI of older specs. So, if a VMM and guest
> expect SDEI v1.1, we can't just forcibly raise it to something else
> during a migration.
>
> The PSCI implementation is a great example of how KVM has grown its
> implementation in line with a specification, all the while preserving
> backwards compatibility.
>

The only information feed by VMM is the exposed events. The events
can't be registered from guest kernel, and raised from host to guest
kernel until it's exposed by VMM. Besides, the exposed events will
be defined staticly in host/KVM as we discussed on PATCH[02/22]. We
also discussed to eliminate those ioctl commands. So I think we needn't
to add KVM_SDEI_CMD_SET_VERSION. Further more, the version is only a
concern to host itself if the migration can be done through the
firmware pseudo system registers since the migration compatibility
is the only concern to VMM (QEMU).

Yes, Currently, 0.1/0.2/1.0 versions are supported by PSCI. 0.1 is
picked until VMM asks for 0.2 and 1.0 explicitly. However, it seems
QEMU isn't using 1.0 PSCI yet and maybe more patch is needed to enable
it.

>> * KVM_SDEI_CMD_GET_EXPOSED_EVENT_COUNT
>> Return the total count of exposed events.
>>
>> * KVM_SDEI_CMD_GET_EXPOSED_EVENT
>> * KVM_SDEI_CMD_SET_EXPOSED_EVENT
>> Get or set exposed event
>>
>> * KVM_SDEI_CMD_GET_REGISTERED_EVENT_COUNT
>> Return the total count of registered events.
>>
>> * KVM_SDEI_CMD_GET_REGISTERED_EVENT
>> * KVM_SDEI_CMD_SET_REGISTERED_EVENT
>> Get or set registered event.
>
> Any new UAPI needs to be documented in Documentation/virt/kvm/api.rst
>
> Additionally, we desperately need a better, generic way to save/restore
> VM scoped state. IMO, we should only be adding ioctls if we are
> affording userspace a meaningful interface. Every save/restore pair of
> ioctls winds up wasting precious ioctl numbers and requires userspace
> take a change to read/write an otherwise opaque value.
>
> Marc had made some suggestions in this area already that Raghavendra
> experimented with [1], and I think its time to meaningfully consider
> our options. Basically, KVM_GET_REG_LIST needs to convey whether a
> particular register is VM or vCPU state. We only need to save/restore a
> VM state register once. That way, userspace doesn't have to care about
> the underlying data and the next piece of VM state that comes along
> doesn't require an ioctl nr nor VMM participation.
>
> [1]: http://lore.kernel.org/r/[email protected]
>

Thanks for the pointer to Raghavendra's series. The firmware pseudo
system registers have been classified into VM and VCPU scoped in the
series. I think it fits the SDEI migration requirements very well.
The shared events can even be migrated through the VM scoped firmware
pseudo system registers. However, I don't plan to support it in next
revision (v6) as currently needed events are all private. I may
spend more time to go through Raghavendra's series later.

Thanks,
Gavin

2022-03-25 17:58:42

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v5 19/22] KVM: arm64: Support SDEI ioctl commands on vCPU

Hi Oliver,

On 3/24/22 1:55 AM, Oliver Upton wrote:
> On Tue, Mar 22, 2022 at 04:07:07PM +0800, Gavin Shan wrote:
>> This supports ioctl commands on vCPU to manage the various object.
>> It's primarily used by VMM to accomplish migration. The ioctl
>> commands introduced by this are highlighted as below:
>>
>> * KVM_SDEI_CMD_GET_VCPU_EVENT_COUNT
>> Return the total count of vCPU events, which have been queued
>> on the target vCPU.
>>
>> * KVM_SDEI_CMD_GET_VCPU_EVENT
>> * KVM_SDEI_CMD_SET_VCPU_EVENT
>> Get or set vCPU events.
>>
>> * KVM_SDEI_CMD_GET_VCPU_STATE
>> * KVM_SDEI_CMD_SET_VCPU_STATE
>> Get or set vCPU state.
>
> All of this GET/SET stuff can probably be added to KVM_{GET,SET}_ONE_REG
> immediately. Just introduce new registers any time a new event comes
> along. The only event we have at the end of this series is the
> software-signaled event, with async PF coming later right?
>
> Some special consideration is likely necessary to avoid adding a
> register for every u64 chunk of data. I don't think we need to afford
> userspace any illusion of granularity with these, and can probably lump
> it all under one giant pseudoregister.
>

Yes, KVM_{GET,SET}_ONE_REG is the ideal interface for migration. You're
correct we're only concerned by software signaled event and the one for
Async PF.

I didn't look into Raghavendra's series deeply. Actually, a lump of
registers can be avoid after 2048 byte size is specified in its
encoding. I think 2048 bytes are enough for now since there are
only two supported events.

In the future, we probably have varied number of SDEI events to
be migrated. In that case, we need to add a new bit to the encoding
of the pseudo system register, so that VMM (QEMU) can support
variable sized system register and keep reading and writing on
these registers on migration:

PSEUDO_SDEI_ADDR: 64-bits in width
PSEUDO_SDEI_DATA: has varied size

PSEUDO_SDEI_ADDR is used to (1) Indicate the size of PSEUDO_SDEI_DATA
(2) The information to be read/written, for example the (shared/private)
registered events on VM and vCPU, VCPU state.

PSEUDO_SDEI_DATA is used to (1) Retrieved information or that to be
written. (2) Flags to indicate current block of information is the
last one or not.

>> * KVM_SDEI_CMD_INJECT_EVENT
>> Inject SDEI event.
>
> What events are we going to allow userspace to inject? IIUC, the
> software-signaled event is an IPI and really under the control of the
> guest. Async PF is entriely under KVM control.
>
> I do agree that having some form of event injection would be great. VM
> providers have found it useful to allow users to NMI their VMs when they
> get wedged. I just believe that userspace should not be able to trigger
> events that have a defined meaning and are under full KVM ownership.
>
> IMO, unless the async PF changes need to go out to userspace, you could
> probably skip event injection for now and only worry about SDEI within a
> VM.
>

I was overthinking on the usage of SDEI. I had the assumption that SDEI
may be used by emulated devices to inject SDEI events in VMM. It can
even be done through PSEUDO_SDEI_{ADDR, DATA} in future. For now, the
software signaled and Async PF events are concerned, Async PF event
is always raised by host/KVM. So we needn't it for now and I will drop
this functionality, actually the whole ioctl commands and migration
support in next respin, as you suggested :)

Thanks,
Gavin

2022-03-25 18:25:15

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v5 19/22] KVM: arm64: Support SDEI ioctl commands on vCPU

Hi Oliver,

On 3/25/22 4:37 PM, Oliver Upton wrote:
> On Fri, Mar 25, 2022 at 03:59:50PM +0800, Gavin Shan wrote:
>> On 3/24/22 1:55 AM, Oliver Upton wrote:
>>> On Tue, Mar 22, 2022 at 04:07:07PM +0800, Gavin Shan wrote:
>>>> This supports ioctl commands on vCPU to manage the various object.
>>>> It's primarily used by VMM to accomplish migration. The ioctl
>>>> commands introduced by this are highlighted as below:
>>>>
>>>> * KVM_SDEI_CMD_GET_VCPU_EVENT_COUNT
>>>> Return the total count of vCPU events, which have been queued
>>>> on the target vCPU.
>>>>
>>>> * KVM_SDEI_CMD_GET_VCPU_EVENT
>>>> * KVM_SDEI_CMD_SET_VCPU_EVENT
>>>> Get or set vCPU events.
>>>>
>>>> * KVM_SDEI_CMD_GET_VCPU_STATE
>>>> * KVM_SDEI_CMD_SET_VCPU_STATE
>>>> Get or set vCPU state.
>>>
>>> All of this GET/SET stuff can probably be added to KVM_{GET,SET}_ONE_REG
>>> immediately. Just introduce new registers any time a new event comes
>>> along. The only event we have at the end of this series is the
>>> software-signaled event, with async PF coming later right?
>>>
>>> Some special consideration is likely necessary to avoid adding a
>>> register for every u64 chunk of data. I don't think we need to afford
>>> userspace any illusion of granularity with these, and can probably lump
>>> it all under one giant pseudoregister.
>>>
>>
>> Yes, KVM_{GET,SET}_ONE_REG is the ideal interface for migration. You're
>> correct we're only concerned by software signaled event and the one for
>> Async PF.
>>
>> I didn't look into Raghavendra's series deeply. Actually, a lump of
>> registers can be avoid after 2048 byte size is specified in its
>> encoding. I think 2048 bytes are enough for now since there are
>> only two supported events.
>
> When I had said 'one giant pseudoregister' I actually meant 'one
> pseudoregister per event', not all of SDEI into a single
> structure. Since most of this is in flux now, it is hard to point
> out what/how we should migrate from conversation alone.
>
> And on the topic of Raghavendra's series, I do not believe it is
> required anymore here w/ the removal of shared events, which I'm
> strongly in favor of.
>
> Let's delve deeper into migration on the next series :)
>

Ok, Thanks for your clarification about 'one giant pseudoregister'.
Lets have more discussion about the migration on next revision.
To be more clear, I plan to implement the base functionality,
where only the private event is supported. After it reaches into
mergeable or merged, we can post the add-on series to support
migration.

>> In the future, we probably have varied number of SDEI events to
>> be migrated. In that case, we need to add a new bit to the encoding
>> of the pseudo system register, so that VMM (QEMU) can support
>> variable sized system register and keep reading and writing on
>> these registers on migration:
>>
>> PSEUDO_SDEI_ADDR: 64-bits in width
>> PSEUDO_SDEI_DATA: has varied size
>>
>> PSEUDO_SDEI_ADDR is used to (1) Indicate the size of PSEUDO_SDEI_DATA
>> (2) The information to be read/written, for example the (shared/private)
>> registered events on VM and vCPU, VCPU state.
>>
>> PSEUDO_SDEI_DATA is used to (1) Retrieved information or that to be
>> written. (2) Flags to indicate current block of information is the
>> last one or not.
>
> I don't think we have sufficient encoding space in the register ID
> to allow for arbitrary length registers. Any new registers for SDEI will
> need to fit into one of the predefined sizes. Note that we've already
> conditioned userspace to handle registers this way and anything else is
> an ABI change.
>

Ok, I think we need padding to structures of the event, to make the
event object aligned to 64-bytes aligned, and VCPU state to 512-bytes
aligned. 64-bytes and 128-bytes registers have been supported.

Thanks,
Gavin

2022-03-25 18:38:47

by Oliver Upton

[permalink] [raw]
Subject: Re: [PATCH v5 19/22] KVM: arm64: Support SDEI ioctl commands on vCPU

On Fri, Mar 25, 2022 at 03:59:50PM +0800, Gavin Shan wrote:
> Hi Oliver,
>
> On 3/24/22 1:55 AM, Oliver Upton wrote:
> > On Tue, Mar 22, 2022 at 04:07:07PM +0800, Gavin Shan wrote:
> > > This supports ioctl commands on vCPU to manage the various object.
> > > It's primarily used by VMM to accomplish migration. The ioctl
> > > commands introduced by this are highlighted as below:
> > >
> > > * KVM_SDEI_CMD_GET_VCPU_EVENT_COUNT
> > > Return the total count of vCPU events, which have been queued
> > > on the target vCPU.
> > >
> > > * KVM_SDEI_CMD_GET_VCPU_EVENT
> > > * KVM_SDEI_CMD_SET_VCPU_EVENT
> > > Get or set vCPU events.
> > >
> > > * KVM_SDEI_CMD_GET_VCPU_STATE
> > > * KVM_SDEI_CMD_SET_VCPU_STATE
> > > Get or set vCPU state.
> >
> > All of this GET/SET stuff can probably be added to KVM_{GET,SET}_ONE_REG
> > immediately. Just introduce new registers any time a new event comes
> > along. The only event we have at the end of this series is the
> > software-signaled event, with async PF coming later right?
> >
> > Some special consideration is likely necessary to avoid adding a
> > register for every u64 chunk of data. I don't think we need to afford
> > userspace any illusion of granularity with these, and can probably lump
> > it all under one giant pseudoregister.
> >
>
> Yes, KVM_{GET,SET}_ONE_REG is the ideal interface for migration. You're
> correct we're only concerned by software signaled event and the one for
> Async PF.
>
> I didn't look into Raghavendra's series deeply. Actually, a lump of
> registers can be avoid after 2048 byte size is specified in its
> encoding. I think 2048 bytes are enough for now since there are
> only two supported events.

When I had said 'one giant pseudoregister' I actually meant 'one
pseudoregister per event', not all of SDEI into a single
structure. Since most of this is in flux now, it is hard to point
out what/how we should migrate from conversation alone.

And on the topic of Raghavendra's series, I do not believe it is
required anymore here w/ the removal of shared events, which I'm
strongly in favor of.

Let's delve deeper into migration on the next series :)

> In the future, we probably have varied number of SDEI events to
> be migrated. In that case, we need to add a new bit to the encoding
> of the pseudo system register, so that VMM (QEMU) can support
> variable sized system register and keep reading and writing on
> these registers on migration:
>
> PSEUDO_SDEI_ADDR: 64-bits in width
> PSEUDO_SDEI_DATA: has varied size
>
> PSEUDO_SDEI_ADDR is used to (1) Indicate the size of PSEUDO_SDEI_DATA
> (2) The information to be read/written, for example the (shared/private)
> registered events on VM and vCPU, VCPU state.
>
> PSEUDO_SDEI_DATA is used to (1) Retrieved information or that to be
> written. (2) Flags to indicate current block of information is the
> last one or not.

I don't think we have sufficient encoding space in the register ID
to allow for arbitrary length registers. Any new registers for SDEI will
need to fit into one of the predefined sizes. Note that we've already
conditioned userspace to handle registers this way and anything else is
an ABI change.

--
Thanks,
Oliver

2022-03-25 19:03:14

by Oliver Upton

[permalink] [raw]
Subject: Re: [PATCH v5 19/22] KVM: arm64: Support SDEI ioctl commands on vCPU

On Tue, Mar 22, 2022 at 04:07:07PM +0800, Gavin Shan wrote:
> This supports ioctl commands on vCPU to manage the various object.
> It's primarily used by VMM to accomplish migration. The ioctl
> commands introduced by this are highlighted as below:
>
> * KVM_SDEI_CMD_GET_VCPU_EVENT_COUNT
> Return the total count of vCPU events, which have been queued
> on the target vCPU.
>
> * KVM_SDEI_CMD_GET_VCPU_EVENT
> * KVM_SDEI_CMD_SET_VCPU_EVENT
> Get or set vCPU events.
>
> * KVM_SDEI_CMD_GET_VCPU_STATE
> * KVM_SDEI_CMD_SET_VCPU_STATE
> Get or set vCPU state.

All of this GET/SET stuff can probably be added to KVM_{GET,SET}_ONE_REG
immediately. Just introduce new registers any time a new event comes
along. The only event we have at the end of this series is the
software-signaled event, with async PF coming later right?

Some special consideration is likely necessary to avoid adding a
register for every u64 chunk of data. I don't think we need to afford
userspace any illusion of granularity with these, and can probably lump
it all under one giant pseudoregister.

> * KVM_SDEI_CMD_INJECT_EVENT
> Inject SDEI event.

What events are we going to allow userspace to inject? IIUC, the
software-signaled event is an IPI and really under the control of the
guest. Async PF is entriely under KVM control.

I do agree that having some form of event injection would be great. VM
providers have found it useful to allow users to NMI their VMs when they
get wedged. I just believe that userspace should not be able to trigger
events that have a defined meaning and are under full KVM ownership.

IMO, unless the async PF changes need to go out to userspace, you could
probably skip event injection for now and only worry about SDEI within a
VM.

--
Thanks,
Oliver

2022-03-25 19:05:06

by Gavin Shan

[permalink] [raw]
Subject: Re: [PATCH v5 18/22] KVM: arm64: Support SDEI ioctl commands on VM

Hi Oliver,

On 3/25/22 3:35 PM, Oliver Upton wrote:
> On Fri, Mar 25, 2022 at 02:59:52PM +0800, Gavin Shan wrote:
>>> The PSCI implementation is a great example of how KVM has grown its
>>> implementation in line with a specification, all the while preserving
>>> backwards compatibility.
>>>
>>
>> The only information feed by VMM is the exposed events. The events
>> can't be registered from guest kernel, and raised from host to guest
>> kernel until it's exposed by VMM.
>
> I would suggest assuming that all SDEI events are exposed by default in
> KVM. We will not require a VMM change to enable events individually.
>

Ok, it was exactly what I did in v4, but the event is exposed
by VMM in v5. In v6, it will be staticly defined again :)

>> Besides, the exposed events will
>> be defined staticly in host/KVM as we discussed on PATCH[02/22]. We
>> also discussed to eliminate those ioctl commands. So I think we needn't
>> to add KVM_SDEI_CMD_SET_VERSION. Further more, the version is only a
>> concern to host itself if the migration can be done through the
>> firmware pseudo system registers since the migration compatibility
>> is the only concern to VMM (QEMU).
>
> This all needs to work just like the KVM_REG_ARM_PSCI_VERSION version,
> I'd recommend taking a look at how we handle that register in KVM.
>

Ok. I will do necessary investigation :)

>> Yes, Currently, 0.1/0.2/1.0 versions are supported by PSCI. 0.1 is
>> picked until VMM asks for 0.2 and 1.0 explicitly. However, it seems
>> QEMU isn't using 1.0 PSCI yet and maybe more patch is needed to enable
>> it.
>
> As far as how it interacts with KVM, QEMU looks fine. The name of the
> KVM_ARM_VCPU_PSCI_0_2 bit is quite frustrating. It actually implies that
> KVM will enable it highest supported PSCI version. If the feature bit is
> cleared then you only get PSCIv0.1
>
> However, the DT node that QEMU sets up looks a bit crusty. The
> properties for conveying PSCI function IDs were only ever necessary for
> PSCIv0.1. The only property of interest any more is 'method', to convey
> the SMCCC conduit instruction.
>

Ok, Thanks again for the further information about PSCI implementation.
I will go through the code when I have free cycyles :)

Thanks,
Gavin

2022-03-25 19:44:57

by Oliver Upton

[permalink] [raw]
Subject: Re: [PATCH v5 18/22] KVM: arm64: Support SDEI ioctl commands on VM

On Fri, Mar 25, 2022 at 02:59:52PM +0800, Gavin Shan wrote:
> > The PSCI implementation is a great example of how KVM has grown its
> > implementation in line with a specification, all the while preserving
> > backwards compatibility.
> >
>
> The only information feed by VMM is the exposed events. The events
> can't be registered from guest kernel, and raised from host to guest
> kernel until it's exposed by VMM.

I would suggest assuming that all SDEI events are exposed by default in
KVM. We will not require a VMM change to enable events individually.

> Besides, the exposed events will
> be defined staticly in host/KVM as we discussed on PATCH[02/22]. We
> also discussed to eliminate those ioctl commands. So I think we needn't
> to add KVM_SDEI_CMD_SET_VERSION. Further more, the version is only a
> concern to host itself if the migration can be done through the
> firmware pseudo system registers since the migration compatibility
> is the only concern to VMM (QEMU).

This all needs to work just like the KVM_REG_ARM_PSCI_VERSION version,
I'd recommend taking a look at how we handle that register in KVM.

> Yes, Currently, 0.1/0.2/1.0 versions are supported by PSCI. 0.1 is
> picked until VMM asks for 0.2 and 1.0 explicitly. However, it seems
> QEMU isn't using 1.0 PSCI yet and maybe more patch is needed to enable
> it.

As far as how it interacts with KVM, QEMU looks fine. The name of the
KVM_ARM_VCPU_PSCI_0_2 bit is quite frustrating. It actually implies that
KVM will enable it highest supported PSCI version. If the feature bit is
cleared then you only get PSCIv0.1

However, the DT node that QEMU sets up looks a bit crusty. The
properties for conveying PSCI function IDs were only ever necessary for
PSCIv0.1. The only property of interest any more is 'method', to convey
the SMCCC conduit instruction.

--
Thanks,
Oliver

2022-03-25 20:01:49

by Oliver Upton

[permalink] [raw]
Subject: Re: [PATCH v5 01/22] KVM: arm64: Introduce template for inline functions

Hi Gavin,

On Tue, Mar 22, 2022 at 04:06:49PM +0800, Gavin Shan wrote:
> The inline functions used to get the SMCCC parameters have same
> layout. It means these functions can be presented by an unified
> template, to make the code simplified. Besides, this adds more
> similar inline functions like smccc_get_arg{4,5,6,7,8}() to get
> more SMCCC arguments, which are needed by SDEI virtualization
> support.
>
> Signed-off-by: Gavin Shan <[email protected]>
> ---
> include/kvm/arm_hypercalls.h | 24 ++++++++++++------------
> 1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
> index 0e2509d27910..d5144c852fe4 100644
> --- a/include/kvm/arm_hypercalls.h
> +++ b/include/kvm/arm_hypercalls.h
> @@ -13,20 +13,20 @@ static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
> return vcpu_get_reg(vcpu, 0);
> }
>
> -static inline unsigned long smccc_get_arg1(struct kvm_vcpu *vcpu)
> -{
> - return vcpu_get_reg(vcpu, 1);
> -}
> -
> -static inline unsigned long smccc_get_arg2(struct kvm_vcpu *vcpu)
> -{
> - return vcpu_get_reg(vcpu, 2);
> +#define SMCCC_DECLARE_GET_ARG(reg) \
> +static inline unsigned long smccc_get_arg##reg(struct kvm_vcpu *vcpu) \
> +{ \
> + return vcpu_get_reg(vcpu, reg); \
> }
>
> -static inline unsigned long smccc_get_arg3(struct kvm_vcpu *vcpu)
> -{
> - return vcpu_get_reg(vcpu, 3);
> -}
> +SMCCC_DECLARE_GET_ARG(1)
> +SMCCC_DECLARE_GET_ARG(2)
> +SMCCC_DECLARE_GET_ARG(3)
> +SMCCC_DECLARE_GET_ARG(4)
> +SMCCC_DECLARE_GET_ARG(5)
> +SMCCC_DECLARE_GET_ARG(6)
> +SMCCC_DECLARE_GET_ARG(7)
> +SMCCC_DECLARE_GET_ARG(8)

Hmm. What if we specify a single inline function where the caller passes
the arg # as a parameter? We really just want to abstract away the
off-by-one difference between GP registers and SMCCC arguments.

Macros generally make me uneasy for template functions, but I may be in
the vocal minority on this topic :)

--
Thanks,
Oliver