This series intends to virtualize Software Delegated Exception Interface
(SDEI), which is defined by DEN0054C (v1.1). It allows the hypervisor to
deliver NMI-alike SDEI event to guest and it's needed by Async PF to
deliver page-not-present notification from hypervisor to guest. The code
and the required qemu changes can be found from:
https://developer.arm.com/documentation/den0054/c
https://github.com/gwshan/linux ("kvm/arm64_sdei")
https://github.com/gwshan/qemu ("kvm/arm64_sdei")
The design is quite strightforward by following the specification. The
(SDEI) events are classified into the shared and private ones according
to their scope. The shared event is system or VM scoped, but the private
event is vcpu scoped. This implementation doesn't support the shared
event because all the needed events are private. Besides, the critial
events aren't supported by the implementation either. It means all events
are normal in terms of priority.
There are several objects (data structures) introduced to help on the
event registration, enablement, disablement, unregistration, reset,
delivery and handling.
* kvm_sdei_event_handler
SDEI event handler, which is provided through EVENT_REGISTER
hypercall, is called when the SDEI event is delivered from
host to guest.
* kvm_sdei_event_context
The saved (preempted) context when SDEI event is delivered
for handling.
* kvm_sdei_vcpu
SDEI events and their states.
The patches are organized as below:
PATCH[01-02] Preparatory work to extend smccc_get_argx() and refactor
hypercall routing mechanism
PATCH[03] Adds SDEI virtualization infrastructure
PATCH[04-16] Supports various SDEI hypercalls and event handling
PATCH[17] Exposes SDEI capability
PATCH[18-19] Support SDEI migration
PATCH[20] Adds document about SDEI
PATCH[21-22] SDEI related selftest cases
The previous revisions can be found:
v6: https://lore.kernel.org/lkml/[email protected]/T/
v5: https://lore.kernel.org/kvmarm/[email protected]/
v4: https://lore.kernel.org/kvmarm/[email protected]/
v3: https://lore.kernel.org/kvmarm/[email protected]/
v2: https://lore.kernel.org/kvmarm/[email protected]/
v1: https://lore.kernel.org/kvmarm/[email protected]/
Testing
=======
[1] The selftest case included in this series works fine. The default SDEI
event, whose number is zero, can be registered, enabled, raised. The
SDEI event handler can be invoked.
[host]# pwd
/home/gavin/sandbox/linux.main/tools/testing/selftests/kvm
[root@virtlab-arm01 kvm]# ./aarch64/sdei
NR_VCPUS: 2 SDEI Event: 0x00000000
--- VERSION
Version: 1.1 (vendor: 0x4b564d)
--- FEATURES
Shared event slots: 0
Private event slots: 0
Relative mode: No
--- PRIVATE_RESET
--- SHARED_RESET
--- PE_UNMASK
--- EVENT_GET_INFO
Type: Private
Priority: Normal
Signaled: Yes
--- EVENT_REGISTER
--- EVENT_ENABLE
--- EVENT_SIGNAL
Handled: Yes
IRQ: No
Status: Registered-Enabled-Running
PC/PSTATE: 000000000040232c 00000000600003c5
Regs: 0000000000000000 0000000000000000
0000000000000000 0000000000000000
--- PE_MASK
--- EVENT_DISABLE
--- EVENT_UNREGISTER
Result: OK
[2] There are additional patches in the following repositories to create
procfs entries, allowing to inject SDEI event from host side. The
SDEI client in the guest side registers the SDEI default event, whose
number is zero. Also, the QEMU exports SDEI ACPI table and supports
migration for SDEI.
https://github.com/gwshan/linux ("kvm/arm64_sdei")
https://github.com/gwshan/qemu ("kvm/arm64_sdei")
[2.1] Start the guests and migrate the source VM to the destination
VM.
[host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
-accel kvm -machine virt,gic-version=host \
-cpu host -smp 6,sockets=2,cores=3,threads=1 \
-m 1024M,slots=16,maxmem=64G \
: \
-kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
-initrd /home/gavin/sandbox/images/rootfs.cpio.xz \
-append earlycon=pl011,mmio,0x9000000 \
:
[host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
-accel kvm -machine virt,gic-version=host \
-cpu host -smp 6,sockets=2,cores=3,threads=1 \
-m 1024M,slots=16,maxmem=64G \
: \
-kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
-initrd /home/gavin/sandbox/images/rootfs.cpio.xz \
-append earlycon=pl011,mmio,0x9000000 \
-incoming tcp:0:4444 \
:
[2.2] Check kernel log on the source VM. The SDEI service is enabled
and the default SDEI event (0x0) is enabled.
[guest-src]# dmesg | grep -i sdei
ACPI: SDEI 0x000000005BC80000 000024 \
(v00 BOCHS BXPC 00000001 BXPC 00000001)
sdei: SDEIv1.1 (0x4b564d) detected in firmware.
SDEI TEST: Version 1.1, Vendor 0x4b564d
sdei_init: SDEI event (0x0) registered
sdei_init: SDEI event (0x0) enabled
(qemu) migrate -d tcp:localhost:4444
[2.3] Migrate the source VM to the destination VM. Inject SDEI event
to the destination VM. The event is raised and handled.
(qemu) migrate -d tcp:localhost:4444
[host]# echo 0 > /proc/kvm/kvm-5360/vcpu-1
[guest-dst]#
=========== SDEI Event (CPU#1) ===========
Event: 0000000000000000 Parameter: 00000000dabfdabf
PC: ffff800008cbb554 PSTATE: 00000000604000c5 SP: ffff800009c7bde0
Regs: 00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001
ffff800016c28000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 ffff800009399008
ffff8000097d9af0 ffff8000097d99f8 ffff8000093a8db8 ffff8000097d9b18
0000000000000000 0000000000000000 ffff000000339d00 0000000000000000
0000000000000000 ffff800009c7bde0 ffff800008cbb5c4
Context: 00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001
ffff800016c28000 03ffffffffffffff 000000024325db59 ffff8000097de190
ffff00000033a790 ffff800008cbb814 0000000000000a30 0000000000000000
Changelog
=========
v7:
* Rebased to v5.19.rc1 (Gavin)
* Add hypercall ranges for routing (Oliver)
* Remove support to the critical event and redesigned
data structures. Function names are also modified
as Oliver suggested (Oliver)
* Deliver event when it's enabled or the specific PE
is unmasked (Oliver)
* Improve EVENT_COMPLETE_AND_RESUME hypercall to resume
from the specified address (Oliver)
* Add patches for SDEI migration and documentation (Gavin)
* Misc comments from Oliver Upon (Oliver)
v6:
* Rebased to v5.18.rc1 (Gavin)
* Pass additional argument to smccc_get_arg() (Oliver)
* Add preparatory patch to route hypercalls based on their
owners (Oliver)
* Remove the support for shared event. (Oliver/Gavin)
* Remove the support for migration and add-on patches to
support it in future (Oliver)
* The events are exposed by KVM instead of VMM (Oliver)
* kvm_sdei_state.h is dropped and all the structures are
folded into the corresponding ones in kvm_sdei.h (Oliver)
* Rename 'struct kvm_sdei_registered_event' to
'struct kvm_sdei_event' (Oliver)
* Misc comments from Oliver Upon (Oliver)
v5/v4/v3/v2/v1:
* Skipped here and please visit the history by
https://lore.kernel.org/lkml/[email protected]/T/
Gavin Shan (22):
KVM: arm64: Extend smccc_get_argx()
KVM: arm64: Route hypercalls based on their owner
KVM: arm64: Add SDEI virtualization infrastructure
KVM: arm64: Support EVENT_REGISTER hypercall
KVM: arm64: Support EVENT_{ENABLE, DISABLE} hypercall
KVM: arm64: Support EVENT_CONTEXT hypercall
KVM: arm64: Support EVENT_UNREGISTER hypercall
KVM: arm64: Support EVENT_STATUS hypercall
KVM: arm64: Support EVENT_GET_INFO hypercall
KVM: arm64: Support PE_{MASK, UNMASK} hypercall
KVM: arm64: Support {PRIVATE, SHARED}_RESET hypercall
KVM: arm64: Support event injection and delivery
KVM: arm64: Support EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall
KVM: arm64: Support EVENT_SIGNAL hypercall
KVM: arm64: Support SDEI_FEATURES hypercall
KVM: arm64: Support SDEI_VERSION hypercall
KVM: arm64: Expose SDEI capbility and service
KVM: arm64: Allow large sized pseudo firmware registers
KVM: arm64: Support SDEI event migration
KVM: arm64: Add SDEI document
selftests: KVM: aarch64: Add SDEI case in hypercall tests
selftests: KVM: aarch64: Add SDEI test case
Documentation/virt/kvm/api.rst | 11 +
Documentation/virt/kvm/arm/hypercalls.rst | 4 +
Documentation/virt/kvm/arm/sdei.rst | 64 ++
arch/arm64/include/asm/kvm_host.h | 3 +
arch/arm64/include/asm/kvm_sdei.h | 81 +++
arch/arm64/include/uapi/asm/kvm.h | 18 +
arch/arm64/kvm/Makefile | 2 +-
arch/arm64/kvm/arm.c | 8 +
arch/arm64/kvm/hypercalls.c | 182 +++--
arch/arm64/kvm/psci.c | 14 +-
arch/arm64/kvm/pvtime.c | 2 +-
arch/arm64/kvm/sdei.c | 676 ++++++++++++++++++
arch/arm64/kvm/trng.c | 4 +-
include/kvm/arm_hypercalls.h | 19 +-
include/linux/arm-smccc.h | 7 +
include/uapi/linux/arm_sdei.h | 8 +
include/uapi/linux/kvm.h | 1 +
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/aarch64/hypercalls.c | 11 +-
tools/testing/selftests/kvm/aarch64/sdei.c | 450 ++++++++++++
20 files changed, 1499 insertions(+), 67 deletions(-)
create mode 100644 Documentation/virt/kvm/arm/sdei.rst
create mode 100644 arch/arm64/include/asm/kvm_sdei.h
create mode 100644 arch/arm64/kvm/sdei.c
create mode 100644 tools/testing/selftests/kvm/aarch64/sdei.c
--
2.23.0
This supports event injection, delivery and cancellation. The event
is injected and cancelled by kvm_sdei_{inject, cancel}_event(). For
event delivery, kvm_sdei_deliver_event() is added to accommodate
KVM_REQ_SDEI request.
The KVM_REQ_SDEI request can be raised in several situation:
* PE is unmasked
* Event is enabled
* Completion of currently running event or handler on receiving
EVENT_COMPLETE or EVENT_COMPLETE_AND_RESUME hypercall, which
will be supported in the subsequent patch.
Signed-off-by: Gavin Shan <[email protected]>
---
arch/arm64/include/asm/kvm_sdei.h | 4 +
arch/arm64/kvm/arm.c | 3 +
arch/arm64/kvm/sdei.c | 123 ++++++++++++++++++++++++++++++
3 files changed, 130 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_sdei.h b/arch/arm64/include/asm/kvm_sdei.h
index 609338b17478..735d9ac1a5a2 100644
--- a/arch/arm64/include/asm/kvm_sdei.h
+++ b/arch/arm64/include/asm/kvm_sdei.h
@@ -64,6 +64,10 @@ struct kvm_sdei_vcpu {
/* APIs */
int kvm_sdei_call(struct kvm_vcpu *vcpu);
+int kvm_sdei_inject_event(struct kvm_vcpu *vcpu,
+ unsigned int num, bool immediate);
+int kvm_sdei_cancel_event(struct kvm_vcpu *vcpu, unsigned int num);
+void kvm_sdei_deliver_event(struct kvm_vcpu *vcpu);
void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu);
void kvm_sdei_destroy_vcpu(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e9516f951e7b..06cb5e38634e 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -720,6 +720,9 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
if (kvm_check_request(KVM_REQ_VCPU_RESET, vcpu))
kvm_reset_vcpu(vcpu);
+ if (kvm_check_request(KVM_REQ_SDEI, vcpu))
+ kvm_sdei_deliver_event(vcpu);
+
/*
* Clear IRQ_PENDING requests that were made to guarantee
* that a VCPU sees new virtual interrupts.
diff --git a/arch/arm64/kvm/sdei.c b/arch/arm64/kvm/sdei.c
index 42ba6f97b168..36a72c1750fc 100644
--- a/arch/arm64/kvm/sdei.c
+++ b/arch/arm64/kvm/sdei.c
@@ -266,6 +266,129 @@ int kvm_sdei_call(struct kvm_vcpu *vcpu)
return 1;
}
+int kvm_sdei_inject_event(struct kvm_vcpu *vcpu,
+ unsigned int num,
+ bool immediate)
+{
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+
+ if (!vsdei)
+ return -EPERM;
+
+ if (num >= KVM_NR_SDEI_EVENTS || !test_bit(num, &vsdei->registered))
+ return -ENOENT;
+
+ /*
+ * The event may be expected to be delivered immediately. There
+ * are several cases we can't do this:
+ *
+ * (1) The PE has been masked from any events.
+ * (2) The event isn't enabled yet.
+ * (3) There are any pending or running events.
+ */
+ if (immediate &&
+ ((vcpu->arch.flags & KVM_ARM64_SDEI_MASKED) ||
+ !test_bit(num, &vsdei->enabled) ||
+ vsdei->pending || vsdei->running))
+ return -EBUSY;
+
+ set_bit(num, &vsdei->pending);
+ if (!(vcpu->arch.flags & KVM_ARM64_SDEI_MASKED) &&
+ test_bit(num, &vsdei->enabled))
+ kvm_make_request(KVM_REQ_SDEI, vcpu);
+
+ return 0;
+}
+
+int kvm_sdei_cancel_event(struct kvm_vcpu *vcpu, unsigned int num)
+{
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+
+ if (!vsdei)
+ return -EPERM;
+
+ if (num >= KVM_NR_SDEI_EVENTS || !test_bit(num, &vsdei->registered))
+ return -ENOENT;
+
+ if (test_bit(num, &vsdei->running))
+ return -EBUSY;
+
+ clear_bit(num, &vsdei->pending);
+
+ return 0;
+}
+
+void kvm_sdei_deliver_event(struct kvm_vcpu *vcpu)
+{
+ struct kvm_sdei_vcpu *vsdei = vcpu->arch.sdei;
+ struct kvm_sdei_event_context *ctxt = &vsdei->ctxt;
+ unsigned int num, i;
+ unsigned long pstate;
+
+ if (!vsdei || (vcpu->arch.flags & KVM_ARM64_SDEI_MASKED))
+ return;
+
+ /*
+ * All supported events have normal priority. So the currently
+ * running event can't be preempted by any one else.
+ */
+ if (vsdei->running)
+ return;
+
+ /* Select next pending event to be delivered */
+ num = 0;
+ while (num < KVM_NR_SDEI_EVENTS) {
+ num = find_next_bit(&vsdei->pending, KVM_NR_SDEI_EVENTS, num);
+ if (test_bit(num, &vsdei->enabled))
+ break;
+ }
+
+ if (num >= KVM_NR_SDEI_EVENTS)
+ return;
+
+ /*
+ * Save the interrupted context. We might have pending request
+ * to adjust PC. Lets adjust it now so that the resume address
+ * is correct when COMPLETE or COMPLETE_AND_RESUME hypercall
+ * is handled.
+ */
+ __kvm_adjust_pc(vcpu);
+ ctxt->pc = *vcpu_pc(vcpu);
+ ctxt->pstate = *vcpu_cpsr(vcpu);
+ for (i = 0; i < ARRAY_SIZE(ctxt->regs); i++)
+ ctxt->regs[i] = vcpu_get_reg(vcpu, i);
+
+ /*
+ * Inject event. The following registers are modified according
+ * to the specification.
+ *
+ * x0: event number
+ * x1: argument specified when the event is registered
+ * x2: PC of the interrupted context
+ * x3: PSTATE of the interrupted context
+ * PC: event handler
+ * PSTATE: Cleared nRW bit, but D/A/I/F bits are set
+ */
+ for (i = 0; i < ARRAY_SIZE(ctxt->regs); i++)
+ vcpu_set_reg(vcpu, i, 0);
+
+ vcpu_set_reg(vcpu, 0, num);
+ vcpu_set_reg(vcpu, 1, vsdei->handlers[num].ep_arg);
+ vcpu_set_reg(vcpu, 2, ctxt->pc);
+ vcpu_set_reg(vcpu, 3, ctxt->pstate);
+
+ pstate = ctxt->pstate;
+ pstate &= ~(PSR_MODE32_BIT | PSR_MODE_MASK);
+ pstate |= (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT | PSR_MODE_EL1h);
+
+ *vcpu_cpsr(vcpu) = pstate;
+ *vcpu_pc(vcpu) = vsdei->handlers[num].ep_addr;
+
+ /* Update event states */
+ clear_bit(num, &vsdei->pending);
+ set_bit(num, &vsdei->running);
+}
+
void kvm_sdei_create_vcpu(struct kvm_vcpu *vcpu)
{
struct kvm_sdei_vcpu *vsdei;
--
2.23.0
Hi Shijie,
On 5/30/22 2:47 PM, Shijie Huang wrote:
> On 2022/5/27 16:02, Gavin Shan wrote:
>>
>> This series intends to virtualize Software Delegated Exception Interface
>> (SDEI), which is defined by DEN0054C (v1.1). It allows the hypervisor to
>> deliver NMI-alike SDEI event to guest and it's needed by Async PF to
>> deliver page-not-present notification from hypervisor to guest. The code
>> and the required qemu changes can be found from:
>>
>> https://developer.arm.com/documentation/den0054/c
>> https://github.com/gwshan/linux ("kvm/arm64_sdei")
>> https://github.com/gwshan/qemu ("kvm/arm64_sdei")
>>
>> The design is quite strightforward by following the specification. The
>> (SDEI) events are classified into the shared and private ones according
>> to their scope. The shared event is system or VM scoped, but the private
>> event is vcpu scoped. This implementation doesn't support the shared
>> event because all the needed events are private. Besides, the critial
>> events aren't supported by the implementation either. It means all events
>> are normal in terms of priority.
>>
>> There are several objects (data structures) introduced to help on the
>> event registration, enablement, disablement, unregistration, reset,
>> delivery and handling.
>>
>> * kvm_sdei_event_handler
>> SDEI event handler, which is provided through EVENT_REGISTER
>> hypercall, is called when the SDEI event is delivered from
>> host to guest.
>>
>> * kvm_sdei_event_context
>> The saved (preempted) context when SDEI event is delivered
>> for handling.
>>
>> * kvm_sdei_vcpu
>> SDEI events and their states.
>>
>> The patches are organized as below:
>>
>> PATCH[01-02] Preparatory work to extend smccc_get_argx() and refactor
>> hypercall routing mechanism
>> PATCH[03] Adds SDEI virtualization infrastructure
>> PATCH[04-16] Supports various SDEI hypercalls and event handling
>> PATCH[17] Exposes SDEI capability
>> PATCH[18-19] Support SDEI migration
>> PATCH[20] Adds document about SDEI
>> PATCH[21-22] SDEI related selftest cases
>>
>> The previous revisions can be found:
>>
>> v6: https://lore.kernel.org/lkml/[email protected]/T/
>> v5: https://lore.kernel.org/kvmarm/[email protected]/
>> v4: https://lore.kernel.org/kvmarm/[email protected]/
>> v3: https://lore.kernel.org/kvmarm/[email protected]/
>> v2: https://lore.kernel.org/kvmarm/[email protected]/
>> v1: https://lore.kernel.org/kvmarm/[email protected]/
>>
>> Testing
>> =======
>> [1] The selftest case included in this series works fine. The default SDEI
>> event, whose number is zero, can be registered, enabled, raised. The
>> SDEI event handler can be invoked.
>>
>> [host]# pwd
>> /home/gavin/sandbox/linux.main/tools/testing/selftests/kvm
>> [root@virtlab-arm01 kvm]# ./aarch64/sdei
>>
>> NR_VCPUS: 2 SDEI Event: 0x00000000
>>
>> --- VERSION
>> Version: 1.1 (vendor: 0x4b564d)
>> --- FEATURES
>> Shared event slots: 0
>> Private event slots: 0
>> Relative mode: No
>> --- PRIVATE_RESET
>> --- SHARED_RESET
>> --- PE_UNMASK
>> --- EVENT_GET_INFO
>> Type: Private
>> Priority: Normal
>> Signaled: Yes
>> --- EVENT_REGISTER
>> --- EVENT_ENABLE
>> --- EVENT_SIGNAL
>> Handled: Yes
>> IRQ: No
>> Status: Registered-Enabled-Running
>> PC/PSTATE: 000000000040232c 00000000600003c5
>> Regs: 0000000000000000 0000000000000000
>> 0000000000000000 0000000000000000
>> --- PE_MASK
>> --- EVENT_DISABLE
>> --- EVENT_UNREGISTER
>>
>> Result: OK
>>
>> [2] There are additional patches in the following repositories to create
>> procfs entries, allowing to inject SDEI event from host side. The
>> SDEI client in the guest side registers the SDEI default event, whose
>> number is zero. Also, the QEMU exports SDEI ACPI table and supports
>> migration for SDEI.
>>
>> https://github.com/gwshan/linux ("kvm/arm64_sdei")
>> https://github.com/gwshan/qemu ("kvm/arm64_sdei")
>>
>> [2.1] Start the guests and migrate the source VM to the destination
>> VM.
>>
>> [host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
>> -accel kvm -machine virt,gic-version=host \
>> -cpu host -smp 6,sockets=2,cores=3,threads=1 \
>> -m 1024M,slots=16,maxmem=64G \
>> : \
>> -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
>> -initrd /home/gavin/sandbox/images/rootfs.cpio.xz \
>> -append earlycon=pl011,mmio,0x9000000 \
>> :
>>
>> [host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
>> -accel kvm -machine virt,gic-version=host \
>> -cpu host -smp 6,sockets=2,cores=3,threads=1 \
>> -m 1024M,slots=16,maxmem=64G \
>> : \
>> -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
>> -initrd /home/gavin/sandbox/images/rootfs.cpio.xz \
>> -append earlycon=pl011,mmio,0x9000000 \
>> -incoming tcp:0:4444 \
>> :
>>
>> [2.2] Check kernel log on the source VM. The SDEI service is enabled
>> and the default SDEI event (0x0) is enabled.
>>
>> [guest-src]# dmesg | grep -i sdei
>> ACPI: SDEI 0x000000005BC80000 000024 \
>> (v00 BOCHS BXPC 00000001 BXPC 00000001)
>> sdei: SDEIv1.1 (0x4b564d) detected in firmware.
>> SDEI TEST: Version 1.1, Vendor 0x4b564d
>> sdei_init: SDEI event (0x0) registered
>> sdei_init: SDEI event (0x0) enabled
>>
>>
>> (qemu) migrate -d tcp:localhost:4444
>>
>> [2.3] Migrate the source VM to the destination VM. Inject SDEI event
>> to the destination VM. The event is raised and handled.
>>
>> (qemu) migrate -d tcp:localhost:4444
>>
>> [host]# echo 0 > /proc/kvm/kvm-5360/vcpu-1
>>
>> [guest-dst]#
>> =========== SDEI Event (CPU#1) ===========
>> Event: 0000000000000000 Parameter: 00000000dabfdabf
>> PC: ffff800008cbb554 PSTATE: 00000000604000c5 SP: ffff800009c7bde0
>> Regs: 00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001
>> ffff800016c28000 0000000000000000 0000000000000000 0000000000000000
>> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> 0000000000000000 0000000000000000 0000000000000000 ffff800009399008
>> ffff8000097d9af0 ffff8000097d99f8 ffff8000093a8db8 ffff8000097d9b18
>> 0000000000000000 0000000000000000 ffff000000339d00 0000000000000000
>> 0000000000000000 ffff800009c7bde0 ffff800008cbb5c4
>> Context: 00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001
>> ffff800016c28000 03ffffffffffffff 000000024325db59 ffff8000097de190
>> ffff00000033a790 ffff800008cbb814 0000000000000a30 0000000000000000
>
> I tested this patch set. It's okay.
>
> Tested-by: Huang Shijie <[email protected]>
>
[...]
Appreciate your efforts to test it through. I will have your
tested-by if respin is needed. Thank you for your time on this.
Thanks,
Gavin
Hi Gavin,
On 2022/5/27 16:02, Gavin Shan wrote:
> [EXTERNAL EMAIL NOTICE: This email originated from an external sender. Please be mindful of safe email handling and proprietary information protection practices.]
>
>
> This series intends to virtualize Software Delegated Exception Interface
> (SDEI), which is defined by DEN0054C (v1.1). It allows the hypervisor to
> deliver NMI-alike SDEI event to guest and it's needed by Async PF to
> deliver page-not-present notification from hypervisor to guest. The code
> and the required qemu changes can be found from:
>
> https://developer.arm.com/documentation/den0054/c
> https://github.com/gwshan/linux ("kvm/arm64_sdei")
> https://github.com/gwshan/qemu ("kvm/arm64_sdei")
>
> The design is quite strightforward by following the specification. The
> (SDEI) events are classified into the shared and private ones according
> to their scope. The shared event is system or VM scoped, but the private
> event is vcpu scoped. This implementation doesn't support the shared
> event because all the needed events are private. Besides, the critial
> events aren't supported by the implementation either. It means all events
> are normal in terms of priority.
>
> There are several objects (data structures) introduced to help on the
> event registration, enablement, disablement, unregistration, reset,
> delivery and handling.
>
> * kvm_sdei_event_handler
> SDEI event handler, which is provided through EVENT_REGISTER
> hypercall, is called when the SDEI event is delivered from
> host to guest.
>
> * kvm_sdei_event_context
> The saved (preempted) context when SDEI event is delivered
> for handling.
>
> * kvm_sdei_vcpu
> SDEI events and their states.
>
> The patches are organized as below:
>
> PATCH[01-02] Preparatory work to extend smccc_get_argx() and refactor
> hypercall routing mechanism
> PATCH[03] Adds SDEI virtualization infrastructure
> PATCH[04-16] Supports various SDEI hypercalls and event handling
> PATCH[17] Exposes SDEI capability
> PATCH[18-19] Support SDEI migration
> PATCH[20] Adds document about SDEI
> PATCH[21-22] SDEI related selftest cases
>
> The previous revisions can be found:
>
> v6: https://lore.kernel.org/lkml/[email protected]/T/
> v5: https://lore.kernel.org/kvmarm/[email protected]/
> v4: https://lore.kernel.org/kvmarm/[email protected]/
> v3: https://lore.kernel.org/kvmarm/[email protected]/
> v2: https://lore.kernel.org/kvmarm/[email protected]/
> v1: https://lore.kernel.org/kvmarm/[email protected]/
>
> Testing
> =======
> [1] The selftest case included in this series works fine. The default SDEI
> event, whose number is zero, can be registered, enabled, raised. The
> SDEI event handler can be invoked.
>
> [host]# pwd
> /home/gavin/sandbox/linux.main/tools/testing/selftests/kvm
> [root@virtlab-arm01 kvm]# ./aarch64/sdei
>
> NR_VCPUS: 2 SDEI Event: 0x00000000
>
> --- VERSION
> Version: 1.1 (vendor: 0x4b564d)
> --- FEATURES
> Shared event slots: 0
> Private event slots: 0
> Relative mode: No
> --- PRIVATE_RESET
> --- SHARED_RESET
> --- PE_UNMASK
> --- EVENT_GET_INFO
> Type: Private
> Priority: Normal
> Signaled: Yes
> --- EVENT_REGISTER
> --- EVENT_ENABLE
> --- EVENT_SIGNAL
> Handled: Yes
> IRQ: No
> Status: Registered-Enabled-Running
> PC/PSTATE: 000000000040232c 00000000600003c5
> Regs: 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000
> --- PE_MASK
> --- EVENT_DISABLE
> --- EVENT_UNREGISTER
>
> Result: OK
>
> [2] There are additional patches in the following repositories to create
> procfs entries, allowing to inject SDEI event from host side. The
> SDEI client in the guest side registers the SDEI default event, whose
> number is zero. Also, the QEMU exports SDEI ACPI table and supports
> migration for SDEI.
>
> https://github.com/gwshan/linux ("kvm/arm64_sdei")
> https://github.com/gwshan/qemu ("kvm/arm64_sdei")
>
> [2.1] Start the guests and migrate the source VM to the destination
> VM.
>
> [host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
> -accel kvm -machine virt,gic-version=host \
> -cpu host -smp 6,sockets=2,cores=3,threads=1 \
> -m 1024M,slots=16,maxmem=64G \
> : \
> -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
> -initrd /home/gavin/sandbox/images/rootfs.cpio.xz \
> -append earlycon=pl011,mmio,0x9000000 \
> :
>
> [host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
> -accel kvm -machine virt,gic-version=host \
> -cpu host -smp 6,sockets=2,cores=3,threads=1 \
> -m 1024M,slots=16,maxmem=64G \
> : \
> -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
> -initrd /home/gavin/sandbox/images/rootfs.cpio.xz \
> -append earlycon=pl011,mmio,0x9000000 \
> -incoming tcp:0:4444 \
> :
>
> [2.2] Check kernel log on the source VM. The SDEI service is enabled
> and the default SDEI event (0x0) is enabled.
>
> [guest-src]# dmesg | grep -i sdei
> ACPI: SDEI 0x000000005BC80000 000024 \
> (v00 BOCHS BXPC 00000001 BXPC 00000001)
> sdei: SDEIv1.1 (0x4b564d) detected in firmware.
> SDEI TEST: Version 1.1, Vendor 0x4b564d
> sdei_init: SDEI event (0x0) registered
> sdei_init: SDEI event (0x0) enabled
>
>
> (qemu) migrate -d tcp:localhost:4444
>
> [2.3] Migrate the source VM to the destination VM. Inject SDEI event
> to the destination VM. The event is raised and handled.
>
> (qemu) migrate -d tcp:localhost:4444
>
> [host]# echo 0 > /proc/kvm/kvm-5360/vcpu-1
>
> [guest-dst]#
> =========== SDEI Event (CPU#1) ===========
> Event: 0000000000000000 Parameter: 00000000dabfdabf
> PC: ffff800008cbb554 PSTATE: 00000000604000c5 SP: ffff800009c7bde0
> Regs: 00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001
> ffff800016c28000 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 ffff800009399008
> ffff8000097d9af0 ffff8000097d99f8 ffff8000093a8db8 ffff8000097d9b18
> 0000000000000000 0000000000000000 ffff000000339d00 0000000000000000
> 0000000000000000 ffff800009c7bde0 ffff800008cbb5c4
> Context: 00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001
> ffff800016c28000 03ffffffffffffff 000000024325db59 ffff8000097de190
> ffff00000033a790 ffff800008cbb814 0000000000000a30 0000000000000000
I tested this patch set. It's okay.
Tested-by: Huang Shijie <[email protected]>
Thanks
Huang Shijie
>
> Changelog
> =========
> v7:
> * Rebased to v5.19.rc1 (Gavin)
> * Add hypercall ranges for routing (Oliver)
> * Remove support to the critical event and redesigned
> data structures. Function names are also modified
> as Oliver suggested (Oliver)
> * Deliver event when it's enabled or the specific PE
> is unmasked (Oliver)
> * Improve EVENT_COMPLETE_AND_RESUME hypercall to resume
> from the specified address (Oliver)
> * Add patches for SDEI migration and documentation (Gavin)
> * Misc comments from Oliver Upon (Oliver)
> v6:
> * Rebased to v5.18.rc1 (Gavin)
> * Pass additional argument to smccc_get_arg() (Oliver)
> * Add preparatory patch to route hypercalls based on their
> owners (Oliver)
> * Remove the support for shared event. (Oliver/Gavin)
> * Remove the support for migration and add-on patches to
> support it in future (Oliver)
> * The events are exposed by KVM instead of VMM (Oliver)
> * kvm_sdei_state.h is dropped and all the structures are
> folded into the corresponding ones in kvm_sdei.h (Oliver)
> * Rename 'struct kvm_sdei_registered_event' to
> 'struct kvm_sdei_event' (Oliver)
> * Misc comments from Oliver Upon (Oliver)
> v5/v4/v3/v2/v1:
> * Skipped here and please visit the history by
> https://lore.kernel.org/lkml/[email protected]/T/
>
> Gavin Shan (22):
> KVM: arm64: Extend smccc_get_argx()
> KVM: arm64: Route hypercalls based on their owner
> KVM: arm64: Add SDEI virtualization infrastructure
> KVM: arm64: Support EVENT_REGISTER hypercall
> KVM: arm64: Support EVENT_{ENABLE, DISABLE} hypercall
> KVM: arm64: Support EVENT_CONTEXT hypercall
> KVM: arm64: Support EVENT_UNREGISTER hypercall
> KVM: arm64: Support EVENT_STATUS hypercall
> KVM: arm64: Support EVENT_GET_INFO hypercall
> KVM: arm64: Support PE_{MASK, UNMASK} hypercall
> KVM: arm64: Support {PRIVATE, SHARED}_RESET hypercall
> KVM: arm64: Support event injection and delivery
> KVM: arm64: Support EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall
> KVM: arm64: Support EVENT_SIGNAL hypercall
> KVM: arm64: Support SDEI_FEATURES hypercall
> KVM: arm64: Support SDEI_VERSION hypercall
> KVM: arm64: Expose SDEI capbility and service
> KVM: arm64: Allow large sized pseudo firmware registers
> KVM: arm64: Support SDEI event migration
> KVM: arm64: Add SDEI document
> selftests: KVM: aarch64: Add SDEI case in hypercall tests
> selftests: KVM: aarch64: Add SDEI test case
>
> Documentation/virt/kvm/api.rst | 11 +
> Documentation/virt/kvm/arm/hypercalls.rst | 4 +
> Documentation/virt/kvm/arm/sdei.rst | 64 ++
> arch/arm64/include/asm/kvm_host.h | 3 +
> arch/arm64/include/asm/kvm_sdei.h | 81 +++
> arch/arm64/include/uapi/asm/kvm.h | 18 +
> arch/arm64/kvm/Makefile | 2 +-
> arch/arm64/kvm/arm.c | 8 +
> arch/arm64/kvm/hypercalls.c | 182 +++--
> arch/arm64/kvm/psci.c | 14 +-
> arch/arm64/kvm/pvtime.c | 2 +-
> arch/arm64/kvm/sdei.c | 676 ++++++++++++++++++
> arch/arm64/kvm/trng.c | 4 +-
> include/kvm/arm_hypercalls.h | 19 +-
> include/linux/arm-smccc.h | 7 +
> include/uapi/linux/arm_sdei.h | 8 +
> include/uapi/linux/kvm.h | 1 +
> tools/testing/selftests/kvm/Makefile | 1 +
> .../selftests/kvm/aarch64/hypercalls.c | 11 +-
> tools/testing/selftests/kvm/aarch64/sdei.c | 450 ++++++++++++
> 20 files changed, 1499 insertions(+), 67 deletions(-)
> create mode 100644 Documentation/virt/kvm/arm/sdei.rst
> create mode 100644 arch/arm64/include/asm/kvm_sdei.h
> create mode 100644 arch/arm64/kvm/sdei.c
> create mode 100644 tools/testing/selftests/kvm/aarch64/sdei.c
>
> --
> 2.23.0
>
Hi Oliver,
On 5/27/22 6:02 PM, Gavin Shan wrote:
> This series intends to virtualize Software Delegated Exception Interface
> (SDEI), which is defined by DEN0054C (v1.1). It allows the hypervisor to
> deliver NMI-alike SDEI event to guest and it's needed by Async PF to
> deliver page-not-present notification from hypervisor to guest. The code
> and the required qemu changes can be found from:
>
> https://developer.arm.com/documentation/den0054/c
> https://github.com/gwshan/linux ("kvm/arm64_sdei")
> https://github.com/gwshan/qemu ("kvm/arm64_sdei")
>
> The design is quite strightforward by following the specification. The
> (SDEI) events are classified into the shared and private ones according
> to their scope. The shared event is system or VM scoped, but the private
> event is vcpu scoped. This implementation doesn't support the shared
> event because all the needed events are private. Besides, the critial
> events aren't supported by the implementation either. It means all events
> are normal in terms of priority.
>
> There are several objects (data structures) introduced to help on the
> event registration, enablement, disablement, unregistration, reset,
> delivery and handling.
>
> * kvm_sdei_event_handler
> SDEI event handler, which is provided through EVENT_REGISTER
> hypercall, is called when the SDEI event is delivered from
> host to guest.
>
> * kvm_sdei_event_context
> The saved (preempted) context when SDEI event is delivered
> for handling.
>
> * kvm_sdei_vcpu
> SDEI events and their states.
>
> The patches are organized as below:
>
> PATCH[01-02] Preparatory work to extend smccc_get_argx() and refactor
> hypercall routing mechanism
> PATCH[03] Adds SDEI virtualization infrastructure
> PATCH[04-16] Supports various SDEI hypercalls and event handling
> PATCH[17] Exposes SDEI capability
> PATCH[18-19] Support SDEI migration
> PATCH[20] Adds document about SDEI
> PATCH[21-22] SDEI related selftest cases
>
> The previous revisions can be found:
>
> v6: https://lore.kernel.org/lkml/[email protected]/T/
> v5: https://lore.kernel.org/kvmarm/[email protected]/
> v4: https://lore.kernel.org/kvmarm/[email protected]/
> v3: https://lore.kernel.org/kvmarm/[email protected]/
> v2: https://lore.kernel.org/kvmarm/[email protected]/
> v1: https://lore.kernel.org/kvmarm/[email protected]/
>
Copying Oliver's new email address ([email protected]).
Please let me know if I need to rebase and repost the series.
Thanks,
Gavin
> Testing
> =======
> [1] The selftest case included in this series works fine. The default SDEI
> event, whose number is zero, can be registered, enabled, raised. The
> SDEI event handler can be invoked.
>
> [host]# pwd
> /home/gavin/sandbox/linux.main/tools/testing/selftests/kvm
> [root@virtlab-arm01 kvm]# ./aarch64/sdei
>
> NR_VCPUS: 2 SDEI Event: 0x00000000
>
> --- VERSION
> Version: 1.1 (vendor: 0x4b564d)
> --- FEATURES
> Shared event slots: 0
> Private event slots: 0
> Relative mode: No
> --- PRIVATE_RESET
> --- SHARED_RESET
> --- PE_UNMASK
> --- EVENT_GET_INFO
> Type: Private
> Priority: Normal
> Signaled: Yes
> --- EVENT_REGISTER
> --- EVENT_ENABLE
> --- EVENT_SIGNAL
> Handled: Yes
> IRQ: No
> Status: Registered-Enabled-Running
> PC/PSTATE: 000000000040232c 00000000600003c5
> Regs: 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000
> --- PE_MASK
> --- EVENT_DISABLE
> --- EVENT_UNREGISTER
>
> Result: OK
>
> [2] There are additional patches in the following repositories to create
> procfs entries, allowing to inject SDEI event from host side. The
> SDEI client in the guest side registers the SDEI default event, whose
> number is zero. Also, the QEMU exports SDEI ACPI table and supports
> migration for SDEI.
>
> https://github.com/gwshan/linux ("kvm/arm64_sdei")
> https://github.com/gwshan/qemu ("kvm/arm64_sdei")
>
> [2.1] Start the guests and migrate the source VM to the destination
> VM.
>
> [host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
> -accel kvm -machine virt,gic-version=host \
> -cpu host -smp 6,sockets=2,cores=3,threads=1 \
> -m 1024M,slots=16,maxmem=64G \
> : \
> -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
> -initrd /home/gavin/sandbox/images/rootfs.cpio.xz \
> -append earlycon=pl011,mmio,0x9000000 \
> :
>
> [host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
> -accel kvm -machine virt,gic-version=host \
> -cpu host -smp 6,sockets=2,cores=3,threads=1 \
> -m 1024M,slots=16,maxmem=64G \
> : \
> -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
> -initrd /home/gavin/sandbox/images/rootfs.cpio.xz \
> -append earlycon=pl011,mmio,0x9000000 \
> -incoming tcp:0:4444 \
> :
>
> [2.2] Check kernel log on the source VM. The SDEI service is enabled
> and the default SDEI event (0x0) is enabled.
>
> [guest-src]# dmesg | grep -i sdei
> ACPI: SDEI 0x000000005BC80000 000024 \
> (v00 BOCHS BXPC 00000001 BXPC 00000001)
> sdei: SDEIv1.1 (0x4b564d) detected in firmware.
> SDEI TEST: Version 1.1, Vendor 0x4b564d
> sdei_init: SDEI event (0x0) registered
> sdei_init: SDEI event (0x0) enabled
>
>
> (qemu) migrate -d tcp:localhost:4444
>
> [2.3] Migrate the source VM to the destination VM. Inject SDEI event
> to the destination VM. The event is raised and handled.
>
> (qemu) migrate -d tcp:localhost:4444
>
> [host]# echo 0 > /proc/kvm/kvm-5360/vcpu-1
>
> [guest-dst]#
> =========== SDEI Event (CPU#1) ===========
> Event: 0000000000000000 Parameter: 00000000dabfdabf
> PC: ffff800008cbb554 PSTATE: 00000000604000c5 SP: ffff800009c7bde0
> Regs: 00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001
> ffff800016c28000 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 ffff800009399008
> ffff8000097d9af0 ffff8000097d99f8 ffff8000093a8db8 ffff8000097d9b18
> 0000000000000000 0000000000000000 ffff000000339d00 0000000000000000
> 0000000000000000 ffff800009c7bde0 ffff800008cbb5c4
> Context: 00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001
> ffff800016c28000 03ffffffffffffff 000000024325db59 ffff8000097de190
> ffff00000033a790 ffff800008cbb814 0000000000000a30 0000000000000000
>
> Changelog
> =========
> v7:
> * Rebased to v5.19.rc1 (Gavin)
> * Add hypercall ranges for routing (Oliver)
> * Remove support to the critical event and redesigned
> data structures. Function names are also modified
> as Oliver suggested (Oliver)
> * Deliver event when it's enabled or the specific PE
> is unmasked (Oliver)
> * Improve EVENT_COMPLETE_AND_RESUME hypercall to resume
> from the specified address (Oliver)
> * Add patches for SDEI migration and documentation (Gavin)
> * Misc comments from Oliver Upon (Oliver)
> v6:
> * Rebased to v5.18.rc1 (Gavin)
> * Pass additional argument to smccc_get_arg() (Oliver)
> * Add preparatory patch to route hypercalls based on their
> owners (Oliver)
> * Remove the support for shared event. (Oliver/Gavin)
> * Remove the support for migration and add-on patches to
> support it in future (Oliver)
> * The events are exposed by KVM instead of VMM (Oliver)
> * kvm_sdei_state.h is dropped and all the structures are
> folded into the corresponding ones in kvm_sdei.h (Oliver)
> * Rename 'struct kvm_sdei_registered_event' to
> 'struct kvm_sdei_event' (Oliver)
> * Misc comments from Oliver Upon (Oliver)
> v5/v4/v3/v2/v1:
> * Skipped here and please visit the history by
> https://lore.kernel.org/lkml/[email protected]/T/
>
> Gavin Shan (22):
> KVM: arm64: Extend smccc_get_argx()
> KVM: arm64: Route hypercalls based on their owner
> KVM: arm64: Add SDEI virtualization infrastructure
> KVM: arm64: Support EVENT_REGISTER hypercall
> KVM: arm64: Support EVENT_{ENABLE, DISABLE} hypercall
> KVM: arm64: Support EVENT_CONTEXT hypercall
> KVM: arm64: Support EVENT_UNREGISTER hypercall
> KVM: arm64: Support EVENT_STATUS hypercall
> KVM: arm64: Support EVENT_GET_INFO hypercall
> KVM: arm64: Support PE_{MASK, UNMASK} hypercall
> KVM: arm64: Support {PRIVATE, SHARED}_RESET hypercall
> KVM: arm64: Support event injection and delivery
> KVM: arm64: Support EVENT_{COMPLETE, COMPLETE_AND_RESUME} hypercall
> KVM: arm64: Support EVENT_SIGNAL hypercall
> KVM: arm64: Support SDEI_FEATURES hypercall
> KVM: arm64: Support SDEI_VERSION hypercall
> KVM: arm64: Expose SDEI capbility and service
> KVM: arm64: Allow large sized pseudo firmware registers
> KVM: arm64: Support SDEI event migration
> KVM: arm64: Add SDEI document
> selftests: KVM: aarch64: Add SDEI case in hypercall tests
> selftests: KVM: aarch64: Add SDEI test case
>
> Documentation/virt/kvm/api.rst | 11 +
> Documentation/virt/kvm/arm/hypercalls.rst | 4 +
> Documentation/virt/kvm/arm/sdei.rst | 64 ++
> arch/arm64/include/asm/kvm_host.h | 3 +
> arch/arm64/include/asm/kvm_sdei.h | 81 +++
> arch/arm64/include/uapi/asm/kvm.h | 18 +
> arch/arm64/kvm/Makefile | 2 +-
> arch/arm64/kvm/arm.c | 8 +
> arch/arm64/kvm/hypercalls.c | 182 +++--
> arch/arm64/kvm/psci.c | 14 +-
> arch/arm64/kvm/pvtime.c | 2 +-
> arch/arm64/kvm/sdei.c | 676 ++++++++++++++++++
> arch/arm64/kvm/trng.c | 4 +-
> include/kvm/arm_hypercalls.h | 19 +-
> include/linux/arm-smccc.h | 7 +
> include/uapi/linux/arm_sdei.h | 8 +
> include/uapi/linux/kvm.h | 1 +
> tools/testing/selftests/kvm/Makefile | 1 +
> .../selftests/kvm/aarch64/hypercalls.c | 11 +-
> tools/testing/selftests/kvm/aarch64/sdei.c | 450 ++++++++++++
> 20 files changed, 1499 insertions(+), 67 deletions(-)
> create mode 100644 Documentation/virt/kvm/arm/sdei.rst
> create mode 100644 arch/arm64/include/asm/kvm_sdei.h
> create mode 100644 arch/arm64/kvm/sdei.c
> create mode 100644 tools/testing/selftests/kvm/aarch64/sdei.c
>
Hi Gavin,
On Thu, 23 Jun 2022 07:11:08 +0100,
Gavin Shan <[email protected]> wrote:
>
> Hi Oliver,
>
> On 5/27/22 6:02 PM, Gavin Shan wrote:
> > This series intends to virtualize Software Delegated Exception Interface
> > (SDEI), which is defined by DEN0054C (v1.1). It allows the hypervisor to
> > deliver NMI-alike SDEI event to guest and it's needed by Async PF to
> > deliver page-not-present notification from hypervisor to guest. The code
> > and the required qemu changes can be found from:
> >
> > https://developer.arm.com/documentation/den0054/c
> > https://github.com/gwshan/linux ("kvm/arm64_sdei")
> > https://github.com/gwshan/qemu ("kvm/arm64_sdei")
> >
> > The design is quite strightforward by following the specification. The
> > (SDEI) events are classified into the shared and private ones according
> > to their scope. The shared event is system or VM scoped, but the private
> > event is vcpu scoped. This implementation doesn't support the shared
> > event because all the needed events are private. Besides, the critial
> > events aren't supported by the implementation either. It means all events
> > are normal in terms of priority.
> >
> > There are several objects (data structures) introduced to help on the
> > event registration, enablement, disablement, unregistration, reset,
> > delivery and handling.
> >
> > * kvm_sdei_event_handler
> > SDEI event handler, which is provided through EVENT_REGISTER
> > hypercall, is called when the SDEI event is delivered from
> > host to guest.
> > * kvm_sdei_event_context
> > The saved (preempted) context when SDEI event is delivered
> > for handling.
> > * kvm_sdei_vcpu
> > SDEI events and their states.
> >
> > The patches are organized as below:
> >
> > PATCH[01-02] Preparatory work to extend smccc_get_argx() and refactor
> > hypercall routing mechanism
> > PATCH[03] Adds SDEI virtualization infrastructure
> > PATCH[04-16] Supports various SDEI hypercalls and event handling
> > PATCH[17] Exposes SDEI capability
> > PATCH[18-19] Support SDEI migration
> > PATCH[20] Adds document about SDEI
> > PATCH[21-22] SDEI related selftest cases
> >
> > The previous revisions can be found:
> >
> > v6: https://lore.kernel.org/lkml/[email protected]/T/
> > v5: https://lore.kernel.org/kvmarm/[email protected]/
> > v4: https://lore.kernel.org/kvmarm/[email protected]/
> > v3: https://lore.kernel.org/kvmarm/[email protected]/
> > v2: https://lore.kernel.org/kvmarm/[email protected]/
> > v1: https://lore.kernel.org/kvmarm/[email protected]/
> >
>
> Copying Oliver's new email address ([email protected]).
>
> Please let me know if I need to rebase and repost the series.
My main issue with this series is that it is a solution in search of a
problem. It is only an enabler for Asynchronous Page Fault support,
and:
- as far as I know, the core Linux/arm64 maintainers have no plan to
support APF. Without it, this is a pointless exercise. And even with
it, this introduces a Linux specific behaviour in an otherwise
architectural hypervisor (something I'm quite keen on avoiding)
- It gives an incentive to other hypervisor vendors to add random crap
to the Linux mm subsystem, which is even worse. At this stage, we
might as well go back to the Xen PV days altogether.
- I haven't seen any of the KVM/arm64 users actually asking for the
APF horror, and the cloud vendors I directly asked had no plan to
use it, and not using it on their x86 systems either
- no performance data nor workloads that could help making an informed
decision have been disclosed, and the only argument in its favour
seems to be "but x86 has it" (hardly a compelling one)
Given the above, I don't see how to justify this series, as it has no
purpose on its own, no matter how well written it is.
M.
--
Without deviation from the norm, progress is not possible.
Hi Marc,
On 6/24/22 11:12 PM, Marc Zyngier wrote:
> On Thu, 23 Jun 2022 07:11:08 +0100,
> Gavin Shan <[email protected]> wrote:
>> On 5/27/22 6:02 PM, Gavin Shan wrote:
>>> This series intends to virtualize Software Delegated Exception Interface
>>> (SDEI), which is defined by DEN0054C (v1.1). It allows the hypervisor to
>>> deliver NMI-alike SDEI event to guest and it's needed by Async PF to
>>> deliver page-not-present notification from hypervisor to guest. The code
>>> and the required qemu changes can be found from:
>>>
>>> https://developer.arm.com/documentation/den0054/c
>>> https://github.com/gwshan/linux ("kvm/arm64_sdei")
>>> https://github.com/gwshan/qemu ("kvm/arm64_sdei")
>>>
>>> The design is quite strightforward by following the specification. The
>>> (SDEI) events are classified into the shared and private ones according
>>> to their scope. The shared event is system or VM scoped, but the private
>>> event is vcpu scoped. This implementation doesn't support the shared
>>> event because all the needed events are private. Besides, the critial
>>> events aren't supported by the implementation either. It means all events
>>> are normal in terms of priority.
>>>
>>> There are several objects (data structures) introduced to help on the
>>> event registration, enablement, disablement, unregistration, reset,
>>> delivery and handling.
>>>
>>> * kvm_sdei_event_handler
>>> SDEI event handler, which is provided through EVENT_REGISTER
>>> hypercall, is called when the SDEI event is delivered from
>>> host to guest.
>>> * kvm_sdei_event_context
>>> The saved (preempted) context when SDEI event is delivered
>>> for handling.
>>> * kvm_sdei_vcpu
>>> SDEI events and their states.
>>>
>>> The patches are organized as below:
>>>
>>> PATCH[01-02] Preparatory work to extend smccc_get_argx() and refactor
>>> hypercall routing mechanism
>>> PATCH[03] Adds SDEI virtualization infrastructure
>>> PATCH[04-16] Supports various SDEI hypercalls and event handling
>>> PATCH[17] Exposes SDEI capability
>>> PATCH[18-19] Support SDEI migration
>>> PATCH[20] Adds document about SDEI
>>> PATCH[21-22] SDEI related selftest cases
>>>
>>> The previous revisions can be found:
>>>
>>> v6: https://lore.kernel.org/lkml/[email protected]/T/
>>> v5: https://lore.kernel.org/kvmarm/[email protected]/
>>> v4: https://lore.kernel.org/kvmarm/[email protected]/
>>> v3: https://lore.kernel.org/kvmarm/[email protected]/
>>> v2: https://lore.kernel.org/kvmarm/[email protected]/
>>> v1: https://lore.kernel.org/kvmarm/[email protected]/
>>>
>>
>> Copying Oliver's new email address ([email protected]).
>>
>> Please let me know if I need to rebase and repost the series.
>
> My main issue with this series is that it is a solution in search of a
> problem. It is only an enabler for Asynchronous Page Fault support,
> and:
>
> - as far as I know, the core Linux/arm64 maintainers have no plan to
> support APF. Without it, this is a pointless exercise. And even with
> it, this introduces a Linux specific behaviour in an otherwise
> architectural hypervisor (something I'm quite keen on avoiding)
>
> - It gives an incentive to other hypervisor vendors to add random crap
> to the Linux mm subsystem, which is even worse. At this stage, we
> might as well go back to the Xen PV days altogether.
>
> - I haven't seen any of the KVM/arm64 users actually asking for the
> APF horror, and the cloud vendors I directly asked had no plan to
> use it, and not using it on their x86 systems either
>
> - no performance data nor workloads that could help making an informed
> decision have been disclosed, and the only argument in its favour
> seems to be "but x86 has it" (hardly a compelling one)
>
> Given the above, I don't see how to justify this series, as it has no
> purpose on its own, no matter how well written it is.
>
Thank you for your time to review the series and provide comments. Long
time ago, I compare the features supported on x86 and arm64, to sort out
the gaps. Async page fault is one of the missed features. From that on,
I started to investigate x86's implementation and work on arm64's
implementation. It's the history why I continue to work on Async page
fault for arm64. It means there is no customer request, asking to support
Async page fault on arm64, on my side.
In order to support Async PF on arm64, there are two parts of changes,
which are related to kvm/arm64 and guest kernel. The service of Async
page fault won't be enabled if either kvm/arm64 or guest kernel doesn't
support it. The service is negotiated between host and guest. So I don't
think it would be a problem. It's true that Async page fault is only
beneficial to Linux host and Linux guest, until it gets supported on
other guest kernels.
SDEI implementation is following the specification. It's true that
Async PF isn't specified by arm64 architecture. However, it's also not
a architectural feature to x86 either. I guess the benefits count here.
The reason we need Async PF (and SDEI virtualization) is the benefit.
If I'm correct, Async PF has been used broadly on x86 because of
'post-copy live migration', which relies on userfaultfd. 'Async page fault'
is explicitly mentioned in its document (linux/Documentation/admin-guide/mm/userfaultfd.rst)
like below. It's the most important motivation to support Async PF.
Yeah, performance data is definitely helpful to measure the benefit,
especially for Async page fault on arm64. I used to revise both
serieses (SDEI virtualization and Async page fault) together, meaning
'Async page fault' series is revised if there are any code changes to
the series of 'SDEI virtualization', until I found it would be practical
to finialize 'SDEI virtualization' before working on 'Async page fault'.
It's why I don't post revised series of 'Async page fault' recently.
However, I think the performance data released in last year's KVM
forum is still relative. I certainly need to regain the performance
data when I continue to work on 'Async page fault' series after
'SDEI virutalization' is finalized.
https://static.sched.com/hosted_files/kvmforum2021/cb/sdei_apf_for_arm64_gavin.pdf
(In page 14 and 15, 41% to 68% improvement in live post-copy migration)
Extracted from linux/documentation/admin-guide/mm/userfaultfd.rst
------------------------------------------------------------------
QEMU/KVM
========
QEMU/KVM is using the ``userfaultfd`` syscall to implement postcopy live
migration. Postcopy live migration is one form of memory
externalization consisting of a virtual machine running with part or
all of its memory residing on a different node in the cloud. The
``userfaultfd`` abstraction is generic enough that not a single line of
KVM kernel code had to be modified in order to add postcopy live
migration to QEMU.
Guest async page faults, ``FOLL_NOWAIT`` and all other ``GUP*`` features work
just fine in combination with userfaults. Userfaults trigger async
page faults in the guest scheduler so those guest processes that
aren't waiting for userfaults (i.e. network bound) can keep running in
the guest vcpus.
Thanks,
Gavin
On 6/24/22 15:12, Marc Zyngier wrote:
> - as far as I know, the core Linux/arm64 maintainers have no plan to
> support APF. Without it, this is a pointless exercise. And even with
> it, this introduces a Linux specific behaviour in an otherwise
> architectural hypervisor (something I'm quite keen on avoiding)
Regarding non-architectural behavior, isn't that the same already for
PTP? I understand that the PTP hypercall is a much smaller
implementation than SDEI+APF, but it goes to show that KVM is already
not "architectural".
There are other cases where paravirtualized solutions can be useful.
PTP is one but there are more where KVM/ARM does not have a solution
yet, for example lock holder preemption. Unless ARM (the company) has a
way to receive input from developers and standardize the interface,
similar to the RISC-V SIGs, vendor-specific hypercalls are a sad fact of
life. It just happened that until now KVM/ARM hasn't seen much use in
some cases (such as desktop virtualization) where overcommitted hosts
are more common.
Async page faults per se are not KVM specific, in fact Linux supported
them for the IBM s390 hypervisor long before KVM added support. They
didn't exist on x86 and ARM, so the developers came up with a new
hypercall API and for x86 honestly it wasn't great. For ARM we learnt
from the mistakes and it seems to me that SDEI is a good match for the
feature. If ARM wants to produce a standard interface for APF, whether
based on SDEI or something else, we're all ears.
Regarding plans of core arm64 maintainers to support async page fault,
can you provide a pointer to the discussion? I agree that if there's a
hard NACK for APF for whatever reason, the whole host-side code is
pointless (including SDEI virtualization); but I would like to read more
about it.
> - It gives an incentive to other hypervisor vendors to add random crap
> to the Linux mm subsystem, which is even worse. At this stage, we
> might as well go back to the Xen PV days altogether.
return -EGREGIOUS;
Since you mention hypervisor vendors and there's only one hypervisor in
Linux, I guess you're not talking about the host mm/ subsystem
(otherwise yeah, FOLL_NOWAIT is only used by KVM async page faults).
So I suppose you're talking about the guest, and then yeah, it sucks to
have multiple hypervisors providing the same functionality in different
ways (or multiple hypervisors providing different subsets of PV
functionality). It happens on x86 with Hyper-V and KVM, and to a lesser
extent Xen and VMware.
But again, KVM/ARM has already crossed that bridge with PTP support, and
the guest needs exactly zero code in the Linux mm subsystem (both
generic and arch-specific) to support asynchronous page faults. There
are 20 lines of code in do_notify_resume(), and the rest is just SDEI
gunk. Again, I would be happy to get a pointer to concrete objections
from the Linux ARM64 maintainers. Maybe a different implementation is
possible, I don't know.
In any case it's absolutely not comparable to Xen PV, and you know it.
> - I haven't seen any of the KVM/arm64 users actually asking for the
> APF horror, and the cloud vendors I directly asked had no plan to
> use it, and not using it on their x86 systems either
Please define "horror" in more technical terms. And since this is the
second time I'm calling you out on this, I'm also asking you to avoid
hyperboles and similar rhetorical gimmicks in the future.
That said: Peter, Sean, Google uses or used postcopy extensively on GCE
(https://dl.acm.org/doi/pdf/10.1145/3296975.3186415). If it doesn't use
it on x86, do you have any insights on why?
> - no performance data nor workloads that could help making an informed
> decision have been disclosed, and the only argument in its favour
> seems to be "but x86 has it" (hardly a compelling one)
Again this is just false, numbers have been posted
(https://lwn.net/ml/linux-kernel/[email protected]/
was the first result that came up from a quick mailing list search). If
they are not enough, please be more specific.
Thanks,
Paolo
On Tue, Jul 05, 2022, Paolo Bonzini wrote:
> That said: Peter, Sean, Google uses or used postcopy extensively on GCE
> (https://dl.acm.org/doi/pdf/10.1145/3296975.3186415). If it doesn't use it
> on x86, do you have any insights on why?
We still use postcopy, but we don't use async #PF. Async #PF is disabled (mostly?)
because the x86 implementation was such a mess prior to switching to IRQ-based
delivery and AFAIK we haven't re-evaluated it since that update.
Hi Vishnu,
On 6/2/23 5:05 PM, Vishnu Pajjuri wrote:
> On 30-05-2022 12:27, Gavin Shan wrote:
>> On 5/30/22 2:47 PM, Shijie Huang wrote:
>>> On 2022/5/27 16:02, Gavin Shan wrote:
>>>>
>>>> This series intends to virtualize Software Delegated Exception Interface
>>>> (SDEI), which is defined by DEN0054C (v1.1). It allows the hypervisor to
>>>> deliver NMI-alike SDEI event to guest and it's needed by Async PF to
>>>> deliver page-not-present notification from hypervisor to guest. The code
>>>> and the required qemu changes can be found from:
>>>>
>>>> https://developer.arm.com/documentation/den0054/c
>>>> https://github.com/gwshan/linux ("kvm/arm64_sdei")
>>>> https://github.com/gwshan/qemu ("kvm/arm64_sdei")
>>>>
>>>> The design is quite strightforward by following the specification. The
>>>> (SDEI) events are classified into the shared and private ones according
>>>> to their scope. The shared event is system or VM scoped, but the private
>>>> event is vcpu scoped. This implementation doesn't support the shared
>>>> event because all the needed events are private. Besides, the critial
>>>> events aren't supported by the implementation either. It means all events
>>>> are normal in terms of priority.
>>>>
>>>> There are several objects (data structures) introduced to help on the
>>>> event registration, enablement, disablement, unregistration, reset,
>>>> delivery and handling.
>>>>
>>>> * kvm_sdei_event_handler
>>>> SDEI event handler, which is provided through EVENT_REGISTER
>>>> hypercall, is called when the SDEI event is delivered from
>>>> host to guest.
>>>>
>>>> * kvm_sdei_event_context
>>>> The saved (preempted) context when SDEI event is delivered
>>>> for handling.
>>>>
>>>> * kvm_sdei_vcpu
>>>> SDEI events and their states.
>>>>
>>>> The patches are organized as below:
>>>>
>>>> PATCH[01-02] Preparatory work to extend smccc_get_argx() and refactor
>>>> hypercall routing mechanism
>>>> PATCH[03] Adds SDEI virtualization infrastructure
>>>> PATCH[04-16] Supports various SDEI hypercalls and event handling
>>>> PATCH[17] Exposes SDEI capability
>>>> PATCH[18-19] Support SDEI migration
>>>> PATCH[20] Adds document about SDEI
>>>> PATCH[21-22] SDEI related selftest cases
>>>>
>>>> The previous revisions can be found:
>>>>
>>>> v6: https://lore.kernel.org/lkml/[email protected]/T/
>>>> v5: https://lore.kernel.org/kvmarm/[email protected]/
>>>> v4: https://lore.kernel.org/kvmarm/[email protected]/
>>>> v3: https://lore.kernel.org/kvmarm/[email protected]/
>>>> v2: https://lore.kernel.org/kvmarm/[email protected]/
>>>> v1: https://lore.kernel.org/kvmarm/[email protected]/
>>>>
>>>> Testing
>>>> =======
>>>> [1] The selftest case included in this series works fine. The default SDEI
>>>> event, whose number is zero, can be registered, enabled, raised. The
>>>> SDEI event handler can be invoked.
>>>>
>>>> [host]# pwd
>>>> /home/gavin/sandbox/linux.main/tools/testing/selftests/kvm
>>>> [root@virtlab-arm01 kvm]# ./aarch64/sdei
>>>>
>>>> NR_VCPUS: 2 SDEI Event: 0x00000000
>>>>
>>>> --- VERSION
>>>> Version: 1.1 (vendor: 0x4b564d)
>>>> --- FEATURES
>>>> Shared event slots: 0
>>>> Private event slots: 0
>>>> Relative mode: No
>>>> --- PRIVATE_RESET
>>>> --- SHARED_RESET
>>>> --- PE_UNMASK
>>>> --- EVENT_GET_INFO
>>>> Type: Private
>>>> Priority: Normal
>>>> Signaled: Yes
>>>> --- EVENT_REGISTER
>>>> --- EVENT_ENABLE
>>>> --- EVENT_SIGNAL
>>>> Handled: Yes
>>>> IRQ: No
>>>> Status: Registered-Enabled-Running
>>>> PC/PSTATE: 000000000040232c 00000000600003c5
>>>> Regs: 0000000000000000 0000000000000000
>>>> 0000000000000000 0000000000000000
>>>> --- PE_MASK
>>>> --- EVENT_DISABLE
>>>> --- EVENT_UNREGISTER
>>>>
>>>> Result: OK
>>>>
>>>> [2] There are additional patches in the following repositories to create
>>>> procfs entries, allowing to inject SDEI event from host side. The
>>>> SDEI client in the guest side registers the SDEI default event, whose
>>>> number is zero. Also, the QEMU exports SDEI ACPI table and supports
>>>> migration for SDEI.
>>>>
>>>> https://github.com/gwshan/linux ("kvm/arm64_sdei")
>>>> https://github.com/gwshan/qemu ("kvm/arm64_sdei")
>>>>
>>>> [2.1] Start the guests and migrate the source VM to the destination
>>>> VM.
>>>>
>>>> [host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
>>>> -accel kvm -machine virt,gic-version=host \
>>>> -cpu host -smp 6,sockets=2,cores=3,threads=1 \
>>>> -m 1024M,slots=16,maxmem=64G \
>>>> : \
>>>> -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
>>>> -initrd /home/gavin/sandbox/images/rootfs.cpio.xz \
>>>> -append earlycon=pl011,mmio,0x9000000 \
>>>> :
>>>>
>>>> [host]# /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
>>>> -accel kvm -machine virt,gic-version=host \
>>>> -cpu host -smp 6,sockets=2,cores=3,threads=1 \
>>>> -m 1024M,slots=16,maxmem=64G \
>>>> : \
>>>> -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image \
>>>> -initrd /home/gavin/sandbox/images/rootfs.cpio.xz \
>>>> -append earlycon=pl011,mmio,0x9000000 \
>>>> -incoming tcp:0:4444 \
>>>> :
>>>>
>>>> [2.2] Check kernel log on the source VM. The SDEI service is enabled
>>>> and the default SDEI event (0x0) is enabled.
>>>>
>>>> [guest-src]# dmesg | grep -i sdei
>>>> ACPI: SDEI 0x000000005BC80000 000024 \
>>>> (v00 BOCHS BXPC 00000001 BXPC 00000001)
>>>> sdei: SDEIv1.1 (0x4b564d) detected in firmware.
>>>> SDEI TEST: Version 1.1, Vendor 0x4b564d
>>>> sdei_init: SDEI event (0x0) registered
>>>> sdei_init: SDEI event (0x0) enabled
>>>>
>>>>
>>>> (qemu) migrate -d tcp:localhost:4444
>>>>
>>>> [2.3] Migrate the source VM to the destination VM. Inject SDEI event
>>>> to the destination VM. The event is raised and handled.
>>>>
>>>> (qemu) migrate -d tcp:localhost:4444
>>>>
>>>> [host]# echo 0 > /proc/kvm/kvm-5360/vcpu-1
>>>>
>>>> [guest-dst]#
>>>> =========== SDEI Event (CPU#1) ===========
>>>> Event: 0000000000000000 Parameter: 00000000dabfdabf
>>>> PC: ffff800008cbb554 PSTATE: 00000000604000c5 SP: ffff800009c7bde0
>>>> Regs: 00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001
>>>> ffff800016c28000 0000000000000000 0000000000000000 0000000000000000
>>>> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>>> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>>> 0000000000000000 0000000000000000 0000000000000000 ffff800009399008
>>>> ffff8000097d9af0 ffff8000097d99f8 ffff8000093a8db8 ffff8000097d9b18
>>>> 0000000000000000 0000000000000000 ffff000000339d00 0000000000000000
>>>> 0000000000000000 ffff800009c7bde0 ffff800008cbb5c4
>>>> Context: 00000000000016ee ffff00001ffd2e28 00000000000016ed 0000000000000001
>>>> ffff800016c28000 03ffffffffffffff 000000024325db59 ffff8000097de190
>>>> ffff00000033a790 ffff800008cbb814 0000000000000a30 0000000000000000
>>>
>>> I tested this patch set. It's okay.
>>>
>>> Tested-by: Huang Shijie <[email protected]>
>>>
>>
>> [...]
>>
>> Appreciate your efforts to test it through. I will have your
>> tested-by if respin is needed. Thank you for your time on this.
>>
> I would like to know the current latest status of this patch series,
>
> Since I didn't find any latest spin for this patch series.
>
> Also, I didn't find any active development branch at
>
> https://github.com/gwshan/linux.
>
> And I observed that the kernel development branch
>
> "https://github.com/gwshan/linuxkvm/arm64_sdei" moved to
>
> "https://github.com/gwshan/linuxbackup/kvm/arm64_sdei"
>
> I'm curious that what is required to spin this patch series with latest kernel versions
>
> Appreciate any other insights on this...
>
Thanks for raising concerns. The SDEI events were used to deliver
signals, required by the asynchronous page fault (Aync PF). I had
several discussions with Marc and Paolo, and we reached to the
temporary conclusion that Async PF isn't used in production enviornments
like google cloud. I suspended the efforts since then. The SDEI
virtualization support won't be needed if we needn't Async PF, unless
there are other signals needed to be delivered by SDEI events.
Thanks,
Gavin