On Mon, 2023-09-18 at 14:41 +0000, Paul Durrant wrote:
> From: Paul Durrant <[email protected]>
>
> The VMM should only need to set the KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO
> attribute in response to a VCPUOP_register_vcpu_info hypercall. We can
> handle the default case internally since we already know where the
> shared_info page is. Modify get_vcpu_info_cache() to pass back a pointer
> to the shared info pfn cache (and appropriate offset) for any of the
> first 32 vCPUs if the attribute has not been set.
>
> A VMM will be able to determine whether it needs to set up default
> vcpu_info using the previously defined KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA
> Xen capability flag, which will be advertized in a subsequent patch.
>
> Also update the KVM API documentation to describe the new behaviour.
>
> Signed-off-by: Paul Durrant <[email protected]>
> ---
> Cc: Sean Christopherson <[email protected]>
> Cc: Paolo Bonzini <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: "H. Peter Anvin" <[email protected]>
> Cc: David Woodhouse <[email protected]>
> Cc: [email protected]
>
> v3:
> - Add a note to the API documentation discussing vcpu_info copying.
>
> v2:
> - Dispense with the KVM_XEN_HVM_CONFIG_DEFAULT_VCPU_INFO capability.
> - Add API documentation.
> ---
> Documentation/virt/kvm/api.rst | 22 +++++++++++++++-------
> arch/x86/kvm/xen.c | 15 +++++++++++++++
> 2 files changed, 30 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index e9df4df6fe48..47bf3db74674 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -5442,13 +5442,7 @@ KVM_XEN_ATTR_TYPE_LONG_MODE
>
> KVM_XEN_ATTR_TYPE_SHARED_INFO
> Sets the guest physical frame number at which the Xen shared_info
> - page resides. Note that although Xen places vcpu_info for the first
> - 32 vCPUs in the shared_info page, KVM does not automatically do so
> - and instead requires that KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO be used
> - explicitly even when the vcpu_info for a given vCPU resides at the
> - "default" location in the shared_info page. This is because KVM may
> - not be aware of the Xen CPU id which is used as the index into the
> - vcpu_info[] array, so may know the correct default location.
> + page resides.
>
> Note that the shared_info page may be constantly written to by KVM;
> it contains the event channel bitmap used to deliver interrupts to
> @@ -5564,12 +5558,26 @@ type values:
>
> KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO
> Sets the guest physical address of the vcpu_info for a given vCPU.
> + The vcpu_info for the first 32 vCPUs defaults to the structures
> + embedded in the shared_info page.
The above is true only if KVM has KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA.
You kind of touch on that next, but perhaps the 'if the KVM_...'
condition should be moved up?
> + If the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA flag is also set in the
> + Xen capabilities then the VMM is not required to set this default
> + location; KVM will handle that internally. Otherwise this attribute
> + must be set for all vCPUs.
> +
> As with the shared_info page for the VM, the corresponding page may be
> dirtied at any time if event channel interrupt delivery is enabled, so
> userspace should always assume that the page is dirty without relying
> on dirty logging. Setting the gpa to KVM_XEN_INVALID_GPA will disable
> the vcpu_info.
>
> + Note that, if the guest sets an explicit vcpu_info location in guest
> + memory then the VMM is expected to copy the content of the structure
> + embedded in the shared_info page to the new location. It is therefore
> + important that no event delivery is in progress at this time, otherwise
> + events may be missed.
>
That's difficult. It means tearing down all interrupts from passthrough
devices which are mapped via PIRQs, and also all IPIs.
The IPI code *should* be able to fall back to just letting the VMM
handle the hypercall in userspace. But PIRQs are harder. I'd be happier
if our plan — handwavy though it may be — led to being able to use the
existing slow path for delivering interrupts by just *invalidating* the
cache. Maybe we *should* move the memcpy into the kernel, and let it
lock *both* the shinfo and new vcpu_info caches while it's doing the
copy? Given that that's the only valid transition, that shouldn't be so
hard, should it?
> KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO
> Sets the guest physical address of an additional pvclock structure
> for a given vCPU. This is typically used for guest vsyscall support.
> diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
> index 459f3ca4710e..660a808c0b50 100644
> --- a/arch/x86/kvm/xen.c
> +++ b/arch/x86/kvm/xen.c
> @@ -491,6 +491,21 @@ static void kvm_xen_inject_vcpu_vector(struct kvm_vcpu *v)
>
> static struct gfn_to_pfn_cache *get_vcpu_info_cache(struct kvm_vcpu *v, unsigned long *offset)
> {
> + if (!v->arch.xen.vcpu_info_cache.active && v->arch.xen.vcpu_id < MAX_VIRT_CPUS) {
> + struct kvm *kvm = v->kvm;
> +
> + if (offset) {
> + if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode)
> + *offset = offsetof(struct shared_info,
> + vcpu_info[v->arch.xen.vcpu_id]);
> + else
> + *offset = offsetof(struct compat_shared_info,
> + vcpu_info[v->arch.xen.vcpu_id]);
> + }
> +
> + return &kvm->arch.xen.shinfo_cache;
> + }
> +
> if (offset)
> *offset = 0;
>
On 18/09/2023 17:07, David Woodhouse wrote:
> On Mon, 2023-09-18 at 14:41 +0000, Paul Durrant wrote:
>> From: Paul Durrant <[email protected]>
>>
>> The VMM should only need to set the KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO
>> attribute in response to a VCPUOP_register_vcpu_info hypercall. We can
>> handle the default case internally since we already know where the
>> shared_info page is. Modify get_vcpu_info_cache() to pass back a pointer
>> to the shared info pfn cache (and appropriate offset) for any of the
>> first 32 vCPUs if the attribute has not been set.
>>
>> A VMM will be able to determine whether it needs to set up default
>> vcpu_info using the previously defined KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA
>> Xen capability flag, which will be advertized in a subsequent patch.
>>
>> Also update the KVM API documentation to describe the new behaviour.
>>
>> Signed-off-by: Paul Durrant <[email protected]>
>> ---
>> Cc: Sean Christopherson <[email protected]>
>> Cc: Paolo Bonzini <[email protected]>
>> Cc: Thomas Gleixner <[email protected]>
>> Cc: Ingo Molnar <[email protected]>
>> Cc: Borislav Petkov <[email protected]>
>> Cc: Dave Hansen <[email protected]>
>> Cc: "H. Peter Anvin" <[email protected]>
>> Cc: David Woodhouse <[email protected]>
>> Cc: [email protected]
>>
>> v3:
>> - Add a note to the API documentation discussing vcpu_info copying.
>>
>> v2:
>> - Dispense with the KVM_XEN_HVM_CONFIG_DEFAULT_VCPU_INFO capability.
>> - Add API documentation.
>> ---
>> Documentation/virt/kvm/api.rst | 22 +++++++++++++++-------
>> arch/x86/kvm/xen.c | 15 +++++++++++++++
>> 2 files changed, 30 insertions(+), 7 deletions(-)
>>
>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
>> index e9df4df6fe48..47bf3db74674 100644
>> --- a/Documentation/virt/kvm/api.rst
>> +++ b/Documentation/virt/kvm/api.rst
>> @@ -5442,13 +5442,7 @@ KVM_XEN_ATTR_TYPE_LONG_MODE
>>
>> KVM_XEN_ATTR_TYPE_SHARED_INFO
>> Sets the guest physical frame number at which the Xen shared_info
>> - page resides. Note that although Xen places vcpu_info for the first
>> - 32 vCPUs in the shared_info page, KVM does not automatically do so
>> - and instead requires that KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO be used
>> - explicitly even when the vcpu_info for a given vCPU resides at the
>> - "default" location in the shared_info page. This is because KVM may
>> - not be aware of the Xen CPU id which is used as the index into the
>> - vcpu_info[] array, so may know the correct default location.
>> + page resides.
>>
>> Note that the shared_info page may be constantly written to by KVM;
>> it contains the event channel bitmap used to deliver interrupts to
>> @@ -5564,12 +5558,26 @@ type values:
>>
>> KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO
>> Sets the guest physical address of the vcpu_info for a given vCPU.
>> + The vcpu_info for the first 32 vCPUs defaults to the structures
>> + embedded in the shared_info page.
>
> The above is true only if KVM has KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA.
> You kind of touch on that next, but perhaps the 'if the KVM_...'
> condition should be moved up?
>
>> + If the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA flag is also set in the
>> + Xen capabilities then the VMM is not required to set this default
>> + location; KVM will handle that internally. Otherwise this attribute
>> + must be set for all vCPUs.
>> +
>> As with the shared_info page for the VM, the corresponding page may be
>> dirtied at any time if event channel interrupt delivery is enabled, so
>> userspace should always assume that the page is dirty without relying
>> on dirty logging. Setting the gpa to KVM_XEN_INVALID_GPA will disable
>> the vcpu_info.
>>
>> + Note that, if the guest sets an explicit vcpu_info location in guest
>> + memory then the VMM is expected to copy the content of the structure
>> + embedded in the shared_info page to the new location. It is therefore
>> + important that no event delivery is in progress at this time, otherwise
>> + events may be missed.
>>
>
> That's difficult. It means tearing down all interrupts from passthrough
> devices which are mapped via PIRQs, and also all IPIs.
So those don't honour event channel masking? That seems like a problem.
>
> The IPI code *should* be able to fall back to just letting the VMM
> handle the hypercall in userspace. But PIRQs are harder. I'd be happier
> if our plan — handwavy though it may be — led to being able to use the
> existing slow path for delivering interrupts by just *invalidating* the
> cache. Maybe we *should* move the memcpy into the kernel, and let it
> lock *both* the shinfo and new vcpu_info caches while it's doing the
> copy? Given that that's the only valid transition, that shouldn't be so
> hard, should it?
>
No, it just kind of oversteps the remit of the attribute... but I'll try
adding it and see how messy it gets.
Paul
>> KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO
>> Sets the guest physical address of an additional pvclock structure
>> for a given vCPU. This is typically used for guest vsyscall support.
>> diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
>> index 459f3ca4710e..660a808c0b50 100644
>> --- a/arch/x86/kvm/xen.c
>> +++ b/arch/x86/kvm/xen.c
>> @@ -491,6 +491,21 @@ static void kvm_xen_inject_vcpu_vector(struct kvm_vcpu *v)
>>
>> static struct gfn_to_pfn_cache *get_vcpu_info_cache(struct kvm_vcpu *v, unsigned long *offset)
>> {
>> + if (!v->arch.xen.vcpu_info_cache.active && v->arch.xen.vcpu_id < MAX_VIRT_CPUS) {
>> + struct kvm *kvm = v->kvm;
>> +
>> + if (offset) {
>> + if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode)
>> + *offset = offsetof(struct shared_info,
>> + vcpu_info[v->arch.xen.vcpu_id]);
>> + else
>> + *offset = offsetof(struct compat_shared_info,
>> + vcpu_info[v->arch.xen.vcpu_id]);
>> + }
>> +
>> + return &kvm->arch.xen.shinfo_cache;
>> + }
>> +
>> if (offset)
>> *offset = 0;
>>
>