The fixed version of the patches from last week. To sum up, the issue
is the following.
Now that APICv can be disabled per-CPU (depending on whether it has some
setup that is incompatible) we need to deal with guests having a mix of
vCPUs with enabled/disabled posted interrupts. For assigned devices,
their posted interrupt configuration must be the same across the whole
VM, so handle posted interrupts by hand on vCPUs with disabled posted
interrupts.
Patches 1-3 handle the regular posted interrupt vector, while patch 4
handles the wakeup vector.
Paolo Bonzini (4):
KVM: x86: ignore APICv if LAPIC is not enabled
KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled
KVM: x86: check PIR even for vCPUs with disabled APICv
KVM: x86: Use a stable condition around all VT-d PI paths
arch/x86/kvm/lapic.c | 2 +-
arch/x86/kvm/svm/svm.c | 1 -
arch/x86/kvm/vmx/posted_intr.c | 20 ++++++++++---------
arch/x86/kvm/vmx/vmx.c | 35 ++++++++++++++++++++++------------
arch/x86/kvm/x86.c | 18 ++++++++---------
5 files changed, 44 insertions(+), 32 deletions(-)
--
2.27.0
Currently, checks for whether VT-d PI can be used refer to the current
status of the feature in the current vCPU; or they more or less pick
vCPU 0 in case a specific vCPU is not available.
However, these checks do not attempt to synchronize with changes to
the IRTE. In particular, there is no path that updates the IRTE when
APICv is re-activated on vCPU 0; and there is no path to wakeup a CPU
that has APICv disabled, if the wakeup occurs because of an IRTE
that points to a posted interrupt.
To fix this, always go through the VT-d PI path as long as there are
assigned devices and APICv is available on both the host and the VM side.
Since the relevant condition was copied over three times, take the hint
and factor it into a separate function.
Suggested-by: Sean Christopherson <[email protected]>
Cc: [email protected]
Reviewed-by: Sean Christopherson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kvm/vmx/posted_intr.c | 20 +++++++++++---------
1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
index 5f81ef092bd4..1c94783b5a54 100644
--- a/arch/x86/kvm/vmx/posted_intr.c
+++ b/arch/x86/kvm/vmx/posted_intr.c
@@ -5,6 +5,7 @@
#include <asm/cpu.h>
#include "lapic.h"
+#include "irq.h"
#include "posted_intr.h"
#include "trace.h"
#include "vmx.h"
@@ -77,13 +78,18 @@ void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
pi_set_on(pi_desc);
}
+static bool vmx_can_use_vtd_pi(struct kvm *kvm)
+{
+ return irqchip_in_kernel(kvm) && enable_apicv &&
+ kvm_arch_has_assigned_device(kvm) &&
+ irq_remapping_cap(IRQ_POSTING_CAP);
+}
+
void vmx_vcpu_pi_put(struct kvm_vcpu *vcpu)
{
struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
- if (!kvm_arch_has_assigned_device(vcpu->kvm) ||
- !irq_remapping_cap(IRQ_POSTING_CAP) ||
- !kvm_vcpu_apicv_active(vcpu))
+ if (!vmx_can_use_vtd_pi(vcpu->kvm))
return;
/* Set SN when the vCPU is preempted */
@@ -141,9 +147,7 @@ int pi_pre_block(struct kvm_vcpu *vcpu)
struct pi_desc old, new;
struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
- if (!kvm_arch_has_assigned_device(vcpu->kvm) ||
- !irq_remapping_cap(IRQ_POSTING_CAP) ||
- !kvm_vcpu_apicv_active(vcpu))
+ if (!vmx_can_use_vtd_pi(vcpu->kvm))
return 0;
WARN_ON(irqs_disabled());
@@ -270,9 +274,7 @@ int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
struct vcpu_data vcpu_info;
int idx, ret = 0;
- if (!kvm_arch_has_assigned_device(kvm) ||
- !irq_remapping_cap(IRQ_POSTING_CAP) ||
- !kvm_vcpu_apicv_active(kvm->vcpus[0]))
+ if (!vmx_can_use_vtd_pi(kvm))
return 0;
idx = srcu_read_lock(&kvm->irq_srcu);
--
2.27.0
Synchronize the condition for the two calls to kvm_x86_sync_pir_to_irr.
The one in the reenter-guest fast path invoked the callback
unconditionally even if LAPIC is disabled.
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kvm/x86.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5a403d92833f..441f4769173e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9849,7 +9849,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
break;
- if (vcpu->arch.apicv_active)
+ if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
static_call(kvm_x86_sync_pir_to_irr)(vcpu);
if (unlikely(kvm_vcpu_exit_request(vcpu))) {
--
2.27.0
If APICv is disabled for this vCPU, assigned devices may
still attempt to post interrupts. In that case, we need
to cancel the vmentry and deliver the interrupt with
KVM_REQ_EVENT. Extend the existing code that handles
injection of L1 interrupts into L2 to cover this case
as well.
vmx_hwapic_irr_update is only called when APICv is active
so it would be confusing to add a check for
vcpu->arch.apicv_active in there. Instead, just use
vmx_set_rvi directly in vmx_sync_pir_to_irr.
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 35 +++++++++++++++++++++++------------
1 file changed, 23 insertions(+), 12 deletions(-)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ba66c171d951..cccf1eab58ac 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6264,7 +6264,7 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
int max_irr;
bool max_irr_updated;
- if (KVM_BUG_ON(!vcpu->arch.apicv_active, vcpu->kvm))
+ if (KVM_BUG_ON(!enable_apicv, vcpu->kvm))
return -EIO;
if (pi_test_on(&vmx->pi_desc)) {
@@ -6276,20 +6276,31 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
smp_mb__after_atomic();
max_irr_updated =
kvm_apic_update_irr(vcpu, vmx->pi_desc.pir, &max_irr);
-
- /*
- * If we are running L2 and L1 has a new pending interrupt
- * which can be injected, this may cause a vmexit or it may
- * be injected into L2. Either way, this interrupt will be
- * processed via KVM_REQ_EVENT, not RVI, because we do not use
- * virtual interrupt delivery to inject L1 interrupts into L2.
- */
- if (is_guest_mode(vcpu) && max_irr_updated)
- kvm_make_request(KVM_REQ_EVENT, vcpu);
} else {
max_irr = kvm_lapic_find_highest_irr(vcpu);
+ max_irr_updated = false;
}
- vmx_hwapic_irr_update(vcpu, max_irr);
+
+ /*
+ * If virtual interrupt delivery is not in use, the interrupt
+ * will be processed via KVM_REQ_EVENT, not RVI. This can happen
+ * in two cases:
+ *
+ * 1) If we are running L2 and L1 has a new pending interrupt
+ * which can be injected, this may cause a vmexit or it may
+ * be injected into L2. We do not use virtual interrupt
+ * delivery to inject L1 interrupts into L2.
+ *
+ * 2) If APICv is disabled for this vCPU, assigned devices may
+ * still attempt to post interrupts. The posted interrupt
+ * vector will cause a vmexit and the subsequent entry will
+ * call sync_pir_to_irr.
+ */
+ if (!is_guest_mode(vcpu) && vcpu->arch.apicv_active)
+ vmx_set_rvi(max_irr);
+ else if (max_irr_updated)
+ kvm_make_request(KVM_REQ_EVENT, vcpu);
+
return max_irr;
}
--
2.27.0
The IRTE for an assigned device can trigger a POSTED_INTR_VECTOR even
if APICv is disabled on the vCPU that receives it. In that case, the
interrupt will just cause a vmexit and leave the ON bit set together
with the PIR bit corresponding to the interrupt.
Right now, the interrupt would not be delivered until APICv is re-enabled.
However, fixing this is just a matter of always doing the PIR->IRR
synchronization, even if the vCPU has temporarily disabled APICv.
This is not a problem for performance, or if anything it is an
improvement. First, in the common case where vcpu->arch.apicv_active is
true, one fewer check has to be performed. Second, static_call_cond will
elide the function call if APICv is not present or disabled. Finally,
in the case for AMD hardware we can remove the sync_pir_to_irr callback:
it is only needed for apic_has_interrupt_for_ppr, and that function
already has a fallback for !APICv.
Cc: [email protected]
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kvm/lapic.c | 2 +-
arch/x86/kvm/svm/svm.c | 1 -
arch/x86/kvm/x86.c | 18 +++++++++---------
3 files changed, 10 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 759952dd1222..f206fc35deff 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -707,7 +707,7 @@ static void pv_eoi_clr_pending(struct kvm_vcpu *vcpu)
static int apic_has_interrupt_for_ppr(struct kvm_lapic *apic, u32 ppr)
{
int highest_irr;
- if (apic->vcpu->arch.apicv_active)
+ if (kvm_x86_ops.sync_pir_to_irr)
highest_irr = static_call(kvm_x86_sync_pir_to_irr)(apic->vcpu);
else
highest_irr = apic_find_highest_irr(apic);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 5630c241d5f6..d0f68d11ec70 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4651,7 +4651,6 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
.load_eoi_exitmap = svm_load_eoi_exitmap,
.hwapic_irr_update = svm_hwapic_irr_update,
.hwapic_isr_update = svm_hwapic_isr_update,
- .sync_pir_to_irr = kvm_lapic_find_highest_irr,
.apicv_post_state_restore = avic_post_state_restore,
.set_tss_addr = svm_set_tss_addr,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 441f4769173e..a8f12c83db4b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4448,8 +4448,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
struct kvm_lapic_state *s)
{
- if (vcpu->arch.apicv_active)
- static_call(kvm_x86_sync_pir_to_irr)(vcpu);
+ static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
return kvm_apic_get_state(vcpu, s);
}
@@ -9528,8 +9527,7 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
if (irqchip_split(vcpu->kvm))
kvm_scan_ioapic_routes(vcpu, vcpu->arch.ioapic_handled_vectors);
else {
- if (vcpu->arch.apicv_active)
- static_call(kvm_x86_sync_pir_to_irr)(vcpu);
+ static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
if (ioapic_in_kernel(vcpu->kvm))
kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
}
@@ -9802,10 +9800,12 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
/*
* This handles the case where a posted interrupt was
- * notified with kvm_vcpu_kick.
+ * notified with kvm_vcpu_kick. Assigned devices can
+ * use the POSTED_INTR_VECTOR even if APICv is disabled,
+ * so do it even if APICv is disabled on this vCPU.
*/
- if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
- static_call(kvm_x86_sync_pir_to_irr)(vcpu);
+ if (kvm_lapic_enabled(vcpu))
+ static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
if (kvm_vcpu_exit_request(vcpu)) {
vcpu->mode = OUTSIDE_GUEST_MODE;
@@ -9849,8 +9849,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
break;
- if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
- static_call(kvm_x86_sync_pir_to_irr)(vcpu);
+ if (kvm_lapic_enabled(vcpu))
+ static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
if (unlikely(kvm_vcpu_exit_request(vcpu))) {
exit_fastpath = EXIT_FASTPATH_EXIT_HANDLED;
--
2.27.0
On Mon, Nov 22, 2021, Paolo Bonzini wrote:
> Synchronize the condition for the two calls to kvm_x86_sync_pir_to_irr.
> The one in the reenter-guest fast path invoked the callback
> unconditionally even if LAPIC is disabled.
>
> Cc: [email protected]
> Signed-off-by: Paolo Bonzini <[email protected]>
> ---
> arch/x86/kvm/x86.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 5a403d92833f..441f4769173e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9849,7 +9849,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
> break;
>
> - if (vcpu->arch.apicv_active)
> + if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
Ooooh, lapic _enabled_, not just present. That took me far too long to read...
Reviewed-by: Sean Christopherson <[email protected]>
> static_call(kvm_x86_sync_pir_to_irr)(vcpu);
>
> if (unlikely(kvm_vcpu_exit_request(vcpu))) {
> --
> 2.27.0
>
>
On Mon, Nov 22, 2021, Paolo Bonzini wrote:
> If APICv is disabled for this vCPU, assigned devices may
> still attempt to post interrupts. In that case, we need
> to cancel the vmentry and deliver the interrupt with
> KVM_REQ_EVENT. Extend the existing code that handles
> injection of L1 interrupts into L2 to cover this case
> as well.
>
> vmx_hwapic_irr_update is only called when APICv is active
> so it would be confusing to add a check for
> vcpu->arch.apicv_active in there. Instead, just use
> vmx_set_rvi directly in vmx_sync_pir_to_irr.
Overzealous newlines.
If APICv is disabled for this vCPU, assigned devices may still attempt to
post interrupts. In that case, we need to cancel the vmentry and deliver
the interrupt with KVM_REQ_EVENT. Extend the existing code that handles
injection of L1 interrupts into L2 to cover this case as well.
vmx_hwapic_irr_update is only called when APICv is active so it would be
confusing to add a check for vcpu->arch.apicv_active in there. Instead,
just use vmx_set_rvi directly in vmx_sync_pir_to_irr.
> Cc: [email protected]
> Signed-off-by: Paolo Bonzini <[email protected]>
> ---
> arch/x86/kvm/vmx/vmx.c | 35 +++++++++++++++++++++++------------
> 1 file changed, 23 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index ba66c171d951..cccf1eab58ac 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6264,7 +6264,7 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
> int max_irr;
> bool max_irr_updated;
>
> - if (KVM_BUG_ON(!vcpu->arch.apicv_active, vcpu->kvm))
> + if (KVM_BUG_ON(!enable_apicv, vcpu->kvm))
> return -EIO;
>
> if (pi_test_on(&vmx->pi_desc)) {
> @@ -6276,20 +6276,31 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
> smp_mb__after_atomic();
> max_irr_updated =
> kvm_apic_update_irr(vcpu, vmx->pi_desc.pir, &max_irr);
> -
> - /*
> - * If we are running L2 and L1 has a new pending interrupt
> - * which can be injected, this may cause a vmexit or it may
> - * be injected into L2. Either way, this interrupt will be
> - * processed via KVM_REQ_EVENT, not RVI, because we do not use
> - * virtual interrupt delivery to inject L1 interrupts into L2.
> - */
> - if (is_guest_mode(vcpu) && max_irr_updated)
> - kvm_make_request(KVM_REQ_EVENT, vcpu);
> } else {
> max_irr = kvm_lapic_find_highest_irr(vcpu);
> + max_irr_updated = false;
Heh, maybe s/max_irr_updated/new_pir_found or so? This is a bit weird:
1. Update max_irr
2. max_irr_updated = false
> }
> - vmx_hwapic_irr_update(vcpu, max_irr);
> +
> + /*
> + * If virtual interrupt delivery is not in use, the interrupt
> + * will be processed via KVM_REQ_EVENT, not RVI. This can happen
I'd strongly prefer to phrase this as a command, e.g. "process the interrupt via
KVM_REQ_EVENT". "will be processed" makes it sound like some other flow is
handling the event, which confused me.
> + * in two cases:
> + *
> + * 1) If we are running L2 and L1 has a new pending interrupt
Hmm, calling it L1's interrupt isn't technically wrong, but it's a bit confusing
because it's handled in L2. Maybe just say "vCPU"?
> + * which can be injected, this may cause a vmexit or it may
Overzealous newlines again.
> + * be injected into L2. We do not use virtual interrupt
> + * delivery to inject L1 interrupts into L2.
I found the part of "may cause a vmexit or may be injected into L2" hard to
follow, I think because this doesn't explicit state that the KVM_REQ_EVENT is
needed to synthesize a VM-Exit.
How about this for the comment?
/*
* If virtual interrupt delivery is not in use, process the interrupt
* via KVM_REQ_EVENT, not RVI. This is necessary in two cases:
*
* 1) If L2 is running and the vCPU has a new pending interrupt. If L1
* wants to exit on interrupts, KVM_REQ_EVENT is needed to synthesize a
* VM-Exit to L1. If L1 doesn't want to exit, the interrupt is injected
* into L2, but KVM doesn't use virtual interrupt delivery to inject
* interrupts into L2, and so KVM_REQ_EVENT is again needed.
*
* 2) If APICv is disabled for this vCPU, assigned devices may still
* attempt to post interrupts. The posted interrupt vector will cause
* a VM-Exit and the subsequent entry will call sync_pir_to_irr.
*/
> + *
> + * 2) If APICv is disabled for this vCPU, assigned devices may
> + * still attempt to post interrupts. The posted interrupt
> + * vector will cause a vmexit and the subsequent entry will
> + * call sync_pir_to_irr.
> + */
> + if (!is_guest_mode(vcpu) && vcpu->arch.apicv_active)
If the second check uses kvm_vcpu_apicv_active(), this will avoid a silent conflict
when kvm/master and kvm/queue are merged.
Nits aside, the logic is good:
Reviewed-by: Sean Christopherson <[email protected]>
> + vmx_set_rvi(max_irr);
> + else if (max_irr_updated)
> + kvm_make_request(KVM_REQ_EVENT, vcpu);
> +
> return max_irr;
> }
>
> --
> 2.27.0
>
>
On Mon, Nov 22, 2021 at 07:43:08PM -0500, Paolo Bonzini wrote:
> Synchronize the condition for the two calls to kvm_x86_sync_pir_to_irr.
> The one in the reenter-guest fast path invoked the callback
> unconditionally even if LAPIC is disabled.
>
> Cc: [email protected]
> Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: David Matlack <[email protected]>
> ---
> arch/x86/kvm/x86.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 5a403d92833f..441f4769173e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9849,7 +9849,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
> break;
>
> - if (vcpu->arch.apicv_active)
> + if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
> static_call(kvm_x86_sync_pir_to_irr)(vcpu);
>
> if (unlikely(kvm_vcpu_exit_request(vcpu))) {
> --
> 2.27.0
>
>
On Mon, Nov 22, 2021 at 07:43:09PM -0500, Paolo Bonzini wrote:
> If APICv is disabled for this vCPU, assigned devices may
> still attempt to post interrupts. In that case, we need
> to cancel the vmentry and deliver the interrupt with
> KVM_REQ_EVENT. Extend the existing code that handles
> injection of L1 interrupts into L2 to cover this case
> as well.
>
> vmx_hwapic_irr_update is only called when APICv is active
> so it would be confusing to add a check for
> vcpu->arch.apicv_active in there. Instead, just use
> vmx_set_rvi directly in vmx_sync_pir_to_irr.
>
> Cc: [email protected]
> Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: David Matlack <[email protected]>
(Although I agree with Sean's suggestions and 1 nit below.)
> ---
> arch/x86/kvm/vmx/vmx.c | 35 +++++++++++++++++++++++------------
> 1 file changed, 23 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index ba66c171d951..cccf1eab58ac 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6264,7 +6264,7 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
> int max_irr;
> bool max_irr_updated;
>
> - if (KVM_BUG_ON(!vcpu->arch.apicv_active, vcpu->kvm))
> + if (KVM_BUG_ON(!enable_apicv, vcpu->kvm))
> return -EIO;
>
> if (pi_test_on(&vmx->pi_desc)) {
> @@ -6276,20 +6276,31 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
> smp_mb__after_atomic();
> max_irr_updated =
> kvm_apic_update_irr(vcpu, vmx->pi_desc.pir, &max_irr);
> -
> - /*
> - * If we are running L2 and L1 has a new pending interrupt
> - * which can be injected, this may cause a vmexit or it may
> - * be injected into L2. Either way, this interrupt will be
> - * processed via KVM_REQ_EVENT, not RVI, because we do not use
> - * virtual interrupt delivery to inject L1 interrupts into L2.
> - */
> - if (is_guest_mode(vcpu) && max_irr_updated)
> - kvm_make_request(KVM_REQ_EVENT, vcpu);
> } else {
> max_irr = kvm_lapic_find_highest_irr(vcpu);
> + max_irr_updated = false;
> }
> - vmx_hwapic_irr_update(vcpu, max_irr);
> +
> + /*
> + * If virtual interrupt delivery is not in use, the interrupt
> + * will be processed via KVM_REQ_EVENT, not RVI. This can happen
> + * in two cases:
> + *
> + * 1) If we are running L2 and L1 has a new pending interrupt
> + * which can be injected, this may cause a vmexit or it may
> + * be injected into L2. We do not use virtual interrupt
> + * delivery to inject L1 interrupts into L2.
> + *
> + * 2) If APICv is disabled for this vCPU, assigned devices may
> + * still attempt to post interrupts. The posted interrupt
> + * vector will cause a vmexit and the subsequent entry will
> + * call sync_pir_to_irr.
> + */
> + if (!is_guest_mode(vcpu) && vcpu->arch.apicv_active)
> + vmx_set_rvi(max_irr);
> + else if (max_irr_updated)
> + kvm_make_request(KVM_REQ_EVENT, vcpu);
nit: In the version of this code that is currently in kvm/queue the
indentation of the previous 3 lines uses spaces instead of tabs. I see
tabs in this mail though.
> +
> return max_irr;
> }
>
> --
> 2.27.0
>
>
On Mon, Nov 22, 2021 at 07:43:10PM -0500, Paolo Bonzini wrote:
> The IRTE for an assigned device can trigger a POSTED_INTR_VECTOR even
> if APICv is disabled on the vCPU that receives it. In that case, the
> interrupt will just cause a vmexit and leave the ON bit set together
> with the PIR bit corresponding to the interrupt.
>
> Right now, the interrupt would not be delivered until APICv is re-enabled.
> However, fixing this is just a matter of always doing the PIR->IRR
> synchronization, even if the vCPU has temporarily disabled APICv.
>
> This is not a problem for performance, or if anything it is an
> improvement. First, in the common case where vcpu->arch.apicv_active is
> true, one fewer check has to be performed. Second, static_call_cond will
> elide the function call if APICv is not present or disabled. Finally,
> in the case for AMD hardware we can remove the sync_pir_to_irr callback:
> it is only needed for apic_has_interrupt_for_ppr, and that function
> already has a fallback for !APICv.
>
> Cc: [email protected]
> Co-developed-by: Sean Christopherson <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>
> Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: David Matlack <[email protected]>
> ---
> arch/x86/kvm/lapic.c | 2 +-
> arch/x86/kvm/svm/svm.c | 1 -
> arch/x86/kvm/x86.c | 18 +++++++++---------
> 3 files changed, 10 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 759952dd1222..f206fc35deff 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -707,7 +707,7 @@ static void pv_eoi_clr_pending(struct kvm_vcpu *vcpu)
> static int apic_has_interrupt_for_ppr(struct kvm_lapic *apic, u32 ppr)
> {
> int highest_irr;
> - if (apic->vcpu->arch.apicv_active)
> + if (kvm_x86_ops.sync_pir_to_irr)
> highest_irr = static_call(kvm_x86_sync_pir_to_irr)(apic->vcpu);
> else
> highest_irr = apic_find_highest_irr(apic);
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 5630c241d5f6..d0f68d11ec70 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4651,7 +4651,6 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
> .load_eoi_exitmap = svm_load_eoi_exitmap,
> .hwapic_irr_update = svm_hwapic_irr_update,
> .hwapic_isr_update = svm_hwapic_isr_update,
> - .sync_pir_to_irr = kvm_lapic_find_highest_irr,
> .apicv_post_state_restore = avic_post_state_restore,
>
> .set_tss_addr = svm_set_tss_addr,
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 441f4769173e..a8f12c83db4b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4448,8 +4448,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
> struct kvm_lapic_state *s)
> {
> - if (vcpu->arch.apicv_active)
> - static_call(kvm_x86_sync_pir_to_irr)(vcpu);
> + static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
>
> return kvm_apic_get_state(vcpu, s);
> }
> @@ -9528,8 +9527,7 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
> if (irqchip_split(vcpu->kvm))
> kvm_scan_ioapic_routes(vcpu, vcpu->arch.ioapic_handled_vectors);
> else {
> - if (vcpu->arch.apicv_active)
> - static_call(kvm_x86_sync_pir_to_irr)(vcpu);
> + static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
> if (ioapic_in_kernel(vcpu->kvm))
> kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
> }
> @@ -9802,10 +9800,12 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>
> /*
> * This handles the case where a posted interrupt was
> - * notified with kvm_vcpu_kick.
> + * notified with kvm_vcpu_kick. Assigned devices can
> + * use the POSTED_INTR_VECTOR even if APICv is disabled,
> + * so do it even if APICv is disabled on this vCPU.
> */
> - if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
> - static_call(kvm_x86_sync_pir_to_irr)(vcpu);
> + if (kvm_lapic_enabled(vcpu))
> + static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
>
> if (kvm_vcpu_exit_request(vcpu)) {
> vcpu->mode = OUTSIDE_GUEST_MODE;
> @@ -9849,8 +9849,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
> break;
>
> - if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
> - static_call(kvm_x86_sync_pir_to_irr)(vcpu);
> + if (kvm_lapic_enabled(vcpu))
> + static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
>
> if (unlikely(kvm_vcpu_exit_request(vcpu))) {
> exit_fastpath = EXIT_FASTPATH_EXIT_HANDLED;
> --
> 2.27.0
>
>
On Mon, Nov 22, 2021 at 07:43:11PM -0500, Paolo Bonzini wrote:
> Currently, checks for whether VT-d PI can be used refer to the current
> status of the feature in the current vCPU; or they more or less pick
> vCPU 0 in case a specific vCPU is not available.
>
> However, these checks do not attempt to synchronize with changes to
> the IRTE. In particular, there is no path that updates the IRTE when
> APICv is re-activated on vCPU 0; and there is no path to wakeup a CPU
> that has APICv disabled, if the wakeup occurs because of an IRTE
> that points to a posted interrupt.
>
> To fix this, always go through the VT-d PI path as long as there are
> assigned devices and APICv is available on both the host and the VM side.
> Since the relevant condition was copied over three times, take the hint
> and factor it into a separate function.
>
> Suggested-by: Sean Christopherson <[email protected]>
> Cc: [email protected]
> Reviewed-by: Sean Christopherson <[email protected]>
> Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: David Matlack <[email protected]>
> ---
> arch/x86/kvm/vmx/posted_intr.c | 20 +++++++++++---------
> 1 file changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
> index 5f81ef092bd4..1c94783b5a54 100644
> --- a/arch/x86/kvm/vmx/posted_intr.c
> +++ b/arch/x86/kvm/vmx/posted_intr.c
> @@ -5,6 +5,7 @@
> #include <asm/cpu.h>
>
> #include "lapic.h"
> +#include "irq.h"
> #include "posted_intr.h"
> #include "trace.h"
> #include "vmx.h"
> @@ -77,13 +78,18 @@ void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
> pi_set_on(pi_desc);
> }
>
> +static bool vmx_can_use_vtd_pi(struct kvm *kvm)
> +{
> + return irqchip_in_kernel(kvm) && enable_apicv &&
> + kvm_arch_has_assigned_device(kvm) &&
> + irq_remapping_cap(IRQ_POSTING_CAP);
> +}
> +
> void vmx_vcpu_pi_put(struct kvm_vcpu *vcpu)
> {
> struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
>
> - if (!kvm_arch_has_assigned_device(vcpu->kvm) ||
> - !irq_remapping_cap(IRQ_POSTING_CAP) ||
> - !kvm_vcpu_apicv_active(vcpu))
> + if (!vmx_can_use_vtd_pi(vcpu->kvm))
> return;
>
> /* Set SN when the vCPU is preempted */
> @@ -141,9 +147,7 @@ int pi_pre_block(struct kvm_vcpu *vcpu)
> struct pi_desc old, new;
> struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
>
> - if (!kvm_arch_has_assigned_device(vcpu->kvm) ||
> - !irq_remapping_cap(IRQ_POSTING_CAP) ||
> - !kvm_vcpu_apicv_active(vcpu))
> + if (!vmx_can_use_vtd_pi(vcpu->kvm))
> return 0;
>
> WARN_ON(irqs_disabled());
> @@ -270,9 +274,7 @@ int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
> struct vcpu_data vcpu_info;
> int idx, ret = 0;
>
> - if (!kvm_arch_has_assigned_device(kvm) ||
> - !irq_remapping_cap(IRQ_POSTING_CAP) ||
> - !kvm_vcpu_apicv_active(kvm->vcpus[0]))
> + if (!vmx_can_use_vtd_pi(kvm))
> return 0;
>
> idx = srcu_read_lock(&kvm->irq_srcu);
> --
> 2.27.0
>
On Mon, 2021-11-22 at 19:43 -0500, Paolo Bonzini wrote:
> Synchronize the condition for the two calls to kvm_x86_sync_pir_to_irr.
> The one in the reenter-guest fast path invoked the callback
> unconditionally even if LAPIC is disabled.
>
> Cc: [email protected]
> Signed-off-by: Paolo Bonzini <[email protected]>
> ---
> arch/x86/kvm/x86.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 5a403d92833f..441f4769173e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9849,7 +9849,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
> break;
>
> - if (vcpu->arch.apicv_active)
> + if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
> static_call(kvm_x86_sync_pir_to_irr)(vcpu);
>
> if (unlikely(kvm_vcpu_exit_request(vcpu))) {
Reviewed-by: Maxim Levitsky <[email protected]>
Best regards,
Maxim Levitsky
On Mon, 2021-11-22 at 19:43 -0500, Paolo Bonzini wrote:
> If APICv is disabled for this vCPU, assigned devices may
> still attempt to post interrupts. In that case, we need
> to cancel the vmentry and deliver the interrupt with
> KVM_REQ_EVENT. Extend the existing code that handles
> injection of L1 interrupts into L2 to cover this case
> as well.
>
> vmx_hwapic_irr_update is only called when APICv is active
> so it would be confusing to add a check for
> vcpu->arch.apicv_active in there. Instead, just use
> vmx_set_rvi directly in vmx_sync_pir_to_irr.
>
> Cc: [email protected]
> Signed-off-by: Paolo Bonzini <[email protected]>
> ---
> arch/x86/kvm/vmx/vmx.c | 35 +++++++++++++++++++++++------------
> 1 file changed, 23 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index ba66c171d951..cccf1eab58ac 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6264,7 +6264,7 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
> int max_irr;
> bool max_irr_updated;
>
> - if (KVM_BUG_ON(!vcpu->arch.apicv_active, vcpu->kvm))
> + if (KVM_BUG_ON(!enable_apicv, vcpu->kvm))
> return -EIO;
>
> if (pi_test_on(&vmx->pi_desc)) {
> @@ -6276,20 +6276,31 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
> smp_mb__after_atomic();
> max_irr_updated =
> kvm_apic_update_irr(vcpu, vmx->pi_desc.pir, &max_irr);
> -
> - /*
> - * If we are running L2 and L1 has a new pending interrupt
> - * which can be injected, this may cause a vmexit or it may
> - * be injected into L2. Either way, this interrupt will be
> - * processed via KVM_REQ_EVENT, not RVI, because we do not use
> - * virtual interrupt delivery to inject L1 interrupts into L2.
> - */
> - if (is_guest_mode(vcpu) && max_irr_updated)
> - kvm_make_request(KVM_REQ_EVENT, vcpu);
> } else {
> max_irr = kvm_lapic_find_highest_irr(vcpu);
> + max_irr_updated = false;
> }
> - vmx_hwapic_irr_update(vcpu, max_irr);
> +
> + /*
> + * If virtual interrupt delivery is not in use, the interrupt
> + * will be processed via KVM_REQ_EVENT, not RVI. This can happen
> + * in two cases:
> + *
> + * 1) If we are running L2 and L1 has a new pending interrupt
> + * which can be injected, this may cause a vmexit or it may
> + * be injected into L2. We do not use virtual interrupt
> + * delivery to inject L1 interrupts into L2.
> + *
> + * 2) If APICv is disabled for this vCPU, assigned devices may
> + * still attempt to post interrupts. The posted interrupt
> + * vector will cause a vmexit and the subsequent entry will
> + * call sync_pir_to_irr.
> + */
> + if (!is_guest_mode(vcpu) && vcpu->arch.apicv_active)
> + vmx_set_rvi(max_irr);
> + else if (max_irr_updated)
> + kvm_make_request(KVM_REQ_EVENT, vcpu);
> +
> return max_irr;
> }
>
Reviewed-by: Maxim Levitsky <[email protected]>
Best regards,
Maxim Levitsky
On Mon, 2021-11-22 at 19:43 -0500, Paolo Bonzini wrote:
> The IRTE for an assigned device can trigger a POSTED_INTR_VECTOR even
> if APICv is disabled on the vCPU that receives it. In that case, the
> interrupt will just cause a vmexit and leave the ON bit set together
> with the PIR bit corresponding to the interrupt.
>
> Right now, the interrupt would not be delivered until APICv is re-enabled.
> However, fixing this is just a matter of always doing the PIR->IRR
> synchronization, even if the vCPU has temporarily disabled APICv.
>
> This is not a problem for performance, or if anything it is an
> improvement. First, in the common case where vcpu->arch.apicv_active is
> true, one fewer check has to be performed. Second, static_call_cond will
> elide the function call if APICv is not present or disabled. Finally,
> in the case for AMD hardware we can remove the sync_pir_to_irr callback:
> it is only needed for apic_has_interrupt_for_ppr, and that function
> already has a fallback for !APICv.
>
> Cc: [email protected]
> Co-developed-by: Sean Christopherson <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>
> Signed-off-by: Paolo Bonzini <[email protected]>
> ---
> arch/x86/kvm/lapic.c | 2 +-
> arch/x86/kvm/svm/svm.c | 1 -
> arch/x86/kvm/x86.c | 18 +++++++++---------
> 3 files changed, 10 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 759952dd1222..f206fc35deff 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -707,7 +707,7 @@ static void pv_eoi_clr_pending(struct kvm_vcpu *vcpu)
> static int apic_has_interrupt_for_ppr(struct kvm_lapic *apic, u32 ppr)
> {
> int highest_irr;
> - if (apic->vcpu->arch.apicv_active)
> + if (kvm_x86_ops.sync_pir_to_irr)
> highest_irr = static_call(kvm_x86_sync_pir_to_irr)(apic->vcpu);
> else
> highest_irr = apic_find_highest_irr(apic);
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 5630c241d5f6..d0f68d11ec70 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4651,7 +4651,6 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
> .load_eoi_exitmap = svm_load_eoi_exitmap,
> .hwapic_irr_update = svm_hwapic_irr_update,
> .hwapic_isr_update = svm_hwapic_isr_update,
> - .sync_pir_to_irr = kvm_lapic_find_highest_irr,
> .apicv_post_state_restore = avic_post_state_restore,
>
> .set_tss_addr = svm_set_tss_addr,
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 441f4769173e..a8f12c83db4b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4448,8 +4448,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
> struct kvm_lapic_state *s)
> {
> - if (vcpu->arch.apicv_active)
> - static_call(kvm_x86_sync_pir_to_irr)(vcpu);
> + static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
>
> return kvm_apic_get_state(vcpu, s);
> }
> @@ -9528,8 +9527,7 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
> if (irqchip_split(vcpu->kvm))
> kvm_scan_ioapic_routes(vcpu, vcpu->arch.ioapic_handled_vectors);
> else {
> - if (vcpu->arch.apicv_active)
> - static_call(kvm_x86_sync_pir_to_irr)(vcpu);
> + static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
> if (ioapic_in_kernel(vcpu->kvm))
> kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
> }
> @@ -9802,10 +9800,12 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>
> /*
> * This handles the case where a posted interrupt was
> - * notified with kvm_vcpu_kick.
> + * notified with kvm_vcpu_kick. Assigned devices can
> + * use the POSTED_INTR_VECTOR even if APICv is disabled,
> + * so do it even if APICv is disabled on this vCPU.
> */
> - if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
> - static_call(kvm_x86_sync_pir_to_irr)(vcpu);
> + if (kvm_lapic_enabled(vcpu))
> + static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
>
> if (kvm_vcpu_exit_request(vcpu)) {
> vcpu->mode = OUTSIDE_GUEST_MODE;
> @@ -9849,8 +9849,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
> break;
>
> - if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active)
> - static_call(kvm_x86_sync_pir_to_irr)(vcpu);
> + if (kvm_lapic_enabled(vcpu))
> + static_call_cond(kvm_x86_sync_pir_to_irr)(vcpu);
>
> if (unlikely(kvm_vcpu_exit_request(vcpu))) {
> exit_fastpath = EXIT_FASTPATH_EXIT_HANDLED;
Reviewed-by: Maxim Levitsky <[email protected]>
Best regards,
Maxim Levitsky
On Mon, 2021-11-22 at 19:43 -0500, Paolo Bonzini wrote:
> Currently, checks for whether VT-d PI can be used refer to the current
> status of the feature in the current vCPU; or they more or less pick
> vCPU 0 in case a specific vCPU is not available.
>
> However, these checks do not attempt to synchronize with changes to
> the IRTE. In particular, there is no path that updates the IRTE when
> APICv is re-activated on vCPU 0; and there is no path to wakeup a CPU
> that has APICv disabled, if the wakeup occurs because of an IRTE
> that points to a posted interrupt.
>
> To fix this, always go through the VT-d PI path as long as there are
> assigned devices and APICv is available on both the host and the VM side.
> Since the relevant condition was copied over three times, take the hint
> and factor it into a separate function.
>
> Suggested-by: Sean Christopherson <[email protected]>
> Cc: [email protected]
> Reviewed-by: Sean Christopherson <[email protected]>
> Signed-off-by: Paolo Bonzini <[email protected]>
> ---
> arch/x86/kvm/vmx/posted_intr.c | 20 +++++++++++---------
> 1 file changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
> index 5f81ef092bd4..1c94783b5a54 100644
> --- a/arch/x86/kvm/vmx/posted_intr.c
> +++ b/arch/x86/kvm/vmx/posted_intr.c
> @@ -5,6 +5,7 @@
> #include <asm/cpu.h>
>
> #include "lapic.h"
> +#include "irq.h"
> #include "posted_intr.h"
> #include "trace.h"
> #include "vmx.h"
> @@ -77,13 +78,18 @@ void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
> pi_set_on(pi_desc);
> }
>
> +static bool vmx_can_use_vtd_pi(struct kvm *kvm)
> +{
> + return irqchip_in_kernel(kvm) && enable_apicv &&
> + kvm_arch_has_assigned_device(kvm) &&
> + irq_remapping_cap(IRQ_POSTING_CAP);
> +}
> +
> void vmx_vcpu_pi_put(struct kvm_vcpu *vcpu)
> {
> struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
>
> - if (!kvm_arch_has_assigned_device(vcpu->kvm) ||
> - !irq_remapping_cap(IRQ_POSTING_CAP) ||
> - !kvm_vcpu_apicv_active(vcpu))
> + if (!vmx_can_use_vtd_pi(vcpu->kvm))
> return;
>
> /* Set SN when the vCPU is preempted */
> @@ -141,9 +147,7 @@ int pi_pre_block(struct kvm_vcpu *vcpu)
> struct pi_desc old, new;
> struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
>
> - if (!kvm_arch_has_assigned_device(vcpu->kvm) ||
> - !irq_remapping_cap(IRQ_POSTING_CAP) ||
> - !kvm_vcpu_apicv_active(vcpu))
> + if (!vmx_can_use_vtd_pi(vcpu->kvm))
> return 0;
>
> WARN_ON(irqs_disabled());
> @@ -270,9 +274,7 @@ int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
> struct vcpu_data vcpu_info;
> int idx, ret = 0;
>
> - if (!kvm_arch_has_assigned_device(kvm) ||
> - !irq_remapping_cap(IRQ_POSTING_CAP) ||
> - !kvm_vcpu_apicv_active(kvm->vcpus[0]))
> + if (!vmx_can_use_vtd_pi(kvm))
> return 0;
>
> idx = srcu_read_lock(&kvm->irq_srcu);
Reviewed-by: Maxim Levitsky <[email protected]>
Best regards,
Maxim Levitsky
On 11/29/21 23:14, Sean Christopherson wrote:
> Heh, maybe s/max_irr_updated/new_pir_found or so? This is a bit weird:
>
> 1. Update max_irr
> 2. max_irr_updated = false
Sounds good (I went for got_posted_interrupt).
>> }
>> - vmx_hwapic_irr_update(vcpu, max_irr);
>> +
>> + /*
>> + * If virtual interrupt delivery is not in use, the interrupt
>> + * will be processed via KVM_REQ_EVENT, not RVI. This can happen
>
> I'd strongly prefer to phrase this as a command, e.g. "process the interrupt via
> KVM_REQ_EVENT". "will be processed" makes it sound like some other flow is
> handling the event, which confused me.
What I wanted to convey is that the interrupt is not processed yet, and
the vmentry might have to be canceled. I changed it to
* Newly recognized interrupts are injected via either virtual
interrupt
* delivery (RVI) or KVM_REQ_EVENT. Virtual interrupt delivery is
* disabled in two cases:
> * 1) If L2 is running and the vCPU has a new pending interrupt. If L1
> * wants to exit on interrupts, KVM_REQ_EVENT is needed to synthesize a
> * VM-Exit to L1. If L1 doesn't want to exit, the interrupt is injected
> * into L2, but KVM doesn't use virtual interrupt delivery to inject
> * interrupts into L2, and so KVM_REQ_EVENT is again needed.
> *
> * 2) If APICv is disabled for this vCPU, assigned devices may still
> * attempt to post interrupts. The posted interrupt vector will cause
> * a VM-Exit and the subsequent entry will call sync_pir_to_irr.
> */
Applied these.
Paolo