2019-11-06 18:00:33

by Joao Martins

[permalink] [raw]
Subject: [PATCH v1 0/3] KVM: VMX: Posted Interrupts fixes

Hey,

This mini-series addresses two bugs plus a small cleanup:

1) Considering PID.PIR as opposed to just PID.ON for pending interrupt checks
in the direct yield code path;

2) Not changing NDST in vmx_vcpu_pi_load(), which addresses a regression.
Prior to this, we would observe Win2K12 guests hanging momentarily.

3) Small cleanup to streamline PIR checks with a common helper.

The cleanup is done last to simplify backports.

Joao

Joao Martins (3):
KVM: VMX: Consider PID.PIR to determine if vCPU has pending interrupts
KVM: VMX: Do not change PID.NDST when loading a blocked vCPU
KVM: VMX: Introduce pi_is_pir_empty() helper

arch/x86/kvm/vmx/vmx.c | 20 ++++++++++++++++++--
arch/x86/kvm/vmx/vmx.h | 11 +++++++++++
2 files changed, 29 insertions(+), 2 deletions(-)

--
2.11.0


2019-11-06 18:01:16

by Joao Martins

[permalink] [raw]
Subject: [PATCH v1 3/3] KVM: VMX: Introduce pi_is_pir_empty() helper

Streamline the PID.PIR check and change its call sites to use
the newly added helper.

Suggested-by: Liran Alon <[email protected]>
Signed-off-by: Joao Martins <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 5 ++---
arch/x86/kvm/vmx/vmx.h | 5 +++++
2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 75d903455e1c..8e8dbb174d14 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1311,7 +1311,7 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
*/
smp_mb__after_atomic();

- if (!bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS))
+ if (!pi_is_pir_empty(pi_desc))
pi_set_on(pi_desc);
}

@@ -6157,8 +6157,7 @@ static bool vmx_dy_apicv_has_pending_interrupt(struct kvm_vcpu *vcpu)
{
struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);

- return pi_test_on(pi_desc) ||
- !bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS);
+ return pi_test_on(pi_desc) || !pi_is_pir_empty(pi_desc);
}

static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 1e32ab54fc2d..5a0f34b1e226 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -355,6 +355,11 @@ static inline int pi_test_and_set_pir(int vector, struct pi_desc *pi_desc)
return test_and_set_bit(vector, (unsigned long *)pi_desc->pir);
}

+static inline bool pi_is_pir_empty(struct pi_desc *pi_desc)
+{
+ return bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS);
+}
+
static inline void pi_set_sn(struct pi_desc *pi_desc)
{
set_bit(POSTED_INTR_SN,
--
2.11.0

2019-11-06 18:01:26

by Joao Martins

[permalink] [raw]
Subject: [PATCH v1 1/3] KVM: VMX: Consider PID.PIR to determine if vCPU has pending interrupts

Commit 17e433b54393 ("KVM: Fix leak vCPU's VMCS value into other pCPU")
introduced vmx_dy_apicv_has_pending_interrupt() in order to determine
if a vCPU have a pending posted interrupt. This routine is used by
kvm_vcpu_on_spin() when searching for a a new runnable vCPU to schedule
on pCPU instead of a vCPU doing busy loop.

vmx_dy_apicv_has_pending_interrupt() determines if a
vCPU has a pending posted interrupt solely based on PID.ON. However,
when a vCPU is preempted, vmx_vcpu_pi_put() sets PID.SN which cause
raised posted interrupts to only set bit in PID.PIR without setting
PID.ON (and without sending notification vector), as depicted in VT-d
manual section 5.2.3 "Interrupt-Posting Hardware Operation".

Therefore, checking PID.ON is insufficient to determine if a vCPU has
pending posted interrupts and instead we should also check if there is
some bit set on PID.PIR.

Fixes: 17e433b54393 ("KVM: Fix leak vCPU's VMCS value into other pCPU")
Signed-off-by: Joao Martins <[email protected]>
Signed-off-by: Liran Alon <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 31ce6bc2c371..18b0bee662a5 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6141,7 +6141,10 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)

static bool vmx_dy_apicv_has_pending_interrupt(struct kvm_vcpu *vcpu)
{
- return pi_test_on(vcpu_to_pi_desc(vcpu));
+ struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
+
+ return pi_test_on(pi_desc) ||
+ !bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS);
}

static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
--
2.11.0

2019-11-06 18:03:07

by Joao Martins

[permalink] [raw]
Subject: [PATCH v1 2/3] KVM: VMX: Do not change PID.NDST when loading a blocked vCPU

When vCPU enters block phase, pi_pre_block() inserts vCPU to a per pCPU
linked list of all vCPUs that are blocked on this pCPU. Afterwards, it
changes PID.NV to POSTED_INTR_WAKEUP_VECTOR which its handler
(wakeup_handler()) is responsible to kick (unblock) any vCPU on that
linked list that now has pending posted interrupts.

While vCPU is blocked (in kvm_vcpu_block()), it may be preempted which
will cause vmx_vcpu_pi_put() to set PID.SN. If later the vCPU will be
scheduled to run on a different pCPU, vmx_vcpu_pi_load() will clear
PID.SN but will also *overwrite PID.NDST to this different pCPU*.
Instead of keeping it with original pCPU which vCPU had entered block
phase on.

This results in an issue because when a posted interrupt is delivered,
the wakeup_handler() will be executed and fail to find blocked vCPU on
its per pCPU linked list of all vCPUs that are blocked on this pCPU.
Which is due to the vCPU being placed on a *different* per pCPU
linked list than the original pCPU that it had entered block phase.

The regression is introduced by commit c112b5f50232 ("KVM: x86:
Recompute PID.ON when clearing PID.SN"). Therefore, partially revert
it and reintroduce the condition in vmx_vcpu_pi_load() responsible for
avoiding changing PID.NDST when loading a blocked vCPU.

Fixes: c112b5f50232 ("KVM: x86: Recompute PID.ON when clearing PID.SN")
Signed-off-by: Joao Martins <[email protected]>
Signed-off-by: Liran Alon <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 14 ++++++++++++++
arch/x86/kvm/vmx/vmx.h | 6 ++++++
2 files changed, 20 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 18b0bee662a5..75d903455e1c 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1274,6 +1274,18 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
if (!pi_test_sn(pi_desc) && vcpu->cpu == cpu)
return;

+ /*
+ * If the 'nv' field is POSTED_INTR_WAKEUP_VECTOR, do not change
+ * PI.NDST: pi_post_block is the one expected to change PID.NDST and the
+ * wakeup handler expects the vCPU to be on the blocked_vcpu_list that
+ * matches PI.NDST. Otherwise, a vcpu may not be able to be woken up
+ * correctly.
+ */
+ if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR || vcpu->cpu == cpu) {
+ pi_clear_sn(pi_desc);
+ goto after_clear_sn;
+ }
+
/* The full case. */
do {
old.control = new.control = pi_desc->control;
@@ -1289,6 +1301,8 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
} while (cmpxchg64(&pi_desc->control, old.control,
new.control) != old.control);

+after_clear_sn:
+
/*
* Clear SN before reading the bitmap. The VT-d firmware
* writes the bitmap and reads SN atomically (5.2.3 in the
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index bee16687dc0b..1e32ab54fc2d 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -373,6 +373,12 @@ static inline void pi_clear_on(struct pi_desc *pi_desc)
(unsigned long *)&pi_desc->control);
}

+static inline void pi_clear_sn(struct pi_desc *pi_desc)
+{
+ clear_bit(POSTED_INTR_SN,
+ (unsigned long *)&pi_desc->control);
+}
+
static inline int pi_test_on(struct pi_desc *pi_desc)
{
return test_bit(POSTED_INTR_ON,
--
2.11.0

2019-11-06 21:57:36

by Joao Martins

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] KVM: VMX: Do not change PID.NDST when loading a blocked vCPU

On 11/6/19 5:56 PM, Joao Martins wrote:
> When vCPU enters block phase, pi_pre_block() inserts vCPU to a per pCPU
> linked list of all vCPUs that are blocked on this pCPU. Afterwards, it
> changes PID.NV to POSTED_INTR_WAKEUP_VECTOR which its handler
> (wakeup_handler()) is responsible to kick (unblock) any vCPU on that
> linked list that now has pending posted interrupts.
>
> While vCPU is blocked (in kvm_vcpu_block()), it may be preempted which
> will cause vmx_vcpu_pi_put() to set PID.SN. If later the vCPU will be
> scheduled to run on a different pCPU, vmx_vcpu_pi_load() will clear
> PID.SN but will also *overwrite PID.NDST to this different pCPU*.
> Instead of keeping it with original pCPU which vCPU had entered block
> phase on.
>
> This results in an issue because when a posted interrupt is delivered,
> the wakeup_handler() will be executed and fail to find blocked vCPU on
> its per pCPU linked list of all vCPUs that are blocked on this pCPU.
> Which is due to the vCPU being placed on a *different* per pCPU
> linked list than the original pCPU that it had entered block phase.
>
> The regression is introduced by commit c112b5f50232 ("KVM: x86:
> Recompute PID.ON when clearing PID.SN"). Therefore, partially revert
> it and reintroduce the condition in vmx_vcpu_pi_load() responsible for
> avoiding changing PID.NDST when loading a blocked vCPU.
>
> Fixes: c112b5f50232 ("KVM: x86: Recompute PID.ON when clearing PID.SN")
> Signed-off-by: Joao Martins <[email protected]>
> Signed-off-by: Liran Alon <[email protected]>

Sorry, I missed a Tb tag:

Tested-by: Nathan Ni <[email protected]>

I can resend it, if preferred.

> ---
> arch/x86/kvm/vmx/vmx.c | 14 ++++++++++++++
> arch/x86/kvm/vmx/vmx.h | 6 ++++++
> 2 files changed, 20 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 18b0bee662a5..75d903455e1c 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1274,6 +1274,18 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
> if (!pi_test_sn(pi_desc) && vcpu->cpu == cpu)
> return;
>
> + /*
> + * If the 'nv' field is POSTED_INTR_WAKEUP_VECTOR, do not change
> + * PI.NDST: pi_post_block is the one expected to change PID.NDST and the
> + * wakeup handler expects the vCPU to be on the blocked_vcpu_list that
> + * matches PI.NDST. Otherwise, a vcpu may not be able to be woken up
> + * correctly.
> + */
> + if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR || vcpu->cpu == cpu) {
> + pi_clear_sn(pi_desc);
> + goto after_clear_sn;
> + }
> +
> /* The full case. */
> do {
> old.control = new.control = pi_desc->control;
> @@ -1289,6 +1301,8 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
> } while (cmpxchg64(&pi_desc->control, old.control,
> new.control) != old.control);
>
> +after_clear_sn:
> +
> /*
> * Clear SN before reading the bitmap. The VT-d firmware
> * writes the bitmap and reads SN atomically (5.2.3 in the
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index bee16687dc0b..1e32ab54fc2d 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -373,6 +373,12 @@ static inline void pi_clear_on(struct pi_desc *pi_desc)
> (unsigned long *)&pi_desc->control);
> }
>
> +static inline void pi_clear_sn(struct pi_desc *pi_desc)
> +{
> + clear_bit(POSTED_INTR_SN,
> + (unsigned long *)&pi_desc->control);
> +}
> +
> static inline int pi_test_on(struct pi_desc *pi_desc)
> {
> return test_bit(POSTED_INTR_ON,
>

2019-11-07 16:05:25

by Jag Raman

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] KVM: VMX: Consider PID.PIR to determine if vCPU has pending interrupts



On 11/6/2019 12:56 PM, Joao Martins wrote:
> Commit 17e433b54393 ("KVM: Fix leak vCPU's VMCS value into other pCPU")
> introduced vmx_dy_apicv_has_pending_interrupt() in order to determine
> if a vCPU have a pending posted interrupt. This routine is used by
> kvm_vcpu_on_spin() when searching for a a new runnable vCPU to schedule
> on pCPU instead of a vCPU doing busy loop.
>
> vmx_dy_apicv_has_pending_interrupt() determines if a
> vCPU has a pending posted interrupt solely based on PID.ON. However,
> when a vCPU is preempted, vmx_vcpu_pi_put() sets PID.SN which cause
> raised posted interrupts to only set bit in PID.PIR without setting
> PID.ON (and without sending notification vector), as depicted in VT-d
> manual section 5.2.3 "Interrupt-Posting Hardware Operation".
>
> Therefore, checking PID.ON is insufficient to determine if a vCPU has
> pending posted interrupts and instead we should also check if there is
> some bit set on PID.PIR.
>
> Fixes: 17e433b54393 ("KVM: Fix leak vCPU's VMCS value into other pCPU")
> Signed-off-by: Joao Martins <[email protected]>
> Signed-off-by: Liran Alon <[email protected]>

Reviewed-by: Jagannathan Raman <[email protected]>

> ---
> arch/x86/kvm/vmx/vmx.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 31ce6bc2c371..18b0bee662a5 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6141,7 +6141,10 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
>
> static bool vmx_dy_apicv_has_pending_interrupt(struct kvm_vcpu *vcpu)
> {
> - return pi_test_on(vcpu_to_pi_desc(vcpu));
> + struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
> +
> + return pi_test_on(pi_desc) ||
> + !bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS);
> }
>
> static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
>

2019-11-11 14:42:56

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] KVM: VMX: Do not change PID.NDST when loading a blocked vCPU

On 06/11/19 18:56, Joao Martins wrote:
> When vCPU enters block phase, pi_pre_block() inserts vCPU to a per pCPU
> linked list of all vCPUs that are blocked on this pCPU. Afterwards, it
> changes PID.NV to POSTED_INTR_WAKEUP_VECTOR which its handler
> (wakeup_handler()) is responsible to kick (unblock) any vCPU on that
> linked list that now has pending posted interrupts.
>
> While vCPU is blocked (in kvm_vcpu_block()), it may be preempted which
> will cause vmx_vcpu_pi_put() to set PID.SN. If later the vCPU will be
> scheduled to run on a different pCPU, vmx_vcpu_pi_load() will clear
> PID.SN but will also *overwrite PID.NDST to this different pCPU*.
> Instead of keeping it with original pCPU which vCPU had entered block
> phase on.
>
> This results in an issue because when a posted interrupt is delivered,
> the wakeup_handler() will be executed and fail to find blocked vCPU on
> its per pCPU linked list of all vCPUs that are blocked on this pCPU.
> Which is due to the vCPU being placed on a *different* per pCPU
> linked list than the original pCPU that it had entered block phase.
>
> The regression is introduced by commit c112b5f50232 ("KVM: x86:
> Recompute PID.ON when clearing PID.SN"). Therefore, partially revert
> it and reintroduce the condition in vmx_vcpu_pi_load() responsible for
> avoiding changing PID.NDST when loading a blocked vCPU.
>
> Fixes: c112b5f50232 ("KVM: x86: Recompute PID.ON when clearing PID.SN")
> Signed-off-by: Joao Martins <[email protected]>
> Signed-off-by: Liran Alon <[email protected]>

Something wrong in the SoB line?

Otherwise looks good.

Paolo

> ---
> arch/x86/kvm/vmx/vmx.c | 14 ++++++++++++++
> arch/x86/kvm/vmx/vmx.h | 6 ++++++
> 2 files changed, 20 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 18b0bee662a5..75d903455e1c 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1274,6 +1274,18 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
> if (!pi_test_sn(pi_desc) && vcpu->cpu == cpu)
> return;
>
> + /*
> + * If the 'nv' field is POSTED_INTR_WAKEUP_VECTOR, do not change
> + * PI.NDST: pi_post_block is the one expected to change PID.NDST and the
> + * wakeup handler expects the vCPU to be on the blocked_vcpu_list that
> + * matches PI.NDST. Otherwise, a vcpu may not be able to be woken up
> + * correctly.
> + */
> + if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR || vcpu->cpu == cpu) {
> + pi_clear_sn(pi_desc);
> + goto after_clear_sn;
> + }
> +
> /* The full case. */
> do {
> old.control = new.control = pi_desc->control;
> @@ -1289,6 +1301,8 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
> } while (cmpxchg64(&pi_desc->control, old.control,
> new.control) != old.control);
>
> +after_clear_sn:
> +
> /*
> * Clear SN before reading the bitmap. The VT-d firmware
> * writes the bitmap and reads SN atomically (5.2.3 in the
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index bee16687dc0b..1e32ab54fc2d 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -373,6 +373,12 @@ static inline void pi_clear_on(struct pi_desc *pi_desc)
> (unsigned long *)&pi_desc->control);
> }
>
> +static inline void pi_clear_sn(struct pi_desc *pi_desc)
> +{
> + clear_bit(POSTED_INTR_SN,
> + (unsigned long *)&pi_desc->control);
> +}
> +
> static inline int pi_test_on(struct pi_desc *pi_desc)
> {
> return test_bit(POSTED_INTR_ON,
>

2019-11-11 14:49:10

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] KVM: VMX: Consider PID.PIR to determine if vCPU has pending interrupts

On 06/11/19 18:56, Joao Martins wrote:
> Commit 17e433b54393 ("KVM: Fix leak vCPU's VMCS value into other pCPU")
> introduced vmx_dy_apicv_has_pending_interrupt() in order to determine
> if a vCPU have a pending posted interrupt. This routine is used by
> kvm_vcpu_on_spin() when searching for a a new runnable vCPU to schedule
> on pCPU instead of a vCPU doing busy loop.
>
> vmx_dy_apicv_has_pending_interrupt() determines if a
> vCPU has a pending posted interrupt solely based on PID.ON. However,
> when a vCPU is preempted, vmx_vcpu_pi_put() sets PID.SN which cause
> raised posted interrupts to only set bit in PID.PIR without setting
> PID.ON (and without sending notification vector), as depicted in VT-d
> manual section 5.2.3 "Interrupt-Posting Hardware Operation".
>
> Therefore, checking PID.ON is insufficient to determine if a vCPU has
> pending posted interrupts and instead we should also check if there is
> some bit set on PID.PIR.
>
> Fixes: 17e433b54393 ("KVM: Fix leak vCPU's VMCS value into other pCPU")
> Signed-off-by: Joao Martins <[email protected]>
> Signed-off-by: Liran Alon <[email protected]>
> ---
> arch/x86/kvm/vmx/vmx.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 31ce6bc2c371..18b0bee662a5 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6141,7 +6141,10 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
>
> static bool vmx_dy_apicv_has_pending_interrupt(struct kvm_vcpu *vcpu)
> {
> - return pi_test_on(vcpu_to_pi_desc(vcpu));
> + struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
> +
> + return pi_test_on(pi_desc) ||
> + !bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS);
> }
>
> static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)

Should we check the bitmap only if SN is false? We have a precondition
that if SN is clear then non-empty PIR implies ON=1 (modulo the small
window in vmx_vcpu_pi_load of course), so that'd be a bit faster.

Paolo

2019-11-11 14:51:55

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] KVM: VMX: Do not change PID.NDST when loading a blocked vCPU

On 11/11/19 15:48, Joao Martins wrote:
>>>
>>> Fixes: c112b5f50232 ("KVM: x86: Recompute PID.ON when clearing PID.SN")
>>> Signed-off-by: Joao Martins <[email protected]>
>>> Signed-off-by: Liran Alon <[email protected]>
>> Something wrong in the SoB line?
>>
> I can't spot any mistake; at least it looks chained correctly for me. What's the
> issue you see with the Sob line?

Liran's line after yours is confusing. Did he help with the analysis or
anything like that?

Paolo

> The only thing I forgot was a:
>
> Tested-by: Nathan Ni <[email protected]>
>
>> Otherwise looks good.
> If you want I can resubmit the series with the Tb and Jag Rb, unless you think
> it's OK doing on commit? Otherwise I can resubmit.

2019-11-11 14:52:55

by Joao Martins

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] KVM: VMX: Do not change PID.NDST when loading a blocked vCPU

On 11/11/19 2:39 PM, Paolo Bonzini wrote:
> On 06/11/19 18:56, Joao Martins wrote:
>> When vCPU enters block phase, pi_pre_block() inserts vCPU to a per pCPU
>> linked list of all vCPUs that are blocked on this pCPU. Afterwards, it
>> changes PID.NV to POSTED_INTR_WAKEUP_VECTOR which its handler
>> (wakeup_handler()) is responsible to kick (unblock) any vCPU on that
>> linked list that now has pending posted interrupts.
>>
>> While vCPU is blocked (in kvm_vcpu_block()), it may be preempted which
>> will cause vmx_vcpu_pi_put() to set PID.SN. If later the vCPU will be
>> scheduled to run on a different pCPU, vmx_vcpu_pi_load() will clear
>> PID.SN but will also *overwrite PID.NDST to this different pCPU*.
>> Instead of keeping it with original pCPU which vCPU had entered block
>> phase on.
>>
>> This results in an issue because when a posted interrupt is delivered,
>> the wakeup_handler() will be executed and fail to find blocked vCPU on
>> its per pCPU linked list of all vCPUs that are blocked on this pCPU.
>> Which is due to the vCPU being placed on a *different* per pCPU
>> linked list than the original pCPU that it had entered block phase.
>>
>> The regression is introduced by commit c112b5f50232 ("KVM: x86:
>> Recompute PID.ON when clearing PID.SN"). Therefore, partially revert
>> it and reintroduce the condition in vmx_vcpu_pi_load() responsible for
>> avoiding changing PID.NDST when loading a blocked vCPU.
>>
>> Fixes: c112b5f50232 ("KVM: x86: Recompute PID.ON when clearing PID.SN")
>> Signed-off-by: Joao Martins <[email protected]>
>> Signed-off-by: Liran Alon <[email protected]>
>
> Something wrong in the SoB line?
>
I can't spot any mistake; at least it looks chained correctly for me. What's the
issue you see with the Sob line?

The only thing I forgot was a:

Tested-by: Nathan Ni <[email protected]>

> Otherwise looks good.

If you want I can resubmit the series with the Tb and Jag Rb, unless you think
it's OK doing on commit? Otherwise I can resubmit.

Joao

2019-11-11 15:01:14

by Joao Martins

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] KVM: VMX: Do not change PID.NDST when loading a blocked vCPU

On 11/11/19 2:50 PM, Paolo Bonzini wrote:
> On 11/11/19 15:48, Joao Martins wrote:
>>>>
>>>> Fixes: c112b5f50232 ("KVM: x86: Recompute PID.ON when clearing PID.SN")
>>>> Signed-off-by: Joao Martins <[email protected]>
>>>> Signed-off-by: Liran Alon <[email protected]>
>>> Something wrong in the SoB line?
>>>
>> I can't spot any mistake; at least it looks chained correctly for me. What's the
>> issue you see with the Sob line?
>
> Liran's line after yours is confusing. Did he help with the analysis or
> anything like that?
>
He was initially reviewing my patches, but then helped improving the problem
description in the commit messages so felt correct to give credit.

Joao

2019-11-11 15:01:26

by Joao Martins

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] KVM: VMX: Consider PID.PIR to determine if vCPU has pending interrupts

On 11/11/19 2:46 PM, Paolo Bonzini wrote:
> On 06/11/19 18:56, Joao Martins wrote:
>> Commit 17e433b54393 ("KVM: Fix leak vCPU's VMCS value into other pCPU")
>> introduced vmx_dy_apicv_has_pending_interrupt() in order to determine
>> if a vCPU have a pending posted interrupt. This routine is used by
>> kvm_vcpu_on_spin() when searching for a a new runnable vCPU to schedule
>> on pCPU instead of a vCPU doing busy loop.
>>
>> vmx_dy_apicv_has_pending_interrupt() determines if a
>> vCPU has a pending posted interrupt solely based on PID.ON. However,
>> when a vCPU is preempted, vmx_vcpu_pi_put() sets PID.SN which cause
>> raised posted interrupts to only set bit in PID.PIR without setting
>> PID.ON (and without sending notification vector), as depicted in VT-d
>> manual section 5.2.3 "Interrupt-Posting Hardware Operation".
>>
>> Therefore, checking PID.ON is insufficient to determine if a vCPU has
>> pending posted interrupts and instead we should also check if there is
>> some bit set on PID.PIR.
>>
>> Fixes: 17e433b54393 ("KVM: Fix leak vCPU's VMCS value into other pCPU")
>> Signed-off-by: Joao Martins <[email protected]>
>> Signed-off-by: Liran Alon <[email protected]>
>> ---
>> arch/x86/kvm/vmx/vmx.c | 5 ++++-
>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index 31ce6bc2c371..18b0bee662a5 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -6141,7 +6141,10 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
>>
>> static bool vmx_dy_apicv_has_pending_interrupt(struct kvm_vcpu *vcpu)
>> {
>> - return pi_test_on(vcpu_to_pi_desc(vcpu));
>> + struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
>> +
>> + return pi_test_on(pi_desc) ||
>> + !bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS);
>> }
>>
>> static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
>
> Should we check the bitmap only if SN is false? We have a precondition
> that if SN is clear then non-empty PIR implies ON=1 (modulo the small
> window in vmx_vcpu_pi_load of course), so that'd be a bit faster.

Makes sense;

The bitmap check was really meant for SN=1.

Should SN=0 we would be saving ~22-27 cycles as far as I micro-benchmarked a few
weeks ago. Now that you suggest it, it would be also good for older platforms too.

Cheers,
Joao

2019-11-11 15:03:18

by Liran Alon

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] KVM: VMX: Do not change PID.NDST when loading a blocked vCPU



> On 11 Nov 2019, at 16:56, Joao Martins <[email protected]> wrote:
>
> On 11/11/19 2:50 PM, Paolo Bonzini wrote:
>> On 11/11/19 15:48, Joao Martins wrote:
>>>>>
>>>>> Fixes: c112b5f50232 ("KVM: x86: Recompute PID.ON when clearing PID.SN")
>>>>> Signed-off-by: Joao Martins <[email protected]>
>>>>> Signed-off-by: Liran Alon <[email protected]>
>>>> Something wrong in the SoB line?
>>>>
>>> I can't spot any mistake; at least it looks chained correctly for me. What's the
>>> issue you see with the Sob line?
>>
>> Liran's line after yours is confusing. Did he help with the analysis or
>> anything like that?
>>
> He was initially reviewing my patches, but then helped improving the problem
> description in the commit messages so felt correct to give credit.
>
> Joao

I think proper action is to just remove me from the SoB line.

-Liran

2019-11-11 15:56:55

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] KVM: VMX: Do not change PID.NDST when loading a blocked vCPU

On Mon, Nov 11, 2019 at 04:59:20PM +0200, Liran Alon wrote:
>
>
> > On 11 Nov 2019, at 16:56, Joao Martins <[email protected]> wrote:
> >
> > On 11/11/19 2:50 PM, Paolo Bonzini wrote:
> >> On 11/11/19 15:48, Joao Martins wrote:
> >>>>>
> >>>>> Fixes: c112b5f50232 ("KVM: x86: Recompute PID.ON when clearing PID.SN")
> >>>>> Signed-off-by: Joao Martins <[email protected]>
> >>>>> Signed-off-by: Liran Alon <[email protected]>
> >>>> Something wrong in the SoB line?
> >>>>
> >>> I can't spot any mistake; at least it looks chained correctly for me. What's the
> >>> issue you see with the Sob line?
> >>
> >> Liran's line after yours is confusing. Did he help with the analysis or
> >> anything like that?
> >>
> > He was initially reviewing my patches, but then helped improving the problem
> > description in the commit messages so felt correct to give credit.
> >
> > Joao
>
> I think proper action is to just remove me from the SoB line.

Use Co-developed-by to attribute multiple authors. Note, the SoB ordering
should show the chronological history of the patch when possible, e.g. the
person sending the patch should always have their SoB last.

Documentation/process/submitting-patches.rst and
Documentation/process/5.Posting.rst have more details.

2019-11-11 15:59:47

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] KVM: VMX: Consider PID.PIR to determine if vCPU has pending interrupts

On 11/11/19 15:59, Joao Martins wrote:
>> Should we check the bitmap only if SN is false?
^^^^^

Of course it should be skipped if SN is false, as you correctly say below.

>> We have a precondition
>> that if SN is clear then non-empty PIR implies ON=1 (modulo the small
>> window in vmx_vcpu_pi_load of course), so that'd be a bit faster.
> Makes sense;
>
> The bitmap check was really meant for SN=1.
>
> Should SN=0 we would be saving ~22-27 cycles as far as I micro-benchmarked a few
> weeks ago. Now that you suggest it, it would be also good for older platforms too.

Or even newer platforms if they don't use VT-d.

Paolo

2019-11-11 17:30:08

by Joao Martins

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] KVM: VMX: Do not change PID.NDST when loading a blocked vCPU

On 11/11/19 3:53 PM, Sean Christopherson wrote:
> On Mon, Nov 11, 2019 at 04:59:20PM +0200, Liran Alon wrote:
>>
>>
>>> On 11 Nov 2019, at 16:56, Joao Martins <[email protected]> wrote:
>>>
>>> On 11/11/19 2:50 PM, Paolo Bonzini wrote:
>>>> On 11/11/19 15:48, Joao Martins wrote:
>>>>>>>
>>>>>>> Fixes: c112b5f50232 ("KVM: x86: Recompute PID.ON when clearing PID.SN")
>>>>>>> Signed-off-by: Joao Martins <[email protected]>
>>>>>>> Signed-off-by: Liran Alon <[email protected]>
>>>>>> Something wrong in the SoB line?
>>>>>>
>>>>> I can't spot any mistake; at least it looks chained correctly for me. What's the
>>>>> issue you see with the Sob line?
>>>>
>>>> Liran's line after yours is confusing. Did he help with the analysis or
>>>> anything like that?
>>>>
>>> He was initially reviewing my patches, but then helped improving the problem
>>> description in the commit messages so felt correct to give credit.
>>>
>>> Joao
>>
>> I think proper action is to just remove me from the SoB line.
>
> Use Co-developed-by to attribute multiple authors. Note, the SoB ordering
> should show the chronological history of the patch when possible, e.g. the
> person sending the patch should always have their SoB last.
>
> Documentation/process/submitting-patches.rst and
> Documentation/process/5.Posting.rst have more details.
>
The Sob chain on the first two patches were broken (regardless of any use of
Co-developed-by). Fixed it up on v2, alongside the rest of the comments.

Cheers,
Joao