2020-06-22 16:10:52

by Igor Mammedov

[permalink] [raw]
Subject: [PATCH] kvm: lapic: fix broken vcpu hotplug

Guest fails to online hotplugged CPU with error
smpboot: do_boot_cpu failed(-1) to wakeup CPU#4

It's caused by the fact that kvm_apic_set_state(), which used to call
recalculate_apic_map() unconditionally and pulled hotplugged CPU into
apic map, is updating map conditionally [1] on state change which doesn't
happen in this case and apic map update is skipped.

Note:
new CPU during kvm_arch_vcpu_create() is not visible to
kvm_recalculate_apic_map(), so all related update calls endup
as NOP and only follow up kvm_apic_set_state() used to trigger map
update that counted in hotplugged CPU.
Fix issue by forcing unconditional update from kvm_apic_set_state(),
like it used to be.

1)
Fixes: (4abaffce4d25a KVM: LAPIC: Recalculate apic map in batch)
Signed-off-by: Igor Mammedov <[email protected]>
---
PS:
it's alternative to full revert of [1], I've posted earlier
https://www.mail-archive.com/[email protected]/msg2205600.html
so fii free to pick up whatever is better by now
---
arch/x86/kvm/lapic.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 34a7e0533dad..5696831d4005 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2556,6 +2556,7 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
struct kvm_lapic *apic = vcpu->arch.apic;
int r;

+ apic->vcpu->kvm->arch.apic_map_dirty = true;
kvm_lapic_set_base(vcpu, vcpu->arch.apic_base);
/* set SPIV separately to get count of SW disabled APICs right */
apic_set_spiv(apic, *((u32 *)(s->regs + APIC_SPIV)));
--
2.26.2


2020-06-22 16:50:43

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] kvm: lapic: fix broken vcpu hotplug

On 22/06/20 18:08, Igor Mammedov wrote:
> Guest fails to online hotplugged CPU with error
> smpboot: do_boot_cpu failed(-1) to wakeup CPU#4
>
> It's caused by the fact that kvm_apic_set_state(), which used to call
> recalculate_apic_map() unconditionally and pulled hotplugged CPU into
> apic map, is updating map conditionally [1] on state change which doesn't
> happen in this case and apic map update is skipped.
>
> Note:
> new CPU during kvm_arch_vcpu_create() is not visible to
> kvm_recalculate_apic_map(), so all related update calls endup
> as NOP and only follow up kvm_apic_set_state() used to trigger map
> update that counted in hotplugged CPU.
> Fix issue by forcing unconditional update from kvm_apic_set_state(),
> like it used to be.
>
> 1)
> Fixes: (4abaffce4d25a KVM: LAPIC: Recalculate apic map in batch)
> Signed-off-by: Igor Mammedov <[email protected]>
> ---
> PS:
> it's alternative to full revert of [1], I've posted earlier
> https://www.mail-archive.com/[email protected]/msg2205600.html
> so fii free to pick up whatever is better by now
> ---
> arch/x86/kvm/lapic.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 34a7e0533dad..5696831d4005 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -2556,6 +2556,7 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
> struct kvm_lapic *apic = vcpu->arch.apic;
> int r;
>
> + apic->vcpu->kvm->arch.apic_map_dirty = true;
> kvm_lapic_set_base(vcpu, vcpu->arch.apic_base);
> /* set SPIV separately to get count of SW disabled APICs right */
> apic_set_spiv(apic, *((u32 *)(s->regs + APIC_SPIV)));
>

Queued, but it's better to set apic_map_dirty just before the call to
kvm_recalculate_apic_map, or you can have a variant of the race that you
pointed out.

Paolo

2020-06-23 11:16:11

by Igor Mammedov

[permalink] [raw]
Subject: Re: [PATCH] kvm: lapic: fix broken vcpu hotplug

On Mon, 22 Jun 2020 18:47:57 +0200
Paolo Bonzini <[email protected]> wrote:

> On 22/06/20 18:08, Igor Mammedov wrote:
> > Guest fails to online hotplugged CPU with error
> > smpboot: do_boot_cpu failed(-1) to wakeup CPU#4
> >
> > It's caused by the fact that kvm_apic_set_state(), which used to call
> > recalculate_apic_map() unconditionally and pulled hotplugged CPU into
> > apic map, is updating map conditionally [1] on state change which doesn't
> > happen in this case and apic map update is skipped.
> >
> > Note:
> > new CPU during kvm_arch_vcpu_create() is not visible to
> > kvm_recalculate_apic_map(), so all related update calls endup
> > as NOP and only follow up kvm_apic_set_state() used to trigger map
> > update that counted in hotplugged CPU.
> > Fix issue by forcing unconditional update from kvm_apic_set_state(),
> > like it used to be.
> >
> > 1)
> > Fixes: (4abaffce4d25a KVM: LAPIC: Recalculate apic map in batch)
> > Signed-off-by: Igor Mammedov <[email protected]>
> > ---
> > PS:
> > it's alternative to full revert of [1], I've posted earlier
> > https://www.mail-archive.com/[email protected]/msg2205600.html
> > so fii free to pick up whatever is better by now
> > ---
> > arch/x86/kvm/lapic.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > index 34a7e0533dad..5696831d4005 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -2556,6 +2556,7 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
> > struct kvm_lapic *apic = vcpu->arch.apic;
> > int r;
> >
> > + apic->vcpu->kvm->arch.apic_map_dirty = true;
> > kvm_lapic_set_base(vcpu, vcpu->arch.apic_base);
> > /* set SPIV separately to get count of SW disabled APICs right */
> > apic_set_spiv(apic, *((u32 *)(s->regs + APIC_SPIV)));
> >
>
> Queued, but it's better to set apic_map_dirty just before the call to
> kvm_recalculate_apic_map, or you can have a variant of the race that you
> pointed out.

Here I was worried about failure path as well that is just before normal
kvm_recalculate_apic_map(), and has its own kvm_recalculate_apic_map().

but I'm not sure if we should force map update in that case.

>
> Paolo
>

2020-06-23 11:38:51

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] kvm: lapic: fix broken vcpu hotplug

On 23/06/20 13:13, Igor Mammedov wrote:
>>> + apic->vcpu->kvm->arch.apic_map_dirty = true;
>>> kvm_lapic_set_base(vcpu, vcpu->arch.apic_base);
>>> /* set SPIV separately to get count of SW disabled APICs right */
>>> apic_set_spiv(apic, *((u32 *)(s->regs + APIC_SPIV)));
>>>
>> Queued, but it's better to set apic_map_dirty just before the call to
>> kvm_recalculate_apic_map, or you can have a variant of the race that you
>> pointed out.
> Here I was worried about failure path as well that is just before normal
> kvm_recalculate_apic_map(), and has its own kvm_recalculate_apic_map().
>
> but I'm not sure if we should force map update in that case.
>

In that case kvm_lapic_set_base and apic_set_spiv will take care of it
(and if it kvm_apic_state_fixup writes LDR, it succeeds and you go down
the other path).

Paolo