2019-02-21 13:53:24

by Joerg Roedel

[permalink] [raw]
Subject: [PATCH stable-4.4.y] KVM: VMX: Fix x2apic check in vmx_msr_bitmap_mode()

From: Joerg Roedel <[email protected]>

The stable backport of upstream commit

904e14fb7cb96 KVM: VMX: make MSR bitmaps per-VCPU

has a bug in vmx_msr_bitmap_mode(). It enables the x2apic
MSR-bitmap when the kernel emulates x2apic for the guest in
software. The upstream version of the commit checkes whether
the hardware has virtualization enabled for x2apic
emulation.

Since KVM emulates x2apic for guests even when the host does
not support x2apic in hardware, this causes the intercept of
at least the X2APIC_TASKPRI MSR to be disabled on machines
not supporting that MSR. The result is undefined behavior,
on some machines (Intel Westmere based) it causes a crash of
the guest kernel when it tries to access that MSR.

Change the check in vmx_msr_bitmap_mode() to match the upstream
code. This fixes the guest crashes observed with stable
kernels starting with v4.4.168 through v4.4.175.

Signed-off-by: Joerg Roedel <[email protected]>
---
arch/x86/kvm/vmx.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index aee2886a387c..14553f6c03a6 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4628,7 +4628,9 @@ static u8 vmx_msr_bitmap_mode(struct kvm_vcpu *vcpu)
{
u8 mode = 0;

- if (irqchip_in_kernel(vcpu->kvm) && apic_x2apic_mode(vcpu->arch.apic)) {
+ if (cpu_has_secondary_exec_ctrls() &&
+ (vmcs_read32(SECONDARY_VM_EXEC_CONTROL) &
+ SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE)) {
mode |= MSR_BITMAP_MODE_X2APIC;
if (enable_apicv)
mode |= MSR_BITMAP_MODE_X2APIC_APICV;
--
2.16.3



2019-02-21 14:18:17

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH stable-4.4.y] KVM: VMX: Fix x2apic check in vmx_msr_bitmap_mode()

On Thu, Feb 21, 2019 at 02:52:13PM +0100, Joerg Roedel wrote:
> From: Joerg Roedel <[email protected]>
>
> The stable backport of upstream commit
>
> 904e14fb7cb96 KVM: VMX: make MSR bitmaps per-VCPU
>
> has a bug in vmx_msr_bitmap_mode(). It enables the x2apic
> MSR-bitmap when the kernel emulates x2apic for the guest in
> software. The upstream version of the commit checkes whether
> the hardware has virtualization enabled for x2apic
> emulation.
>
> Since KVM emulates x2apic for guests even when the host does
> not support x2apic in hardware, this causes the intercept of
> at least the X2APIC_TASKPRI MSR to be disabled on machines
> not supporting that MSR. The result is undefined behavior,
> on some machines (Intel Westmere based) it causes a crash of
> the guest kernel when it tries to access that MSR.
>
> Change the check in vmx_msr_bitmap_mode() to match the upstream
> code. This fixes the guest crashes observed with stable
> kernels starting with v4.4.168 through v4.4.175.
>
> Signed-off-by: Joerg Roedel <[email protected]>
> ---
> arch/x86/kvm/vmx.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index aee2886a387c..14553f6c03a6 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -4628,7 +4628,9 @@ static u8 vmx_msr_bitmap_mode(struct kvm_vcpu *vcpu)
> {
> u8 mode = 0;
>
> - if (irqchip_in_kernel(vcpu->kvm) && apic_x2apic_mode(vcpu->arch.apic)) {
> + if (cpu_has_secondary_exec_ctrls() &&
> + (vmcs_read32(SECONDARY_VM_EXEC_CONTROL) &
> + SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE)) {
> mode |= MSR_BITMAP_MODE_X2APIC;
> if (enable_apicv)
> mode |= MSR_BITMAP_MODE_X2APIC_APICV;
> --
> 2.16.3
>

Ugh, good catch!

Any hint as to what type of testing that you did that caught this? I
keep asking people to run some kvm tests, but so far no one is :(

thanks,

greg k-h

2019-02-21 14:49:46

by Jörg Rödel

[permalink] [raw]
Subject: Re: [PATCH stable-4.4.y] KVM: VMX: Fix x2apic check in vmx_msr_bitmap_mode()

On Thu, Feb 21, 2019 at 03:15:30PM +0100, Greg Kroah-Hartman wrote:
> Ugh, good catch!
>
> Any hint as to what type of testing that you did that caught this? I
> keep asking people to run some kvm tests, but so far no one is :(

We caught this at SUSE while testing candidate kernel updates for one of
our service packs using a 4.4-based kernel and debugging turned
out that this is issue came in via stable-updates. We also build a
vanilla-flavour of the kernel which is nearly identical to the upstream
stable tree, but what usually ends up in testing is the full tree with
other backports.

This particular issue was found by updating some openstack machines with
the candidate kernel, which then triggered the problem in some guests.
It is also a very special one, since I was only able to trigger the
problem on Westmere-based machines with a specific guest-config.

Regards,

Joerg

2019-02-21 16:21:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH stable-4.4.y] KVM: VMX: Fix x2apic check in vmx_msr_bitmap_mode()

On Thu, Feb 21, 2019 at 03:47:01PM +0100, Joerg Roedel wrote:
> On Thu, Feb 21, 2019 at 03:15:30PM +0100, Greg Kroah-Hartman wrote:
> > Ugh, good catch!
> >
> > Any hint as to what type of testing that you did that caught this? I
> > keep asking people to run some kvm tests, but so far no one is :(
>
> We caught this at SUSE while testing candidate kernel updates for one of
> our service packs using a 4.4-based kernel and debugging turned
> out that this is issue came in via stable-updates. We also build a
> vanilla-flavour of the kernel which is nearly identical to the upstream
> stable tree, but what usually ends up in testing is the full tree with
> other backports.
>
> This particular issue was found by updating some openstack machines with
> the candidate kernel, which then triggered the problem in some guests.
> It is also a very special one, since I was only able to trigger the
> problem on Westmere-based machines with a specific guest-config.

Nice work. Any chance that "test" could be added to the kvm testing
scripts that I think are being worked on somewhere? Ideally we would
have caught this before it ever hit the stable tree. Due to the lack of
good KVM testing, that's one of the areas I am always most worried about
:(

thanks,

greg k-h

2019-02-21 17:01:51

by Ben Hutchings

[permalink] [raw]
Subject: Re: [PATCH stable-4.4.y] KVM: VMX: Fix x2apic check in vmx_msr_bitmap_mode()

On Thu, 2019-02-21 at 17:20 +0100, Greg Kroah-Hartman wrote:
> On Thu, Feb 21, 2019 at 03:47:01PM +0100, Joerg Roedel wrote:
> > On Thu, Feb 21, 2019 at 03:15:30PM +0100, Greg Kroah-Hartman wrote:
> > > Ugh, good catch!
> > >
> > > Any hint as to what type of testing that you did that caught this?  I
> > > keep asking people to run some kvm tests, but so far no one is :(
> >
> > We caught this at SUSE while testing candidate kernel updates for one of
> > our service packs using a 4.4-based kernel and debugging turned
> > out that this is issue came in via stable-updates. We also build a
> > vanilla-flavour of the kernel which is nearly identical to the upstream
> > stable tree, but what usually ends up in testing is the full tree with
> > other backports.
> >
> > This particular issue was found by updating some openstack machines with
> > the candidate kernel, which then triggered the problem in some guests.
> > It is also a very special one, since I was only able to trigger the
> > problem on Westmere-based machines with a specific guest-config.
>
> Nice work.  Any chance that "test" could be added to the kvm testing
> scripts that I think are being worked on somewhere?  Ideally we would
> have caught this before it ever hit the stable tree.

If I understood correctly, the bug is specific to my backport.

> Due to the lack of
> good KVM testing, that's one of the areas I am always most worried about
> :(

Since the behaviour in this area depends on the host CPU model this
might not help much.

Ben.

--
Ben Hutchings, Software Developer   Codethink Ltd
https://www.codethink.co.uk/ Dale House, 35 Dale Street
Manchester, M1 2HF, United Kingdom

2019-02-21 17:16:32

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH stable-4.4.y] KVM: VMX: Fix x2apic check in vmx_msr_bitmap_mode()

On Thu, Feb 21, 2019 at 05:20:32PM +0100, Greg Kroah-Hartman wrote:
> On Thu, Feb 21, 2019 at 03:47:01PM +0100, Joerg Roedel wrote:
> > On Thu, Feb 21, 2019 at 03:15:30PM +0100, Greg Kroah-Hartman wrote:
> > > Ugh, good catch!
> > >
> > > Any hint as to what type of testing that you did that caught this? I
> > > keep asking people to run some kvm tests, but so far no one is :(
> >
> > We caught this at SUSE while testing candidate kernel updates for one of
> > our service packs using a 4.4-based kernel and debugging turned
> > out that this is issue came in via stable-updates. We also build a
> > vanilla-flavour of the kernel which is nearly identical to the upstream
> > stable tree, but what usually ends up in testing is the full tree with
> > other backports.
> >
> > This particular issue was found by updating some openstack machines with
> > the candidate kernel, which then triggered the problem in some guests.
> > It is also a very special one, since I was only able to trigger the
> > problem on Westmere-based machines with a specific guest-config.
>
> Nice work. Any chance that "test" could be added to the kvm testing
> scripts that I think are being worked on somewhere? Ideally we would
> have caught this before it ever hit the stable tree. Due to the lack of
> good KVM testing, that's one of the areas I am always most worried about

This bug exists only in the 4.4.y backport; upstream, 4.9.y and 4.14.y
all had the correct code from the get-go. And there is already a KVM
unit test that *should* hit this, albeit somewhat indirectly. I'll
verify the tests that touch the TPR actually run with x2APIC enabled.

Assuming the KVM unit test actually works, it's not a stretch for the
bug to esacpe, e.g. if the tests weren't run on 4.4.y at all, or were
only run on hardware with x2APIC.

2019-02-22 09:23:07

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH stable-4.4.y] KVM: VMX: Fix x2apic check in vmx_msr_bitmap_mode()

On 21/02/19 18:15, Sean Christopherson wrote:
> This bug exists only in the 4.4.y backport; upstream, 4.9.y and 4.14.y
> all had the correct code from the get-go. And there is already a KVM
> unit test that *should* hit this, albeit somewhat indirectly. I'll
> verify the tests that touch the TPR actually run with x2APIC enabled.

eventinj from kvm-unit-tests should trigger it. There are other tests
that touch the TPR, but they use cr8 so they don't show the bug.

> Assuming the KVM unit test actually works, it's not a stretch for the
> bug to esacpe, e.g. if the tests weren't run on 4.4.y at all, or were
> only run on hardware with x2APIC.

Yeah, you should be able to see this with kvm_intel.enable_apicv=0 on
newer processors. But I've never run the tests for 4.4.y.

Paolo