2019-07-25 13:31:34

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH stable-4.19 0/2] KVM: nVMX: guest reset fixes

Few patches were recently marked for stable@ but commits are not
backportable as-is and require a few tweaks. Here is 4.19 stable backport.

Jan Kiszka (1):
KVM: nVMX: Clear pending KVM_REQ_GET_VMCS12_PAGES when leaving nested

Paolo Bonzini (1):
KVM: nVMX: do not use dangling shadow VMCS after guest reset

arch/x86/kvm/vmx.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

--
2.20.1



2019-07-25 13:44:34

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH stable-4.19 0/2] KVM: nVMX: guest reset fixes

On 25/07/19 12:46, Vitaly Kuznetsov wrote:
> Few patches were recently marked for stable@ but commits are not
> backportable as-is and require a few tweaks. Here is 4.19 stable backport.
>
> Jan Kiszka (1):
> KVM: nVMX: Clear pending KVM_REQ_GET_VMCS12_PAGES when leaving nested
>
> Paolo Bonzini (1):
> KVM: nVMX: do not use dangling shadow VMCS after guest reset
>
> arch/x86/kvm/vmx.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>

Acked-by: Paolo Bonzini <[email protected]>

Thanks Vitaly for helping out.

Paolo

2019-07-25 16:47:00

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH stable-4.19 1/2] KVM: nVMX: do not use dangling shadow VMCS after guest reset

From: Paolo Bonzini <[email protected]>

[ Upstream commit 88dddc11a8d6b09201b4db9d255b3394d9bc9e57 ]

If a KVM guest is reset while running a nested guest, free_nested will
disable the shadow VMCS execution control in the vmcs01. However,
on the next KVM_RUN vmx_vcpu_run would nevertheless try to sync
the VMCS12 to the shadow VMCS which has since been freed.

This causes a vmptrld of a NULL pointer on my machime, but Jan reports
the host to hang altogether. Let's see how much this trivial patch fixes.

Reported-by: Jan Kiszka <[email protected]>
Cc: Liran Alon <[email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/kvm/vmx.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 73d6d585dd66..880bc36a0d5d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8457,6 +8457,7 @@ static void vmx_disable_shadow_vmcs(struct vcpu_vmx *vmx)
{
vmcs_clear_bits(SECONDARY_VM_EXEC_CONTROL, SECONDARY_EXEC_SHADOW_VMCS);
vmcs_write64(VMCS_LINK_POINTER, -1ull);
+ vmx->nested.sync_shadow_vmcs = false;
}

static inline void nested_release_vmcs12(struct vcpu_vmx *vmx)
@@ -8468,7 +8469,6 @@ static inline void nested_release_vmcs12(struct vcpu_vmx *vmx)
/* copy to memory all shadowed fields in case
they were modified */
copy_shadow_to_vmcs12(vmx);
- vmx->nested.sync_shadow_vmcs = false;
vmx_disable_shadow_vmcs(vmx);
}
vmx->nested.posted_intr_nv = -1;
@@ -8668,6 +8668,9 @@ static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx)
u64 field_value;
struct vmcs *shadow_vmcs = vmx->vmcs01.shadow_vmcs;

+ if (WARN_ON(!shadow_vmcs))
+ return;
+
preempt_disable();

vmcs_load(shadow_vmcs);
@@ -8706,6 +8709,9 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx)
u64 field_value = 0;
struct vmcs *shadow_vmcs = vmx->vmcs01.shadow_vmcs;

+ if (WARN_ON(!shadow_vmcs))
+ return;
+
vmcs_load(shadow_vmcs);

for (q = 0; q < ARRAY_SIZE(fields); q++) {
--
2.20.1


2019-07-26 13:50:02

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH stable-4.19 0/2] KVM: nVMX: guest reset fixes

On Thu, Jul 25, 2019 at 12:46:43PM +0200, Vitaly Kuznetsov wrote:
> Few patches were recently marked for stable@ but commits are not
> backportable as-is and require a few tweaks. Here is 4.19 stable backport.
>
> Jan Kiszka (1):
> KVM: nVMX: Clear pending KVM_REQ_GET_VMCS12_PAGES when leaving nested
>
> Paolo Bonzini (1):
> KVM: nVMX: do not use dangling shadow VMCS after guest reset
>
> arch/x86/kvm/vmx.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)

All now applied, thanks!

greg k-h

2019-07-29 09:08:50

by Jack Wang

[permalink] [raw]
Subject: Re: [PATCH stable-4.19 1/2] KVM: nVMX: do not use dangling shadow VMCS after guest reset

Vitaly Kuznetsov <[email protected]> 于2019年7月25日周四 下午3:29写道:
>
> From: Paolo Bonzini <[email protected]>
>
> [ Upstream commit 88dddc11a8d6b09201b4db9d255b3394d9bc9e57 ]
>
> If a KVM guest is reset while running a nested guest, free_nested will
> disable the shadow VMCS execution control in the vmcs01. However,
> on the next KVM_RUN vmx_vcpu_run would nevertheless try to sync
> the VMCS12 to the shadow VMCS which has since been freed.
>
> This causes a vmptrld of a NULL pointer on my machime, but Jan reports
> the host to hang altogether. Let's see how much this trivial patch fixes.
>
> Reported-by: Jan Kiszka <[email protected]>
> Cc: Liran Alon <[email protected]>
> Cc: [email protected]
> Signed-off-by: Paolo Bonzini <[email protected]>

Hi all,

Do we need to backport the fix also to stable 4.14? It applies
cleanly and compiles fine.

Regards,
Jack Wang

2019-07-29 09:30:24

by Jack Wang

[permalink] [raw]
Subject: Re: [PATCH stable-4.19 1/2] KVM: nVMX: do not use dangling shadow VMCS after guest reset

Paolo Bonzini <[email protected]> 于2019年7月29日周一 上午11:10写道:
>
> On 29/07/19 10:58, Jack Wang wrote:
> > Vitaly Kuznetsov <[email protected]> 于2019年7月25日周四 下午3:29写道:
> >>
> >> From: Paolo Bonzini <[email protected]>
> >>
> >> [ Upstream commit 88dddc11a8d6b09201b4db9d255b3394d9bc9e57 ]
> >>
> >> If a KVM guest is reset while running a nested guest, free_nested will
> >> disable the shadow VMCS execution control in the vmcs01. However,
> >> on the next KVM_RUN vmx_vcpu_run would nevertheless try to sync
> >> the VMCS12 to the shadow VMCS which has since been freed.
> >>
> >> This causes a vmptrld of a NULL pointer on my machime, but Jan reports
> >> the host to hang altogether. Let's see how much this trivial patch fixes.
> >>
> >> Reported-by: Jan Kiszka <[email protected]>
> >> Cc: Liran Alon <[email protected]>
> >> Cc: [email protected]
> >> Signed-off-by: Paolo Bonzini <[email protected]>
> >
> > Hi all,
> >
> > Do we need to backport the fix also to stable 4.14? It applies
> > cleanly and compiles fine.
>
> The reproducer required newer kernels that support KVM_GET_NESTED_STATE
> and KVM_SET_NESTED_STATE, so it would be hard to test it. However, the
> patch itself should be safe.
>
> Paolo

Thanks Paolo for confirmation. I'm asking because we had one incident
in our production with 4.14.129 kernel,
System is Skylake Gold cpu, first kvm errors, host hung afterwards

kernel: [1186161.091160] kvm: vmptrld (null)/6bfc00000000 failed
kernel: [1186161.091537] kvm: vmclear fail: (null)/6bfc00000000
kernel: [1186186.490300] watchdog: BUG: soft lockup - CPU#54 stuck for
23s! [qemu:16639]

Hi Sasha, hi Greg,

Would be great if you can pick this patch also to 4.14 kernel.

Best regards,
Jack Wang

2019-07-29 12:49:21

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH stable-4.19 1/2] KVM: nVMX: do not use dangling shadow VMCS after guest reset

On 29/07/19 10:58, Jack Wang wrote:
> Vitaly Kuznetsov <[email protected]> 于2019年7月25日周四 下午3:29写道:
>>
>> From: Paolo Bonzini <[email protected]>
>>
>> [ Upstream commit 88dddc11a8d6b09201b4db9d255b3394d9bc9e57 ]
>>
>> If a KVM guest is reset while running a nested guest, free_nested will
>> disable the shadow VMCS execution control in the vmcs01. However,
>> on the next KVM_RUN vmx_vcpu_run would nevertheless try to sync
>> the VMCS12 to the shadow VMCS which has since been freed.
>>
>> This causes a vmptrld of a NULL pointer on my machime, but Jan reports
>> the host to hang altogether. Let's see how much this trivial patch fixes.
>>
>> Reported-by: Jan Kiszka <[email protected]>
>> Cc: Liran Alon <[email protected]>
>> Cc: [email protected]
>> Signed-off-by: Paolo Bonzini <[email protected]>
>
> Hi all,
>
> Do we need to backport the fix also to stable 4.14? It applies
> cleanly and compiles fine.

The reproducer required newer kernels that support KVM_GET_NESTED_STATE
and KVM_SET_NESTED_STATE, so it would be hard to test it. However, the
patch itself should be safe.

Paolo

2019-07-29 12:54:31

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH stable-4.19 1/2] KVM: nVMX: do not use dangling shadow VMCS after guest reset

On 29/07/19 11:29, Jack Wang wrote:
> Thanks Paolo for confirmation. I'm asking because we had one incident
> in our production with 4.14.129 kernel,
> System is Skylake Gold cpu, first kvm errors, host hung afterwards
>
> kernel: [1186161.091160] kvm: vmptrld (null)/6bfc00000000 failed
> kernel: [1186161.091537] kvm: vmclear fail: (null)/6bfc00000000
> kernel: [1186186.490300] watchdog: BUG: soft lockup - CPU#54 stuck for
> 23s! [qemu:16639]
>
> Hi Sasha, hi Greg,
>
> Would be great if you can pick this patch also to 4.14 kernel.

Acked-by: Paolo Bonzini <[email protected]>

2019-07-29 15:32:51

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH stable-4.19 1/2] KVM: nVMX: do not use dangling shadow VMCS after guest reset

On Mon, Jul 29, 2019 at 11:30:19AM +0200, Paolo Bonzini wrote:
> On 29/07/19 11:29, Jack Wang wrote:
> > Thanks Paolo for confirmation. I'm asking because we had one incident
> > in our production with 4.14.129 kernel,
> > System is Skylake Gold cpu, first kvm errors, host hung afterwards
> >
> > kernel: [1186161.091160] kvm: vmptrld (null)/6bfc00000000 failed
> > kernel: [1186161.091537] kvm: vmclear fail: (null)/6bfc00000000
> > kernel: [1186186.490300] watchdog: BUG: soft lockup - CPU#54 stuck for
> > 23s! [qemu:16639]
> >
> > Hi Sasha, hi Greg,
> >
> > Would be great if you can pick this patch also to 4.14 kernel.
>
> Acked-by: Paolo Bonzini <[email protected]>

Ok, now queued up, thanks.

greg k-h