Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752940AbdHBKsb (ORCPT ); Wed, 2 Aug 2017 06:48:31 -0400 Received: from mail-pg0-f66.google.com ([74.125.83.66]:38866 "EHLO mail-pg0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751192AbdHBKs3 (ORCPT ); Wed, 2 Aug 2017 06:48:29 -0400 From: Wanpeng Li X-Google-Original-From: Wanpeng Li To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , Wanpeng Li Subject: [PATCH v3] KVM: nVMX: Fix attempting to emulate "Acknowledge interrupt on exit" when there is no interrupt which L1 requires to inject to L2 Date: Wed, 2 Aug 2017 03:48:23 -0700 Message-Id: <1501670903-3368-1-git-send-email-wanpeng.li@hotmail.com> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2661 Lines: 69 From: Wanpeng Li ------------[ cut here ]------------ WARNING: CPU: 5 PID: 2288 at arch/x86/kvm/vmx.c:11124 nested_vmx_vmexit+0xd64/0xd70 [kvm_intel] CPU: 5 PID: 2288 Comm: qemu-system-x86 Not tainted 4.13.0-rc2+ #7 RIP: 0010:nested_vmx_vmexit+0xd64/0xd70 [kvm_intel] Call Trace: vmx_check_nested_events+0x131/0x1f0 [kvm_intel] ? vmx_check_nested_events+0x131/0x1f0 [kvm_intel] kvm_arch_vcpu_ioctl_run+0x5dd/0x1be0 [kvm] ? vmx_vcpu_load+0x1be/0x220 [kvm_intel] ? kvm_arch_vcpu_load+0x62/0x230 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x8f/0x750 ? trace_hardirqs_on_thunk+0x1a/0x1c entry_SYSCALL64_slow_path+0x25/0x25 This can be reproduced by booting L1 guest w/ 'noapic' grub parameter, which means that tells the kernel to not make use of any IOAPICs that may be present in the system. Actually external_intr variable in nested_vmx_vmexit() is the req_int_win variable passed from vcpu_enter_guest() which means that the L0's userspace requests an irq window. I observed the scenario (!kvm_cpu_has_interrupt(vcpu) && L0's userspace reqeusts an irq window) is true, so there is no interrupt which L1 requires to inject to L2, we should not attempt to emualte "Acknowledge interrupt on exit" for the irq window requirement in this scenario. SDM says that with acknowledge interrupt on exit, bit 31 of the VM-exit interrupt information (valid interrupt) is always set to 1 on EXIT_REASON_EXTERNAL_INTERRUPT. We don't want to break hypervisors expecting an interrupt in that case, so we should do a userspace VM exit when the window is open and then inject the userspace interrupt with a VM exit. Cc: Paolo Bonzini Cc: Radim Krčmář Signed-off-by: Wanpeng Li --- v2 -> v3: * request an irq window and don't nested vmexit v1 -> v2: * update patch description * check nested_exit_intr_ack_set() first arch/x86/kvm/vmx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 80b20e8..9ef2ec3 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -10761,7 +10761,8 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr) return 0; } - if ((kvm_cpu_has_interrupt(vcpu) || external_intr) && + if ((kvm_cpu_has_interrupt(vcpu) || + (external_intr && !nested_exit_intr_ack_set(vcpu))) && nested_exit_on_intr(vcpu)) { if (vmx->nested.nested_run_pending) return -EBUSY; -- 2.7.4