Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753380AbdHWHN1 (ORCPT ); Wed, 23 Aug 2017 03:13:27 -0400 Received: from mail-wr0-f196.google.com ([209.85.128.196]:35599 "EHLO mail-wr0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753318AbdHWHNZ (ORCPT ); Wed, 23 Aug 2017 03:13:25 -0400 Subject: Re: [PATCH v2] KVM: nVMX: Fix trying to cancel vmlauch/vmresume To: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , Wanpeng Li Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Wanpeng Li References: <1503065488-108870-1-git-send-email-wanpeng.li@hotmail.com> <20170821162052.GC17079@flask> From: Paolo Bonzini Message-ID: <26239412-1a0e-51f0-4382-0d48f86f8162@redhat.com> Date: Wed, 23 Aug 2017 09:13:21 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170821162052.GC17079@flask> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3149 Lines: 80 On 21/08/2017 18:20, Radim Krčmář wrote: > 2017-08-18 07:11-0700, Wanpeng Li: >> From: Wanpeng Li >> >> ------------[ cut here ]------------ >> WARNING: CPU: 7 PID: 3861 at /home/kernel/ssd/kvm/arch/x86/kvm//vmx.c:11299 nested_vmx_vmexit+0x176e/0x1980 [kvm_intel] >> CPU: 7 PID: 3861 Comm: qemu-system-x86 Tainted: G W OE 4.13.0-rc4+ #11 >> RIP: 0010:nested_vmx_vmexit+0x176e/0x1980 [kvm_intel] >> Call Trace: >> ? kvm_multiple_exception+0x149/0x170 [kvm] >> ? handle_emulation_failure+0x79/0x230 [kvm] >> ? load_vmcs12_host_state+0xa80/0xa80 [kvm_intel] >> ? check_chain_key+0x137/0x1e0 >> ? reexecute_instruction.part.168+0x130/0x130 [kvm] >> nested_vmx_inject_exception_vmexit+0xb7/0x100 [kvm_intel] >> ? nested_vmx_inject_exception_vmexit+0xb7/0x100 [kvm_intel] >> vmx_queue_exception+0x197/0x300 [kvm_intel] >> kvm_arch_vcpu_ioctl_run+0x1b0c/0x2c90 [kvm] >> ? kvm_arch_vcpu_runnable+0x220/0x220 [kvm] >> ? preempt_count_sub+0x18/0xc0 >> ? restart_apic_timer+0x17d/0x300 [kvm] >> ? kvm_lapic_restart_hv_timer+0x37/0x50 [kvm] >> ? kvm_arch_vcpu_load+0x1d8/0x350 [kvm] >> kvm_vcpu_ioctl+0x4e4/0x910 [kvm] >> ? kvm_vcpu_ioctl+0x4e4/0x910 [kvm] >> ? kvm_dev_ioctl+0xbe0/0xbe0 [kvm] >> >> The flag "nested_run_pending", which can override the decision of which should run >> next, L1 or L2. nested_run_pending=1 means that we *must* run L2 next, not L1. This >> is necessary in particular when L1 did a VMLAUNCH of L2 and therefore expects L2 to >> be run (and perhaps be injected with an event it specified, etc.). Nested_run_pending >> is especially intended to avoid switching to L1 in the injection decision-point. >> >> I catch this in the queue exception path, this patch fixes it by requesting >> an immediate VM exit from L2 and keeping the exception for L1 pending for a >> subsequent nested VM exit. >> >> Cc: Paolo Bonzini >> Cc: Radim Krčmář >> Signed-off-by: Wanpeng Li >> --- >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> @@ -6356,8 +6356,8 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool req_int_win) >> kvm_update_dr7(vcpu); >> } > > Hm, we shouldn't execute the code above if exception won't be injected. True, the code should have been when the exception is injected first, not here. But it's fine since in both cases (set RF and clear GD) it's idempotent. >> >> - kvm_x86_ops->queue_exception(vcpu); >> - return 0; > > vmx_complete_interrupts() assumes that the exception is always injected, > so it would be dropped by kvm_clear_exception_queue(). That's a separate bug. We need to separate exception.pending from exception.injected, similar to NMIs (what is now exception.pending would become exception.injected). For now, this patch would be better than nothing, but getting the right fix for 4.14 would be nice too. Paolo > I'm starting to wonder whether getting rid of nested_run_pending > wouldn't be nicer. > >> + r = kvm_x86_ops->queue_exception(vcpu); >> + return r; >> } >> >> if (vcpu->arch.nmi_injected) { >> -- >> 2.7.4 >> >