Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752252AbdCBMa7 (ORCPT ); Thu, 2 Mar 2017 07:30:59 -0500 Received: from mail-wr0-f172.google.com ([209.85.128.172]:35699 "EHLO mail-wr0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751895AbdCBMay (ORCPT ); Thu, 2 Mar 2017 07:30:54 -0500 MIME-Version: 1.0 In-Reply-To: <20170301183150.GB17506@potion> References: <20170301183150.GB17506@potion> From: Wanpeng Li Date: Thu, 2 Mar 2017 20:28:18 +0800 Message-ID: Subject: Re: kvm: WARNING in nested_vmx_vmexit To: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= Cc: Dmitry Vyukov , Paolo Bonzini , KVM list , LKML , Jim Mattson , Steve Rutherford , Haozhong Zhang , Xiao Guangrong , syzkaller Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by mail.home.local id v22CV3WO019577 Content-Length: 4426 Lines: 105 2017-03-02 2:31 GMT+08:00 Radim Krčmář : > 2017-03-01 10:44+0100, Dmitry Vyukov: >> On Wed, Mar 1, 2017 at 7:13 AM, Wanpeng Li wrote: >>> 2017-02-28 20:15 GMT+08:00 Dmitry Vyukov : >>>> Hello, >>>> >>>> The following program triggers WARNING in nested_vmx_vmexit: >>>> https://gist.githubusercontent.com/dvyukov/16b946d7dc703bb07b9b933f12fb8a6e/raw/dac60506feb8dd9dd22828c486e46ee8a5e30f13/gistfile1.txt >>>> >>>> >>>> ------------[ cut here ]------------ >>>> WARNING: CPU: 1 PID: 27742 at arch/x86/kvm/vmx.c:11029 >>>> nested_vmx_vmexit+0x5c35/0x74d0 arch/x86/kvm/vmx.c:11029 >>>> CPU: 1 PID: 27742 Comm: a.out Not tainted 4.10.0+ #229 >>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 >>>> Call Trace: >>>> __dump_stack lib/dump_stack.c:15 [inline] >>>> dump_stack+0x2ee/0x3ef lib/dump_stack.c:51 >>>> panic+0x1fb/0x412 kernel/panic.c:179 >>>> __warn+0x1c4/0x1e0 kernel/panic.c:540 >>>> warn_slowpath_null+0x2c/0x40 kernel/panic.c:583 >>>> nested_vmx_vmexit+0x5c35/0x74d0 arch/x86/kvm/vmx.c:11029 >>>> vmx_leave_nested arch/x86/kvm/vmx.c:11136 [inline] >>>> vmx_set_msr+0x1565/0x1910 arch/x86/kvm/vmx.c:3324 >>>> kvm_set_msr+0xd4/0x170 arch/x86/kvm/x86.c:1099 >>>> do_set_msr+0x11e/0x190 arch/x86/kvm/x86.c:1128 >>>> __msr_io arch/x86/kvm/x86.c:2577 [inline] >>>> msr_io+0x24b/0x450 arch/x86/kvm/x86.c:2614 >>>> kvm_arch_vcpu_ioctl+0x35b/0x46a0 arch/x86/kvm/x86.c:3497 >>>> kvm_vcpu_ioctl+0x232/0x1120 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2721 >>>> vfs_ioctl fs/ioctl.c:43 [inline] >>>> do_vfs_ioctl+0x1bf/0x1790 fs/ioctl.c:683 >>>> SYSC_ioctl fs/ioctl.c:698 [inline] >>>> SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689 >>>> entry_SYSCALL_64_fastpath+0x1f/0xc2 >>>> RIP: 0033:0x451229 >>>> RSP: 002b:00007fc1e7ebec98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 >>>> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000451229 >>>> RDX: 0000000020aecfe8 RSI: 000000004008ae89 RDI: 0000000000000008 >>>> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 >>>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 >>>> R13: 0000000000000000 R14: 00007fc1e7ebf9c0 R15: 00007fc1e7ebf700 >>>> >>>> >>>> On commit e5d56efc97f8240d0b5d66c03949382b6d7e5570 >>> >>> I git checkout to this commit for linus tree, however the line number >>> 11029 doesn't match any warning, and there are several warnings in the >>> function nested_vmx_vmexit(), could you point out which one? >> >> Ah sorry, I have some local diff. >> So now I am on 86292b33d4b79ee03e2f43ea0381ef85f077c760 and here is my diff: >> https://gist.githubusercontent.com/dvyukov/6bb88c6e3584c3fd989df01386efcb74/raw/1ae08a3a682ad9467de5eb7fb96d457db514f9d2/gistfile1.txt >> >> The warning is: >> >> /* trying to cancel vmlaunch/vmresume is a bug */ >> WARN_ON_ONCE(vmx->nested.nested_run_pending); > > I can't reproduce this one, but it is there: > > The warning gets thrown while doing > > (*(uint32_t*)0x20aecfe8 = (uint32_t)0x1); > (*(uint32_t*)0x20aecfec = (uint32_t)0x0); > (*(uint32_t*)0x20aecff0 = (uint32_t)0x3a); > (*(uint32_t*)0x20aecff4 = (uint32_t)0x0); > (*(uint64_t*)0x20aecff8 = (uint64_t)0x0); > r[29] = syscall(__NR_ioctl, r[4], 0x4008ae89ul, > 0x20aecfe8ul, 0, 0, 0, 0, 0, 0); > > i.e. KVM_SET_MSR ioctl with > > struct kvm_msrs { > .nmsrs = 1, > .pad = 0, > .entries = { > {.index = MSR_IA32_FEATURE_CONTROL, > .reserved = 0, > .data = 0} > } > } > > and the relevant setter is > > case MSR_IA32_FEATURE_CONTROL: > if (!vmx_feature_control_msr_valid(vcpu, data) || > (to_vmx(vcpu)->msr_ia32_feature_control & > FEATURE_CONTROL_LOCKED && !msr_info->host_initiated)) > return 1; > vmx->msr_ia32_feature_control = data; > if (msr_info->host_initiated && data == 0) > vmx_leave_nested(vcpu); > break; > > The condition is wrong as it is no longer only VMX bits in data, so VMX > can get disabled while the data is nonzero. KVM might not need to force > nested VM exit when userspace goes crazy, so the condition would just go > away ... need to check. If the bits just influence the new VMXON execution instead of currently running hypervisor? Regards, Wanpeng Li