Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751568AbdINQwl (ORCPT ); Thu, 14 Sep 2017 12:52:41 -0400 Received: from mx1.redhat.com ([209.132.183.28]:47720 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751335AbdINQwj (ORCPT ); Thu, 14 Sep 2017 12:52:39 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com C60F54ACBD Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=rkrcmar@redhat.com Date: Thu, 14 Sep 2017 18:52:36 +0200 From: Radim =?utf-8?B?S3LEjW3DocWZ?= To: Wanpeng Li Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Wanpeng Li Subject: Re: [PATCH v2] KVM: async_pf: Fix #DF due to inject "Page not Present" and "Page Ready" exceptions simultaneously Message-ID: <20170914165236.GB23415@flask> References: <1505386456-126144-1-git-send-email-wanpeng.li@hotmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1505386456-126144-1-git-send-email-wanpeng.li@hotmail.com> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Thu, 14 Sep 2017 16:52:39 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3558 Lines: 80 2017-09-14 03:54-0700, Wanpeng Li: > From: Wanpeng Li > > qemu-system-x86-8600 [004] d..1 7205.687530: kvm_entry: vcpu 2 > qemu-system-x86-8600 [004] .... 7205.687532: kvm_exit: reason EXCEPTION_NMI rip 0xffffffffa921297d info ffffeb2c0e44e018 80000b0e > qemu-system-x86-8600 [004] .... 7205.687532: kvm_page_fault: address ffffeb2c0e44e018 error_code 0 > qemu-system-x86-8600 [004] .... 7205.687620: kvm_try_async_get_page: gva = 0xffffeb2c0e44e018, gfn = 0x427e4e > qemu-system-x86-8600 [004] .N.. 7205.687628: kvm_async_pf_not_present: token 0x8b002 gva 0xffffeb2c0e44e018 > kworker/4:2-7814 [004] .... 7205.687655: kvm_async_pf_completed: gva 0xffffeb2c0e44e018 address 0x7fcc30c4e000 > qemu-system-x86-8600 [004] .... 7205.687703: kvm_async_pf_ready: token 0x8b002 gva 0xffffeb2c0e44e018 > qemu-system-x86-8600 [004] d..1 7205.687711: kvm_entry: vcpu 2 > > After running some memory intensive workload in guest, I catch the kworker > which completes the GUP too quickly, and queues an "Page Ready" #PF exception > after the "Page not Present" exception before the next vmentry as the above > trace which will result in #DF injected to guest. The #DF feature can bite us in other cases as well, e.g. when emulating an instruction that throws #GP/#UD. Can't we replace all non-#PF exceptions with the PV #PF? Doing so should be wrong only for trap exceptions and we currently just override them anyway, so we wouldn't regress. :) > This patch fixes it by clearing the queue for "Page not Present" if "Page Ready" > occurs before the next vmentry since the GUP has already got the required page > and shadow page table has already been fixed by "Page Ready" handler. > > Cc: Paolo Bonzini > Cc: Radim Krčmář > Signed-off-by: Wanpeng Li > --- > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > @@ -8653,15 +8661,26 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, > kvm_del_async_pf_gfn(vcpu, work->arch.gfn); > trace_kvm_async_pf_ready(work->arch.token, work->gva); > > - if ((vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED) && > - !apf_put_user(vcpu, KVM_PV_REASON_PAGE_READY)) { > - fault.vector = PF_VECTOR; > - fault.error_code_valid = true; > - fault.error_code = 0; > - fault.nested_page_fault = false; > - fault.address = work->arch.token; > - fault.async_page_fault = true; > - kvm_inject_page_fault(vcpu, &fault); > + if (vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED) { > + if (!apf_get_user(vcpu, &val)) { I removed one indentation level when applying by merging these two condition. > + if (val == KVM_PV_REASON_PAGE_NOT_PRESENT && > + vcpu->arch.exception.pending && > + vcpu->arch.exception.nr == PF_VECTOR && > + !apf_put_user(vcpu, 0)) { > + vcpu->arch.exception.pending = false; We know that vcpu->arch.exception.injected is false here, but I cleared it too for safety, thanks. > + vcpu->arch.exception.nr = 0; > + vcpu->arch.exception.has_error_code = false; > + vcpu->arch.exception.error_code = 0; > + } else if (!apf_put_user(vcpu, KVM_PV_REASON_PAGE_READY)) { > + fault.vector = PF_VECTOR; > + fault.error_code_valid = true; > + fault.error_code = 0; > + fault.nested_page_fault = false; > + fault.address = work->arch.token; > + fault.async_page_fault = true; > + kvm_inject_page_fault(vcpu, &fault); > + } > + } > } > vcpu->arch.apf.halted = false; > vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; > -- > 2.7.4 >