Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752717AbdFPPii (ORCPT ); Fri, 16 Jun 2017 11:38:38 -0400 Received: from mx1.redhat.com ([209.132.183.28]:35658 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752451AbdFPPig (ORCPT ); Fri, 16 Jun 2017 11:38:36 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com C1849C0567A2 Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=rkrcmar@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com C1849C0567A2 Date: Fri, 16 Jun 2017 17:38:32 +0200 From: Radim =?utf-8?B?S3LEjW3DocWZ?= To: Wanpeng Li Cc: "linux-kernel@vger.kernel.org" , kvm , Paolo Bonzini , Wanpeng Li Subject: Re: [PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf Message-ID: <20170616153832.GA5980@potion> References: <1497493615-18512-1-git-send-email-wanpeng.li@hotmail.com> <1497493615-18512-4-git-send-email-wanpeng.li@hotmail.com> <20170616133702.GA6360@potion> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Fri, 16 Jun 2017 15:38:35 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2534 Lines: 53 2017-06-16 22:24+0800, Wanpeng Li: > 2017-06-16 21:37 GMT+08:00 Radim Krčmář : > > 2017-06-14 19:26-0700, Wanpeng Li: > >> From: Wanpeng Li > >> > >> Add an async_page_fault field to vcpu->arch.exception to identify an async > >> page fault, and constructs the expected vm-exit information fields. Force > >> a nested VM exit from nested_vmx_check_exception() if the injected #PF > >> is async page fault. > >> > >> Cc: Paolo Bonzini > >> Cc: Radim Krčmář > >> Signed-off-by: Wanpeng Li > >> --- > >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > >> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp); > >> void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault) > >> { > >> ++vcpu->stat.pf_guest; > >> - vcpu->arch.cr2 = fault->address; > >> + vcpu->arch.exception.async_page_fault = fault->async_page_fault; > > > > I think we need to act as if arch.exception.async_page_fault was not > > pending in kvm_vcpu_ioctl_x86_get_vcpu_events(). Otherwise, if we > > migrate with pending async_page_fault exception, we'd inject it as a > > normal #PF, which could confuse/kill the nested guest. > > > > And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for > > sanity as well. > > Do you mean we should add a field like async_page_fault to > kvm_vcpu_events::exception, then saves arch.exception.async_page_fault > to events->exception.async_page_fault through KVM_GET_VCPU_EVENTS and > restores events->exception.async_page_fault to > arch.exception.async_page_fault through KVM_SET_VCPU_EVENTS? No, I thought we could get away with a disgusting hack of hiding the exception from userspace, which would work for migration, but not if local userspace did KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS ... Extending the userspace interface would work, but I'd do it as a last resort, after all conservative solutions have failed. async_pf migration is very crude, so exposing the exception is just an ugly workaround for the local case. Adding the flag would also require userspace configuration of async_pf features for the guest to keep compatibility. I see two options that might be simpler than adding the userspace flag: 1) do the nested VM exit sooner, at the place where we now queue #PF, 2) queue the #PF later, save the async_pf in some intermediate structure and consume it at the place where you proposed the nested VM exit.