Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752591AbdFNNUq (ORCPT ); Wed, 14 Jun 2017 09:20:46 -0400 Received: from mx1.redhat.com ([209.132.183.28]:30773 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752297AbdFNNUo (ORCPT ); Wed, 14 Jun 2017 09:20:44 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 5E59A4E4C4 Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=rkrcmar@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 5E59A4E4C4 Date: Wed, 14 Jun 2017 15:20:40 +0200 From: Radim =?utf-8?B?S3LEjW3DocWZ?= To: Wanpeng Li Cc: "linux-kernel@vger.kernel.org" , kvm , Paolo Bonzini , Wanpeng Li Subject: Re: [PATCH 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf Message-ID: <20170614132039.GC1276@potion> References: <1497334094-6982-1-git-send-email-wanpeng.li@hotmail.com> <1497334094-6982-4-git-send-email-wanpeng.li@hotmail.com> <20170613185522.GA29537@potion> <20170614125221.GA2343@potion> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Wed, 14 Jun 2017 13:20:44 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2090 Lines: 49 2017-06-14 21:02+0800, Wanpeng Li: > 2017-06-14 20:52 GMT+08:00 Radim Krčmář : > > 2017-06-14 09:07+0800, Wanpeng Li: > >> 2017-06-14 2:55 GMT+08:00 Radim Krčmář : > >> > Using vcpu->arch.cr2 is suspicious as VMX doesn't update CR2 on VM > >> > exits; isn't this going to change the CR2 visible in L2 guest after a > >> > nested VM entry? > >> > >> Sorry, I don't fully understand the question. As you know this > >> vcpu->arch.cr2 which includes token is set before async pf injection, > > > > Yes, I'm thinking that setting vcpu->arch.cr2 is a mistake in this case. > > > >> and L1 will intercept it from EXIT_QUALIFICATION during nested vmexit, > > > > Right, so we do not need to have the token in CR2, because L1 is not > > going to look at it. > > > >> why it can change the CR2 visible in L2 guest after a nested VM entry? > > > > Sorry, the situation is too convoluted to be expressed in one sentence: > > > > 1) L2 is running with CR2 = L2CR2 > > 3) VMX exits (say, unrelated EXTERNAL_INTERRUPT) and L0 stores L2CR2 in > > vcpu->arch.cr2 > > 2) APF for L1 has completed > > 4) L0 KVM wants to inject APF and sets vcpu->arch.cr2 = APFT > > 5) L0 KVM does a nested VM exit to L1, EXIT_QUALIFICATION = APFT > > 6) L0 KVM enters L1 with CR2 = vcpu->arch.cr2 = APFT > > 7) L1 stores APFT as L2's CR2 > > 8) L1 handles APF, maybe reschedules, but eventually comes back to this > > L2's thread > > 9) after some time, L1 enters L2 with CR2 = APFT > > 10) L2 is running with CR2 = APTF > > > > The original L2CR2 is lost and we'd introduce a bug if L2 wanted to look > > at it, e.g. it was in a process of handling its #PF. > > Good point. What's your proposal? :) Get rid of async_pf. :) Optimal solutions aside, I think it would be best to add a new injection function for APF. One that injects a normal #PF for non-nested guests and directly triggers a #PF VM exit otherwise, and call it from kvm_arch_async_page_*present(). Do you think that just moving the nested VM exit from nested_vmx_check_exception() would work? Thanks.