Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67;
Subject: Re: [PATCH] KVM: x86: clean up conditions for asynchronous page fault
 handling
To:     =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= <rkrcmar@redhat.com>
Cc:     linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
        Wanpeng Li <wanpengli@tencent.com>
References: <1560423812-51166-1-git-send-email-pbonzini@redhat.com>
 <20190613171220.GA24873@flask>
From:   Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <309cf0b7-f831-51da-7c2c-3333cd4c4036@redhat.com>
Date:   Thu, 13 Jun 2019 19:27:05 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
 Thunderbird/60.6.1
MIME-Version: 1.0
In-Reply-To: <20190613171220.GA24873@flask>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Sender: linux-kernel-owner@vger.kernel.org
Precedence: bulk

On 13/06/19 19:12, Radim Krčmář wrote:
> 2019-06-13 13:03+0200, Paolo Bonzini:
>> Even when asynchronous page fault is disabled, KVM does not want to pause
>> the host if a guest triggers a page fault; instead it will put it into
>> an artificial HLT state that allows running other host processes while
>> allowing interrupt delivery into the guest.
>>
>> However, the way this feature is triggered is a bit confusing.
>> First, it is not used for page faults while a nested guest is
>> running: but this is not an issue since the artificial halt
>> is completely invisible to the guest, either L1 or L2.  Second,
>> it is used even if kvm_halt_in_guest() returns true; in this case,
>> the guest probably should not pay the additional latency cost of the
>> artificial halt, and thus we should handle the page fault in a
>> completely synchronous way.
> 
> The same reasoning would apply to kvm_mwait_in_guest(), so I would
> disable APF with it as well.

True, on the other hand it's not a very sensible condition to have
kvm_mwait_in_guest but not kvm_halt_in_guest.

>> By introducing a new function kvm_can_deliver_async_pf, this patch
>> commonizes the code that chooses whether to deliver an async page fault
>> (kvm_arch_async_page_not_present) and the code that chooses whether a
>> page fault should be handled synchronously (kvm_can_do_async_pf).
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> @@ -9775,6 +9775,36 @@ static int apf_get_user(struct kvm_vcpu *vcpu, u32 *val)
>> +bool kvm_can_do_async_pf(struct kvm_vcpu *vcpu)
>> +{
>> +	if (unlikely(!lapic_in_kernel(vcpu) ||
>> +		     kvm_event_needs_reinjection(vcpu) ||
>> +		     vcpu->arch.exception.pending))
>> +		return false;
>> +
>> +	if (kvm_hlt_in_guest(vcpu->kvm) && !kvm_can_deliver_async_pf(vcpu))
>> +		return false;
>> +
>> +	/*
>> +	 * If interrupts are off we cannot even use an artificial
>> +	 * halt state.
> 
> Can't we?  The artificial halt state would be canceled by the host page
> fault handler.

hlt allows interrupts to be injected, which of course is not possible in
an interrupt-off region.  But, the only difference is that halt_poll is
not obeyed for synchronous page fault handling; either way, the vCPU
thread stays in the kernel but it is scheduled out while the page fault
is being handled.  This is only for lapic_in_kernel so hlt does not
leave KVM_RUN.

>> +	 */
>> +	return kvm_x86_ops->interrupt_allowed(vcpu);
>> @@ -9783,19 +9813,26 @@ void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu,
>>  	trace_kvm_async_pf_not_present(work->arch.token, work->gva);
>>  	kvm_add_async_pf_gfn(vcpu, work->arch.gfn);
>>  
>> -	if (!(vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED) ||
>> -	    (vcpu->arch.apf.send_user_only &&
>> -	     kvm_x86_ops->get_cpl(vcpu) == 0))
>> +	if (!kvm_can_deliver_async_pf(vcpu) ||
>> +	    apf_put_user(vcpu, KVM_PV_REASON_PAGE_NOT_PRESENT)) {
>> +		/*
>> +		 * It is not possible to deliver a paravirtualized asynchronous
>> +		 * page fault, but putting the guest in an artificial halt state
>> +		 * can be beneficial nevertheless: if an interrupt arrives, we
>> +		 * can deliver it timely and perhaps the guest will schedule
>> +		 * another process.  When the instruction that triggered a page
>> +		 * fault is retried, hopefully the page will be ready in the host.
>> +		 */
>>  		kvm_make_request(KVM_REQ_APF_HALT, vcpu);
> 
> A return is missing here, to prevent the delivery of PV APF.
> (I'd probably keep the if/else.)

Fixed in v2 (keeping the if/else but swapping the two arms).

Paolo

> Thanks.
> 
>> -	else if (!apf_put_user(vcpu, KVM_PV_REASON_PAGE_NOT_PRESENT)) {
>> -		fault.vector = PF_VECTOR;
>> -		fault.error_code_valid = true;
>> -		fault.error_code = 0;
>> -		fault.nested_page_fault = false;
>> -		fault.address = work->arch.token;
>> -		fault.async_page_fault = true;
>> -		kvm_inject_page_fault(vcpu, &fault);
> 
>>  	}
>> +
>> +	fault.vector = PF_VECTOR;
>> +	fault.error_code_valid = true;
>> +	fault.error_code = 0;
>> +	fault.nested_page_fault = false;
>> +	fault.address = work->arch.token;
>> +	fault.async_page_fault = true;
>> +	kvm_inject_page_fault(vcpu, &fault);
>>  }
>>  
>>  void kvm_arch_async_page_present(struct kvm_vcpu *vcpu,
>> -- 
>> 1.8.3.1
>>