Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752998AbdHDOFz (ORCPT ); Fri, 4 Aug 2017 10:05:55 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52110 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752491AbdHDOFx (ORCPT ); Fri, 4 Aug 2017 10:05:53 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 098577F3E0 Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=pbonzini@redhat.com Subject: Re: [PATCH v2 1/3] kvm: svm: Add support for additional SVM NPF error codes To: Brijesh Singh Cc: kvm@vger.kernel.org, thomas lendacky , rkrcmar@redhat.com, joro@8bytes.org, x86@kernel.org, linux-kernel@vger.kernel.org, mingo@redhat.com, hpa@zytor.com, tglx@linutronix.de, bp@suse.de References: <147992048887.27638.17559991037474542240.stgit@brijesh-build-machine> <147992049856.27638.17076562184960611399.stgit@brijesh-build-machine> <21b9f4db-f929-80f6-6ad2-6fa3b77f82c0@redhat.com> <98086274.371452.1501531542630.JavaMail.zimbra@redhat.com> <661faa8a-87af-743f-d3ea-b95ada0d7677@amd.com> <6afdcd42-7abe-c814-1f67-407ff91a75d2@redhat.com> From: Paolo Bonzini Message-ID: <85670631-0e42-7fd3-6d2c-29be2f91b38a@redhat.com> Date: Fri, 4 Aug 2017 16:05:45 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Fri, 04 Aug 2017 14:05:53 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3050 Lines: 77 On 04/08/2017 02:30, Brijesh Singh wrote: > > > On 8/2/17 5:42 AM, Paolo Bonzini wrote: >> On 01/08/2017 15:36, Brijesh Singh wrote: >>>> The flow is: >>>> >>>> hardware walks page table; L2 page table points to read only memory >>>> -> pf_interception (code = >>>> -> kvm_handle_page_fault (need_unprotect = false) >>>> -> kvm_mmu_page_fault >>>> -> paging64_page_fault (for example) >>>> -> try_async_pf >>>> map_writable set to false >>>> -> paging64_fetch(write_fault = true, map_writable = false, >>>> prefault = false) >>>> -> mmu_set_spte(speculative = false, host_writable = false, >>>> write_fault = true) >>>> -> set_spte >>>> mmu_need_write_protect returns true >>>> return true >>>> write_fault == true -> set emulate = true >>>> return true >>>> return true >>>> return true >>>> emulate >>>> >>>> Without this patch, emulation would have called >>>> >>>> ..._gva_to_gpa_nested >>>> -> translate_nested_gpa >>>> -> paging64_gva_to_gpa >>>> -> paging64_walk_addr >>>> -> paging64_walk_addr_generic >>>> set fault (nested_page_fault=true) >>>> >>>> and then: >>>> >>>> kvm_propagate_fault >>>> -> nested_svm_inject_npf_exit >>>> >>> maybe then safer thing would be to qualify the new error_code check with >>> !mmu_is_nested(vcpu) or something like that. So that way it would run on >>> L1 guest, and not the L2 guest. I believe that would restrict it avoid >>> hitting this case. Are you okay with this change ? >> Or check "vcpu->arch.mmu.direct_map"? That would be true when not using >> shadow pages. > > Yes that can be used. Are you going to send a patch for this? Paolo >>> IIRC, the main place where this check was valuable was when L1 guest had >>> a fault (when coming out of the L2 guest) and emulation was not needed. >> How do I measure the effect? I tried counting the number of emulations, >> and any difference from the patch was lost in noise. > > I think this patch is necessary for functional reasons (not just > perf), because we added the other patch to look at the GPA and stop > walking the guest page tables on a NPF. > > The issue I think was that hardware has taken an NPF because the page > table is marked RO, and it saves the GPA in the VMCB. KVM was then going > and emulating the instruction and it saw that a GPA was available. But > that GPA was not the GPA of the instruction it is emulating, since it > was the GPA of the tablewalk page that had the fault. It was debugged > that at the time and realized that emulating the instruction was > unnecessary so we added this new code in there which fixed the > functional issue and helps perf. > > I don't have any data on how much perf, as I recall it was most > effective when the L1 guest page tables and L2 nested page tables were > exactly the same. In that case, it avoided emulations for code that L1 > executes which I think could be as much as one emulation per 4kb code page. >