Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934016AbdGKTex convert rfc822-to-8bit (ORCPT ); Tue, 11 Jul 2017 15:34:53 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46216 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932390AbdGKTew (ORCPT ); Tue, 11 Jul 2017 15:34:52 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 90D874E4CB Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=bsd@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 90D874E4CB From: Bandan Das To: Radim =?utf-8?B?S3LEjW3DocWZ?= Cc: David Hildenbrand , kvm@vger.kernel.org, pbonzini@redhat.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH v4 3/3] KVM: nVMX: Emulate EPTP switching for the L1 hypervisor References: <20170710204936.4001-1-bsd@redhat.com> <20170710204936.4001-4-bsd@redhat.com> <2d50ebc4-9328-ce08-b55b-6a331ee13cc3@redhat.com> <20170711135251.GA3326@potion> <20170711191207.GD3326@potion> Date: Tue, 11 Jul 2017 15:34:48 -0400 In-Reply-To: <20170711191207.GD3326@potion> ("Radim \=\?utf-8\?B\?S3LEjW3DocWZ\?\= \=\?utf-8\?B\?Iidz\?\= message of "Tue, 11 Jul 2017 21:12:07 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Tue, 11 Jul 2017 19:34:51 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4266 Lines: 105 Radim Krčmář writes: > 2017-07-11 14:05-0400, Bandan Das: >> Radim Krčmář writes: >> >> > [David did a great review, so I'll just point out things I noticed.] >> > >> > 2017-07-11 09:51+0200, David Hildenbrand: >> >> On 10.07.2017 22:49, Bandan Das wrote: >> >> > When L2 uses vmfunc, L0 utilizes the associated vmexit to >> >> > emulate a switching of the ept pointer by reloading the >> >> > guest MMU. >> >> > >> >> > Signed-off-by: Paolo Bonzini >> >> > Signed-off-by: Bandan Das >> >> > --- >> >> > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> >> > @@ -7784,11 +7801,46 @@ static int handle_vmfunc(struct kvm_vcpu *vcpu) >> >> > } >> >> > >> >> > vmcs12 = get_vmcs12(vcpu); >> >> > - if ((vmcs12->vm_function_control & (1 << function)) == 0) >> >> > + if (((vmcs12->vm_function_control & (1 << function)) == 0) || >> >> > + WARN_ON_ONCE(function)) >> >> >> >> "... instruction causes a VM exit if the bit at position EAX is 0 in the >> >> VM-function controls (the selected VM function is >> >> not enabled)." >> >> >> >> So g2 can trigger this WARN_ON_ONCE, no? I think we should drop it then >> >> completely. >> > >> > It assumes that vm_function_control is not > 1, which is (should be) >> > guaranteed by VM entry check, because the nested_vmx_vmfunc_controls MSR >> > is 1. >> > >> >> > + goto fail; >> > >> > The rest of the code assumes that the function is >> > VMX_VMFUNC_EPTP_SWITCHING, so some WARN_ON_ONCE is reasonable. >> > >> > Writing it as >> > >> > WARN_ON_ONCE(function != VMX_VMFUNC_EPTP_SWITCHING) >> > >> > would be cleared and I'd prefer to move the part that handles >> > VMX_VMFUNC_EPTP_SWITCHING into a new function. (Imagine that Intel is >> > going to add more than one VM FUNC. :]) >> >> IMO, for now, this should be fine because we are not even passing through the >> hardware's eptp switching. Even if there are other vm functions, they >> won't be available for the nested case and cause any conflict. > > Yeah, it is fine function-wise, I was just pointing out that it looks > ugly to me. Ok, lemme switch this to a switch statement style handler function. That way, future additions would be easier. > Btw. have you looked what we'd need to do for the hardware pass-through? > I'd expect big changes to MMU. :) Yes, the first version was actually using vmfunc 0 directly, well not exatly, the first time would go through this path and then the next time the processor would handle it directly. Paolo pointed out an issue that I was ready to fix but he wasn't comfortable with the idea. I actually agree with him, it's too much untested code for something that would be rarely used. >> >> > + if (!nested_cpu_has_ept(vmcs12) || >> >> > + !nested_cpu_has_eptp_switching(vmcs12)) >> >> > + goto fail; >> > >> > This brings me to a missing vm-entry check: >> > >> > If “EPTP switching” VM-function control is 1, the “enable EPT” >> > VM-execution control must also be 1. In addition, the EPTP-list address >> > must satisfy the following checks: >> > • Bits 11:0 of the address must be 0. >> > • The address must not set any bits beyond the processor’s >> > physical-address width. >> > >> > so this one could be >> > >> > if (!nested_cpu_has_eptp_switching(vmcs12) || >> > WARN_ON_ONCE(!nested_cpu_has_ept(vmcs12))) >> >> I will reverse the order here but the vm entry check is unnecessary because >> the check on the list address is already being done in this function. > > Here is too late, the nested VM-entry should have failed, never letting > this situation happen. We want an equivalent of > > if (nested_cpu_has_eptp_switching(vmcs12) && !nested_cpu_has_ept(vmcs12)) > return VMXERR_ENTRY_INVALID_CONTROL_FIELD; > > in nested controls checks, right next to the reserved fields check. > And then also the check EPTP-list check. All of them only checked when > nested_cpu_has_vmfunc(vmcs12). Actually, I misread 25.5.5.3. There are two checks. Here, the list entry needs to be checked so that eptp won't cause a vmentry failure. The vmentry needs to check the eptp list address itself. I will add that check for the list address in the next version. Bandan