DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 90D874E4CB
From: Bandan Das <bsd@redhat.com>
To: Radim =?utf-8?B?S3LEjW3DocWZ?= <rkrcmar@redhat.com>
Cc: David Hildenbrand <david@redhat.com>, kvm@vger.kernel.org,
        pbonzini@redhat.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 3/3] KVM: nVMX: Emulate EPTP switching for the L1 hypervisor
References: <20170710204936.4001-1-bsd@redhat.com>
        <20170710204936.4001-4-bsd@redhat.com>
        <2d50ebc4-9328-ce08-b55b-6a331ee13cc3@redhat.com>
        <20170711135251.GA3326@potion> <jpgtw2ialyd.fsf@linux.bootlegged.copy>
        <20170711191207.GD3326@potion>
Date: Tue, 11 Jul 2017 15:34:48 -0400
In-Reply-To: <20170711191207.GD3326@potion> ("Radim \=\?utf-8\?B\?S3LEjW3DocWZ\?\=
 \=\?utf-8\?B\?Iidz\?\= message of "Tue,
        11 Jul 2017 21:12:07 +0200")
Message-ID: <jpgshi2lqcn.fsf@linux.bootlegged.copy>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8BIT
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 4266
Lines: 105

Radim Krčmář <rkrcmar@redhat.com> writes:

> 2017-07-11 14:05-0400, Bandan Das:
>> Radim Krčmář <rkrcmar@redhat.com> writes:
>> 
>> > [David did a great review, so I'll just point out things I noticed.]
>> >
>> > 2017-07-11 09:51+0200, David Hildenbrand:
>> >> On 10.07.2017 22:49, Bandan Das wrote:
>> >> > When L2 uses vmfunc, L0 utilizes the associated vmexit to
>> >> > emulate a switching of the ept pointer by reloading the
>> >> > guest MMU.
>> >> > 
>> >> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> >> > Signed-off-by: Bandan Das <bsd@redhat.com>
>> >> > ---
>> >> > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> >> > @@ -7784,11 +7801,46 @@ static int handle_vmfunc(struct kvm_vcpu *vcpu)
>> >> >  	}
>> >> >  
>> >> >  	vmcs12 = get_vmcs12(vcpu);
>> >> > -	if ((vmcs12->vm_function_control & (1 << function)) == 0)
>> >> > +	if (((vmcs12->vm_function_control & (1 << function)) == 0) ||
>> >> > +	    WARN_ON_ONCE(function))
>> >> 
>> >> "... instruction causes a VM exit if the bit at position EAX is 0 in the
>> >> VM-function controls (the selected VM function is
>> >> not enabled)."
>> >> 
>> >> So g2 can trigger this WARN_ON_ONCE, no? I think we should drop it then
>> >> completely.
>> >
>> > It assumes that vm_function_control is not > 1, which is (should be)
>> > guaranteed by VM entry check, because the nested_vmx_vmfunc_controls MSR
>> > is 1.
>> >
>> >> > +		goto fail;
>> >
>> > The rest of the code assumes that the function is
>> > VMX_VMFUNC_EPTP_SWITCHING, so some WARN_ON_ONCE is reasonable.
>> >
>> > Writing it as
>> >
>> >   WARN_ON_ONCE(function != VMX_VMFUNC_EPTP_SWITCHING)
>> >
>> > would be cleared and I'd prefer to move the part that handles
>> > VMX_VMFUNC_EPTP_SWITCHING into a new function. (Imagine that Intel is
>> > going to add more than one VM FUNC. :])
>> 
>> IMO, for now, this should be fine because we are not even passing through the
>> hardware's eptp switching. Even if there are other vm functions, they
>> won't be available for the nested case and cause any conflict.
>
> Yeah, it is fine function-wise, I was just pointing out that it looks
> ugly to me.

Ok, lemme switch this to a switch statement style handler function. That way,
future additions would be easier.

> Btw. have you looked what we'd need to do for the hardware pass-through?
> I'd expect big changes to MMU. :)

Yes, the first version was actually using vmfunc 0 directly, well not exatly, the
first time would go through this path and then the next time the processor would
handle it directly. Paolo pointed out an issue that I was ready to fix but he wasn't
comfortable with the idea. I actually agree with him, it's too much untested code
for something that would be rarely used.

>> >> > +	if (!nested_cpu_has_ept(vmcs12) ||
>> >> > +	    !nested_cpu_has_eptp_switching(vmcs12))
>> >> > +		goto fail;
>> >
>> > This brings me to a missing vm-entry check:
>> >
>> >  If “EPTP switching” VM-function control is 1, the “enable EPT”
>> >  VM-execution control must also be 1. In addition, the EPTP-list address
>> >  must satisfy the following checks:
>> >  • Bits 11:0 of the address must be 0.
>> >  • The address must not set any bits beyond the processor’s
>> >    physical-address width.
>> >
>> > so this one could be
>> >
>> >   if (!nested_cpu_has_eptp_switching(vmcs12) ||
>> >       WARN_ON_ONCE(!nested_cpu_has_ept(vmcs12)))
>> 
>> I will reverse the order here but the vm entry check is unnecessary because
>> the check on the list address is already being done in this function.
>
> Here is too late, the nested VM-entry should have failed, never letting
> this situation happen.  We want an equivalent of
>
>   if (nested_cpu_has_eptp_switching(vmcs12) && !nested_cpu_has_ept(vmcs12))
>   	return VMXERR_ENTRY_INVALID_CONTROL_FIELD;
>
> in nested controls checks, right next to the reserved fields check.
> And then also the check EPTP-list check.  All of them only checked when
> nested_cpu_has_vmfunc(vmcs12).

Actually, I misread 25.5.5.3. There are two checks. Here, the list entry
needs to be checked so that eptp won't cause a vmentry failure. The vmentry
needs to check the eptp list address itself. I will add that check for the
list address in the next version.

Bandan