Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933744AbdGKVIz convert rfc822-to-8bit (ORCPT ); Tue, 11 Jul 2017 17:08:55 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37008 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933195AbdGKVIy (ORCPT ); Tue, 11 Jul 2017 17:08:54 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 4BAA05AFC9 Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=bsd@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 4BAA05AFC9 From: Bandan Das To: Radim =?utf-8?B?S3LEjW3DocWZ?= Cc: David Hildenbrand , kvm@vger.kernel.org, pbonzini@redhat.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH v4 3/3] KVM: nVMX: Emulate EPTP switching for the L1 hypervisor References: <20170710204936.4001-1-bsd@redhat.com> <20170710204936.4001-4-bsd@redhat.com> <2d50ebc4-9328-ce08-b55b-6a331ee13cc3@redhat.com> <20170711193235.GE3326@potion> <20170711202118.GC28875@potion> <20170711204521.GF3326@potion> Date: Tue, 11 Jul 2017 17:08:48 -0400 In-Reply-To: <20170711204521.GF3326@potion> ("Radim \=\?utf-8\?B\?S3LEjW3DocWZ\?\= \=\?utf-8\?B\?Iidz\?\= message of "Tue, 11 Jul 2017 22:45:21 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 11 Jul 2017 21:08:53 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4825 Lines: 114 Radim Krčmář writes: > 2017-07-11 16:34-0400, Bandan Das: >> Radim Krčmář writes: >> >> > 2017-07-11 15:50-0400, Bandan Das: >> >> Radim Krčmář writes: >> >> > 2017-07-11 14:24-0400, Bandan Das: >> >> >> Bandan Das writes: >> >> >> > If there's a triple fault, I think it's a good idea to inject it >> >> >> > back. Basically, there's no need to take care of damage control >> >> >> > that L1 is intentionally doing. >> >> >> > >> >> >> >>> + goto fail; >> >> >> >>> + kvm_mmu_unload(vcpu); >> >> >> >>> + vmcs12->ept_pointer = address; >> >> >> >>> + kvm_mmu_reload(vcpu); >> >> >> >> >> >> >> >> I was thinking about something like this: >> >> >> >> >> >> >> >> kvm_mmu_unload(vcpu); >> >> >> >> old = vmcs12->ept_pointer; >> >> >> >> vmcs12->ept_pointer = address; >> >> >> >> if (kvm_mmu_reload(vcpu)) { >> >> >> >> /* pointer invalid, restore previous state */ >> >> >> >> kvm_clear_request(KVM_REQ_TRIPLE_FAULT, vcpu); >> >> >> >> vmcs12->ept_pointer = old; >> >> >> >> kvm_mmu_reload(vcpu); >> >> >> >> goto fail; >> >> >> >> } >> >> >> >> >> >> >> >> The you can inherit the checks from mmu_check_root(). >> >> >> >> >> >> Actually, thinking about this a bit more, I agree with you. Any fault >> >> >> with a vmfunc operation should end with a vmfunc vmexit, so this >> >> >> is a good thing to have. Thank you for this idea! :) >> >> > >> >> > SDM says >> >> > >> >> > IF tent_EPTP is not a valid EPTP value (would cause VM entry to fail >> >> > if in EPTP) THEN VMexit; >> >> >> >> This section here: >> >> As noted in Section 25.5.5.2, an execution of the >> >> EPTP-switching VM function that causes a VM exit (as specified >> >> above), uses the basic exit reason 59, indicating “VMFUNC”. >> >> The length of the VMFUNC instruction is saved into the >> >> VM-exit instruction-length field. No additional VM-exit >> >> information is provided. >> >> >> >> Although, it adds (as specified above), from testing, any vmexit that >> >> happens as a result of the execution of the vmfunc instruction always >> >> has exit reason 59. >> >> >> >> IMO, the case David pointed out comes under "as a result of the >> >> execution of the vmfunc instruction", so I would prefer exiting >> >> with reason 59. >> > >> > Right, the exit reason is 59 for reasons that trigger a VM exit >> > (i.e. invalid EPTP value, the four below), but kvm_mmu_reload() checks >> > unrelated stuff. >> > >> > If the EPTP value is correct, then the switch should succeed. >> > If the EPTP is correct, but bogus, then the guest should get >> > EPT_MISCONFIG VM exit on its first access (when reading the >> > instruction). Source: I added >> >> My point is that we are using kvm_mmu_reload() to emulate eptp >> switching. If that emulation of vmfunc fails, it should exit with reason >> 59. > > Yeah, we just disagree on what is a vmfunc failure. > >> > vmcs_write64(EPT_POINTER, vmcs_read64(EPT_POINTER) | (1ULL << 40)); >> > >> > shortly before a VMLAUNCH on L0. :) >> >> What happens if this ept pointer is actually in the eptp list and the guest >> switches to it using vmfunc ? I think it will exit with reason 59. > > I think otherwise, because it doesn't cause a VM entry failure on > bare-metal (and SDM says that we get a VM exit only if there would be a > VM entry failure). > I expect the vmfunc to succeed and to get a EPT_MISCONFIG right after. > (Experiment pending :]) > >> > I think that we might be emulating this case incorrectly and throwing >> > triple faults when it should be VM exits in vcpu_run(). >> >> No, I agree with not throwing a triple fault. We should clear it out. >> But we should emulate a vmfunc vmexit back to L1 when kvm_mmu_load fails. > > Here we disagree. I think that it's a bug do the VM exit, so we can Why do you think it's a bug ? The eptp switching function really didn't succeed as far as our emulation goes when kvm_mmu_reload() fails. And as such, the generic vmexit failure event should be a vmfunc vmexit. We cannot strictly follow the spec here, the spec doesn't even mention a way to emulate eptp switching. If setting up the switching succeeded and the new root pointer is invalid or whatever, I really don't care what happens next but this is not the case. We fail to get a new root pointer and without that, we can't even make a switch! > just keep the original bug -- we want to eventually fix it and it's no > worse till then. Anyway, can you please confirm again what is the behavior that you are expecting if kvm_mmu_reload fails ? This would be a rarely used branch and I am actually fine diverging from what I think is right if I can get the reviewers to agree on a common thing. (Thanks for giving this a closer look, Radim. I really appreciate it.) Bandan