Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp1795700pxy; Fri, 23 Apr 2021 17:55:44 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwW6x5BEnHX/8sOFUbfHFVC1ucBzCGCykuRt0gmp1AsyALTW+4nIo1IMo2VyWLSgIwYomEC X-Received: by 2002:a17:906:3509:: with SMTP id r9mr7010403eja.490.1619225744068; Fri, 23 Apr 2021 17:55:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619225744; cv=none; d=google.com; s=arc-20160816; b=C+sKnVhwi8PbRJWFyTCsshOvFiAeY+OrFVUfNRkyxQA6HGf2DkmfhLhwbPSyhJvqaq ZKPH67DT9uZ9k+36bD7JV3CHlMAJSJ2VgbQybzbKIGYak/GUO3iVN5njYyzRYDQV/WhP nCj0w3XwP1jR1fOcvmOc7vdYvhr70us6H+M2eD0yZzIAxnJwWUYTDGPI4CZ2aZSGDLT9 FcR9QMGLFn3tnPY82xqeNMe3Az930stQJX582EX3qkjfCunKpYOAyQfQ8MqF0uoLsYBu eSfvaaOwj4aZ3oQMUfKmlPhlx1kq3tGDhOxe6IeiZ8GzO751cLCCJIGV/zY7QOxAXUkH X1DA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:references:mime-version :message-id:in-reply-to:date:reply-to:dkim-signature; bh=SHBCMVMQ7FRSJddaHHw6x+5pUMPyIiVnrugYIOD42MA=; b=oRH4fO7MA9kxsLRZAbbzVYvHm1dP+AHdamVIn5v1QijCnGDX6CfY17a2RVUYaYCTEs RCSZ8X3eXFhHRml9lVdRL83yMvHMsXWrL64YLgqtlNOr0dq82Mv/G7HU0vDwaum0imZm LDgOyA7DeM+uWwJ4LWjc4wleAEpobQP5vK6NEB3JUPFs3CywWwTJYg3g4HuhfkjCQszA 4Snki0RFQvdm9SOKt+vbwtLm14PMyFRlGmWSIDlMuGYQs3s3EW36PNo4TvA7PaNSc3xu KsLYwDJ0tDA/0NGmIvbQHFn0DKOZeWbKlrM+Wt3aP3sM5vf8YknifRILIu26B5PR/DhR Kjnw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=K0J9aSp6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o4si6549869ejh.725.2021.04.23.17.55.21; Fri, 23 Apr 2021 17:55:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=K0J9aSp6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244426AbhDXAxH (ORCPT + 99 others); Fri, 23 Apr 2021 20:53:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36808 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244461AbhDXAvk (ORCPT ); Fri, 23 Apr 2021 20:51:40 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9820DC061374 for ; Fri, 23 Apr 2021 17:47:53 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id i201-20020a25d1d20000b02904ed4c01f82bso4016203ybg.20 for ; Fri, 23 Apr 2021 17:47:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=SHBCMVMQ7FRSJddaHHw6x+5pUMPyIiVnrugYIOD42MA=; b=K0J9aSp6q96WN80rLA6BoEuEJYxtMQ8pd8iGLxs8Gr6drsu1Dq4gujhBRutXdcf1Nh DvR9rhloXJ2QlSEliiOhZIePybPVLVhP+K0eXGOJMuvS9+LhJnzlLHdgtXjQ0qIOnLfo HgjRG/0NGbKvb3bkpn2u7Dz9nC7dyYvXXcsuVoDTKx8/GSyzRKQ+O559upr7n48pRK8h RqK4K61sgn2Y7My7Ab3m0cUaQBgrJ/apiGYseof8kscSvVnpA19XdUbW3TYS4Ysm1DNQ UxKz5BvsVHfvSB/H+Qqdo9Ip7d5GBv2OxTE8vr2mizTH11y8OMR+XwGdDXImPaNO0HlB EgkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=SHBCMVMQ7FRSJddaHHw6x+5pUMPyIiVnrugYIOD42MA=; b=Mk5L++50aQSb2rwUYTn39vJ2KpFNpugyp3PpFXfuxHOjTw4w72pYH7eKEUblgg9xuM PLkIQKfWrOg3rlTUSsW91iMH7nYs1IT+zS9mlUdczkg2n3PRJdJ7atDltcYo5WrEqPJ7 mZk5iE6sULhFjGIgqoCrxliAr5Z/spb4YQfe3yV7yPk7UQssH4rTOO36CitOPchlR2SK +vDiJdBbO2P2teHBHrFhjJMYjj5XzVZxm0jk15lFDr3m5eLd0pJCSeA/CkDB/u4N7G/9 X2yTMxZMtTDKdka3vwQTInPrUH+2QAgOyEwBs203CSkte0XSrrFgi6Juvgr1qQmrnxbX k6Zw== X-Gm-Message-State: AOAM533C8yTAuxuPbGAw4c6/UFVzjNwgDcyOSd6jH1KLxFIFmwYrKkXP tWX16FPQuTGwKgp0KwCztWdG/xWUHY0= X-Received: from seanjc798194.pdx.corp.google.com ([2620:15c:f:10:ad52:3246:e190:f070]) (user=seanjc job=sendgmr) by 2002:a5b:489:: with SMTP id n9mr9165164ybp.45.1619225272854; Fri, 23 Apr 2021 17:47:52 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 23 Apr 2021 17:46:26 -0700 In-Reply-To: <20210424004645.3950558-1-seanjc@google.com> Message-Id: <20210424004645.3950558-25-seanjc@google.com> Mime-Version: 1.0 References: <20210424004645.3950558-1-seanjc@google.com> X-Mailer: git-send-email 2.31.1.498.g6c1eba8ee3d-goog Subject: [PATCH 24/43] KVM: nVMX: Do not clear CR3 load/store exiting bits if L1 wants 'em From: Sean Christopherson To: Paolo Bonzini Cc: Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Keep CR3 load/store exiting enable as needed when running L2 in order to honor L1's desires. This fixes a largely theoretical bug where L1 could intercept CR3 but not CR0.PG and end up not getting the desired CR3 exits when L2 enables paging. In other words, the existing !is_paging() check inadvertantly handles the normal case for L2 where vmx_set_cr0() is called during VM-Enter, which is guaranteed to run with paging enabled, and thus will never clear the bits. Removing the !is_paging() check will also allow future consolidation and cleanup of the related code. From a performance perspective, this is all a nop, as the VMCS controls shadow will optimize away the VMWRITE when the controls are in the desired state. Add a comment explaining why CR3 is intercepted, with a big disclaimer about not querying the old CR3. Because vmx_set_cr0() is used for flows that are not directly tied to MOV CR3, e.g. vCPU RESET/INIT and nested VM-Enter, it's possible that is_paging() is not synchronized with CR3 load/store exiting. This is actually guaranteed in the current code, as KVM starts with CR3 interception disabled. Obviously that can be fixed, but there's no good reason to play whack-a-mole, and it tends to end poorly, e.g. descriptor table exiting for UMIP emulation attempted to be precise in the past and ended up botching the interception toggling. Fixes: fe3ef05c7572 ("KVM: nVMX: Prepare vmcs02 from vmcs01 and vmcs12") Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/vmx.c | 46 +++++++++++++++++++++++++++++++++--------- 1 file changed, 37 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index c9322cd55390..e42ae77e4b82 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3102,10 +3102,14 @@ void ept_save_pdptrs(struct kvm_vcpu *vcpu) kvm_register_mark_dirty(vcpu, VCPU_EXREG_PDPTR); } +#define CR3_EXITING_BITS (CPU_BASED_CR3_LOAD_EXITING | \ + CPU_BASED_CR3_STORE_EXITING) + void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) { struct vcpu_vmx *vmx = to_vmx(vcpu); unsigned long hw_cr0; + u32 tmp; hw_cr0 = (cr0 & ~KVM_VM_CR0_ALWAYS_OFF); if (is_unrestricted_guest(vcpu)) @@ -3132,18 +3136,42 @@ void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) #endif if (enable_ept && !is_unrestricted_guest(vcpu)) { + /* + * Ensure KVM has an up-to-date snapshot of the guest's CR3. If + * the below code _enables_ CR3 exiting, vmx_cache_reg() will + * (correctly) stop reading vmcs.GUEST_CR3 because it thinks + * KVM's CR3 is installed. + */ if (!kvm_register_is_available(vcpu, VCPU_EXREG_CR3)) vmx_cache_reg(vcpu, VCPU_EXREG_CR3); + + /* + * When running with EPT but not unrestricted guest, KVM must + * intercept CR3 accesses when paging is _disabled_. This is + * necessary because restricted guests can't actually run with + * paging disabled, and so KVM stuffs its own CR3 in order to + * run the guest when identity mapped page tables. + * + * Do _NOT_ check the old CR0.PG, e.g. to optimize away the + * update, it may be stale with respect to CR3 interception, + * e.g. after nested VM-Enter. + * + * Lastly, honor L1's desires, i.e. intercept CR3 loads and/or + * stores to forward them to L1, even if KVM does not need to + * intercept them to preserve its identity mapped page tables. + */ if (!(cr0 & X86_CR0_PG)) { - /* From paging/starting to nonpaging */ - exec_controls_setbit(vmx, CPU_BASED_CR3_LOAD_EXITING | - CPU_BASED_CR3_STORE_EXITING); - vcpu->arch.cr0 = cr0; - vmx_set_cr4(vcpu, kvm_read_cr4(vcpu)); - } else if (!is_paging(vcpu)) { - /* From nonpaging to paging */ - exec_controls_clearbit(vmx, CPU_BASED_CR3_LOAD_EXITING | - CPU_BASED_CR3_STORE_EXITING); + exec_controls_setbit(vmx, CR3_EXITING_BITS); + } else if (!is_guest_mode(vcpu)) { + exec_controls_clearbit(vmx, CR3_EXITING_BITS); + } else { + tmp = exec_controls_get(vmx); + tmp &= ~CR3_EXITING_BITS; + tmp |= get_vmcs12(vcpu)->cpu_based_vm_exec_control & CR3_EXITING_BITS; + exec_controls_set(vmx, tmp); + } + + if (!is_paging(vcpu) != !(cr0 & X86_CR0_PG)) { vcpu->arch.cr0 = cr0; vmx_set_cr4(vcpu, kvm_read_cr4(vcpu)); } -- 2.31.1.498.g6c1eba8ee3d-goog