Received: by 2002:ac0:da4c:0:0:0:0:0 with SMTP id a12csp1132044imi; Fri, 22 Jul 2022 18:25:18 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uwp03ofgLqT0dw5AGGjDoU3RKYvPQpehjVKyBnoOVLFoVOabN6KxWgnVITlClH+TQUBifJ X-Received: by 2002:a17:907:2bc6:b0:72e:ceea:862b with SMTP id gv6-20020a1709072bc600b0072eceea862bmr2014046ejc.134.1658539517977; Fri, 22 Jul 2022 18:25:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1658539517; cv=none; d=google.com; s=arc-20160816; b=vhec4R8or0ccLjwyXWMkGDreMgWeVehbxNRzS7WwhPdmmWeiBe37XhbvNrVNb8WsGu vIAPvhI0GSJX9RBtyEv+E8RzMzMy68vUw8h/GJ+tmJt9R9oqNsY4eGW7eZ020KpBbpgL 9AhlDwfIKkQ6aI4474DTlO8U5FPOd2IqY9cG1Du/HSVceeLS6fEIyDNqiZjlIZ6x2bry EMOwV9SSffNy0Q9W7amt83kAIExfdARfQPIfsFYX6O4ygpdQfCEyOiI3wperUuFNwHJG mg0ouoUuDnXJae8xNG1/+zTjMybD2ndrz55oK3Tlo0cStEXpV0u9CfRL3EFFyzuTK8jO WMyw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:references:mime-version :message-id:in-reply-to:date:reply-to:dkim-signature; bh=9ulz7aLVAum1vSSwzpqx4oGCxXM6Qh5WmZWBpBwUqbk=; b=CHNeRcEuMjoYXZ5sFgIQwm1kNk1xH3Q+ceoe0aYZrfF19lwD4fZlGUtaoCl68bRYGO qJZvv3DgQCKKx4cxvGgc7ZM84G84XaenK2DINrMG2N8D7s2KsmbugWT6B2MG+0L/fwIi 9WswpZshV/kipk5BlSXotL2038/aJ+90aXUBYkf2hgzRD+NAf2oro1q97JJJcdPWJSew /ACF1zaeicdF/rbtiUzXGKuubtcc1GkO0gM75ZbkXOQ15tiG0hZCX1QIH8H91WtnLwQw e8HEXQFwvDTpcXXxfmAbbdcEC1J6lAFF9t+f6N0+6rFBjElKtXO2IqF0U9/vEV+k7VnU T+lg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=X2Xwpuog; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hq19-20020a1709073f1300b0072f0f088ed7si6500383ejc.712.2022.07.22.18.24.53; Fri, 22 Jul 2022 18:25:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=X2Xwpuog; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236819AbiGWBXr (ORCPT + 99 others); Fri, 22 Jul 2022 21:23:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54856 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236527AbiGWBXg (ORCPT ); Fri, 22 Jul 2022 21:23:36 -0400 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 722DA675A3 for ; Fri, 22 Jul 2022 18:23:34 -0700 (PDT) Received: by mail-pg1-x549.google.com with SMTP id x71-20020a63314a000000b00419699fc9afso3019479pgx.1 for ; Fri, 22 Jul 2022 18:23:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=9ulz7aLVAum1vSSwzpqx4oGCxXM6Qh5WmZWBpBwUqbk=; b=X2Xwpuogh9PUYk+auvdaLmWIyRy814YJlecxSvggqAJVj763zTof4nqo81oSrTKUq8 DXZsV/AO3xzeDa/HfnJBEf7dl6PiS4rAZKrKfwkYMe3RKdn2A5/kWkFsB4JBDDY+dt8m /nuJXscaVhQkltmk2G2Srelw+pADI0+j7QevdKeJJ9WnDg8/nd0kZqL99l+BZvhx1e4Z wwIs2BMUjFwNNTVYRoWW1+u1yguWry2DQVm0IAWTIQTVAiddzZJQLJyY6VLItK+YMn1t a4T/2f3bE3uXziFF5ILsLCnRycf8HHbqTaCoL4W7PMeAIucpl8gKigbyxQv70uujoQj3 3z/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=9ulz7aLVAum1vSSwzpqx4oGCxXM6Qh5WmZWBpBwUqbk=; b=e3xSaQXxfMeTp2RHOQxLqtPTDu2i5NmaOQdpyQFzTHx6BE+D9Cc4XETnelg+78dxbN jMhB8RoLhLajohMcCRVvCJTwo6KXY+631hzUkMSkB1kV+w0e55XZzt+U21sollszmblK yer9ofomkEWSbxOA0w2xWFLVJs7MLUqCrASCBLjqIo9Fu7O5wiMfPaU9eQVaPUFa6DDf 6LNr8xbCd7TwxUpoVcoQlenjg8RmktNbFWbveWDWoAEkpo2s1zu8KR7It1NtIcNZpjku BFxHYTnPAZeLiOxYXWt+8TQ012Mh4vWlk6+uRENgxv9dgQ7LO2TNfr9xNmSRn2VafVtW wbpw== X-Gm-Message-State: AJIora+JEFuM1G0j5D3P/fLg5WrebO1YHz8JynhcbOYrla/D3HREh9VE WDOt+erL7+u9IWqT0Lj2TD42Xt/2LBg= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a63:2cc6:0:b0:411:4fd6:49cb with SMTP id s189-20020a632cc6000000b004114fd649cbmr2040088pgs.365.1658539413965; Fri, 22 Jul 2022 18:23:33 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 01:23:21 +0000 In-Reply-To: <20220723012325.1715714-1-seanjc@google.com> Message-Id: <20220723012325.1715714-3-seanjc@google.com> Mime-Version: 1.0 References: <20220723012325.1715714-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v2 2/6] KVM: x86/mmu: Properly account NX huge page workaround for nonpaging MMUs From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed , Mingwei Zhang , Ben Gardon Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Account and track NX huge pages for nonpaging MMUs so that a future enhancement to precisely check if shadow page cannot be replaced by a NX huge page doesn't get false positives. Without correct tracking, KVM can get stuck in a loop if an instruction is fetching and writing data on the same huge page, e.g. KVM installs a small executable page on the fetch fault, replaces it with an NX huge page on the write fault, and faults again on the fetch. Alternatively, and perhaps ideally, KVM would simply not enforce the workaround for nonpaging MMUs. The guest has no page tables to abuse and KVM is guaranteed to switch to a different MMU on CR0.PG being toggled so there's no security or performance concerns. However, getting make_spte() to play nice now and in the future is unnecessarily complex. In the current code base, make_spte() can enforce the mitigation if TDP is enabled or the MMU is indirect, but make_spte() may not always have a vCPU/MMU to work with, e.g. if KVM were to support in-line huge page promotion when disabling dirty logging. Without a vCPU/MMU, KVM could either pass in the correct information and/or derive it from the shadow page, but the former is ugly and the latter subtly non-trivial due to the possitibility of direct shadow pages in indirect MMUs. Given that using shadow paging with an unpaged guest is far from top priority _and_ has been subjected to the workaround since its inception, keep it simple and just fix the accounting glitch. Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu.c | 2 +- arch/x86/kvm/mmu/mmu_internal.h | 8 ++++++++ arch/x86/kvm/mmu/spte.c | 11 +++++++++++ 3 files changed, 20 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 1112e3a4cf3e..493cdf1c29ff 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3135,7 +3135,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) continue; link_shadow_page(vcpu, it.sptep, sp); - if (fault->is_tdp && fault->huge_page_disallowed) + if (fault->huge_page_disallowed) account_nx_huge_page(vcpu->kvm, sp, fault->req_level >= it.level); } diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index ff4ca54b9dda..83644a0167ab 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -201,6 +201,14 @@ struct kvm_page_fault { /* Derived from mmu and global state. */ const bool is_tdp; + + /* + * Note, enforcing the NX huge page mitigation for nonpaging MMUs + * (shadow paging, CR0.PG=0 in the guest) is completely unnecessary. + * The guest doesn't have any page tables to abuse and is guaranteed + * to switch to a different MMU when CR0.PG is toggled on (may not + * always be guaranteed when KVM is using TDP). See also make_spte(). + */ const bool nx_huge_page_workaround_enabled; /* diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c index 7314d27d57a4..9f3e5af088a5 100644 --- a/arch/x86/kvm/mmu/spte.c +++ b/arch/x86/kvm/mmu/spte.c @@ -147,6 +147,17 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, if (!prefetch) spte |= spte_shadow_accessed_mask(spte); + /* + * For simplicity, enforce the NX huge page mitigation even if not + * strictly necessary. KVM could ignore if the mitigation if paging is + * disabled in the guest, but KVM would then have to ensure a new MMU + * is loaded (or all shadow pages zapped) when CR0.PG is toggled on, + * and that's a net negative for performance when TDP is enabled. KVM + * could ignore the mitigation if TDP is disabled and CR0.PG=0, as KVM + * will always switch to a new MMU if paging is enabled in the guest, + * but that adds complexity just to optimize a mode that is anything + * but performance critical. + */ if (level > PG_LEVEL_4K && (pte_access & ACC_EXEC_MASK) && is_nx_huge_page_enabled(vcpu->kvm)) { pte_access &= ~ACC_EXEC_MASK; -- 2.37.1.359.gd136c6c3e2-goog