Received: by 2002:a05:6512:3d0e:0:0:0:0 with SMTP id d14csp7494lfv; Tue, 12 Apr 2022 15:03:59 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwl+DVvzuxPVF15bBZyH/dgLiqGB0zElxMcJ/sbL8dTs/AdLk1LXou1qcb+tWhorEhnc6i0 X-Received: by 2002:a17:90b:4a4e:b0:1ca:c996:20dc with SMTP id lb14-20020a17090b4a4e00b001cac99620dcmr7477946pjb.98.1649801038789; Tue, 12 Apr 2022 15:03:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649801038; cv=none; d=google.com; s=arc-20160816; b=gDK0vH5fj0NQ6EhJDqPQBYkAx3yDGFOnlJkC0k+Hk9f9fqAjdXqcDzCl0Fgf3X4WGz YDd1ADCs/RloquN6scvHAeox886PYLl5MABJqBepbwR7thYQmJdLTiYTdx4433Ltuno+ vbWvSlr5Jhh+02b8j8lqN4mSwJk6apImRIXkANDv3HT31CRw5CzRWc8vXRtlaqnohBUF QYETDEj3lwyun4zHXnR+XWgLH3anF+i03Gt6UxDO0t/5zSlJM6Z+p2liKdCxglMmAYWo hTyfrbnRoqd6bKv1DBVYykIyEKYToxtm8USuxLfRGCMf56hejSPUTrWdqvzAe5tfyPun jddA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:references:mime-version :message-id:in-reply-to:date:dkim-signature; bh=G74YOGd+KW6PfFvdswhoDzYcIuStPXuwuhYirfhXwiM=; b=tuj+DCLprq+OpE2Q6sf7ECjOo355uoNJQf52h8n+BaGGGa4jPuAUTBI2waCmGqZSkk uWRZ5XvYtDTiBYanhIvl+HdBR5zIxm4rXmno0O7fK37aLYgIhCtQV+eLQ8ji2IIF5Dk2 8nSBH7cmX8p+HjDIlrAeHfoMVWWgDqxrWB8r27eQPvw4n7jZKWnWP0lPPXNmgtqI5pMY DFwL06dva4pMwu/yMCdT+/HlMdo5RVljnmHbgxsWRHdyr7zvo7Th+XPDcFnNIBK7KXu2 h3BIbnrbdRBlks6o0XIfCRKzsR37lF+7pB7ey7CTt1O/vpohALRRDq4OrfxrbMbFgOFf epbg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=EYkBSxlk; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id i12-20020a170902c94c00b0015454f1f0dfsi14172143pla.28.2022.04.12.15.03.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Apr 2022 15:03:58 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=EYkBSxlk; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id AD1AF16015C; Tue, 12 Apr 2022 13:55:41 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349002AbiDKVNT (ORCPT + 99 others); Mon, 11 Apr 2022 17:13:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39270 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350054AbiDKVNI (ORCPT ); Mon, 11 Apr 2022 17:13:08 -0400 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 13DF72B185 for ; Mon, 11 Apr 2022 14:10:50 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id n11-20020a170902d2cb00b00156c1fd01c2so6729407plc.12 for ; Mon, 11 Apr 2022 14:10:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=G74YOGd+KW6PfFvdswhoDzYcIuStPXuwuhYirfhXwiM=; b=EYkBSxlkjezsOn0WSYft+MgmzUxZWXclsjEeb8M/F9g+FhFQUQCNX4yPI8d8FBgRmV pRxUsb4RnLzRHBRa9K054kS4ljLd0ltg4jkjSpodRquXfeW1fovrPRPAB4tlvZfvsIx/ t5NgnCB2ihQ2YSphonkw7EGj3j9juXOLQVBuyX5D9fFEaCQQrigWOQRamSSAg/tfQ+eq RwTIHCCFQDWG5J3Ik0hRJ0BxQ05f1IdoxltA0imoc/HUFgxqxairkXSOYoJBU42hTPfC rpHCTFgLIGceYiqQncQjCJX9vcgtaOuzqrIhLe7EkE7Ulroo+PZg1foCPQBl3+RgRLyy iDCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=G74YOGd+KW6PfFvdswhoDzYcIuStPXuwuhYirfhXwiM=; b=DEwpErIYHkUeEO+J3DV+RqMDqaaEKqSq7i0W2Fij7ULpeGB3dBEnF6sauCcy+Jbmu1 Z4cfc/wjfLczkMFs7dCffvfdS3TDEmVjpK1EAPlxFXwYAQkiA4K6NmjrqXdlefI1B86v 16x2rs7OsuIkwScByJn1sIo5EEUyLzIilDvGtqONtVTuOygMRlypkdj9/48MKhKSWmFZ ILr4Z3JdcJTfCn5iOJrEl+azTIubW+26ZbGvFBet5F571BIlZkkx3kmXbxmyTRHwS8cV vrfjyETodpu5TQsEbZSuXZaxfbcva83YwrdV0pVRSWVKZ2yidyQJmvMIlngjvWEVD+KG AYMQ== X-Gm-Message-State: AOAM533Rqy0rLxU4zNfMTj4gPeRgBuAgKNUDJK1Dc+jJH8b93FZkYzlR QlqH9snRos7TEX9/4Xx0QGRqTQbevm+b1mJdc++M5Tep3G1CrcZbMJLWseXvdjWNVuSYUNf20P6 grFMQE+Bb0wFRu1R/2WXSvsWnYYl1fe7CVv/RR1hAmMoPvVncsJGOMX8m1QnjuMsAfg8mbm2M X-Received: from bgardon.sea.corp.google.com ([2620:15c:100:202:a2d0:faec:7d8b:2e0b]) (user=bgardon job=sendgmr) by 2002:a17:90a:ba13:b0:1cb:6296:ce41 with SMTP id s19-20020a17090aba1300b001cb6296ce41mr1224016pjr.104.1649711449422; Mon, 11 Apr 2022 14:10:49 -0700 (PDT) Date: Mon, 11 Apr 2022 14:10:12 -0700 In-Reply-To: <20220411211015.3091615-1-bgardon@google.com> Message-Id: <20220411211015.3091615-8-bgardon@google.com> Mime-Version: 1.0 References: <20220411211015.3091615-1-bgardon@google.com> X-Mailer: git-send-email 2.35.1.1178.g4f1659d476-goog Subject: [PATCH v4 07/10] KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis From: Ben Gardon To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , Peter Xu , Sean Christopherson , Peter Shier , David Dunn , Junaid Shahid , Jim Mattson , David Matlack , Mingwei Zhang , Jing Zhang , Ben Gardon Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org In some cases, the NX hugepage mitigation for iTLB multihit is not needed for all guests on a host. Allow disabling the mitigation on a per-VM basis to avoid the performance hit of NX hugepages on trusted workloads. Reviewed-by: David Matlack Signed-off-by: Ben Gardon --- Documentation/virt/kvm/api.rst | 11 +++++++++++ arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/mmu.h | 10 ++++++---- arch/x86/kvm/mmu/mmu.c | 2 +- arch/x86/kvm/mmu/spte.c | 7 ++++--- arch/x86/kvm/mmu/spte.h | 3 ++- arch/x86/kvm/mmu/tdp_mmu.c | 3 ++- arch/x86/kvm/x86.c | 6 ++++++ include/uapi/linux/kvm.h | 1 + 9 files changed, 35 insertions(+), 10 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 72183ae628f7..31fb002632bb 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7855,6 +7855,17 @@ At this time, KVM_PMU_CAP_DISABLE is the only capability. Setting this capability will disable PMU virtualization for that VM. Usermode should adjust CPUID leaf 0xA to reflect that the PMU is disabled. +8.36 KVM_CAP_VM_DISABLE_NX_HUGE_PAGES +--------------------------- + +:Capability KVM_CAP_PMU_CAPABILITY +:Architectures: x86 +:Type: vm + +This capability disables the NX huge pages mitigation for iTLB MULTIHIT. + +The capability has no effect if the nx_huge_pages module parameter is not set. + 9. Known KVM API problems ========================= diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 2c20f715f009..b8ab4fa7d4b2 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1240,6 +1240,8 @@ struct kvm_arch { hpa_t hv_root_tdp; spinlock_t hv_root_tdp_lock; #endif + + bool disable_nx_huge_pages; }; struct kvm_vm_stat { diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 671cfeccf04e..20d12e5b0040 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -173,10 +173,12 @@ struct kvm_page_fault { int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); extern int nx_huge_pages; -static inline bool is_nx_huge_page_enabled(void) +static inline bool is_nx_huge_page_enabled(struct kvm *kvm) { - return READ_ONCE(nx_huge_pages); + return READ_ONCE(nx_huge_pages) && + !kvm->arch.disable_nx_huge_pages; } +void kvm_update_nx_huge_pages(struct kvm *kvm); static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u32 err, bool prefetch) @@ -191,8 +193,8 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, .user = err & PFERR_USER_MASK, .prefetch = prefetch, .is_tdp = likely(vcpu->arch.mmu->page_fault == kvm_tdp_page_fault), - .nx_huge_page_workaround_enabled = is_nx_huge_page_enabled(), - + .nx_huge_page_workaround_enabled = + is_nx_huge_page_enabled(vcpu->kvm), .max_level = KVM_MAX_HUGEPAGE_LEVEL, .req_level = PG_LEVEL_4K, .goal_level = PG_LEVEL_4K, diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index caaa610b7878..149f353105f4 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6144,7 +6144,7 @@ static void __set_nx_huge_pages(bool val) nx_huge_pages = itlb_multihit_kvm_mitigation = val; } -static void kvm_update_nx_huge_pages(struct kvm *kvm) +void kvm_update_nx_huge_pages(struct kvm *kvm) { mutex_lock(&kvm->slots_lock); kvm_mmu_zap_all_fast(kvm); diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c index 4739b53c9734..877ad30bc7ad 100644 --- a/arch/x86/kvm/mmu/spte.c +++ b/arch/x86/kvm/mmu/spte.c @@ -116,7 +116,7 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, spte |= spte_shadow_accessed_mask(spte); if (level > PG_LEVEL_4K && (pte_access & ACC_EXEC_MASK) && - is_nx_huge_page_enabled()) { + is_nx_huge_page_enabled(vcpu->kvm)) { pte_access &= ~ACC_EXEC_MASK; } @@ -215,7 +215,8 @@ static u64 make_spte_executable(u64 spte) * This is used during huge page splitting to build the SPTEs that make up the * new page table. */ -u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index) +u64 make_huge_page_split_spte(struct kvm *kvm, u64 huge_spte, int huge_level, + int index) { u64 child_spte; int child_level; @@ -243,7 +244,7 @@ u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index) * When splitting to a 4K page, mark the page executable as the * NX hugepage mitigation no longer applies. */ - if (is_nx_huge_page_enabled()) + if (is_nx_huge_page_enabled(kvm)) child_spte = make_spte_executable(child_spte); } diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index 73f12615416f..e4142caff4b1 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -415,7 +415,8 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, unsigned int pte_access, gfn_t gfn, kvm_pfn_t pfn, u64 old_spte, bool prefetch, bool can_unsync, bool host_writable, u64 *new_spte); -u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index); +u64 make_huge_page_split_spte(struct kvm *kvm, u64 huge_spte, int huge_level, + int index); u64 make_nonleaf_spte(u64 *child_pt, bool ad_disabled); u64 make_mmio_spte(struct kvm_vcpu *vcpu, u64 gfn, unsigned int access); u64 mark_spte_for_access_track(u64 spte); diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 566548a3efa7..03aa1e0f60e2 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1469,7 +1469,8 @@ static int tdp_mmu_split_huge_page(struct kvm *kvm, struct tdp_iter *iter, * not been linked in yet and thus is not reachable from any other CPU. */ for (i = 0; i < PT64_ENT_PER_PAGE; i++) - sp->spt[i] = make_huge_page_split_spte(huge_spte, level, i); + sp->spt[i] = make_huge_page_split_spte(kvm, huge_spte, + level, i); /* * Replace the huge spte with a pointer to the populated lower level diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ab336f7c82e4..b810ea45f965 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4286,6 +4286,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_SYS_ATTRIBUTES: case KVM_CAP_VAPIC: case KVM_CAP_ENABLE_CAP: + case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES: r = 1; break; case KVM_CAP_EXIT_HYPERCALL: @@ -6079,6 +6080,11 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, } mutex_unlock(&kvm->lock); break; + case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES: + kvm->arch.disable_nx_huge_pages = true; + kvm_update_nx_huge_pages(kvm); + r = 0; + break; default: r = -EINVAL; break; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index dd1d8167e71f..7155488164bd 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1148,6 +1148,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_PMU_CAPABILITY 212 #define KVM_CAP_DISABLE_QUIRKS2 213 #define KVM_CAP_VM_TSC_CONTROL 214 +#define KVM_CAP_VM_DISABLE_NX_HUGE_PAGES 215 #ifdef KVM_CAP_IRQ_ROUTING -- 2.35.1.1178.g4f1659d476-goog