Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp1491915pxb; Thu, 14 Apr 2022 07:20:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxl10VFnJ+YNRMC60fyJv1y9Qe+h5HrF9OGSR8zMW4Bw6PQofU7tWCIKnki7IJb0MwXCxzJ X-Received: by 2002:a17:90b:4a01:b0:1c9:a552:f487 with SMTP id kk1-20020a17090b4a0100b001c9a552f487mr4659546pjb.68.1649946040772; Thu, 14 Apr 2022 07:20:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649946040; cv=none; d=google.com; s=arc-20160816; b=KCmiVYGE7LdxAWxSmQ3yvG9NrtjYqotsgMkeqx6h28LU4yufmjkVvq8/VWpm0pOOwr REXbw2vgtPytgdIuWMHw4duuxEioPQcDLtZd3G2+PJLUEXxbz4fFraMTKgFVm5ty9q2X 7cWJiyZle8GcBQXvA5bmjomp/KZKABjXePjlPU1YkXdvHSIXkFc+yDj9vbNj736GF+N2 7F4pP3n/uYKvK2rgbWtGarPsqCv6Ge1qAhaPp/MAOkULToz1O69SYOZadjFtaBpNhDsu oL3f4gRNz4+5HjudLdaB+xUQuaQ22YTvdKJXb8VXDEFoSIBS/Rnf9a9UrRqdAiRCluU5 ct/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=2X3uicfOLSBUBzVzdE5/oWfcLTIRlhFHODXMCALg0vU=; b=BdObQCNUvBhlJu6pZGKfqrKh/nu6qbnvngw1V5cjRjGPgGCHMeLj0rBlrdaoUVBq1Y EvHzEbypj+2V7JVvfWIFIvth6iSqxE9Y3adM0PtktCJjvADlRlAGnvz3QyT7zlISG7K6 1CxuPb5/PfJ9748lVge5SNeMx8JuJb+Yei3u42muAwy+N6/ocSTPMEBpXCn379jncbci TQpaLAZTT8f9CbfhbnOG1MeRjdnjl96fZmreZcIFBC2yDMEbEFulzEe0w/u70mh+zpuc h3wTvLxqmPtr0CnajMuSG1TJXeDgMmfss8YDac3hT36vToKJyWVyXg7YPTpBjhw4IH/X iIuw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Hntq20sA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o18-20020a656a52000000b0038633e77da3si9360661pgu.71.2022.04.14.07.20.12; Thu, 14 Apr 2022 07:20:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Hntq20sA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239043AbiDMXG1 (ORCPT + 99 others); Wed, 13 Apr 2022 19:06:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56812 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233354AbiDMXGZ (ORCPT ); Wed, 13 Apr 2022 19:06:25 -0400 Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CAEDC41F98 for ; Wed, 13 Apr 2022 16:04:02 -0700 (PDT) Received: by mail-pf1-x432.google.com with SMTP id l127so1550305pfl.6 for ; Wed, 13 Apr 2022 16:04:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=2X3uicfOLSBUBzVzdE5/oWfcLTIRlhFHODXMCALg0vU=; b=Hntq20sATPLdJyU/h7Ttcsthca71JopiVmru2qp5Pz1Iydac/eHjcTEuqaAadrjVwR VxblHLjHcgFpGinccN+V9JEncr6KbwQHvQV1jiboVzjIIlcrWX3V5Ce8ouosHlvAwOon NBactZtCPJIMIqIsH0S71j0QdhVsnVvg0lqPJtql+HXtAiI41ghYXBTbig2IdrP5mGwR p92Qaw+N2NH0OQ9LPtNmyqbL+ZuqbRyeZMq6sNUyZktzAH3SWu/od9znpRAR35/EK3p3 8dpoJOiV2nzGwPaq7YoS81ZLzj28tstEEEiWVN2hTN26nSYhRcvPcZQDll5E8EN8O5hH rw9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=2X3uicfOLSBUBzVzdE5/oWfcLTIRlhFHODXMCALg0vU=; b=lPwVsNJdcmgYeCqUHmHc9RMa+G6SkTII5F64HJNNK8vUzdAjfalzD2pv0l7neV+8c0 2+xrtXlDUfju+YQXZumbwh/IAajkYaAZU3jcywwHrg8DcnHAb6GviEGkbKvBgB8m+jk3 9X8wqgNc+3v47d3yXPLLKajS3ioWO1qBKtHUpuslDWlUwHvP6ZMCA8DNBDzH95mgDHoX E6IQdvjpIJ0LBjwh17QpizvROdy/ZbaY1v9RhRuwkxx6RIdr9OZBb33zXeu9+03syifX I69rWvZqr2xGZ4Ld0xu4T+3mTNqo2Phfp/C0OJ0gLhtWOxPCGqDJcXtU3ZxE3JPSKKyl 4uYQ== X-Gm-Message-State: AOAM533ogqZUXoJ9GKlptNFnW2bS3j/iQMfw56bONH1gjq1yt1QNjxGO LFOSgciOsZPMVNs8ZYMv/o8+0Vn2asz81A== X-Received: by 2002:a63:5d4c:0:b0:39d:5470:efc7 with SMTP id o12-20020a635d4c000000b0039d5470efc7mr14682488pgm.27.1649891042130; Wed, 13 Apr 2022 16:04:02 -0700 (PDT) Received: from google.com (157.214.185.35.bc.googleusercontent.com. [35.185.214.157]) by smtp.gmail.com with ESMTPSA id b11-20020a621b0b000000b00505c6892effsm141831pfb.26.2022.04.13.16.04.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Apr 2022 16:04:01 -0700 (PDT) Date: Wed, 13 Apr 2022 23:03:57 +0000 From: Sean Christopherson To: Ben Gardon Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Peter Xu , David Matlack , Jim Mattson , David Dunn , Jing Zhang , Junaid Shahid Subject: Re: [PATCH v5 08/10] KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis Message-ID: References: <20220413175944.71705-1-bgardon@google.com> <20220413175944.71705-9-bgardon@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220413175944.71705-9-bgardon@google.com> X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 13, 2022, Ben Gardon wrote: > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst > index 72183ae628f7..021452a9fa91 100644 > --- a/Documentation/virt/kvm/api.rst > +++ b/Documentation/virt/kvm/api.rst > @@ -7855,6 +7855,19 @@ At this time, KVM_PMU_CAP_DISABLE is the only capability. Setting > this capability will disable PMU virtualization for that VM. Usermode > should adjust CPUID leaf 0xA to reflect that the PMU is disabled. > > +8.36 KVM_CAP_VM_DISABLE_NX_HUGE_PAGES > +--------------------------- > + > +:Capability KVM_CAP_PMU_CAPABILITY > +:Architectures: x86 > +:Type: vm > +:Returns 0 on success, -EPERM if the userspace process does not > + have CAP_SYS_BOOT Needs to document the -EINVAL cases, especially the requirement that this be called before VMs are created. The > +This capability disables the NX huge pages mitigation for iTLB MULTIHIT. > + > +The capability has no effect if the nx_huge_pages module parameter is not set. > + > 9. Known KVM API problems > ========================= > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 2c20f715f009..b8ab4fa7d4b2 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -1240,6 +1240,8 @@ struct kvm_arch { > hpa_t hv_root_tdp; > spinlock_t hv_root_tdp_lock; > #endif > + > + bool disable_nx_huge_pages; > }; > > struct kvm_vm_stat { > diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h > index 671cfeccf04e..148f630af78a 100644 > --- a/arch/x86/kvm/mmu.h > +++ b/arch/x86/kvm/mmu.h > @@ -173,9 +173,10 @@ struct kvm_page_fault { > int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); > > extern int nx_huge_pages; > -static inline bool is_nx_huge_page_enabled(void) > +static inline bool is_nx_huge_page_enabled(struct kvm *kvm) > { > - return READ_ONCE(nx_huge_pages); > + return READ_ONCE(nx_huge_pages) && > + !kvm->arch.disable_nx_huge_pages; No need for a newline, that fits on a single line. > diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c > index 566548a3efa7..03aa1e0f60e2 100644 > --- a/arch/x86/kvm/mmu/tdp_mmu.c > +++ b/arch/x86/kvm/mmu/tdp_mmu.c > @@ -1469,7 +1469,8 @@ static int tdp_mmu_split_huge_page(struct kvm *kvm, struct tdp_iter *iter, > * not been linked in yet and thus is not reachable from any other CPU. > */ > for (i = 0; i < PT64_ENT_PER_PAGE; i++) > - sp->spt[i] = make_huge_page_split_spte(huge_spte, level, i); > + sp->spt[i] = make_huge_page_split_spte(kvm, huge_spte, > + level, i); Just let this poke past 80 chars. > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 665c1fa8bb57..27631c3b53c2 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -4286,6 +4286,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) > case KVM_CAP_SYS_ATTRIBUTES: > case KVM_CAP_VAPIC: > case KVM_CAP_ENABLE_CAP: > + case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES: > r = 1; > break; > case KVM_CAP_EXIT_HYPERCALL: > @@ -6079,6 +6080,28 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, > } > mutex_unlock(&kvm->lock); > break; > + case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES: > + r = -EINVAL; > + if (cap->args[0]) > + break; > + > + /* > + * Since the risk of disabling NX hugepages is a guest crashing > + * the system, ensure the userspace process has permission to > + * reboot the system. Since I'm nitpicking already and there's also a comment... Can you call out that, unlike the actual reboot() syscall, the process needs the capability in the init? namespace (I don't actual know the terminology) because exposing /dev/kvm into a container doesn't magically limit the iTLB multihit bug to that container. I.e. that this _must_ use capable(), not ns_capable(). Amusingly, someone could subvert the selftest's SYS_reboot heuristic by running the test in a container :-)