Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp5601402pxb; Mon, 28 Mar 2022 14:49:02 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwVBol0q353oL3xddRPy2yMg4f2G0LihXq7pKQj4i5e5u/3k0k3iiNBohw2o/3H27KoHn6f X-Received: by 2002:a05:6214:e66:b0:441:7695:8eb7 with SMTP id jz6-20020a0562140e6600b0044176958eb7mr19453881qvb.127.1648504142163; Mon, 28 Mar 2022 14:49:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648504142; cv=none; d=google.com; s=arc-20160816; b=seV4l6N5iK2osT3zJ+E1D7TATtXWTTh5D0xeuB74ncJEcuoJxTNL0p7sJEqSHXOrQH /F50YxnsNUQiZYi+KmDu4auhy6fGEG8zit9XmOYhk+p//TMhE5+Yj7M06Nu0tbtIZXJu d/hYl8+jpomSFPBGfZqnsPLwWxBTdkHUTyAyS1q+PkbI1oLeXWZi/45pLR72j8IJtvTk etAqHhik1pRI7NziMZ/BGfDgAKxfzzYSczpwp0yjmG+tagOGShAh7Ghy9Bmppbm+zuSD 8dzl2oDWlbvsHALA5j4Kj4f8VxlKmTxqwAib2McI5Wz2HvYJV8ihKNAi7dQFvyZTQkNA sEVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=Z8M2eaDrvySsRMoJDq3gdwHu0GI4WHy75G6qEWnfrak=; b=RYDrNWU/vxSHxZiXxqBr0z/2zx7A8RHzR56ZlXuk6fNgyBMVwF017lbw7DH2JzHMwz yZm7NHlmjLDJRM+7LrtwEWhSL/ZV523OJxqiIuBO7mNzh3Kh8ClwbKaTFgZ7Ztar88qX mZvWuRu/kjhpTDbWU7MN9QOn1O1oAnoiItvjxVEcPqwycyr7M/1ZEPfjX1ayfsA9hHlT qFWfxS3YVsfwqgDcZhuYynPPqvQ9BeIcyAB4fu+hc3cZj1eGbYsFBSnhjKx45jrt4Tm6 w8t+hmYeldMZBNVLQGkWo1QvmaxImF7gB7MT5YVhbqLEUKTWKg7ZdO/lIhQ+9Z4owyX+ K7ug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=UDUMifkj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id v5-20020a67ac05000000b00325b5c04d36si2099072vse.558.2022.03.28.14.49.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Mar 2022 14:49:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=UDUMifkj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 983DAF4638; Mon, 28 Mar 2022 14:21:02 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345100AbiC1Ugd (ORCPT + 99 others); Mon, 28 Mar 2022 16:36:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52242 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344566AbiC1UgO (ORCPT ); Mon, 28 Mar 2022 16:36:14 -0400 Received: from mail-pg1-x531.google.com (mail-pg1-x531.google.com [IPv6:2607:f8b0:4864:20::531]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80B9B51E6D for ; Mon, 28 Mar 2022 13:34:26 -0700 (PDT) Received: by mail-pg1-x531.google.com with SMTP id c11so13060886pgu.11 for ; Mon, 28 Mar 2022 13:34:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Z8M2eaDrvySsRMoJDq3gdwHu0GI4WHy75G6qEWnfrak=; b=UDUMifkjplq8rJ6a+odmLYcEQAXp6T6HvU89gOcSXenZ++rWsksgkLUZHakIakEqoM 092VAJn5hHH1DhG0qyrYvjjdzUBUTzHhobRgl4mFji/FjDh8x4ji3NraIQTC3GliwDWV Xz3nb7KzKPHat9T11HkkwoXt7zfpxcv+nKihAI4qEYPEXpAKFTNHpb+z4qaARrxt8mNy CjGeqFlOnjfbLFi5dMUza92eKf5khfq56qwIkGTtR8EEgTZedFRT/D/1Z5cL+i4JFiag 4pISPz3r8REj3A3chZuksVXQ6ivqXOxmhTgSlwb46eNpPu8JcJLx+83en6zYNBX5jV78 ZCig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Z8M2eaDrvySsRMoJDq3gdwHu0GI4WHy75G6qEWnfrak=; b=AvhuhLjE4zC79j5Tm1rGgTpWA5sUqnw6BGNvlFkibiMFi1JNK0/oB4fbaUVaJ3kUIo 8aCFeilT2njNiXGMdQXHhk2uHch0E+BeWdCgnCifGfBiNcTejpLTJ8DNoYdbfTazc6cv WNCcI6AB/o0AzhEYVojp4uRQpskEW1JYBFFXuQkpViwhVbA312j5jdov4bEoCSsPoczu ZbgkGSVFX4ACT6n/2lCD3wSN+FivoWWPBj1GSeVkLPgW0Ofi0D7dZmXg9p2/ayNSePwV QlSSTRwyatba7d5pmC0tcgDkG/qU3WBG33EaznOoQM5SJuQSfd0XZLDH3VEnQe8Jnzru o5Uw== X-Gm-Message-State: AOAM530HLK68Pdg/4/V/hP8MIjTwxpLa43HLK+sVM2DatvvPLtCanw+h VlbPDQQC9DJQTu4T4+Y9HjXrnw== X-Received: by 2002:aa7:8556:0:b0:4fa:6d38:95e3 with SMTP id y22-20020aa78556000000b004fa6d3895e3mr24967832pfn.54.1648499665428; Mon, 28 Mar 2022 13:34:25 -0700 (PDT) Received: from google.com (254.80.82.34.bc.googleusercontent.com. [34.82.80.254]) by smtp.gmail.com with ESMTPSA id j9-20020a056a00130900b004f73df40914sm16770458pfu.82.2022.03.28.13.34.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Mar 2022 13:34:24 -0700 (PDT) Date: Mon, 28 Mar 2022 20:34:21 +0000 From: David Matlack To: Ben Gardon Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Peter Xu , Sean Christopherson , Jim Mattson , David Dunn , Jing Zhang , Junaid Shahid Subject: Re: [PATCH v2 11/11] KVM: x86/MMU: Require reboot permission to disable NX hugepages Message-ID: References: <20220321234844.1543161-1-bgardon@google.com> <20220321234844.1543161-12-bgardon@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220321234844.1543161-12-bgardon@google.com> X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 21, 2022 at 04:48:44PM -0700, Ben Gardon wrote: > Ensure that the userspace actor attempting to disable NX hugepages has > permission to reboot the system. Since disabling NX hugepages would > allow a guest to crash the system, it is similar to reboot permissions. > > This approach is the simplest permission gating, but passing a file > descriptor opened for write for the module parameter would also work > well and be more precise. > The latter approach was suggested by Sean Christopherson. > > Suggested-by: Jim Mattson > Signed-off-by: Ben Gardon > --- > arch/x86/kvm/x86.c | 18 ++++++- > .../selftests/kvm/include/kvm_util_base.h | 2 + > tools/testing/selftests/kvm/lib/kvm_util.c | 7 +++ > .../selftests/kvm/x86_64/nx_huge_pages_test.c | 49 ++++++++++++++----- > .../kvm/x86_64/nx_huge_pages_test.sh | 2 +- > 5 files changed, 65 insertions(+), 13 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 74351cbb9b5b..995f30667619 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -4256,7 +4256,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) > case KVM_CAP_SYS_ATTRIBUTES: > case KVM_CAP_VAPIC: > case KVM_CAP_ENABLE_CAP: > - case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES: > r = 1; > break; > case KVM_CAP_EXIT_HYPERCALL: > @@ -4359,6 +4358,14 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) > case KVM_CAP_DISABLE_QUIRKS2: > r = KVM_X86_VALID_QUIRKS; > break; > + case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES: > + /* > + * Since the risk of disabling NX hugepages is a guest crashing > + * the system, ensure the userspace process has permission to > + * reboot the system. > + */ > + r = capable(CAP_SYS_BOOT); Duplicating this check and comment isn't ideal. I think it would be fine to unconditionally return true here (KVM, after all, does support the capability) and only check for CAP_SYS_BOOT when userspace attempts to enable the capability. > + break; > default: > break; > } > @@ -6050,6 +6057,15 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, > mutex_unlock(&kvm->lock); > break; > case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES: > + /* > + * Since the risk of disabling NX hugepages is a guest crashing > + * the system, ensure the userspace process has permission to > + * reboot the system. > + */ > + if (!capable(CAP_SYS_BOOT)) { > + r = -EPERM; > + break; > + } > kvm->arch.disable_nx_huge_pages = true; > kvm_update_nx_huge_pages(kvm); > r = 0; > diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h > index 72163ba2f878..4db8251c3ce5 100644 > --- a/tools/testing/selftests/kvm/include/kvm_util_base.h > +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h Can you split out the selftests changes to a separate commit? I have a feeling you meant to :). > @@ -411,4 +411,6 @@ uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name); > > uint32_t guest_get_vcpuid(void); > > +void vm_disable_nx_huge_pages(struct kvm_vm *vm); > + > #endif /* SELFTEST_KVM_UTIL_BASE_H */ > diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c > index 9d72d1bb34fa..46a7fa08d3e0 100644 > --- a/tools/testing/selftests/kvm/lib/kvm_util.c > +++ b/tools/testing/selftests/kvm/lib/kvm_util.c > @@ -2765,3 +2765,10 @@ uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name) > return value; > } > > +void vm_disable_nx_huge_pages(struct kvm_vm *vm) > +{ > + struct kvm_enable_cap cap = { 0 }; > + > + cap.cap = KVM_CAP_VM_DISABLE_NX_HUGE_PAGES; > + vm_enable_cap(vm, &cap); > +} > diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c > index 2bcbe4efdc6a..5ce98f759bc8 100644 > --- a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c > +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c Will you add a test to exercise the CAP_SYS_BOOT check? At minimum the selftest should check if it has CAP_SYS_BOOT and act accordingly (e.g. exiting with KSFT_SKIP). > @@ -57,13 +57,40 @@ static void check_split_count(struct kvm_vm *vm, int expected_splits) > expected_splits, actual_splits); > } > > +static void help(void) > +{ > + puts(""); > + printf("usage: nx_huge_pages_test.sh [-x]\n"); > + puts(""); > + printf(" -x: Allow executable huge pages on the VM.\n"); > + puts(""); > + exit(0); > +} > + > int main(int argc, char **argv) > { > struct kvm_vm *vm; > struct timespec ts; > + bool disable_nx = false; > + int opt; > + > + while ((opt = getopt(argc, argv, "x")) != -1) { > + switch (opt) { > + case 'x': > + disable_nx = true; > + break; > + case 'h': > + default: > + help(); > + break; > + } > + } > > vm = vm_create(VM_MODE_DEFAULT, DEFAULT_GUEST_PHY_PAGES, O_RDWR); > > + if (disable_nx) > + vm_disable_nx_huge_pages(vm); > + > vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS_HUGETLB, > HPAGE_PADDR_START, HPAGE_SLOT, > HPAGE_SLOT_NPAGES, 0); > @@ -83,21 +110,21 @@ int main(int argc, char **argv) > * at 2M. > */ > run_guest_code(vm, guest_code0); > - check_2m_page_count(vm, 2); > - check_split_count(vm, 2); > + check_2m_page_count(vm, disable_nx ? 4 : 2); > + check_split_count(vm, disable_nx ? 0 : 2); > > /* > * guest_code1 is in the same huge page as data1, so it will cause > * that huge page to be remapped at 4k. > */ > run_guest_code(vm, guest_code1); > - check_2m_page_count(vm, 1); > - check_split_count(vm, 3); > + check_2m_page_count(vm, disable_nx ? 4 : 1); > + check_split_count(vm, disable_nx ? 0 : 3); > > /* Run guest_code0 again to check that is has no effect. */ > run_guest_code(vm, guest_code0); > - check_2m_page_count(vm, 1); > - check_split_count(vm, 3); > + check_2m_page_count(vm, disable_nx ? 4 : 1); > + check_split_count(vm, disable_nx ? 0 : 3); > > /* > * Give recovery thread time to run. The wrapper script sets > @@ -110,7 +137,7 @@ int main(int argc, char **argv) > /* > * Now that the reclaimer has run, all the split pages should be gone. > */ > - check_2m_page_count(vm, 1); > + check_2m_page_count(vm, disable_nx ? 4 : 1); > check_split_count(vm, 0); > > /* > @@ -118,13 +145,13 @@ int main(int argc, char **argv) > * again to check that pages are mapped at 2M again. > */ > run_guest_code(vm, guest_code0); > - check_2m_page_count(vm, 2); > - check_split_count(vm, 2); > + check_2m_page_count(vm, disable_nx ? 4 : 2); > + check_split_count(vm, disable_nx ? 0 : 2); > > /* Pages are once again split from running guest_code1. */ > run_guest_code(vm, guest_code1); > - check_2m_page_count(vm, 1); > - check_split_count(vm, 3); > + check_2m_page_count(vm, disable_nx ? 4 : 1); > + check_split_count(vm, disable_nx ? 0 : 3); > > kvm_vm_free(vm); > > diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh > index 19fc95723fcb..29f999f48848 100755 > --- a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh > +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh > @@ -14,7 +14,7 @@ echo 1 > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio > echo 100 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms > echo 200 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages > > -./nx_huge_pages_test > +./nx_huge_pages_test "${@}" > RET=$? > > echo $NX_HUGE_PAGES > /sys/module/kvm/parameters/nx_huge_pages > -- > 2.35.1.894.gb6a874cedc-goog >