Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp1556324rwd; Thu, 1 Jun 2023 18:16:00 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5kyXYq6NDOIpjI0zyRC1dKXTj1WZ/NyvZ1dxsT/880cqBL83NKuJ1TQcCQdQF7Bm/+NL1v X-Received: by 2002:a05:6358:9218:b0:125:80a4:4733 with SMTP id d24-20020a056358921800b0012580a44733mr8682108rwb.10.1685668560086; Thu, 01 Jun 2023 18:16:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685668560; cv=none; d=google.com; s=arc-20160816; b=VUczMjM1j9rg7V4dn4bQ1uNkQHEgqJk74gVyht2SdA1831hNtp7Y1Bi/Myr8JoS/P1 xvuKcm1eycX0G8PU3loP3FRSA82LjolABb4yGRcl+bK3bU1n7m3wMwDg1/eeCjVQdbro 8A65eNe2GLqAG5ZQue2uJS/Tr5a/7ZkoCXDFgEW+wOQTTROy2L/LKkXtSv/dXb1xWpw9 zkRcZ83gdPzf4WsE8aDl5Js8ZRYUcOHzqYamWUV2UUfFD/LFfgIDTxgan0dzexVeG4Iu K+qydMb7iRCcdvztFvIurejwecTd+VDC/JKxRYke9eckbMqgGE0RuhiE15uicLyg4POE kP8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:mime-version:date :reply-to:dkim-signature; bh=0u7y5QK/IUNVLjf4PP/AE2dTkQWeXWOPAwDPk1OpCag=; b=gglcyfYC/nPBWtrxbsiiUP3pBFpL2fL4MkSYQ5JHvCOW/iKLrJYuiBJKbFf46mXvfY 5CrEPNsQFaedRJJtI4H8KusLLeraYdmC0SbF+Y1GPZ2lSCmn2Pf5TktHYF5nsjDau6il Sn97kOr74hQjnqlt4ZS8FRWj7H2wb3nxe7FzTi5Av5+njDF7g3+b7IcBMUg4jyAlA00R E7wdOptuBuvE1fAsG1Ky07L24PHnpRNvibnk7kNa8gG8471nzmr49b+aRVcv5JVQGxy2 xHpOZjs8ZfhsXXe/yp1r1DdHDubU/awh8+HqfpDDUU791YKPNPqv536ATqso342lf1Wm BO3w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=wIQaJ+pM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v3-20020a655c43000000b00530740b4bcbsi120083pgr.342.2023.06.01.18.15.46; Thu, 01 Jun 2023 18:16:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=wIQaJ+pM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232124AbjFBA7F (ORCPT + 99 others); Thu, 1 Jun 2023 20:59:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35888 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229524AbjFBA7E (ORCPT ); Thu, 1 Jun 2023 20:59:04 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B1FC4E2 for ; Thu, 1 Jun 2023 17:59:02 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-babb5e91ab4so2153799276.0 for ; Thu, 01 Jun 2023 17:59:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1685667542; x=1688259542; h=cc:to:from:subject:message-id:mime-version:date:reply-to:from:to:cc :subject:date:message-id:reply-to; bh=0u7y5QK/IUNVLjf4PP/AE2dTkQWeXWOPAwDPk1OpCag=; b=wIQaJ+pMgik/b6GJwDHzo6Jgj2/3eBSKUPtjVYobTGqnFrLpg+apm6uFGdvnH/0xTq 6pyYMcvEboxUZ90nuhRIKotL3IA4OWwkUUf2t8INbGwAEsuL2S4Mm6q7GMUGr2TPlk4D Yz29+TXn6p8x8afCO7EwW6HA6rR0EXmSkyvgmKtUzusZ3+u0hwfAxP3YkK6D0iGB+HdT 5Ly1lWrLuW1C8AIAVCnrt0e/O8wLfm1OV3N7gIwdxpSz9kD/q5Gbjv48KPYoGrXnz9VM ao0x13kkeffb0FL+kXWrDhl5QNu1B6Ea/0VayqPjbseL8m2x6OVv4PSQ8mddGx8HNB0V JxDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685667542; x=1688259542; h=cc:to:from:subject:message-id:mime-version:date:reply-to :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=0u7y5QK/IUNVLjf4PP/AE2dTkQWeXWOPAwDPk1OpCag=; b=Qzf9tCNoFxoMpSvkGybUHorDGHiP7btrn/B1NxcoGWGPghnpT5MZxBb9/ssvy71CqK CfaMrg7K/149nN4W4DcboKCCT/bZqN/wIv+y69qa1ZBvGOU2IpcWk7guIhDu1wZZooyB nWsEYREfG7PYmqkg6x5ygd69tgNRUgF6iao8z4EOdkwNh5sEjBxoG8WivPHYUIDNr88u JgS9pu80aj0n92Eah3Uh97AlzRRxoC7eVmegh/MHKpTJMFpgSzv7F9n4/bA1d9jFiiYk Jlm/G0Wc9N6QWR5XV2F/WqFAQq9OZJuUq26SmZB+souUb/1xZwOxWldolJivVZS78hHM eDgA== X-Gm-Message-State: AC+VfDx1HN9tYvw0zBKsGdWXYi3KuU1uBc0WA1ruRKS5zMijg0wCpUEE Xkxfdq2zedJ+yIGI1H+yNpUR0JPqcc0= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a25:5d6:0:b0:ba7:d142:eada with SMTP id 205-20020a2505d6000000b00ba7d142eadamr952269ybf.7.1685667542036; Thu, 01 Jun 2023 17:59:02 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 1 Jun 2023 17:58:59 -0700 Mime-Version: 1.0 X-Mailer: git-send-email 2.41.0.rc2.161.g9c6817b8e7-goog Message-ID: <20230602005859.784190-1-seanjc@google.com> Subject: [PATCH] KVM: x86/mmu: Add "never" option to allow sticky disabling of nx_huge_pages From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Li RongQing , Yong He , Robert Hoo , Kai Huang Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Add a "never" option to the nx_huge_pages module param to allow userspace to do a one-way hard disabling of the mitigation, and don't create the per-VM recovery threads when the mitigation is hard disabled. Letting userspace pinky swear that userspace doesn't want to enable NX mitigation (without reloading KVM) allows certain use cases to avoid the latency problems associated with spawning a kthread for each VM. E.g. in FaaS use cases, the guest kernel is trusted and the host may create 100+ VMs per logical CPU, which can result in 100ms+ latencies when a burst of VMs is created. Reported-by: Li RongQing Closes: https://lore.kernel.org/all/1679555884-32544-1-git-send-email-lirongqing@baidu.com Cc: Yong He Cc: Robert Hoo Cc: Kai Huang Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu.c | 41 ++++++++++++++++++++++++++++++++++++----- 1 file changed, 36 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index c8961f45e3b1..2ed38916b904 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -58,6 +58,8 @@ extern bool itlb_multihit_kvm_mitigation; +static bool nx_hugepage_mitigation_hard_disabled; + int __read_mostly nx_huge_pages = -1; static uint __read_mostly nx_huge_pages_recovery_period_ms; #ifdef CONFIG_PREEMPT_RT @@ -67,12 +69,13 @@ static uint __read_mostly nx_huge_pages_recovery_ratio = 0; static uint __read_mostly nx_huge_pages_recovery_ratio = 60; #endif +static int get_nx_huge_pages(char *buffer, const struct kernel_param *kp); static int set_nx_huge_pages(const char *val, const struct kernel_param *kp); static int set_nx_huge_pages_recovery_param(const char *val, const struct kernel_param *kp); static const struct kernel_param_ops nx_huge_pages_ops = { .set = set_nx_huge_pages, - .get = param_get_bool, + .get = get_nx_huge_pages, }; static const struct kernel_param_ops nx_huge_pages_recovery_param_ops = { @@ -6844,6 +6847,14 @@ static void mmu_destroy_caches(void) kmem_cache_destroy(mmu_page_header_cache); } +static int get_nx_huge_pages(char *buffer, const struct kernel_param *kp) +{ + if (nx_hugepage_mitigation_hard_disabled) + return sprintf(buffer, "never\n"); + + return param_get_bool(buffer, kp); +} + static bool get_nx_auto_mode(void) { /* Return true when CPU has the bug, and mitigations are ON */ @@ -6860,15 +6871,29 @@ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp) bool old_val = nx_huge_pages; bool new_val; + if (nx_hugepage_mitigation_hard_disabled) + return -EPERM; + /* In "auto" mode deploy workaround only if CPU has the bug. */ - if (sysfs_streq(val, "off")) + if (sysfs_streq(val, "off")) { new_val = 0; - else if (sysfs_streq(val, "force")) + } else if (sysfs_streq(val, "force")) { new_val = 1; - else if (sysfs_streq(val, "auto")) + } else if (sysfs_streq(val, "auto")) { new_val = get_nx_auto_mode(); - else if (kstrtobool(val, &new_val) < 0) + } else if (sysfs_streq(val, "never")) { + new_val = 0; + + mutex_lock(&kvm_lock); + if (!list_empty(&vm_list)) { + mutex_unlock(&kvm_lock); + return -EBUSY; + } + nx_hugepage_mitigation_hard_disabled = true; + mutex_unlock(&kvm_lock); + } else if (kstrtobool(val, &new_val) < 0) { return -EINVAL; + } __set_nx_huge_pages(new_val); @@ -7006,6 +7031,9 @@ static int set_nx_huge_pages_recovery_param(const char *val, const struct kernel uint old_period, new_period; int err; + if (nx_hugepage_mitigation_hard_disabled) + return -EPERM; + was_recovery_enabled = calc_nx_huge_pages_recovery_period(&old_period); err = param_set_uint(val, kp); @@ -7161,6 +7189,9 @@ int kvm_mmu_post_init_vm(struct kvm *kvm) { int err; + if (nx_hugepage_mitigation_hard_disabled) + return 0; + err = kvm_vm_create_worker_thread(kvm, kvm_nx_huge_page_recovery_worker, 0, "kvm-nx-lpage-recovery", &kvm->arch.nx_huge_page_recovery_thread); base-commit: 39428f6ea9eace95011681628717062ff7f5eb5f -- 2.41.0.rc2.161.g9c6817b8e7-goog