Received: by 2002:a05:6358:7058:b0:131:369:b2a3 with SMTP id 24csp474084rwp; Wed, 12 Jul 2023 16:59:31 -0700 (PDT) X-Google-Smtp-Source: APBJJlFoccKi4vJqNdlFfby6bqD1IN6uVO1jvN0mT6S6UpKDb0KA6+/VV0YFPZorv8h+NtnA89qq X-Received: by 2002:a17:906:7398:b0:993:da87:1c7b with SMTP id f24-20020a170906739800b00993da871c7bmr14713676ejl.10.1689206371290; Wed, 12 Jul 2023 16:59:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689206371; cv=none; d=google.com; s=arc-20160816; b=ZU8V5M4ZSq7k3hl2h+qO70JVJ6mORbbF/fQmPFW7JU2Q3lDwip45j6y055pv0tMdtc u8U7OOT5gl2Gk6qgz1AcvoFAV6MCWLU2B+therFdJqSK+kcXxmMXD7tqnCUUncEqU+r/ EppjyZMU7yEkKmJf+QmHxnjPpkXY2jZBQvrVqrXcRO++xmBHF9DjBcRD5XXSQ4NBORVn 4C1cJ/wD++Cf5ptaEq5FTxcIoQAPtGW2JEorYIjZ+f7nVhejZCLdniGnSFgH+IcfimII U52xQQaU5nxXRzBQXcEOqJHXdjb1Zae4pnRaaF+o2tPzXrxztwZqrpfZt9BsRE7YuS2b d6fg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=o7nUtvGiaimaAY0cZqfNqB77rOB1r7Cx9gcJevlv5Iw=; fh=XMm38N08tlD7uzzhJ6GGQLtGjxSvzNCOVnKghPfDpdE=; b=AkNBM2XzIIi313D0jwg9onVhybFKYwyQRdYvJeCG6EE0i4sEoFfrCJvlNkUcAnjBT1 ercXhpU0ZJt67hLgezWEkxRy1ORebNY6bEeDCOC/Np27U0rnAxh5OjyUysn9jJpHUrVF FZU1K06BQTon/1kbpwyCFkcmS4AzePwkmTi+NPVBpOvE4RNm78MIC3HYlzS29Rq7xwaj /xVhMd6zCiEyRo8WbhPr1w7UJYem5y7rj5YPRymkpvtK9cjM5pdSYjmDLOWzXhEK8O0S mPhEQzbldvirK7lA3nrw6t0pRjAF3aXGy1XcUI4L+3bYFT2Nxbyu05Rt0GHzShUDDHA1 oNqA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=dH1TkkwI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z25-20020a170906815900b00992fef5cffasi5404073ejw.641.2023.07.12.16.59.07; Wed, 12 Jul 2023 16:59:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=dH1TkkwI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230399AbjGLXrd (ORCPT + 99 others); Wed, 12 Jul 2023 19:47:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53518 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230134AbjGLXrb (ORCPT ); Wed, 12 Jul 2023 19:47:31 -0400 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DF3D2139 for ; Wed, 12 Jul 2023 16:47:25 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id d9443c01a7336-1b896096287so1065255ad.0 for ; Wed, 12 Jul 2023 16:47:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689205645; x=1691797645; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=o7nUtvGiaimaAY0cZqfNqB77rOB1r7Cx9gcJevlv5Iw=; b=dH1TkkwIH2N3rOv98vFjMjLNR2RzgZXxmQvbeFRMHxFC/oiz1qxPNuXga6xPpLXjvo YYscOCfrhUcI5vrQ/IQMdHShU3Li7MRlpxpTNq2PYQ1OGj0vjymqubLgQztkxs62pGRu lI2TpohTkYwtrRZDNkPZtSruf4F91iAda7vn8NPbIIgULXidctMeUZ1xMWpEzKmLk7FZ M70o/gYhexV2rzlWbmYOnVs+QnfrwDR01qu9qCiM+3iZd8SgFUgMpEp+3gNndOhmRBow +T0Ua51N+3qQ9sqijHhfoCPVp91c54/Dwki66c4p7n54uLSffPzfXMsVk7z4RbS3TD72 nHgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689205645; x=1691797645; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=o7nUtvGiaimaAY0cZqfNqB77rOB1r7Cx9gcJevlv5Iw=; b=F8vfS2SJqFz/v6p1BSUB4t7Kj4N4JwavbScp3D3aUUrJpbVJFHP0Rp1xvlWo7V3AVh qXYARc8oITyKCLXqG8crgTG0dk2RdPGHGMtEsG5D67e4QlwZAH6VeFtaLMY+qy3CpyzL 9/HRkQwPjCYo5Tnh6VRlFu+BudouI6iul0tR/e1j83vnr3bCX8ntgVFNnyNxzqgaRW7q sdGEje7fBLWI6wyRwLZ/pDOsGchNZa4bzYwLJ17YxQ7p6jaq9LzLMUfHIh01/vls28dR Qoa1H23UfgCjpICGziZ9lr+nQobNhCKz/Ghcz5QdhMbmqi4jT6Sj9CXfGXjDmK0renXz KA0g== X-Gm-Message-State: ABy/qLaHdleYp/pY8xdb63COFQacfebF9h6VLtNXv+2vpyuXiC4xz2NX AH98u+JQIOAznjXGflr3vWmuqvSnYaE= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:d2cd:b0:1af:f80f:185d with SMTP id n13-20020a170902d2cd00b001aff80f185dmr545plc.4.1689205645373; Wed, 12 Jul 2023 16:47:25 -0700 (PDT) Date: Wed, 12 Jul 2023 16:47:24 -0700 In-Reply-To: <4b621470-8c58-264b-1e8b-75cec73cd7b0@gmail.com> Mime-Version: 1.0 References: <20230602005859.784190-1-seanjc@google.com> <168667299355.1927151.1998349801097712999.b4-ty@google.com> <4b621470-8c58-264b-1e8b-75cec73cd7b0@gmail.com> Message-ID: Subject: Re: [PATCH] KVM: x86/mmu: Add "never" option to allow sticky disabling of nx_huge_pages From: Sean Christopherson To: Like Xu Cc: Luiz Capitulino , Paolo Bonzini , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Li RongQing , Yong He , Robert Hoo , Kai Huang Content-Type: text/plain; charset="us-ascii" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 12, 2023, Like Xu wrote: > On 2023/6/15 03:07, Sean Christopherson wrote: > > On Wed, Jun 14, 2023, Luiz Capitulino wrote: > > > > Applied to kvm-x86 mmu. I kept the default as "auto" for now, as that can go on > > > > top and I don't want to introduce that change this late in the cycle. If no one > > > > beats me to the punch (hint, hint ;-) ), I'll post a patch to make "never" the > > > > default for unaffected hosts so that we can discuss/consider that change for 6.6. > > > > > > Thanks Sean, I agree with the plan. I could give a try on the patch if you'd like. > > > > Yes please, thanks! > > As a KVM/x86 *feature*, playing with splitting and reconstructing large > pages have other potential user scenarios, e.g. for performance test > comparisons in a easier approach, not just for itlb_multihit mitigation. Enabling and disabling dirty logging is a far better tool for that, as it gives userspace much more explicit control over what pages are are split/reconstituted, and when. > On unaffected machines (ICX and later), nx_huge_pages is already "N", > and turning it into "never" doesn't help materially in the mitigation > implementation, but loses flexibility. I'm becoming more and more convinced that losing the flexibility is perfectly acceptable. There's a very good argument to be made that mitigating DoS attacks from the guest kernel should be done several levels up, e.g. by refusing to create VMs for a customer that is bringing down hosts. As Jim has a pointed out, plugging the hole only works if you are 100% confident there are no other holes, and will never be other holes. > IMO, the real issue here is that the kernel thread "kvm-nx-lpage- > recovery" is created unconditionally. We also need to be aware of the > existence of this commit 084cc29f8bbb ("KVM: x86/MMU: Allow NX huge > pages to be disabled on a per-vm basis"). > > One of the technical proposals is to defer kvm_vm_create_worker_thread() > to kvm_mmu_create() or kvm_init_mmu(), based on > kvm->arch.disable_nx_huge_pages, even until guest paging mode is enabled > on the first vcpu. > > Is this step worth taking ? IMO, no. In hindsight, adding KVM_CAP_VM_DISABLE_NX_HUGE_PAGES was likely a mistake; requiring CAP_SYS_BOOT makes it annoyingly difficult to safely use the capability. My preference at this point is to make changes to the NX hugepage mitigation only when there is a substantial benefit to an already-deployed usecase.