Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp3413881iob; Sat, 7 May 2022 04:41:34 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx3fWulrmUavDD+KRgnE6FaIdiwB9Z+FGkXbiQ6YrRvjWIDdzQjYvr/SxIccpBViGi6KQQN X-Received: by 2002:a05:6808:616:b0:325:bc0d:ae29 with SMTP id y22-20020a056808061600b00325bc0dae29mr3655219oih.192.1651923693849; Sat, 07 May 2022 04:41:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651923693; cv=none; d=google.com; s=arc-20160816; b=bz9KlWnk7M3amfTKkvhEYnqkXTftMz+cDsI9/+kEpiaEQh5FaG8C9q5V5opHZUNQQS j9MMn53mWqOdzE1X5VcSw+DwboFD+AO7wdmGzYtaZ9XwFQPS9fVy/iGIyax+XrNaSbTH mHOEhymsrTOHbfoL31ekQhIysRmZBdjM6Nh6P/gfbG8Wog7WPwMG5lmolk0mmPg3bow0 J66s8r4JGbYVabjkCBMT9NeKgT6T/WtS9zTDl2Nnw2jgKBDga1rJrUQ5FoCFsCTKSCDN y14xEfXg9IjF4oiM+c3I2LMru95fOSJZZbmL9m31u/fmyWVsgfVevhqj32FtVA+j8mGg pzfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=BjYrUmm3gQyIDEnUysCtyi26iCeQZJHz2b8v2Ezb4IU=; b=JFbCYTTpyNne0RkhKjClKGLLckhE439QVBns5q66v5831hXjkWfbzXHMQPXlSNtR98 CQCR7TNK+rrldbZGPP2OdCW0xkTR++Mtwi43FxH4bIFrPUxSwzam1/N0QMcRepB2WcJ7 LfbtUUIvHHp2KOvyaTzjG9u9r0lZQlvVq0fP7Bl4q3QtORS+/BA+X9CesrItVrbikdCF kpvXUwfhATLd6sykQUnqzKfurqgesqvEFr/cFY9kMJG9OWKj564dyy+CI0kK9ZZxsHIp Teju2CmlBALua5fWUP92z9wmYaMqpEr/RlcniAxazjshnA1e1GDgbZtlFAqzu1cWT6ot jxqg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20210112.gappssmtp.com header.s=20210112 header.b=znqizYiT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k84-20020aca3d57000000b002ef0c347682si5123999oia.258.2022.05.07.04.41.08; Sat, 07 May 2022 04:41:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20210112.gappssmtp.com header.s=20210112 header.b=znqizYiT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387824AbiEFBTd (ORCPT + 99 others); Thu, 5 May 2022 21:19:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41888 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237763AbiEFBTb (ORCPT ); Thu, 5 May 2022 21:19:31 -0400 Received: from mail-pj1-x1033.google.com (mail-pj1-x1033.google.com [IPv6:2607:f8b0:4864:20::1033]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E0A7A47AC2 for ; Thu, 5 May 2022 18:15:49 -0700 (PDT) Received: by mail-pj1-x1033.google.com with SMTP id r9so5704395pjo.5 for ; Thu, 05 May 2022 18:15:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=BjYrUmm3gQyIDEnUysCtyi26iCeQZJHz2b8v2Ezb4IU=; b=znqizYiTQ8sUZIj4I4mqg2jBkSnSI7UXZ07MB/5isgB4i/PVxTQV/9UZpMuYdQE1// uFcZUMpoROkEDq+uCsgTmhJDYip5P2BwkL6X7/ZbNVfDpJJ2T8crnDzzAn6NweKvPlcM V0TrQcnvkkqHX2jRy0CWCRc/S77SI1i/5ALNnycThpzbEsiyHQwzakFhq+iamws7Paf+ PruVdTTnCw8nmVwfmXZCkc1K3X0XcMiCW3Ot63aZpAn4YdVAnY8Z6t9vwMY/iaExVOv0 kMn87QUGhqUPsKCp6lN559ANDnHz4XEkVGz3FIuj2rtiaG3DcGE6Wjwq77+M1i+jfaDB nfJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=BjYrUmm3gQyIDEnUysCtyi26iCeQZJHz2b8v2Ezb4IU=; b=p5IgHAA9Of1ayX19D5ZYaOmJBi4uhFf6FOJrUH4/qD9Wq52IHG03e0gu2CtE+5Yw1n 1FyqiFYK5IMu4Mvtycffq4B+wQ5wceT14aWE2gKo06Yzc64ZOErnwQnxQOOp4IjpMOMu LQrr9cbls7bkivV8NnOPitvS2Opr/cw93LA10x6kKl4llzxFzjBTeNe78mR6z4gSnxwu kmGj3awMs/ZdTXy6T12trRk7Fh9pOpU2tDQlcev5HBvZAR40mRnDMSuZuHZPOZV70xhf 4/lv7FDxWnI5+oUii36yWhzb05dy91Xtuu6PxBxTSDHS0OPYMgNqd0L4mcCy4bPTRlz5 Zwgg== X-Gm-Message-State: AOAM533aterjCEcDglT9X8KXn/UBToGUWABc03LqBwlVdWuVN/tw+Lg9 fjkR47lV9r+ci9OzRKVv6AnQn+GdlGWoNKsNf/oR2g== X-Received: by 2002:a17:902:ea57:b0:15a:6173:87dd with SMTP id r23-20020a170902ea5700b0015a617387ddmr933844plg.147.1651799749349; Thu, 05 May 2022 18:15:49 -0700 (PDT) MIME-Version: 1.0 References: <522e37eb-68fc-35db-44d5-479d0088e43f@intel.com> <9b388f54f13b34fe684ef77603fc878952e48f87.camel@intel.com> <664f8adeb56ba61774f3c845041f016c54e0f96e.camel@intel.com> <1b681365-ef98-ec78-96dc-04e28316cf0e@intel.com> <8bf596b45f68363134f431bcc550e16a9a231b80.camel@intel.com> <6bb89ca6e7346f4334f06ea293f29fd12df70fe4.camel@intel.com> In-Reply-To: From: Dan Williams Date: Thu, 5 May 2022 18:15:38 -0700 Message-ID: Subject: Re: [PATCH v3 00/21] TDX host kernel support To: Kai Huang Cc: Dave Hansen , Linux Kernel Mailing List , KVM list , Sean Christopherson , Paolo Bonzini , "Brown, Len" , "Luck, Tony" , Rafael J Wysocki , Reinette Chatre , Peter Zijlstra , Andi Kleen , "Kirill A. Shutemov" , Kuppuswamy Sathyanarayanan , Isaku Yamahata , Mike Rapoport Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 5, 2022 at 5:46 PM Kai Huang wrote: > > On Thu, 2022-05-05 at 17:22 -0700, Dan Williams wrote: > > On Thu, May 5, 2022 at 3:14 PM Kai Huang wrote: > > > > > > Thanks for feedback! > > > > > > On Thu, 2022-05-05 at 06:51 -0700, Dan Williams wrote: > > > > [ add Mike ] > > > > > > > > > > > > On Thu, May 5, 2022 at 2:54 AM Kai Huang wrote: > > > > [..] > > > > > > > > > > Hi Dave, > > > > > > > > > > Sorry to ping (trying to close this). > > > > > > > > > > Given we don't need to consider kmem-hot-add legacy PMEM after TDX module > > > > > initialization, I think for now it's totally fine to exclude legacy PMEMs from > > > > > TDMRs. The worst case is when someone tries to use them as TD guest backend > > > > > directly, the TD will fail to create. IMO it's acceptable, as it is supposedly > > > > > that no one should just use some random backend to run TD. > > > > > > > > The platform will already do this, right? > > > > > > > > > > In the current v3 implementation, we don't have any code to handle memory > > > hotplug, therefore nothing prevents people from adding legacy PMEMs as system > > > RAM using kmem driver. In order to guarantee all pages managed by page > > > > That's the fundamental question I am asking why is "guarantee all > > pages managed by page allocator are TDX memory". That seems overkill > > compared to indicating the incompatibility after the fact. > > As I explained, the reason is I don't want to modify page allocator to > distinguish TDX and non-TDX allocation, for instance, having to have a ZONE_TDX > and GFP_TDX. Right, TDX details do not belong at that level, but it will work almost all the time if you do nothing to "guarantee" all TDX capable pages all the time. > KVM depends on host's page fault handler to allocate the page. In fact KVM only > consumes PFN from host's page tables. For now only RAM is TDX memory. By > guaranteeing all pages in page allocator is TDX memory, we can easily use > anonymous pages as TD guest memory. Again, TDX capable pages will be the overwhelming default, why are you worried about cluttering the memory hotplug path for nice corner cases. Consider the fact that end users can break the kernel by specifying invalid memmap= command line options. The memory hotplug code does not take any steps to add safety in those cases because there are already too many ways it can go wrong. TDX is just one more corner case where the memmap= user needs to be careful. Otherwise, it is up to the platform firmware to make sure everything in the base memory map is TDX capable, and then all you need is documentation about the failure mode when extending "System RAM" beyond that baseline. > shmem to support a new fd-based backend which doesn't require having to mmap() > TD guest memory to host userspace: > > https://lore.kernel.org/kvm/20220310140911.50924-1-chao.p.peng@linux.intel.com/ > > Also, besides TD guest memory, there are some per-TD control data structures > (which must be TDX memory too) need to be allocated for each TD. Normal memory > allocation APIs can be used for such allocation if we guarantee all pages in > page allocator is TDX memory. You don't need that guarantee, just check it after the fact and fail if that assertion fails. It should almost always be the case that it succeeds and if it doesn't then something special is happening with that system and the end user has effectively opt-ed out of TDX operation. > > > allocator are all TDX memory, the v3 implementation needs to always include > > > legacy PMEMs as TDX memory so that even people truly add legacy PMEMs as system > > > RAM, we can still guarantee all pages in page allocator are TDX memory. > > > > Why? > > If we don't include legacy PMEMs as TDX memory, then after they are hot-added as > system RAM using kmem driver, the assumption of "all pages in page allocator are > TDX memory" is broken. A TD can be killed during runtime. Yes, that is what the end user asked for. If they don't want that to happen then the policy decision about using kmem needs to be updated in userspace, not hard code that policy decision towards TDX inside the kernel.