Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp1939731iob; Thu, 5 May 2022 11:20:17 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyVgPCZ3clKBamzy/0r9oHrpVTr0DEW75IiIBhlcDXKHC+3PxSOtQvMYoTqn6ZBllYUiRVQ X-Received: by 2002:a17:902:9a81:b0:158:1c91:4655 with SMTP id w1-20020a1709029a8100b001581c914655mr29367598plp.162.1651774816708; Thu, 05 May 2022 11:20:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651774816; cv=none; d=google.com; s=arc-20160816; b=k2y/8Vpgh2BpI25/autJH8gcK2AmzDJIjIKYOehfU7PZV9Qk0M1t0A162K9+8EghT9 9/H3phap8CHwXLnCfHPZbT4JemqzdMC7mdGLnyQIQ14/+7xSRS2hR4lxWMozlY1f2wcc mMyCNHl/ZmHhTaSbcBbItcC7R8cM5AHoCERWagLV8lzouFiC1sQT5jYUIzKXbjJO8hni yGhYdhSXbJtp0K3PcEixFiRsdurYLGD1msL+PgstS1mkvjB+oTrYAppatSUmunKWdACQ 3ZeZvfiymCCuRbPl6ofkXQR0vcS6OPiJZGHSG0/CXwe3JwlEqRL/SbGiSjH4QxiWLn8a Fa+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=pERn3PHrXHxS93PtCIXjYTXnh41HEIKK4Qz15+/hIws=; b=pxJc7ovhXexfHxU+mfMwTf1kPpht1YlYltKEBUKNZDWImdZGTMUwaWQbuAQxcgOcYM gUZXzKTttkQig2qZJtvTTcAAMBMgVEzOaWmZ3zecGB4XVSfwlATPlra5y1tbwkaEtdcj 0qLgaUDr5Zt99YL/f28H5kIdt3mmSmqyhDCc76SyQW8dzmLh4ljeupudAzIcQ/oWPa0M b/9FnuOdh2iObL+8onoy3iW0wNMw2lP8e8xgii8vGhYRmbGUOy132paHOlkf2lH1zaa4 QhIHy4Z+HwbZ9mbAlbW1bvSc/IObT4nOuQpdmGATKJdFoZC6P5BttHb4ig+JBybvBm8v zfaw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=LYiFMsp4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n9-20020a170903404900b0015904e23e37si2356128pla.59.2022.05.05.11.19.58; Thu, 05 May 2022 11:20:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=LYiFMsp4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1354522AbiEEJ63 (ORCPT + 99 others); Thu, 5 May 2022 05:58:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230073AbiEEJ61 (ORCPT ); Thu, 5 May 2022 05:58:27 -0400 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D511D140D7; Thu, 5 May 2022 02:54:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651744486; x=1683280486; h=message-id:subject:from:to:cc:date:in-reply-to: references:mime-version:content-transfer-encoding; bh=+IIdzhi2wpaIkl6R/rnZ2YgMFMMb8WS0hD6EgHwqztw=; b=LYiFMsp4zHRfnDe/xzMdd4/VXl8faf7S6BfDl/mOUDXAJzxxo9iGtx70 H5OYuYunfybdoIL1WBRDvVw5Y57eYuV4j+JdYDXLrbCbiyez3z341a1B0 7dWy7bWmHsuHCAeTU2xD6uPdGWaeW5kZEOlRClYi/cjS0dUashYJpMypY Ftg7HQD8VdkJ2jAg0YWMZSXhTl2iq+eghoubtcg0yMhBsJsP1Vj4HwbW2 mkBQMbeDehxIRwhKwkg1kadTKfobOkrP7bQ71EUvT8fSRjplVDLLS8guY lMNs14g4TeD+/SZNOtV6v5iaHwiUnf8OXvebgjuFk+VqUCa/dJuf1bt5G g==; X-IronPort-AV: E=McAfee;i="6400,9594,10337"; a="267656701" X-IronPort-AV: E=Sophos;i="5.91,200,1647327600"; d="scan'208";a="267656701" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 02:54:46 -0700 X-IronPort-AV: E=Sophos;i="5.91,200,1647327600"; d="scan'208";a="568528376" Received: from adgonzal-mobl2.amr.corp.intel.com (HELO khuang2-desk.gar.corp.intel.com) ([10.254.3.146]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 02:54:43 -0700 Message-ID: <6bb89ca6e7346f4334f06ea293f29fd12df70fe4.camel@intel.com> Subject: Re: [PATCH v3 00/21] TDX host kernel support From: Kai Huang To: Dave Hansen , linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, pbonzini@redhat.com, len.brown@intel.com, tony.luck@intel.com, rafael.j.wysocki@intel.com, reinette.chatre@intel.com, dan.j.williams@intel.com, peterz@infradead.org, ak@linux.intel.com, kirill.shutemov@linux.intel.com, sathyanarayanan.kuppuswamy@linux.intel.com, isaku.yamahata@intel.com Date: Thu, 05 May 2022 21:54:41 +1200 In-Reply-To: <8bf596b45f68363134f431bcc550e16a9a231b80.camel@intel.com> References: <522e37eb-68fc-35db-44d5-479d0088e43f@intel.com> <9b388f54f13b34fe684ef77603fc878952e48f87.camel@intel.com> <664f8adeb56ba61774f3c845041f016c54e0f96e.camel@intel.com> <1b681365-ef98-ec78-96dc-04e28316cf0e@intel.com> <8bf596b45f68363134f431bcc550e16a9a231b80.camel@intel.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.42.4 (3.42.4-1.fc35) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2022-05-04 at 13:15 +1200, Kai Huang wrote: > On Tue, 2022-05-03 at 17:25 -0700, Dave Hansen wrote: > > On 5/3/22 16:59, Kai Huang wrote: > > > Should be: > > > > > > // prevent racing with TDX module initialization */ > > > tdx_init_disable(); > > > > > > if (tdx_module_initialized()) { > > > if (new_memory_resource in TDMRs) > > > // allow memory hot-add > > > else > > > // reject memory hot-add > > > } else if (new_memory_resource in CMR) { > > > // add new memory to TDX memory so it can be > > > // included into TDMRs > > > > > > // allow memory hot-add > > > } > > > else > > > // reject memory hot-add > > > > > > tdx_module_enable(); > > > > > > And when platform doesn't TDX, always allow memory hot-add. > > > > I don't think it even needs to be *that* complicated. > > > > It could just be winner take all: if TDX is initialized first, don't > > allow memory hotplug. If memory hotplug happens first, don't allow TDX > > to be initialized. > > > > That's fine at least for a minimal patch set. > > OK. This should also work. > > We will need tdx_init_disable() which grabs the mutex to prevent TDX module > initialization from running concurrently, and to disable TDX module > initialization once for all. > > > > > > What you have up above is probably where you want to go eventually, but > > it means doing things like augmenting the e820 since it's the single > > source of truth for creating the TMDRs right now. > > > > Yes. But in this case, I am thinking about probably we should switch from > consulting e820 to consulting memblock. The advantage of using e820 is it's > easy to include legacy PMEM as TDX memory, but the disadvantage is (as you can > see in e820_for_each_mem() loop) I am actually merging contiguous different > types of RAM entries in order to be consistent with the behavior of > e820_memblock_setup(). This is not nice. > > If memory hot-add and TDX can only be one winner, legacy PMEM actually won't be > used as TDX memory anyway now. The reason is TDX guest will very likely needing > to use the new fd-based backend (see my reply in other emails), but not just > some random backend. To me it's totally fine to not support using legacy PMEM > directly as TD guest backend (and if we create a TD with real NVDIMM as backend > using dax, the TD cannot be created anyway). Given we cannot kmem-hot-add > legacy PMEM back as system RAM, to me it's pointless to include legacy PMEM into > TDMRs. > > In this case, we can just create TDMRs based on memblock directly. One problem > is memblock will be gone after kernel boots, but this can be solved either by > keeping the memblock, or build the TDX memory early when memblock is still > alive. > > Btw, eventually, as it's likely we need to support different source of TDX > memory (CLX memory, etc), I think eventually we will need some data structures > to represent TDX memory block and APIs to add those blocks to the whole TDX > memory so those TDX memory ranges from different source can be added before > initializing the TDX module. > > struct tdx_memblock { > struct list_head list; > phys_addr_t start; > phys_addr_t end; > int nid; > ... > }; > > struct tdx_memory { > struct list_head tmb_list; > ... > }; > > int tdx_memory_add_memblock(start, end, nid, ...); > > And the TDMRs can be created based on 'struct tdx_memory'. > > For now, we only need to add memblock to TDX memory. > > Any comments? > Hi Dave, Sorry to ping (trying to close this). Given we don't need to consider kmem-hot-add legacy PMEM after TDX module initialization, I think for now it's totally fine to exclude legacy PMEMs from TDMRs. The worst case is when someone tries to use them as TD guest backend directly, the TD will fail to create. IMO it's acceptable, as it is supposedly that no one should just use some random backend to run TD. I think w/o needing to include legacy PMEM, it's better to get all TDX memory blocks based on memblock, but not e820. The pages managed by page allocator are from memblock anyway (w/o those from memory hotplug). And I also think it makes more sense to introduce 'tdx_memblock' and 'tdx_memory' data structures to gather all TDX memory blocks during boot when memblock is still alive. When TDX module is initialized during runtime, TDMRs can be created based on the 'struct tdx_memory' which contains all TDX memory blocks we gathered based on memblock during boot. This is also more flexible to support other TDX memory from other sources such as CLX memory in the future. Please let me know if you have any objection? Thanks! -- Thanks, -Kai