Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp233904iob; Mon, 2 May 2022 18:01:49 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy94Suje9f8tZ+8CPn0uHDbdWya4pXSjoY2wGbkgzVKSi4TnCLIDgftLPJs/nZLqiruNvHO X-Received: by 2002:a17:90a:488c:b0:1c7:b62e:8e8c with SMTP id b12-20020a17090a488c00b001c7b62e8e8cmr2000352pjh.157.1651539709670; Mon, 02 May 2022 18:01:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651539709; cv=none; d=google.com; s=arc-20160816; b=edK7zeMnxxX7mAugefqOEzmaETLLgwVAZbYg8aNtOQf2f1dObIcvkfeUHQtTjElg5a Sf8SgwE+xZhgJuyPLuP1rMgMoVMWOlqPZvgpraI0pMzLYD5m5cnKnMxHUz9OytyE1fjD s2AkWRO5r/MJFj/f3UmXRqTDxl24BJ5cpSar6VRTvHYTZKeKaxKzci9RtcUImlatJllT 563g2rPxJlBxuMhBk8/SLQzIWcAYVmOWdirCAdsNLWBLNk0jzGINHSB1eQ5xVkzyGFte KnpVdv2eFnhZuV70Yv3Z0XZ7xPueqzmz10dP/4JXuxEscdr0t79u98n41g9bJIIg3gX7 EbTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=oxP0ek6wUN9iyjzedzWi79VMCxMFV0VAQi9HECelv5w=; b=op/2giAantn4eiOTRYgVK7OYIa2h2HdwoXS1UcrvtEUEO0cathFeQKWmtcsYx0R8N8 hS6wVMMREskDre7nCXgxGtHfoVTkqvt0Hj0VPxQfAd6RxpVRcbjmyevbnTOIYUG7S4Mc OutbvMfSsEz4UsKRJBFuA/aCyb2P7mF1AKAzMxzpUJSsg1NOxDn5lMWOuASArDDZF3qr MfBomGFyHMmvCEH51cog91gV2CQk8FzANicBlJHMBm7g6Qq5qbR8ExKDvLWU7rGYhHtz gMCYcabLeovEZg2FaxZuFyevGI6w3yAc9Ne9pLjz5bDGaDk2gDFRvx3Cai30iATNnsQJ K9Aw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="by1/H/rs"; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id c12-20020a621c0c000000b0050a9599eedfsi13422287pfc.343.2022.05.02.18.01.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 May 2022 18:01:49 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="by1/H/rs"; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 874A54BBB2; Mon, 2 May 2022 17:45:53 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1379936AbiD2Shz (ORCPT + 99 others); Fri, 29 Apr 2022 14:37:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51244 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236661AbiD2Shx (ORCPT ); Fri, 29 Apr 2022 14:37:53 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 07CA4D5543; Fri, 29 Apr 2022 11:34:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651257274; x=1682793274; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=LMD/xKUKZ25/S/WmOS4jXEEPVwy0dkFAu8aG3A2vSHY=; b=by1/H/rsOq2efLwZ7+Osj3/eKKyV52CKHWkyp3zoitWI5lq9xoFsC2hy 2HfhcCXog4OneGZVVynbmgorxTSrvt7IwdIWn14svsjhNeCorv11X6Xf1 2x9EW+qUN/1NmySUM2RMLiIiLmbZngvWheLsmkUWF5nUy0L5+c9YDp72Y shlUD9iIku2KzOrPmDLTsyVoA3Nwj1u7pRK3MdVB/X28F1hLGVoOEkmKA mdihpGxnXAMn6IJRa2lR8NedGaopTLSrvWvOMoXsmo3ENcwg/MjWKWn72 o6K6IdgPMEpff93BXmWx2bPw3DGQAeMBBebiAfjs3r4/2T3UbnrNwOu5O A==; X-IronPort-AV: E=McAfee;i="6400,9594,10332"; a="291911490" X-IronPort-AV: E=Sophos;i="5.91,186,1647327600"; d="scan'208";a="291911490" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Apr 2022 11:34:34 -0700 X-IronPort-AV: E=Sophos;i="5.91,186,1647327600"; d="scan'208";a="582322897" Received: from jinggu-mobl1.amr.corp.intel.com (HELO [10.212.30.227]) ([10.212.30.227]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Apr 2022 11:34:33 -0700 Message-ID: <4d0c7316-3564-ef27-1113-042019d583dc@intel.com> Date: Fri, 29 Apr 2022 11:34:50 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: [PATCH v3 00/21] TDX host kernel support Content-Language: en-US To: Dan Williams Cc: Kai Huang , Linux Kernel Mailing List , KVM list , Sean Christopherson , Paolo Bonzini , "Brown, Len" , "Luck, Tony" , Rafael J Wysocki , Reinette Chatre , Peter Zijlstra , Andi Kleen , "Kirill A. Shutemov" , Kuppuswamy Sathyanarayanan , Isaku Yamahata References: <522e37eb-68fc-35db-44d5-479d0088e43f@intel.com> <92af7b22-fa8a-5d42-ae15-8526abfd2622@intel.com> <4a5143cc-3102-5e30-08b4-c07e44f1a2fc@intel.com> From: Dave Hansen In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/29/22 10:48, Dan Williams wrote: >> But, neither of those really help with, say, a device-DAX mapping of >> TDX-*IN*capable memory handed to KVM. The "new syscall" would just >> throw up its hands and leave users with the same result: TDX can't be >> used. The new sysfs ABI for NUMA nodes wouldn't clearly apply to >> device-DAX because they don't respect the NUMA policy ABI. > They do have "target_node" attributes to associate node specific > metadata, and could certainly express target_node capabilities in its > own ABI. Then it's just a matter of making pfn_to_nid() do the right > thing so KVM kernel side can validate the capabilities of all inbound > pfns. Let's walk through how this would work with today's kernel on tomorrow's hardware, without KVM validating PFNs: 1. daxaddr mmap("/dev/dax1234") 2. kvmfd = open("/dev/kvm") 3. ioctl(KVM_SET_USER_MEMORY_REGION, { daxaddr }; 4. guest starts running 5. guest touches 'daxaddr' 6. Page fault handler maps 'daxaddr' 7. KVM finds new 'daxaddr' PTE 8. TDX code tries to add physical address to Secure-EPT 9. TDX "SEAMCALL" fails because page is not convertible 10. Guest dies All we can do to improve on that is call something that pledges to only map convertible memory at 'daxaddr'. We can't *actually* validate the physical addresses at mmap() time or even KVM_SET_USER_MEMORY_REGION-time because the memory might not have been allocated. Those pledges are hard for anonymous memory though. To fulfill the pledge, we not only have to validate that the NUMA policy is compatible at KVM_SET_USER_MEMORY_REGION, we also need to decline changes to the policy that might undermine the pledge.