Received: by 2002:ab2:3350:0:b0:1f4:6588:b3a7 with SMTP id o16csp1726960lqe; Mon, 8 Apr 2024 19:58:24 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUuqFAHFCCQMw+2xZgDZb7DEbrOxE+d15vSMz4fYDe1u7lBMdg/owrcsmLKBg+JmHMAAnIbga6oEjJXwljhv9kt9e5i96CRH/3JCAg8YQ== X-Google-Smtp-Source: AGHT+IHJPRl5hAVOlZI1S+iQ7kvWLATTbTVbj+20sDDYLgAiI96Hlzig5Oajqsugqm5btKvPQ05S X-Received: by 2002:a05:6512:484a:b0:516:d26e:ea6f with SMTP id ep10-20020a056512484a00b00516d26eea6fmr8229793lfb.35.1712631504724; Mon, 08 Apr 2024 19:58:24 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1712631504; cv=pass; d=google.com; s=arc-20160816; b=tF/TKnwBa6LDw3y+rf+hqL4fK3VAg24pKMkuanTGUU5utCbj1N83adhak1H7qHx18l d5u+bGywo6WbiU7SfHdf0AhJ7srN8XzD2yM3/vrukDFWRwiVf1+Uu/DRAo1v/Z9uwKv0 i8CCt4AeJRZv8UAu4PMl1ySHeZbLvGc0FPOhDSkIB+/Tw7mZW1tmyZ2XJYWdIaesCLQx tAH0svKd4TwRvI3F28STvTCTOcYia0jsq4AbuG5Uhs+W+bC+lWSY+qfTHsEhqQ/OI/Ai l0sNljiCdrMrME9nurBImW9hTzDqBh8NK/NDaGzLdlj3turNPmZs5JideqjyACf1QOtq 60AQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id:dkim-signature; bh=IREj9Au7RT6NFersGkSZICe6KpZKgXOAMN9QdVfZAfo=; fh=LygJMmx6q6xmbRGDpGfdnRnMYWkVX6QK+SLi/0l8GeY=; b=CA9vg4A7AtyaizSc17mp/HI3jpXuRZu36Nb+Kq9vg1d5nQe1So6m4K0oe2vGCNn4iN faQlpI1liYlQDJFtRkRfRAUshmIHRsOkoLZhwPkiA9oxA43PVZ1o7BDALHzDt1Q7JWRI /ambZen2uAtIPr1jSA+E2jzslTu0BS03QksQ1etIw3R3KOSLMqnolKQmwLm65x5odXsc vsEAItcyw7t0iEkenaPTZeopK7GE9bNGKvK6NbzcLfEei3eltAXZTAkKKNpol8Wkh5d/ 1wLr9omIRH62fb52GICOqFHqtCt1wj1a1BkttLpXLT6VxULEdl/i0x5sDVleOC1Vzzd9 XOAA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TCU140FY; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-136155-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-136155-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id hr31-20020a1709073f9f00b00a51c1e03ef7si2970270ejc.638.2024.04.08.19.58.24 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Apr 2024 19:58:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-136155-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TCU140FY; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-136155-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-136155-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 41C911F2449F for ; Tue, 9 Apr 2024 02:58:13 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 37E9B54773; Tue, 9 Apr 2024 02:58:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="TCU140FY" Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 16EAD4F1E2; Tue, 9 Apr 2024 02:58:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.17 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712631483; cv=none; b=bufssHwE0lnvEqGSfbp/enUkQb0w3giTqSUInk6e+MaHELdKAVp3uxQ1UrZfHHdAGkpiX6bSiU9QqemnUMj242Pn72IXZPZY8Llr+ONU75MScFW0NIue/yY4aS382OQKWyHAG/1rn3Am4OtqlTAgxuvOyFJ8nI3h3/ph4X8pFxw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712631483; c=relaxed/simple; bh=ftu17EFideu85zcIBzW3FSNHlVABHCz50AEW+QGwWko=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=gQU400CXy0zGja+gEnTA4dl+dnUdFEIaKI0MkSuZWcAs9BAroVoFyAWKpDtMoU3byIHCviw7M0obWm5FbBor0es0BZH4cHWXjtm6ti2zfAWpIV+SSFkAorJCK5IaadVPz4pSVuyXUpjgs0xQwLRGiglGho3Q0qXYVMtOwM9WMIs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=TCU140FY; arc=none smtp.client-ip=198.175.65.17 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1712631481; x=1744167481; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=ftu17EFideu85zcIBzW3FSNHlVABHCz50AEW+QGwWko=; b=TCU140FYUfC4YXq+lnCrZWCBZIqZ779yUi9wFWMr5Ydr4o7IEUQRWlBb xMqx3YAHtjDYNgzlClCWPZeqW3B4hl7yDuMAVAjAJQTGxZLWXbOFT4X5A nYtRi1+vWvWfS7Zpt05JEi+vonAT2QUsyhOAsiFNBeUEeYH12VW8JTfBn pJqxR52XGYWhhYr3nU6ihq4GkuYmpHCpfOgcWui/Af8G4ck0AHeyDxh6n oylJQ9+kJgrAdfn15WzMhwxx1/YJ80U9fHbkeV258zweDXIRxLxr/1s+V 4H08fHo/Us35Y4wHhf4qRpByHeSxbe/S0C7jAFaJVd7sv/nwqnJ/XQUaT Q==; X-CSE-ConnectionGUID: MYpvSYlJRJmMRzAgdF3tnw== X-CSE-MsgGUID: spjFH/WoQZ+WeG459AgE6w== X-IronPort-AV: E=McAfee;i="6600,9927,11038"; a="8037583" X-IronPort-AV: E=Sophos;i="6.07,188,1708416000"; d="scan'208";a="8037583" Received: from orviesa009.jf.intel.com ([10.64.159.149]) by orvoesa109.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Apr 2024 19:58:01 -0700 X-CSE-ConnectionGUID: xQ091zZ+QKyya+YLrz1kew== X-CSE-MsgGUID: M//Z3gSlSCW+KqWX2Cv3og== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,188,1708416000"; d="scan'208";a="19999474" Received: from xiaoyaol-hp-g830.ccr.corp.intel.com (HELO [10.124.242.48]) ([10.124.242.48]) by orviesa009-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Apr 2024 19:57:58 -0700 Message-ID: <24c80d16-733b-4036-8057-075a0dab3b4d@intel.com> Date: Tue, 9 Apr 2024 10:57:55 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [ANNOUNCE] PUCK Notes - 2024.04.03 - TDX Upstreaming Strategy To: Sean Christopherson Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Rick P Edgecombe , Isaku Yamahata , Wei W Wang , David Skidmore , Steve Rutherford , Pankaj Gupta References: <20240405165844.1018872-1-seanjc@google.com> <73b40363-1063-4cb3-b744-9c90bae900b5@intel.com> Content-Language: en-US From: Xiaoyao Li In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 4/9/2024 12:20 AM, Sean Christopherson wrote: > On Sun, Apr 07, 2024, Xiaoyao Li wrote: >> On 4/6/2024 12:58 AM, Sean Christopherson wrote: >>> - For guest MAXPHYADDR vs. GPAW, rely on KVM_GET_SUPPORTED_CPUID to enumerate >>> the usable MAXPHYADDR[2], and simply refuse to enable TDX if the TDX Module >>> isn't compatible. Specifically, if MAXPHYADDR=52, 5-level paging is enabled, >>> but the TDX-Module only allows GPAW=0, i.e. only supports 4-level paging. >> >> So userspace can get supported GPAW from usable MAXPHYADDR, i.e., >> CPUID(0X8000_0008).eaxx[23:16] of KVM_GET_SUPPORTED_CPUID: >> - if usable MAXPHYADDR == 52, supported GPAW is 0 and 1. >> - if usable MAXPHYADDR <= 48, supported GPAW is only 0. >> >> There is another thing needs to be discussed. How does userspace configure >> GPAW for TD guest? >> >> Currently, KVM uses CPUID(0x8000_0008).EAX[7:0] in struct >> kvm_tdx_init_vm::cpuid.entries[] of IOCTL(KVM_TDX_INIT_VM) to deduce the >> GPAW: >> >> int maxpa = 36; >> entry = kvm_find_cpuid_entry2(cpuid->entries, cpuid->nent, 0x80000008, 0); >> if (entry) >> max_pa = entry->eax & 0xff; >> >> ... >> if (!cpu_has_vmx_ept_5levels() && max_pa > 48) >> return -EINVAL; >> if (cpu_has_vmx_ept_5levels() && max_pa > 48) { >> td_params->eptp_controls |= VMX_EPTP_PWL_5; >> td_params->exec_controls |= TDX_EXEC_CONTROL_MAX_GPAW; >> } else { >> td_params->eptp_controls |= VMX_EPTP_PWL_4; >> } >> >> The code implies that KVM allows the provided CPUID(0x8000_0008).EAX[7:0] to >> be any value (when 5level ept is supported). when it > 48, configure GPAW of >> TD to 1, otherwise to 0. >> >> However, the virtual value of CPUID(0x8000_0008).EAX[7:0] inside TD is >> always the native value of hardware (for current TDX). >> >> So if we want to keep this behavior, we need to document it somewhere that >> CPUID(0x8000_0008).EAX[7:0] in struct kvm_tdx_init_vm::cpuid.entries[] of >> IOCTL(KVM_TDX_INIT_VM) is only for configuring GPAW, not for userspace to >> configure virtual CPUID value for TD VMs. >> >> Another option is that, KVM doesn't allow userspace to configure >> CPUID(0x8000_0008).EAX[7:0]. Instead, it provides a gpaw field in struct >> kvm_tdx_init_vm for userspace to configure directly. >> >> What do you prefer? > > Hmm, neither. I think the best approach is to build on Gerd's series to have KVM > select 4-level vs. 5-level based on the enumerated guest.MAXPHYADDR, not on > host.MAXPHYADDR. I see no difference between using guest.MAXPHYADDR (EAX[23:16]) and using host.MAXPHYADDR (EAX[7:0]) to determine the GPAW (and EPT level) for TD guest. The case for TDX diverges from what for non TDX VMs. The value of them passed from userspace can only be used to configure GPAW and EPT level for TD, but won't be reflected in CPUID inside TD. So I take it as you prefer the former option than dedicated GPAW field. > With a moderate amount of refactoring, cache/compute guest_maxphyaddr as: > > static void kvm_vcpu_refresh_maxphyaddr(struct kvm_vcpu *vcpu) > { > struct kvm_cpuid_entry2 *best; > > best = kvm_find_cpuid_entry(vcpu, 0x80000000); > if (!best || best->eax < 0x80000008) > goto not_found; > > best = kvm_find_cpuid_entry(vcpu, 0x80000008); > if (!best) > goto not_found; > > vcpu->arch.maxphyaddr = best->eax & GENMASK(7, 0); > > if (best->eax & GENMASK(15, 8)) > vcpu->arch.guest_maxphyaddr = (best->eax & GENMASK(15, 8)) >> 8; > else > vcpu->arch.guest_maxphyaddr = vcpu->arch.maxphyaddr; > > return; > > not_found: > vcpu->arch.maxphyaddr = KVM_X86_DEFAULT_MAXPHYADDR; > vcpu->arch.guest_maxphyaddr = KVM_X86_DEFAULT_MAXPHYADDR; > } > > and then use vcpu->arch.guest_maxphyaddr instead of vcpu->arch.maxphyaddr when > selecting the TDP level. > > static inline int kvm_mmu_get_tdp_level(struct kvm_vcpu *vcpu) > { > /* tdp_root_level is architecture forced level, use it if nonzero */ > if (tdp_root_level) > return tdp_root_level; > > /* > * Use 5-level TDP if and only if it's useful/necessary. Definitely a > * more verbose comment here. > */ > if (max_tdp_level == 5 && vcpu->arch.guest_maxphyaddr <= 48) > return 4; > > return max_tdp_level; > } > > The only question is whether or not the behavior needs to be opt-in via a new > capability, e.g. in case there is some weird usage where userspace enumerates > guest.MAXPHYADDR < host.MAXPHYADDR but still wants/needs 5-level paging. I highly > doubt such a use case exists though. > > I'll get Gerd's series applied, and will post a small series to implement the > above later this week.