Received: by 2002:a05:7412:b10a:b0:f3:1519:9f41 with SMTP id az10csp2922845rdb; Mon, 4 Dec 2023 11:07:58 -0800 (PST) X-Google-Smtp-Source: AGHT+IH7bLqrkWUGnM7DCk+BQAnlihIEnmSu/fjibK+RL78nRS5QH3dMa8cJe0sl/rPDJkEU7Slb X-Received: by 2002:a05:6a00:4c89:b0:6ce:4cc8:99d6 with SMTP id eb9-20020a056a004c8900b006ce4cc899d6mr88992pfb.16.1701716878586; Mon, 04 Dec 2023 11:07:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701716878; cv=none; d=google.com; s=arc-20160816; b=BEJULmHV5Uu32v29qbYb13fT6C0mML/U/OdzyNsnFW0JoaT2QluxudtSGLtXdISlk1 jNS/W1SDTQvG0/0mCnNgOL2IOc8Z4jE/8W1h9MjCI253tP4fq0CBjzqQ9qLglayVfCEl SnknkSUOvb3XA98esz+0GmyUS4PZC6T8ADa565lKu7MRtsmYOZ7Lr+Pxn5k1GI2a0Ye0 psFy7LqiYT1DkRoW5LDk728+xNnjj7vmJFteJsZZD4Vd87MPMK1hNUvClS2lA9JqI9Hb sC1fZ6TKwwc5MrPJw6V80e9rrIjhRpm8r1Ddu37hF3ZhY37vYbPHADAYsnnZCoY3nS7B Jr0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature:dkim-filter; bh=DhialEwGNDg3mrw6IMaHyyz1UwnTPP9nz7f6+UkxtYo=; fh=yrRQlsu7CZ4h+ibtDvCJ92anQb/qLZB4494JlS3/hvg=; b=uoHuI7eHRdf6E/Nl0Xk2JezUSdv0z5GXsisiNGctJGJWvVC/w28dt5kLOA1wBT1hux jSaz0WXKx83RhZyKtzHbXjGc/e5CB23VaYDXltsUh4A0qWcvp8W9gG00t0/6wOWbi930 ArmC42Lkq9fkf6T79IUeFovaJHoIM0oZvoTxavcjMHpOZ0lXc08mFwY9fXTJ7RHbUQs2 tFKkMFvl+yFEGbv2nmwy7JcrcHGtzvADIwfLyzYuamA/mJRX3/DkaHIvmkwtCvstU/PW y9uEmGX+dKhEXfFehC3ZaEDLXHG3kTlVpVgjxzIGCj7RygsH8/3FixSVxyIInrsLnGl+ 2TQg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=hc1WA4TB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Return-Path: Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id b11-20020a056a00114b00b006cbf67abff9si8372955pfm.269.2023.12.04.11.07.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 11:07:58 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=hc1WA4TB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id DF58780ACB45; Mon, 4 Dec 2023 11:07:56 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345444AbjLDTHo (ORCPT + 99 others); Mon, 4 Dec 2023 14:07:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45180 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231266AbjLDTHm (ORCPT ); Mon, 4 Dec 2023 14:07:42 -0500 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 78497D5; Mon, 4 Dec 2023 11:07:48 -0800 (PST) Received: from [192.168.178.49] (dynamic-adsl-84-220-28-122.clienti.tiscali.it [84.220.28.122]) by linux.microsoft.com (Postfix) with ESMTPSA id 0451C20B74C0; Mon, 4 Dec 2023 11:07:40 -0800 (PST) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 0451C20B74C0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1701716867; bh=DhialEwGNDg3mrw6IMaHyyz1UwnTPP9nz7f6+UkxtYo=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=hc1WA4TBrCB1KkvAbyYjJyBppPtRTRbsfBKpT419aWpy2ztkcPxAb/s6f2G+exsrM Rra8BHMT2mlyE8abLbGIfD6tyWJEGywaPkcO1Wd2sC6gI6QxTOQq6/xq0vRw9wuGCe cqn9Wyx22UN4A4vtLznpkQS6Qe4uCaEiPin01ypA= Message-ID: <9ab71fee-be9f-4afc-8098-ad9d6b667d46@linux.microsoft.com> Date: Mon, 4 Dec 2023 20:07:38 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v1 1/3] x86/tdx: Check for TDX partitioning during early TDX init Content-Language: en-US To: "Reshetova, Elena" , "linux-kernel@vger.kernel.org" , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , "Kirill A. Shutemov" , Michael Kelley , Nikolay Borisov , Peter Zijlstra , Thomas Gleixner , Tom Lendacky , "x86@kernel.org" , "Cui, Dexuan" Cc: "linux-hyperv@vger.kernel.org" , "stefan.bader@canonical.com" , "tim.gardner@canonical.com" , "roxana.nicolescu@canonical.com" , "cascardo@canonical.com" , "kys@microsoft.com" , "haiyangz@microsoft.com" , "wei.liu@kernel.org" , "sashal@kernel.org" , "stable@vger.kernel.org" References: <20231122170106.270266-1-jpiotrowski@linux.microsoft.com> From: Jeremi Piotrowski In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-17.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_PASS,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 04 Dec 2023 11:07:57 -0800 (PST) On 04/12/2023 10:17, Reshetova, Elena wrote: >> Check for additional CPUID bits to identify TDX guests running with Trust >> Domain (TD) partitioning enabled. TD partitioning is like nested virtualization >> inside the Trust Domain so there is a L1 TD VM(M) and there can be L2 TD VM(s). >> >> In this arrangement we are not guaranteed that the TDX_CPUID_LEAF_ID is >> visible >> to Linux running as an L2 TD VM. This is because a majority of TDX facilities >> are controlled by the L1 VMM and the L2 TDX guest needs to use TD partitioning >> aware mechanisms for what's left. So currently such guests do not have >> X86_FEATURE_TDX_GUEST set. > > Back to this concrete patch. Why cannot L1 VMM emulate the correct value of > the TDX_CPUID_LEAF_ID to L2 VM? It can do this per TDX partitioning arch. > How do you handle this and other CPUID calls call currently in L1? Per spec, > all CPUIDs calls from L2 will cause L2 --> L1 exit, so what do you do in L1? The disclaimer here is that I don't have access to the paravisor (L1) code. But to the best of my knowledge the L1 handles CPUID calls by calling into the TDX module, or synthesizing a response itself. TDX_CPUID_LEAF_ID is not provided to the L2 guest in order to discriminate a guest that is solely responsible for every TDX mechanism (running at L1) from one running at L2 that has to cooperate with L1. More below. > > Given that you do that simple emulation, you already end up with TDX guest > code being activated. Next you can check what features you wont be able to > provide in L1 and create simple emulation calls for the TDG calls that must be > supported and cannot return error. The biggest TDG call (TDVMCALL) is already > direct call into L0 VMM, so this part doesn’t require L1 VMM support. I don't see anything in the TD-partitioning spec that gives the TDX guest a way to detect if it's running at L2 or L1, or check whether TDVMCALLs go to L0/L1. So in any case this requires an extra cpuid call to establish the environment. Given that, exposing TDX_CPUID_LEAF_ID to the guest doesn't help. I'll give some examples of where the idea of emulating a TDX environment without attempting L1-L2 cooperation breaks down. hlt: if the guest issues a hlt TDVMCALL it goes to L0, but if it issues a classic hlt it traps to L1. The hlt should definitely go to L1 so that L1 has a chance to do housekeeping. map gpa: say the guest uses MAP_GPA TDVMCALL. This goes to L0, not L1 which is the actual entity that needs to have a say in performing the conversion. L1 can't act on the request if L0 would forward it because of the CoCo threat model. So L1 and L2 get out of sync. The only safe approach is for L2 to use a different mechanism to trap to L1 explicitly. Having a paravisor is required to support a TPM and having TDVMCALLs go to L0 is required to make performance viable for real workloads. > > Until we really see what breaks with this approach, I don’t think it is worth to > take in the complexity to support different L1 hypervisors view on partitioning. > I'm not asking to support different L1 hypervisors view on partitioning, I want to clean up the code (by fixing assumptions that no longer hold) for the model that I'm describing that: the kernel already supports, has an implementation that works and has actual users. This is also a model that Intel intentionally created the TD-partitioning spec to support. So lets work together to make X86_FEATURE_TDX_GUEST match reality. Best regards, Jeremi > Best Regards, > Elena. > >