Received: by 2002:a05:7412:b10a:b0:f3:1519:9f41 with SMTP id az10csp1313963rdb; Fri, 1 Dec 2023 12:36:19 -0800 (PST) X-Google-Smtp-Source: AGHT+IHYpQaJb1gDBQr+uW2O7WBBwjjPQNr1QkvddcunCYnSmnbmIGpA2HeeeM9iYQxZkVSTv0N6 X-Received: by 2002:a17:902:e5c7:b0:1cf:d5a3:6b0c with SMTP id u7-20020a170902e5c700b001cfd5a36b0cmr98493plf.21.1701462978865; Fri, 01 Dec 2023 12:36:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701462978; cv=none; d=google.com; s=arc-20160816; b=uaoF/yHQ+jT86WCNQ0HNbrJFVAytCXOM0DlwUqn4SeTqjv5sRuEppIUkXPGZERHId5 v9on5pw1EH3E3E76dxG4pCHxyxAGMP2ryo5QW2odR3+7uH1azGsq1t+c3LwPvJrjaBdy wVFPDLHJEA4Yp9Js03O2I9AG/61MoIcvCwUmGVJMz6ctQpMQk5j198lMWmL0FdH3L4Z8 wCXQ94Eo5wFZUF1xk7UyXutBUcYqGEMAYBQwIRjRt58xwpue3e5/LZ40y+x1Eb02OC6o YsgtGRdEjkrfsG8e27oAvfJyUn04NsaH7pZToyEuZ3JckhAhO6m/hbl92uetwzuiAbBw Z7ZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:autocrypt :from:references:cc:to:content-language:subject:user-agent :mime-version:date:message-id:dkim-signature; bh=RsRY0PK2jTyuCRSQRKaZamJt4QuePD8qFSX/QNsEoEo=; fh=UnEj2l4svsT63hiFmTsCOCMSTqLPH8tKdlCK3Xyhj3g=; b=zR85VHo/PKfqBI5fbvgAEB9/gh/4X+fw6vqCBIJvXKa7qaC9nbTccwvHV290Bnml5T gim3bjmGijvStV3nh8/miXLM8RYg/udMePjgfZ84xEldM6o41qvGJbwABl0rnHEUjdDh rpJsj7c3sFxj/VbaTybQmQPiW8bI6ibZejDrZDIbl2+HjDGquzxMXoCiSxDElDKb7fAo 4C4KhiFeKsgng3tdfx0Lvf5kHpFKsK1Yuxt3Ir2mZijN63AFiHM7IG6eJ5yDdAWgCYWy wLNgmoHa0WwUmNZMqArgK3jN2TG3/tRaekA87EkZi8i7T0FzhX+ugAZ4alQEHAJn6QF4 s2Cg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=HLmEZvFk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id bc7-20020a170902930700b001d04d730685si2149691plb.543.2023.12.01.12.36.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Dec 2023 12:36:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=HLmEZvFk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 6FF1481DC61B; Fri, 1 Dec 2023 12:36:08 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230526AbjLAUfx (ORCPT + 99 others); Fri, 1 Dec 2023 15:35:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59192 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230364AbjLAUfu (ORCPT ); Fri, 1 Dec 2023 15:35:50 -0500 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F318310E5; Fri, 1 Dec 2023 12:35:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701462956; x=1732998956; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=LMf1fItzoZA3T76vtDhmr3DZRsMRth2RfoIKey8Owo0=; b=HLmEZvFkXHlwS9hobsZN10QXNY9JajhFBVtEg22WrDjVpukCdzcVkJqB WJ2xsk0oMqPm9XLfCGJ8N6PZZQqIJTVp5FJveNI1jBTE+T7FeYhJ4+XJL bvHESli7uhAexuFFVtrRVrwbN/dBLQkKFROx8gmqPQlDRu2YD5ANVn/gy v10+nmjcgadG3HoFiJlY5bBYuH7/Wc7Tlm6RFb317So+F6tFmQNQRKQ9z SFnzjyUE6efx7lYq8jdjEkq7tmG+gmJuB3J0Jl3WGoiuLdNoXP6iitH6W 0Y/e+LAYPaG8vh38sI87bEUvRO6g+GBRlQ/5pckBL0UgcACG/hGroe3Sr g==; X-IronPort-AV: E=McAfee;i="6600,9927,10911"; a="390714800" X-IronPort-AV: E=Sophos;i="6.04,242,1695711600"; d="scan'208";a="390714800" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2023 12:35:55 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10911"; a="773551436" X-IronPort-AV: E=Sophos;i="6.04,242,1695711600"; d="scan'208";a="773551436" Received: from jturmaud-mobl1.amr.corp.intel.com (HELO [10.209.91.66]) ([10.209.91.66]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2023 12:35:53 -0800 Message-ID: Date: Fri, 1 Dec 2023 12:35:53 -0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v15 22/23] x86/mce: Improve error log of kernel space TDX #MC due to erratum Content-Language: en-US To: Kai Huang , linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: x86@kernel.org, kirill.shutemov@linux.intel.com, peterz@infradead.org, tony.luck@intel.com, tglx@linutronix.de, bp@alien8.de, mingo@redhat.com, hpa@zytor.com, seanjc@google.com, pbonzini@redhat.com, rafael@kernel.org, david@redhat.com, dan.j.williams@intel.com, len.brown@intel.com, ak@linux.intel.com, isaku.yamahata@intel.com, ying.huang@intel.com, chao.gao@intel.com, sathyanarayanan.kuppuswamy@linux.intel.com, nik.borisov@suse.com, bagasdotme@gmail.com, sagis@google.com, imammedo@redhat.com References: <9e80873fac878aa5d697cbcd4d456d01e1009d1f.1699527082.git.kai.huang@intel.com> From: Dave Hansen Autocrypt: addr=dave.hansen@intel.com; keydata= xsFNBE6HMP0BEADIMA3XYkQfF3dwHlj58Yjsc4E5y5G67cfbt8dvaUq2fx1lR0K9h1bOI6fC oAiUXvGAOxPDsB/P6UEOISPpLl5IuYsSwAeZGkdQ5g6m1xq7AlDJQZddhr/1DC/nMVa/2BoY 2UnKuZuSBu7lgOE193+7Uks3416N2hTkyKUSNkduyoZ9F5twiBhxPJwPtn/wnch6n5RsoXsb ygOEDxLEsSk/7eyFycjE+btUtAWZtx+HseyaGfqkZK0Z9bT1lsaHecmB203xShwCPT49Blxz VOab8668QpaEOdLGhtvrVYVK7x4skyT3nGWcgDCl5/Vp3TWA4K+IofwvXzX2ON/Mj7aQwf5W iC+3nWC7q0uxKwwsddJ0Nu+dpA/UORQWa1NiAftEoSpk5+nUUi0WE+5DRm0H+TXKBWMGNCFn c6+EKg5zQaa8KqymHcOrSXNPmzJuXvDQ8uj2J8XuzCZfK4uy1+YdIr0yyEMI7mdh4KX50LO1 pmowEqDh7dLShTOif/7UtQYrzYq9cPnjU2ZW4qd5Qz2joSGTG9eCXLz5PRe5SqHxv6ljk8mb ApNuY7bOXO/A7T2j5RwXIlcmssqIjBcxsRRoIbpCwWWGjkYjzYCjgsNFL6rt4OL11OUF37wL QcTl7fbCGv53KfKPdYD5hcbguLKi/aCccJK18ZwNjFhqr4MliQARAQABzUVEYXZpZCBDaHJp c3RvcGhlciBIYW5zZW4gKEludGVsIFdvcmsgQWRkcmVzcykgPGRhdmUuaGFuc2VuQGludGVs LmNvbT7CwXgEEwECACIFAlQ+9J0CGwMGCwkIBwMCBhUIAgkKCwQWAgMBAh4BAheAAAoJEGg1 lTBwyZKwLZUP/0dnbhDc229u2u6WtK1s1cSd9WsflGXGagkR6liJ4um3XCfYWDHvIdkHYC1t MNcVHFBwmQkawxsYvgO8kXT3SaFZe4ISfB4K4CL2qp4JO+nJdlFUbZI7cz/Td9z8nHjMcWYF IQuTsWOLs/LBMTs+ANumibtw6UkiGVD3dfHJAOPNApjVr+M0P/lVmTeP8w0uVcd2syiaU5jB aht9CYATn+ytFGWZnBEEQFnqcibIaOrmoBLu2b3fKJEd8Jp7NHDSIdrvrMjYynmc6sZKUqH2 I1qOevaa8jUg7wlLJAWGfIqnu85kkqrVOkbNbk4TPub7VOqA6qG5GCNEIv6ZY7HLYd/vAkVY E8Plzq/NwLAuOWxvGrOl7OPuwVeR4hBDfcrNb990MFPpjGgACzAZyjdmYoMu8j3/MAEW4P0z F5+EYJAOZ+z212y1pchNNauehORXgjrNKsZwxwKpPY9qb84E3O9KYpwfATsqOoQ6tTgr+1BR CCwP712H+E9U5HJ0iibN/CDZFVPL1bRerHziuwuQuvE0qWg0+0SChFe9oq0KAwEkVs6ZDMB2 P16MieEEQ6StQRlvy2YBv80L1TMl3T90Bo1UUn6ARXEpcbFE0/aORH/jEXcRteb+vuik5UGY 5TsyLYdPur3TXm7XDBdmmyQVJjnJKYK9AQxj95KlXLVO38lczsFNBFRjzmoBEACyAxbvUEhd GDGNg0JhDdezyTdN8C9BFsdxyTLnSH31NRiyp1QtuxvcqGZjb2trDVuCbIzRrgMZLVgo3upr MIOx1CXEgmn23Zhh0EpdVHM8IKx9Z7V0r+rrpRWFE8/wQZngKYVi49PGoZj50ZEifEJ5qn/H Nsp2+Y+bTUjDdgWMATg9DiFMyv8fvoqgNsNyrrZTnSgoLzdxr89FGHZCoSoAK8gfgFHuO54B lI8QOfPDG9WDPJ66HCodjTlBEr/Cwq6GruxS5i2Y33YVqxvFvDa1tUtl+iJ2SWKS9kCai2DR 3BwVONJEYSDQaven/EHMlY1q8Vln3lGPsS11vSUK3QcNJjmrgYxH5KsVsf6PNRj9mp8Z1kIG qjRx08+nnyStWC0gZH6NrYyS9rpqH3j+hA2WcI7De51L4Rv9pFwzp161mvtc6eC/GxaiUGuH BNAVP0PY0fqvIC68p3rLIAW3f97uv4ce2RSQ7LbsPsimOeCo/5vgS6YQsj83E+AipPr09Caj 0hloj+hFoqiticNpmsxdWKoOsV0PftcQvBCCYuhKbZV9s5hjt9qn8CE86A5g5KqDf83Fxqm/ vXKgHNFHE5zgXGZnrmaf6resQzbvJHO0Fb0CcIohzrpPaL3YepcLDoCCgElGMGQjdCcSQ+Ci FCRl0Bvyj1YZUql+ZkptgGjikQARAQABwsFfBBgBAgAJBQJUY85qAhsMAAoJEGg1lTBwyZKw l4IQAIKHs/9po4spZDFyfDjunimEhVHqlUt7ggR1Hsl/tkvTSze8pI1P6dGp2XW6AnH1iayn yRcoyT0ZJ+Zmm4xAH1zqKjWplzqdb/dO28qk0bPso8+1oPO8oDhLm1+tY+cOvufXkBTm+whm +AyNTjaCRt6aSMnA/QHVGSJ8grrTJCoACVNhnXg/R0g90g8iV8Q+IBZyDkG0tBThaDdw1B2l asInUTeb9EiVfL/Zjdg5VWiF9LL7iS+9hTeVdR09vThQ/DhVbCNxVk+DtyBHsjOKifrVsYep WpRGBIAu3bK8eXtyvrw1igWTNs2wazJ71+0z2jMzbclKAyRHKU9JdN6Hkkgr2nPb561yjcB8 sIq1pFXKyO+nKy6SZYxOvHxCcjk2fkw6UmPU6/j/nQlj2lfOAgNVKuDLothIxzi8pndB8Jju KktE5HJqUUMXePkAYIxEQ0mMc8Po7tuXdejgPMwgP7x65xtfEqI0RuzbUioFltsp1jUaRwQZ MTsCeQDdjpgHsj+P2ZDeEKCbma4m6Ez/YWs4+zDm1X8uZDkZcfQlD9NldbKDJEXLIjYWo1PH hYepSffIWPyvBMBTW2W5FRjJ4vLRrJSUoEfJuPQ3vW9Y73foyo/qFoURHO48AinGPZ7PC7TF vUaNOTjKedrqHkaOcqB185ahG2had0xnFsDPlx5y In-Reply-To: <9e80873fac878aa5d697cbcd4d456d01e1009d1f.1699527082.git.kai.huang@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Fri, 01 Dec 2023 12:36:08 -0800 (PST) On 11/9/23 03:55, Kai Huang wrote: > +static bool is_pamt_page(unsigned long phys) > +{ > + struct tdmr_info_list *tdmr_list = &tdx_tdmr_list; > + int i; > + > + /* > + * This function is called from #MC handler, and theoretically > + * it could run in parallel with the TDX module initialization > + * on other logical cpus. But it's not OK to hold mutex here > + * so just blindly check module status to make sure PAMTs/TDMRs > + * are stable to access. > + * > + * This may return inaccurate result in rare cases, e.g., when > + * #MC happens on a PAMT page during module initialization, but > + * this is fine as #MC handler doesn't need a 100% accurate > + * result. > + */ It doesn't need perfect accuracy. But how do we know it's not going to go, for instance, chase a bad pointer? > + if (tdx_module_status != TDX_MODULE_INITIALIZED) > + return false; As an example, what prevents this CPU from observing tdx_module_status==TDX_MODULE_INITIALIZED while the PAMT structure is being assembled? > + for (i = 0; i < tdmr_list->nr_consumed_tdmrs; i++) { > + unsigned long base, size; > + > + tdmr_get_pamt(tdmr_entry(tdmr_list, i), &base, &size); > + > + if (phys >= base && phys < (base + size)) > + return true; > + } > + > + return false; > +} > + > +/* > + * Return whether the memory page at the given physical address is TDX > + * private memory or not. Called from #MC handler do_machine_check(). > + * > + * Note this function may not return an accurate result in rare cases. > + * This is fine as the #MC handler doesn't need a 100% accurate result, > + * because it cannot distinguish #MC between software bug and real > + * hardware error anyway. > + */ > +bool tdx_is_private_mem(unsigned long phys) > +{ > + struct tdx_module_args args = { > + .rcx = phys & PAGE_MASK, > + }; > + u64 sret; > + > + if (!platform_tdx_enabled()) > + return false; > + > + /* Get page type from the TDX module */ > + sret = __seamcall_ret(TDH_PHYMEM_PAGE_RDMD, &args); > + /* > + * Handle the case that CPU isn't in VMX operation. > + * > + * KVM guarantees no VM is running (thus no TDX guest) > + * when there's any online CPU isn't in VMX operation. > + * This means there will be no TDX guest private memory > + * and Secure-EPT pages. However the TDX module may have > + * been initialized and the memory page could be PAMT. > + */ > + if (sret == TDX_SEAMCALL_UD) > + return is_pamt_page(phys); Either this is comment is wonky or the module initialization is buggy. config_global_keyid() goes and does SEAMCALLs on all CPUs. There are zero checks or special handling in there for whether the CPU has done VMXON. So, by the time we've started initializing the TDX module (including the PAMT), all online CPUs must be able to do SEAMCALLs. Right? So how can we have a working PAMT here when this CPU can't do SEAMCALLs? I don't think we should even bother with this complexity. I think we can just fix the whole thing by saying that unless you can make a non-init SEAMCALL, we just assume the memory can't be private. The transition to being able to make non-init SEAMCALLs is also #MC safe *and* it's at a point when the tdmr_list is stable. Can anyone shoot any holes in that? :)