Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp1406031pxj; Fri, 21 May 2021 13:29:08 -0700 (PDT) X-Google-Smtp-Source: ABdhPJysPQPsT9WENneyGyoJb/G6l80ewshY6D1pHtZ04dOj5jDYLwHvVPjDlZC6jUVpAaP3e7z0 X-Received: by 2002:a05:6402:35c4:: with SMTP id z4mr12697936edc.362.1621628948308; Fri, 21 May 2021 13:29:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1621628948; cv=none; d=google.com; s=arc-20160816; b=cGFdL+tmDjcDOimmKJIGc5CWlfsb5b+S38yufko+f7cMNBNt2lMWQyikY6V2QjHbu7 n6zUMUasR13p4bOx0hS3klcLPEB6utB3OGm2HOL/9+SDEozY4yAzfhfrNoGf/oE7pjTZ Zd3920t1+RylD1htOGD5xcHBMvnAEki3XgHQ0xNx0tX++czhyPkQVycAmyDIdCpJdSUP tkjwvud2kn5by9QAMyX+91wIXfSyHZKr9A4N7Lj5Pz+oItUFNm6CrBRUZnQPlMEWFmIa KgemT4ErU/tC48VWBS/RRzPEoXShDZn7PtAcrkjsEMYhsfsxXHTOhHKmGbwKCoJlww6k MYCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:ironport-sdr:ironport-sdr; bh=hCfKqDvivjqj3hqk3iwR0csJ5geivPMyE8E64GmVV9c=; b=EME0yI6Nf54IwHumG652G4rj+yahUft/Qvzr9PDZj1AroIxJUrrKYXS6bcLLU/o8lD oMQpn3THqB2nRYUvTXuattLmHPTxMa1pxLjPUIPVfvL7zZk194ydeaT7TrNghk4HhfEy VX8n0b9d9tk0AJaXe4kAsm4UlUdfxJ7OFGeGDr7PG70Fduvm4Uw8E93soxvS6BYAm8a9 sQZvFq5vYObu0YRsdG4skkVOsaL0gU8BJGrbawVM9boO62Uz0+SAmMRN80vY3s2jh87l QPCgtINP1HvYc6vs4RA+cHzWZFVEeoNP5RD3ZQ+4BKwc9TznqfaHbGLX24GCKAKxUdxs ho4w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c9si5420979edv.447.2021.05.21.13.28.43; Fri, 21 May 2021 13:29:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234895AbhEUSc5 (ORCPT + 99 others); Fri, 21 May 2021 14:32:57 -0400 Received: from mga07.intel.com ([134.134.136.100]:18032 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232583AbhEUSc5 (ORCPT ); Fri, 21 May 2021 14:32:57 -0400 IronPort-SDR: Jg2CGNSkzbgLpsmSvm8N6RuwKadZmOvyPr7hkg7dzTrguqps+bsrwltcPz6YwLMUIM5xuzoP29 8TDLijtxm/RQ== X-IronPort-AV: E=McAfee;i="6200,9189,9991"; a="265454097" X-IronPort-AV: E=Sophos;i="5.82,319,1613462400"; d="scan'208";a="265454097" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 May 2021 11:31:32 -0700 IronPort-SDR: eyYK8ErqfcJQIa0PIa3TYUy9WIkLtNYJjtNwF9kW98MxelC0La13YapBBYsH02BPqVzMRk3ctc KCKmFV1gZNiQ== X-IronPort-AV: E=Sophos;i="5.82,319,1613462400"; d="scan'208";a="441193658" Received: from orxpovpvmu02.amr.corp.intel.com (HELO skuppusw-mobl5.amr.corp.intel.com) ([10.213.181.51]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 May 2021 11:31:32 -0700 Subject: Re: [RFC v2-fix-v2 1/1] x86/boot: Avoid #VE during boot for TDX platforms To: Dave Hansen , Peter Zijlstra , Andy Lutomirski Cc: Tony Luck , Andi Kleen , Kirill Shutemov , Kuppuswamy Sathyanarayanan , Dan Williams , Raj Ashok , Sean Christopherson , linux-kernel@vger.kernel.org, Sean Christopherson References: <20210521143524.2527690-1-sathyanarayanan.kuppuswamy@linux.intel.com> From: "Kuppuswamy, Sathyanarayanan" Message-ID: Date: Fri, 21 May 2021 11:31:29 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/21/21 9:11 AM, Dave Hansen wrote: >> Avoid operations which will inject #VE during boot process. >> They're easy to avoid and it is less complex than handling >> the exceptions. > > This puts the solution before the problem. I'd also make sure to > clearly connect this solution to the problem. For instance, if you > refer to register "modification", ensure that you reflect that language > here. Don't call them "modifications" in one part of the changelog and > "operations" here. I'd also qualify them as "superfluous". > > Please reorder this in the following form: > > 1. Background > 2. Problem > 3. Solution > > Please do this for all of your patches. > >> There are a few MSRs and control register bits which the >> kernel normally needs to modify during boot. But, TDX >> disallows modification of these registers to help provide >> consistent security guarantees ( and avoid generating #VE >> when updating them). > > No, the TDX architecture does not avoid generating #VE. The *kernel* > does that. This sentence conflates those two things. > >> Fortunately, TDX ensures that these are >> all in the correct state before the kernel loads, which means >> the kernel has no need to modify them. >> >> The conditions we need to avoid are: >> >> * Any writes to the EFER MSR >> * Clearing CR0.NE >> * Clearing CR3.MCE > > Sathya, there have been repeated issues in your changelogs with "we's". > Remember, speak in imperative voice. Please fix this in your tooling > to find these so that reviewers don't have to. How about the following commit log? In TDX guests, Virtualization Exceptions (#VE) are delivered to TDX guests due to specific guest actions like MSR writes, CPUID leaf accesses or I/O access. But in early boot code, #VE cannot be allowed because the required exception handler setup support code is missing. If #VE is triggered without proper handler support, it would lead to triple fault or kernel hang. So, avoid operations which will inject #VE during boot process. They're easy to avoid and it is less complex than handling the exceptions. There are a few MSRs and control register bits which the kernel normally needs to modify during boot. But, TDX disallows modification of these registers to help provide consistent security guarantees. Fortunately, TDX ensures that these are all in the correct state before the kernel loads, which means the kernel has no need to modify them. The conditions to avoid are: * Any writes to the EFER MSR * Clearing CR0.NE * Clearing CR3.MCE If above conditions are not avoided, it would lead to triple fault or kernel hang. > >> + /* >> + * Preserve current value of EFER for comparison and to skip >> + * EFER writes if no change was made (for TDX guest) >> + */ >> + movl %eax, %edx >> btsl $_EFER_SCE, %eax /* Enable System Call */ >> btl $20,%edi /* No Execute supported? */ >> jnc 1f >> btsl $_EFER_NX, %eax >> btsq $_PAGE_BIT_NX,early_pmd_flags(%rip) >> -1: wrmsr /* Make changes effective */ >> >> + /* Avoid writing EFER if no change was made (for TDX guest) */ >> +1: cmpl %edx, %eax >> + je 1f >> + xor %edx, %edx >> + wrmsr /* Make changes effective */ >> +1: > > Just curious, but what if this goes wrong? Say the TDX firmware didn't > set up EFER correctly and this code does the WRMSR. What ends up > happening? It would lead to triple fault. Do we get anything out on the console, or is it essentially > undebuggable? > We can still get logs with debug TDX module. So it is still debugable. >> >> + /* >> + * Skip writing to EFER if the register already has desiered >> + * value (to avoid #VE for TDX guest). >> + */ > > > spelling ^ > > There are lots of editors that can do spell checking, even in C > comments. You might want to look into that for your editor. > -- Sathyanarayanan Kuppuswamy Linux Kernel Developer