Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp1399748pxj; Fri, 21 May 2021 13:18:30 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzy21/tsHZvfnSZAhcGoqB+jTR5lWb6JkMnmcYq5K+BuIg08ixhdK+vVcFedWM6m393kHVf X-Received: by 2002:a05:6e02:58a:: with SMTP id c10mr706047ils.216.1621628310120; Fri, 21 May 2021 13:18:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1621628310; cv=none; d=google.com; s=arc-20160816; b=iY0PHHWAoIqzNKZr7FLMKXsF4e+d7LeB2c5FGbuSzGMTgt7FkHzIJCuchTxRS5e5/O L4Br2mdXX8qHMzErK0tZYywoatzz192VAMi6Rdup87OIzKVylSK5wrMxg9ipdlUigQkU fYRL0BseSPreLrGpN553KSXsHxGO9yjSo/bX5z/PvnqxToTI3AwSgKXtr+Lv/lOzuAOf VxChfQQ6ky9euCfwa5ML3Ac8w6kvDFICHMO4h6aGZZWfHtZoXEygaAVtyyvVtBDV/3Kr xwjfrYVAKNOz/VDfmTJSMOybrKOjMmuhJ6ghOAA20aMgty3jm8QoSh2/HST5YwOTDjUL TeCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :ironport-sdr:ironport-sdr; bh=NG/xXUBS6C9OQAzbPmd3NVZgVmhgeiYiWocrc1UN4Hc=; b=WZwtMuKHVOiT6A6UayBbhRloepzxNlHJ2hL25vDORtScg/SzDkknNzOZux8aFYLShg jfQFoOee8gQaid/eghzhAukTqPgmH7y4JTPZVqKNtTR4CETrqhAhZ4O325F4rs4nrDFO 7Mb/Zhr4P7Np6WEtawgIRpMjw2GrLynEsvMjWOSlghAc0UHhoO0SROHkxsvEbyA/kuNX mhNhvsDj9rt/QJJBfBLSUCP/rhiOIZxY3XRi8//WOfYhCblMtkI2gti4V79YFLPM3VL+ ApJoxWmbY2+NJU6c0fC4qQDWZJHjOFZODYIIh/LXVIYyPwlIlBTvKjpb2GklZrNdKOoJ JdMA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y12si6576819ilm.51.2021.05.21.13.18.17; Fri, 21 May 2021 13:18:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236812AbhEUOg7 (ORCPT + 99 others); Fri, 21 May 2021 10:36:59 -0400 Received: from mga05.intel.com ([192.55.52.43]:19794 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231986AbhEUOg7 (ORCPT ); Fri, 21 May 2021 10:36:59 -0400 IronPort-SDR: f3+ONlg6gIt6GNHAGSRdt0Jvfse6UWxwsBrHlB9RwjNfcLw2+YxfY8FYzHd0lCAfeBFh/8cN0x IVtq3+hNtJSw== X-IronPort-AV: E=McAfee;i="6200,9189,9990"; a="287037704" X-IronPort-AV: E=Sophos;i="5.82,319,1613462400"; d="scan'208";a="287037704" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 May 2021 07:35:35 -0700 IronPort-SDR: +r3rZj66TZT9PCWgUTx4FUvQkiIZ6dEeFsFvajIIo4NZhufvw1CZsuwhcN+PTmMBpetbBvfR3J 97GSaOvHwwBA== X-IronPort-AV: E=Sophos;i="5.82,319,1613462400"; d="scan'208";a="462507742" Received: from spattipa-mobl.amr.corp.intel.com (HELO skuppusw-desk1.amr.corp.intel.com) ([10.212.113.231]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 May 2021 07:35:34 -0700 From: Kuppuswamy Sathyanarayanan To: Peter Zijlstra , Andy Lutomirski , Dave Hansen Cc: Tony Luck , Andi Kleen , Kirill Shutemov , Kuppuswamy Sathyanarayanan , Dan Williams , Raj Ashok , Sean Christopherson , linux-kernel@vger.kernel.org, Sean Christopherson , Kuppuswamy Sathyanarayanan Subject: [RFC v2-fix-v2 1/1] x86/boot: Avoid #VE during boot for TDX platforms Date: Fri, 21 May 2021 07:35:24 -0700 Message-Id: <20210521143524.2527690-1-sathyanarayanan.kuppuswamy@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Sean Christopherson Avoid operations which will inject #VE during boot process. They're easy to avoid and it is less complex than handling the exceptions. There are a few MSRs and control register bits which the kernel normally needs to modify during boot. But, TDX disallows modification of these registers to help provide consistent security guarantees ( and avoid generating #VE when updating them). Fortunately, TDX ensures that these are all in the correct state before the kernel loads, which means the kernel has no need to modify them. The conditions we need to avoid are: * Any writes to the EFER MSR * Clearing CR0.NE * Clearing CR3.MCE Signed-off-by: Sean Christopherson Reviewed-by: Andi Kleen Signed-off-by: Kuppuswamy Sathyanarayanan --- Changes since RFC v2-fix: * Fixed commit and comments as per Dave and Dan's suggestions. * Merged CR0.NE related change in pa_trampoline_compat() from patch titled "x86/boot: Add a trampoline for APs booting in 64-bit mode" to this patch. It belongs in this patch. * Merged TRAMPOLINE_32BIT_CODE_SIZE related change from patch titled "x86/boot: Add a trampoline for APs booting in 64-bit mode" to this patch (since it was wrongly merged to that patch during patch split). arch/x86/boot/compressed/head_64.S | 16 ++++++++++++---- arch/x86/boot/compressed/pgtable.h | 2 +- arch/x86/kernel/head_64.S | 20 ++++++++++++++++++-- arch/x86/realmode/rm/trampoline_64.S | 23 +++++++++++++++++++---- 4 files changed, 50 insertions(+), 11 deletions(-) diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S index e94874f4bbc1..f848569e3fb0 100644 --- a/arch/x86/boot/compressed/head_64.S +++ b/arch/x86/boot/compressed/head_64.S @@ -616,12 +616,20 @@ SYM_CODE_START(trampoline_32bit_src) movl $MSR_EFER, %ecx rdmsr btsl $_EFER_LME, %eax + /* Avoid writing EFER if no change was made (for TDX guest) */ + jc 1f wrmsr - popl %edx +1: popl %edx popl %ecx /* Enable PAE and LA57 (if required) paging modes */ - movl $X86_CR4_PAE, %eax + movl %cr4, %eax + /* + * Clear all bits except CR4.MCE, which is preserved. + * Clearing CR4.MCE will #VE in TDX guests. + */ + andl $X86_CR4_MCE, %eax + orl $X86_CR4_PAE, %eax testl %edx, %edx jz 1f orl $X86_CR4_LA57, %eax @@ -635,8 +643,8 @@ SYM_CODE_START(trampoline_32bit_src) pushl $__KERNEL_CS pushl %eax - /* Enable paging again */ - movl $(X86_CR0_PG | X86_CR0_PE), %eax + /* Enable paging again. Avoid clearing X86_CR0_NE for TDX */ + movl $(X86_CR0_PG | X86_CR0_NE | X86_CR0_PE), %eax movl %eax, %cr0 lret diff --git a/arch/x86/boot/compressed/pgtable.h b/arch/x86/boot/compressed/pgtable.h index 6ff7e81b5628..cc9b2529a086 100644 --- a/arch/x86/boot/compressed/pgtable.h +++ b/arch/x86/boot/compressed/pgtable.h @@ -6,7 +6,7 @@ #define TRAMPOLINE_32BIT_PGTABLE_OFFSET 0 #define TRAMPOLINE_32BIT_CODE_OFFSET PAGE_SIZE -#define TRAMPOLINE_32BIT_CODE_SIZE 0x70 +#define TRAMPOLINE_32BIT_CODE_SIZE 0x80 #define TRAMPOLINE_32BIT_STACK_END TRAMPOLINE_32BIT_SIZE diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 04bddaaba8e2..6cf8d126b80a 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -141,7 +141,13 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL) 1: /* Enable PAE mode, PGE and LA57 */ - movl $(X86_CR4_PAE | X86_CR4_PGE), %ecx + movq %cr4, %rcx + /* + * Clear all bits except CR4.MCE, which is preserved. + * Clearing CR4.MCE will #VE in TDX guests. + */ + andl $X86_CR4_MCE, %ecx + orl $(X86_CR4_PAE | X86_CR4_PGE), %ecx #ifdef CONFIG_X86_5LEVEL testl $1, __pgtable_l5_enabled(%rip) jz 1f @@ -229,13 +235,23 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL) /* Setup EFER (Extended Feature Enable Register) */ movl $MSR_EFER, %ecx rdmsr + /* + * Preserve current value of EFER for comparison and to skip + * EFER writes if no change was made (for TDX guest) + */ + movl %eax, %edx btsl $_EFER_SCE, %eax /* Enable System Call */ btl $20,%edi /* No Execute supported? */ jnc 1f btsl $_EFER_NX, %eax btsq $_PAGE_BIT_NX,early_pmd_flags(%rip) -1: wrmsr /* Make changes effective */ + /* Avoid writing EFER if no change was made (for TDX guest) */ +1: cmpl %edx, %eax + je 1f + xor %edx, %edx + wrmsr /* Make changes effective */ +1: /* Setup cr0 */ movl $CR0_STATE, %eax /* Make changes effective */ diff --git a/arch/x86/realmode/rm/trampoline_64.S b/arch/x86/realmode/rm/trampoline_64.S index 957bb21ce105..cf14d0326a48 100644 --- a/arch/x86/realmode/rm/trampoline_64.S +++ b/arch/x86/realmode/rm/trampoline_64.S @@ -143,13 +143,27 @@ SYM_CODE_START(startup_32) movl %eax, %cr3 # Set up EFER + movl $MSR_EFER, %ecx + rdmsr + /* + * Skip writing to EFER if the register already has desiered + * value (to avoid #VE for TDX guest). + */ + cmp pa_tr_efer, %eax + jne .Lwrite_efer + cmp pa_tr_efer + 4, %edx + je .Ldone_efer +.Lwrite_efer: movl pa_tr_efer, %eax movl pa_tr_efer + 4, %edx - movl $MSR_EFER, %ecx wrmsr - # Enable paging and in turn activate Long Mode - movl $(X86_CR0_PG | X86_CR0_WP | X86_CR0_PE), %eax +.Ldone_efer: + /* + * Enable paging and in turn activate Long Mode. Avoid clearing + * X86_CR0_NE for TDX. + */ + movl $(X86_CR0_PG | X86_CR0_WP | X86_CR0_NE | X86_CR0_PE), %eax movl %eax, %cr0 /* @@ -169,7 +183,8 @@ SYM_CODE_START(pa_trampoline_compat) movl $rm_stack_end, %esp movw $__KERNEL_DS, %dx - movl $X86_CR0_PE, %eax + /* Avoid clearing X86_CR0_NE for TDX */ + movl $(X86_CR0_NE | X86_CR0_PE), %eax movl %eax, %cr0 ljmpl $__KERNEL32_CS, $pa_startup_32 SYM_CODE_END(pa_trampoline_compat) -- 2.25.1