2024-01-29 18:15:21

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v3 00/19] x86: Confine early 1:1 mapped startup code

From: Ard Biesheuvel <[email protected]>

This is a follow-up to my RFC [0] that proposed to build the entire core
kernel with -fPIC, to reduce the likelihood that code that runs
extremely early from the 1:1 mapping of memory will misbehave.

This is needed to address reports that SEV boot on Clang built kernels
is broken, due to the fact that this early code attempts to access
virtual kernel address that are not mapped yet. Kevin has suggested some
workarounds to this [1] but this is really something that requires a
more rigorous approach, rather than addressing a couple of symptoms of
the underlying defect.

As it turns out, the use of fPIE for the entire kernel is neither
necessary nor sufficient, and has its own set of problems, including the
fact that the PIE small C code model uses FS rather than GS for the
per-CPU register, and only recent GCC and Clang versions permit this to
be overridden on the command line.

But the real problem is that even position independent code is not
guaranteed to execute correctly at any offset unless all statically
initialized pointer variables use the same translation as the code.

So instead, this v2 and later proposes another solution, taking the
following approach:
- clean up and refactor the startup code so that the primary startup
code executes from the 1:1 mapping but nothing else;
- define a new text section type .pi.text and enforce that it can only
call into other .pi.text sections;
- (tbd) require that objects containing .pi.text sections are built with
-fPIC, and disallow any absolute references from such objects.

The latter point is not implemented yet in this v3, but this could be
done rather straight-forwardly. (The EFI stub already does something
similar across all architectures)

Changes since v2: [2]
- move command line parsing out of early startup code entirely
- fix LTO and instrumentation related build warnings reported by Nathan
- omit PTI related PGD/P4D setters when creating the early page tables,
instead of pulling that code into the 'early' set

[0] https://lkml.kernel.org/r/20240122090851.851120-7-ardb%2Bgit%40google.com
[1] https://lore.kernel.org/all/[email protected]/T/#u
[2] https://lkml.kernel.org/r/20240125112818.2016733-19-ardb%2Bgit%40google.com

Cc: Kevin Loughlin <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Dionna Glaze <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Justin Stitt <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]

Ard Biesheuvel (19):
efi/libstub: Add generic support for parsing mem_encrypt=
x86/boot: Move mem_encrypt= parsing to the decompressor
x86/startup_64: Drop long return to initial_code pointer
x86/startup_64: Simplify calculation of initial page table address
x86/startup_64: Simplify CR4 handling in startup code
x86/startup_64: Drop global variables keeping track of LA57 state
x86/startup_64: Simplify virtual switch on primary boot
x86/head64: Replace pointer fixups with PIE codegen
x86/head64: Simplify GDT/IDT initialization code
asm-generic: Add special .pi.text section for position independent
code
x86: Move return_thunk to __pitext section
x86/head64: Move early startup code into __pitext
modpost: Warn about calls from __pitext into other text sections
x86/coco: Make cc_set_mask() static inline
x86/sev: Make all code reachable from 1:1 mapping __pitext
x86/sev: Avoid WARN() in early code
x86/sev: Use PIC codegen for early SEV startup code
x86/sev: Drop inline asm LEA instructions for RIP-relative references
x86/startup_64: Don't bother setting up GS before the kernel is mapped

arch/x86/Makefile | 8 +
arch/x86/boot/compressed/Makefile | 2 +-
arch/x86/boot/compressed/misc.c | 22 +++
arch/x86/boot/compressed/pgtable_64.c | 2 -
arch/x86/boot/compressed/sev.c | 6 +
arch/x86/coco/core.c | 7 +-
arch/x86/include/asm/coco.h | 8 +-
arch/x86/include/asm/desc.h | 3 +-
arch/x86/include/asm/init.h | 2 -
arch/x86/include/asm/mem_encrypt.h | 8 +-
arch/x86/include/asm/pgtable_64.h | 12 +-
arch/x86/include/asm/pgtable_64_types.h | 15 +-
arch/x86/include/asm/setup.h | 4 +-
arch/x86/include/asm/sev.h | 6 +-
arch/x86/include/uapi/asm/bootparam.h | 2 +
arch/x86/kernel/Makefile | 7 +
arch/x86/kernel/cpu/common.c | 2 -
arch/x86/kernel/head64.c | 206 +++++++-------------
arch/x86/kernel/head_64.S | 156 +++++----------
arch/x86/kernel/sev-shared.c | 54 +++--
arch/x86/kernel/sev.c | 27 ++-
arch/x86/kernel/vmlinux.lds.S | 3 +-
arch/x86/lib/Makefile | 13 --
arch/x86/lib/memcpy_64.S | 3 +-
arch/x86/lib/memset_64.S | 3 +-
arch/x86/lib/retpoline.S | 2 +-
arch/x86/mm/Makefile | 2 +-
arch/x86/mm/kasan_init_64.c | 3 -
arch/x86/mm/mem_encrypt_boot.S | 3 +-
arch/x86/mm/mem_encrypt_identity.c | 98 +++-------
drivers/firmware/efi/libstub/efi-stub-helper.c | 8 +
drivers/firmware/efi/libstub/efistub.h | 2 +-
drivers/firmware/efi/libstub/x86-stub.c | 6 +
include/asm-generic/vmlinux.lds.h | 3 +
include/linux/init.h | 12 ++
scripts/mod/modpost.c | 11 +-
tools/objtool/check.c | 26 +--
37 files changed, 319 insertions(+), 438 deletions(-)


base-commit: aa8eff72842021f52600392b245fb82d113afa8a
--
2.43.0.429.g432eaa2c6b-goog



2024-01-29 18:17:03

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v3 06/19] x86/startup_64: Drop global variables keeping track of LA57 state

From: Ard Biesheuvel <[email protected]>

On x86_64, the core kernel is entered in long mode, which implies that
paging is enabled. This means that the CR4.LA57 control bit is
guaranteed to be in sync with the number of paging levels used by the
kernel, and there is no need to store this in a variable.

There is also no need to use variables for storing the calculations of
pgdir_shift and ptrs_per_p4d, as they are easily determined on the fly.
Other assignments of global variables related to the number of paging
levels can be deferred to the primary C entrypoint that actually runs
from the kernel virtual mapping.

This removes the need for writing to __ro_after_init from the code that
executes extremely early via the 1:1 mapping.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/x86/boot/compressed/pgtable_64.c | 2 -
arch/x86/include/asm/pgtable_64_types.h | 15 +++---
arch/x86/kernel/cpu/common.c | 2 -
arch/x86/kernel/head64.c | 52 ++++----------------
arch/x86/mm/kasan_init_64.c | 3 --
arch/x86/mm/mem_encrypt_identity.c | 9 ----
6 files changed, 15 insertions(+), 68 deletions(-)

diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
index 51f957b24ba7..0586cc216aa6 100644
--- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -128,8 +128,6 @@ asmlinkage void configure_5level_paging(struct boot_params *bp, void *pgtable)

/* Initialize variables for 5-level paging */
__pgtable_l5_enabled = 1;
- pgdir_shift = 48;
- ptrs_per_p4d = 512;
}

/*
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 38b54b992f32..ecc010fbb377 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -22,28 +22,25 @@ typedef struct { pteval_t pte; } pte_t;
typedef struct { pmdval_t pmd; } pmd_t;

#ifdef CONFIG_X86_5LEVEL
+#ifdef USE_EARLY_PGTABLE_L5
extern unsigned int __pgtable_l5_enabled;

-#ifdef USE_EARLY_PGTABLE_L5
/*
- * cpu_feature_enabled() is not available in early boot code.
- * Use variable instead.
+ * CR4.LA57 may not be set to its final value yet in the early boot code.
+ * Use a variable instead.
*/
static inline bool pgtable_l5_enabled(void)
{
return __pgtable_l5_enabled;
}
#else
-#define pgtable_l5_enabled() cpu_feature_enabled(X86_FEATURE_LA57)
+#define pgtable_l5_enabled() !!(native_read_cr4() & X86_CR4_LA57)
#endif /* USE_EARLY_PGTABLE_L5 */

#else
#define pgtable_l5_enabled() 0
#endif /* CONFIG_X86_5LEVEL */

-extern unsigned int pgdir_shift;
-extern unsigned int ptrs_per_p4d;
-
#endif /* !__ASSEMBLY__ */

#define SHARED_KERNEL_PMD 0
@@ -53,7 +50,7 @@ extern unsigned int ptrs_per_p4d;
/*
* PGDIR_SHIFT determines what a top-level page table entry can map
*/
-#define PGDIR_SHIFT pgdir_shift
+#define PGDIR_SHIFT (pgtable_l5_enabled() ? 48 : 39)
#define PTRS_PER_PGD 512

/*
@@ -61,7 +58,7 @@ extern unsigned int ptrs_per_p4d;
*/
#define P4D_SHIFT 39
#define MAX_PTRS_PER_P4D 512
-#define PTRS_PER_P4D ptrs_per_p4d
+#define PTRS_PER_P4D (pgtable_l5_enabled() ? 512 : 1)
#define P4D_SIZE (_AC(1, UL) << P4D_SHIFT)
#define P4D_MASK (~(P4D_SIZE - 1))

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 0b97bcde70c6..20ac11a2c06b 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1,6 +1,4 @@
// SPDX-License-Identifier: GPL-2.0-only
-/* cpu_feature_enabled() cannot be used this early */
-#define USE_EARLY_PGTABLE_L5

#include <linux/memblock.h>
#include <linux/linkage.h>
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index dc0956067944..d636bb02213f 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -7,9 +7,6 @@

#define DISABLE_BRANCH_PROFILING

-/* cpu_feature_enabled() cannot be used this early */
-#define USE_EARLY_PGTABLE_L5
-
#include <linux/init.h>
#include <linux/linkage.h>
#include <linux/types.h>
@@ -50,14 +47,6 @@ extern pmd_t early_dynamic_pgts[EARLY_DYNAMIC_PAGE_TABLES][PTRS_PER_PMD];
static unsigned int __initdata next_early_pgt;
pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);

-#ifdef CONFIG_X86_5LEVEL
-unsigned int __pgtable_l5_enabled __ro_after_init;
-unsigned int pgdir_shift __ro_after_init = 39;
-EXPORT_SYMBOL(pgdir_shift);
-unsigned int ptrs_per_p4d __ro_after_init = 1;
-EXPORT_SYMBOL(ptrs_per_p4d);
-#endif
-
#ifdef CONFIG_DYNAMIC_MEMORY_LAYOUT
unsigned long page_offset_base __ro_after_init = __PAGE_OFFSET_BASE_L4;
EXPORT_SYMBOL(page_offset_base);
@@ -95,37 +84,6 @@ static unsigned long __head *fixup_long(void *ptr, unsigned long physaddr)
return fixup_pointer(ptr, physaddr);
}

-#ifdef CONFIG_X86_5LEVEL
-static unsigned int __head *fixup_int(void *ptr, unsigned long physaddr)
-{
- return fixup_pointer(ptr, physaddr);
-}
-
-static bool __head check_la57_support(unsigned long physaddr)
-{
- /*
- * 5-level paging is detected and enabled at kernel decompression
- * stage. Only check if it has been enabled there.
- */
- if (!(native_read_cr4() & X86_CR4_LA57))
- return false;
-
- *fixup_int(&__pgtable_l5_enabled, physaddr) = 1;
- *fixup_int(&pgdir_shift, physaddr) = 48;
- *fixup_int(&ptrs_per_p4d, physaddr) = 512;
- *fixup_long(&page_offset_base, physaddr) = __PAGE_OFFSET_BASE_L5;
- *fixup_long(&vmalloc_base, physaddr) = __VMALLOC_BASE_L5;
- *fixup_long(&vmemmap_base, physaddr) = __VMEMMAP_BASE_L5;
-
- return true;
-}
-#else
-static bool __head check_la57_support(unsigned long physaddr)
-{
- return false;
-}
-#endif
-
static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdval_t *pmd)
{
unsigned long vaddr, vaddr_end;
@@ -189,7 +147,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
int i;
unsigned int *next_pgt_ptr;

- la57 = check_la57_support(physaddr);
+ la57 = pgtable_l5_enabled();

/* Is the address too large? */
if (physaddr >> MAX_PHYSMEM_BITS)
@@ -486,6 +444,14 @@ asmlinkage __visible void __init __noreturn x86_64_start_kernel(char * real_mode
(__START_KERNEL & PGDIR_MASK)));
BUILD_BUG_ON(__fix_to_virt(__end_of_fixed_addresses) <= MODULES_END);

+#ifdef CONFIG_DYNAMIC_MEMORY_LAYOUT
+ if (pgtable_l5_enabled()) {
+ page_offset_base = __PAGE_OFFSET_BASE_L5;
+ vmalloc_base = __VMALLOC_BASE_L5;
+ vmemmap_base = __VMEMMAP_BASE_L5;
+ }
+#endif
+
cr4_init_shadow();

/* Kill off the identity-map trampoline */
diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index 0302491d799d..85ae1ef840cc 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -2,9 +2,6 @@
#define DISABLE_BRANCH_PROFILING
#define pr_fmt(fmt) "kasan: " fmt

-/* cpu_feature_enabled() cannot be used this early */
-#define USE_EARLY_PGTABLE_L5
-
#include <linux/memblock.h>
#include <linux/kasan.h>
#include <linux/kdebug.h>
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index 06466f6d5966..2e195866a7fe 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -27,15 +27,6 @@
#undef CONFIG_PARAVIRT_XXL
#undef CONFIG_PARAVIRT_SPINLOCKS

-/*
- * This code runs before CPU feature bits are set. By default, the
- * pgtable_l5_enabled() function uses bit X86_FEATURE_LA57 to determine if
- * 5-level paging is active, so that won't work here. USE_EARLY_PGTABLE_L5
- * is provided to handle this situation and, instead, use a variable that
- * has been set by the early boot code.
- */
-#define USE_EARLY_PGTABLE_L5
-
#include <linux/kernel.h>
#include <linux/mm.h>
#include <linux/mem_encrypt.h>
--
2.43.0.429.g432eaa2c6b-goog


2024-01-29 18:17:48

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v3 09/19] x86/head64: Simplify GDT/IDT initialization code

From: Ard Biesheuvel <[email protected]>

There used to be two separate code paths for programming the IDT early:
one that was called via the 1:1 mapping, and one via the kernel virtual
mapping, where the former used explicit pointer fixups to obtain 1:1
mapped addresses.

That distinction is now gone so the GDT/IDT init code can be unified and
simplified accordingly.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/x86/kernel/head64.c | 57 +++++++-------------
1 file changed, 18 insertions(+), 39 deletions(-)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index a4a380494703..58c58c66dec9 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -59,21 +59,12 @@ EXPORT_SYMBOL(vmemmap_base);
/*
* GDT used on the boot CPU before switching to virtual addresses.
*/
-static struct desc_struct startup_gdt[GDT_ENTRIES] __initdata = {
+static struct desc_struct startup_gdt[GDT_ENTRIES] __initconst = {
[GDT_ENTRY_KERNEL32_CS] = GDT_ENTRY_INIT(DESC_CODE32, 0, 0xfffff),
[GDT_ENTRY_KERNEL_CS] = GDT_ENTRY_INIT(DESC_CODE64, 0, 0xfffff),
[GDT_ENTRY_KERNEL_DS] = GDT_ENTRY_INIT(DESC_DATA64, 0, 0xfffff),
};

-/*
- * Address needs to be set at runtime because it references the startup_gdt
- * while the kernel still uses a direct mapping.
- */
-static struct desc_ptr startup_gdt_descr __initdata = {
- .size = sizeof(startup_gdt)-1,
- .address = 0,
-};
-
#define __va_symbol(sym) ({ \
unsigned long __v; \
asm("movq $" __stringify(sym) ", %0":"=r"(__v)); \
@@ -517,47 +508,32 @@ void __init __noreturn x86_64_start_reservations(char *real_mode_data)
*/
static gate_desc bringup_idt_table[NUM_EXCEPTION_VECTORS] __page_aligned_data;

-static struct desc_ptr bringup_idt_descr = {
- .size = (NUM_EXCEPTION_VECTORS * sizeof(gate_desc)) - 1,
- .address = 0, /* Set at runtime */
-};
-
-static void set_bringup_idt_handler(gate_desc *idt, int n, void *handler)
-{
-#ifdef CONFIG_AMD_MEM_ENCRYPT
- struct idt_data data;
- gate_desc desc;
-
- init_idt_data(&data, n, handler);
- idt_init_desc(&desc, &data);
- native_write_idt_entry(idt, n, &desc);
-#endif
-}
-
-/* This runs while still in the direct mapping */
-static void __head startup_64_load_idt(void)
+static void early_load_idt(void (*handler)(void))
{
gate_desc *idt = bringup_idt_table;
+ struct desc_ptr bringup_idt_descr;
+
+ if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
+ struct idt_data data;
+ gate_desc desc;

- if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
/* VMM Communication Exception */
- set_bringup_idt_handler(idt, X86_TRAP_VC, vc_no_ghcb);
+ init_idt_data(&data, X86_TRAP_VC, handler);
+ idt_init_desc(&desc, &data);
+ native_write_idt_entry(idt, X86_TRAP_VC, &desc);
+ }

bringup_idt_descr.address = (unsigned long)idt;
+ bringup_idt_descr.size = sizeof(bringup_idt_table);
native_load_idt(&bringup_idt_descr);
}

-/* This is used when running on kernel addresses */
void early_setup_idt(void)
{
- /* VMM Communication Exception */
- if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
+ if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
setup_ghcb();
- set_bringup_idt_handler(bringup_idt_table, X86_TRAP_VC, vc_boot_ghcb);
- }

- bringup_idt_descr.address = (unsigned long)bringup_idt_table;
- native_load_idt(&bringup_idt_descr);
+ early_load_idt(vc_boot_ghcb);
}

/*
@@ -565,8 +541,11 @@ void early_setup_idt(void)
*/
void __head startup_64_setup_env(void)
{
+ struct desc_ptr startup_gdt_descr;
+
/* Load GDT */
startup_gdt_descr.address = (unsigned long)startup_gdt;
+ startup_gdt_descr.size = sizeof(startup_gdt) - 1;
native_load_gdt(&startup_gdt_descr);

/* New GDT is live - reload data segment registers */
@@ -574,5 +553,5 @@ void __head startup_64_setup_env(void)
"movl %%eax, %%ss\n"
"movl %%eax, %%es\n" : : "a"(__KERNEL_DS) : "memory");

- startup_64_load_idt();
+ early_load_idt(vc_no_ghcb);
}
--
2.43.0.429.g432eaa2c6b-goog


2024-02-07 13:31:30

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 06/19] x86/startup_64: Drop global variables keeping track of LA57 state

On Mon, Jan 29, 2024 at 07:05:09PM +0100, Ard Biesheuvel wrote:
> static inline bool pgtable_l5_enabled(void)
> {
> return __pgtable_l5_enabled;
> }
> #else
> -#define pgtable_l5_enabled() cpu_feature_enabled(X86_FEATURE_LA57)
> +#define pgtable_l5_enabled() !!(native_read_cr4() & X86_CR4_LA57)
> #endif /* USE_EARLY_PGTABLE_L5 */

Can we drop this ifdeffery and simply have __pgtable_l5_enabled always
present and contain the correct value?

So that we don't have an expensive CR4 read hidden in
pgtable_l5_enabled()?

For the sake of simplicity, pgtable_l5_enabled() can be defined outside
of CONFIG_X86_5LEVEL and since both vendors support 5level now, might as
well start dropping the CONFIG ifdeffery slowly...

Other than that - a nice cleanup!

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2024-02-09 13:55:35

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v3 06/19] x86/startup_64: Drop global variables keeping track of LA57 state

On Wed, 7 Feb 2024 at 13:29, Borislav Petkov <[email protected]> wrote:
>
> On Mon, Jan 29, 2024 at 07:05:09PM +0100, Ard Biesheuvel wrote:
> > static inline bool pgtable_l5_enabled(void)
> > {
> > return __pgtable_l5_enabled;
> > }
> > #else
> > -#define pgtable_l5_enabled() cpu_feature_enabled(X86_FEATURE_LA57)
> > +#define pgtable_l5_enabled() !!(native_read_cr4() & X86_CR4_LA57)
> > #endif /* USE_EARLY_PGTABLE_L5 */
>
> Can we drop this ifdeffery and simply have __pgtable_l5_enabled always
> present and contain the correct value?
>

I was trying to get rid of global variable assignments and accesses
from the 1:1 mapping, but since we cannot get rid of those entirely,
we might just keep __pgtable_l5_enabled but use RIP_REL_REF() in the
accessors, and move the assignment to the asm startup code.

> So that we don't have an expensive CR4 read hidden in
> pgtable_l5_enabled()?
>

Yeah, I didn't realize it was expensive. Alternatively, we might do
something like

static __always_inline bool pgtable_l5_enabled(void)
{
unsigned long r;
bool ret;

asm(ALTERNATIVE_TERNARY(
"movq %%cr4, %[reg] \n\t btl %[la57], %k[reg]" CC_SET(c),
%P[feat], "stc", "clc")
: [reg] "=r" (r), CC_OUT(c) (ret)
: [feat] "i" (X86_FEATURE_LA57),
[la57] "i" (X86_CR4_LA57_BIT)
: "cc");
return ret;
}

but we'd still have two versions in that case.

> For the sake of simplicity, pgtable_l5_enabled() can be defined outside
> of CONFIG_X86_5LEVEL and since both vendors support 5level now, might as
> well start dropping the CONFIG ifdeffery slowly...
>
> Other than that - a nice cleanup!
>

Thanks.

2024-02-10 10:51:01

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 06/19] x86/startup_64: Drop global variables keeping track of LA57 state

On Fri, Feb 09, 2024 at 01:55:02PM +0000, Ard Biesheuvel wrote:
> I was trying to get rid of global variable assignments and accesses
> from the 1:1 mapping, but since we cannot get rid of those entirely,
> we might just keep __pgtable_l5_enabled but use RIP_REL_REF() in the
> accessors, and move the assignment to the asm startup code.

Yeah.

> asm(ALTERNATIVE_TERNARY(
> "movq %%cr4, %[reg] \n\t btl %[la57], %k[reg]" CC_SET(c),
> %P[feat], "stc", "clc")
> : [reg] "=r" (r), CC_OUT(c) (ret)
> : [feat] "i" (X86_FEATURE_LA57),
> [la57] "i" (X86_CR4_LA57_BIT)
> : "cc");

Creative :)

> but we'd still have two versions in that case.

Yap. RIP_REL_REF() ain't too bad ...

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2024-02-11 22:37:14

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v3 06/19] x86/startup_64: Drop global variables keeping track of LA57 state

On Sat, 10 Feb 2024 at 11:41, Borislav Petkov <[email protected]> wrote:
>
> On Fri, Feb 09, 2024 at 01:55:02PM +0000, Ard Biesheuvel wrote:
> > I was trying to get rid of global variable assignments and accesses
> > from the 1:1 mapping, but since we cannot get rid of those entirely,
> > we might just keep __pgtable_l5_enabled but use RIP_REL_REF() in the
> > accessors, and move the assignment to the asm startup code.
>
> Yeah.
>
> > asm(ALTERNATIVE_TERNARY(
> > "movq %%cr4, %[reg] \n\t btl %[la57], %k[reg]" CC_SET(c),
> > %P[feat], "stc", "clc")
> > : [reg] "=r" (r), CC_OUT(c) (ret)
> > : [feat] "i" (X86_FEATURE_LA57),
> > [la57] "i" (X86_CR4_LA57_BIT)
> > : "cc");
>
> Creative :)
>
> > but we'd still have two versions in that case.
>
> Yap. RIP_REL_REF() ain't too bad ...
>

We can actually rip all of that stuff out, and have only a single
implementation of pgtable_l5_enabled() that is not based on a variable
at all. It results in a nice cleanup, but I'll keep it as a separate
patch in the next revision so we can easily drop it if preferred.

2024-02-12 14:37:55

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 09/19] x86/head64: Simplify GDT/IDT initialization code

On Mon, Jan 29, 2024 at 07:05:12PM +0100, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <[email protected]>
>
> There used to be two separate code paths for programming the IDT early:
> one that was called via the 1:1 mapping, and one via the kernel virtual
> mapping, where the former used explicit pointer fixups to obtain 1:1
> mapped addresses.
>
> That distinction is now gone so the GDT/IDT init code can be unified and
> simplified accordingly.
>
> Signed-off-by: Ard Biesheuvel <[email protected]>
> ---
> arch/x86/kernel/head64.c | 57 +++++++-------------
> 1 file changed, 18 insertions(+), 39 deletions(-)

Ok, I don't see anything wrong here and since this one is the last of
the cleanup, lemme stop here so that you can send a new revision. We can
deal with whether we want .pi.text later.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2024-02-12 15:24:15

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v3 09/19] x86/head64: Simplify GDT/IDT initialization code

On Mon, 12 Feb 2024 at 15:37, Borislav Petkov <[email protected]> wrote:
>
> On Mon, Jan 29, 2024 at 07:05:12PM +0100, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <[email protected]>
> >
> > There used to be two separate code paths for programming the IDT early:
> > one that was called via the 1:1 mapping, and one via the kernel virtual
> > mapping, where the former used explicit pointer fixups to obtain 1:1
> > mapped addresses.
> >
> > That distinction is now gone so the GDT/IDT init code can be unified and
> > simplified accordingly.
> >
> > Signed-off-by: Ard Biesheuvel <[email protected]>
> > ---
> > arch/x86/kernel/head64.c | 57 +++++++-------------
> > 1 file changed, 18 insertions(+), 39 deletions(-)
>
> Ok, I don't see anything wrong here and since this one is the last of
> the cleanup, lemme stop here so that you can send a new revision. We can
> deal with whether we want .pi.text later.
>

OK.

I'll have the next rev out shortly, thanks.