2023-08-02 16:51:57

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v8 00/23] efi/x86: Avoid bare metal decompressor during EFI boot

Update the x86 boot path to avoid the bare metal decompressor when
booting via the EFI stub. The bare metal decompressor inherits the
loader's 1:1 mapping of DRAM when entering in 64-bit mode, and assumes
that all of it is mapped read/write/execute, which will no longer be the
case on systems built to comply with recently tightened logo
requirements (*).

Changes since v7 [10]:
- Rebase onto suggested x86/kexec fix [11] that will most likely land
first
- Drop only vaguely relevant i386 head_32.S patch
- Restructure the 32-bit trampoline changes slightly into steps that are
clearer and easier to follow.
- Fix wording and placement of comments according to suggestions by
Borislav

Changes since v6 [9]:
- add new patch to fix our current reliance on 64-bit GPRs retaining
their full width contents across the switch into 32-bit protected
mode (with fixes: tag, may need to go to -stable);
- preserve the top half of RSP explicitly, and preserve all callee save
registers on the stack across the mode switch; this fixes a reported
issue on Ice Lake with kexec (which loads the kernel above 4G)

Changes since v5 [8]:
- reintroduce patch removing redundant RSI pushes and pops from
arch/x86/kernel/head_64.S
- avoid bare constant 0x200 for the offset of startup_64() in the
decompressor
- rejig SEV/SNP logic in patch #20 once again, to ensure that CPUID
calls and VM exits only occur when the active configuration permits
it
- improve/clarify some code comments and commit logs
- rebase onto v6.5-rc1

Changes since v4 [7]:
- avoid CPUID calls after protocol negotiation but before configuring
exception handling;
- drop patch removing redundant RSI pushes and pops from
arch/x86/kernel/head_64.S
- rebase onto -tip x86/cc - the conflicts are mostly trivial and
restricted to the last 4 patches in the series, so applying this onto
a separate topic branch should be straight-forward as well.

Changes since v3 [6]:
- trivial rebase onto Kirill's unaccepted memory series v13
- test SNP feature mask while running in the EFI boot services, and fail
gracefully on a mismatch
- perform only the SEV init after ExitBootServices()

Changes since v2 [4]:
- update prose style to comply with -tip guidelines
- rebased onto Kirill's unaccepted memory series [3]
- add Kirill's ack to 4/5-level paging changes
- perform SEV init and SNP feature check after ExitBootServices(), to
avoid corrupting the firmware's own SEV state
- split out preparatory refactor of handover entry code and BSS clearing
(patches #1 to #4)

Changes since v1 [2]:
- streamline existing 4/5 level switching code and call it directly from
the EFI stub - this is covered by the first 9 patches, which can be
applied in isolation, if desired;
- deal with SEV/SNP init explicitly;
- clear BSS when booting via the 'handover protocol'
- switch to kernel CS before calling SEV init code in kernel proper.

---- v1 cover letter follows ----

This series is conceptually a combination of Evgeny's series [0] and
mine [1], both of which attempt to make the early decompressor code more
amenable to executing in the EFI environment with stricter handling of
memory permissions.

My series [1] implemented zboot for x86, by getting rid of the entire
x86 decompressor, and replacing it with existing EFI code that does the
same but in a generic way. The downside of this is that only EFI boot is
supported, making it unviable for distros, which need to support BIOS
boot and hybrid EFI boot modes that omit the EFI stub.

Evgeny's series [0] adapted the entire decompressor code flow to allow
it to execute in the EFI context as well as the bare metal context, and
this involves changes to the 1:1 mapping code and the page fault
handlers etc, none of which are really needed when doing EFI boot in the
first place.

So this series attempts to occupy the middle ground here: it makes
minimal changes to the existing decompressor so some of it can be called
from the EFI stub. Then, it reimplements the EFI boot flow to decompress
the kernel and boot it directly, without relying on the trampoline
allocation code, page table code or page fault handling code. This
allows us to get rid of quite a bit of unsavory EFI stub code, and
replace it with two clear invocations of the EFI firmware APIs to clear
NX restrictions from allocations that have been populated with
executable code.

The only code that is being reused is the decompression library itself,
along with the minimal ELF parsing that is required to copy the ELF
segments in place, and the relocation processing that fixes up absolute
symbol references to refer to the correct virtual addresses.

Note that some of Evgeny's changes to clean up the PE/COFF header
generation will still be needed, but I've omitted those here for
brevity.

(*) IMHO the following developments are likely to occur:
- the Windows boot chain (along with 3rd party drivers) is cleaned up so
that it never relies on memory being writable and executable at the
same time when running under the EFI boot services;
- the EFI reference implementation gets updated to map all memory NX by
default, and to require read-only permissions for executable mappings;
- BIOS vendors incorporate these changes into their codebases, and
deploy it more widely than just the 'secure' SKUs;
- OEMs only care about the Windows sticker [5], so they only boot test
Windows, which works fine in this more restricted context;
- Linux boot no longer works reliably on new hardware built for Windows
unless we clean up our boot chain as well.

Cc: Evgeniy Baskov <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Alexey Khoroshilov <[email protected]>
Cc: Peter Jones <[email protected]>
Cc: Gerd Hoffmann <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Mario Limonciello <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Joerg Roedel <[email protected]>

[0] https://lore.kernel.org/all/[email protected]/
[1] https://lore.kernel.org/all/[email protected]/
[2] https://lore.kernel.org/all/[email protected]/
[3] https://lore.kernel.org/all/[email protected]/
[4] https://lore.kernel.org/all/[email protected]/
[5] https://techcommunity.microsoft.com/t5/hardware-dev-center/new-uefi-ca-memory-mitigation-requirements-for-signing/ba-p/3608714
[6] https://lore.kernel.org/all/[email protected]/
[7] https://lore.kernel.org/all/[email protected]/
[8] https://lore.kernel.org/all/[email protected]/
[9] https://lore.kernel.org/all/[email protected]/
[10] https://lore.kernel.org/all/[email protected]/
[11] https://lore.kernel.org/all/20230802093927.GAZMokT57anC5jBISK@fat_crate.local/

Ard Biesheuvel (23):
x86/decompressor: Don't rely on upper 32 bits of GPRs being preserved
x86/head_64: Store boot_params pointer in callee save register
x86/efistub: Branch straight to kernel entry point from C code
x86/efistub: Simplify and clean up handover entry code
x86/decompressor: Avoid magic offsets for EFI handover entrypoint
x86/efistub: Clear BSS in EFI handover protocol entrypoint
x86/decompressor: Store boot_params pointer in callee save register
x86/decompressor: Assign paging related global variables earlier
x86/decompressor: Call trampoline as a normal function
x86/decompressor: Use standard calling convention for trampoline
x86/decompressor: Avoid the need for a stack in the 32-bit trampoline
x86/decompressor: Call trampoline directly from C code
x86/decompressor: Only call the trampoline when changing paging levels
x86/decompressor: Pass pgtable address to trampoline directly
x86/decompressor: Merge trampoline cleanup with switching code
x86/efistub: Perform 4/5 level paging switch from the stub
x86/efistub: Prefer EFI memory attributes protocol over DXE services
decompress: Use 8 byte alignment
x86/decompressor: Move global symbol references to C code
x86/decompressor: Factor out kernel decompression and relocation
efi/libstub: Add limit argument to efi_random_alloc()
x86/efistub: Perform SNP feature test while running in the firmware
x86/efistub: Avoid legacy decompressor when doing EFI boot

Documentation/arch/x86/boot.rst | 2 +-
arch/x86/boot/compressed/Makefile | 5 +
arch/x86/boot/compressed/efi_mixed.S | 107 +++-----
arch/x86/boot/compressed/head_32.S | 32 ---
arch/x86/boot/compressed/head_64.S | 280 ++++++-------------
arch/x86/boot/compressed/misc.c | 44 ++-
arch/x86/boot/compressed/misc.h | 2 -
arch/x86/boot/compressed/pgtable.h | 10 +-
arch/x86/boot/compressed/pgtable_64.c | 87 +++---
arch/x86/boot/compressed/sev.c | 112 ++++----
arch/x86/include/asm/boot.h | 8 +
arch/x86/include/asm/efi.h | 7 +-
arch/x86/include/asm/sev.h | 6 +
arch/x86/kernel/head_64.S | 31 +--
drivers/firmware/efi/libstub/Makefile | 1 +
drivers/firmware/efi/libstub/arm64-stub.c | 2 +-
drivers/firmware/efi/libstub/efi-stub-helper.c | 2 +
drivers/firmware/efi/libstub/efistub.h | 3 +-
drivers/firmware/efi/libstub/randomalloc.c | 10 +-
drivers/firmware/efi/libstub/x86-5lvl.c | 95 +++++++
drivers/firmware/efi/libstub/x86-stub.c | 285 +++++++++++---------
drivers/firmware/efi/libstub/x86-stub.h | 17 ++
drivers/firmware/efi/libstub/zboot.c | 2 +-
include/linux/decompress/mm.h | 2 +-
24 files changed, 588 insertions(+), 564 deletions(-)
create mode 100644 drivers/firmware/efi/libstub/x86-5lvl.c
create mode 100644 drivers/firmware/efi/libstub/x86-stub.h

--
2.39.2



2023-08-02 16:53:40

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v8 13/23] x86/decompressor: Only call the trampoline when changing paging levels

Since the current and desired number of paging levels are known when the
trampoline is being prepared, avoid calling the trampoline at all if it
is clear that calling it is not going to result in a change to the
number of paging levels.

Given that the CPU is already running in long mode, the PAE and LA57
settings are necessarily consistent with the currently active page
tables, and other fields in CR4 will be initialized by the startup code
in the kernel proper. So limit the manipulation of CR4 to toggling the
LA57 bit, which is the only thing that really needs doing at this point
in the boot. This also means that there is no need to pass the value of
l5_required to toggle_la57(), as it will not be called unless CR4.LA57
needs to toggle.

Acked-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/x86/boot/compressed/head_64.S | 45 ++------------------
arch/x86/boot/compressed/pgtable_64.c | 22 ++++------
2 files changed, 13 insertions(+), 54 deletions(-)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index cd6e3e175389aa6b..8730b1d58e2b0825 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -387,10 +387,6 @@ SYM_CODE_START(startup_64)
* For the trampoline, we need the top page table to reside in lower
* memory as we don't have a way to load 64-bit values into CR3 in
* 32-bit mode.
- *
- * We go though the trampoline even if we don't have to: if we're
- * already in a desired paging mode. This way the trampoline code gets
- * tested on every boot.
*/

/* Make sure we have GDT with 32-bit code segment */
@@ -526,8 +522,7 @@ SYM_FUNC_END(.Lrelocated)
*
* Return address is at the top of the stack (might be above 4G).
* The first argument (EDI) contains the 32-bit addressable base of the
- * trampoline memory. A non-zero second argument (ESI) means that the
- * trampoline needs to enable 5-level paging.
+ * trampoline memory.
*/
.section ".rodata", "a", @progbits
SYM_CODE_START(trampoline_32bit_src)
@@ -579,25 +574,10 @@ SYM_CODE_START(trampoline_32bit_src)
btrl $X86_CR0_PG_BIT, %eax
movl %eax, %cr0

- /* Check what paging mode we want to be in after the trampoline */
- testl %esi, %esi
- jz 1f
-
- /* We want 5-level paging: don't touch CR3 if it already points to 5-level page tables */
- movl %cr4, %eax
- testl $X86_CR4_LA57, %eax
- jnz 3f
- jmp 2f
-1:
- /* We want 4-level paging: don't touch CR3 if it already points to 4-level page tables */
- movl %cr4, %eax
- testl $X86_CR4_LA57, %eax
- jz 3f
-2:
/* Point CR3 to the trampoline's new top level page table */
leal TRAMPOLINE_32BIT_PGTABLE_OFFSET(%edi), %eax
movl %eax, %cr3
-3:
+
/* Set EFER.LME=1 as a precaution in case hypervsior pulls the rug */
movl $MSR_EFER, %ecx
rdmsr
@@ -606,26 +586,9 @@ SYM_CODE_START(trampoline_32bit_src)
jc 1f
wrmsr
1:
-#ifdef CONFIG_X86_MCE
- /*
- * Preserve CR4.MCE if the kernel will enable #MC support.
- * Clearing MCE may fault in some environments (that also force #MC
- * support). Any machine check that occurs before #MC support is fully
- * configured will crash the system regardless of the CR4.MCE value set
- * here.
- */
+ /* Toggle CR4.LA57 */
movl %cr4, %eax
- andl $X86_CR4_MCE, %eax
-#else
- movl $0, %eax
-#endif
-
- /* Enable PAE and LA57 (if required) paging modes */
- orl $X86_CR4_PAE, %eax
- testl %esi, %esi
- jz 1f
- orl $X86_CR4_LA57, %eax
-1:
+ btcl $X86_CR4_LA57_BIT, %eax
movl %eax, %cr4

/* Enable paging again. */
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
index f9cc86b2ee55ca80..4213473ae54883c8 100644
--- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -103,7 +103,7 @@ static unsigned long find_trampoline_placement(void)

asmlinkage void configure_5level_paging(struct boot_params *bp)
{
- void (*toggle_la57)(void *trampoline, bool enable_5lvl);
+ void (*toggle_la57)(void *trampoline);
bool l5_required = false;

/* Initialize boot_params. Required for cmdline_find_option_bool(). */
@@ -133,6 +133,13 @@ asmlinkage void configure_5level_paging(struct boot_params *bp)
ptrs_per_p4d = 512;
}

+ /*
+ * The trampoline will not be used if the paging mode is already set to
+ * the desired one.
+ */
+ if (l5_required == !!(native_read_cr4() & X86_CR4_LA57))
+ return;
+
trampoline_32bit = (unsigned long *)find_trampoline_placement();

/* Preserve trampoline memory */
@@ -160,18 +167,8 @@ asmlinkage void configure_5level_paging(struct boot_params *bp)
*
* The new page table will be used by trampoline code for switching
* from 4- to 5-level paging or vice versa.
- *
- * If switching is not required, the page table is unused: trampoline
- * code wouldn't touch CR3.
*/

- /*
- * We are not going to use the page table in trampoline memory if we
- * are already in the desired paging mode.
- */
- if (l5_required == !!(native_read_cr4() & X86_CR4_LA57))
- goto out;
-
if (l5_required) {
/*
* For 4- to 5-level paging transition, set up current CR3 as
@@ -194,8 +191,7 @@ asmlinkage void configure_5level_paging(struct boot_params *bp)
(void *)src, PAGE_SIZE);
}

-out:
- toggle_la57(trampoline_32bit, l5_required);
+ toggle_la57(trampoline_32bit);
}

void cleanup_trampoline(void *pgtable)
--
2.39.2


2023-08-02 16:54:44

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v8 04/23] x86/efistub: Simplify and clean up handover entry code

Now that the EFI entry code in assembler is only used by the optional
and deprecated EFI handover protocol, and given that the EFI stub C code
no longer returns to it, most of it can simply be dropped.

While at it, clarify the symbol naming, by merging efi_main() and
efi_stub_entry(), making the latter the shared entry point for all
different boot modes that enter via the EFI stub.

The efi32_stub_entry() and efi64_stub_entry() names are referenced
explicitly by the tooling that populates the setup header, so these must
be retained, but can be emitted as aliases of efi_stub_entry() where
appropriate.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
Documentation/arch/x86/boot.rst | 2 +-
arch/x86/boot/compressed/efi_mixed.S | 22 +++++++++++---------
arch/x86/boot/compressed/head_32.S | 11 ----------
arch/x86/boot/compressed/head_64.S | 12 ++---------
drivers/firmware/efi/libstub/x86-stub.c | 20 ++++++++++++++----
5 files changed, 31 insertions(+), 36 deletions(-)

diff --git a/Documentation/arch/x86/boot.rst b/Documentation/arch/x86/boot.rst
index 33520ecdb37abfda..cdbca15a4fc23833 100644
--- a/Documentation/arch/x86/boot.rst
+++ b/Documentation/arch/x86/boot.rst
@@ -1417,7 +1417,7 @@ execution context provided by the EFI firmware.

The function prototype for the handover entry point looks like this::

- efi_main(void *handle, efi_system_table_t *table, struct boot_params *bp)
+ efi_stub_entry(void *handle, efi_system_table_t *table, struct boot_params *bp)

'handle' is the EFI image handle passed to the boot loader by the EFI
firmware, 'table' is the EFI system table - these are the first two
diff --git a/arch/x86/boot/compressed/efi_mixed.S b/arch/x86/boot/compressed/efi_mixed.S
index 4ca70bf93dc0bdcd..dcc562c8f7f35162 100644
--- a/arch/x86/boot/compressed/efi_mixed.S
+++ b/arch/x86/boot/compressed/efi_mixed.S
@@ -26,8 +26,8 @@
* When booting in 64-bit mode on 32-bit EFI firmware, startup_64_mixed_mode()
* is the first thing that runs after switching to long mode. Depending on
* whether the EFI handover protocol or the compat entry point was used to
- * enter the kernel, it will either branch to the 64-bit EFI handover
- * entrypoint at offset 0x390 in the image, or to the 64-bit EFI PE/COFF
+ * enter the kernel, it will either branch to the common 64-bit EFI stub
+ * entrypoint efi_stub_entry() directly, or via the 64-bit EFI PE/COFF
* entrypoint efi_pe_entry(). In the former case, the bootloader must provide a
* struct bootparams pointer as the third argument, so the presence of such a
* pointer is used to disambiguate.
@@ -37,21 +37,23 @@
* | efi32_pe_entry |---->| | | +-----------+--+
* +------------------+ | | +------+----------------+ |
* | startup_32 |---->| startup_64_mixed_mode | |
- * +------------------+ | | +------+----------------+ V
- * | efi32_stub_entry |---->| | | +------------------+
- * +------------------+ +------------+ +---->| efi64_stub_entry |
- * +-------------+----+
- * +------------+ +----------+ |
- * | startup_64 |<----| efi_main |<--------------+
- * +------------+ +----------+
+ * +------------------+ | | +------+----------------+ |
+ * | efi32_stub_entry |---->| | | |
+ * +------------------+ +------------+ | |
+ * V |
+ * +------------+ +----------------+ |
+ * | startup_64 |<----| efi_stub_entry |<--------+
+ * +------------+ +----------------+
*/
SYM_FUNC_START(startup_64_mixed_mode)
lea efi32_boot_args(%rip), %rdx
mov 0(%rdx), %edi
mov 4(%rdx), %esi
+#ifdef CONFIG_EFI_HANDOVER_PROTOCOL
mov 8(%rdx), %edx // saved bootparams pointer
test %edx, %edx
- jnz efi64_stub_entry
+ jnz efi_stub_entry
+#endif
/*
* efi_pe_entry uses MS calling convention, which requires 32 bytes of
* shadow space on the stack even if all arguments are passed in
diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 987ae727cf9f0d04..8876ffe30e9a4819 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -150,17 +150,6 @@ SYM_FUNC_START(startup_32)
jmp *%eax
SYM_FUNC_END(startup_32)

-#ifdef CONFIG_EFI_STUB
-SYM_FUNC_START(efi32_stub_entry)
- add $0x4, %esp
- movl 8(%esp), %esi /* save boot_params pointer */
- call efi_main
- /* efi_main returns the possibly relocated address of startup_32 */
- jmp *%eax
-SYM_FUNC_END(efi32_stub_entry)
-SYM_FUNC_ALIAS(efi_stub_entry, efi32_stub_entry)
-#endif
-
.text
SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index f732426d3b483139..e6880208b9a1b90a 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -542,19 +542,11 @@ trampoline_return:
jmp *%rax
SYM_CODE_END(startup_64)

-#ifdef CONFIG_EFI_STUB
-#ifdef CONFIG_EFI_HANDOVER_PROTOCOL
+#if IS_ENABLED(CONFIG_EFI_MIXED) && IS_ENABLED(CONFIG_EFI_HANDOVER_PROTOCOL)
.org 0x390
-#endif
SYM_FUNC_START(efi64_stub_entry)
- and $~0xf, %rsp /* realign the stack */
- movq %rdx, %rbx /* save boot_params pointer */
- call efi_main
- movq %rbx,%rsi
- leaq rva(startup_64)(%rax), %rax
- jmp *%rax
+ jmp efi_stub_entry
SYM_FUNC_END(efi64_stub_entry)
-SYM_FUNC_ALIAS(efi_stub_entry, efi64_stub_entry)
#endif

.text
diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c
index 40a10db2d85e7942..3f3b3edf7a387d7c 100644
--- a/drivers/firmware/efi/libstub/x86-stub.c
+++ b/drivers/firmware/efi/libstub/x86-stub.c
@@ -817,9 +817,9 @@ static void __noreturn enter_kernel(unsigned long kernel_addr,
* return. On failure, it will exit to the firmware via efi_exit() instead of
* returning.
*/
-asmlinkage unsigned long efi_main(efi_handle_t handle,
- efi_system_table_t *sys_table_arg,
- struct boot_params *boot_params)
+void __noreturn efi_stub_entry(efi_handle_t handle,
+ efi_system_table_t *sys_table_arg,
+ struct boot_params *boot_params)
{
unsigned long bzimage_addr = (unsigned long)startup_32;
unsigned long buffer_start, buffer_end;
@@ -964,7 +964,19 @@ asmlinkage unsigned long efi_main(efi_handle_t handle,

enter_kernel(bzimage_addr, boot_params);
fail:
- efi_err("efi_main() failed!\n");
+ efi_err("efi_stub_entry() failed!\n");

efi_exit(handle, status);
}
+
+#ifdef CONFIG_EFI_HANDOVER_PROTOCOL
+#ifndef CONFIG_EFI_MIXED
+extern __alias(efi_stub_entry)
+void efi32_stub_entry(efi_handle_t handle, efi_system_table_t *sys_table_arg,
+ struct boot_params *boot_params);
+
+extern __alias(efi_stub_entry)
+void efi64_stub_entry(efi_handle_t handle, efi_system_table_t *sys_table_arg,
+ struct boot_params *boot_params);
+#endif
+#endif
--
2.39.2


2023-08-05 15:01:53

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v8 00/23] efi/x86: Avoid bare metal decompressor during EFI boot

On Wed, Aug 02, 2023 at 05:48:08PM +0200, Ard Biesheuvel wrote:
> Update the x86 boot path to avoid the bare metal decompressor when
> booting via the EFI stub. The bare metal decompressor inherits the
> loader's 1:1 mapping of DRAM when entering in 64-bit mode, and assumes
> that all of it is mapped read/write/execute, which will no longer be the
> case on systems built to comply with recently tightened logo
> requirements (*).
>
> Changes since v7 [10]:

My Zen1 box fails booting with those. It is related to memory encryption
because if I supply "mem_encrypt=off", it boots.

The failure is (typing it off from the video from the BMC):

/dev/root: Can't open blockdev
VFS: Cannot open root device "UUID=..."
Please append a correct "root=" boot option;
...

I'll bisect now but it is pretty clear which one is the culprit.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2023-08-05 17:45:27

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v8 00/23] efi/x86: Avoid bare metal decompressor during EFI boot

On Sat, 5 Aug 2023 at 16:41, Borislav Petkov <[email protected]> wrote:
>
> On Wed, Aug 02, 2023 at 05:48:08PM +0200, Ard Biesheuvel wrote:
> > Update the x86 boot path to avoid the bare metal decompressor when
> > booting via the EFI stub. The bare metal decompressor inherits the
> > loader's 1:1 mapping of DRAM when entering in 64-bit mode, and assumes
> > that all of it is mapped read/write/execute, which will no longer be the
> > case on systems built to comply with recently tightened logo
> > requirements (*).
> >
> > Changes since v7 [10]:
>
> My Zen1 box fails booting with those. It is related to memory encryption
> because if I supply "mem_encrypt=off", it boots.
>
> The failure is (typing it off from the video from the BMC):
>
> /dev/root: Can't open blockdev
> VFS: Cannot open root device "UUID=..."
> Please append a correct "root=" boot option;
> ...
>
> I'll bisect now but it is pretty clear which one is the culprit.
>

So this is the host booting, right? So is the position of the C bit
perhaps getting detect incorrectly?

2023-08-05 22:25:14

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v8 00/23] efi/x86: Avoid bare metal decompressor during EFI boot

On Sat, Aug 05, 2023 at 07:37:17PM +0200, Ard Biesheuvel wrote:
> So this is the host booting, right?

Yes.

> So is the position of the C bit perhaps getting detect incorrectly?

I don't know yet. But the evildoer is someone else:

02d47ce2ae02 ("x86/head_64: Store boot_params pointer in callee save register")

after bisection. More staring later as to why...

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2023-08-05 23:38:50

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v8 00/23] efi/x86: Avoid bare metal decompressor during EFI boot

On Sat, 5 Aug 2023 at 23:07, Borislav Petkov <[email protected]> wrote:
>
> On Sat, Aug 05, 2023 at 07:37:17PM +0200, Ard Biesheuvel wrote:
> > So this is the host booting, right?
>
> Yes.
>
> > So is the position of the C bit perhaps getting detect incorrectly?
>
> I don't know yet. But the evildoer is someone else:
>
> 02d47ce2ae02 ("x86/head_64: Store boot_params pointer in callee save register")
>
> after bisection. More staring later as to why...
>

How bizarre.

But that was a bonus patch anyway, so we could just drop it for now.

2023-08-06 10:21:43

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v8 00/23] efi/x86: Avoid bare metal decompressor during EFI boot

On Sun, 6 Aug 2023 at 00:19, Ard Biesheuvel <[email protected]> wrote:
>
> On Sat, 5 Aug 2023 at 23:07, Borislav Petkov <[email protected]> wrote:
> >
> > On Sat, Aug 05, 2023 at 07:37:17PM +0200, Ard Biesheuvel wrote:
> > > So this is the host booting, right?
> >
> > Yes.
> >
> > > So is the position of the C bit perhaps getting detect incorrectly?
> >
> > I don't know yet. But the evildoer is someone else:
> >
> > 02d47ce2ae02 ("x86/head_64: Store boot_params pointer in callee save register")
> >
> > after bisection. More staring later as to why...
> >
>
> How bizarre.
>
> But that was a bonus patch anyway, so we could just drop it for now.

I suspect this should fix the issue:

--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -110,6 +110,7 @@ SYM_CODE_START_NOALIGN(startup_64)
* programmed into CR3.
*/
leaq _text(%rip), %rdi
+ movq %r15, %rsi
call __startup_64

/* Form the CR3 value being sure to include the CR3 modifier */

2023-08-06 11:07:28

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v8 00/23] efi/x86: Avoid bare metal decompressor during EFI boot

On Sun, Aug 06, 2023 at 12:05:25PM +0200, Ard Biesheuvel wrote:
> I suspect this should fix the issue:
>
> --- a/arch/x86/kernel/head_64.S
> +++ b/arch/x86/kernel/head_64.S
> @@ -110,6 +110,7 @@ SYM_CODE_START_NOALIGN(startup_64)
> * programmed into CR3.
> */
> leaq _text(%rip), %rdi
> + movq %r15, %rsi
> call __startup_64

Yah, I was suspecting we're not passing boot_params properly somewhere.
Aaand there it is.

So yes, that fixes it.

Lemme know when you've refreshed your branch so that I can continue
testing.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2023-08-06 11:07:39

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v8 00/23] efi/x86: Avoid bare metal decompressor during EFI boot

On Sun, 6 Aug 2023 at 12:17, Borislav Petkov <[email protected]> wrote:
>
> On Sun, Aug 06, 2023 at 12:05:25PM +0200, Ard Biesheuvel wrote:
> > I suspect this should fix the issue:
> >
> > --- a/arch/x86/kernel/head_64.S
> > +++ b/arch/x86/kernel/head_64.S
> > @@ -110,6 +110,7 @@ SYM_CODE_START_NOALIGN(startup_64)
> > * programmed into CR3.
> > */
> > leaq _text(%rip), %rdi
> > + movq %r15, %rsi
> > call __startup_64
>
> Yah, I was suspecting we're not passing boot_params properly somewhere.
> Aaand there it is.
>
> So yes, that fixes it.
>
> Lemme know when you've refreshed your branch so that I can continue
> testing.
>

https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=x86-efistub-cleanup-v9

2023-08-07 17:10:49

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v8 00/23] efi/x86: Avoid bare metal decompressor during EFI boot

On Sun, Aug 06, 2023 at 12:21:36PM +0200, Ard Biesheuvel wrote:
> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=x86-efistub-cleanup-v9

Ok, boots on all my machines.

Please send it once again to the ML so that I can pick it up and have
Link: tags refer to an official submission.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette