2023-10-24 19:27:12

by Björn Töpel

[permalink] [raw]
Subject: [PATCH] riscv: CONFIG_EFI should not depend on CONFIG_RISCV_ISA_C

From: Björn Töpel <[email protected]>

UEFI/PE mandates that the kernel Image starts with "MZ" ASCII
(0x5A4D). Convenient enough, "MZ" is a valid compressed RISC-V
instruction. This means that a non-UEFI loader can simply jump to
"code0" in the Image header [1] and start executing.

The Image specification [1] says the following about "code0":
| This header is also reused to support EFI stub for RISC-V. EFI
| specification needs PE/COFF image header in the beginning of the
| kernel image in order to load it as an EFI application. In order
| to support EFI stub, code0 is replaced with "MZ" magic string
| and res3(at offset 0x3c) points to the rest of the PE/COFF
| header.

"MZ" is not a valid instruction for implementations without the C
extension.

A non-UEFI loader, loading a non-C UEFI image have the following
options:
1. Trap and emulate "code0"
2. Avoid "code0" if it is "MZ", and have the kernel entry at
"code1".

Replace the compressed instruction with a hex code variant, that works
for CONFIG_RISCV_ISA_C=n builds. Further, this change also make sure
that the "code0" instruction is 32b aligned.

[1] Documentation/riscv/boot-image-header.rst

Signed-off-by: Björn Töpel <[email protected]>
---
arch/riscv/Kconfig | 1 -
arch/riscv/kernel/head.S | 8 ++++++--
2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index d607ab0f7c6d..9c5bbbc93951 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -799,7 +799,6 @@ config EFI
select EFI_RUNTIME_WRAPPERS
select EFI_STUB
select LIBFDT
- select RISCV_ISA_C
select UCS2_STRING
help
This option provides support for runtime services provided
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index 3710ea5d160f..33d69b569843 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -27,9 +27,13 @@ ENTRY(_start)
*/
#ifdef CONFIG_EFI
/*
- * This instruction decodes to "MZ" ASCII required by UEFI.
+ * The compressed (C extension) "c.li s4,-13" instruction
+ * decodes to 0x5a4d/"MZ" (ASCII), which is required by UEFI.
+ *
+ * In order to support non-compressed EFI kernels, the
+ * instruction is written in hex.
*/
- c.li s4,-13
+ .word 0x5a4d5a4d
j _start_kernel
#else
/* jump to start kernel */

base-commit: d88520ad73b79e71e3ddf08de335b8520ae41c5c
--
2.40.1


2023-11-06 17:48:01

by Palmer Dabbelt

[permalink] [raw]
Subject: Re: [PATCH] riscv: CONFIG_EFI should not depend on CONFIG_RISCV_ISA_C

On Tue, 24 Oct 2023 12:26:48 PDT (-0700), [email protected] wrote:
> From: Björn Töpel <[email protected]>
>
> UEFI/PE mandates that the kernel Image starts with "MZ" ASCII
> (0x5A4D). Convenient enough, "MZ" is a valid compressed RISC-V
> instruction. This means that a non-UEFI loader can simply jump to
> "code0" in the Image header [1] and start executing.
>
> The Image specification [1] says the following about "code0":
> | This header is also reused to support EFI stub for RISC-V. EFI
> | specification needs PE/COFF image header in the beginning of the
> | kernel image in order to load it as an EFI application. In order
> | to support EFI stub, code0 is replaced with "MZ" magic string
> | and res3(at offset 0x3c) points to the rest of the PE/COFF
> | header.
>
> "MZ" is not a valid instruction for implementations without the C
> extension.
>
> A non-UEFI loader, loading a non-C UEFI image have the following
> options:
> 1. Trap and emulate "code0"
> 2. Avoid "code0" if it is "MZ", and have the kernel entry at
> "code1".
>
> Replace the compressed instruction with a hex code variant, that works
> for CONFIG_RISCV_ISA_C=n builds. Further, this change also make sure
> that the "code0" instruction is 32b aligned.

IMO this breaks the Image format on non-C systems: they'll jump straight
to the start (as there's no symbols left or anything) and then just fail
to execute the C instructions.

We could amend the Image format to require bootloaders to handle this
(ie, look at the first instructions and emulate them if there's no C),
that would require some bootloader updates but given this doesn't work
maybe that's fine.

We could also just stop producing Image+PE binaries on non-C systems
(ie, just produce PE).

> [1] Documentation/riscv/boot-image-header.rst
>
> Signed-off-by: Björn Töpel <[email protected]>
> ---
> arch/riscv/Kconfig | 1 -
> arch/riscv/kernel/head.S | 8 ++++++--
> 2 files changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index d607ab0f7c6d..9c5bbbc93951 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -799,7 +799,6 @@ config EFI
> select EFI_RUNTIME_WRAPPERS
> select EFI_STUB
> select LIBFDT
> - select RISCV_ISA_C
> select UCS2_STRING
> help
> This option provides support for runtime services provided
> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
> index 3710ea5d160f..33d69b569843 100644
> --- a/arch/riscv/kernel/head.S
> +++ b/arch/riscv/kernel/head.S
> @@ -27,9 +27,13 @@ ENTRY(_start)
> */
> #ifdef CONFIG_EFI
> /*
> - * This instruction decodes to "MZ" ASCII required by UEFI.
> + * The compressed (C extension) "c.li s4,-13" instruction
> + * decodes to 0x5a4d/"MZ" (ASCII), which is required by UEFI.
> + *
> + * In order to support non-compressed EFI kernels, the
> + * instruction is written in hex.
> */
> - c.li s4,-13
> + .word 0x5a4d5a4d
> j _start_kernel
> #else
> /* jump to start kernel */
>
> base-commit: d88520ad73b79e71e3ddf08de335b8520ae41c5c

2023-11-06 19:21:20

by Björn Töpel

[permalink] [raw]
Subject: Re: [PATCH] riscv: CONFIG_EFI should not depend on CONFIG_RISCV_ISA_C

Palmer Dabbelt <[email protected]> writes:

> On Tue, 24 Oct 2023 12:26:48 PDT (-0700), [email protected] wrote:
>> From: Björn Töpel <[email protected]>
>>
>> UEFI/PE mandates that the kernel Image starts with "MZ" ASCII
>> (0x5A4D). Convenient enough, "MZ" is a valid compressed RISC-V
>> instruction. This means that a non-UEFI loader can simply jump to
>> "code0" in the Image header [1] and start executing.
>>
>> The Image specification [1] says the following about "code0":
>> | This header is also reused to support EFI stub for RISC-V. EFI
>> | specification needs PE/COFF image header in the beginning of the
>> | kernel image in order to load it as an EFI application. In order
>> | to support EFI stub, code0 is replaced with "MZ" magic string
>> | and res3(at offset 0x3c) points to the rest of the PE/COFF
>> | header.
>>
>> "MZ" is not a valid instruction for implementations without the C
>> extension.
>>
>> A non-UEFI loader, loading a non-C UEFI image have the following
>> options:
>> 1. Trap and emulate "code0"
>> 2. Avoid "code0" if it is "MZ", and have the kernel entry at
>> "code1".
>>
>> Replace the compressed instruction with a hex code variant, that works
>> for CONFIG_RISCV_ISA_C=n builds. Further, this change also make sure
>> that the "code0" instruction is 32b aligned.
>
> IMO this breaks the Image format on non-C systems: they'll jump straight
> to the start (as there's no symbols left or anything) and then just fail
> to execute the C instructions.

That's right! My idea, which was probably really poor articulated, was
to add trap/emulate in OpenSBI/non-C for the MZ instructions.

> We could amend the Image format to require bootloaders to handle this
> (ie, look at the first instructions and emulate them if there's no C),
> that would require some bootloader updates but given this doesn't work
> maybe that's fine.
>
> We could also just stop producing Image+PE binaries on non-C systems
> (ie, just produce PE).

Yes, or make the Image loader in Qemu et al more robust, by e.g.
checking code0/code1. The Image spec does say that code0 is 'MZ' for
PE+Image builds, and for non-C capable systems one could argue that it
should be checked/skipped by the loader.

Does the arguments above make sense for you? I.e.:
1. Merge this patch
2a. Add the trap/emu to OpenSBI
(2b. Improve the image loader in Qemu?)


Björn