2021-11-10 11:02:09

by Evgeniy Baskov

[permalink] [raw]
Subject: [PATCH RFC 0/5] Handle UEFI NX-restricted page tables

Note, that this patch series is RFC, since it is yet untested
and possibly incompatible with AMD SEV and related extensions.

The UEFI specification states that certain memory regions may
not have every permission, i.e. may not be writable or executable.

Furthermore there exist some implementations (at least on i386/x86_64)
that restrict execution of memory regions expected by the kernel to
be executable. E.g. first megabyte of address space, where trampoline
for switching between 4/5 level paging is placed and memory regions,
allocated as loader data.

This patch series allows Linux kernel to boot on such UEFI
implementations on i386 and x86_64.

The simplest way to achieve that on i386 is to disable paging
before jumping to potentially relocated code.

x86_64, on the other hand, does not allow disabling paging so it
is required to build temporary page tables containing memory regions
required for Linux kernel to boot with appropriate access permissions.

Baskov Evgeniy (5):
Docs: document notemppt option
efi: Add option for handling efi memory protection
libstub: build temporary page table without NX-bit
efi/x86_64: set page table if provided by libstub
efi/x86: Disable paging when booting via efistub

Documentation/admin-guide/kernel-parameters.txt | 7
arch/x86/boot/compressed/head_32.S | 12 +
arch/x86/boot/compressed/head_64.S | 12 +
drivers/firmware/efi/Kconfig | 17 ++
drivers/firmware/efi/libstub/Makefile | 2
drivers/firmware/efi/libstub/efi-stub-helper.c | 3
drivers/firmware/efi/libstub/efistub.h | 10 +
drivers/firmware/efi/libstub/temp-pgtable.c | 190 ++++++++++++++++++++++++
drivers/firmware/efi/libstub/x86-stub.c | 8 -
9 files changed, 258 insertions(+), 3 deletions(-)


2021-11-10 11:16:17

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH RFC 0/5] Handle UEFI NX-restricted page tables

On Wed, 10 Nov 2021 at 11:56, Baskov Evgeniy <[email protected]> wrote:
>
> Note, that this patch series is RFC, since it is yet untested
> and possibly incompatible with AMD SEV and related extensions.
>
> The UEFI specification states that certain memory regions may
> not have every permission, i.e. may not be writable or executable.
>
> Furthermore there exist some implementations (at least on i386/x86_64)
> that restrict execution of memory regions expected by the kernel to
> be executable. E.g. first megabyte of address space, where trampoline
> for switching between 4/5 level paging is placed and memory regions,
> allocated as loader data.
>
> This patch series allows Linux kernel to boot on such UEFI
> implementations on i386 and x86_64.
>
> The simplest way to achieve that on i386 is to disable paging
> before jumping to potentially relocated code.
>
> x86_64, on the other hand, does not allow disabling paging so it
> is required to build temporary page tables containing memory regions
> required for Linux kernel to boot with appropriate access permissions.
>

Hello Baskov,

To be honest, I am truly not a fan of this approach.

Which systems is this issue occurring on? Did you try something like
the below to allocate executable memory explicitly?


diff --git a/drivers/firmware/efi/libstub/relocate.c
b/drivers/firmware/efi/libstub/relocate.c
index 8ee9eb2b9039..b73012a7bcdc 100644
--- a/drivers/firmware/efi/libstub/relocate.c
+++ b/drivers/firmware/efi/libstub/relocate.c
@@ -80,7 +80,7 @@ efi_status_t efi_low_alloc_above(unsigned long size,
unsigned long align,
continue;

status = efi_bs_call(allocate_pages, EFI_ALLOCATE_ADDRESS,
- EFI_LOADER_DATA, nr_pages, &start);
+ EFI_LOADER_CODE, nr_pages, &start);
if (status == EFI_SUCCESS) {
*addr = start;
break;
@@ -146,7 +146,7 @@ efi_status_t efi_relocate_kernel(unsigned long *image_addr,
*/
nr_pages = round_up(alloc_size, EFI_ALLOC_ALIGN) / EFI_PAGE_SIZE;
status = efi_bs_call(allocate_pages, EFI_ALLOCATE_ADDRESS,
- EFI_LOADER_DATA, nr_pages, &efi_addr);
+ EFI_LOADER_CODE, nr_pages, &efi_addr);
new_addr = efi_addr;
/*
* If preferred address allocation failed allocate as low as



--
Ard.

2021-11-25 07:38:44

by Evgeniy Baskov

[permalink] [raw]
Subject: Re: [PATCH RFC 0/5] Handle UEFI NX-restricted page tables


Hello,

I apologize for delayed reply.

The system in question runs in a firmware that tries to achieve
complete W^X protection. Both loader code and loader data
are not executable, so the suggested approach does not work.
If you would like to test this, you can set
the PcdDxeNxMemoryProtectionPolicy in any firmware available to you.

As a justification for the approach itself, I can use the fact that
UEFI specification says nothing about the ability to execute
self-allocated EfiLoaderCode or any other types besides the areas
allocated by the firmware for UEFI Images. In fact, Table 7-5
explicitly states that EfiLoaderCode is used for:

> The code portions of a loaded UEFI application.

While we do not think it should be interpreted as one cannot allocate
such areas at all, it is clear that there are no guarantees about the
other use cases and permissions of the allocations of this type besides
those stated by 2.3.4:

> Paging mode is enabled and any memory space defined by the UEFI memory
> map is identity mapped (virtual address equals physical address),
> although the attributes of certain regions may not have all read,
> write, and execute attributes or be unmarked for purposes of platform
> protection.

Long story short, the kernel is not allowed to allocate such areas and
assume they are executable, it should do paging itself, and the changes
here address that. For the reference, Windows adheres to this convention
and works fine on the target system.

Thanks,
Baskov Evgeniy

On 2021-11-10 14:11, Ard Biesheuvel wrote:
> On Wed, 10 Nov 2021 at 11:56, Baskov Evgeniy <[email protected]> wrote:
>>
>> Note, that this patch series is RFC, since it is yet untested
>> and possibly incompatible with AMD SEV and related extensions.
>>
>> The UEFI specification states that certain memory regions may
>> not have every permission, i.e. may not be writable or executable.
>>
>> Furthermore there exist some implementations (at least on i386/x86_64)
>> that restrict execution of memory regions expected by the kernel to
>> be executable. E.g. first megabyte of address space, where trampoline
>> for switching between 4/5 level paging is placed and memory regions,
>> allocated as loader data.
>>
>> This patch series allows Linux kernel to boot on such UEFI
>> implementations on i386 and x86_64.
>>
>> The simplest way to achieve that on i386 is to disable paging
>> before jumping to potentially relocated code.
>>
>> x86_64, on the other hand, does not allow disabling paging so it
>> is required to build temporary page tables containing memory regions
>> required for Linux kernel to boot with appropriate access permissions.
>>
>
> Hello Baskov,
>
> To be honest, I am truly not a fan of this approach.
>
> Which systems is this issue occurring on? Did you try something like
> the below to allocate executable memory explicitly?
>
>
> diff --git a/drivers/firmware/efi/libstub/relocate.c
> b/drivers/firmware/efi/libstub/relocate.c
> index 8ee9eb2b9039..b73012a7bcdc 100644
> --- a/drivers/firmware/efi/libstub/relocate.c
> +++ b/drivers/firmware/efi/libstub/relocate.c
> @@ -80,7 +80,7 @@ efi_status_t efi_low_alloc_above(unsigned long size,
> unsigned long align,
> continue;
>
> status = efi_bs_call(allocate_pages,
> EFI_ALLOCATE_ADDRESS,
> - EFI_LOADER_DATA, nr_pages,
> &start);
> + EFI_LOADER_CODE, nr_pages,
> &start);
> if (status == EFI_SUCCESS) {
> *addr = start;
> break;
> @@ -146,7 +146,7 @@ efi_status_t efi_relocate_kernel(unsigned long
> *image_addr,
> */
> nr_pages = round_up(alloc_size, EFI_ALLOC_ALIGN) /
> EFI_PAGE_SIZE;
> status = efi_bs_call(allocate_pages, EFI_ALLOCATE_ADDRESS,
> - EFI_LOADER_DATA, nr_pages, &efi_addr);
> + EFI_LOADER_CODE, nr_pages, &efi_addr);
> new_addr = efi_addr;
> /*
> * If preferred address allocation failed allocate as low as



2021-12-16 17:30:37

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH RFC 0/5] Handle UEFI NX-restricted page tables

On Thu, 25 Nov 2021 at 08:36, <[email protected]> wrote:
>
>
> Hello,
>
> I apologize for delayed reply.
>

No worries.


> The system in question runs in a firmware that tries to achieve
> complete W^X protection. Both loader code and loader data
> are not executable, so the suggested approach does not work.
> If you would like to test this, you can set
> the PcdDxeNxMemoryProtectionPolicy in any firmware available to you.
>

The PCD in question has the following note:

# NOTE: User must NOT set NX protection for EfiLoaderCode /
EfiBootServicesCode / EfiRuntimeServicesCode. <BR>

Any idea whether this is easily reproducible with OVMF? Restricting
the loader from creating executable regions seems rather daft, so we
should at least report this, and preferably fix it in EDK2.

> As a justification for the approach itself, I can use the fact that
> UEFI specification says nothing about the ability to execute
> self-allocated EfiLoaderCode or any other types besides the areas
> allocated by the firmware for UEFI Images. In fact, Table 7-5
> explicitly states that EfiLoaderCode is used for:
>
> > The code portions of a loaded UEFI application.
>

Fair enough. So EfiLoaderCode is not the right type.

> While we do not think it should be interpreted as one cannot allocate
> such areas at all, it is clear that there are no guarantees about the
> other use cases and permissions of the allocations of this type besides
> those stated by 2.3.4:
>
> > Paging mode is enabled and any memory space defined by the UEFI memory
> > map is identity mapped (virtual address equals physical address),
> > although the attributes of certain regions may not have all read,
> > write, and execute attributes or be unmarked for purposes of platform
> > protection.
>
> Long story short, the kernel is not allowed to allocate such areas and
> assume they are executable,

OK

> it should do paging itself,

Now you're going too fast. One does not necessarily imply the other.

> and the changes
> here address that. For the reference, Windows adheres to this convention
> and works fine on the target system.
>

Given that this issue is specific to EDK2 based firmwares, would it be
possible to fix this using DXE services instead? In particular, could
we use

gDS->SetMemorySpaceAttributes()

to ensure that the regions have executable permissions?





>
> On 2021-11-10 14:11, Ard Biesheuvel wrote:
> > On Wed, 10 Nov 2021 at 11:56, Baskov Evgeniy <[email protected]> wrote:
> >>
> >> Note, that this patch series is RFC, since it is yet untested
> >> and possibly incompatible with AMD SEV and related extensions.
> >>
> >> The UEFI specification states that certain memory regions may
> >> not have every permission, i.e. may not be writable or executable.
> >>
> >> Furthermore there exist some implementations (at least on i386/x86_64)
> >> that restrict execution of memory regions expected by the kernel to
> >> be executable. E.g. first megabyte of address space, where trampoline
> >> for switching between 4/5 level paging is placed and memory regions,
> >> allocated as loader data.
> >>
> >> This patch series allows Linux kernel to boot on such UEFI
> >> implementations on i386 and x86_64.
> >>
> >> The simplest way to achieve that on i386 is to disable paging
> >> before jumping to potentially relocated code.
> >>
> >> x86_64, on the other hand, does not allow disabling paging so it
> >> is required to build temporary page tables containing memory regions
> >> required for Linux kernel to boot with appropriate access permissions.
> >>
> >
> > Hello Baskov,
> >
> > To be honest, I am truly not a fan of this approach.
> >
> > Which systems is this issue occurring on? Did you try something like
> > the below to allocate executable memory explicitly?
> >
> >
> > diff --git a/drivers/firmware/efi/libstub/relocate.c
> > b/drivers/firmware/efi/libstub/relocate.c
> > index 8ee9eb2b9039..b73012a7bcdc 100644
> > --- a/drivers/firmware/efi/libstub/relocate.c
> > +++ b/drivers/firmware/efi/libstub/relocate.c
> > @@ -80,7 +80,7 @@ efi_status_t efi_low_alloc_above(unsigned long size,
> > unsigned long align,
> > continue;
> >
> > status = efi_bs_call(allocate_pages,
> > EFI_ALLOCATE_ADDRESS,
> > - EFI_LOADER_DATA, nr_pages,
> > &start);
> > + EFI_LOADER_CODE, nr_pages,
> > &start);
> > if (status == EFI_SUCCESS) {
> > *addr = start;
> > break;
> > @@ -146,7 +146,7 @@ efi_status_t efi_relocate_kernel(unsigned long
> > *image_addr,
> > */
> > nr_pages = round_up(alloc_size, EFI_ALLOC_ALIGN) /
> > EFI_PAGE_SIZE;
> > status = efi_bs_call(allocate_pages, EFI_ALLOCATE_ADDRESS,
> > - EFI_LOADER_DATA, nr_pages, &efi_addr);
> > + EFI_LOADER_CODE, nr_pages, &efi_addr);
> > new_addr = efi_addr;
> > /*
> > * If preferred address allocation failed allocate as low as
>
>