2020-01-03 11:42:56

by Ard Biesheuvel

[permalink] [raw]
Subject: [GIT PULL 00/20] More EFI updates for v5.6

Ingo, Thomas,

This is the second batch of EFI updates for v5.6. Two things are still
under discussion, so I'll probably have a few more changes for this
cycle in a week or so.

The following changes since commit 0679715e714345d273c0e1eb78078535ffc4b2a1:

efi/libstub/x86: Avoid globals to store context during mixed mode calls (2019-12-25 10:49:26 +0100)

are available in the Git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git tags/efi-next

for you to fetch changes up to d95e4feae5368a91775c4597a8f298ba84f31535:

efi/x86: avoid RWX mappings for all of DRAM (2020-01-03 11:46:15 +0100)

----------------------------------------------------------------
Second batch of EFI updates for v5.6:
- Some followup fixes for the EFI stub changes that have been queued up
already.
- Overhaul of the x86 EFI boot/runtime code, to peel off layers of pointer
casting and type mangling via variadic macros and asm wrappers that made
the code fragile and ugly.
- Increase robustness for mixed mode code, by using argmaps to annotate and
translate function prototypes that are not mixed mode safe. (Arvind)
- Add the ability to disable DMA at the root port level in the EFI stub, to
avoid booting into the kernel proper with IOMMUs in pass through and DMA
enabled (suggested by Matthew)
- Get rid of RWX mappings in the EFI memory map, where possible.

----------------------------------------------------------------
Ard Biesheuvel (17):
efi/libstub: fix boot argument handling in mixed mode entry code
efi/libstub/x86: force 'hidden' visibility for extern declarations
efi/x86: re-disable RT services for 32-bit kernels running on 64-bit EFI
efi/x86: map the entire EFI vendor string before copying it
efi/x86: avoid redundant cast of EFI firmware service pointer
efi/x86: split off some old memmap handling into separate routines
efi/x86: split SetVirtualAddresMap() wrappers into 32 and 64 bit versions
efi/x86: simplify i386 efi_call_phys() firmware call wrapper
efi/x86: simplify 64-bit EFI firmware call wrapper
efi/x86: simplify mixed mode call wrapper
efi/x86: drop two near identical versions of efi_runtime_init()
efi/x86: clean up efi_systab_init() routine for legibility
efi/x86: don't panic or BUG() on non-critical error conditions
efi/x86: remove unreachable code in kexec_enter_virtual_mode()
x86/mm: fix NX bit clearing issue in kernel_map_pages_in_pgd
efi/x86: don't map the entire kernel text RW for mixed mode
efi/x86: avoid RWX mappings for all of DRAM

Arvind Sankar (2):
efi/x86: Check number of arguments to variadic functions
efi/x86: Allow translating 64-bit arguments for mixed mode calls

Matthew Garrett (1):
efi: Allow disabling PCI busmastering on bridges during boot

Documentation/admin-guide/kernel-parameters.txt | 7 +-
arch/x86/boot/compressed/eboot.c | 18 +-
arch/x86/boot/compressed/efi_thunk_64.S | 4 +-
arch/x86/boot/compressed/head_64.S | 17 +-
arch/x86/include/asm/efi.h | 169 ++++++++---
arch/x86/mm/pageattr.c | 8 +-
arch/x86/platform/efi/Makefile | 1 -
arch/x86/platform/efi/efi.c | 354 ++++++++----------------
arch/x86/platform/efi/efi_32.c | 22 +-
arch/x86/platform/efi/efi_64.c | 157 +++++++----
arch/x86/platform/efi/efi_stub_32.S | 109 ++------
arch/x86/platform/efi/efi_stub_64.S | 43 +--
arch/x86/platform/efi/efi_thunk_64.S | 121 ++------
arch/x86/platform/uv/bios_uv.c | 7 +-
drivers/firmware/efi/Kconfig | 22 ++
drivers/firmware/efi/libstub/Makefile | 2 +-
drivers/firmware/efi/libstub/efi-stub-helper.c | 20 +-
drivers/firmware/efi/libstub/pci.c | 114 ++++++++
include/linux/efi.h | 29 +-
19 files changed, 597 insertions(+), 627 deletions(-)
create mode 100644 drivers/firmware/efi/libstub/pci.c


2020-01-03 11:43:01

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH 17/20] efi: Allow disabling PCI busmastering on bridges during boot

From: Matthew Garrett <[email protected]>

Add an option to disable the busmaster bit in the control register on
all PCI bridges before calling ExitBootServices() and passing control
to the runtime kernel. System firmware may configure the IOMMU to prevent
malicious PCI devices from being able to attack the OS via DMA. However,
since firmware can't guarantee that the OS is IOMMU-aware, it will tear
down IOMMU configuration when ExitBootServices() is called. This leaves
a window between where a hostile device could still cause damage before
Linux configures the IOMMU again.

If CONFIG_EFI_DISABLE_PCI_DMA is enabled or "efi=disable_early_pci_dma"
is passed on the command line, the EFI stub will clear the busmaster bit
on all PCI bridges before ExitBootServices() is called. This will
prevent any malicious PCI devices from being able to perform DMA until
the kernel reenables busmastering after configuring the IOMMU.

This option may cause failures with some poorly behaved hardware and
should not be enabled without testing. The kernel commandline options
"efi=disable_early_pci_dma" or "efi=no_disable_early_pci_dma" may be
used to override the default. Note that PCI devices downstream from PCI
bridges are disconnected from their drivers first, using the UEFI
driver model API, so that DMA can be disabled safely at the bridge
level.

Cc: Andy Lutomirski <[email protected]>
Cc: Arvind Sankar <[email protected]>
Co-developed-by: Matthew Garrett <[email protected]>
Signed-off-by: Matthew Garrett <[email protected]>
[ardb: disconnect PCI I/O handles first, as suggested by Arvind]
Signed-off-by: Ard Biesheuvel <[email protected]>
---
.../admin-guide/kernel-parameters.txt | 7 +-
arch/x86/include/asm/efi.h | 5 +
drivers/firmware/efi/Kconfig | 22 ++++
drivers/firmware/efi/libstub/Makefile | 2 +-
.../firmware/efi/libstub/efi-stub-helper.c | 15 +++
drivers/firmware/efi/libstub/pci.c | 114 ++++++++++++++++++
include/linux/efi.h | 6 +-
7 files changed, 168 insertions(+), 3 deletions(-)
create mode 100644 drivers/firmware/efi/libstub/pci.c

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index ade4e6ec23e0..994632ae48de 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1165,7 +1165,8 @@

efi= [EFI]
Format: { "old_map", "nochunk", "noruntime", "debug",
- "nosoftreserve" }
+ "nosoftreserve", "disable_early_pci_dma",
+ "no_disable_early_pci_dma" }
old_map [X86-64]: switch to the old ioremap-based EFI
runtime services mapping. 32-bit still uses this one by
default.
@@ -1180,6 +1181,10 @@
claim. Specify efi=nosoftreserve to disable this
reservation and treat the memory by its base type
(i.e. EFI_CONVENTIONAL_MEMORY / "System RAM").
+ disable_early_pci_dma: Disable the busmaster bit on all
+ PCI bridges while in the EFI boot stub
+ no_disable_early_pci_dma: Leave the busmaster bit set
+ on all PCI bridges while in the EFI boot stub

efi_no_storage_paranoia [EFI; X86]
Using this parameter you can use more than 50% of
diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 166f0386719e..383f7a0fc170 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -283,6 +283,11 @@ static inline void *efi64_zero_upper(void *p)
#define __efi64_argmap_locate_protocol(protocol, reg, interface) \
((protocol), (reg), efi64_zero_upper(interface))

+/* PCI I/O */
+#define __efi64_argmap_get_location(protocol, seg, bus, dev, func) \
+ ((protocol), efi64_zero_upper(seg), efi64_zero_upper(bus), \
+ efi64_zero_upper(dev), efi64_zero_upper(func))
+
/*
* The macros below handle the plumbing for the argument mapping. To add a
* mapping for a specific EFI method, simply define a macro
diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig
index bcc378c19ebe..ecc83e2f032c 100644
--- a/drivers/firmware/efi/Kconfig
+++ b/drivers/firmware/efi/Kconfig
@@ -215,6 +215,28 @@ config EFI_RCI2_TABLE

Say Y here for Dell EMC PowerEdge systems.

+config EFI_DISABLE_PCI_DMA
+ bool "Clear Busmaster bit on PCI bridges during ExitBootServices()"
+ help
+ Disable the busmaster bit in the control register on all PCI bridges
+ while calling ExitBootServices() and passing control to the runtime
+ kernel. System firmware may configure the IOMMU to prevent malicious
+ PCI devices from being able to attack the OS via DMA. However, since
+ firmware can't guarantee that the OS is IOMMU-aware, it will tear
+ down IOMMU configuration when ExitBootServices() is called. This
+ leaves a window between where a hostile device could still cause
+ damage before Linux configures the IOMMU again.
+
+ If you say Y here, the EFI stub will clear the busmaster bit on all
+ PCI bridges before ExitBootServices() is called. This will prevent
+ any malicious PCI devices from being able to perform DMA until the
+ kernel reenables busmastering after configuring the IOMMU.
+
+ This option will cause failures with some poorly behaved hardware
+ and should not be enabled without testing. The kernel commandline
+ options "efi=disable_early_pci_dma" or "efi=no_disable_early_pci_dma"
+ may be used to override this option.
+
endmenu

config UEFI_CPER
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index c35f893897e1..98a81576213d 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -39,7 +39,7 @@ OBJECT_FILES_NON_STANDARD := y
KCOV_INSTRUMENT := n

lib-y := efi-stub-helper.o gop.o secureboot.o tpm.o \
- random.o
+ random.o pci.o

# include the stub's generic dependencies from lib/ when building for ARM/arm64
arm-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c
diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c
index fcc45ee94e02..74ddfb496140 100644
--- a/drivers/firmware/efi/libstub/efi-stub-helper.c
+++ b/drivers/firmware/efi/libstub/efi-stub-helper.c
@@ -33,6 +33,8 @@ static bool __efistub_global efi_nokaslr;
static bool __efistub_global efi_quiet;
static bool __efistub_global efi_novamap;
static bool __efistub_global efi_nosoftreserve;
+static bool __efistub_global efi_disable_pci_dma =
+ IS_ENABLED(CONFIG_EFI_DISABLE_PCI_DMA);

bool __pure nokaslr(void)
{
@@ -490,6 +492,16 @@ efi_status_t efi_parse_options(char const *cmdline)
efi_nosoftreserve = true;
}

+ if (!strncmp(str, "disable_early_pci_dma", 21)) {
+ str += strlen("disable_early_pci_dma");
+ efi_disable_pci_dma = true;
+ }
+
+ if (!strncmp(str, "no_disable_early_pci_dma", 24)) {
+ str += strlen("no_disable_early_pci_dma");
+ efi_disable_pci_dma = false;
+ }
+
/* Group words together, delimited by "," */
while (*str && *str != ' ' && *str != ',')
str++;
@@ -876,6 +888,9 @@ efi_status_t efi_exit_boot_services(void *handle,
if (status != EFI_SUCCESS)
goto free_map;

+ if (efi_disable_pci_dma)
+ efi_pci_disable_bridge_busmaster();
+
status = efi_bs_call(exit_boot_services, handle, *map->key_ptr);

if (status == EFI_INVALID_PARAMETER) {
diff --git a/drivers/firmware/efi/libstub/pci.c b/drivers/firmware/efi/libstub/pci.c
new file mode 100644
index 000000000000..b025e59b94df
--- /dev/null
+++ b/drivers/firmware/efi/libstub/pci.c
@@ -0,0 +1,114 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * PCI-related functions used by the EFI stub on multiple
+ * architectures.
+ *
+ * Copyright 2019 Google, LLC
+ */
+
+#include <linux/efi.h>
+#include <linux/pci.h>
+
+#include <asm/efi.h>
+
+#include "efistub.h"
+
+void efi_pci_disable_bridge_busmaster(void)
+{
+ efi_guid_t pci_proto = EFI_PCI_IO_PROTOCOL_GUID;
+ unsigned long pci_handle_size = 0;
+ efi_handle_t *pci_handle = NULL;
+ efi_handle_t handle;
+ efi_status_t status;
+ u16 class, command;
+ int i;
+
+ status = efi_bs_call(locate_handle, EFI_LOCATE_BY_PROTOCOL, &pci_proto,
+ NULL, &pci_handle_size, NULL);
+
+ if (status != EFI_BUFFER_TOO_SMALL) {
+ if (status != EFI_SUCCESS && status != EFI_NOT_FOUND)
+ pr_efi_err("Failed to locate PCI I/O handles'\n");
+ return;
+ }
+
+ status = efi_bs_call(allocate_pool, EFI_LOADER_DATA, pci_handle_size,
+ (void **)&pci_handle);
+ if (status != EFI_SUCCESS) {
+ pr_efi_err("Failed to allocate memory for 'pci_handle'\n");
+ return;
+ }
+
+ status = efi_bs_call(locate_handle, EFI_LOCATE_BY_PROTOCOL, &pci_proto,
+ NULL, &pci_handle_size, pci_handle);
+ if (status != EFI_SUCCESS) {
+ pr_efi_err("Failed to locate PCI I/O handles'\n");
+ goto free_handle;
+ }
+
+ for_each_efi_handle(handle, pci_handle, pci_handle_size, i) {
+ efi_pci_io_protocol_t *pci;
+ unsigned long segment_nr, bus_nr, device_nr, func_nr;
+
+ status = efi_bs_call(handle_protocol, handle, &pci_proto,
+ (void **)&pci);
+ if (status != EFI_SUCCESS)
+ continue;
+
+ /*
+ * Disregard devices living on bus 0 - these are not behind a
+ * bridge so no point in disconnecting them from their drivers.
+ */
+ status = efi_call_proto(pci, get_location, &segment_nr, &bus_nr,
+ &device_nr, &func_nr);
+ if (status != EFI_SUCCESS || bus_nr == 0)
+ continue;
+
+ /*
+ * Don't disconnect VGA controllers so we don't risk losing
+ * access to the framebuffer. Drivers for true PCIe graphics
+ * controllers that are behind a PCIe root port do not use
+ * DMA to implement the GOP framebuffer anyway [although they
+ * may use it in their implentation of Gop->Blt()], and so
+ * disabling DMA in the PCI bridge should not interfere with
+ * normal operation of the device.
+ */
+ status = efi_call_proto(pci, pci.read, EfiPciIoWidthUint16,
+ PCI_CLASS_DEVICE, 1, &class);
+ if (status != EFI_SUCCESS || class == PCI_CLASS_DISPLAY_VGA)
+ continue;
+
+ /* Disconnect this handle from all its drivers */
+ efi_bs_call(disconnect_controller, handle, NULL, NULL);
+ }
+
+ for_each_efi_handle(handle, pci_handle, pci_handle_size, i) {
+ efi_pci_io_protocol_t *pci;
+
+ status = efi_bs_call(handle_protocol, handle, &pci_proto,
+ (void **)&pci);
+ if (status != EFI_SUCCESS || !pci)
+ continue;
+
+ status = efi_call_proto(pci, pci.read, EfiPciIoWidthUint16,
+ PCI_CLASS_DEVICE, 1, &class);
+
+ if (status != EFI_SUCCESS || class != PCI_CLASS_BRIDGE_PCI)
+ continue;
+
+ /* Disable busmastering */
+ status = efi_call_proto(pci, pci.read, EfiPciIoWidthUint16,
+ PCI_COMMAND, 1, &command);
+ if (status != EFI_SUCCESS || !(command & PCI_COMMAND_MASTER))
+ continue;
+
+ command &= ~PCI_COMMAND_MASTER;
+ status = efi_call_proto(pci, pci.write, EfiPciIoWidthUint16,
+ PCI_COMMAND, 1, &command);
+ if (status != EFI_SUCCESS)
+ pr_efi_err("Failed to disable PCI busmastering\n");
+ }
+
+free_handle:
+ efi_bs_call(free_pool, pci_handle);
+}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index ee68ea6f85ff..7e8e25b1d11c 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -319,7 +319,9 @@ typedef union {
void *stall;
void *set_watchdog_timer;
void *connect_controller;
- void *disconnect_controller;
+ efi_status_t (__efiapi *disconnect_controller)(efi_handle_t,
+ efi_handle_t,
+ efi_handle_t);
void *open_protocol;
void *close_protocol;
void *open_protocol_information;
@@ -1692,4 +1694,6 @@ struct linux_efi_memreserve {
#define EFI_MEMRESERVE_COUNT(size) (((size) - sizeof(struct linux_efi_memreserve)) \
/ sizeof(((struct linux_efi_memreserve *)0)->entry[0]))

+void efi_pci_disable_bridge_busmaster(void);
+
#endif /* _LINUX_EFI_H */
--
2.20.1

2020-01-03 11:43:10

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH 19/20] efi/x86: don't map the entire kernel text RW for mixed mode

The mixed mode thunking routine requires a part of it to be
mapped 1:1, and for this reason, we currently map the entire
kernel .text read/write in the EFI page tables, which is bad.

In fact, the kernel_map_pages_in_pgd() invocation that installs
this mapping is entirely redundant, since all of DRAM is already
1:1 mapped read/write in the EFI page tables when we reach this
point, which means that .rodata is mapped read-write as well.

So let's remap both .text and .rodata read-only in the EFI
page tables.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/x86/platform/efi/efi_64.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index c13fa2150976..6ec58ff60b56 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -391,11 +391,11 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)

efi_scratch.phys_stack = page_to_phys(page + 1); /* stack grows down */

- npages = (_etext - _text) >> PAGE_SHIFT;
+ npages = (__end_rodata_aligned - _text) >> PAGE_SHIFT;
text = __pa(_text);
pfn = text >> PAGE_SHIFT;

- pf = _PAGE_RW | _PAGE_ENC;
+ pf = _PAGE_ENC;
if (kernel_map_pages_in_pgd(pgd, pfn, text, npages, pf)) {
pr_err("Failed to map kernel text 1:1\n");
return 1;
--
2.20.1

2020-01-03 11:43:18

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH 18/20] x86/mm: fix NX bit clearing issue in kernel_map_pages_in_pgd

Commit 15f003d20782 ("x86/mm/pat: Don't implicitly allow _PAGE_RW in
kernel_map_pages_in_pgd()") modified kernel_map_pages_in_pgd() to
manage writable permissions of memory mappings in the EFI page
table in a different way, but in the process, it removed the
ability to clear NX attributes from read-only mappings, by
clobbering the clear mask if _PAGE_RW is not being requested.

Failure to remove the NX attribute from read-only mappings is
unlikely to be a security issue, but it does prevent us from
tightening the permissions in the EFI page tables going forward,
so let's fix it now.

Fixes: 15f003d20782 ("x86/mm/pat: Don't implicitly allow _PAGE_RW in kernel_map_pages_in_pgd()
Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/x86/mm/pageattr.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 1b99ad05b117..f42780ba0893 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -2215,7 +2215,7 @@ int __init kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
.pgd = pgd,
.numpages = numpages,
.mask_set = __pgprot(0),
- .mask_clr = __pgprot(0),
+ .mask_clr = __pgprot(~page_flags & (_PAGE_NX|_PAGE_RW)),
.flags = 0,
};

@@ -2224,12 +2224,6 @@ int __init kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
if (!(__supported_pte_mask & _PAGE_NX))
goto out;

- if (!(page_flags & _PAGE_NX))
- cpa.mask_clr = __pgprot(_PAGE_NX);
-
- if (!(page_flags & _PAGE_RW))
- cpa.mask_clr = __pgprot(_PAGE_RW);
-
if (!(page_flags & _PAGE_ENC))
cpa.mask_clr = pgprot_encrypted(cpa.mask_clr);

--
2.20.1

2020-01-03 11:43:33

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH 10/20] efi/x86: simplify mixed mode call wrapper

Calling 32-bit EFI runtime services from a 64-bit OS involves
switching back to the flat mapping with a stack carved out of
memory that is 32-bit addressable.

There is no need to actually execute the 64-bit part of this
routine from the flat mapping as well, as long as the entry
and return address fit in 32 bits. There is also no need to
preserve part of the calling context in global variables: we
can simply push the old stack pointer value to the new stack,
and keep the return address from the code32 section in EBX.

While at it, move the conditional check whether to invoke
the mixed mode version of SetVirtualAddressMap() into the
64-bit implementation of the wrapper routine.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/x86/include/asm/efi.h | 6 --
arch/x86/platform/efi/efi.c | 19 +----
arch/x86/platform/efi/efi_64.c | 73 ++++++++++------
arch/x86/platform/efi/efi_thunk_64.S | 121 +++++----------------------
4 files changed, 71 insertions(+), 148 deletions(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index cb08035b89a0..e7e9c6e057f9 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -164,12 +164,6 @@ extern void parse_efi_setup(u64 phys_addr, u32 data_len);
extern void efifb_setup_from_dmi(struct screen_info *si, const char *opt);

extern void efi_thunk_runtime_setup(void);
-extern efi_status_t efi_thunk_set_virtual_address_map(
- void *phys_set_virtual_address_map,
- unsigned long memory_map_size,
- unsigned long descriptor_size,
- u32 descriptor_version,
- efi_memory_desc_t *virtual_map);
efi_status_t efi_set_virtual_address_map(unsigned long memory_map_size,
unsigned long descriptor_size,
u32 descriptor_version,
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 50f8123e658a..e4d3afac7be3 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -1015,21 +1015,10 @@ static void __init __efi_enter_virtual_mode(void)

efi_sync_low_kernel_mappings();

- if (!efi_is_mixed()) {
- status = efi_set_virtual_address_map(
- efi.memmap.desc_size * count,
- efi.memmap.desc_size,
- efi.memmap.desc_version,
- (efi_memory_desc_t *)pa);
- } else {
- status = efi_thunk_set_virtual_address_map(
- efi_phys.set_virtual_address_map,
- efi.memmap.desc_size * count,
- efi.memmap.desc_size,
- efi.memmap.desc_version,
- (efi_memory_desc_t *)pa);
- }
-
+ status = efi_set_virtual_address_map(efi.memmap.desc_size * count,
+ efi.memmap.desc_size,
+ efi.memmap.desc_version,
+ (efi_memory_desc_t *)pa);
if (status != EFI_SUCCESS) {
pr_alert("Unable to switch EFI into virtual mode (status=%lx)!\n",
status);
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 03565dad0c4b..910e9ec03b09 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -626,61 +626,74 @@ void efi_switch_mm(struct mm_struct *mm)
switch_mm(efi_scratch.prev_mm, mm, NULL);
}

-#ifdef CONFIG_EFI_MIXED
static DEFINE_SPINLOCK(efi_runtime_lock);

-#define runtime_service32(func) \
-({ \
- u32 table = (u32)(unsigned long)efi.systab; \
- u32 *rt, *___f; \
- \
- rt = (u32 *)(table + offsetof(efi_system_table_32_t, runtime)); \
- ___f = (u32 *)(*rt + offsetof(efi_runtime_services_32_t, func)); \
- *___f; \
+/*
+ * DS and ES contain user values. We need to save them.
+ * The 32-bit EFI code needs a valid DS, ES, and SS. There's no
+ * need to save the old SS: __KERNEL_DS is always acceptable.
+ */
+#define __efi_thunk(func, ...) \
+({ \
+ efi_runtime_services_32_t *__rt; \
+ unsigned short __ds, __es; \
+ efi_status_t ____s; \
+ \
+ __rt = (void *)(unsigned long)efi.systab->mixed_mode.runtime; \
+ \
+ savesegment(ds, __ds); \
+ savesegment(es, __es); \
+ \
+ loadsegment(ss, __KERNEL_DS); \
+ loadsegment(ds, __KERNEL_DS); \
+ loadsegment(es, __KERNEL_DS); \
+ \
+ ____s = efi64_thunk(__rt->func, __VA_ARGS__); \
+ \
+ loadsegment(ds, __ds); \
+ loadsegment(es, __es); \
+ \
+ ____s ^= (____s & BIT(31)) | (____s & BIT_ULL(31)) << 32; \
+ ____s; \
})

/*
* Switch to the EFI page tables early so that we can access the 1:1
* runtime services mappings which are not mapped in any other page
- * tables. This function must be called before runtime_service32().
+ * tables.
*
* Also, disable interrupts because the IDT points to 64-bit handlers,
* which aren't going to function correctly when we switch to 32-bit.
*/
-#define efi_thunk(f, ...) \
+#define efi_thunk(func...) \
({ \
efi_status_t __s; \
- u32 __func; \
\
arch_efi_call_virt_setup(); \
\
- __func = runtime_service32(f); \
- __s = efi64_thunk(__func, __VA_ARGS__); \
+ __s = __efi_thunk(func); \
\
arch_efi_call_virt_teardown(); \
\
__s; \
})

-efi_status_t efi_thunk_set_virtual_address_map(
- void *phys_set_virtual_address_map,
- unsigned long memory_map_size,
- unsigned long descriptor_size,
- u32 descriptor_version,
- efi_memory_desc_t *virtual_map)
+static efi_status_t __init
+efi_thunk_set_virtual_address_map(unsigned long memory_map_size,
+ unsigned long descriptor_size,
+ u32 descriptor_version,
+ efi_memory_desc_t *virtual_map)
{
efi_status_t status;
unsigned long flags;
- u32 func;

efi_sync_low_kernel_mappings();
local_irq_save(flags);

efi_switch_mm(&efi_mm);

- func = (u32)(unsigned long)phys_set_virtual_address_map;
- status = efi64_thunk(func, memory_map_size, descriptor_size,
- descriptor_version, virtual_map);
+ status = __efi_thunk(set_virtual_address_map, memory_map_size,
+ descriptor_size, descriptor_version, virtual_map);

efi_switch_mm(efi_scratch.prev_mm);
local_irq_restore(flags);
@@ -983,8 +996,11 @@ efi_thunk_query_capsule_caps(efi_capsule_header_t **capsules,
return EFI_UNSUPPORTED;
}

-void efi_thunk_runtime_setup(void)
+void __init efi_thunk_runtime_setup(void)
{
+ if (!IS_ENABLED(CONFIG_EFI_MIXED))
+ return;
+
efi.get_time = efi_thunk_get_time;
efi.set_time = efi_thunk_set_time;
efi.get_wakeup_time = efi_thunk_get_wakeup_time;
@@ -1000,7 +1016,6 @@ void efi_thunk_runtime_setup(void)
efi.update_capsule = efi_thunk_update_capsule;
efi.query_capsule_caps = efi_thunk_query_capsule_caps;
}
-#endif /* CONFIG_EFI_MIXED */

efi_status_t __init efi_set_virtual_address_map(unsigned long memory_map_size,
unsigned long descriptor_size,
@@ -1011,6 +1026,12 @@ efi_status_t __init efi_set_virtual_address_map(unsigned long memory_map_size,
unsigned long flags;
pgd_t *save_pgd = NULL;

+ if (efi_is_mixed())
+ return efi_thunk_set_virtual_address_map(memory_map_size,
+ descriptor_size,
+ descriptor_version,
+ virtual_map);
+
if (efi_enabled(EFI_OLD_MEMMAP)) {
save_pgd = efi_old_memmap_phys_prolog();
if (!save_pgd)
diff --git a/arch/x86/platform/efi/efi_thunk_64.S b/arch/x86/platform/efi/efi_thunk_64.S
index 3189f1394701..162b35729633 100644
--- a/arch/x86/platform/efi/efi_thunk_64.S
+++ b/arch/x86/platform/efi/efi_thunk_64.S
@@ -25,15 +25,16 @@

.text
.code64
-SYM_FUNC_START(efi64_thunk)
+SYM_CODE_START(efi64_thunk)
push %rbp
push %rbx

/*
* Switch to 1:1 mapped 32-bit stack pointer.
*/
- movq %rsp, efi_saved_sp(%rip)
+ movq %rsp, %rax
movq efi_scratch(%rip), %rsp
+ push %rax

/*
* Calculate the physical address of the kernel text.
@@ -41,113 +42,31 @@ SYM_FUNC_START(efi64_thunk)
movq $__START_KERNEL_map, %rax
subq phys_base(%rip), %rax

- /*
- * Push some physical addresses onto the stack. This is easier
- * to do now in a code64 section while the assembler can address
- * 64-bit values. Note that all the addresses on the stack are
- * 32-bit.
- */
- subq $16, %rsp
- leaq efi_exit32(%rip), %rbx
- subq %rax, %rbx
- movl %ebx, 8(%rsp)
-
- leaq __efi64_thunk(%rip), %rbx
+ leaq 1f(%rip), %rbp
+ leaq 2f(%rip), %rbx
+ subq %rax, %rbp
subq %rax, %rbx
- call *%rbx
-
- movq efi_saved_sp(%rip), %rsp
- pop %rbx
- pop %rbp
- retq
-SYM_FUNC_END(efi64_thunk)

-/*
- * We run this function from the 1:1 mapping.
- *
- * This function must be invoked with a 1:1 mapped stack.
- */
-SYM_FUNC_START_LOCAL(__efi64_thunk)
- movl %ds, %eax
- push %rax
- movl %es, %eax
- push %rax
- movl %ss, %eax
- push %rax
-
- subq $32, %rsp
- movl %esi, 0x0(%rsp)
- movl %edx, 0x4(%rsp)
- movl %ecx, 0x8(%rsp)
- movq %r8, %rsi
- movl %esi, 0xc(%rsp)
- movq %r9, %rsi
- movl %esi, 0x10(%rsp)
-
- leaq 1f(%rip), %rbx
- movq %rbx, func_rt_ptr(%rip)
+ subq $28, %rsp
+ movl %ebx, 0x0(%rsp) /* return address */
+ movl %esi, 0x4(%rsp)
+ movl %edx, 0x8(%rsp)
+ movl %ecx, 0xc(%rsp)
+ movl %r8d, 0x10(%rsp)
+ movl %r9d, 0x14(%rsp)

/* Switch to 32-bit descriptor */
pushq $__KERNEL32_CS
- leaq efi_enter32(%rip), %rax
- pushq %rax
+ pushq %rdi /* EFI runtime service address */
lretq

-1: addq $32, %rsp
-
+1: movq 24(%rsp), %rsp
pop %rbx
- movl %ebx, %ss
- pop %rbx
- movl %ebx, %es
- pop %rbx
- movl %ebx, %ds
-
- /*
- * Convert 32-bit status code into 64-bit.
- */
- test %rax, %rax
- jz 1f
- movl %eax, %ecx
- andl $0x0fffffff, %ecx
- andl $0xf0000000, %eax
- shl $32, %rax
- or %rcx, %rax
-1:
- ret
-SYM_FUNC_END(__efi64_thunk)
-
-SYM_FUNC_START_LOCAL(efi_exit32)
- movq func_rt_ptr(%rip), %rax
- push %rax
- mov %rdi, %rax
- ret
-SYM_FUNC_END(efi_exit32)
+ pop %rbp
+ retq

.code32
-/*
- * EFI service pointer must be in %edi.
- *
- * The stack should represent the 32-bit calling convention.
- */
-SYM_FUNC_START_LOCAL(efi_enter32)
- movl $__KERNEL_DS, %eax
- movl %eax, %ds
- movl %eax, %es
- movl %eax, %ss
-
- call *%edi
-
- /* We must preserve return value */
- movl %eax, %edi
-
- movl 72(%esp), %eax
- pushl $__KERNEL_CS
- pushl %eax
-
+2: pushl $__KERNEL_CS
+ pushl %ebp
lret
-SYM_FUNC_END(efi_enter32)
-
- .data
- .balign 8
-func_rt_ptr: .quad 0
-efi_saved_sp: .quad 0
+SYM_CODE_END(efi64_thunk)
--
2.20.1

2020-01-03 11:43:48

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH 02/20] efi/libstub/x86: force 'hidden' visibility for extern declarations

Commit c3710de5065d ("efi/libstub/x86: Drop __efi_early() export and
efi_config struct") introduced a reference from C code in eboot.c to
the startup_32 symbol defined in the .S startup code. This results in
a GOT based reference to startup_32, and since GOT entries carry
absolute addresses, they need to be fixed up before they can be used.

On modern toolchains (binutils 2.26 or later), this reference is
relaxed into a R_386_GOTOFF relocation (or the analogous X86_64 one)
which never uses the absolute address in the entry, and so we get
away with not fixing up the GOT table before calling the EFI entry
point. However, GCC 4.6 combined with a binutils of the era (2.24)
will produce a true GOT indirected reference, resulting in a wrong
value to be returned for the address of startup_32() if the boot
code is not running at the address it was linked at.

Fortunately, we can easily override this behavior, and force GCC to
emit the GOTOFF relocations explicitly, by setting the visibility
pragma 'hidden'.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/x86/boot/compressed/eboot.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c
index da04948d75ed..565ee4733579 100644
--- a/arch/x86/boot/compressed/eboot.c
+++ b/arch/x86/boot/compressed/eboot.c
@@ -6,6 +6,8 @@
*
* ----------------------------------------------------------------------- */

+#pragma GCC visibility push(hidden)
+
#include <linux/efi.h>
#include <linux/pci.h>

--
2.20.1

2020-01-03 11:50:52

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH 01/20] efi/libstub: fix boot argument handling in mixed mode entry code

The mixed mode refactor actually broke mixed mode by failing to
pass the bootparam structure to startup_32(). This went unnoticed
because it apparently has a high tolerance for being passed random
junk, and still boots fine in some cases. So let's fix this by
populating %esi as required when entering via efi32_stub_entry,
and while at it, preserve the arguments themselves instead of their
address in memory (via the stack pointer) since that memory could
be clobbered before we get to it.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/x86/boot/compressed/head_64.S | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index a6f3ee9ca61d..44a6bb6964b5 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -208,13 +208,12 @@ SYM_FUNC_START(startup_32)
pushl $__KERNEL_CS
leal startup_64(%ebp), %eax
#ifdef CONFIG_EFI_MIXED
- movl efi32_boot_args(%ebp), %ebx
- cmp $0, %ebx
+ movl efi32_boot_args(%ebp), %edi
+ cmp $0, %edi
jz 1f
leal handover_entry(%ebp), %eax
- movl 0(%ebx), %edi
- movl 4(%ebx), %esi
- movl 8(%ebx), %edx
+ movl %esi, %edx
+ movl efi32_boot_args+4(%ebp), %esi
movl $0x0, %ecx
1:
#endif
@@ -232,12 +231,16 @@ SYM_FUNC_END(startup_32)
.org 0x190
SYM_FUNC_START(efi32_stub_entry)
add $0x4, %esp /* Discard return address */
+ popl %ecx
+ popl %edx
+ popl %esi

call 1f
1: pop %ebp
subl $1b, %ebp

- movl %esp, efi32_boot_args(%ebp)
+ movl %ecx, efi32_boot_args(%ebp)
+ movl %edx, efi32_boot_args+4(%ebp)
sgdtl efi32_boot_gdt(%ebp)

/* Disable paging */
@@ -628,7 +631,7 @@ SYM_DATA_START_LOCAL(gdt)
SYM_DATA_END_LABEL(gdt, SYM_L_LOCAL, gdt_end)

#ifdef CONFIG_EFI_MIXED
-SYM_DATA_LOCAL(efi32_boot_args, .long 0)
+SYM_DATA_LOCAL(efi32_boot_args, .long 0, 0)
#endif

/*
--
2.20.1

2020-01-09 11:03:19

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [GIT PULL 00/20] More EFI updates for v5.6

On Fri, 3 Jan 2020 at 12:40, Ard Biesheuvel <[email protected]> wrote:
>
> Ingo, Thomas,
>
> This is the second batch of EFI updates for v5.6. Two things are still
> under discussion, so I'll probably have a few more changes for this
> cycle in a week or so.
>
> The following changes since commit 0679715e714345d273c0e1eb78078535ffc4b2a1:
>
> efi/libstub/x86: Avoid globals to store context during mixed mode calls (2019-12-25 10:49:26 +0100)
>
> are available in the Git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git tags/efi-next
>
> for you to fetch changes up to d95e4feae5368a91775c4597a8f298ba84f31535:
>
> efi/x86: avoid RWX mappings for all of DRAM (2020-01-03 11:46:15 +0100)
>

Ingo, Thomas,

I'll be submitting another PR later today or tomorrow that goes on top
of these changes. Please let me know if you would like a v2 of this PR
with the new content included, or rather keep them separate.

Thanks,
Ard.



> ----------------------------------------------------------------
> Second batch of EFI updates for v5.6:
> - Some followup fixes for the EFI stub changes that have been queued up
> already.
> - Overhaul of the x86 EFI boot/runtime code, to peel off layers of pointer
> casting and type mangling via variadic macros and asm wrappers that made
> the code fragile and ugly.
> - Increase robustness for mixed mode code, by using argmaps to annotate and
> translate function prototypes that are not mixed mode safe. (Arvind)
> - Add the ability to disable DMA at the root port level in the EFI stub, to
> avoid booting into the kernel proper with IOMMUs in pass through and DMA
> enabled (suggested by Matthew)
> - Get rid of RWX mappings in the EFI memory map, where possible.
>
> ----------------------------------------------------------------
> Ard Biesheuvel (17):
> efi/libstub: fix boot argument handling in mixed mode entry code
> efi/libstub/x86: force 'hidden' visibility for extern declarations
> efi/x86: re-disable RT services for 32-bit kernels running on 64-bit EFI
> efi/x86: map the entire EFI vendor string before copying it
> efi/x86: avoid redundant cast of EFI firmware service pointer
> efi/x86: split off some old memmap handling into separate routines
> efi/x86: split SetVirtualAddresMap() wrappers into 32 and 64 bit versions
> efi/x86: simplify i386 efi_call_phys() firmware call wrapper
> efi/x86: simplify 64-bit EFI firmware call wrapper
> efi/x86: simplify mixed mode call wrapper
> efi/x86: drop two near identical versions of efi_runtime_init()
> efi/x86: clean up efi_systab_init() routine for legibility
> efi/x86: don't panic or BUG() on non-critical error conditions
> efi/x86: remove unreachable code in kexec_enter_virtual_mode()
> x86/mm: fix NX bit clearing issue in kernel_map_pages_in_pgd
> efi/x86: don't map the entire kernel text RW for mixed mode
> efi/x86: avoid RWX mappings for all of DRAM
>
> Arvind Sankar (2):
> efi/x86: Check number of arguments to variadic functions
> efi/x86: Allow translating 64-bit arguments for mixed mode calls
>
> Matthew Garrett (1):
> efi: Allow disabling PCI busmastering on bridges during boot
>
> Documentation/admin-guide/kernel-parameters.txt | 7 +-
> arch/x86/boot/compressed/eboot.c | 18 +-
> arch/x86/boot/compressed/efi_thunk_64.S | 4 +-
> arch/x86/boot/compressed/head_64.S | 17 +-
> arch/x86/include/asm/efi.h | 169 ++++++++---
> arch/x86/mm/pageattr.c | 8 +-
> arch/x86/platform/efi/Makefile | 1 -
> arch/x86/platform/efi/efi.c | 354 ++++++++----------------
> arch/x86/platform/efi/efi_32.c | 22 +-
> arch/x86/platform/efi/efi_64.c | 157 +++++++----
> arch/x86/platform/efi/efi_stub_32.S | 109 ++------
> arch/x86/platform/efi/efi_stub_64.S | 43 +--
> arch/x86/platform/efi/efi_thunk_64.S | 121 ++------
> arch/x86/platform/uv/bios_uv.c | 7 +-
> drivers/firmware/efi/Kconfig | 22 ++
> drivers/firmware/efi/libstub/Makefile | 2 +-
> drivers/firmware/efi/libstub/efi-stub-helper.c | 20 +-
> drivers/firmware/efi/libstub/pci.c | 114 ++++++++
> include/linux/efi.h | 29 +-
> 19 files changed, 597 insertions(+), 627 deletions(-)
> create mode 100644 drivers/firmware/efi/libstub/pci.c

2020-01-10 18:14:55

by Ingo Molnar

[permalink] [raw]
Subject: Re: [GIT PULL 00/20] More EFI updates for v5.6


* Ard Biesheuvel <[email protected]> wrote:

> On Fri, 3 Jan 2020 at 12:40, Ard Biesheuvel <[email protected]> wrote:
> >
> > Ingo, Thomas,
> >
> > This is the second batch of EFI updates for v5.6. Two things are still
> > under discussion, so I'll probably have a few more changes for this
> > cycle in a week or so.
> >
> > The following changes since commit 0679715e714345d273c0e1eb78078535ffc4b2a1:
> >
> > efi/libstub/x86: Avoid globals to store context during mixed mode calls (2019-12-25 10:49:26 +0100)
> >
> > are available in the Git repository at:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git tags/efi-next
> >
> > for you to fetch changes up to d95e4feae5368a91775c4597a8f298ba84f31535:
> >
> > efi/x86: avoid RWX mappings for all of DRAM (2020-01-03 11:46:15 +0100)
> >
>
> Ingo, Thomas,
>
> I'll be submitting another PR later today or tomorrow that goes on top
> of these changes. Please let me know if you would like a v2 of this PR
> with the new content included, or rather keep them separate.

So there's one complication I noticed, there's conflicts with ongoing
x86/mm work. I've merged x86/mm into efi/core (and will send the branches
in that order in the merge window), and the final three patches conflict.

Mind sending those three patches and your other patches on top of the
latest efi/core (4444f8541dad)?

Thanks,

Ingo

2020-01-13 07:30:30

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [GIT PULL 00/20] More EFI updates for v5.6

On Fri, 10 Jan 2020 at 19:13, Ingo Molnar <[email protected]> wrote:
>
>
> * Ard Biesheuvel <[email protected]> wrote:
>
> > On Fri, 3 Jan 2020 at 12:40, Ard Biesheuvel <[email protected]> wrote:
> > >
> > > Ingo, Thomas,
> > >
> > > This is the second batch of EFI updates for v5.6. Two things are still
> > > under discussion, so I'll probably have a few more changes for this
> > > cycle in a week or so.
> > >
> > > The following changes since commit 0679715e714345d273c0e1eb78078535ffc4b2a1:
> > >
> > > efi/libstub/x86: Avoid globals to store context during mixed mode calls (2019-12-25 10:49:26 +0100)
> > >
> > > are available in the Git repository at:
> > >
> > > git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git tags/efi-next
> > >
> > > for you to fetch changes up to d95e4feae5368a91775c4597a8f298ba84f31535:
> > >
> > > efi/x86: avoid RWX mappings for all of DRAM (2020-01-03 11:46:15 +0100)
> > >
> >
> > Ingo, Thomas,
> >
> > I'll be submitting another PR later today or tomorrow that goes on top
> > of these changes. Please let me know if you would like a v2 of this PR
> > with the new content included, or rather keep them separate.
>
> So there's one complication I noticed, there's conflicts with ongoing
> x86/mm work. I've merged x86/mm into efi/core (and will send the branches
> in that order in the merge window), and the final three patches conflict.
>
> Mind sending those three patches and your other patches on top of the
> latest efi/core (4444f8541dad)?
>

No problem. I'll rebase and retest, and send out the result end of
today or tomorrow.

Thanks,
Ard.