This series adds UEFI support for RISC-V.
Linux kernel: master(00e4db51259a)
U-Boot: v2020.07
OpenSBI: master
Patch 1-3 are generic riscv feature addition required for UEFI support.
Patch 4-7 adds the efi stub support for RISC-V which was reviewed few months back.
https://www.spinics.net/lists/linux-efi/msg19144.html
Patch 8 just renames arm-init code so that it can be used across different
architectures.
Patch 9 adds the runtime services for RISC-V.
The working set of patches can also be found in following git repo.
https://github.com/atishp04/linux/tree/uefi_riscv_5.10_v5
The patches have been verified on following platforms:
1. Qemu (both RV32 & RV64) for the following bootflow
OpenSBI->U-Boot->Linux
EDK2->Linux
2. HiFive unleashed using (RV64) for the following bootflow
OpenSBI->U-Boot->Linux
EDK2->Linux
Thanks Abner & Daniel for all work done for EDK2.
The EDK2 instructions are available here.
https://github.com/JohnAZoidberg/riscv-edk2-docker/
Note:
1. Currently, EDK2 RISC-V port doesn't support OVMF package. That's why
EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER should be enabled to load initrd via
commandline until OVMF patches are available.
2. For RV32, maximum allocated memory should be 1G as RISC-V kernel can not map
beyond 1G of physical memory for RV32.
3. Runtime services have been verified with fwts on EDK2.
***********************************************************************
[root@fedora-riscv ~]# fwts uefirtvariable
Running 1 tests, results appended to results.log
Test: UEFI Runtime service variable interface tests.
Test UEFI RT service get variable interface. 1 passed
Test UEFI RT service get next variable name interface. 4 passed
Test UEFI RT service set variable interface. 7 passed, 1 warning
Test UEFI RT service query variable info interface. 1 passed
Test UEFI RT service variable interface stress test. 2 passed
Test UEFI RT service set variable interface stress t.. 4 passed
Test UEFI RT service query variable info interface s.. 1 passed
Test UEFI RT service get variable interface, invalid.. 5 passed
Test UEFI RT variable services supported status. 1 skipped
Test |Pass |Fail |Abort|Warn |Skip |Info |
uefirtvariable | 25| | | 1| 1| |
Total: | 25| 0| 0| 1| 1| 0|
***********************************************************************
Changes from v4->v5:
1. Late mappings allocations are now done through function pointers.
2. EFI run time services are verified using full linux boot and fwts using EDK2.
Changes from v3->v4:
1. Used pgd mapping to avoid copying DT to bss.
Changes from v2->v3:
1. Fixed few bugs in run time services page table mapping.
2. Dropped patch 1 as it is already taken into efi-tree.
3. Sent few generic mmu fixes as a separate series to ease the merge conflicts.
Changes from v1->v2:
1. Removed patch 1 as it is already taken into efi-tree.
2. Fixed compilation issues with patch 9.
3. Moved few function prototype declaration to header file to keep kbuild happy.
Changes from previous version:
1. Added full ioremap support.
2. Added efi runtime services support.
3. Fixes mm issues
Anup Patel (1):
RISC-V: Move DT mapping outof fixmap
Atish Patra (8):
RISC-V: Add early ioremap support
RISC-V: Implement late mapping page table allocation functions
include: pe.h: Add RISC-V related PE definition
RISC-V: Add PE/COFF header for EFI stub
RISC-V: Add EFI stub support.
efi: Rename arm-init to efi-init common for all arch
RISC-V: Add EFI runtime services
RISC-V: Add page table dump support for uefi
arch/riscv/Kconfig | 25 +++
arch/riscv/Makefile | 1 +
arch/riscv/configs/defconfig | 1 +
arch/riscv/include/asm/Kbuild | 1 +
arch/riscv/include/asm/efi.h | 56 +++++
arch/riscv/include/asm/fixmap.h | 16 +-
arch/riscv/include/asm/io.h | 1 +
arch/riscv/include/asm/mmu.h | 2 +
arch/riscv/include/asm/pgtable.h | 5 +
arch/riscv/include/asm/sections.h | 13 ++
arch/riscv/kernel/Makefile | 5 +
arch/riscv/kernel/efi-header.S | 104 ++++++++++
arch/riscv/kernel/efi.c | 105 ++++++++++
arch/riscv/kernel/head.S | 17 +-
arch/riscv/kernel/head.h | 2 -
arch/riscv/kernel/image-vars.h | 51 +++++
arch/riscv/kernel/setup.c | 17 +-
arch/riscv/kernel/vmlinux.lds.S | 22 +-
arch/riscv/mm/init.c | 191 +++++++++++++-----
arch/riscv/mm/ptdump.c | 48 ++++-
drivers/firmware/efi/Kconfig | 3 +-
drivers/firmware/efi/Makefile | 4 +-
.../firmware/efi/{arm-init.c => efi-init.c} | 0
drivers/firmware/efi/libstub/Makefile | 10 +
drivers/firmware/efi/libstub/efi-stub.c | 11 +-
drivers/firmware/efi/libstub/riscv-stub.c | 110 ++++++++++
drivers/firmware/efi/riscv-runtime.c | 143 +++++++++++++
include/linux/pe.h | 3 +
28 files changed, 900 insertions(+), 67 deletions(-)
create mode 100644 arch/riscv/include/asm/efi.h
create mode 100644 arch/riscv/include/asm/sections.h
create mode 100644 arch/riscv/kernel/efi-header.S
create mode 100644 arch/riscv/kernel/efi.c
create mode 100644 arch/riscv/kernel/image-vars.h
rename drivers/firmware/efi/{arm-init.c => efi-init.c} (100%)
create mode 100644 drivers/firmware/efi/libstub/riscv-stub.c
create mode 100644 drivers/firmware/efi/riscv-runtime.c
--
2.24.0
Add a RISC-V architecture specific stub code that actually copies the
actual kernel image to a valid address and jump to it after boot services
are terminated. Enable UEFI related kernel configs as well for RISC-V.
Signed-off-by: Atish Patra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[ardb: - move hartid fetch into check_platform_features()
- use image_size not reserve_size
- select ISA_C ]
Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/riscv/Kconfig | 22 +++++
arch/riscv/Makefile | 1 +
arch/riscv/configs/defconfig | 1 +
arch/riscv/include/asm/efi.h | 36 +++++++
drivers/firmware/efi/Kconfig | 3 +-
drivers/firmware/efi/libstub/Makefile | 10 ++
drivers/firmware/efi/libstub/riscv-stub.c | 110 ++++++++++++++++++++++
7 files changed, 182 insertions(+), 1 deletion(-)
create mode 100644 arch/riscv/include/asm/efi.h
create mode 100644 drivers/firmware/efi/libstub/riscv-stub.c
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 15597f5f504f..e11907cc7a43 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -401,6 +401,26 @@ config CMDLINE_FORCE
endchoice
+config EFI_STUB
+ bool
+
+config EFI
+ bool "UEFI runtime support"
+ depends on OF
+ select LIBFDT
+ select UCS2_STRING
+ select EFI_PARAMS_FROM_FDT
+ select EFI_STUB
+ select EFI_GENERIC_STUB
+ select RISCV_ISA_C
+ default y
+ help
+ This option provides support for runtime services provided
+ by UEFI firmware (such as non-volatile variables, realtime
+ clock, and platform reset). A UEFI stub is also provided to
+ allow the kernel to be booted as an EFI application. This
+ is only useful on systems that have UEFI firmware.
+
endmenu
config BUILTIN_DTB
@@ -413,3 +433,5 @@ menu "Power management options"
source "kernel/power/Kconfig"
endmenu
+
+source "drivers/firmware/Kconfig"
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index fb6e37db836d..10df59f28add 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -80,6 +80,7 @@ head-y := arch/riscv/kernel/head.o
core-y += arch/riscv/
libs-y += arch/riscv/lib/
+libs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
PHONY += vdso_install
vdso_install:
diff --git a/arch/riscv/configs/defconfig b/arch/riscv/configs/defconfig
index d58c93efb603..d222d353d86d 100644
--- a/arch/riscv/configs/defconfig
+++ b/arch/riscv/configs/defconfig
@@ -130,3 +130,4 @@ CONFIG_DEBUG_BLOCK_EXT_DEVT=y
# CONFIG_RUNTIME_TESTING_MENU is not set
CONFIG_MEMTEST=y
# CONFIG_SYSFS_SYSCALL is not set
+CONFIG_EFI=y
diff --git a/arch/riscv/include/asm/efi.h b/arch/riscv/include/asm/efi.h
new file mode 100644
index 000000000000..86da231909bb
--- /dev/null
+++ b/arch/riscv/include/asm/efi.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020 Western Digital Corporation or its affiliates.
+ */
+#ifndef _ASM_EFI_H
+#define _ASM_EFI_H
+
+#include <asm/io.h>
+#include <asm/mmu_context.h>
+#include <asm/ptrace.h>
+#include <asm/tlbflush.h>
+
+/* on RISC-V, the FDT may be located anywhere in system RAM */
+static inline unsigned long efi_get_max_fdt_addr(unsigned long dram_base)
+{
+ return ULONG_MAX;
+}
+
+/* Load initrd at enough distance from DRAM start */
+static inline unsigned long efi_get_max_initrd_addr(unsigned long dram_base,
+ unsigned long image_addr)
+{
+ return dram_base + SZ_256M;
+}
+
+#define alloc_screen_info(x...) (&screen_info)
+
+static inline void free_screen_info(struct screen_info *si)
+{
+}
+
+static inline void efifb_setup_from_dmi(struct screen_info *si, const char *opt)
+{
+}
+
+#endif /* _ASM_EFI_H */
diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig
index 3939699e62fe..a29fbd6e657e 100644
--- a/drivers/firmware/efi/Kconfig
+++ b/drivers/firmware/efi/Kconfig
@@ -111,7 +111,7 @@ config EFI_GENERIC_STUB
config EFI_ARMSTUB_DTB_LOADER
bool "Enable the DTB loader"
- depends on EFI_GENERIC_STUB
+ depends on EFI_GENERIC_STUB && !RISCV
default y
help
Select this config option to add support for the dtb= command
@@ -128,6 +128,7 @@ config EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER
bool "Enable the command line initrd loader" if !X86
depends on EFI_STUB && (EFI_GENERIC_STUB || X86)
default y
+ depends on !RISCV
help
Select this config option to add support for the initrd= command
line parameter, allowing an initrd that resides on the same volume
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index 296b18fbd7a2..e9fc2ddabd5f 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -22,6 +22,8 @@ cflags-$(CONFIG_ARM64) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
cflags-$(CONFIG_ARM) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
-fno-builtin -fpic \
$(call cc-option,-mno-single-pic-base)
+cflags-$(CONFIG_RISCV) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
+ -fpic
cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
@@ -63,6 +65,7 @@ lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o fdt.o string.o \
lib-$(CONFIG_ARM) += arm32-stub.o
lib-$(CONFIG_ARM64) += arm64-stub.o
lib-$(CONFIG_X86) += x86-stub.o
+lib-$(CONFIG_RISCV) += riscv-stub.o
CFLAGS_arm32-stub.o := -DTEXT_OFFSET=$(TEXT_OFFSET)
CFLAGS_arm64-stub.o := -DTEXT_OFFSET=$(TEXT_OFFSET)
@@ -106,6 +109,13 @@ STUBCOPY_FLAGS-$(CONFIG_ARM64) += --prefix-alloc-sections=.init \
--prefix-symbols=__efistub_
STUBCOPY_RELOC-$(CONFIG_ARM64) := R_AARCH64_ABS
+# For RISC-V, we don't need anything special other than arm64. Keep all the
+# symbols in .init section and make sure that no absolute symbols references
+# doesn't exist.
+STUBCOPY_FLAGS-$(CONFIG_RISCV) += --prefix-alloc-sections=.init \
+ --prefix-symbols=__efistub_
+STUBCOPY_RELOC-$(CONFIG_RISCV) := R_RISCV_HI20
+
$(obj)/%.stub.o: $(obj)/%.o FORCE
$(call if_changed,stubcopy)
diff --git a/drivers/firmware/efi/libstub/riscv-stub.c b/drivers/firmware/efi/libstub/riscv-stub.c
new file mode 100644
index 000000000000..77c3fd6f820e
--- /dev/null
+++ b/drivers/firmware/efi/libstub/riscv-stub.c
@@ -0,0 +1,110 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2020 Western Digital Corporation or its affiliates.
+ */
+
+#include <linux/efi.h>
+#include <linux/libfdt.h>
+
+#include <asm/efi.h>
+#include <asm/sections.h>
+
+#include "efistub.h"
+
+/*
+ * RISC-V requires the kernel image to placed 2 MB aligned base for 64 bit and
+ * 4MB for 32 bit.
+ */
+#ifdef CONFIG_64BIT
+#define MIN_KIMG_ALIGN SZ_2M
+#else
+#define MIN_KIMG_ALIGN SZ_4M
+#endif
+
+typedef void __noreturn (*jump_kernel_func)(unsigned int, unsigned long);
+
+static u32 hartid;
+
+static u32 get_boot_hartid_from_fdt(void)
+{
+ const void *fdt;
+ int chosen_node, len;
+ const fdt32_t *prop;
+
+ fdt = get_efi_config_table(DEVICE_TREE_GUID);
+ if (!fdt)
+ return U32_MAX;
+
+ chosen_node = fdt_path_offset(fdt, "/chosen");
+ if (chosen_node < 0)
+ return U32_MAX;
+
+ prop = fdt_getprop((void *)fdt, chosen_node, "boot-hartid", &len);
+ if (!prop || len != sizeof(u32))
+ return U32_MAX;
+
+ return fdt32_to_cpu(*prop);
+}
+
+efi_status_t check_platform_features(void)
+{
+ hartid = get_boot_hartid_from_fdt();
+ if (hartid == U32_MAX) {
+ efi_err("/chosen/boot-hartid missing or invalid!\n");
+ return EFI_UNSUPPORTED;
+ }
+ return EFI_SUCCESS;
+}
+
+void __noreturn efi_enter_kernel(unsigned long entrypoint, unsigned long fdt,
+ unsigned long fdt_size)
+{
+ unsigned long stext_offset = _start_kernel - _start;
+ unsigned long kernel_entry = entrypoint + stext_offset;
+ jump_kernel_func jump_kernel = (jump_kernel_func)kernel_entry;
+
+ /*
+ * Jump to real kernel here with following constraints.
+ * 1. MMU should be disabled.
+ * 2. a0 should contain hartid
+ * 3. a1 should DT address
+ */
+ csr_write(CSR_SATP, 0);
+ jump_kernel(hartid, fdt);
+}
+
+efi_status_t handle_kernel_image(unsigned long *image_addr,
+ unsigned long *image_size,
+ unsigned long *reserve_addr,
+ unsigned long *reserve_size,
+ unsigned long dram_base,
+ efi_loaded_image_t *image)
+{
+ unsigned long kernel_size = 0;
+ unsigned long preferred_addr;
+ efi_status_t status;
+
+ kernel_size = _edata - _start;
+ *image_addr = (unsigned long)_start;
+ *image_size = kernel_size + (_end - _edata);
+
+ /*
+ * RISC-V kernel maps PAGE_OFFSET virtual address to the same physical
+ * address where kernel is booted. That's why kernel should boot from
+ * as low as possible to avoid wastage of memory. Currently, dram_base
+ * is occupied by the firmware. So the preferred address for kernel to
+ * boot is next aligned address. If preferred address is not available,
+ * relocate_kernel will fall back to efi_low_alloc_above to allocate
+ * lowest possible memory region as long as the address and size meets
+ * the alignment constraints.
+ */
+ preferred_addr = round_up(dram_base, MIN_KIMG_ALIGN) + MIN_KIMG_ALIGN;
+ status = efi_relocate_kernel(image_addr, kernel_size, *image_size,
+ preferred_addr, MIN_KIMG_ALIGN, dram_base);
+
+ if (status != EFI_SUCCESS) {
+ efi_err("Failed to relocate kernel\n");
+ *image_size = 0;
+ }
+ return status;
+}
--
2.24.0
This patch adds EFI runtime service support for RISC-V.
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/Kconfig | 2 +
arch/riscv/include/asm/efi.h | 20 ++++
arch/riscv/include/asm/mmu.h | 2 +
arch/riscv/include/asm/pgtable.h | 4 +
arch/riscv/kernel/Makefile | 1 +
arch/riscv/kernel/efi.c | 105 +++++++++++++++++
arch/riscv/kernel/setup.c | 7 +-
arch/riscv/mm/init.c | 2 +-
drivers/firmware/efi/Makefile | 2 +
drivers/firmware/efi/libstub/efi-stub.c | 11 +-
drivers/firmware/efi/riscv-runtime.c | 143 ++++++++++++++++++++++++
11 files changed, 295 insertions(+), 4 deletions(-)
create mode 100644 arch/riscv/kernel/efi.c
create mode 100644 drivers/firmware/efi/riscv-runtime.c
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index e11907cc7a43..b2164109483d 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -412,7 +412,9 @@ config EFI
select EFI_PARAMS_FROM_FDT
select EFI_STUB
select EFI_GENERIC_STUB
+ select EFI_RUNTIME_WRAPPERS
select RISCV_ISA_C
+ depends on MMU
default y
help
This option provides support for runtime services provided
diff --git a/arch/riscv/include/asm/efi.h b/arch/riscv/include/asm/efi.h
index 86da231909bb..93c305a638f4 100644
--- a/arch/riscv/include/asm/efi.h
+++ b/arch/riscv/include/asm/efi.h
@@ -5,11 +5,28 @@
#ifndef _ASM_EFI_H
#define _ASM_EFI_H
+#include <asm/csr.h>
#include <asm/io.h>
#include <asm/mmu_context.h>
#include <asm/ptrace.h>
#include <asm/tlbflush.h>
+#ifdef CONFIG_EFI
+extern void efi_init(void);
+#else
+#define efi_init()
+#endif
+
+int efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md);
+int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md);
+
+#define arch_efi_call_virt_setup() efi_virtmap_load()
+#define arch_efi_call_virt_teardown() efi_virtmap_unload()
+
+#define arch_efi_call_virt(p, f, args...) p->f(args)
+
+#define ARCH_EFI_IRQ_FLAGS_MASK (SR_IE | SR_SPIE)
+
/* on RISC-V, the FDT may be located anywhere in system RAM */
static inline unsigned long efi_get_max_fdt_addr(unsigned long dram_base)
{
@@ -33,4 +50,7 @@ static inline void efifb_setup_from_dmi(struct screen_info *si, const char *opt)
{
}
+void efi_virtmap_load(void);
+void efi_virtmap_unload(void);
+
#endif /* _ASM_EFI_H */
diff --git a/arch/riscv/include/asm/mmu.h b/arch/riscv/include/asm/mmu.h
index 967eacb01ab5..dabcf2cfb3dc 100644
--- a/arch/riscv/include/asm/mmu.h
+++ b/arch/riscv/include/asm/mmu.h
@@ -20,6 +20,8 @@ typedef struct {
#endif
} mm_context_t;
+void __init create_pgd_mapping(pgd_t *pgdp, uintptr_t va, phys_addr_t pa,
+ phys_addr_t sz, pgprot_t prot);
#endif /* __ASSEMBLY__ */
#endif /* _ASM_RISCV_MMU_H */
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 815f8c959dd4..183f1f4b2ae6 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -100,6 +100,10 @@
#define PAGE_KERNEL __pgprot(_PAGE_KERNEL)
#define PAGE_KERNEL_EXEC __pgprot(_PAGE_KERNEL | _PAGE_EXEC)
+#define PAGE_KERNEL_READ __pgprot(_PAGE_KERNEL & ~_PAGE_WRITE)
+#define PAGE_KERNEL_EXEC __pgprot(_PAGE_KERNEL | _PAGE_EXEC)
+#define PAGE_KERNEL_READ_EXEC __pgprot((_PAGE_KERNEL & ~_PAGE_WRITE) \
+ | _PAGE_EXEC)
#define PAGE_TABLE __pgprot(_PAGE_TABLE)
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index eabec4dce50b..0b48059cc9da 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -36,6 +36,7 @@ OBJCOPYFLAGS := --prefix-symbols=__efistub_
$(obj)/%.stub.o: $(obj)/%.o FORCE
$(call if_changed,objcopy)
+obj-$(CONFIG_EFI) += efi.o
obj-$(CONFIG_FPU) += fpu.o
obj-$(CONFIG_SMP) += smpboot.o
obj-$(CONFIG_SMP) += smp.o
diff --git a/arch/riscv/kernel/efi.c b/arch/riscv/kernel/efi.c
new file mode 100644
index 000000000000..d7a723b446c3
--- /dev/null
+++ b/arch/riscv/kernel/efi.c
@@ -0,0 +1,105 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2020 Western Digital Corporation or its affiliates.
+ * Adapted from arch/arm64/kernel/efi.c
+ */
+
+#include <linux/efi.h>
+#include <linux/init.h>
+
+#include <asm/efi.h>
+#include <asm/pgtable.h>
+#include <asm/pgtable-bits.h>
+
+/*
+ * Only regions of type EFI_RUNTIME_SERVICES_CODE need to be
+ * executable, everything else can be mapped with the XN bits
+ * set. Also take the new (optional) RO/XP bits into account.
+ */
+static __init pgprot_t efimem_to_pgprot_map(efi_memory_desc_t *md)
+{
+ u64 attr = md->attribute;
+ u32 type = md->type;
+
+ if (type == EFI_MEMORY_MAPPED_IO)
+ return PAGE_KERNEL;
+
+ if (WARN_ONCE(!PAGE_ALIGNED(md->phys_addr),
+ "UEFI Runtime regions are not aligned to page size -- buggy firmware?"))
+ /*
+ * If the region is not aligned to the page size of the OS, we
+ * can not use strict permissions, since that would also affect
+ * the mapping attributes of the adjacent regions.
+ */
+ return PAGE_EXEC;
+
+ /* R-- */
+ if ((attr & (EFI_MEMORY_XP | EFI_MEMORY_RO)) ==
+ (EFI_MEMORY_XP | EFI_MEMORY_RO))
+ return PAGE_KERNEL_READ;
+
+ /* R-X */
+ if (attr & EFI_MEMORY_RO)
+ return PAGE_KERNEL_READ_EXEC;
+
+ /* RW- */
+ if (((attr & (EFI_MEMORY_RP | EFI_MEMORY_WP | EFI_MEMORY_XP)) ==
+ EFI_MEMORY_XP) ||
+ type != EFI_RUNTIME_SERVICES_CODE)
+ return PAGE_KERNEL;
+
+ /* RWX */
+ return PAGE_KERNEL_EXEC;
+}
+
+int __init efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md)
+{
+ pgprot_t prot = __pgprot(pgprot_val(efimem_to_pgprot_map(md)) &
+ ~(_PAGE_GLOBAL));
+ int i;
+
+ /* RISC-V maps one page at a time */
+ for (i = 0; i < md->num_pages; i++)
+ create_pgd_mapping(mm->pgd, md->virt_addr + i * PAGE_SIZE,
+ md->phys_addr + i * PAGE_SIZE,
+ PAGE_SIZE, prot);
+ return 0;
+}
+
+static int __init set_permissions(pte_t *ptep, unsigned long addr, void *data)
+{
+ efi_memory_desc_t *md = data;
+ pte_t pte = READ_ONCE(*ptep);
+ unsigned long val;
+
+ if (md->attribute & EFI_MEMORY_RO) {
+ val = pte_val(pte) & ~_PAGE_WRITE;
+ val = pte_val(pte) | _PAGE_READ;
+ pte = __pte(val);
+ }
+ if (md->attribute & EFI_MEMORY_XP) {
+ val = pte_val(pte) & ~_PAGE_EXEC;
+ pte = __pte(val);
+ }
+ set_pte(ptep, pte);
+
+ return 0;
+}
+
+int __init efi_set_mapping_permissions(struct mm_struct *mm,
+ efi_memory_desc_t *md)
+{
+ BUG_ON(md->type != EFI_RUNTIME_SERVICES_CODE &&
+ md->type != EFI_RUNTIME_SERVICES_DATA);
+
+ /*
+ * Calling apply_to_page_range() is only safe on regions that are
+ * guaranteed to be mapped down to pages. Since we are only called
+ * for regions that have been mapped using efi_create_mapping() above
+ * (and this is checked by the generic Memory Attributes table parsing
+ * routines), there is no need to check that again here.
+ */
+ return apply_to_page_range(mm, md->virt_addr,
+ md->num_pages << EFI_PAGE_SHIFT,
+ set_permissions, md);
+}
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index c71788e6aff4..7f2a0d6dca7d 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -17,6 +17,7 @@
#include <linux/sched/task.h>
#include <linux/swiotlb.h>
#include <linux/smp.h>
+#include <linux/efi.h>
#include <asm/clint.h>
#include <asm/cpu_ops.h>
@@ -26,11 +27,12 @@
#include <asm/tlbflush.h>
#include <asm/thread_info.h>
#include <asm/kasan.h>
+#include <asm/efi.h>
#include "head.h"
-#ifdef CONFIG_DUMMY_CONSOLE
-struct screen_info screen_info = {
+#if defined(CONFIG_DUMMY_CONSOLE) || defined(CONFIG_EFI)
+struct screen_info screen_info __section(.data) = {
.orig_video_lines = 30,
.orig_video_cols = 80,
.orig_video_mode = 0,
@@ -75,6 +77,7 @@ void __init setup_arch(char **cmdline_p)
early_ioremap_setup();
parse_early_param();
+ efi_init();
setup_bootmem();
paging_init();
#if IS_ENABLED(CONFIG_BUILTIN_DTB)
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index d238cdc501ee..9fb2fe2f4a3e 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -390,7 +390,7 @@ static void __init create_pmd_mapping(pmd_t *pmdp,
#define fixmap_pgd_next fixmap_pte
#endif
-static void __init create_pgd_mapping(pgd_t *pgdp,
+void __init create_pgd_mapping(pgd_t *pgdp,
uintptr_t va, phys_addr_t pa,
phys_addr_t sz, pgprot_t prot)
{
diff --git a/drivers/firmware/efi/Makefile b/drivers/firmware/efi/Makefile
index 61fd1e8b26fb..4d628081bb2f 100644
--- a/drivers/firmware/efi/Makefile
+++ b/drivers/firmware/efi/Makefile
@@ -35,6 +35,8 @@ fake_map-$(CONFIG_X86) += x86_fake_mem.o
arm-obj-$(CONFIG_EFI) := efi-init.o arm-runtime.o
obj-$(CONFIG_ARM) += $(arm-obj-y)
obj-$(CONFIG_ARM64) += $(arm-obj-y)
+riscv-obj-$(CONFIG_EFI) := efi-init.o riscv-runtime.o
+obj-$(CONFIG_RISCV) += $(riscv-obj-y)
obj-$(CONFIG_EFI_CAPSULE_LOADER) += capsule-loader.o
obj-$(CONFIG_EFI_EARLYCON) += earlycon.o
obj-$(CONFIG_UEFI_CPER_ARM) += cper-arm.o
diff --git a/drivers/firmware/efi/libstub/efi-stub.c b/drivers/firmware/efi/libstub/efi-stub.c
index a5a405d8ab44..5c26725d8fd0 100644
--- a/drivers/firmware/efi/libstub/efi-stub.c
+++ b/drivers/firmware/efi/libstub/efi-stub.c
@@ -17,7 +17,10 @@
/*
* This is the base address at which to start allocating virtual memory ranges
- * for UEFI Runtime Services. This is in the low TTBR0 range so that we can use
+ * for UEFI Runtime Services.
+ *
+ * For ARM/ARM64:
+ * This is in the low TTBR0 range so that we can use
* any allocation we choose, and eliminate the risk of a conflict after kexec.
* The value chosen is the largest non-zero power of 2 suitable for this purpose
* both on 32-bit and 64-bit ARM CPUs, to maximize the likelihood that it can
@@ -25,6 +28,12 @@
* Since 32-bit ARM could potentially execute with a 1G/3G user/kernel split,
* map everything below 1 GB. (512 MB is a reasonable upper bound for the
* entire footprint of the UEFI runtime services memory regions)
+ *
+ * For RISC-V:
+ * There is no specific reason for which, this address (512MB) can't be used
+ * EFI runtime virtual address for RISC-V. It also helps to use EFI runtime
+ * services on both RV32/RV64. Keep the same runtime virtual address for RISC-V
+ * as well to minimize the code churn.
*/
#define EFI_RT_VIRTUAL_BASE SZ_512M
#define EFI_RT_VIRTUAL_SIZE SZ_512M
diff --git a/drivers/firmware/efi/riscv-runtime.c b/drivers/firmware/efi/riscv-runtime.c
new file mode 100644
index 000000000000..d28e715d2bcc
--- /dev/null
+++ b/drivers/firmware/efi/riscv-runtime.c
@@ -0,0 +1,143 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Extensible Firmware Interface
+ *
+ * Copyright (C) 2020 Western Digital Corporation or its affiliates.
+ *
+ * Based on Extensible Firmware Interface Specification version 2.4
+ * Adapted from drivers/firmware/efi/arm-runtime.c
+ *
+ */
+
+#include <linux/dmi.h>
+#include <linux/efi.h>
+#include <linux/io.h>
+#include <linux/memblock.h>
+#include <linux/mm_types.h>
+#include <linux/preempt.h>
+#include <linux/rbtree.h>
+#include <linux/rwsem.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/pgtable.h>
+
+#include <asm/cacheflush.h>
+#include <asm/efi.h>
+#include <asm/mmu.h>
+#include <asm/pgalloc.h>
+
+static bool __init efi_virtmap_init(void)
+{
+ efi_memory_desc_t *md;
+
+ efi_mm.pgd = pgd_alloc(&efi_mm);
+ mm_init_cpumask(&efi_mm);
+ init_new_context(NULL, &efi_mm);
+
+ for_each_efi_memory_desc(md) {
+ phys_addr_t phys = md->phys_addr;
+ int ret;
+
+ if (!(md->attribute & EFI_MEMORY_RUNTIME))
+ continue;
+ if (md->virt_addr == 0)
+ return false;
+
+ ret = efi_create_mapping(&efi_mm, md);
+ if (ret) {
+ pr_warn(" EFI remap %pa: failed to create mapping (%d)\n",
+ &phys, ret);
+ return false;
+ }
+ }
+
+ if (efi_memattr_apply_permissions(&efi_mm, efi_set_mapping_permissions))
+ return false;
+
+ return true;
+}
+
+/*
+ * Enable the UEFI Runtime Services if all prerequisites are in place, i.e.,
+ * non-early mapping of the UEFI system table and virtual mappings for all
+ * EFI_MEMORY_RUNTIME regions.
+ */
+static int __init riscv_enable_runtime_services(void)
+{
+ u64 mapsize;
+
+ if (!efi_enabled(EFI_BOOT)) {
+ pr_info("EFI services will not be available.\n");
+ return 0;
+ }
+
+ efi_memmap_unmap();
+
+ mapsize = efi.memmap.desc_size * efi.memmap.nr_map;
+
+ if (efi_memmap_init_late(efi.memmap.phys_map, mapsize)) {
+ pr_err("Failed to remap EFI memory map\n");
+ return 0;
+ }
+
+ if (efi_soft_reserve_enabled()) {
+ efi_memory_desc_t *md;
+
+ for_each_efi_memory_desc(md) {
+ int md_size = md->num_pages << EFI_PAGE_SHIFT;
+ struct resource *res;
+
+ if (!(md->attribute & EFI_MEMORY_SP))
+ continue;
+
+ res = kzalloc(sizeof(*res), GFP_KERNEL);
+ if (WARN_ON(!res))
+ break;
+
+ res->start = md->phys_addr;
+ res->end = md->phys_addr + md_size - 1;
+ res->name = "Soft Reserved";
+ res->flags = IORESOURCE_MEM;
+ res->desc = IORES_DESC_SOFT_RESERVED;
+
+ insert_resource(&iomem_resource, res);
+ }
+ }
+
+ if (efi_runtime_disabled()) {
+ pr_info("EFI runtime services will be disabled.\n");
+ return 0;
+ }
+
+ if (efi_enabled(EFI_RUNTIME_SERVICES)) {
+ pr_info("EFI runtime services access via paravirt.\n");
+ return 0;
+ }
+
+ pr_info("Remapping and enabling EFI services.\n");
+
+ if (!efi_virtmap_init()) {
+ pr_err("UEFI virtual mapping missing or invalid -- runtime services will not be available\n");
+ return -ENOMEM;
+ }
+
+ /* Set up runtime services function pointers */
+ efi_native_runtime_setup();
+ set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
+
+ return 0;
+}
+early_initcall(riscv_enable_runtime_services);
+
+void efi_virtmap_load(void)
+{
+ preempt_disable();
+ switch_mm(current->active_mm, &efi_mm, NULL);
+}
+
+void efi_virtmap_unload(void)
+{
+ switch_mm(&efi_mm, current->active_mm, NULL);
+ preempt_enable();
+}
--
2.24.0
From: Anup Patel <[email protected]>
Currently, RISC-V reserves 1MB of fixmap memory for device tree. However,
it maps only single PMD (2MB) space for fixmap which leaves only < 1MB space
left for other kernel features such as early ioremap which requires fixmap
as well. The fixmap size can be increased by another 2MB but it brings
additional complexity and changes the virtual memory layout as well.
If we require some additional feature requiring fixmap again, it has to be
moved again.
Technically, DT doesn't need a fixmap as the memory occupied by the DT is
only used during boot. That's why, We map device tree in early page table
using two consecutive PGD mappings at lower addresses (< PAGE_OFFSET).
This frees lot of space in fixmap and also makes maximum supported
device tree size supported as PGDIR_SIZE. Thus, init memory section can be used
for the same purpose as well. This simplifies fixmap implementation.
Signed-off-by: Anup Patel <[email protected]>
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/include/asm/fixmap.h | 3 ---
arch/riscv/include/asm/pgtable.h | 1 +
arch/riscv/kernel/head.S | 1 -
arch/riscv/kernel/head.h | 2 --
arch/riscv/kernel/setup.c | 9 +++++++--
arch/riscv/mm/init.c | 26 ++++++++++++--------------
6 files changed, 20 insertions(+), 22 deletions(-)
diff --git a/arch/riscv/include/asm/fixmap.h b/arch/riscv/include/asm/fixmap.h
index 1ff075a8dfc7..11613f38228a 100644
--- a/arch/riscv/include/asm/fixmap.h
+++ b/arch/riscv/include/asm/fixmap.h
@@ -22,9 +22,6 @@
*/
enum fixed_addresses {
FIX_HOLE,
-#define FIX_FDT_SIZE SZ_1M
- FIX_FDT_END,
- FIX_FDT = FIX_FDT_END + FIX_FDT_SIZE / PAGE_SIZE - 1,
FIX_PTE,
FIX_PMD,
FIX_TEXT_POKE1,
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index eaea1f717010..815f8c959dd4 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -464,6 +464,7 @@ static inline void __kernel_map_pages(struct page *page, int numpages, int enabl
#define kern_addr_valid(addr) (1) /* FIXME */
extern void *dtb_early_va;
+extern uintptr_t dtb_early_pa;
void setup_bootmem(void);
void paging_init(void);
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index 7822054dbd88..a2f0cb3ca0a6 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -255,7 +255,6 @@ clear_bss_done:
#endif
/* Start the kernel */
call soc_early_init
- call parse_dtb
tail start_kernel
.Lsecondary_start:
diff --git a/arch/riscv/kernel/head.h b/arch/riscv/kernel/head.h
index 105fb0496b24..b48dda3d04f6 100644
--- a/arch/riscv/kernel/head.h
+++ b/arch/riscv/kernel/head.h
@@ -16,6 +16,4 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa);
extern void *__cpu_up_stack_pointer[];
extern void *__cpu_up_task_pointer[];
-void __init parse_dtb(void);
-
#endif /* __ASM_HEAD_H */
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index f04373be54a6..6a0ee2405813 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -49,8 +49,9 @@ atomic_t hart_lottery __section(.sdata);
unsigned long boot_cpu_hartid;
static DEFINE_PER_CPU(struct cpu, cpu_devices);
-void __init parse_dtb(void)
+static void __init parse_dtb(void)
{
+ /* Early scan of device tree from init memory */
if (early_init_dt_scan(dtb_early_va))
return;
@@ -63,6 +64,7 @@ void __init parse_dtb(void)
void __init setup_arch(char **cmdline_p)
{
+ parse_dtb();
init_mm.start_code = (unsigned long) _stext;
init_mm.end_code = (unsigned long) _etext;
init_mm.end_data = (unsigned long) _edata;
@@ -77,7 +79,10 @@ void __init setup_arch(char **cmdline_p)
#if IS_ENABLED(CONFIG_BUILTIN_DTB)
unflatten_and_copy_device_tree();
#else
- unflatten_device_tree();
+ if (early_init_dt_verify(__va(dtb_early_pa)))
+ unflatten_device_tree();
+ else
+ pr_err("No DTB found in kernel mappings\n");
#endif
clint_init_boot_cpu();
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 787c75f751a5..2b651f63f5c4 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -28,7 +28,9 @@ unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]
EXPORT_SYMBOL(empty_zero_page);
extern char _start[];
-void *dtb_early_va;
+#define DTB_EARLY_BASE_VA PGDIR_SIZE
+void *dtb_early_va __initdata;
+uintptr_t dtb_early_pa __initdata;
static void __init zone_sizes_init(void)
{
@@ -141,8 +143,6 @@ static void __init setup_initrd(void)
}
#endif /* CONFIG_BLK_DEV_INITRD */
-static phys_addr_t dtb_early_pa __initdata;
-
void __init setup_bootmem(void)
{
struct memblock_region *reg;
@@ -399,7 +399,7 @@ static uintptr_t __init best_map_size(phys_addr_t base, phys_addr_t size)
asmlinkage void __init setup_vm(uintptr_t dtb_pa)
{
- uintptr_t va, end_va;
+ uintptr_t va, pa, end_va;
uintptr_t load_pa = (uintptr_t)(&_start);
uintptr_t load_sz = (uintptr_t)(&_end) - load_pa;
uintptr_t map_size = best_map_size(load_pa, MAX_EARLY_MAPPING_SIZE);
@@ -448,16 +448,13 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
load_pa + (va - PAGE_OFFSET),
map_size, PAGE_KERNEL_EXEC);
- /* Create fixed mapping for early FDT parsing */
- end_va = __fix_to_virt(FIX_FDT) + FIX_FDT_SIZE;
- for (va = __fix_to_virt(FIX_FDT); va < end_va; va += PAGE_SIZE)
- create_pte_mapping(fixmap_pte, va,
- dtb_pa + (va - __fix_to_virt(FIX_FDT)),
- PAGE_SIZE, PAGE_KERNEL);
-
- /* Save pointer to DTB for early FDT parsing */
- dtb_early_va = (void *)fix_to_virt(FIX_FDT) + (dtb_pa & ~PAGE_MASK);
- /* Save physical address for memblock reservation */
+ /* Create two consecutive PGD mappings for FDT early scan */
+ pa = dtb_pa & ~(PGDIR_SIZE - 1);
+ create_pgd_mapping(early_pg_dir, DTB_EARLY_BASE_VA,
+ pa, PGDIR_SIZE, PAGE_KERNEL);
+ create_pgd_mapping(early_pg_dir, DTB_EARLY_BASE_VA + PGDIR_SIZE,
+ pa + PGDIR_SIZE, PGDIR_SIZE, PAGE_KERNEL);
+ dtb_early_va = (void *)DTB_EARLY_BASE_VA + (dtb_pa & (PGDIR_SIZE - 1));
dtb_early_pa = dtb_pa;
}
@@ -516,6 +513,7 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
#else
dtb_early_va = (void *)dtb_pa;
#endif
+ dtb_early_pa = dtb_pa;
}
static inline void setup_vm_final(void)
--
2.24.0
Currently, page table setup is done during setup_va_final where fixmap can
be used to create the temporary mappings. The physical frame is allocated
from memblock_alloc_* functions. However, this won't work if page table
mapping needs to be created for a different mm context (i.e. efi mm) at
a later point of time.
Use generic kernel page allocation function & macros for any mapping
after setup_vm_final.
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/mm/init.c | 130 ++++++++++++++++++++++++++++++++-----------
1 file changed, 99 insertions(+), 31 deletions(-)
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index b75ebe8e7a92..d238cdc501ee 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -32,6 +32,17 @@ extern char _start[];
void *dtb_early_va __initdata;
uintptr_t dtb_early_pa __initdata;
+struct pt_alloc_ops {
+ pte_t *(*get_pte_virt)(phys_addr_t pa);
+ phys_addr_t (*alloc_pte)(uintptr_t va);
+#ifndef __PAGETABLE_PMD_FOLDED
+ pmd_t *(*get_pmd_virt)(phys_addr_t pa);
+ phys_addr_t (*alloc_pmd)(uintptr_t va);
+#endif
+};
+
+struct pt_alloc_ops pt_ops;
+
static void __init zone_sizes_init(void)
{
unsigned long max_zone_pfns[MAX_NR_ZONES] = { 0, };
@@ -211,7 +222,6 @@ EXPORT_SYMBOL(pfn_base);
pgd_t swapper_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
pgd_t trampoline_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
pte_t fixmap_pte[PTRS_PER_PTE] __page_aligned_bss;
-static bool mmu_enabled;
#define MAX_EARLY_MAPPING_SIZE SZ_128M
@@ -234,27 +244,46 @@ void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot)
}
}
-static pte_t *__init get_pte_virt(phys_addr_t pa)
+static inline pte_t *__init get_pte_virt_early(phys_addr_t pa)
{
- if (mmu_enabled) {
- clear_fixmap(FIX_PTE);
- return (pte_t *)set_fixmap_offset(FIX_PTE, pa);
- } else {
- return (pte_t *)((uintptr_t)pa);
- }
+ return (pte_t *)((uintptr_t)pa);
}
-static phys_addr_t __init alloc_pte(uintptr_t va)
+static inline pte_t *__init get_pte_virt_fixmap(phys_addr_t pa)
+{
+ clear_fixmap(FIX_PTE);
+ return (pte_t *)set_fixmap_offset(FIX_PTE, pa);
+}
+
+static inline pte_t *get_pte_virt_late(phys_addr_t pa)
+{
+ return (pte_t *) __va(pa);
+}
+
+static inline phys_addr_t __init alloc_pte_early(uintptr_t va)
{
/*
* We only create PMD or PGD early mappings so we
* should never reach here with MMU disabled.
*/
- BUG_ON(!mmu_enabled);
+ BUG();
+}
+static inline phys_addr_t __init alloc_pte_fixmap(uintptr_t va)
+{
return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
}
+static phys_addr_t alloc_pte_late(uintptr_t va)
+{
+ unsigned long vaddr;
+
+ vaddr = __get_free_page(GFP_KERNEL);
+ if (!vaddr || !pgtable_pte_page_ctor(virt_to_page(vaddr)))
+ BUG();
+ return __pa(vaddr);
+}
+
static void __init create_pte_mapping(pte_t *ptep,
uintptr_t va, phys_addr_t pa,
phys_addr_t sz, pgprot_t prot)
@@ -279,28 +308,46 @@ pmd_t fixmap_pmd[PTRS_PER_PMD] __page_aligned_bss;
#endif
pmd_t early_pmd[PTRS_PER_PMD * NUM_EARLY_PMDS] __initdata __aligned(PAGE_SIZE);
-static pmd_t *__init get_pmd_virt(phys_addr_t pa)
+static pmd_t *__init get_pmd_virt_early(phys_addr_t pa)
{
- if (mmu_enabled) {
- clear_fixmap(FIX_PMD);
- return (pmd_t *)set_fixmap_offset(FIX_PMD, pa);
- } else {
- return (pmd_t *)((uintptr_t)pa);
- }
+ /* Before MMU is enabled */
+ return (pmd_t *)((uintptr_t)pa);
}
-static phys_addr_t __init alloc_pmd(uintptr_t va)
+static pmd_t *__init get_pmd_virt_fixmap(phys_addr_t pa)
{
- uintptr_t pmd_num;
+ clear_fixmap(FIX_PMD);
+ return (pmd_t *)set_fixmap_offset(FIX_PMD, pa);
+}
+
+static pmd_t *get_pmd_virt_late(phys_addr_t pa)
+{
+ return (pmd_t *) __va(pa);
+}
- if (mmu_enabled)
- return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
+static phys_addr_t __init alloc_pmd_early(uintptr_t va)
+{
+ uintptr_t pmd_num;
pmd_num = (va - PAGE_OFFSET) >> PGDIR_SHIFT;
BUG_ON(pmd_num >= NUM_EARLY_PMDS);
return (uintptr_t)&early_pmd[pmd_num * PTRS_PER_PMD];
}
+static phys_addr_t __init alloc_pmd_fixmap(uintptr_t va)
+{
+ return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
+}
+
+static phys_addr_t alloc_pmd_late(uintptr_t va)
+{
+ unsigned long vaddr;
+
+ vaddr = __get_free_page(GFP_KERNEL);
+ BUG_ON(!vaddr);
+ return __pa(vaddr);
+}
+
static void __init create_pmd_mapping(pmd_t *pmdp,
uintptr_t va, phys_addr_t pa,
phys_addr_t sz, pgprot_t prot)
@@ -316,28 +363,28 @@ static void __init create_pmd_mapping(pmd_t *pmdp,
}
if (pmd_none(pmdp[pmd_idx])) {
- pte_phys = alloc_pte(va);
+ pte_phys = pt_ops.alloc_pte(va);
pmdp[pmd_idx] = pfn_pmd(PFN_DOWN(pte_phys), PAGE_TABLE);
- ptep = get_pte_virt(pte_phys);
+ ptep = pt_ops.get_pte_virt(pte_phys);
memset(ptep, 0, PAGE_SIZE);
} else {
pte_phys = PFN_PHYS(_pmd_pfn(pmdp[pmd_idx]));
- ptep = get_pte_virt(pte_phys);
+ ptep = pt_ops.get_pte_virt(pte_phys);
}
create_pte_mapping(ptep, va, pa, sz, prot);
}
#define pgd_next_t pmd_t
-#define alloc_pgd_next(__va) alloc_pmd(__va)
-#define get_pgd_next_virt(__pa) get_pmd_virt(__pa)
+#define alloc_pgd_next(__va) pt_ops.alloc_pmd(__va)
+#define get_pgd_next_virt(__pa) pt_ops.get_pmd_virt(__pa)
#define create_pgd_next_mapping(__nextp, __va, __pa, __sz, __prot) \
create_pmd_mapping(__nextp, __va, __pa, __sz, __prot)
#define fixmap_pgd_next fixmap_pmd
#else
#define pgd_next_t pte_t
-#define alloc_pgd_next(__va) alloc_pte(__va)
-#define get_pgd_next_virt(__pa) get_pte_virt(__pa)
+#define alloc_pgd_next(__va) pt_ops.alloc_pte(__va)
+#define get_pgd_next_virt(__pa) pt_ops.get_pte_virt(__pa)
#define create_pgd_next_mapping(__nextp, __va, __pa, __sz, __prot) \
create_pte_mapping(__nextp, __va, __pa, __sz, __prot)
#define fixmap_pgd_next fixmap_pte
@@ -421,6 +468,12 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
BUG_ON((load_pa % map_size) != 0);
BUG_ON(load_sz > MAX_EARLY_MAPPING_SIZE);
+ pt_ops.alloc_pte = alloc_pte_early;
+ pt_ops.get_pte_virt = get_pte_virt_early;
+#ifndef __PAGETABLE_PMD_FOLDED
+ pt_ops.alloc_pmd = alloc_pmd_early;
+ pt_ops.get_pmd_virt = get_pmd_virt_early;
+#endif
/* Setup early PGD for fixmap */
create_pgd_mapping(early_pg_dir, FIXADDR_START,
(uintptr_t)fixmap_pgd_next, PGDIR_SIZE, PAGE_TABLE);
@@ -497,9 +550,16 @@ static void __init setup_vm_final(void)
phys_addr_t pa, start, end;
struct memblock_region *reg;
- /* Set mmu_enabled flag */
- mmu_enabled = true;
-
+ /**
+ * MMU is enabled at this point. But page table setup is not complete yet.
+ * fixmap page table alloc functions should be used at this point
+ */
+ pt_ops.alloc_pte = alloc_pte_fixmap;
+ pt_ops.get_pte_virt = get_pte_virt_fixmap;
+#ifndef __PAGETABLE_PMD_FOLDED
+ pt_ops.alloc_pmd = alloc_pmd_fixmap;
+ pt_ops.get_pmd_virt = get_pmd_virt_fixmap;
+#endif
/* Setup swapper PGD for fixmap */
create_pgd_mapping(swapper_pg_dir, FIXADDR_START,
__pa_symbol(fixmap_pgd_next),
@@ -533,6 +593,14 @@ static void __init setup_vm_final(void)
/* Move to swapper page table */
csr_write(CSR_SATP, PFN_DOWN(__pa_symbol(swapper_pg_dir)) | SATP_MODE);
local_flush_tlb_all();
+
+ /* generic page allocation functions must be used to setup page table */
+ pt_ops.alloc_pte = alloc_pte_late;
+ pt_ops.get_pte_virt = get_pte_virt_late;
+#ifndef __PAGETABLE_PMD_FOLDED
+ pt_ops.alloc_pmd = alloc_pmd_late;
+ pt_ops.get_pmd_virt = get_pmd_virt_late;
+#endif
}
#else
asmlinkage void __init setup_vm(uintptr_t dtb_pa)
--
2.24.0
Linux kernel Image can appear as an EFI application With appropriate
PE/COFF header fields in the beginning of the Image header. An EFI
application loader can directly load a Linux kernel Image and an EFI
stub residing in kernel can boot Linux kernel directly.
Add the necessary PE/COFF header.
Signed-off-by: Atish Patra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[ardb: - use C prefix for c.li to ensure the expected opcode is emitted
- align all image sections according to PE/COFF section alignment ]
Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/riscv/include/asm/sections.h | 13 ++++
arch/riscv/kernel/Makefile | 4 ++
arch/riscv/kernel/efi-header.S | 104 ++++++++++++++++++++++++++++++
arch/riscv/kernel/head.S | 16 +++++
arch/riscv/kernel/image-vars.h | 51 +++++++++++++++
arch/riscv/kernel/vmlinux.lds.S | 22 ++++++-
6 files changed, 208 insertions(+), 2 deletions(-)
create mode 100644 arch/riscv/include/asm/sections.h
create mode 100644 arch/riscv/kernel/efi-header.S
create mode 100644 arch/riscv/kernel/image-vars.h
diff --git a/arch/riscv/include/asm/sections.h b/arch/riscv/include/asm/sections.h
new file mode 100644
index 000000000000..3a9971b1210f
--- /dev/null
+++ b/arch/riscv/include/asm/sections.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2020 Western Digital Corporation or its affiliates.
+ */
+#ifndef __ASM_SECTIONS_H
+#define __ASM_SECTIONS_H
+
+#include <asm-generic/sections.h>
+
+extern char _start[];
+extern char _start_kernel[];
+
+#endif /* __ASM_SECTIONS_H */
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index a5287ab9f7f2..eabec4dce50b 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -32,6 +32,10 @@ obj-y += patch.o
obj-$(CONFIG_MMU) += vdso.o vdso/
obj-$(CONFIG_RISCV_M_MODE) += clint.o traps_misaligned.o
+OBJCOPYFLAGS := --prefix-symbols=__efistub_
+$(obj)/%.stub.o: $(obj)/%.o FORCE
+ $(call if_changed,objcopy)
+
obj-$(CONFIG_FPU) += fpu.o
obj-$(CONFIG_SMP) += smpboot.o
obj-$(CONFIG_SMP) += smp.o
diff --git a/arch/riscv/kernel/efi-header.S b/arch/riscv/kernel/efi-header.S
new file mode 100644
index 000000000000..822b4c9ff2bb
--- /dev/null
+++ b/arch/riscv/kernel/efi-header.S
@@ -0,0 +1,104 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2020 Western Digital Corporation or its affiliates.
+ * Adapted from arch/arm64/kernel/efi-header.S
+ */
+
+#include <linux/pe.h>
+#include <linux/sizes.h>
+
+ .macro __EFI_PE_HEADER
+ .long PE_MAGIC
+coff_header:
+#ifdef CONFIG_64BIT
+ .short IMAGE_FILE_MACHINE_RISCV64 // Machine
+#else
+ .short IMAGE_FILE_MACHINE_RISCV32 // Machine
+#endif
+ .short section_count // NumberOfSections
+ .long 0 // TimeDateStamp
+ .long 0 // PointerToSymbolTable
+ .long 0 // NumberOfSymbols
+ .short section_table - optional_header // SizeOfOptionalHeader
+ .short IMAGE_FILE_DEBUG_STRIPPED | \
+ IMAGE_FILE_EXECUTABLE_IMAGE | \
+ IMAGE_FILE_LINE_NUMS_STRIPPED // Characteristics
+
+optional_header:
+ .short PE_OPT_MAGIC_PE32PLUS // PE32+ format
+ .byte 0x02 // MajorLinkerVersion
+ .byte 0x14 // MinorLinkerVersion
+ .long __pecoff_text_end - efi_header_end // SizeOfCode
+ .long __pecoff_data_virt_size // SizeOfInitializedData
+ .long 0 // SizeOfUninitializedData
+ .long __efistub_efi_pe_entry - _start // AddressOfEntryPoint
+ .long efi_header_end - _start // BaseOfCode
+
+extra_header_fields:
+ .quad 0 // ImageBase
+ .long PECOFF_SECTION_ALIGNMENT // SectionAlignment
+ .long PECOFF_FILE_ALIGNMENT // FileAlignment
+ .short 0 // MajorOperatingSystemVersion
+ .short 0 // MinorOperatingSystemVersion
+ .short LINUX_EFISTUB_MAJOR_VERSION // MajorImageVersion
+ .short LINUX_EFISTUB_MINOR_VERSION // MinorImageVersion
+ .short 0 // MajorSubsystemVersion
+ .short 0 // MinorSubsystemVersion
+ .long 0 // Win32VersionValue
+
+ .long _end - _start // SizeOfImage
+
+ // Everything before the kernel image is considered part of the header
+ .long efi_header_end - _start // SizeOfHeaders
+ .long 0 // CheckSum
+ .short IMAGE_SUBSYSTEM_EFI_APPLICATION // Subsystem
+ .short 0 // DllCharacteristics
+ .quad 0 // SizeOfStackReserve
+ .quad 0 // SizeOfStackCommit
+ .quad 0 // SizeOfHeapReserve
+ .quad 0 // SizeOfHeapCommit
+ .long 0 // LoaderFlags
+ .long (section_table - .) / 8 // NumberOfRvaAndSizes
+
+ .quad 0 // ExportTable
+ .quad 0 // ImportTable
+ .quad 0 // ResourceTable
+ .quad 0 // ExceptionTable
+ .quad 0 // CertificationTable
+ .quad 0 // BaseRelocationTable
+
+ // Section table
+section_table:
+ .ascii ".text\0\0\0"
+ .long __pecoff_text_end - efi_header_end // VirtualSize
+ .long efi_header_end - _start // VirtualAddress
+ .long __pecoff_text_end - efi_header_end // SizeOfRawData
+ .long efi_header_end - _start // PointerToRawData
+
+ .long 0 // PointerToRelocations
+ .long 0 // PointerToLineNumbers
+ .short 0 // NumberOfRelocations
+ .short 0 // NumberOfLineNumbers
+ .long IMAGE_SCN_CNT_CODE | \
+ IMAGE_SCN_MEM_READ | \
+ IMAGE_SCN_MEM_EXECUTE // Characteristics
+
+ .ascii ".data\0\0\0"
+ .long __pecoff_data_virt_size // VirtualSize
+ .long __pecoff_text_end - _start // VirtualAddress
+ .long __pecoff_data_raw_size // SizeOfRawData
+ .long __pecoff_text_end - _start // PointerToRawData
+
+ .long 0 // PointerToRelocations
+ .long 0 // PointerToLineNumbers
+ .short 0 // NumberOfRelocations
+ .short 0 // NumberOfLineNumbers
+ .long IMAGE_SCN_CNT_INITIALIZED_DATA | \
+ IMAGE_SCN_MEM_READ | \
+ IMAGE_SCN_MEM_WRITE // Characteristics
+
+ .set section_count, (. - section_table) / 40
+
+ .balign 0x1000
+efi_header_end:
+ .endm
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index a2f0cb3ca0a6..f5583473f41c 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -13,6 +13,7 @@
#include <asm/csr.h>
#include <asm/hwcap.h>
#include <asm/image.h>
+#include "efi-header.S"
__HEAD
ENTRY(_start)
@@ -22,10 +23,18 @@ ENTRY(_start)
* Do not modify it without modifying the structure and all bootloaders
* that expects this header format!!
*/
+#ifdef CONFIG_EFI
+ /*
+ * This instruction decodes to "MZ" ASCII required by UEFI.
+ */
+ c.li s4,-13
+ j _start_kernel
+#else
/* jump to start kernel */
j _start_kernel
/* reserved */
.word 0
+#endif
.balign 8
#if __riscv_xlen == 64
/* Image load offset(2MB) from start of RAM */
@@ -43,7 +52,14 @@ ENTRY(_start)
.ascii RISCV_IMAGE_MAGIC
.balign 4
.ascii RISCV_IMAGE_MAGIC2
+#ifdef CONFIG_EFI
+ .word pe_head_start - _start
+pe_head_start:
+
+ __EFI_PE_HEADER
+#else
.word 0
+#endif
.align 2
#ifdef CONFIG_MMU
diff --git a/arch/riscv/kernel/image-vars.h b/arch/riscv/kernel/image-vars.h
new file mode 100644
index 000000000000..8c212efb37a6
--- /dev/null
+++ b/arch/riscv/kernel/image-vars.h
@@ -0,0 +1,51 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2020 Western Digital Corporation or its affiliates.
+ * Linker script variables to be set after section resolution, as
+ * ld.lld does not like variables assigned before SECTIONS is processed.
+ * Based on arch/arm64/kerne/image-vars.h
+ */
+#ifndef __RISCV_KERNEL_IMAGE_VARS_H
+#define __RISCV_KERNEL_IMAGE_VARS_H
+
+#ifndef LINKER_SCRIPT
+#error This file should only be included in vmlinux.lds.S
+#endif
+
+#ifdef CONFIG_EFI
+
+/*
+ * The EFI stub has its own symbol namespace prefixed by __efistub_, to
+ * isolate it from the kernel proper. The following symbols are legally
+ * accessed by the stub, so provide some aliases to make them accessible.
+ * Only include data symbols here, or text symbols of functions that are
+ * guaranteed to be safe when executed at another offset than they were
+ * linked at. The routines below are all implemented in assembler in a
+ * position independent manner
+ */
+__efistub_memcmp = memcmp;
+__efistub_memchr = memchr;
+__efistub_memcpy = memcpy;
+__efistub_memmove = memmove;
+__efistub_memset = memset;
+__efistub_strlen = strlen;
+__efistub_strnlen = strnlen;
+__efistub_strcmp = strcmp;
+__efistub_strncmp = strncmp;
+__efistub_strrchr = strrchr;
+
+#ifdef CONFIG_KASAN
+__efistub___memcpy = memcpy;
+__efistub___memmove = memmove;
+__efistub___memset = memset;
+#endif
+
+__efistub__start = _start;
+__efistub__start_kernel = _start_kernel;
+__efistub__end = _end;
+__efistub__edata = _edata;
+__efistub_screen_info = screen_info;
+
+#endif
+
+#endif /* __RISCV_KERNEL_IMAGE_VARS_H */
diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
index f3586e31ed1e..6dcf790282dd 100644
--- a/arch/riscv/kernel/vmlinux.lds.S
+++ b/arch/riscv/kernel/vmlinux.lds.S
@@ -10,6 +10,7 @@
#include <asm/cache.h>
#include <asm/thread_info.h>
#include <asm/set_memory.h>
+#include "image-vars.h"
#include <linux/sizes.h>
OUTPUT_ARCH(riscv)
@@ -17,6 +18,9 @@ ENTRY(_start)
jiffies = jiffies_64;
+PECOFF_SECTION_ALIGNMENT = 0x1000;
+PECOFF_FILE_ALIGNMENT = 0x200;
+
SECTIONS
{
/* Beginning of code and text segment */
@@ -76,6 +80,10 @@ SECTIONS
EXCEPTION_TABLE(0x10)
+#ifdef CONFIG_EFI
+ . = ALIGN(PECOFF_SECTION_ALIGNMENT);
+ __pecoff_text_end = .;
+#endif
. = ALIGN(SECTION_ALIGN);
_data = .;
@@ -83,16 +91,26 @@ SECTIONS
.sdata : {
__global_pointer$ = . + 0x800;
*(.sdata*)
- /* End of data section */
- _edata = .;
}
+#ifdef CONFIG_EFI
+ .pecoff_edata_padding : { BYTE(0); . = ALIGN(PECOFF_FILE_ALIGNMENT); }
+ __pecoff_data_raw_size = ABSOLUTE(. - __pecoff_text_end);
+#endif
+
+ /* End of data section */
+ _edata = .;
+
BSS_SECTION(PAGE_SIZE, PAGE_SIZE, 0)
.rel.dyn : {
*(.rel.dyn*)
}
+#ifdef CONFIG_EFI
+ . = ALIGN(PECOFF_SECTION_ALIGNMENT);
+ __pecoff_data_virt_size = ABSOLUTE(. - __pecoff_text_end);
+#endif
_end = .;
STABS_DEBUG
--
2.24.0
UEFI uses early IO or memory mappings for runtime services before
normal ioremap() is usable. Add the necessary fixmap bindings and
pmd mappings for generic ioremap support to work.
Signed-off-by: Atish Patra <[email protected]>
---
arch/riscv/Kconfig | 1 +
arch/riscv/include/asm/Kbuild | 1 +
arch/riscv/include/asm/fixmap.h | 13 +++++++++++++
arch/riscv/include/asm/io.h | 1 +
arch/riscv/kernel/setup.c | 1 +
arch/riscv/mm/init.c | 33 +++++++++++++++++++++++++++++++++
6 files changed, 50 insertions(+)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 7b5905529146..15597f5f504f 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -37,6 +37,7 @@ config RISCV
select GENERIC_ARCH_TOPOLOGY if SMP
select GENERIC_ATOMIC64 if !64BIT
select GENERIC_CLOCKEVENTS
+ select GENERIC_EARLY_IOREMAP
select GENERIC_GETTIMEOFDAY if HAVE_GENERIC_VDSO
select GENERIC_IOREMAP
select GENERIC_IRQ_MULTI_HANDLER
diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
index 3d9410bb4de0..59dd7be55005 100644
--- a/arch/riscv/include/asm/Kbuild
+++ b/arch/riscv/include/asm/Kbuild
@@ -1,4 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
+generic-y += early_ioremap.h
generic-y += extable.h
generic-y += flat.h
generic-y += kvm_para.h
diff --git a/arch/riscv/include/asm/fixmap.h b/arch/riscv/include/asm/fixmap.h
index 11613f38228a..54cbf07fb4e9 100644
--- a/arch/riscv/include/asm/fixmap.h
+++ b/arch/riscv/include/asm/fixmap.h
@@ -27,6 +27,19 @@ enum fixed_addresses {
FIX_TEXT_POKE1,
FIX_TEXT_POKE0,
FIX_EARLYCON_MEM_BASE,
+
+ __end_of_permanent_fixed_addresses,
+ /*
+ * Temporary boot-time mappings, used by early_ioremap(),
+ * before ioremap() is functional.
+ */
+#define NR_FIX_BTMAPS (SZ_256K / PAGE_SIZE)
+#define FIX_BTMAPS_SLOTS 7
+#define TOTAL_FIX_BTMAPS (NR_FIX_BTMAPS * FIX_BTMAPS_SLOTS)
+
+ FIX_BTMAP_END = __end_of_permanent_fixed_addresses,
+ FIX_BTMAP_BEGIN = FIX_BTMAP_END + TOTAL_FIX_BTMAPS - 1,
+
__end_of_fixed_addresses
};
diff --git a/arch/riscv/include/asm/io.h b/arch/riscv/include/asm/io.h
index 3835c3295dc5..c025a746a148 100644
--- a/arch/riscv/include/asm/io.h
+++ b/arch/riscv/include/asm/io.h
@@ -14,6 +14,7 @@
#include <linux/types.h>
#include <linux/pgtable.h>
#include <asm/mmiowb.h>
+#include <asm/early_ioremap.h>
/*
* MMIO access functions are separated out to break dependency cycles
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 6a0ee2405813..c71788e6aff4 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -72,6 +72,7 @@ void __init setup_arch(char **cmdline_p)
*cmdline_p = boot_command_line;
+ early_ioremap_setup();
parse_early_param();
setup_bootmem();
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 2b651f63f5c4..b75ebe8e7a92 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -403,6 +403,9 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
uintptr_t load_pa = (uintptr_t)(&_start);
uintptr_t load_sz = (uintptr_t)(&_end) - load_pa;
uintptr_t map_size = best_map_size(load_pa, MAX_EARLY_MAPPING_SIZE);
+#ifndef __PAGETABLE_PMD_FOLDED
+ pmd_t fix_bmap_spmd, fix_bmap_epmd;
+#endif
va_pa_offset = PAGE_OFFSET - load_pa;
pfn_base = PFN_DOWN(load_pa);
@@ -456,6 +459,36 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
pa + PGDIR_SIZE, PGDIR_SIZE, PAGE_KERNEL);
dtb_early_va = (void *)DTB_EARLY_BASE_VA + (dtb_pa & (PGDIR_SIZE - 1));
dtb_early_pa = dtb_pa;
+
+ /*
+ * Bootime fixmap only can handle PMD_SIZE mapping. Thus, boot-ioremap
+ * range can not span multiple pmds.
+ */
+ BUILD_BUG_ON((__fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT)
+ != (__fix_to_virt(FIX_BTMAP_END) >> PMD_SHIFT));
+
+#ifndef __PAGETABLE_PMD_FOLDED
+ /*
+ * Early ioremap fixmap is already created as it lies within first 2MB
+ * of fixmap region. We always map PMD_SIZE. Thus, both FIX_BTMAP_END
+ * FIX_BTMAP_BEGIN should lie in the same pmd. Verify that and warn
+ * the user if not.
+ */
+ fix_bmap_spmd = fixmap_pmd[pmd_index(__fix_to_virt(FIX_BTMAP_BEGIN))];
+ fix_bmap_epmd = fixmap_pmd[pmd_index(__fix_to_virt(FIX_BTMAP_END))];
+ if (pmd_val(fix_bmap_spmd) != pmd_val(fix_bmap_epmd)) {
+ WARN_ON(1);
+ pr_warn("fixmap btmap start [%08lx] != end [%08lx]\n",
+ pmd_val(fix_bmap_spmd), pmd_val(fix_bmap_epmd));
+ pr_warn("fix_to_virt(FIX_BTMAP_BEGIN): %08lx\n",
+ fix_to_virt(FIX_BTMAP_BEGIN));
+ pr_warn("fix_to_virt(FIX_BTMAP_END): %08lx\n",
+ fix_to_virt(FIX_BTMAP_END));
+
+ pr_warn("FIX_BTMAP_END: %d\n", FIX_BTMAP_END);
+ pr_warn("FIX_BTMAP_BEGIN: %d\n", FIX_BTMAP_BEGIN);
+ }
+#endif
}
static void __init setup_vm_final(void)
--
2.24.0
On Thu, Aug 13, 2020 at 5:18 AM Atish Patra <[email protected]> wrote:
>
> UEFI uses early IO or memory mappings for runtime services before
> normal ioremap() is usable. Add the necessary fixmap bindings and
> pmd mappings for generic ioremap support to work.
>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> arch/riscv/Kconfig | 1 +
> arch/riscv/include/asm/Kbuild | 1 +
> arch/riscv/include/asm/fixmap.h | 13 +++++++++++++
> arch/riscv/include/asm/io.h | 1 +
> arch/riscv/kernel/setup.c | 1 +
> arch/riscv/mm/init.c | 33 +++++++++++++++++++++++++++++++++
> 6 files changed, 50 insertions(+)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 7b5905529146..15597f5f504f 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -37,6 +37,7 @@ config RISCV
> select GENERIC_ARCH_TOPOLOGY if SMP
> select GENERIC_ATOMIC64 if !64BIT
> select GENERIC_CLOCKEVENTS
> + select GENERIC_EARLY_IOREMAP
> select GENERIC_GETTIMEOFDAY if HAVE_GENERIC_VDSO
> select GENERIC_IOREMAP
> select GENERIC_IRQ_MULTI_HANDLER
> diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
> index 3d9410bb4de0..59dd7be55005 100644
> --- a/arch/riscv/include/asm/Kbuild
> +++ b/arch/riscv/include/asm/Kbuild
> @@ -1,4 +1,5 @@
> # SPDX-License-Identifier: GPL-2.0
> +generic-y += early_ioremap.h
> generic-y += extable.h
> generic-y += flat.h
> generic-y += kvm_para.h
> diff --git a/arch/riscv/include/asm/fixmap.h b/arch/riscv/include/asm/fixmap.h
> index 11613f38228a..54cbf07fb4e9 100644
> --- a/arch/riscv/include/asm/fixmap.h
> +++ b/arch/riscv/include/asm/fixmap.h
> @@ -27,6 +27,19 @@ enum fixed_addresses {
> FIX_TEXT_POKE1,
> FIX_TEXT_POKE0,
> FIX_EARLYCON_MEM_BASE,
> +
> + __end_of_permanent_fixed_addresses,
> + /*
> + * Temporary boot-time mappings, used by early_ioremap(),
> + * before ioremap() is functional.
> + */
> +#define NR_FIX_BTMAPS (SZ_256K / PAGE_SIZE)
> +#define FIX_BTMAPS_SLOTS 7
> +#define TOTAL_FIX_BTMAPS (NR_FIX_BTMAPS * FIX_BTMAPS_SLOTS)
> +
> + FIX_BTMAP_END = __end_of_permanent_fixed_addresses,
> + FIX_BTMAP_BEGIN = FIX_BTMAP_END + TOTAL_FIX_BTMAPS - 1,
> +
> __end_of_fixed_addresses
> };
>
> diff --git a/arch/riscv/include/asm/io.h b/arch/riscv/include/asm/io.h
> index 3835c3295dc5..c025a746a148 100644
> --- a/arch/riscv/include/asm/io.h
> +++ b/arch/riscv/include/asm/io.h
> @@ -14,6 +14,7 @@
> #include <linux/types.h>
> #include <linux/pgtable.h>
> #include <asm/mmiowb.h>
> +#include <asm/early_ioremap.h>
>
> /*
> * MMIO access functions are separated out to break dependency cycles
> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> index 6a0ee2405813..c71788e6aff4 100644
> --- a/arch/riscv/kernel/setup.c
> +++ b/arch/riscv/kernel/setup.c
> @@ -72,6 +72,7 @@ void __init setup_arch(char **cmdline_p)
>
> *cmdline_p = boot_command_line;
>
> + early_ioremap_setup();
> parse_early_param();
>
> setup_bootmem();
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index 2b651f63f5c4..b75ebe8e7a92 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -403,6 +403,9 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> uintptr_t load_pa = (uintptr_t)(&_start);
> uintptr_t load_sz = (uintptr_t)(&_end) - load_pa;
> uintptr_t map_size = best_map_size(load_pa, MAX_EARLY_MAPPING_SIZE);
> +#ifndef __PAGETABLE_PMD_FOLDED
> + pmd_t fix_bmap_spmd, fix_bmap_epmd;
> +#endif
>
> va_pa_offset = PAGE_OFFSET - load_pa;
> pfn_base = PFN_DOWN(load_pa);
> @@ -456,6 +459,36 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> pa + PGDIR_SIZE, PGDIR_SIZE, PAGE_KERNEL);
> dtb_early_va = (void *)DTB_EARLY_BASE_VA + (dtb_pa & (PGDIR_SIZE - 1));
> dtb_early_pa = dtb_pa;
> +
> + /*
> + * Bootime fixmap only can handle PMD_SIZE mapping. Thus, boot-ioremap
> + * range can not span multiple pmds.
> + */
> + BUILD_BUG_ON((__fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT)
> + != (__fix_to_virt(FIX_BTMAP_END) >> PMD_SHIFT));
> +
> +#ifndef __PAGETABLE_PMD_FOLDED
> + /*
> + * Early ioremap fixmap is already created as it lies within first 2MB
> + * of fixmap region. We always map PMD_SIZE. Thus, both FIX_BTMAP_END
> + * FIX_BTMAP_BEGIN should lie in the same pmd. Verify that and warn
> + * the user if not.
> + */
> + fix_bmap_spmd = fixmap_pmd[pmd_index(__fix_to_virt(FIX_BTMAP_BEGIN))];
> + fix_bmap_epmd = fixmap_pmd[pmd_index(__fix_to_virt(FIX_BTMAP_END))];
> + if (pmd_val(fix_bmap_spmd) != pmd_val(fix_bmap_epmd)) {
> + WARN_ON(1);
> + pr_warn("fixmap btmap start [%08lx] != end [%08lx]\n",
> + pmd_val(fix_bmap_spmd), pmd_val(fix_bmap_epmd));
> + pr_warn("fix_to_virt(FIX_BTMAP_BEGIN): %08lx\n",
> + fix_to_virt(FIX_BTMAP_BEGIN));
> + pr_warn("fix_to_virt(FIX_BTMAP_END): %08lx\n",
> + fix_to_virt(FIX_BTMAP_END));
> +
> + pr_warn("FIX_BTMAP_END: %d\n", FIX_BTMAP_END);
> + pr_warn("FIX_BTMAP_BEGIN: %d\n", FIX_BTMAP_BEGIN);
> + }
> +#endif
> }
>
> static void __init setup_vm_final(void)
> --
> 2.24.0
>
Looks good to me.
Reviewed-by: Anup Patel <[email protected]>
Regards,
Anup
On Thu, Aug 13, 2020 at 5:19 AM Atish Patra <[email protected]> wrote:
>
> Currently, page table setup is done during setup_va_final where fixmap can
> be used to create the temporary mappings. The physical frame is allocated
> from memblock_alloc_* functions. However, this won't work if page table
> mapping needs to be created for a different mm context (i.e. efi mm) at
> a later point of time.
>
> Use generic kernel page allocation function & macros for any mapping
> after setup_vm_final.
>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> arch/riscv/mm/init.c | 130 ++++++++++++++++++++++++++++++++-----------
> 1 file changed, 99 insertions(+), 31 deletions(-)
>
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index b75ebe8e7a92..d238cdc501ee 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -32,6 +32,17 @@ extern char _start[];
> void *dtb_early_va __initdata;
> uintptr_t dtb_early_pa __initdata;
>
> +struct pt_alloc_ops {
> + pte_t *(*get_pte_virt)(phys_addr_t pa);
> + phys_addr_t (*alloc_pte)(uintptr_t va);
> +#ifndef __PAGETABLE_PMD_FOLDED
> + pmd_t *(*get_pmd_virt)(phys_addr_t pa);
> + phys_addr_t (*alloc_pmd)(uintptr_t va);
> +#endif
> +};
> +
> +struct pt_alloc_ops pt_ops;
> +
> static void __init zone_sizes_init(void)
> {
> unsigned long max_zone_pfns[MAX_NR_ZONES] = { 0, };
> @@ -211,7 +222,6 @@ EXPORT_SYMBOL(pfn_base);
> pgd_t swapper_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
> pgd_t trampoline_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
> pte_t fixmap_pte[PTRS_PER_PTE] __page_aligned_bss;
> -static bool mmu_enabled;
>
> #define MAX_EARLY_MAPPING_SIZE SZ_128M
>
> @@ -234,27 +244,46 @@ void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot)
> }
> }
>
> -static pte_t *__init get_pte_virt(phys_addr_t pa)
> +static inline pte_t *__init get_pte_virt_early(phys_addr_t pa)
> {
> - if (mmu_enabled) {
> - clear_fixmap(FIX_PTE);
> - return (pte_t *)set_fixmap_offset(FIX_PTE, pa);
> - } else {
> - return (pte_t *)((uintptr_t)pa);
> - }
> + return (pte_t *)((uintptr_t)pa);
> }
>
> -static phys_addr_t __init alloc_pte(uintptr_t va)
> +static inline pte_t *__init get_pte_virt_fixmap(phys_addr_t pa)
> +{
> + clear_fixmap(FIX_PTE);
> + return (pte_t *)set_fixmap_offset(FIX_PTE, pa);
> +}
> +
> +static inline pte_t *get_pte_virt_late(phys_addr_t pa)
> +{
> + return (pte_t *) __va(pa);
> +}
> +
> +static inline phys_addr_t __init alloc_pte_early(uintptr_t va)
> {
> /*
> * We only create PMD or PGD early mappings so we
> * should never reach here with MMU disabled.
> */
> - BUG_ON(!mmu_enabled);
> + BUG();
> +}
>
> +static inline phys_addr_t __init alloc_pte_fixmap(uintptr_t va)
> +{
> return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
> }
>
> +static phys_addr_t alloc_pte_late(uintptr_t va)
> +{
> + unsigned long vaddr;
> +
> + vaddr = __get_free_page(GFP_KERNEL);
> + if (!vaddr || !pgtable_pte_page_ctor(virt_to_page(vaddr)))
> + BUG();
> + return __pa(vaddr);
> +}
> +
> static void __init create_pte_mapping(pte_t *ptep,
> uintptr_t va, phys_addr_t pa,
> phys_addr_t sz, pgprot_t prot)
> @@ -279,28 +308,46 @@ pmd_t fixmap_pmd[PTRS_PER_PMD] __page_aligned_bss;
> #endif
> pmd_t early_pmd[PTRS_PER_PMD * NUM_EARLY_PMDS] __initdata __aligned(PAGE_SIZE);
>
> -static pmd_t *__init get_pmd_virt(phys_addr_t pa)
> +static pmd_t *__init get_pmd_virt_early(phys_addr_t pa)
> {
> - if (mmu_enabled) {
> - clear_fixmap(FIX_PMD);
> - return (pmd_t *)set_fixmap_offset(FIX_PMD, pa);
> - } else {
> - return (pmd_t *)((uintptr_t)pa);
> - }
> + /* Before MMU is enabled */
> + return (pmd_t *)((uintptr_t)pa);
> }
>
> -static phys_addr_t __init alloc_pmd(uintptr_t va)
> +static pmd_t *__init get_pmd_virt_fixmap(phys_addr_t pa)
> {
> - uintptr_t pmd_num;
> + clear_fixmap(FIX_PMD);
> + return (pmd_t *)set_fixmap_offset(FIX_PMD, pa);
> +}
> +
> +static pmd_t *get_pmd_virt_late(phys_addr_t pa)
> +{
> + return (pmd_t *) __va(pa);
> +}
>
> - if (mmu_enabled)
> - return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
> +static phys_addr_t __init alloc_pmd_early(uintptr_t va)
> +{
> + uintptr_t pmd_num;
>
> pmd_num = (va - PAGE_OFFSET) >> PGDIR_SHIFT;
> BUG_ON(pmd_num >= NUM_EARLY_PMDS);
> return (uintptr_t)&early_pmd[pmd_num * PTRS_PER_PMD];
> }
>
> +static phys_addr_t __init alloc_pmd_fixmap(uintptr_t va)
> +{
> + return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
> +}
> +
> +static phys_addr_t alloc_pmd_late(uintptr_t va)
> +{
> + unsigned long vaddr;
> +
> + vaddr = __get_free_page(GFP_KERNEL);
> + BUG_ON(!vaddr);
> + return __pa(vaddr);
> +}
> +
> static void __init create_pmd_mapping(pmd_t *pmdp,
> uintptr_t va, phys_addr_t pa,
> phys_addr_t sz, pgprot_t prot)
> @@ -316,28 +363,28 @@ static void __init create_pmd_mapping(pmd_t *pmdp,
> }
>
> if (pmd_none(pmdp[pmd_idx])) {
> - pte_phys = alloc_pte(va);
> + pte_phys = pt_ops.alloc_pte(va);
> pmdp[pmd_idx] = pfn_pmd(PFN_DOWN(pte_phys), PAGE_TABLE);
> - ptep = get_pte_virt(pte_phys);
> + ptep = pt_ops.get_pte_virt(pte_phys);
> memset(ptep, 0, PAGE_SIZE);
> } else {
> pte_phys = PFN_PHYS(_pmd_pfn(pmdp[pmd_idx]));
> - ptep = get_pte_virt(pte_phys);
> + ptep = pt_ops.get_pte_virt(pte_phys);
> }
>
> create_pte_mapping(ptep, va, pa, sz, prot);
> }
>
> #define pgd_next_t pmd_t
> -#define alloc_pgd_next(__va) alloc_pmd(__va)
> -#define get_pgd_next_virt(__pa) get_pmd_virt(__pa)
> +#define alloc_pgd_next(__va) pt_ops.alloc_pmd(__va)
> +#define get_pgd_next_virt(__pa) pt_ops.get_pmd_virt(__pa)
> #define create_pgd_next_mapping(__nextp, __va, __pa, __sz, __prot) \
> create_pmd_mapping(__nextp, __va, __pa, __sz, __prot)
> #define fixmap_pgd_next fixmap_pmd
> #else
> #define pgd_next_t pte_t
> -#define alloc_pgd_next(__va) alloc_pte(__va)
> -#define get_pgd_next_virt(__pa) get_pte_virt(__pa)
> +#define alloc_pgd_next(__va) pt_ops.alloc_pte(__va)
> +#define get_pgd_next_virt(__pa) pt_ops.get_pte_virt(__pa)
> #define create_pgd_next_mapping(__nextp, __va, __pa, __sz, __prot) \
> create_pte_mapping(__nextp, __va, __pa, __sz, __prot)
> #define fixmap_pgd_next fixmap_pte
> @@ -421,6 +468,12 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> BUG_ON((load_pa % map_size) != 0);
> BUG_ON(load_sz > MAX_EARLY_MAPPING_SIZE);
>
> + pt_ops.alloc_pte = alloc_pte_early;
> + pt_ops.get_pte_virt = get_pte_virt_early;
> +#ifndef __PAGETABLE_PMD_FOLDED
> + pt_ops.alloc_pmd = alloc_pmd_early;
> + pt_ops.get_pmd_virt = get_pmd_virt_early;
> +#endif
> /* Setup early PGD for fixmap */
> create_pgd_mapping(early_pg_dir, FIXADDR_START,
> (uintptr_t)fixmap_pgd_next, PGDIR_SIZE, PAGE_TABLE);
> @@ -497,9 +550,16 @@ static void __init setup_vm_final(void)
> phys_addr_t pa, start, end;
> struct memblock_region *reg;
>
> - /* Set mmu_enabled flag */
> - mmu_enabled = true;
> -
> + /**
> + * MMU is enabled at this point. But page table setup is not complete yet.
> + * fixmap page table alloc functions should be used at this point
> + */
> + pt_ops.alloc_pte = alloc_pte_fixmap;
> + pt_ops.get_pte_virt = get_pte_virt_fixmap;
> +#ifndef __PAGETABLE_PMD_FOLDED
> + pt_ops.alloc_pmd = alloc_pmd_fixmap;
> + pt_ops.get_pmd_virt = get_pmd_virt_fixmap;
> +#endif
> /* Setup swapper PGD for fixmap */
> create_pgd_mapping(swapper_pg_dir, FIXADDR_START,
> __pa_symbol(fixmap_pgd_next),
> @@ -533,6 +593,14 @@ static void __init setup_vm_final(void)
> /* Move to swapper page table */
> csr_write(CSR_SATP, PFN_DOWN(__pa_symbol(swapper_pg_dir)) | SATP_MODE);
> local_flush_tlb_all();
> +
> + /* generic page allocation functions must be used to setup page table */
> + pt_ops.alloc_pte = alloc_pte_late;
> + pt_ops.get_pte_virt = get_pte_virt_late;
> +#ifndef __PAGETABLE_PMD_FOLDED
> + pt_ops.alloc_pmd = alloc_pmd_late;
> + pt_ops.get_pmd_virt = get_pmd_virt_late;
> +#endif
> }
> #else
> asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> --
> 2.24.0
>
Looks good to me.
Reviewed-by: Anup Patel <[email protected]>
Regards,
Anup
On Thu, Aug 13, 2020 at 5:18 AM Atish Patra <[email protected]> wrote:
>
> Linux kernel Image can appear as an EFI application With appropriate
> PE/COFF header fields in the beginning of the Image header. An EFI
> application loader can directly load a Linux kernel Image and an EFI
> stub residing in kernel can boot Linux kernel directly.
>
> Add the necessary PE/COFF header.
>
> Signed-off-by: Atish Patra <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> [ardb: - use C prefix for c.li to ensure the expected opcode is emitted
> - align all image sections according to PE/COFF section alignment ]
> Signed-off-by: Ard Biesheuvel <[email protected]>
> ---
> arch/riscv/include/asm/sections.h | 13 ++++
> arch/riscv/kernel/Makefile | 4 ++
> arch/riscv/kernel/efi-header.S | 104 ++++++++++++++++++++++++++++++
> arch/riscv/kernel/head.S | 16 +++++
> arch/riscv/kernel/image-vars.h | 51 +++++++++++++++
> arch/riscv/kernel/vmlinux.lds.S | 22 ++++++-
> 6 files changed, 208 insertions(+), 2 deletions(-)
> create mode 100644 arch/riscv/include/asm/sections.h
> create mode 100644 arch/riscv/kernel/efi-header.S
> create mode 100644 arch/riscv/kernel/image-vars.h
>
> diff --git a/arch/riscv/include/asm/sections.h b/arch/riscv/include/asm/sections.h
> new file mode 100644
> index 000000000000..3a9971b1210f
> --- /dev/null
> +++ b/arch/riscv/include/asm/sections.h
> @@ -0,0 +1,13 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2020 Western Digital Corporation or its affiliates.
> + */
> +#ifndef __ASM_SECTIONS_H
> +#define __ASM_SECTIONS_H
> +
> +#include <asm-generic/sections.h>
> +
> +extern char _start[];
> +extern char _start_kernel[];
> +
> +#endif /* __ASM_SECTIONS_H */
> diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
> index a5287ab9f7f2..eabec4dce50b 100644
> --- a/arch/riscv/kernel/Makefile
> +++ b/arch/riscv/kernel/Makefile
> @@ -32,6 +32,10 @@ obj-y += patch.o
> obj-$(CONFIG_MMU) += vdso.o vdso/
>
> obj-$(CONFIG_RISCV_M_MODE) += clint.o traps_misaligned.o
> +OBJCOPYFLAGS := --prefix-symbols=__efistub_
> +$(obj)/%.stub.o: $(obj)/%.o FORCE
> + $(call if_changed,objcopy)
> +
> obj-$(CONFIG_FPU) += fpu.o
> obj-$(CONFIG_SMP) += smpboot.o
> obj-$(CONFIG_SMP) += smp.o
> diff --git a/arch/riscv/kernel/efi-header.S b/arch/riscv/kernel/efi-header.S
> new file mode 100644
> index 000000000000..822b4c9ff2bb
> --- /dev/null
> +++ b/arch/riscv/kernel/efi-header.S
> @@ -0,0 +1,104 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2020 Western Digital Corporation or its affiliates.
> + * Adapted from arch/arm64/kernel/efi-header.S
> + */
> +
> +#include <linux/pe.h>
> +#include <linux/sizes.h>
> +
> + .macro __EFI_PE_HEADER
> + .long PE_MAGIC
> +coff_header:
> +#ifdef CONFIG_64BIT
> + .short IMAGE_FILE_MACHINE_RISCV64 // Machine
> +#else
> + .short IMAGE_FILE_MACHINE_RISCV32 // Machine
> +#endif
> + .short section_count // NumberOfSections
> + .long 0 // TimeDateStamp
> + .long 0 // PointerToSymbolTable
> + .long 0 // NumberOfSymbols
> + .short section_table - optional_header // SizeOfOptionalHeader
> + .short IMAGE_FILE_DEBUG_STRIPPED | \
> + IMAGE_FILE_EXECUTABLE_IMAGE | \
> + IMAGE_FILE_LINE_NUMS_STRIPPED // Characteristics
> +
> +optional_header:
> + .short PE_OPT_MAGIC_PE32PLUS // PE32+ format
> + .byte 0x02 // MajorLinkerVersion
> + .byte 0x14 // MinorLinkerVersion
> + .long __pecoff_text_end - efi_header_end // SizeOfCode
> + .long __pecoff_data_virt_size // SizeOfInitializedData
> + .long 0 // SizeOfUninitializedData
> + .long __efistub_efi_pe_entry - _start // AddressOfEntryPoint
> + .long efi_header_end - _start // BaseOfCode
> +
> +extra_header_fields:
> + .quad 0 // ImageBase
> + .long PECOFF_SECTION_ALIGNMENT // SectionAlignment
> + .long PECOFF_FILE_ALIGNMENT // FileAlignment
> + .short 0 // MajorOperatingSystemVersion
> + .short 0 // MinorOperatingSystemVersion
> + .short LINUX_EFISTUB_MAJOR_VERSION // MajorImageVersion
> + .short LINUX_EFISTUB_MINOR_VERSION // MinorImageVersion
> + .short 0 // MajorSubsystemVersion
> + .short 0 // MinorSubsystemVersion
> + .long 0 // Win32VersionValue
> +
> + .long _end - _start // SizeOfImage
> +
> + // Everything before the kernel image is considered part of the header
> + .long efi_header_end - _start // SizeOfHeaders
> + .long 0 // CheckSum
> + .short IMAGE_SUBSYSTEM_EFI_APPLICATION // Subsystem
> + .short 0 // DllCharacteristics
> + .quad 0 // SizeOfStackReserve
> + .quad 0 // SizeOfStackCommit
> + .quad 0 // SizeOfHeapReserve
> + .quad 0 // SizeOfHeapCommit
> + .long 0 // LoaderFlags
> + .long (section_table - .) / 8 // NumberOfRvaAndSizes
> +
> + .quad 0 // ExportTable
> + .quad 0 // ImportTable
> + .quad 0 // ResourceTable
> + .quad 0 // ExceptionTable
> + .quad 0 // CertificationTable
> + .quad 0 // BaseRelocationTable
> +
> + // Section table
> +section_table:
> + .ascii ".text\0\0\0"
> + .long __pecoff_text_end - efi_header_end // VirtualSize
> + .long efi_header_end - _start // VirtualAddress
> + .long __pecoff_text_end - efi_header_end // SizeOfRawData
> + .long efi_header_end - _start // PointerToRawData
> +
> + .long 0 // PointerToRelocations
> + .long 0 // PointerToLineNumbers
> + .short 0 // NumberOfRelocations
> + .short 0 // NumberOfLineNumbers
> + .long IMAGE_SCN_CNT_CODE | \
> + IMAGE_SCN_MEM_READ | \
> + IMAGE_SCN_MEM_EXECUTE // Characteristics
> +
> + .ascii ".data\0\0\0"
> + .long __pecoff_data_virt_size // VirtualSize
> + .long __pecoff_text_end - _start // VirtualAddress
> + .long __pecoff_data_raw_size // SizeOfRawData
> + .long __pecoff_text_end - _start // PointerToRawData
> +
> + .long 0 // PointerToRelocations
> + .long 0 // PointerToLineNumbers
> + .short 0 // NumberOfRelocations
> + .short 0 // NumberOfLineNumbers
> + .long IMAGE_SCN_CNT_INITIALIZED_DATA | \
> + IMAGE_SCN_MEM_READ | \
> + IMAGE_SCN_MEM_WRITE // Characteristics
> +
> + .set section_count, (. - section_table) / 40
> +
> + .balign 0x1000
> +efi_header_end:
> + .endm
> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
> index a2f0cb3ca0a6..f5583473f41c 100644
> --- a/arch/riscv/kernel/head.S
> +++ b/arch/riscv/kernel/head.S
> @@ -13,6 +13,7 @@
> #include <asm/csr.h>
> #include <asm/hwcap.h>
> #include <asm/image.h>
> +#include "efi-header.S"
>
> __HEAD
> ENTRY(_start)
> @@ -22,10 +23,18 @@ ENTRY(_start)
> * Do not modify it without modifying the structure and all bootloaders
> * that expects this header format!!
> */
> +#ifdef CONFIG_EFI
> + /*
> + * This instruction decodes to "MZ" ASCII required by UEFI.
> + */
> + c.li s4,-13
> + j _start_kernel
> +#else
> /* jump to start kernel */
> j _start_kernel
> /* reserved */
> .word 0
> +#endif
> .balign 8
> #if __riscv_xlen == 64
> /* Image load offset(2MB) from start of RAM */
> @@ -43,7 +52,14 @@ ENTRY(_start)
> .ascii RISCV_IMAGE_MAGIC
> .balign 4
> .ascii RISCV_IMAGE_MAGIC2
> +#ifdef CONFIG_EFI
> + .word pe_head_start - _start
> +pe_head_start:
> +
> + __EFI_PE_HEADER
> +#else
> .word 0
> +#endif
>
> .align 2
> #ifdef CONFIG_MMU
> diff --git a/arch/riscv/kernel/image-vars.h b/arch/riscv/kernel/image-vars.h
> new file mode 100644
> index 000000000000..8c212efb37a6
> --- /dev/null
> +++ b/arch/riscv/kernel/image-vars.h
> @@ -0,0 +1,51 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2020 Western Digital Corporation or its affiliates.
> + * Linker script variables to be set after section resolution, as
> + * ld.lld does not like variables assigned before SECTIONS is processed.
> + * Based on arch/arm64/kerne/image-vars.h
> + */
> +#ifndef __RISCV_KERNEL_IMAGE_VARS_H
> +#define __RISCV_KERNEL_IMAGE_VARS_H
> +
> +#ifndef LINKER_SCRIPT
> +#error This file should only be included in vmlinux.lds.S
> +#endif
> +
> +#ifdef CONFIG_EFI
> +
> +/*
> + * The EFI stub has its own symbol namespace prefixed by __efistub_, to
> + * isolate it from the kernel proper. The following symbols are legally
> + * accessed by the stub, so provide some aliases to make them accessible.
> + * Only include data symbols here, or text symbols of functions that are
> + * guaranteed to be safe when executed at another offset than they were
> + * linked at. The routines below are all implemented in assembler in a
> + * position independent manner
> + */
> +__efistub_memcmp = memcmp;
> +__efistub_memchr = memchr;
> +__efistub_memcpy = memcpy;
> +__efistub_memmove = memmove;
> +__efistub_memset = memset;
> +__efistub_strlen = strlen;
> +__efistub_strnlen = strnlen;
> +__efistub_strcmp = strcmp;
> +__efistub_strncmp = strncmp;
> +__efistub_strrchr = strrchr;
> +
> +#ifdef CONFIG_KASAN
> +__efistub___memcpy = memcpy;
> +__efistub___memmove = memmove;
> +__efistub___memset = memset;
> +#endif
> +
> +__efistub__start = _start;
> +__efistub__start_kernel = _start_kernel;
> +__efistub__end = _end;
> +__efistub__edata = _edata;
> +__efistub_screen_info = screen_info;
> +
> +#endif
> +
> +#endif /* __RISCV_KERNEL_IMAGE_VARS_H */
> diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
> index f3586e31ed1e..6dcf790282dd 100644
> --- a/arch/riscv/kernel/vmlinux.lds.S
> +++ b/arch/riscv/kernel/vmlinux.lds.S
> @@ -10,6 +10,7 @@
> #include <asm/cache.h>
> #include <asm/thread_info.h>
> #include <asm/set_memory.h>
> +#include "image-vars.h"
>
> #include <linux/sizes.h>
> OUTPUT_ARCH(riscv)
> @@ -17,6 +18,9 @@ ENTRY(_start)
>
> jiffies = jiffies_64;
>
> +PECOFF_SECTION_ALIGNMENT = 0x1000;
> +PECOFF_FILE_ALIGNMENT = 0x200;
> +
> SECTIONS
> {
> /* Beginning of code and text segment */
> @@ -76,6 +80,10 @@ SECTIONS
>
> EXCEPTION_TABLE(0x10)
>
> +#ifdef CONFIG_EFI
> + . = ALIGN(PECOFF_SECTION_ALIGNMENT);
> + __pecoff_text_end = .;
> +#endif
> . = ALIGN(SECTION_ALIGN);
> _data = .;
>
> @@ -83,16 +91,26 @@ SECTIONS
> .sdata : {
> __global_pointer$ = . + 0x800;
> *(.sdata*)
> - /* End of data section */
> - _edata = .;
> }
>
> +#ifdef CONFIG_EFI
> + .pecoff_edata_padding : { BYTE(0); . = ALIGN(PECOFF_FILE_ALIGNMENT); }
> + __pecoff_data_raw_size = ABSOLUTE(. - __pecoff_text_end);
> +#endif
> +
> + /* End of data section */
> + _edata = .;
> +
> BSS_SECTION(PAGE_SIZE, PAGE_SIZE, 0)
>
> .rel.dyn : {
> *(.rel.dyn*)
> }
>
> +#ifdef CONFIG_EFI
> + . = ALIGN(PECOFF_SECTION_ALIGNMENT);
> + __pecoff_data_virt_size = ABSOLUTE(. - __pecoff_text_end);
> +#endif
> _end = .;
>
> STABS_DEBUG
> --
> 2.24.0
>
Looks good to me.
Reviewed-by: Anup Patel <[email protected]>
Regards,
Anup
On Thu, 13 Aug 2020 at 01:48, Atish Patra <[email protected]> wrote:
>
> This patch adds EFI runtime service support for RISC-V.
>
> Signed-off-by: Atish Patra <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
> ---
> arch/riscv/Kconfig | 2 +
> arch/riscv/include/asm/efi.h | 20 ++++
> arch/riscv/include/asm/mmu.h | 2 +
> arch/riscv/include/asm/pgtable.h | 4 +
> arch/riscv/kernel/Makefile | 1 +
> arch/riscv/kernel/efi.c | 105 +++++++++++++++++
> arch/riscv/kernel/setup.c | 7 +-
> arch/riscv/mm/init.c | 2 +-
> drivers/firmware/efi/Makefile | 2 +
> drivers/firmware/efi/libstub/efi-stub.c | 11 +-
> drivers/firmware/efi/riscv-runtime.c | 143 ++++++++++++++++++++++++
> 11 files changed, 295 insertions(+), 4 deletions(-)
> create mode 100644 arch/riscv/kernel/efi.c
> create mode 100644 drivers/firmware/efi/riscv-runtime.c
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index e11907cc7a43..b2164109483d 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -412,7 +412,9 @@ config EFI
> select EFI_PARAMS_FROM_FDT
> select EFI_STUB
> select EFI_GENERIC_STUB
> + select EFI_RUNTIME_WRAPPERS
> select RISCV_ISA_C
> + depends on MMU
> default y
> help
> This option provides support for runtime services provided
> diff --git a/arch/riscv/include/asm/efi.h b/arch/riscv/include/asm/efi.h
> index 86da231909bb..93c305a638f4 100644
> --- a/arch/riscv/include/asm/efi.h
> +++ b/arch/riscv/include/asm/efi.h
> @@ -5,11 +5,28 @@
> #ifndef _ASM_EFI_H
> #define _ASM_EFI_H
>
> +#include <asm/csr.h>
> #include <asm/io.h>
> #include <asm/mmu_context.h>
> #include <asm/ptrace.h>
> #include <asm/tlbflush.h>
>
> +#ifdef CONFIG_EFI
> +extern void efi_init(void);
> +#else
> +#define efi_init()
> +#endif
> +
> +int efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md);
> +int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md);
> +
> +#define arch_efi_call_virt_setup() efi_virtmap_load()
> +#define arch_efi_call_virt_teardown() efi_virtmap_unload()
> +
> +#define arch_efi_call_virt(p, f, args...) p->f(args)
> +
> +#define ARCH_EFI_IRQ_FLAGS_MASK (SR_IE | SR_SPIE)
> +
> /* on RISC-V, the FDT may be located anywhere in system RAM */
> static inline unsigned long efi_get_max_fdt_addr(unsigned long dram_base)
> {
> @@ -33,4 +50,7 @@ static inline void efifb_setup_from_dmi(struct screen_info *si, const char *opt)
> {
> }
>
> +void efi_virtmap_load(void);
> +void efi_virtmap_unload(void);
> +
> #endif /* _ASM_EFI_H */
> diff --git a/arch/riscv/include/asm/mmu.h b/arch/riscv/include/asm/mmu.h
> index 967eacb01ab5..dabcf2cfb3dc 100644
> --- a/arch/riscv/include/asm/mmu.h
> +++ b/arch/riscv/include/asm/mmu.h
> @@ -20,6 +20,8 @@ typedef struct {
> #endif
> } mm_context_t;
>
> +void __init create_pgd_mapping(pgd_t *pgdp, uintptr_t va, phys_addr_t pa,
> + phys_addr_t sz, pgprot_t prot);
> #endif /* __ASSEMBLY__ */
>
> #endif /* _ASM_RISCV_MMU_H */
> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> index 815f8c959dd4..183f1f4b2ae6 100644
> --- a/arch/riscv/include/asm/pgtable.h
> +++ b/arch/riscv/include/asm/pgtable.h
> @@ -100,6 +100,10 @@
>
> #define PAGE_KERNEL __pgprot(_PAGE_KERNEL)
> #define PAGE_KERNEL_EXEC __pgprot(_PAGE_KERNEL | _PAGE_EXEC)
> +#define PAGE_KERNEL_READ __pgprot(_PAGE_KERNEL & ~_PAGE_WRITE)
> +#define PAGE_KERNEL_EXEC __pgprot(_PAGE_KERNEL | _PAGE_EXEC)
> +#define PAGE_KERNEL_READ_EXEC __pgprot((_PAGE_KERNEL & ~_PAGE_WRITE) \
> + | _PAGE_EXEC)
>
> #define PAGE_TABLE __pgprot(_PAGE_TABLE)
>
> diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
> index eabec4dce50b..0b48059cc9da 100644
> --- a/arch/riscv/kernel/Makefile
> +++ b/arch/riscv/kernel/Makefile
> @@ -36,6 +36,7 @@ OBJCOPYFLAGS := --prefix-symbols=__efistub_
> $(obj)/%.stub.o: $(obj)/%.o FORCE
> $(call if_changed,objcopy)
>
> +obj-$(CONFIG_EFI) += efi.o
> obj-$(CONFIG_FPU) += fpu.o
> obj-$(CONFIG_SMP) += smpboot.o
> obj-$(CONFIG_SMP) += smp.o
> diff --git a/arch/riscv/kernel/efi.c b/arch/riscv/kernel/efi.c
> new file mode 100644
> index 000000000000..d7a723b446c3
> --- /dev/null
> +++ b/arch/riscv/kernel/efi.c
> @@ -0,0 +1,105 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2020 Western Digital Corporation or its affiliates.
> + * Adapted from arch/arm64/kernel/efi.c
> + */
> +
> +#include <linux/efi.h>
> +#include <linux/init.h>
> +
> +#include <asm/efi.h>
> +#include <asm/pgtable.h>
> +#include <asm/pgtable-bits.h>
> +
> +/*
> + * Only regions of type EFI_RUNTIME_SERVICES_CODE need to be
> + * executable, everything else can be mapped with the XN bits
> + * set. Also take the new (optional) RO/XP bits into account.
> + */
> +static __init pgprot_t efimem_to_pgprot_map(efi_memory_desc_t *md)
> +{
> + u64 attr = md->attribute;
> + u32 type = md->type;
> +
> + if (type == EFI_MEMORY_MAPPED_IO)
> + return PAGE_KERNEL;
> +
> + if (WARN_ONCE(!PAGE_ALIGNED(md->phys_addr),
> + "UEFI Runtime regions are not aligned to page size -- buggy firmware?"))
> + /*
> + * If the region is not aligned to the page size of the OS, we
> + * can not use strict permissions, since that would also affect
> + * the mapping attributes of the adjacent regions.
> + */
> + return PAGE_EXEC;
> +
> + /* R-- */
> + if ((attr & (EFI_MEMORY_XP | EFI_MEMORY_RO)) ==
> + (EFI_MEMORY_XP | EFI_MEMORY_RO))
> + return PAGE_KERNEL_READ;
> +
> + /* R-X */
> + if (attr & EFI_MEMORY_RO)
> + return PAGE_KERNEL_READ_EXEC;
> +
> + /* RW- */
> + if (((attr & (EFI_MEMORY_RP | EFI_MEMORY_WP | EFI_MEMORY_XP)) ==
> + EFI_MEMORY_XP) ||
> + type != EFI_RUNTIME_SERVICES_CODE)
> + return PAGE_KERNEL;
> +
> + /* RWX */
> + return PAGE_KERNEL_EXEC;
> +}
> +
> +int __init efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md)
> +{
> + pgprot_t prot = __pgprot(pgprot_val(efimem_to_pgprot_map(md)) &
> + ~(_PAGE_GLOBAL));
> + int i;
> +
> + /* RISC-V maps one page at a time */
> + for (i = 0; i < md->num_pages; i++)
> + create_pgd_mapping(mm->pgd, md->virt_addr + i * PAGE_SIZE,
> + md->phys_addr + i * PAGE_SIZE,
> + PAGE_SIZE, prot);
> + return 0;
> +}
> +
> +static int __init set_permissions(pte_t *ptep, unsigned long addr, void *data)
> +{
> + efi_memory_desc_t *md = data;
> + pte_t pte = READ_ONCE(*ptep);
> + unsigned long val;
> +
> + if (md->attribute & EFI_MEMORY_RO) {
> + val = pte_val(pte) & ~_PAGE_WRITE;
> + val = pte_val(pte) | _PAGE_READ;
> + pte = __pte(val);
> + }
> + if (md->attribute & EFI_MEMORY_XP) {
> + val = pte_val(pte) & ~_PAGE_EXEC;
> + pte = __pte(val);
> + }
> + set_pte(ptep, pte);
> +
> + return 0;
> +}
> +
> +int __init efi_set_mapping_permissions(struct mm_struct *mm,
> + efi_memory_desc_t *md)
> +{
> + BUG_ON(md->type != EFI_RUNTIME_SERVICES_CODE &&
> + md->type != EFI_RUNTIME_SERVICES_DATA);
> +
> + /*
> + * Calling apply_to_page_range() is only safe on regions that are
> + * guaranteed to be mapped down to pages. Since we are only called
> + * for regions that have been mapped using efi_create_mapping() above
> + * (and this is checked by the generic Memory Attributes table parsing
> + * routines), there is no need to check that again here.
> + */
> + return apply_to_page_range(mm, md->virt_addr,
> + md->num_pages << EFI_PAGE_SHIFT,
> + set_permissions, md);
> +}
> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> index c71788e6aff4..7f2a0d6dca7d 100644
> --- a/arch/riscv/kernel/setup.c
> +++ b/arch/riscv/kernel/setup.c
> @@ -17,6 +17,7 @@
> #include <linux/sched/task.h>
> #include <linux/swiotlb.h>
> #include <linux/smp.h>
> +#include <linux/efi.h>
>
> #include <asm/clint.h>
> #include <asm/cpu_ops.h>
> @@ -26,11 +27,12 @@
> #include <asm/tlbflush.h>
> #include <asm/thread_info.h>
> #include <asm/kasan.h>
> +#include <asm/efi.h>
>
> #include "head.h"
>
> -#ifdef CONFIG_DUMMY_CONSOLE
> -struct screen_info screen_info = {
> +#if defined(CONFIG_DUMMY_CONSOLE) || defined(CONFIG_EFI)
> +struct screen_info screen_info __section(.data) = {
> .orig_video_lines = 30,
> .orig_video_cols = 80,
> .orig_video_mode = 0,
> @@ -75,6 +77,7 @@ void __init setup_arch(char **cmdline_p)
> early_ioremap_setup();
> parse_early_param();
>
> + efi_init();
> setup_bootmem();
> paging_init();
> #if IS_ENABLED(CONFIG_BUILTIN_DTB)
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index d238cdc501ee..9fb2fe2f4a3e 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -390,7 +390,7 @@ static void __init create_pmd_mapping(pmd_t *pmdp,
> #define fixmap_pgd_next fixmap_pte
> #endif
>
> -static void __init create_pgd_mapping(pgd_t *pgdp,
> +void __init create_pgd_mapping(pgd_t *pgdp,
> uintptr_t va, phys_addr_t pa,
> phys_addr_t sz, pgprot_t prot)
> {
> diff --git a/drivers/firmware/efi/Makefile b/drivers/firmware/efi/Makefile
> index 61fd1e8b26fb..4d628081bb2f 100644
> --- a/drivers/firmware/efi/Makefile
> +++ b/drivers/firmware/efi/Makefile
> @@ -35,6 +35,8 @@ fake_map-$(CONFIG_X86) += x86_fake_mem.o
> arm-obj-$(CONFIG_EFI) := efi-init.o arm-runtime.o
> obj-$(CONFIG_ARM) += $(arm-obj-y)
> obj-$(CONFIG_ARM64) += $(arm-obj-y)
> +riscv-obj-$(CONFIG_EFI) := efi-init.o riscv-runtime.o
> +obj-$(CONFIG_RISCV) += $(riscv-obj-y)
> obj-$(CONFIG_EFI_CAPSULE_LOADER) += capsule-loader.o
> obj-$(CONFIG_EFI_EARLYCON) += earlycon.o
> obj-$(CONFIG_UEFI_CPER_ARM) += cper-arm.o
> diff --git a/drivers/firmware/efi/libstub/efi-stub.c b/drivers/firmware/efi/libstub/efi-stub.c
> index a5a405d8ab44..5c26725d8fd0 100644
> --- a/drivers/firmware/efi/libstub/efi-stub.c
> +++ b/drivers/firmware/efi/libstub/efi-stub.c
> @@ -17,7 +17,10 @@
>
> /*
> * This is the base address at which to start allocating virtual memory ranges
> - * for UEFI Runtime Services. This is in the low TTBR0 range so that we can use
> + * for UEFI Runtime Services.
> + *
> + * For ARM/ARM64:
> + * This is in the low TTBR0 range so that we can use
> * any allocation we choose, and eliminate the risk of a conflict after kexec.
> * The value chosen is the largest non-zero power of 2 suitable for this purpose
> * both on 32-bit and 64-bit ARM CPUs, to maximize the likelihood that it can
> @@ -25,6 +28,12 @@
> * Since 32-bit ARM could potentially execute with a 1G/3G user/kernel split,
> * map everything below 1 GB. (512 MB is a reasonable upper bound for the
> * entire footprint of the UEFI runtime services memory regions)
> + *
> + * For RISC-V:
> + * There is no specific reason for which, this address (512MB) can't be used
> + * EFI runtime virtual address for RISC-V. It also helps to use EFI runtime
> + * services on both RV32/RV64. Keep the same runtime virtual address for RISC-V
> + * as well to minimize the code churn.
> */
> #define EFI_RT_VIRTUAL_BASE SZ_512M
> #define EFI_RT_VIRTUAL_SIZE SZ_512M
> diff --git a/drivers/firmware/efi/riscv-runtime.c b/drivers/firmware/efi/riscv-runtime.c
> new file mode 100644
> index 000000000000..d28e715d2bcc
> --- /dev/null
> +++ b/drivers/firmware/efi/riscv-runtime.c
> @@ -0,0 +1,143 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Extensible Firmware Interface
> + *
> + * Copyright (C) 2020 Western Digital Corporation or its affiliates.
> + *
> + * Based on Extensible Firmware Interface Specification version 2.4
> + * Adapted from drivers/firmware/efi/arm-runtime.c
> + *
> + */
> +
> +#include <linux/dmi.h>
> +#include <linux/efi.h>
> +#include <linux/io.h>
> +#include <linux/memblock.h>
> +#include <linux/mm_types.h>
> +#include <linux/preempt.h>
> +#include <linux/rbtree.h>
> +#include <linux/rwsem.h>
> +#include <linux/sched.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock.h>
> +#include <linux/pgtable.h>
> +
> +#include <asm/cacheflush.h>
> +#include <asm/efi.h>
> +#include <asm/mmu.h>
> +#include <asm/pgalloc.h>
> +
> +static bool __init efi_virtmap_init(void)
> +{
> + efi_memory_desc_t *md;
> +
> + efi_mm.pgd = pgd_alloc(&efi_mm);
> + mm_init_cpumask(&efi_mm);
> + init_new_context(NULL, &efi_mm);
> +
> + for_each_efi_memory_desc(md) {
> + phys_addr_t phys = md->phys_addr;
> + int ret;
> +
> + if (!(md->attribute & EFI_MEMORY_RUNTIME))
> + continue;
> + if (md->virt_addr == 0)
> + return false;
> +
> + ret = efi_create_mapping(&efi_mm, md);
> + if (ret) {
> + pr_warn(" EFI remap %pa: failed to create mapping (%d)\n",
> + &phys, ret);
> + return false;
> + }
> + }
> +
> + if (efi_memattr_apply_permissions(&efi_mm, efi_set_mapping_permissions))
> + return false;
> +
> + return true;
> +}
> +
> +/*
> + * Enable the UEFI Runtime Services if all prerequisites are in place, i.e.,
> + * non-early mapping of the UEFI system table and virtual mappings for all
> + * EFI_MEMORY_RUNTIME regions.
> + */
> +static int __init riscv_enable_runtime_services(void)
> +{
> + u64 mapsize;
> +
> + if (!efi_enabled(EFI_BOOT)) {
> + pr_info("EFI services will not be available.\n");
> + return 0;
> + }
> +
> + efi_memmap_unmap();
> +
> + mapsize = efi.memmap.desc_size * efi.memmap.nr_map;
> +
> + if (efi_memmap_init_late(efi.memmap.phys_map, mapsize)) {
> + pr_err("Failed to remap EFI memory map\n");
> + return 0;
> + }
> +
> + if (efi_soft_reserve_enabled()) {
> + efi_memory_desc_t *md;
> +
> + for_each_efi_memory_desc(md) {
> + int md_size = md->num_pages << EFI_PAGE_SHIFT;
> + struct resource *res;
> +
> + if (!(md->attribute & EFI_MEMORY_SP))
> + continue;
> +
> + res = kzalloc(sizeof(*res), GFP_KERNEL);
> + if (WARN_ON(!res))
> + break;
> +
> + res->start = md->phys_addr;
> + res->end = md->phys_addr + md_size - 1;
> + res->name = "Soft Reserved";
> + res->flags = IORESOURCE_MEM;
> + res->desc = IORES_DESC_SOFT_RESERVED;
> +
> + insert_resource(&iomem_resource, res);
> + }
> + }
> +
> + if (efi_runtime_disabled()) {
> + pr_info("EFI runtime services will be disabled.\n");
> + return 0;
> + }
> +
> + if (efi_enabled(EFI_RUNTIME_SERVICES)) {
> + pr_info("EFI runtime services access via paravirt.\n");
> + return 0;
> + }
> +
> + pr_info("Remapping and enabling EFI services.\n");
> +
> + if (!efi_virtmap_init()) {
> + pr_err("UEFI virtual mapping missing or invalid -- runtime services will not be available\n");
> + return -ENOMEM;
> + }
> +
> + /* Set up runtime services function pointers */
> + efi_native_runtime_setup();
> + set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
> +
> + return 0;
> +}
> +early_initcall(riscv_enable_runtime_services);
> +
> +void efi_virtmap_load(void)
> +{
> + preempt_disable();
> + switch_mm(current->active_mm, &efi_mm, NULL);
> +}
> +
> +void efi_virtmap_unload(void)
> +{
> + switch_mm(&efi_mm, current->active_mm, NULL);
> + preempt_enable();
> +}
> --
> 2.24.0
>
On Wed, Aug 12, 2020 at 04:47:52PM -0700, Atish Patra wrote:
> Currently, page table setup is done during setup_va_final where fixmap can
> be used to create the temporary mappings. The physical frame is allocated
> from memblock_alloc_* functions. However, this won't work if page table
> mapping needs to be created for a different mm context (i.e. efi mm) at
> a later point of time.
>
> Use generic kernel page allocation function & macros for any mapping
> after setup_vm_final.
>
> Signed-off-by: Atish Patra <[email protected]>
A nit below, otherwise
Acked-by: Mike Rapoport <[email protected]>
> ---
> arch/riscv/mm/init.c | 130 ++++++++++++++++++++++++++++++++-----------
> 1 file changed, 99 insertions(+), 31 deletions(-)
>
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index b75ebe8e7a92..d238cdc501ee 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -32,6 +32,17 @@ extern char _start[];
> void *dtb_early_va __initdata;
> uintptr_t dtb_early_pa __initdata;
>
> +struct pt_alloc_ops {
> + pte_t *(*get_pte_virt)(phys_addr_t pa);
> + phys_addr_t (*alloc_pte)(uintptr_t va);
> +#ifndef __PAGETABLE_PMD_FOLDED
> + pmd_t *(*get_pmd_virt)(phys_addr_t pa);
> + phys_addr_t (*alloc_pmd)(uintptr_t va);
> +#endif
> +};
> +
> +struct pt_alloc_ops pt_ops;
static?
> +
> static void __init zone_sizes_init(void)
> {
> unsigned long max_zone_pfns[MAX_NR_ZONES] = { 0, };
> @@ -211,7 +222,6 @@ EXPORT_SYMBOL(pfn_base);
> pgd_t swapper_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
> pgd_t trampoline_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
> pte_t fixmap_pte[PTRS_PER_PTE] __page_aligned_bss;
> -static bool mmu_enabled;
>
> #define MAX_EARLY_MAPPING_SIZE SZ_128M
>
> @@ -234,27 +244,46 @@ void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot)
> }
> }
>
> -static pte_t *__init get_pte_virt(phys_addr_t pa)
> +static inline pte_t *__init get_pte_virt_early(phys_addr_t pa)
> {
> - if (mmu_enabled) {
> - clear_fixmap(FIX_PTE);
> - return (pte_t *)set_fixmap_offset(FIX_PTE, pa);
> - } else {
> - return (pte_t *)((uintptr_t)pa);
> - }
> + return (pte_t *)((uintptr_t)pa);
> }
>
> -static phys_addr_t __init alloc_pte(uintptr_t va)
> +static inline pte_t *__init get_pte_virt_fixmap(phys_addr_t pa)
> +{
> + clear_fixmap(FIX_PTE);
> + return (pte_t *)set_fixmap_offset(FIX_PTE, pa);
> +}
> +
> +static inline pte_t *get_pte_virt_late(phys_addr_t pa)
> +{
> + return (pte_t *) __va(pa);
> +}
> +
> +static inline phys_addr_t __init alloc_pte_early(uintptr_t va)
> {
> /*
> * We only create PMD or PGD early mappings so we
> * should never reach here with MMU disabled.
> */
> - BUG_ON(!mmu_enabled);
> + BUG();
> +}
>
> +static inline phys_addr_t __init alloc_pte_fixmap(uintptr_t va)
> +{
> return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
> }
>
> +static phys_addr_t alloc_pte_late(uintptr_t va)
> +{
> + unsigned long vaddr;
> +
> + vaddr = __get_free_page(GFP_KERNEL);
> + if (!vaddr || !pgtable_pte_page_ctor(virt_to_page(vaddr)))
> + BUG();
> + return __pa(vaddr);
> +}
> +
> static void __init create_pte_mapping(pte_t *ptep,
> uintptr_t va, phys_addr_t pa,
> phys_addr_t sz, pgprot_t prot)
> @@ -279,28 +308,46 @@ pmd_t fixmap_pmd[PTRS_PER_PMD] __page_aligned_bss;
> #endif
> pmd_t early_pmd[PTRS_PER_PMD * NUM_EARLY_PMDS] __initdata __aligned(PAGE_SIZE);
>
> -static pmd_t *__init get_pmd_virt(phys_addr_t pa)
> +static pmd_t *__init get_pmd_virt_early(phys_addr_t pa)
> {
> - if (mmu_enabled) {
> - clear_fixmap(FIX_PMD);
> - return (pmd_t *)set_fixmap_offset(FIX_PMD, pa);
> - } else {
> - return (pmd_t *)((uintptr_t)pa);
> - }
> + /* Before MMU is enabled */
> + return (pmd_t *)((uintptr_t)pa);
> }
>
> -static phys_addr_t __init alloc_pmd(uintptr_t va)
> +static pmd_t *__init get_pmd_virt_fixmap(phys_addr_t pa)
> {
> - uintptr_t pmd_num;
> + clear_fixmap(FIX_PMD);
> + return (pmd_t *)set_fixmap_offset(FIX_PMD, pa);
> +}
> +
> +static pmd_t *get_pmd_virt_late(phys_addr_t pa)
> +{
> + return (pmd_t *) __va(pa);
> +}
>
> - if (mmu_enabled)
> - return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
> +static phys_addr_t __init alloc_pmd_early(uintptr_t va)
> +{
> + uintptr_t pmd_num;
>
> pmd_num = (va - PAGE_OFFSET) >> PGDIR_SHIFT;
> BUG_ON(pmd_num >= NUM_EARLY_PMDS);
> return (uintptr_t)&early_pmd[pmd_num * PTRS_PER_PMD];
> }
>
> +static phys_addr_t __init alloc_pmd_fixmap(uintptr_t va)
> +{
> + return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
> +}
> +
> +static phys_addr_t alloc_pmd_late(uintptr_t va)
> +{
> + unsigned long vaddr;
> +
> + vaddr = __get_free_page(GFP_KERNEL);
> + BUG_ON(!vaddr);
> + return __pa(vaddr);
> +}
> +
> static void __init create_pmd_mapping(pmd_t *pmdp,
> uintptr_t va, phys_addr_t pa,
> phys_addr_t sz, pgprot_t prot)
> @@ -316,28 +363,28 @@ static void __init create_pmd_mapping(pmd_t *pmdp,
> }
>
> if (pmd_none(pmdp[pmd_idx])) {
> - pte_phys = alloc_pte(va);
> + pte_phys = pt_ops.alloc_pte(va);
> pmdp[pmd_idx] = pfn_pmd(PFN_DOWN(pte_phys), PAGE_TABLE);
> - ptep = get_pte_virt(pte_phys);
> + ptep = pt_ops.get_pte_virt(pte_phys);
> memset(ptep, 0, PAGE_SIZE);
> } else {
> pte_phys = PFN_PHYS(_pmd_pfn(pmdp[pmd_idx]));
> - ptep = get_pte_virt(pte_phys);
> + ptep = pt_ops.get_pte_virt(pte_phys);
> }
>
> create_pte_mapping(ptep, va, pa, sz, prot);
> }
>
> #define pgd_next_t pmd_t
> -#define alloc_pgd_next(__va) alloc_pmd(__va)
> -#define get_pgd_next_virt(__pa) get_pmd_virt(__pa)
> +#define alloc_pgd_next(__va) pt_ops.alloc_pmd(__va)
> +#define get_pgd_next_virt(__pa) pt_ops.get_pmd_virt(__pa)
> #define create_pgd_next_mapping(__nextp, __va, __pa, __sz, __prot) \
> create_pmd_mapping(__nextp, __va, __pa, __sz, __prot)
> #define fixmap_pgd_next fixmap_pmd
> #else
> #define pgd_next_t pte_t
> -#define alloc_pgd_next(__va) alloc_pte(__va)
> -#define get_pgd_next_virt(__pa) get_pte_virt(__pa)
> +#define alloc_pgd_next(__va) pt_ops.alloc_pte(__va)
> +#define get_pgd_next_virt(__pa) pt_ops.get_pte_virt(__pa)
> #define create_pgd_next_mapping(__nextp, __va, __pa, __sz, __prot) \
> create_pte_mapping(__nextp, __va, __pa, __sz, __prot)
> #define fixmap_pgd_next fixmap_pte
> @@ -421,6 +468,12 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> BUG_ON((load_pa % map_size) != 0);
> BUG_ON(load_sz > MAX_EARLY_MAPPING_SIZE);
>
> + pt_ops.alloc_pte = alloc_pte_early;
> + pt_ops.get_pte_virt = get_pte_virt_early;
> +#ifndef __PAGETABLE_PMD_FOLDED
> + pt_ops.alloc_pmd = alloc_pmd_early;
> + pt_ops.get_pmd_virt = get_pmd_virt_early;
> +#endif
> /* Setup early PGD for fixmap */
> create_pgd_mapping(early_pg_dir, FIXADDR_START,
> (uintptr_t)fixmap_pgd_next, PGDIR_SIZE, PAGE_TABLE);
> @@ -497,9 +550,16 @@ static void __init setup_vm_final(void)
> phys_addr_t pa, start, end;
> struct memblock_region *reg;
>
> - /* Set mmu_enabled flag */
> - mmu_enabled = true;
> -
> + /**
> + * MMU is enabled at this point. But page table setup is not complete yet.
> + * fixmap page table alloc functions should be used at this point
> + */
> + pt_ops.alloc_pte = alloc_pte_fixmap;
> + pt_ops.get_pte_virt = get_pte_virt_fixmap;
> +#ifndef __PAGETABLE_PMD_FOLDED
> + pt_ops.alloc_pmd = alloc_pmd_fixmap;
> + pt_ops.get_pmd_virt = get_pmd_virt_fixmap;
> +#endif
> /* Setup swapper PGD for fixmap */
> create_pgd_mapping(swapper_pg_dir, FIXADDR_START,
> __pa_symbol(fixmap_pgd_next),
> @@ -533,6 +593,14 @@ static void __init setup_vm_final(void)
> /* Move to swapper page table */
> csr_write(CSR_SATP, PFN_DOWN(__pa_symbol(swapper_pg_dir)) | SATP_MODE);
> local_flush_tlb_all();
> +
> + /* generic page allocation functions must be used to setup page table */
> + pt_ops.alloc_pte = alloc_pte_late;
> + pt_ops.get_pte_virt = get_pte_virt_late;
> +#ifndef __PAGETABLE_PMD_FOLDED
> + pt_ops.alloc_pmd = alloc_pmd_late;
> + pt_ops.get_pmd_virt = get_pmd_virt_late;
> +#endif
> }
> #else
> asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> --
> 2.24.0
>
--
Sincerely yours,
Mike.
On Fri, Aug 14, 2020 at 2:39 AM Mike Rapoport <[email protected]> wrote:
>
> On Wed, Aug 12, 2020 at 04:47:52PM -0700, Atish Patra wrote:
> > Currently, page table setup is done during setup_va_final where fixmap can
> > be used to create the temporary mappings. The physical frame is allocated
> > from memblock_alloc_* functions. However, this won't work if page table
> > mapping needs to be created for a different mm context (i.e. efi mm) at
> > a later point of time.
> >
> > Use generic kernel page allocation function & macros for any mapping
> > after setup_vm_final.
> >
> > Signed-off-by: Atish Patra <[email protected]>
>
> A nit below, otherwise
>
>
> Acked-by: Mike Rapoport <[email protected]>
>
> > ---
> > arch/riscv/mm/init.c | 130 ++++++++++++++++++++++++++++++++-----------
> > 1 file changed, 99 insertions(+), 31 deletions(-)
> >
> > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> > index b75ebe8e7a92..d238cdc501ee 100644
> > --- a/arch/riscv/mm/init.c
> > +++ b/arch/riscv/mm/init.c
> > @@ -32,6 +32,17 @@ extern char _start[];
> > void *dtb_early_va __initdata;
> > uintptr_t dtb_early_pa __initdata;
> >
> > +struct pt_alloc_ops {
> > + pte_t *(*get_pte_virt)(phys_addr_t pa);
> > + phys_addr_t (*alloc_pte)(uintptr_t va);
> > +#ifndef __PAGETABLE_PMD_FOLDED
> > + pmd_t *(*get_pmd_virt)(phys_addr_t pa);
> > + phys_addr_t (*alloc_pmd)(uintptr_t va);
> > +#endif
> > +};
> > +
> > +struct pt_alloc_ops pt_ops;
>
> static?
>
Ahh yes. Thanks for catching that. I will fix it in the next version.
> > +
> > static void __init zone_sizes_init(void)
> > {
> > unsigned long max_zone_pfns[MAX_NR_ZONES] = { 0, };
> > @@ -211,7 +222,6 @@ EXPORT_SYMBOL(pfn_base);
> > pgd_t swapper_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
> > pgd_t trampoline_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
> > pte_t fixmap_pte[PTRS_PER_PTE] __page_aligned_bss;
> > -static bool mmu_enabled;
> >
> > #define MAX_EARLY_MAPPING_SIZE SZ_128M
> >
> > @@ -234,27 +244,46 @@ void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot)
> > }
> > }
> >
> > -static pte_t *__init get_pte_virt(phys_addr_t pa)
> > +static inline pte_t *__init get_pte_virt_early(phys_addr_t pa)
> > {
> > - if (mmu_enabled) {
> > - clear_fixmap(FIX_PTE);
> > - return (pte_t *)set_fixmap_offset(FIX_PTE, pa);
> > - } else {
> > - return (pte_t *)((uintptr_t)pa);
> > - }
> > + return (pte_t *)((uintptr_t)pa);
> > }
> >
> > -static phys_addr_t __init alloc_pte(uintptr_t va)
> > +static inline pte_t *__init get_pte_virt_fixmap(phys_addr_t pa)
> > +{
> > + clear_fixmap(FIX_PTE);
> > + return (pte_t *)set_fixmap_offset(FIX_PTE, pa);
> > +}
> > +
> > +static inline pte_t *get_pte_virt_late(phys_addr_t pa)
> > +{
> > + return (pte_t *) __va(pa);
> > +}
> > +
> > +static inline phys_addr_t __init alloc_pte_early(uintptr_t va)
> > {
> > /*
> > * We only create PMD or PGD early mappings so we
> > * should never reach here with MMU disabled.
> > */
> > - BUG_ON(!mmu_enabled);
> > + BUG();
> > +}
> >
> > +static inline phys_addr_t __init alloc_pte_fixmap(uintptr_t va)
> > +{
> > return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
> > }
> >
> > +static phys_addr_t alloc_pte_late(uintptr_t va)
> > +{
> > + unsigned long vaddr;
> > +
> > + vaddr = __get_free_page(GFP_KERNEL);
> > + if (!vaddr || !pgtable_pte_page_ctor(virt_to_page(vaddr)))
> > + BUG();
> > + return __pa(vaddr);
> > +}
> > +
> > static void __init create_pte_mapping(pte_t *ptep,
> > uintptr_t va, phys_addr_t pa,
> > phys_addr_t sz, pgprot_t prot)
> > @@ -279,28 +308,46 @@ pmd_t fixmap_pmd[PTRS_PER_PMD] __page_aligned_bss;
> > #endif
> > pmd_t early_pmd[PTRS_PER_PMD * NUM_EARLY_PMDS] __initdata __aligned(PAGE_SIZE);
> >
> > -static pmd_t *__init get_pmd_virt(phys_addr_t pa)
> > +static pmd_t *__init get_pmd_virt_early(phys_addr_t pa)
> > {
> > - if (mmu_enabled) {
> > - clear_fixmap(FIX_PMD);
> > - return (pmd_t *)set_fixmap_offset(FIX_PMD, pa);
> > - } else {
> > - return (pmd_t *)((uintptr_t)pa);
> > - }
> > + /* Before MMU is enabled */
> > + return (pmd_t *)((uintptr_t)pa);
> > }
> >
> > -static phys_addr_t __init alloc_pmd(uintptr_t va)
> > +static pmd_t *__init get_pmd_virt_fixmap(phys_addr_t pa)
> > {
> > - uintptr_t pmd_num;
> > + clear_fixmap(FIX_PMD);
> > + return (pmd_t *)set_fixmap_offset(FIX_PMD, pa);
> > +}
> > +
> > +static pmd_t *get_pmd_virt_late(phys_addr_t pa)
> > +{
> > + return (pmd_t *) __va(pa);
> > +}
> >
> > - if (mmu_enabled)
> > - return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
> > +static phys_addr_t __init alloc_pmd_early(uintptr_t va)
> > +{
> > + uintptr_t pmd_num;
> >
> > pmd_num = (va - PAGE_OFFSET) >> PGDIR_SHIFT;
> > BUG_ON(pmd_num >= NUM_EARLY_PMDS);
> > return (uintptr_t)&early_pmd[pmd_num * PTRS_PER_PMD];
> > }
> >
> > +static phys_addr_t __init alloc_pmd_fixmap(uintptr_t va)
> > +{
> > + return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
> > +}
> > +
> > +static phys_addr_t alloc_pmd_late(uintptr_t va)
> > +{
> > + unsigned long vaddr;
> > +
> > + vaddr = __get_free_page(GFP_KERNEL);
> > + BUG_ON(!vaddr);
> > + return __pa(vaddr);
> > +}
> > +
> > static void __init create_pmd_mapping(pmd_t *pmdp,
> > uintptr_t va, phys_addr_t pa,
> > phys_addr_t sz, pgprot_t prot)
> > @@ -316,28 +363,28 @@ static void __init create_pmd_mapping(pmd_t *pmdp,
> > }
> >
> > if (pmd_none(pmdp[pmd_idx])) {
> > - pte_phys = alloc_pte(va);
> > + pte_phys = pt_ops.alloc_pte(va);
> > pmdp[pmd_idx] = pfn_pmd(PFN_DOWN(pte_phys), PAGE_TABLE);
> > - ptep = get_pte_virt(pte_phys);
> > + ptep = pt_ops.get_pte_virt(pte_phys);
> > memset(ptep, 0, PAGE_SIZE);
> > } else {
> > pte_phys = PFN_PHYS(_pmd_pfn(pmdp[pmd_idx]));
> > - ptep = get_pte_virt(pte_phys);
> > + ptep = pt_ops.get_pte_virt(pte_phys);
> > }
> >
> > create_pte_mapping(ptep, va, pa, sz, prot);
> > }
> >
> > #define pgd_next_t pmd_t
> > -#define alloc_pgd_next(__va) alloc_pmd(__va)
> > -#define get_pgd_next_virt(__pa) get_pmd_virt(__pa)
> > +#define alloc_pgd_next(__va) pt_ops.alloc_pmd(__va)
> > +#define get_pgd_next_virt(__pa) pt_ops.get_pmd_virt(__pa)
> > #define create_pgd_next_mapping(__nextp, __va, __pa, __sz, __prot) \
> > create_pmd_mapping(__nextp, __va, __pa, __sz, __prot)
> > #define fixmap_pgd_next fixmap_pmd
> > #else
> > #define pgd_next_t pte_t
> > -#define alloc_pgd_next(__va) alloc_pte(__va)
> > -#define get_pgd_next_virt(__pa) get_pte_virt(__pa)
> > +#define alloc_pgd_next(__va) pt_ops.alloc_pte(__va)
> > +#define get_pgd_next_virt(__pa) pt_ops.get_pte_virt(__pa)
> > #define create_pgd_next_mapping(__nextp, __va, __pa, __sz, __prot) \
> > create_pte_mapping(__nextp, __va, __pa, __sz, __prot)
> > #define fixmap_pgd_next fixmap_pte
> > @@ -421,6 +468,12 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> > BUG_ON((load_pa % map_size) != 0);
> > BUG_ON(load_sz > MAX_EARLY_MAPPING_SIZE);
> >
> > + pt_ops.alloc_pte = alloc_pte_early;
> > + pt_ops.get_pte_virt = get_pte_virt_early;
> > +#ifndef __PAGETABLE_PMD_FOLDED
> > + pt_ops.alloc_pmd = alloc_pmd_early;
> > + pt_ops.get_pmd_virt = get_pmd_virt_early;
> > +#endif
> > /* Setup early PGD for fixmap */
> > create_pgd_mapping(early_pg_dir, FIXADDR_START,
> > (uintptr_t)fixmap_pgd_next, PGDIR_SIZE, PAGE_TABLE);
> > @@ -497,9 +550,16 @@ static void __init setup_vm_final(void)
> > phys_addr_t pa, start, end;
> > struct memblock_region *reg;
> >
> > - /* Set mmu_enabled flag */
> > - mmu_enabled = true;
> > -
> > + /**
> > + * MMU is enabled at this point. But page table setup is not complete yet.
> > + * fixmap page table alloc functions should be used at this point
> > + */
> > + pt_ops.alloc_pte = alloc_pte_fixmap;
> > + pt_ops.get_pte_virt = get_pte_virt_fixmap;
> > +#ifndef __PAGETABLE_PMD_FOLDED
> > + pt_ops.alloc_pmd = alloc_pmd_fixmap;
> > + pt_ops.get_pmd_virt = get_pmd_virt_fixmap;
> > +#endif
> > /* Setup swapper PGD for fixmap */
> > create_pgd_mapping(swapper_pg_dir, FIXADDR_START,
> > __pa_symbol(fixmap_pgd_next),
> > @@ -533,6 +593,14 @@ static void __init setup_vm_final(void)
> > /* Move to swapper page table */
> > csr_write(CSR_SATP, PFN_DOWN(__pa_symbol(swapper_pg_dir)) | SATP_MODE);
> > local_flush_tlb_all();
> > +
> > + /* generic page allocation functions must be used to setup page table */
> > + pt_ops.alloc_pte = alloc_pte_late;
> > + pt_ops.get_pte_virt = get_pte_virt_late;
> > +#ifndef __PAGETABLE_PMD_FOLDED
> > + pt_ops.alloc_pmd = alloc_pmd_late;
> > + pt_ops.get_pmd_virt = get_pmd_virt_late;
> > +#endif
> > }
> > #else
> > asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> > --
> > 2.24.0
> >
>
> --
> Sincerely yours,
> Mike.
--
Regards,
Atish
On Wed, 12 Aug 2020 16:47:50 PDT (-0700), Atish Patra wrote:
> From: Anup Patel <[email protected]>
>
> Currently, RISC-V reserves 1MB of fixmap memory for device tree. However,
> it maps only single PMD (2MB) space for fixmap which leaves only < 1MB space
> left for other kernel features such as early ioremap which requires fixmap
> as well. The fixmap size can be increased by another 2MB but it brings
> additional complexity and changes the virtual memory layout as well.
> If we require some additional feature requiring fixmap again, it has to be
> moved again.
>
> Technically, DT doesn't need a fixmap as the memory occupied by the DT is
> only used during boot. That's why, We map device tree in early page table
> using two consecutive PGD mappings at lower addresses (< PAGE_OFFSET).
> This frees lot of space in fixmap and also makes maximum supported
> device tree size supported as PGDIR_SIZE. Thus, init memory section can be used
> for the same purpose as well. This simplifies fixmap implementation.
>
> Signed-off-by: Anup Patel <[email protected]>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> arch/riscv/include/asm/fixmap.h | 3 ---
> arch/riscv/include/asm/pgtable.h | 1 +
> arch/riscv/kernel/head.S | 1 -
> arch/riscv/kernel/head.h | 2 --
> arch/riscv/kernel/setup.c | 9 +++++++--
> arch/riscv/mm/init.c | 26 ++++++++++++--------------
> 6 files changed, 20 insertions(+), 22 deletions(-)
>
> diff --git a/arch/riscv/include/asm/fixmap.h b/arch/riscv/include/asm/fixmap.h
> index 1ff075a8dfc7..11613f38228a 100644
> --- a/arch/riscv/include/asm/fixmap.h
> +++ b/arch/riscv/include/asm/fixmap.h
> @@ -22,9 +22,6 @@
> */
> enum fixed_addresses {
> FIX_HOLE,
> -#define FIX_FDT_SIZE SZ_1M
> - FIX_FDT_END,
> - FIX_FDT = FIX_FDT_END + FIX_FDT_SIZE / PAGE_SIZE - 1,
> FIX_PTE,
> FIX_PMD,
> FIX_TEXT_POKE1,
> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> index eaea1f717010..815f8c959dd4 100644
> --- a/arch/riscv/include/asm/pgtable.h
> +++ b/arch/riscv/include/asm/pgtable.h
> @@ -464,6 +464,7 @@ static inline void __kernel_map_pages(struct page *page, int numpages, int enabl
> #define kern_addr_valid(addr) (1) /* FIXME */
>
> extern void *dtb_early_va;
> +extern uintptr_t dtb_early_pa;
> void setup_bootmem(void);
> void paging_init(void);
>
> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
> index 7822054dbd88..a2f0cb3ca0a6 100644
> --- a/arch/riscv/kernel/head.S
> +++ b/arch/riscv/kernel/head.S
> @@ -255,7 +255,6 @@ clear_bss_done:
> #endif
> /* Start the kernel */
> call soc_early_init
> - call parse_dtb
> tail start_kernel
>
> .Lsecondary_start:
> diff --git a/arch/riscv/kernel/head.h b/arch/riscv/kernel/head.h
> index 105fb0496b24..b48dda3d04f6 100644
> --- a/arch/riscv/kernel/head.h
> +++ b/arch/riscv/kernel/head.h
> @@ -16,6 +16,4 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa);
> extern void *__cpu_up_stack_pointer[];
> extern void *__cpu_up_task_pointer[];
>
> -void __init parse_dtb(void);
> -
> #endif /* __ASM_HEAD_H */
> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> index f04373be54a6..6a0ee2405813 100644
> --- a/arch/riscv/kernel/setup.c
> +++ b/arch/riscv/kernel/setup.c
> @@ -49,8 +49,9 @@ atomic_t hart_lottery __section(.sdata);
> unsigned long boot_cpu_hartid;
> static DEFINE_PER_CPU(struct cpu, cpu_devices);
>
> -void __init parse_dtb(void)
> +static void __init parse_dtb(void)
> {
> + /* Early scan of device tree from init memory */
> if (early_init_dt_scan(dtb_early_va))
> return;
>
> @@ -63,6 +64,7 @@ void __init parse_dtb(void)
>
> void __init setup_arch(char **cmdline_p)
> {
> + parse_dtb();
> init_mm.start_code = (unsigned long) _stext;
> init_mm.end_code = (unsigned long) _etext;
> init_mm.end_data = (unsigned long) _edata;
> @@ -77,7 +79,10 @@ void __init setup_arch(char **cmdline_p)
> #if IS_ENABLED(CONFIG_BUILTIN_DTB)
> unflatten_and_copy_device_tree();
> #else
> - unflatten_device_tree();
> + if (early_init_dt_verify(__va(dtb_early_pa)))
> + unflatten_device_tree();
> + else
> + pr_err("No DTB found in kernel mappings\n");
> #endif
> clint_init_boot_cpu();
>
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index 787c75f751a5..2b651f63f5c4 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -28,7 +28,9 @@ unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]
> EXPORT_SYMBOL(empty_zero_page);
>
> extern char _start[];
> -void *dtb_early_va;
> +#define DTB_EARLY_BASE_VA PGDIR_SIZE
> +void *dtb_early_va __initdata;
> +uintptr_t dtb_early_pa __initdata;
>
> static void __init zone_sizes_init(void)
> {
> @@ -141,8 +143,6 @@ static void __init setup_initrd(void)
> }
> #endif /* CONFIG_BLK_DEV_INITRD */
>
> -static phys_addr_t dtb_early_pa __initdata;
> -
> void __init setup_bootmem(void)
> {
> struct memblock_region *reg;
> @@ -399,7 +399,7 @@ static uintptr_t __init best_map_size(phys_addr_t base, phys_addr_t size)
>
> asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> {
> - uintptr_t va, end_va;
> + uintptr_t va, pa, end_va;
> uintptr_t load_pa = (uintptr_t)(&_start);
> uintptr_t load_sz = (uintptr_t)(&_end) - load_pa;
> uintptr_t map_size = best_map_size(load_pa, MAX_EARLY_MAPPING_SIZE);
> @@ -448,16 +448,13 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> load_pa + (va - PAGE_OFFSET),
> map_size, PAGE_KERNEL_EXEC);
>
> - /* Create fixed mapping for early FDT parsing */
> - end_va = __fix_to_virt(FIX_FDT) + FIX_FDT_SIZE;
> - for (va = __fix_to_virt(FIX_FDT); va < end_va; va += PAGE_SIZE)
> - create_pte_mapping(fixmap_pte, va,
> - dtb_pa + (va - __fix_to_virt(FIX_FDT)),
> - PAGE_SIZE, PAGE_KERNEL);
> -
> - /* Save pointer to DTB for early FDT parsing */
> - dtb_early_va = (void *)fix_to_virt(FIX_FDT) + (dtb_pa & ~PAGE_MASK);
> - /* Save physical address for memblock reservation */
> + /* Create two consecutive PGD mappings for FDT early scan */
> + pa = dtb_pa & ~(PGDIR_SIZE - 1);
> + create_pgd_mapping(early_pg_dir, DTB_EARLY_BASE_VA,
> + pa, PGDIR_SIZE, PAGE_KERNEL);
> + create_pgd_mapping(early_pg_dir, DTB_EARLY_BASE_VA + PGDIR_SIZE,
> + pa + PGDIR_SIZE, PGDIR_SIZE, PAGE_KERNEL);
> + dtb_early_va = (void *)DTB_EARLY_BASE_VA + (dtb_pa & (PGDIR_SIZE - 1));
> dtb_early_pa = dtb_pa;
> }
>
> @@ -516,6 +513,7 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> #else
> dtb_early_va = (void *)dtb_pa;
> #endif
> + dtb_early_pa = dtb_pa;
> }
>
> static inline void setup_vm_final(void)
Reviewed-by: Palmer Dabbelt <[email protected]>
On Wed, 12 Aug 2020 16:47:51 PDT (-0700), Atish Patra wrote:
> UEFI uses early IO or memory mappings for runtime services before
> normal ioremap() is usable. Add the necessary fixmap bindings and
> pmd mappings for generic ioremap support to work.
>
> Signed-off-by: Atish Patra <[email protected]>
> ---
> arch/riscv/Kconfig | 1 +
> arch/riscv/include/asm/Kbuild | 1 +
> arch/riscv/include/asm/fixmap.h | 13 +++++++++++++
> arch/riscv/include/asm/io.h | 1 +
> arch/riscv/kernel/setup.c | 1 +
> arch/riscv/mm/init.c | 33 +++++++++++++++++++++++++++++++++
> 6 files changed, 50 insertions(+)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 7b5905529146..15597f5f504f 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -37,6 +37,7 @@ config RISCV
> select GENERIC_ARCH_TOPOLOGY if SMP
> select GENERIC_ATOMIC64 if !64BIT
> select GENERIC_CLOCKEVENTS
> + select GENERIC_EARLY_IOREMAP
> select GENERIC_GETTIMEOFDAY if HAVE_GENERIC_VDSO
> select GENERIC_IOREMAP
> select GENERIC_IRQ_MULTI_HANDLER
> diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
> index 3d9410bb4de0..59dd7be55005 100644
> --- a/arch/riscv/include/asm/Kbuild
> +++ b/arch/riscv/include/asm/Kbuild
> @@ -1,4 +1,5 @@
> # SPDX-License-Identifier: GPL-2.0
> +generic-y += early_ioremap.h
> generic-y += extable.h
> generic-y += flat.h
> generic-y += kvm_para.h
> diff --git a/arch/riscv/include/asm/fixmap.h b/arch/riscv/include/asm/fixmap.h
> index 11613f38228a..54cbf07fb4e9 100644
> --- a/arch/riscv/include/asm/fixmap.h
> +++ b/arch/riscv/include/asm/fixmap.h
> @@ -27,6 +27,19 @@ enum fixed_addresses {
> FIX_TEXT_POKE1,
> FIX_TEXT_POKE0,
> FIX_EARLYCON_MEM_BASE,
> +
> + __end_of_permanent_fixed_addresses,
> + /*
> + * Temporary boot-time mappings, used by early_ioremap(),
> + * before ioremap() is functional.
> + */
> +#define NR_FIX_BTMAPS (SZ_256K / PAGE_SIZE)
> +#define FIX_BTMAPS_SLOTS 7
> +#define TOTAL_FIX_BTMAPS (NR_FIX_BTMAPS * FIX_BTMAPS_SLOTS)
> +
> + FIX_BTMAP_END = __end_of_permanent_fixed_addresses,
> + FIX_BTMAP_BEGIN = FIX_BTMAP_END + TOTAL_FIX_BTMAPS - 1,
> +
> __end_of_fixed_addresses
> };
>
> diff --git a/arch/riscv/include/asm/io.h b/arch/riscv/include/asm/io.h
> index 3835c3295dc5..c025a746a148 100644
> --- a/arch/riscv/include/asm/io.h
> +++ b/arch/riscv/include/asm/io.h
> @@ -14,6 +14,7 @@
> #include <linux/types.h>
> #include <linux/pgtable.h>
> #include <asm/mmiowb.h>
> +#include <asm/early_ioremap.h>
>
> /*
> * MMIO access functions are separated out to break dependency cycles
> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> index 6a0ee2405813..c71788e6aff4 100644
> --- a/arch/riscv/kernel/setup.c
> +++ b/arch/riscv/kernel/setup.c
> @@ -72,6 +72,7 @@ void __init setup_arch(char **cmdline_p)
>
> *cmdline_p = boot_command_line;
>
> + early_ioremap_setup();
> parse_early_param();
>
> setup_bootmem();
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index 2b651f63f5c4..b75ebe8e7a92 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -403,6 +403,9 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> uintptr_t load_pa = (uintptr_t)(&_start);
> uintptr_t load_sz = (uintptr_t)(&_end) - load_pa;
> uintptr_t map_size = best_map_size(load_pa, MAX_EARLY_MAPPING_SIZE);
> +#ifndef __PAGETABLE_PMD_FOLDED
> + pmd_t fix_bmap_spmd, fix_bmap_epmd;
> +#endif
>
> va_pa_offset = PAGE_OFFSET - load_pa;
> pfn_base = PFN_DOWN(load_pa);
> @@ -456,6 +459,36 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> pa + PGDIR_SIZE, PGDIR_SIZE, PAGE_KERNEL);
> dtb_early_va = (void *)DTB_EARLY_BASE_VA + (dtb_pa & (PGDIR_SIZE - 1));
> dtb_early_pa = dtb_pa;
> +
> + /*
> + * Bootime fixmap only can handle PMD_SIZE mapping. Thus, boot-ioremap
> + * range can not span multiple pmds.
> + */
> + BUILD_BUG_ON((__fix_to_virt(FIX_BTMAP_BEGIN) >> PMD_SHIFT)
> + != (__fix_to_virt(FIX_BTMAP_END) >> PMD_SHIFT));
> +
> +#ifndef __PAGETABLE_PMD_FOLDED
> + /*
> + * Early ioremap fixmap is already created as it lies within first 2MB
> + * of fixmap region. We always map PMD_SIZE. Thus, both FIX_BTMAP_END
> + * FIX_BTMAP_BEGIN should lie in the same pmd. Verify that and warn
> + * the user if not.
> + */
> + fix_bmap_spmd = fixmap_pmd[pmd_index(__fix_to_virt(FIX_BTMAP_BEGIN))];
> + fix_bmap_epmd = fixmap_pmd[pmd_index(__fix_to_virt(FIX_BTMAP_END))];
> + if (pmd_val(fix_bmap_spmd) != pmd_val(fix_bmap_epmd)) {
> + WARN_ON(1);
> + pr_warn("fixmap btmap start [%08lx] != end [%08lx]\n",
> + pmd_val(fix_bmap_spmd), pmd_val(fix_bmap_epmd));
> + pr_warn("fix_to_virt(FIX_BTMAP_BEGIN): %08lx\n",
> + fix_to_virt(FIX_BTMAP_BEGIN));
> + pr_warn("fix_to_virt(FIX_BTMAP_END): %08lx\n",
> + fix_to_virt(FIX_BTMAP_END));
> +
> + pr_warn("FIX_BTMAP_END: %d\n", FIX_BTMAP_END);
> + pr_warn("FIX_BTMAP_BEGIN: %d\n", FIX_BTMAP_BEGIN);
> + }
> +#endif
> }
>
> static void __init setup_vm_final(void)
Reviewed-by: Palmer Dabbelt <[email protected]>