2018-07-24 06:58:14

by AKASHI Takahiro

[permalink] [raw]
Subject: [PATCH v12 00/16] arm64: kexec: add kexec_file_load() support

This is the twelfth round of implementing kexec_file_load() support
on arm64.[1] (See "Changes" below)
Most of the code is based on kexec-tools.


This patch series enables us to
* load the kernel by specifying its file descriptor, instead of user-
filled buffer, at kexec_file_load() system call, and
* optionally verify its signature at load time for trusted boot.
Kernel virtual address randomization is also supported since v9.

Contrary to kexec_load() system call, as we discussed a long time ago,
users may not be allowed to provide a device tree to the 2nd kernel
explicitly, hence enforcing a dt blob of the first kernel to be re-used
internally.

To use kexec_file_load() system call, instead of kexec_load(), at kexec
command, '-s' option must be specified. See [2] for a necessary patch for
kexec-tools.

To analyze a generated crash dump file, use the latest master branch of
crash utility[3]. I always try to submit patches to fix any inconsistencies
introduced in the latest kernel.

Regarding a kernel image verification, a signature must be presented
along with the binary itself. A signature is basically a hash value
calculated against the whole binary data and encrypted by a key which
will be authenticated by one of the system's trusted certificates.
Any attempt to read and load a to-be-kexec-ed kernel image through
a system call will be checked and blocked if the binary's hash value
doesn't match its associated signature.

There are two methods available now:
1. implementing arch-specific verification hook of kexec_file_load()
2. utilizing IMA(Integrity Measurement Architecture)[4] appraisal framework

Before my v7, I believed that my patch only supports (1) but am now
confident that (2) comes free if IMA is enabled and properly configured.


(1) Arch-specific verification hook
If CONFIG_KEXEC_VERIFY_SIG is enabled, kexec_file_load() invokes an arch-
defined (and hence file-format-specific) hook function to check for the
validity of kernel binary.

On x86, a signature is embedded into a PE file (Microsoft's format) header
of binary. Since arm64's "Image" can also be seen as a PE file as far as
CONFIG_EFI is enabled, we adopt this format for kernel signing.

As in the case of UEFI applications, we can create a signed kernel image:
$ sbsign --key ${KEY} --cert ${CERT} Image

You may want to use certs/signing_key.pem, which is intended to be used
for module signing (CONFIG_MODULE_SIG), as ${KEY} and ${CERT} for test
purpose.


(2) IMA appraisal-based
IMA was first introduced in linux in order to meet TCG (Trusted Computing
Group) requirement that all the sensitive files be *measured* before
reading/executing them to detect any untrusted changes/modification.
Then appraisal feature, which allows us to ensure the integrity of
files and even prevent them from reading/executing, was added later.

Meanwhile, kexec_file_load() has been merged since v3.17 and evolved to
enable IMA-appraisal type verification by the commit b804defe4297 ("kexec:
replace call to copy_file_from_fd() with kernel version").

In this scheme, a signature will be stored in a extended file attribute,
"security.ima" while a decryption key is hold in a dedicated keyring,
".ima" or "_ima". All the necessary process of verification is confined
in a secure API, kernel_read_file_from_fd(), called by kexec_file_load().

Please note that powerpc is one of the two architectures now
supporting KEXEC_FILE, and that it wishes to extend IMA,
where a signature may be appended to "vmlinux" file[5], like module
signing, instead of using an extended file attribute.

While IMA meant to be used with TPM (Trusted Platform Module) on secure
platform, IMA is still usable without TPM. Here is an example procedure
about how we can give it a try to run the feature using a self-signed
root ca for demo/test purposes:

1) Generate needed keys and certificates, following "Generate trusted
keys" section in README of ima-evm-utils[6].

2) Build the kernel with the following kernel configurations, specifying
"ima-local-ca.pem" for CONFIG_SYSTEM_TRUSTED_KEYS:
CONFIG_EXT4_FS_SECURITY
CONFIG_INTEGRITY_SIGNATURE
CONFIG_INTEGRITY_ASYMMETRIC_KEYS
CONFIG_INTEGRITY_TRUSTED_KEYRING
CONFIG_IMA
CONFIG_IMA_WRITE_POLICY
CONFIG_IMA_READ_POLICY
CONFIG_IMA_APPRAISE
CONFIG_IMA_APPRAISE_BOOTPARAM
CONFIG_SYSTEM_TRUSTED_KEYS
Please note that CONFIG_KEXEC_VERIFY_SIG is not, actually should
not be, enabled.

3) Sign(label) a kernel image binary to be kexec-ed on target filesystem:
$ evmctl ima_sign --key /path/to/private_key.pem /your/Image

4) Add a command line parameter and boot the kernel:
ima_appraise=enforce

On live system,
5) Set a security policy:
$ mount -t securityfs none /sys/kernel/security
$ echo "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig" \
> /sys/kernel/security/ima/policy

6) Add a key for ima:
$ keyctl padd asymmetric my_ima_key %:.ima < /path/to/x509_ima.der
(or evmctl import /path/to/x509_ima.der <ima_keyring_id>)

7) Then try kexec as normal.


Concerns(or future works):
* Support for physical address randomization
* Signature verification of big endian kernel with CONFIG_KEXEC_VERIFY_SIG
While big-endian kernel can support kernel signing, I'm not sure that
Image can be recognized as in PE format because x86 standard only
defines little-endian-based format.
* Support for vminux loading

[1] http://git.linaro.org/people/takahiro.akashi/linux-aarch64.git
branch:arm64/kexec_file
[2] http://git.linaro.org/people/takahiro.akashi/kexec-tools.git
branch:arm64/kexec_file
[3] http://github.com/crash-utility/crash.git
[4] https://sourceforge.net/p/linux-ima/wiki/Home/
[5] http://lkml.iu.edu//hypermail/linux/kernel/1707.0/03669.html
[6] https://sourceforge.net/p/linux-ima/ima-evm-utils/ci/master/tree/


Changes in v12 (July 24, 2018)
(mostly addressing James' comments)
* unify all the variants of arch_kexec_walk_mem(), including s390's, into
common code, leaving arch_kexec_walk_mem() static (i.e. no longer
replacable)
* always initialize kbuf.mem to zero to align with a change above
* set kbuf.buf_min/buf_max consistently between kexec and kdump
* try to consistently use "unsigned long" for physical (kexec-time)
address, and "void *" for virtual (runtime) address in
load_other_segments() with a couple of variables renamed for readability
* fix a 'sparse' warning against arch_kimage_file_post_load_cleanup()
* fix a calculation of string lengh of "ARM64_MAGIC"
* set kernel image alignment to MIN_KIMG_ALIGN rather than SZ_2M
* set elf header alignment to SZ_64K rather than SZ_4K


Changes in v11 (July 11, 2018)
* split v10's patch#3, a refactoring stuff, into two parts, "just move"
and modify
* remove selecting BUILD_BIN2C from KEXEC_FILE config
* modify setup_dtb()
* to correct a return value on failure of fdt_xyz() call,
* to always remove existing bootargs and initrd-start/end properties,
if any, when copying current system's dtb into new dtb
* to use fdt_setprop_string() for bootargs (I'm now sure that
kimage->cmdline_buf is a null-terminated string.)
* revise a warning comment in case of KEXEC_VERIFY_SIG but
!(EFI && SIGNED_PE_FILE_VERIFICATION)

Changes in v10 (June 23, 2018)
* rebased to v4.18-rc
* change syscall numer of kexec_file_load from 292 to 293
* factor out memblock-based arch_kexec_walk_mem() from powerpc and
merge it into generic one
* move generic fdt helper functions from arm64 dir to drivers/of
(dt_root_[addr|size]_cells are no longer __initdata.)
* modify fill_property() to use 'while' loop
* modify fdt_setprop_reg() to allocate a buffer on stack
* modify setup_dtb() to use fdt_setprop_u64()
* pass kernel_load_addr/size directly as arguments, instead of via
kimage_arch.kern_segment, at load_other_segments()
* refuse loading an image which cannot be supported in image loader,
adding cpu-feature(MMFR0) helper functions
* modify prepare_elf_headers() to use kmalloc() instead of vmalloc()
* always pass arch.dtb_mem as the fourth argument to cpu_soft_restart()
in machine_kexec() while dtb_mem will be zero in kexec case

Changes in v9 (April 25, 2018)
* rebased to v4.17-rc
* remove preparatory patches on generic/x86/ppc code
They have now been merged in v4.17-rc1.
* allocate memory based on memblock list instead of system resources
This will prevent reserved regions, particularly UEFI/ACPI data,
from being corrupted.
* correct dt property names, linux,initrd-*, in newly-created dtb
"linux," was missing.
* remove alignment requirement for initrd loading
* add kaslr (kernel virtual address randomization) support
* misc code clean-up
* revise commit messages

Changes in v8 (Feb 22, 2018)
* introduce ARCH_HAS_KEXEC_PURGATORY so that arm64 will be able to skip
purgatory
* remove "ifdef CONFIG_X86_64" stuffs from a re-factored function,
prepare_elf64_headers(), making its interface more generic
(The original patch was split into two for easier reviews.)
* modify cpu_soft_restart() so as to let the 2nd kernel jump into its entry
code directly without requiring purgatory in case of kexec_file_load
* remove CONFIG_KEXEC_FILE_IMAGE_FMT and introduce
CONFIG_KEXEC_IMAGE_VERIFY_SIG, much similar to x86 but quite redundant
for now.
* In addition, update/modify dependencies of KEXEC_IMAGE_VERIFY_SIG

Changes in v7 (Dec 4, 2017)
* rebased to v4.15-rc2
* re-organize the patch set to separate KEXEC_FILE_VERIFY_SIG-related
code from the others
* revamp factored-out code in kernel/kexec_file.c due to the changes
in original x86 code
* redefine walk_sys_ram_res_rev() prototype due to change of callback
type in the counterpart, walk_sys_ram_res()
* make KEXEC_FILE_IMAGE_FMT default on if KEXEC_FILE selected

Changes in v6 (Oct 24, 2017)
* fix a for-loop bug in _kexec_kernel_image_probe() per Julien

Changes in v5 (Oct 10, 2017)
* fix kbuild errors around patch #3
per Julien's comments,
* fix a bug in walk_system_ram_res_rev() with some cleanup
* modify fdt_setprop_range() to use vmalloc()
* modify fill_property() to use memset()

Changes in v4 (Oct 2, 2017)
* reinstate x86's arch_kexec_kernel_image_load()
* rename weak arch_kexec_kernel_xxx() to _kexec_kernel_xxx() for
better re-use
* constify kexec_file_loaders[]

Changes in v3 (Sep 15, 2017)
* fix kbuild test error
* factor out arch_kexec_kernel_*() & arch_kimage_file_post_load_cleanup()
* remove CONFIG_CRASH_CORE guard from kexec_file.c
* add vmapped kernel region to vmcore for gdb backtracing
(see prepare_elf64_headers())
* merge asm/kexec_file.h into asm/kexec.h
* and some cleanups

Changes in v2 (Sep 8, 2017)
* move core-header-related functions from crash_core.c to kexec_file.c
* drop hash-check code from purgatory
* modify purgatory asm to remove arch_kexec_apply_relocations_add()
* drop older kernel support
* drop vmlinux support (at least, for this series)


Patch #1 to #10 are essential part for KEXEC_FILE support
(additionally allowing for IMA-based verification):
Patch #1 to #6 are all preparatory patches on generic side.
Patch #7 to #11 are to enable kexec_file_load on arm64.

Patch #12 to #13 are for KEXEC_VERIFY_SIG (arch-specific verification)
support

AKASHI Takahiro (16):
asm-generic: add kexec_file_load system call to unistd.h
kexec_file: make kexec_image_post_load_cleanup_default() global
s390, kexec_file: drop arch_kexec_mem_walk()
powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()
kexec_file: kexec_walk_memblock() only walks a dedicated region at
kdump
of/fdt: add helper functions for handling properties
arm64: add image head flag definitions
arm64: cpufeature: add MMFR0 helper functions
arm64: enable KEXEC_FILE config
arm64: kexec_file: load initrd and device-tree
arm64: kexec_file: allow for loading Image-format kernel
arm64: kexec_file: add crash dump support
arm64: kexec_file: invoke the kernel without purgatory
include: pe.h: remove message[] from mz header definition
arm64: kexec_file: add kernel signature verification support
arm64: kexec_file: add kaslr support

arch/arm64/Kconfig | 33 ++
arch/arm64/include/asm/boot.h | 15 +
arch/arm64/include/asm/cpufeature.h | 48 +++
arch/arm64/include/asm/kexec.h | 49 +++
arch/arm64/kernel/Makefile | 3 +-
arch/arm64/kernel/cpu-reset.S | 8 +-
arch/arm64/kernel/head.S | 2 +-
arch/arm64/kernel/kexec_image.c | 123 ++++++++
arch/arm64/kernel/machine_kexec.c | 12 +-
arch/arm64/kernel/machine_kexec_file.c | 314 ++++++++++++++++++++
arch/arm64/kernel/relocate_kernel.S | 3 +-
arch/powerpc/kernel/machine_kexec_file_64.c | 54 ----
arch/s390/kernel/machine_kexec_file.c | 10 -
drivers/of/fdt.c | 62 +++-
include/linux/kexec.h | 3 +-
include/linux/of_fdt.h | 10 +-
include/linux/pe.h | 2 +-
include/uapi/asm-generic/unistd.h | 4 +-
kernel/kexec_file.c | 67 ++++-
19 files changed, 738 insertions(+), 84 deletions(-)
create mode 100644 arch/arm64/kernel/kexec_image.c
create mode 100644 arch/arm64/kernel/machine_kexec_file.c

--
2.18.0



2018-07-24 06:58:32

by AKASHI Takahiro

[permalink] [raw]
Subject: [PATCH v12 01/16] asm-generic: add kexec_file_load system call to unistd.h

The initial user of this system call number is arm64.

Signed-off-by: AKASHI Takahiro <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
---
include/uapi/asm-generic/unistd.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 42990676a55e..c81f4a0df51f 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -734,9 +734,11 @@ __SYSCALL(__NR_pkey_free, sys_pkey_free)
__SYSCALL(__NR_statx, sys_statx)
#define __NR_io_pgetevents 292
__SC_COMP(__NR_io_pgetevents, sys_io_pgetevents, compat_sys_io_pgetevents)
+#define __NR_kexec_file_load 293
+__SYSCALL(__NR_kexec_file_load, sys_kexec_file_load)

#undef __NR_syscalls
-#define __NR_syscalls 293
+#define __NR_syscalls 294

/*
* 32 bit systems traditionally used different
--
2.18.0


2018-07-24 06:59:02

by AKASHI Takahiro

[permalink] [raw]
Subject: [PATCH v12 03/16] s390, kexec_file: drop arch_kexec_mem_walk()

Since s390 already knows where to locate buffers, calling
arch_kexec_mem_walk() has no sense. So we can just drop it as kbuf->mem
indicates this while all other architectures sets it to 0 initially.

This change is a preparatory work for the next patch, where all the
variant memory walks, either on system resource or memblock, will be
put in one common place so that it will satisfy all the architectures'
need.

Signed-off-by: AKASHI Takahiro <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Baoquan He <[email protected]>
---
arch/s390/kernel/machine_kexec_file.c | 10 ----------
kernel/kexec_file.c | 4 ++++
2 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/arch/s390/kernel/machine_kexec_file.c b/arch/s390/kernel/machine_kexec_file.c
index f413f57f8d20..32023b4f9dc0 100644
--- a/arch/s390/kernel/machine_kexec_file.c
+++ b/arch/s390/kernel/machine_kexec_file.c
@@ -134,16 +134,6 @@ int kexec_file_add_initrd(struct kimage *image, struct s390_load_data *data,
return ret;
}

-/*
- * The kernel is loaded to a fixed location. Turn off kexec_locate_mem_hole
- * and provide kbuf->mem by hand.
- */
-int arch_kexec_walk_mem(struct kexec_buf *kbuf,
- int (*func)(struct resource *, void *))
-{
- return 1;
-}
-
int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
Elf_Shdr *section,
const Elf_Shdr *relsec,
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 63c7ce1c0c3e..bf39df5e5bb9 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -534,6 +534,10 @@ int kexec_locate_mem_hole(struct kexec_buf *kbuf)
{
int ret;

+ /* Arch knows where to place */
+ if (kbuf->mem)
+ return 0;
+
ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);

return ret == 1 ? 0 : -EADDRNOTAVAIL;
--
2.18.0


2018-07-24 06:59:20

by AKASHI Takahiro

[permalink] [raw]
Subject: [PATCH v12 05/16] kexec_file: kexec_walk_memblock() only walks a dedicated region at kdump

In kdump case, there exists only one dedicated memblock region as usable
memory (crashk_res). With this patch, kexec_walk_memblock() runs a given
callback function on this region.

Signed-off-by: AKASHI Takahiro <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Baoquan He <[email protected]>
---
kernel/kexec_file.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 2f0691b0f8ad..bb23f9280a7a 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -511,6 +511,9 @@ static int kexec_walk_memblock(struct kexec_buf *kbuf,
phys_addr_t mstart, mend;
struct resource res = { };

+ if (kbuf->image->type == KEXEC_TYPE_CRASH)
+ return func(&crashk_res, kbuf);
+
if (kbuf->top_down) {
for_each_free_mem_range_reverse(i, NUMA_NO_NODE, 0,
&mstart, &mend, NULL) {
--
2.18.0


2018-07-24 06:59:30

by AKASHI Takahiro

[permalink] [raw]
Subject: [PATCH v12 06/16] of/fdt: add helper functions for handling properties

These functions will be used later to handle kexec-specific properties
in arm64's kexec_file implementation.

Signed-off-by: AKASHI Takahiro <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Frank Rowand <[email protected]>
---
drivers/of/fdt.c | 62 ++++++++++++++++++++++++++++++++++++++++--
include/linux/of_fdt.h | 10 +++++--
2 files changed, 68 insertions(+), 4 deletions(-)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 6da20b9688f7..f7c9d69ce86c 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -25,6 +25,7 @@
#include <linux/debugfs.h>
#include <linux/serial_core.h>
#include <linux/sysfs.h>
+#include <linux/types.h>

#include <asm/setup.h> /* for COMMAND_LINE_SIZE */
#include <asm/page.h>
@@ -537,8 +538,8 @@ void *of_fdt_unflatten_tree(const unsigned long *blob,
EXPORT_SYMBOL_GPL(of_fdt_unflatten_tree);

/* Everything below here references initial_boot_params directly. */
-int __initdata dt_root_addr_cells;
-int __initdata dt_root_size_cells;
+int dt_root_addr_cells;
+int dt_root_size_cells;

void *initial_boot_params;

@@ -1330,3 +1331,60 @@ late_initcall(of_fdt_raw_init);
#endif

#endif /* CONFIG_OF_EARLY_FLATTREE */
+
+bool of_fdt_cells_size_fitted(u64 base, u64 size)
+{
+ /* if *_cells >= 2, cells can hold 64-bit values anyway */
+ if ((dt_root_addr_cells == 1) && (base > U32_MAX))
+ return false;
+
+ if ((dt_root_size_cells == 1) && (size > U32_MAX))
+ return false;
+
+ return true;
+}
+
+size_t of_fdt_reg_cells_size(void)
+{
+ return (dt_root_addr_cells + dt_root_size_cells) * sizeof(u32);
+}
+
+#define FDT_ALIGN(x, a) (((x) + (a) - 1) & ~((a) - 1))
+#define FDT_TAGALIGN(x) (FDT_ALIGN((x), FDT_TAGSIZE))
+
+int fdt_prop_len(const char *prop_name, int len)
+{
+ return (strlen(prop_name) + 1) +
+ sizeof(struct fdt_property) +
+ FDT_TAGALIGN(len);
+}
+
+static void fill_property(void *buf, u64 val64, int cells)
+{
+ __be32 val32;
+
+ while (cells) {
+ val32 = cpu_to_fdt32((val64 >> (32 * (--cells))) & U32_MAX);
+ memcpy(buf, &val32, sizeof(val32));
+ buf += sizeof(val32);
+ }
+}
+
+int fdt_setprop_reg(void *fdt, int nodeoffset, const char *name,
+ u64 addr, u64 size)
+{
+ char buf[sizeof(__be32) * 2 * 2];
+ /* assume dt_root_[addr|size]_cells <= 2 */
+ void *prop;
+ size_t buf_size;
+
+ buf_size = of_fdt_reg_cells_size();
+ prop = buf;
+
+ fill_property(prop, addr, dt_root_addr_cells);
+ prop += dt_root_addr_cells * sizeof(u32);
+
+ fill_property(prop, size, dt_root_size_cells);
+
+ return fdt_setprop(fdt, nodeoffset, name, buf, buf_size);
+}
diff --git a/include/linux/of_fdt.h b/include/linux/of_fdt.h
index b9cd9ebdf9b9..9615d6142578 100644
--- a/include/linux/of_fdt.h
+++ b/include/linux/of_fdt.h
@@ -37,8 +37,8 @@ extern void *of_fdt_unflatten_tree(const unsigned long *blob,
struct device_node **mynodes);

/* TBD: Temporary export of fdt globals - remove when code fully merged */
-extern int __initdata dt_root_addr_cells;
-extern int __initdata dt_root_size_cells;
+extern int dt_root_addr_cells;
+extern int dt_root_size_cells;
extern void *initial_boot_params;

extern char __dtb_start[];
@@ -108,5 +108,11 @@ static inline void unflatten_device_tree(void) {}
static inline void unflatten_and_copy_device_tree(void) {}
#endif /* CONFIG_OF_EARLY_FLATTREE */

+bool of_fdt_cells_size_fitted(u64 base, u64 size);
+size_t of_fdt_reg_cells_size(void);
+int fdt_prop_len(const char *prop_name, int len);
+int fdt_setprop_reg(void *fdt, int nodeoffset, const char *name,
+ u64 addr, u64 size);
+
#endif /* __ASSEMBLY__ */
#endif /* _LINUX_OF_FDT_H */
--
2.18.0


2018-07-24 06:59:38

by AKASHI Takahiro

[permalink] [raw]
Subject: [PATCH v12 07/16] arm64: add image head flag definitions

Those image head's flags will be used later by kexec_file loader.

Signed-off-by: AKASHI Takahiro <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Acked-by: James Morse <[email protected]>
---
arch/arm64/include/asm/boot.h | 15 +++++++++++++++
arch/arm64/kernel/head.S | 2 +-
2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h
index 355e552a9175..0bab7eed3012 100644
--- a/arch/arm64/include/asm/boot.h
+++ b/arch/arm64/include/asm/boot.h
@@ -5,6 +5,21 @@

#include <asm/sizes.h>

+#define ARM64_MAGIC "ARM\x64"
+
+#define HEAD_FLAG_BE_SHIFT 0
+#define HEAD_FLAG_PAGE_SIZE_SHIFT 1
+#define HEAD_FLAG_BE_MASK 0x1
+#define HEAD_FLAG_PAGE_SIZE_MASK 0x3
+
+#define HEAD_FLAG_BE 1
+#define HEAD_FLAG_PAGE_SIZE_4K 1
+#define HEAD_FLAG_PAGE_SIZE_16K 2
+#define HEAD_FLAG_PAGE_SIZE_64K 3
+
+#define head_flag_field(flags, field) \
+ (((flags) >> field##_SHIFT) & field##_MASK)
+
/*
* arm64 requires the DTB to be 8 byte aligned and
* not exceed 2MB in size.
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index b0853069702f..8cbac6232ed1 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -91,7 +91,7 @@ _head:
.quad 0 // reserved
.quad 0 // reserved
.quad 0 // reserved
- .ascii "ARM\x64" // Magic number
+ .ascii ARM64_MAGIC // Magic number
#ifdef CONFIG_EFI
.long pe_header - _head // Offset to the PE header.

--
2.18.0


2018-07-24 06:59:45

by AKASHI Takahiro

[permalink] [raw]
Subject: [PATCH v12 04/16] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()

Memblock list is another source for usable system memory layout.
So move powerpc's arch_kexec_walk_mem() to common code so that other
memblock-based architectures, particularly arm64, can also utilise it.
A moved function is now renamed to kexec_walk_memblock() and integrated
into kexec_locate_mem_hole(), which will now be usable for all
architectures with no need for overriding arch_kexec_walk_mem().

kexec_walk_memblock() will not work for kdump in this form, this will be
fixed in the next patch.

Signed-off-by: AKASHI Takahiro <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Baoquan He <[email protected]>
Acked-by: James Morse <[email protected]>
---
arch/powerpc/kernel/machine_kexec_file_64.c | 54 -------------------
include/linux/kexec.h | 2 -
kernel/kexec_file.c | 58 ++++++++++++++++++++-
3 files changed, 56 insertions(+), 58 deletions(-)

diff --git a/arch/powerpc/kernel/machine_kexec_file_64.c b/arch/powerpc/kernel/machine_kexec_file_64.c
index 0bd23dc789a4..5357b09902c5 100644
--- a/arch/powerpc/kernel/machine_kexec_file_64.c
+++ b/arch/powerpc/kernel/machine_kexec_file_64.c
@@ -24,7 +24,6 @@

#include <linux/slab.h>
#include <linux/kexec.h>
-#include <linux/memblock.h>
#include <linux/of_fdt.h>
#include <linux/libfdt.h>
#include <asm/ima.h>
@@ -46,59 +45,6 @@ int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
return kexec_image_probe_default(image, buf, buf_len);
}

-/**
- * arch_kexec_walk_mem - call func(data) for each unreserved memory block
- * @kbuf: Context info for the search. Also passed to @func.
- * @func: Function to call for each memory block.
- *
- * This function is used by kexec_add_buffer and kexec_locate_mem_hole
- * to find unreserved memory to load kexec segments into.
- *
- * Return: The memory walk will stop when func returns a non-zero value
- * and that value will be returned. If all free regions are visited without
- * func returning non-zero, then zero will be returned.
- */
-int arch_kexec_walk_mem(struct kexec_buf *kbuf,
- int (*func)(struct resource *, void *))
-{
- int ret = 0;
- u64 i;
- phys_addr_t mstart, mend;
- struct resource res = { };
-
- if (kbuf->top_down) {
- for_each_free_mem_range_reverse(i, NUMA_NO_NODE, 0,
- &mstart, &mend, NULL) {
- /*
- * In memblock, end points to the first byte after the
- * range while in kexec, end points to the last byte
- * in the range.
- */
- res.start = mstart;
- res.end = mend - 1;
- ret = func(&res, kbuf);
- if (ret)
- break;
- }
- } else {
- for_each_free_mem_range(i, NUMA_NO_NODE, 0, &mstart, &mend,
- NULL) {
- /*
- * In memblock, end points to the first byte after the
- * range while in kexec, end points to the last byte
- * in the range.
- */
- res.start = mstart;
- res.end = mend - 1;
- ret = func(&res, kbuf);
- if (ret)
- break;
- }
- }
-
- return ret;
-}
-
/**
* setup_purgatory - initialize the purgatory's global variables
* @image: kexec image.
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 49ab758f4d91..c196bfd11bee 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -184,8 +184,6 @@ int __weak arch_kexec_apply_relocations(struct purgatory_info *pi,
const Elf_Shdr *relsec,
const Elf_Shdr *symtab);

-int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
- int (*func)(struct resource *, void *));
extern int kexec_add_buffer(struct kexec_buf *kbuf);
int kexec_locate_mem_hole(struct kexec_buf *kbuf);

diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index bf39df5e5bb9..2f0691b0f8ad 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -16,6 +16,7 @@
#include <linux/file.h>
#include <linux/slab.h>
#include <linux/kexec.h>
+#include <linux/memblock.h>
#include <linux/mutex.h>
#include <linux/list.h>
#include <linux/fs.h>
@@ -501,6 +502,55 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
return locate_mem_hole_bottom_up(start, end, kbuf);
}

+#if defined(CONFIG_HAVE_MEMBLOCK) && !defined(CONFIG_ARCH_DISCARD_MEMBLOCK)
+static int kexec_walk_memblock(struct kexec_buf *kbuf,
+ int (*func)(struct resource *, void *))
+{
+ int ret = 0;
+ u64 i;
+ phys_addr_t mstart, mend;
+ struct resource res = { };
+
+ if (kbuf->top_down) {
+ for_each_free_mem_range_reverse(i, NUMA_NO_NODE, 0,
+ &mstart, &mend, NULL) {
+ /*
+ * In memblock, end points to the first byte after the
+ * range while in kexec, end points to the last byte
+ * in the range.
+ */
+ res.start = mstart;
+ res.end = mend - 1;
+ ret = func(&res, kbuf);
+ if (ret)
+ break;
+ }
+ } else {
+ for_each_free_mem_range(i, NUMA_NO_NODE, 0, &mstart, &mend,
+ NULL) {
+ /*
+ * In memblock, end points to the first byte after the
+ * range while in kexec, end points to the last byte
+ * in the range.
+ */
+ res.start = mstart;
+ res.end = mend - 1;
+ ret = func(&res, kbuf);
+ if (ret)
+ break;
+ }
+ }
+
+ return ret;
+}
+#else
+static int kexec_walk_memblock(struct kexec_buf *kbuf,
+ int (*func)(struct resource *, void *))
+{
+ return 0;
+}
+#endif
+
/**
* arch_kexec_walk_mem - call func(data) on free memory regions
* @kbuf: Context info for the search. Also passed to @func.
@@ -510,7 +560,7 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
* and that value will be returned. If all free regions are visited without
* func returning non-zero, then zero will be returned.
*/
-int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
+static int arch_kexec_walk_mem(struct kexec_buf *kbuf,
int (*func)(struct resource *, void *))
{
if (kbuf->image->type == KEXEC_TYPE_CRASH)
@@ -538,7 +588,11 @@ int kexec_locate_mem_hole(struct kexec_buf *kbuf)
if (kbuf->mem)
return 0;

- ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
+ if (IS_ENABLED(CONFIG_HAVE_MEMBLOCK) &&
+ !IS_ENABLED(CONFIG_ARCH_DISCARD_MEMBLOCK))
+ ret = kexec_walk_memblock(kbuf, locate_mem_hole_callback);
+ else
+ ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);

return ret == 1 ? 0 : -EADDRNOTAVAIL;
}
--
2.18.0


2018-07-24 06:59:51

by AKASHI Takahiro

[permalink] [raw]
Subject: [PATCH v12 08/16] arm64: cpufeature: add MMFR0 helper functions

Those helper functions for MMFR0 register will be used later by kexec_file
loader.

Signed-off-by: AKASHI Takahiro <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Reviewed-by: James Morse <[email protected]>
---
arch/arm64/include/asm/cpufeature.h | 48 +++++++++++++++++++++++++++++
1 file changed, 48 insertions(+)

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 1717ba1db35d..cd90b5252d6d 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -486,11 +486,59 @@ static inline bool system_supports_32bit_el0(void)
return cpus_have_const_cap(ARM64_HAS_32BIT_EL0);
}

+static inline bool system_supports_4kb_granule(void)
+{
+ u64 mmfr0;
+ u32 val;
+
+ mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
+ val = cpuid_feature_extract_unsigned_field(mmfr0,
+ ID_AA64MMFR0_TGRAN4_SHIFT);
+
+ return val == ID_AA64MMFR0_TGRAN4_SUPPORTED;
+}
+
+static inline bool system_supports_64kb_granule(void)
+{
+ u64 mmfr0;
+ u32 val;
+
+ mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
+ val = cpuid_feature_extract_unsigned_field(mmfr0,
+ ID_AA64MMFR0_TGRAN64_SHIFT);
+
+ return val == ID_AA64MMFR0_TGRAN64_SUPPORTED;
+}
+
+static inline bool system_supports_16kb_granule(void)
+{
+ u64 mmfr0;
+ u32 val;
+
+ mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
+ val = cpuid_feature_extract_unsigned_field(mmfr0,
+ ID_AA64MMFR0_TGRAN16_SHIFT);
+
+ return val == ID_AA64MMFR0_TGRAN16_SUPPORTED;
+}
+
static inline bool system_supports_mixed_endian_el0(void)
{
return id_aa64mmfr0_mixed_endian_el0(read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1));
}

+static inline bool system_supports_mixed_endian(void)
+{
+ u64 mmfr0;
+ u32 val;
+
+ mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
+ val = cpuid_feature_extract_unsigned_field(mmfr0,
+ ID_AA64MMFR0_BIGENDEL_SHIFT);
+
+ return val == 0x1;
+}
+
static inline bool system_supports_fpsimd(void)
{
return !cpus_have_const_cap(ARM64_HAS_NO_FPSIMD);
--
2.18.0


2018-07-24 06:59:55

by AKASHI Takahiro

[permalink] [raw]
Subject: [PATCH v12 02/16] kexec_file: make kexec_image_post_load_cleanup_default() global

Change this function from static to global so that arm64 can implement
its own arch_kimage_file_post_load_cleanup() later using
kexec_image_post_load_cleanup_default().

Signed-off-by: AKASHI Takahiro <[email protected]>
Acked-by: Dave Young <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Baoquan He <[email protected]>
---
include/linux/kexec.h | 1 +
kernel/kexec_file.c | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 9e4e638fb505..49ab758f4d91 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -143,6 +143,7 @@ extern const struct kexec_file_ops * const kexec_file_loaders[];

int kexec_image_probe_default(struct kimage *image, void *buf,
unsigned long buf_len);
+int kexec_image_post_load_cleanup_default(struct kimage *image);

/**
* struct kexec_buf - parameters for finding a place for a buffer in memory
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index c6a3b6851372..63c7ce1c0c3e 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -78,7 +78,7 @@ void * __weak arch_kexec_kernel_image_load(struct kimage *image)
return kexec_image_load_default(image);
}

-static int kexec_image_post_load_cleanup_default(struct kimage *image)
+int kexec_image_post_load_cleanup_default(struct kimage *image)
{
if (!image->fops || !image->fops->cleanup)
return 0;
--
2.18.0


2018-07-24 07:00:03

by AKASHI Takahiro

[permalink] [raw]
Subject: [PATCH v12 09/16] arm64: enable KEXEC_FILE config

Modify arm64/Kconfig to enable kexec_file_load support.

Signed-off-by: AKASHI Takahiro <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Acked-by: James Morse <[email protected]>
---
arch/arm64/Kconfig | 9 +++++++++
arch/arm64/kernel/Makefile | 3 ++-
arch/arm64/kernel/machine_kexec_file.c | 16 ++++++++++++++++
3 files changed, 27 insertions(+), 1 deletion(-)
create mode 100644 arch/arm64/kernel/machine_kexec_file.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 42c090cf0292..a9a3a5583c8b 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -835,6 +835,15 @@ config KEXEC
but it is independent of the system firmware. And like a reboot
you can start any kernel with it, not just Linux.

+config KEXEC_FILE
+ bool "kexec file based system call"
+ select KEXEC_CORE
+ help
+ This is new version of kexec system call. This system call is
+ file based and takes file descriptors as system call argument
+ for kernel and initramfs as opposed to list of segments as
+ accepted by previous system call.
+
config CRASH_DUMP
bool "Build kdump crash kernel"
help
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 0025f8691046..06281e1ad7ed 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -48,8 +48,9 @@ arm64-obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL) += acpi_parking_protocol.o
arm64-obj-$(CONFIG_PARAVIRT) += paravirt.o
arm64-obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
arm64-obj-$(CONFIG_HIBERNATION) += hibernate.o hibernate-asm.o
-arm64-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o \
+arm64-obj-$(CONFIG_KEXEC_CORE) += machine_kexec.o relocate_kernel.o \
cpu-reset.o
+arm64-obj-$(CONFIG_KEXEC_FILE) += machine_kexec_file.o
arm64-obj-$(CONFIG_ARM64_RELOC_TEST) += arm64-reloc-test.o
arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o
arm64-obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
new file mode 100644
index 000000000000..c38a8048ed00
--- /dev/null
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -0,0 +1,16 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * kexec_file for arm64
+ *
+ * Copyright (C) 2018 Linaro Limited
+ * Author: AKASHI Takahiro <[email protected]>
+ *
+ */
+
+#define pr_fmt(fmt) "kexec_file: " fmt
+
+#include <linux/kexec.h>
+
+const struct kexec_file_ops * const kexec_file_loaders[] = {
+ NULL
+};
--
2.18.0


2018-07-24 07:00:23

by AKASHI Takahiro

[permalink] [raw]
Subject: [PATCH v12 11/16] arm64: kexec_file: allow for loading Image-format kernel

This patch provides kexec_file_ops for "Image"-format kernel. In this
implementation, a binary is always loaded with a fixed offset identified
in text_offset field of its header.

Regarding signature verification for trusted boot, this patch doesn't
contains CONFIG_KEXEC_VERIFY_SIG support, which is to be added later
in this series, but file-attribute-based verification is still a viable
option by enabling IMA security subsystem.

You can sign(label) a to-be-kexec'ed kernel image on target file system
with:
$ evmctl ima_sign --key /path/to/private_key.pem Image

On live system, you must have IMA enforced with, at least, the following
security policy:
"appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig"

See more details about IMA here:
https://sourceforge.net/p/linux-ima/wiki/Home/

Signed-off-by: AKASHI Takahiro <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Reviewed-by: James Morse <[email protected]>
---
arch/arm64/include/asm/kexec.h | 28 +++++++
arch/arm64/kernel/Makefile | 2 +-
arch/arm64/kernel/kexec_image.c | 108 +++++++++++++++++++++++++
arch/arm64/kernel/machine_kexec_file.c | 1 +
4 files changed, 138 insertions(+), 1 deletion(-)
create mode 100644 arch/arm64/kernel/kexec_image.c

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 026f7e408f0c..5d102a1054b3 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -101,6 +101,34 @@ struct kimage_arch {
unsigned long dtb_mem;
};

+/**
+ * struct arm64_image_header - arm64 kernel image header
+ * See Documentation/arm64/booting.txt for details
+ *
+ * @mz_magic: DOS header magic number ('MZ', optional)
+ * @code1: Instruction (branch to stext)
+ * @text_offset: Image load offset
+ * @image_size: Effective image size
+ * @flags: Bit-field flags
+ * @reserved: Reserved
+ * @magic: Magic number
+ * @pe_header: Offset to PE COFF header (optional)
+ **/
+
+struct arm64_image_header {
+ __le16 mz_magic; /* also code0 */
+ __le16 pad;
+ __le32 code1;
+ __le64 text_offset;
+ __le64 image_size;
+ __le64 flags;
+ __le64 reserved[3];
+ __le32 magic;
+ __le32 pe_header;
+};
+
+extern const struct kexec_file_ops kexec_image_ops;
+
struct kimage;

extern int arch_kimage_file_post_load_cleanup(struct kimage *image);
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 06281e1ad7ed..a9cc7752f276 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -50,7 +50,7 @@ arm64-obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
arm64-obj-$(CONFIG_HIBERNATION) += hibernate.o hibernate-asm.o
arm64-obj-$(CONFIG_KEXEC_CORE) += machine_kexec.o relocate_kernel.o \
cpu-reset.o
-arm64-obj-$(CONFIG_KEXEC_FILE) += machine_kexec_file.o
+arm64-obj-$(CONFIG_KEXEC_FILE) += machine_kexec_file.o kexec_image.o
arm64-obj-$(CONFIG_ARM64_RELOC_TEST) += arm64-reloc-test.o
arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o
arm64-obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
new file mode 100644
index 000000000000..d64f5e9f9d22
--- /dev/null
+++ b/arch/arm64/kernel/kexec_image.c
@@ -0,0 +1,108 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Kexec image loader
+
+ * Copyright (C) 2018 Linaro Limited
+ * Author: AKASHI Takahiro <[email protected]>
+ */
+
+#define pr_fmt(fmt) "kexec_file(Image): " fmt
+
+#include <linux/err.h>
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/kexec.h>
+#include <linux/string.h>
+#include <asm/boot.h>
+#include <asm/byteorder.h>
+#include <asm/cpufeature.h>
+#include <asm/memory.h>
+
+static int image_probe(const char *kernel_buf, unsigned long kernel_len)
+{
+ const struct arm64_image_header *h;
+
+ h = (const struct arm64_image_header *)(kernel_buf);
+
+ if (!h || (kernel_len < sizeof(*h)) ||
+ memcmp(&h->magic, ARM64_MAGIC, sizeof(h->magic)))
+ return -EINVAL;
+
+ return 0;
+}
+
+static void *image_load(struct kimage *image,
+ char *kernel, unsigned long kernel_len,
+ char *initrd, unsigned long initrd_len,
+ char *cmdline, unsigned long cmdline_len)
+{
+ struct arm64_image_header *h;
+ u64 flags, value;
+ struct kexec_buf kbuf;
+ unsigned long text_offset;
+ struct kexec_segment *kernel_segment;
+ int ret;
+
+ /* Don't support old kernel */
+ h = (struct arm64_image_header *)kernel;
+ if (!h->text_offset)
+ return ERR_PTR(-EINVAL);
+
+ /* Check cpu features */
+ flags = le64_to_cpu(h->flags);
+ value = head_flag_field(flags, HEAD_FLAG_BE);
+ if (((value == HEAD_FLAG_BE) && !IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) ||
+ ((value != HEAD_FLAG_BE) && IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)))
+ if (!system_supports_mixed_endian())
+ return ERR_PTR(-EINVAL);
+
+ value = head_flag_field(flags, HEAD_FLAG_PAGE_SIZE);
+ if (((value == HEAD_FLAG_PAGE_SIZE_4K) &&
+ !system_supports_4kb_granule()) ||
+ ((value == HEAD_FLAG_PAGE_SIZE_64K) &&
+ !system_supports_64kb_granule()) ||
+ ((value == HEAD_FLAG_PAGE_SIZE_16K) &&
+ !system_supports_16kb_granule()))
+ return ERR_PTR(-EINVAL);
+
+ /* Load the kernel */
+ kbuf.image = image;
+ kbuf.buf_min = 0;
+ kbuf.buf_max = ULONG_MAX;
+ kbuf.top_down = false;
+
+ kbuf.buffer = kernel;
+ kbuf.bufsz = kernel_len;
+ kbuf.mem = 0;
+ kbuf.memsz = le64_to_cpu(h->image_size);
+ text_offset = le64_to_cpu(h->text_offset);
+ kbuf.buf_align = MIN_KIMG_ALIGN;
+
+ /* Adjust kernel segment with TEXT_OFFSET */
+ kbuf.memsz += text_offset;
+
+ ret = kexec_add_buffer(&kbuf);
+ if (ret)
+ return ERR_PTR(ret);
+
+ kernel_segment = &image->segment[image->nr_segments - 1];
+ kernel_segment->mem += text_offset;
+ kernel_segment->memsz -= text_offset;
+ image->start = kernel_segment->mem;
+
+ pr_debug("Loaded kernel at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
+ kernel_segment->mem, kbuf.bufsz,
+ kernel_segment->memsz);
+
+ /* Load additional data */
+ ret = load_other_segments(image,
+ kernel_segment->mem, kernel_segment->memsz,
+ initrd, initrd_len, cmdline, cmdline_len);
+
+ return ERR_PTR(ret);
+}
+
+const struct kexec_file_ops kexec_image_ops = {
+ .probe = image_probe,
+ .load = image_load,
+};
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index b28fbb0659c9..b8297f10e2ef 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -20,6 +20,7 @@
#include <asm/byteorder.h>

const struct kexec_file_ops * const kexec_file_loaders[] = {
+ &kexec_image_ops,
NULL
};

--
2.18.0


2018-07-24 07:00:33

by AKASHI Takahiro

[permalink] [raw]
Subject: [PATCH v12 12/16] arm64: kexec_file: add crash dump support

Enabling crash dump (kdump) includes
* prepare contents of ELF header of a core dump file, /proc/vmcore,
using crash_prepare_elf64_headers(), and
* add two device tree properties, "linux,usable-memory-range" and
"linux,elfcorehdr", which represent respectively a memory range
to be used by crash dump kernel and the header's location

Signed-off-by: AKASHI Takahiro <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
---
arch/arm64/include/asm/kexec.h | 6 +-
arch/arm64/kernel/machine_kexec_file.c | 113 ++++++++++++++++++++++++-
2 files changed, 115 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 5d102a1054b3..1b2c27026ae0 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -97,8 +97,12 @@ static inline void crash_post_resume(void) {}
#define ARCH_HAS_KIMAGE_ARCH

struct kimage_arch {
- void *dtb_buf;
+ void *dtb;
unsigned long dtb_mem;
+ /* Core ELF header buffer */
+ void *elf_headers;
+ unsigned long elf_headers_mem;
+ unsigned long elf_headers_sz;
};

/**
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index b8297f10e2ef..7356da5a53d5 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -16,7 +16,9 @@
#include <linux/libfdt.h>
#include <linux/memblock.h>
#include <linux/of_fdt.h>
+#include <linux/slab.h>
#include <linux/types.h>
+#include <linux/vmalloc.h>
#include <asm/byteorder.h>

const struct kexec_file_ops * const kexec_file_loaders[] = {
@@ -29,6 +31,10 @@ int arch_kimage_file_post_load_cleanup(struct kimage *image)
vfree(image->arch.dtb);
image->arch.dtb = NULL;

+ vfree(image->arch.elf_headers);
+ image->arch.elf_headers = NULL;
+ image->arch.elf_headers_sz = 0;
+
return kexec_image_post_load_cleanup_default(image);
}

@@ -38,12 +44,30 @@ static int setup_dtb(struct kimage *image,
void **dtb_buf, unsigned long *dtb_buf_len)
{
void *buf = NULL;
- size_t buf_size;
+ size_t buf_size, range_size;
int nodeoffset;
int ret;

+ /* check ranges against root's #address-cells and #size-cells */
+ if (image->type == KEXEC_TYPE_CRASH &&
+ (!of_fdt_cells_size_fitted(image->arch.elf_headers_mem,
+ image->arch.elf_headers_sz) ||
+ !of_fdt_cells_size_fitted(crashk_res.start,
+ crashk_res.end - crashk_res.start + 1))) {
+ pr_err("Crash memory region doesn't fit into DT's root cell sizes.\n");
+ ret = -EINVAL;
+ goto out_err;
+ }
+
/* duplicate dt blob */
buf_size = fdt_totalsize(initial_boot_params);
+ range_size = of_fdt_reg_cells_size();
+
+ if (image->type == KEXEC_TYPE_CRASH) {
+ buf_size += fdt_prop_len("linux,elfcorehdr", range_size);
+ buf_size += fdt_prop_len("linux,usable-memory-range",
+ range_size);
+ }

if (initrd_load_addr) {
/* can be redundant, but trimmed at the end */
@@ -73,6 +97,23 @@ static int setup_dtb(struct kimage *image,
goto out_err;
}

+ if (image->type == KEXEC_TYPE_CRASH) {
+ /* add linux,elfcorehdr */
+ ret = fdt_setprop_reg(buf, nodeoffset, "linux,elfcorehdr",
+ image->arch.elf_headers_mem,
+ image->arch.elf_headers_sz);
+ if (ret)
+ goto out_err;
+
+ /* add linux,usable-memory-range */
+ ret = fdt_setprop_reg(buf, nodeoffset,
+ "linux,usable-memory-range",
+ crashk_res.start,
+ crashk_res.end - crashk_res.start + 1);
+ if (ret)
+ goto out_err;
+ }
+
/* add bootargs */
if (cmdline) {
ret = fdt_setprop_string(buf, nodeoffset, "bootargs", cmdline);
@@ -129,6 +170,43 @@ static int setup_dtb(struct kimage *image,
return ret;
}

+static int prepare_elf_headers(void **addr, unsigned long *sz)
+{
+ struct crash_mem *cmem;
+ unsigned int nr_ranges;
+ int ret;
+ u64 i;
+ phys_addr_t start, end;
+
+ nr_ranges = 1; /* for exclusion of crashkernel region */
+ for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE, 0,
+ &start, &end, NULL)
+ nr_ranges++;
+
+ cmem = kmalloc(sizeof(struct crash_mem) +
+ sizeof(struct crash_mem_range) * nr_ranges, GFP_KERNEL);
+ if (!cmem)
+ return -ENOMEM;
+
+ cmem->max_nr_ranges = nr_ranges;
+ cmem->nr_ranges = 0;
+ for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE,
+ MEMBLOCK_NONE, &start, &end, NULL) {
+ cmem->ranges[cmem->nr_ranges].start = start;
+ cmem->ranges[cmem->nr_ranges].end = end - 1;
+ cmem->nr_ranges++;
+ }
+
+ /* Exclude crashkernel region */
+ ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
+
+ if (!ret)
+ ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
+
+ kfree(cmem);
+ return ret;
+}
+
int load_other_segments(struct kimage *image,
unsigned long kernel_load_addr,
unsigned long kernel_size,
@@ -136,14 +214,43 @@ int load_other_segments(struct kimage *image,
char *cmdline, unsigned long cmdline_len)
{
struct kexec_buf kbuf;
- void *dtb = NULL;
- unsigned long initrd_load_addr = 0, dtb_len;
+ void *headers, *dtb = NULL;
+ unsigned long headers_sz, initrd_load_addr = 0, dtb_len;
int ret = 0;

kbuf.image = image;
/* not allocate anything below the kernel */
kbuf.buf_min = kernel_load_addr + kernel_size;

+ /* load elf core header */
+ if (image->type == KEXEC_TYPE_CRASH) {
+ ret = prepare_elf_headers(&headers, &headers_sz);
+ if (ret) {
+ pr_err("Preparing elf core header failed\n");
+ goto out_err;
+ }
+
+ kbuf.buffer = headers;
+ kbuf.bufsz = headers_sz;
+ kbuf.mem = 0;
+ kbuf.memsz = headers_sz;
+ kbuf.buf_align = SZ_64K; /* largest supported page size */
+ kbuf.buf_max = ULONG_MAX;
+ kbuf.top_down = true;
+
+ ret = kexec_add_buffer(&kbuf);
+ if (ret) {
+ vfree(headers);
+ goto out_err;
+ }
+ image->arch.elf_headers = headers;
+ image->arch.elf_headers_mem = kbuf.mem;
+ image->arch.elf_headers_sz = headers_sz;
+
+ pr_debug("Loaded elf core header at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
+ image->arch.elf_headers_mem, headers_sz, headers_sz);
+ }
+
/* load initrd */
if (initrd) {
kbuf.buffer = initrd;
--
2.18.0


2018-07-24 07:00:50

by AKASHI Takahiro

[permalink] [raw]
Subject: [PATCH v12 10/16] arm64: kexec_file: load initrd and device-tree

load_other_segments() is expected to allocate and place all the necessary
memory segments other than kernel, including initrd and device-tree
blob (and elf core header for crash).
While most of the code was borrowed from kexec-tools' counterpart,
users may not be allowed to specify dtb explicitly, instead, the dtb
presented by the original boot loader is reused.

arch_kimage_kernel_post_load_cleanup() is responsible for freeing arm64-
specific data allocated in load_other_segments().

Signed-off-by: AKASHI Takahiro <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Reviewed-by: James Morse <[email protected]>
---
arch/arm64/include/asm/kexec.h | 17 +++
arch/arm64/kernel/machine_kexec_file.c | 182 +++++++++++++++++++++++++
2 files changed, 199 insertions(+)

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index e17f0529a882..026f7e408f0c 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -93,6 +93,23 @@ static inline void crash_prepare_suspend(void) {}
static inline void crash_post_resume(void) {}
#endif

+#ifdef CONFIG_KEXEC_FILE
+#define ARCH_HAS_KIMAGE_ARCH
+
+struct kimage_arch {
+ void *dtb_buf;
+ unsigned long dtb_mem;
+};
+
+struct kimage;
+
+extern int arch_kimage_file_post_load_cleanup(struct kimage *image);
+extern int load_other_segments(struct kimage *image,
+ unsigned long kernel_load_addr, unsigned long kernel_size,
+ char *initrd, unsigned long initrd_len,
+ char *cmdline, unsigned long cmdline_len);
+#endif
+
#endif /* __ASSEMBLY__ */

#endif
diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index c38a8048ed00..b28fbb0659c9 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -5,12 +5,194 @@
* Copyright (C) 2018 Linaro Limited
* Author: AKASHI Takahiro <[email protected]>
*
+ * Most code is derived from arm64 port of kexec-tools
*/

#define pr_fmt(fmt) "kexec_file: " fmt

+#include <linux/ioport.h>
+#include <linux/kernel.h>
#include <linux/kexec.h>
+#include <linux/libfdt.h>
+#include <linux/memblock.h>
+#include <linux/of_fdt.h>
+#include <linux/types.h>
+#include <asm/byteorder.h>

const struct kexec_file_ops * const kexec_file_loaders[] = {
NULL
};
+
+int arch_kimage_file_post_load_cleanup(struct kimage *image)
+{
+ vfree(image->arch.dtb);
+ image->arch.dtb = NULL;
+
+ return kexec_image_post_load_cleanup_default(image);
+}
+
+static int setup_dtb(struct kimage *image,
+ unsigned long initrd_load_addr, unsigned long initrd_len,
+ char *cmdline, unsigned long cmdline_len,
+ void **dtb_buf, unsigned long *dtb_buf_len)
+{
+ void *buf = NULL;
+ size_t buf_size;
+ int nodeoffset;
+ int ret;
+
+ /* duplicate dt blob */
+ buf_size = fdt_totalsize(initial_boot_params);
+
+ if (initrd_load_addr) {
+ /* can be redundant, but trimmed at the end */
+ buf_size += fdt_prop_len("linux,initrd-start", sizeof(u64));
+ buf_size += fdt_prop_len("linux,initrd-end", sizeof(u64));
+ }
+
+ if (cmdline)
+ /* can be redundant, but trimmed at the end */
+ buf_size += fdt_prop_len("bootargs", cmdline_len);
+
+ buf = vmalloc(buf_size);
+ if (!buf) {
+ ret = -ENOMEM;
+ goto out_err;
+ }
+
+ ret = fdt_open_into(initial_boot_params, buf, buf_size);
+ if (ret) {
+ ret = -EINVAL;
+ goto out_err;
+ }
+
+ nodeoffset = fdt_path_offset(buf, "/chosen");
+ if (nodeoffset < 0) {
+ ret = -EINVAL;
+ goto out_err;
+ }
+
+ /* add bootargs */
+ if (cmdline) {
+ ret = fdt_setprop_string(buf, nodeoffset, "bootargs", cmdline);
+ if (ret) {
+ ret = -EINVAL;
+ goto out_err;
+ }
+ } else {
+ ret = fdt_delprop(buf, nodeoffset, "bootargs");
+ if (ret && (ret != -FDT_ERR_NOTFOUND)) {
+ ret = -EINVAL;
+ goto out_err;
+ }
+ }
+
+ /* add initrd-* */
+ if (initrd_load_addr) {
+ ret = fdt_setprop_u64(buf, nodeoffset, "linux,initrd-start",
+ initrd_load_addr);
+ if (ret) {
+ ret = -EINVAL;
+ goto out_err;
+ }
+
+ ret = fdt_setprop_u64(buf, nodeoffset, "linux,initrd-end",
+ initrd_load_addr + initrd_len);
+ if (ret) {
+ ret = -EINVAL;
+ goto out_err;
+ }
+ } else {
+ ret = fdt_delprop(buf, nodeoffset, "linux,initrd-start");
+ if (ret && (ret != -FDT_ERR_NOTFOUND)) {
+ ret = -EINVAL;
+ goto out_err;
+ }
+
+ ret = fdt_delprop(buf, nodeoffset, "linux,initrd-end");
+ if (ret && (ret != -FDT_ERR_NOTFOUND)) {
+ ret = -EINVAL;
+ goto out_err;
+ }
+ }
+
+ /* trim a buffer */
+ fdt_pack(buf);
+ *dtb_buf = buf;
+ *dtb_buf_len = fdt_totalsize(buf);
+
+ return 0;
+
+out_err:
+ vfree(buf);
+ return ret;
+}
+
+int load_other_segments(struct kimage *image,
+ unsigned long kernel_load_addr,
+ unsigned long kernel_size,
+ char *initrd, unsigned long initrd_len,
+ char *cmdline, unsigned long cmdline_len)
+{
+ struct kexec_buf kbuf;
+ void *dtb = NULL;
+ unsigned long initrd_load_addr = 0, dtb_len;
+ int ret = 0;
+
+ kbuf.image = image;
+ /* not allocate anything below the kernel */
+ kbuf.buf_min = kernel_load_addr + kernel_size;
+
+ /* load initrd */
+ if (initrd) {
+ kbuf.buffer = initrd;
+ kbuf.bufsz = initrd_len;
+ kbuf.mem = 0;
+ kbuf.memsz = initrd_len;
+ kbuf.buf_align = 0;
+ /* within 1GB-aligned window of up to 32GB in size */
+ kbuf.buf_max = round_down(kernel_load_addr, SZ_1G)
+ + (unsigned long)SZ_1G * 32;
+ kbuf.top_down = false;
+
+ ret = kexec_add_buffer(&kbuf);
+ if (ret)
+ goto out_err;
+ initrd_load_addr = kbuf.mem;
+
+ pr_debug("Loaded initrd at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
+ initrd_load_addr, initrd_len, initrd_len);
+ }
+
+ /* load dtb blob */
+ ret = setup_dtb(image, initrd_load_addr, initrd_len,
+ cmdline, cmdline_len, &dtb, &dtb_len);
+ if (ret) {
+ pr_err("Preparing for new dtb failed\n");
+ goto out_err;
+ }
+
+ kbuf.buffer = dtb;
+ kbuf.bufsz = dtb_len;
+ kbuf.mem = 0;
+ kbuf.memsz = dtb_len;
+ /* not across 2MB boundary */
+ kbuf.buf_align = SZ_2M;
+ kbuf.buf_max = ULONG_MAX;
+ kbuf.top_down = true;
+
+ ret = kexec_add_buffer(&kbuf);
+ if (ret)
+ goto out_err;
+ image->arch.dtb = dtb;
+ image->arch.dtb_mem = kbuf.mem;
+
+ pr_debug("Loaded dtb at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
+ kbuf.mem, dtb_len, dtb_len);
+
+ return 0;
+
+out_err:
+ vfree(dtb);
+ return ret;
+}
--
2.18.0


2018-07-24 07:00:52

by AKASHI Takahiro

[permalink] [raw]
Subject: [PATCH v12 13/16] arm64: kexec_file: invoke the kernel without purgatory

On arm64, purgatory would do almost nothing. So just invoke secondary
kernel directly by jumping into its entry code.

While, in this case, cpu_soft_restart() must be called with dtb address
in the fifth argument, the behavior still stays compatible with kexec_load
case as long as the argument is null.

Signed-off-by: AKASHI Takahiro <[email protected]>
Reviewed-by: James Morse <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
---
arch/arm64/kernel/cpu-reset.S | 8 ++++----
arch/arm64/kernel/machine_kexec.c | 12 ++++++++++--
arch/arm64/kernel/relocate_kernel.S | 3 ++-
3 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S
index 8021b46c9743..a2be30275a73 100644
--- a/arch/arm64/kernel/cpu-reset.S
+++ b/arch/arm64/kernel/cpu-reset.S
@@ -22,11 +22,11 @@
* __cpu_soft_restart(el2_switch, entry, arg0, arg1, arg2) - Helper for
* cpu_soft_restart.
*
- * @el2_switch: Flag to indicate a swich to EL2 is needed.
+ * @el2_switch: Flag to indicate a switch to EL2 is needed.
* @entry: Location to jump to for soft reset.
- * arg0: First argument passed to @entry.
- * arg1: Second argument passed to @entry.
- * arg2: Third argument passed to @entry.
+ * arg0: First argument passed to @entry. (relocation list)
+ * arg1: Second argument passed to @entry.(physical kernel entry)
+ * arg2: Third argument passed to @entry. (physical dtb address)
*
* Put the CPU into the same state as it would be if it had been reset, and
* branch to what would be the reset vector. It must be executed with the
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index f76ea92dff91..830a5063e09d 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -205,10 +205,18 @@ void machine_kexec(struct kimage *kimage)
* uses physical addressing to relocate the new image to its final
* position and transfers control to the image entry point when the
* relocation is complete.
+ * In kexec case, kimage->start points to purgatory assuming that
+ * kernel entry and dtb address are embedded in purgatory by
+ * userspace (kexec-tools).
+ * In kexec_file case, the kernel starts directly without purgatory.
*/
-
cpu_soft_restart(kimage != kexec_crash_image,
- reboot_code_buffer_phys, kimage->head, kimage->start, 0);
+ reboot_code_buffer_phys, kimage->head, kimage->start,
+#ifdef CONFIG_KEXEC_FILE
+ kimage->arch.dtb_mem);
+#else
+ 0);
+#endif

BUG(); /* Should never get here. */
}
diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S
index f407e422a720..95fd94209aae 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -32,6 +32,7 @@
ENTRY(arm64_relocate_new_kernel)

/* Setup the list loop variables. */
+ mov x18, x2 /* x18 = dtb address */
mov x17, x1 /* x17 = kimage_start */
mov x16, x0 /* x16 = kimage_head */
raw_dcache_line_size x15, x0 /* x15 = dcache line size */
@@ -107,7 +108,7 @@ ENTRY(arm64_relocate_new_kernel)
isb

/* Start new image. */
- mov x0, xzr
+ mov x0, x18
mov x1, xzr
mov x2, xzr
mov x3, xzr
--
2.18.0


2018-07-24 07:01:02

by AKASHI Takahiro

[permalink] [raw]
Subject: [PATCH v12 15/16] arm64: kexec_file: add kernel signature verification support

With this patch, kernel verification can be done without IMA security
subsystem enabled. Turn on CONFIG_KEXEC_VERIFY_SIG instead.

On x86, a signature is embedded into a PE file (Microsoft's format) header
of binary. Since arm64's "Image" can also be seen as a PE file as far as
CONFIG_EFI is enabled, we adopt this format for kernel signing.

You can create a signed kernel image with:
$ sbsign --key ${KEY} --cert ${CERT} Image

Signed-off-by: AKASHI Takahiro <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
---
arch/arm64/Kconfig | 24 ++++++++++++++++++++++++
arch/arm64/kernel/kexec_image.c | 15 +++++++++++++++
2 files changed, 39 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index a9a3a5583c8b..1445eb2fc833 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -844,6 +844,30 @@ config KEXEC_FILE
for kernel and initramfs as opposed to list of segments as
accepted by previous system call.

+config KEXEC_VERIFY_SIG
+ bool "Verify kernel signature during kexec_file_load() syscall"
+ depends on KEXEC_FILE
+ help
+ Select this option to verify a signature with loaded kernel
+ image. If configured, any attempt of loading a image without
+ valid signature will fail.
+
+ In addition to that option, you need to enable signature
+ verification for the corresponding kernel image type being
+ loaded in order for this to work.
+
+config KEXEC_IMAGE_VERIFY_SIG
+ bool "Enable Image signature verification support"
+ default y
+ depends on KEXEC_VERIFY_SIG
+ depends on EFI && SIGNED_PE_FILE_VERIFICATION
+ help
+ Enable Image signature verification support.
+
+comment "Support for PE file signature verification disabled"
+ depends on KEXEC_VERIFY_SIG
+ depends on !EFI || !SIGNED_PE_FILE_VERIFICATION
+
config CRASH_DUMP
bool "Build kdump crash kernel"
help
diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
index d64f5e9f9d22..578d358632d0 100644
--- a/arch/arm64/kernel/kexec_image.c
+++ b/arch/arm64/kernel/kexec_image.c
@@ -13,6 +13,7 @@
#include <linux/kernel.h>
#include <linux/kexec.h>
#include <linux/string.h>
+#include <linux/verification.h>
#include <asm/boot.h>
#include <asm/byteorder.h>
#include <asm/cpufeature.h>
@@ -28,6 +29,9 @@ static int image_probe(const char *kernel_buf, unsigned long kernel_len)
memcmp(&h->magic, ARM64_MAGIC, sizeof(h->magic)))
return -EINVAL;

+ pr_debug("PE format: %s\n",
+ memcmp(&h->mz_magic, "MZ", 2) ? "no" : "yes");
+
return 0;
}

@@ -102,7 +106,18 @@ static void *image_load(struct kimage *image,
return ERR_PTR(ret);
}

+#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
+static int image_verify_sig(const char *kernel, unsigned long kernel_len)
+{
+ return verify_pefile_signature(kernel, kernel_len, NULL,
+ VERIFYING_KEXEC_PE_SIGNATURE);
+}
+#endif
+
const struct kexec_file_ops kexec_image_ops = {
.probe = image_probe,
.load = image_load,
+#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
+ .verify_sig = image_verify_sig,
+#endif
};
--
2.18.0


2018-07-24 07:01:38

by AKASHI Takahiro

[permalink] [raw]
Subject: [PATCH v12 14/16] include: pe.h: remove message[] from mz header definition

message[] field won't be part of the definition of mz header.

This change is crucial for enabling kexec_file_load on arm64 because
arm64's "Image" binary, as in PE format, doesn't have any data for it and
accordingly the following check in pefile_parse_binary() will fail:

chkaddr(cursor, mz->peaddr, sizeof(*pe));

Signed-off-by: AKASHI Takahiro <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Cc: David Howells <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: David S. Miller <[email protected]>
---
include/linux/pe.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/pe.h b/include/linux/pe.h
index 143ce75be5f0..3482b18a48b5 100644
--- a/include/linux/pe.h
+++ b/include/linux/pe.h
@@ -166,7 +166,7 @@ struct mz_hdr {
uint16_t oem_info; /* oem specific */
uint16_t reserved1[10]; /* reserved */
uint32_t peaddr; /* address of pe header */
- char message[64]; /* message to print */
+ char message[]; /* message to print */
};

struct mz_reloc {
--
2.18.0


2018-07-24 07:01:48

by AKASHI Takahiro

[permalink] [raw]
Subject: [PATCH v12 16/16] arm64: kexec_file: add kaslr support

Adding "kaslr-seed" to dtb enables triggering kaslr, or kernel virtual
address randomization, at secondary kernel boot. We always do this as
it will have no harm on kaslr-incapable kernel.

We don't have any "switch" to turn off this feature directly, but still
can suppress it by passing "nokaslr" as a kernel boot argument.

Signed-off-by: AKASHI Takahiro <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
---
arch/arm64/kernel/machine_kexec_file.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
index 7356da5a53d5..47a4fbd0dc34 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -16,6 +16,7 @@
#include <linux/libfdt.h>
#include <linux/memblock.h>
#include <linux/of_fdt.h>
+#include <linux/random.h>
#include <linux/slab.h>
#include <linux/types.h>
#include <linux/vmalloc.h>
@@ -46,6 +47,7 @@ static int setup_dtb(struct kimage *image,
void *buf = NULL;
size_t buf_size, range_size;
int nodeoffset;
+ u64 value;
int ret;

/* check ranges against root's #address-cells and #size-cells */
@@ -158,6 +160,12 @@ static int setup_dtb(struct kimage *image,
}
}

+ /* add kaslr-seed */
+ get_random_bytes(&value, sizeof(value));
+ ret = fdt_setprop(buf, nodeoffset, "kaslr-seed", &value, sizeof(value));
+ if (ret)
+ goto out_err;
+
/* trim a buffer */
fdt_pack(buf);
*dtb_buf = buf;
--
2.18.0


2018-07-24 09:25:05

by Philipp Rudo

[permalink] [raw]
Subject: Re: [PATCH v12 03/16] s390, kexec_file: drop arch_kexec_mem_walk()

Hi AKASHI,

the patch looks good to me.

Reviewed-by: Philipp Rudo <[email protected]>

Thanks
Philipp


On Tue, 24 Jul 2018 15:57:46 +0900
AKASHI Takahiro <[email protected]> wrote:

> Since s390 already knows where to locate buffers, calling
> arch_kexec_mem_walk() has no sense. So we can just drop it as kbuf->mem
> indicates this while all other architectures sets it to 0 initially.
>
> This change is a preparatory work for the next patch, where all the
> variant memory walks, either on system resource or memblock, will be
> put in one common place so that it will satisfy all the architectures'
> need.
>
> Signed-off-by: AKASHI Takahiro <[email protected]>
> Cc: Martin Schwidefsky <[email protected]>
> Cc: Heiko Carstens <[email protected]>
> Cc: Dave Young <[email protected]>
> Cc: Vivek Goyal <[email protected]>
> Cc: Baoquan He <[email protected]>
> ---
> arch/s390/kernel/machine_kexec_file.c | 10 ----------
> kernel/kexec_file.c | 4 ++++
> 2 files changed, 4 insertions(+), 10 deletions(-)
>
> diff --git a/arch/s390/kernel/machine_kexec_file.c b/arch/s390/kernel/machine_kexec_file.c
> index f413f57f8d20..32023b4f9dc0 100644
> --- a/arch/s390/kernel/machine_kexec_file.c
> +++ b/arch/s390/kernel/machine_kexec_file.c
> @@ -134,16 +134,6 @@ int kexec_file_add_initrd(struct kimage *image, struct s390_load_data *data,
> return ret;
> }
>
> -/*
> - * The kernel is loaded to a fixed location. Turn off kexec_locate_mem_hole
> - * and provide kbuf->mem by hand.
> - */
> -int arch_kexec_walk_mem(struct kexec_buf *kbuf,
> - int (*func)(struct resource *, void *))
> -{
> - return 1;
> -}
> -
> int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
> Elf_Shdr *section,
> const Elf_Shdr *relsec,
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 63c7ce1c0c3e..bf39df5e5bb9 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -534,6 +534,10 @@ int kexec_locate_mem_hole(struct kexec_buf *kbuf)
> {
> int ret;
>
> + /* Arch knows where to place */
> + if (kbuf->mem)
> + return 0;
> +
> ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
>
> return ret == 1 ? 0 : -EADDRNOTAVAIL;


2018-07-25 12:32:49

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH v12 04/16] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()

On 07/24/18 at 03:57pm, AKASHI Takahiro wrote:
> Memblock list is another source for usable system memory layout.
> So move powerpc's arch_kexec_walk_mem() to common code so that other
> memblock-based architectures, particularly arm64, can also utilise it.
> A moved function is now renamed to kexec_walk_memblock() and integrated
> into kexec_locate_mem_hole(), which will now be usable for all
> architectures with no need for overriding arch_kexec_walk_mem().
>
> kexec_walk_memblock() will not work for kdump in this form, this will be
> fixed in the next patch.
>
> Signed-off-by: AKASHI Takahiro <[email protected]>
> Cc: "Eric W. Biederman" <[email protected]>
> Cc: Dave Young <[email protected]>
> Cc: Vivek Goyal <[email protected]>
> Cc: Baoquan He <[email protected]>
> Acked-by: James Morse <[email protected]>
> ---
> arch/powerpc/kernel/machine_kexec_file_64.c | 54 -------------------
> include/linux/kexec.h | 2 -
> kernel/kexec_file.c | 58 ++++++++++++++++++++-
> 3 files changed, 56 insertions(+), 58 deletions(-)
>
> diff --git a/arch/powerpc/kernel/machine_kexec_file_64.c b/arch/powerpc/kernel/machine_kexec_file_64.c
> index 0bd23dc789a4..5357b09902c5 100644
> --- a/arch/powerpc/kernel/machine_kexec_file_64.c
> +++ b/arch/powerpc/kernel/machine_kexec_file_64.c
> @@ -24,7 +24,6 @@
>
> #include <linux/slab.h>
> #include <linux/kexec.h>
> -#include <linux/memblock.h>
> #include <linux/of_fdt.h>
> #include <linux/libfdt.h>
> #include <asm/ima.h>
> @@ -46,59 +45,6 @@ int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> return kexec_image_probe_default(image, buf, buf_len);
> }
>
> -/**
> - * arch_kexec_walk_mem - call func(data) for each unreserved memory block
> - * @kbuf: Context info for the search. Also passed to @func.
> - * @func: Function to call for each memory block.
> - *
> - * This function is used by kexec_add_buffer and kexec_locate_mem_hole
> - * to find unreserved memory to load kexec segments into.
> - *
> - * Return: The memory walk will stop when func returns a non-zero value
> - * and that value will be returned. If all free regions are visited without
> - * func returning non-zero, then zero will be returned.
> - */
> -int arch_kexec_walk_mem(struct kexec_buf *kbuf,
> - int (*func)(struct resource *, void *))
> -{
> - int ret = 0;
> - u64 i;
> - phys_addr_t mstart, mend;
> - struct resource res = { };
> -
> - if (kbuf->top_down) {
> - for_each_free_mem_range_reverse(i, NUMA_NO_NODE, 0,
> - &mstart, &mend, NULL) {
> - /*
> - * In memblock, end points to the first byte after the
> - * range while in kexec, end points to the last byte
> - * in the range.
> - */
> - res.start = mstart;
> - res.end = mend - 1;
> - ret = func(&res, kbuf);
> - if (ret)
> - break;
> - }
> - } else {
> - for_each_free_mem_range(i, NUMA_NO_NODE, 0, &mstart, &mend,
> - NULL) {
> - /*
> - * In memblock, end points to the first byte after the
> - * range while in kexec, end points to the last byte
> - * in the range.
> - */
> - res.start = mstart;
> - res.end = mend - 1;
> - ret = func(&res, kbuf);
> - if (ret)
> - break;
> - }
> - }
> -
> - return ret;
> -}
> -
> /**
> * setup_purgatory - initialize the purgatory's global variables
> * @image: kexec image.
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 49ab758f4d91..c196bfd11bee 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -184,8 +184,6 @@ int __weak arch_kexec_apply_relocations(struct purgatory_info *pi,
> const Elf_Shdr *relsec,
> const Elf_Shdr *symtab);
>
> -int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> - int (*func)(struct resource *, void *));
> extern int kexec_add_buffer(struct kexec_buf *kbuf);
> int kexec_locate_mem_hole(struct kexec_buf *kbuf);
>
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index bf39df5e5bb9..2f0691b0f8ad 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -16,6 +16,7 @@
> #include <linux/file.h>
> #include <linux/slab.h>
> #include <linux/kexec.h>
> +#include <linux/memblock.h>
> #include <linux/mutex.h>
> #include <linux/list.h>
> #include <linux/fs.h>
> @@ -501,6 +502,55 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
> return locate_mem_hole_bottom_up(start, end, kbuf);
> }
>
> +#if defined(CONFIG_HAVE_MEMBLOCK) && !defined(CONFIG_ARCH_DISCARD_MEMBLOCK)
> +static int kexec_walk_memblock(struct kexec_buf *kbuf,
> + int (*func)(struct resource *, void *))
> +{
> + int ret = 0;
> + u64 i;
> + phys_addr_t mstart, mend;
> + struct resource res = { };
> +
> + if (kbuf->top_down) {
> + for_each_free_mem_range_reverse(i, NUMA_NO_NODE, 0,
> + &mstart, &mend, NULL) {
> + /*
> + * In memblock, end points to the first byte after the
> + * range while in kexec, end points to the last byte
> + * in the range.
> + */
> + res.start = mstart;
> + res.end = mend - 1;
> + ret = func(&res, kbuf);
> + if (ret)
> + break;
> + }
> + } else {
> + for_each_free_mem_range(i, NUMA_NO_NODE, 0, &mstart, &mend,
> + NULL) {
> + /*
> + * In memblock, end points to the first byte after the
> + * range while in kexec, end points to the last byte
> + * in the range.
> + */
> + res.start = mstart;
> + res.end = mend - 1;
> + ret = func(&res, kbuf);
> + if (ret)
> + break;
> + }
> + }
> +
> + return ret;
> +}
> +#else
> +static int kexec_walk_memblock(struct kexec_buf *kbuf,
> + int (*func)(struct resource *, void *))
> +{
> + return 0;
> +}
> +#endif
> +
> /**
> * arch_kexec_walk_mem - call func(data) on free memory regions
> * @kbuf: Context info for the search. Also passed to @func.
> @@ -510,7 +560,7 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
> * and that value will be returned. If all free regions are visited without
> * func returning non-zero, then zero will be returned.
> */
> -int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> +static int arch_kexec_walk_mem(struct kexec_buf *kbuf,
> int (*func)(struct resource *, void *))
> {
> if (kbuf->image->type == KEXEC_TYPE_CRASH)
> @@ -538,7 +588,11 @@ int kexec_locate_mem_hole(struct kexec_buf *kbuf)
> if (kbuf->mem)
> return 0;
>
> - ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
> + if (IS_ENABLED(CONFIG_HAVE_MEMBLOCK) &&
> + !IS_ENABLED(CONFIG_ARCH_DISCARD_MEMBLOCK))
> + ret = kexec_walk_memblock(kbuf, locate_mem_hole_callback);
> + else
> + ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);

AKASHI, since it is not weak function now, it would be better to rename
the function for example name it as kexec_walk_resource()

Other than this,

Acked-by: Dave Young <[email protected]>

>
> return ret == 1 ? 0 : -EADDRNOTAVAIL;
> }
> --
> 2.18.0
>

Thanks
Dave

2018-07-26 13:36:41

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v12 10/16] arm64: kexec_file: load initrd and device-tree

Hi Akashi,

On 24/07/18 07:57, AKASHI Takahiro wrote:
> load_other_segments() is expected to allocate and place all the necessary
> memory segments other than kernel, including initrd and device-tree
> blob (and elf core header for crash).
> While most of the code was borrowed from kexec-tools' counterpart,
> users may not be allowed to specify dtb explicitly, instead, the dtb
> presented by the original boot loader is reused.
>
> arch_kimage_kernel_post_load_cleanup() is responsible for freeing arm64-
> specific data allocated in load_other_segments().

Since v11 you've renamed struct kimage_arch's dtb_buf as dtb, but not changed
the struct. This series doesn't build until patch 12 where you fix it. This will
cause anyone trying to bisect through here a problem.


Thanks,

James

> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
> index e17f0529a882..026f7e408f0c 100644
> --- a/arch/arm64/include/asm/kexec.h
> +++ b/arch/arm64/include/asm/kexec.h

> +struct kimage_arch {
> + void *dtb_buf;
> + unsigned long dtb_mem;
> +};

> diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
> index c38a8048ed00..b28fbb0659c9 100644
> --- a/arch/arm64/kernel/machine_kexec_file.c
> +++ b/arch/arm64/kernel/machine_kexec_file.c

> +int arch_kimage_file_post_load_cleanup(struct kimage *image)
> +{
> + vfree(image->arch.dtb);
> + image->arch.dtb = NULL;
> +
> + return kexec_image_post_load_cleanup_default(image);
> +}


2018-07-26 13:37:33

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v12 01/16] asm-generic: add kexec_file_load system call to unistd.h

Hi Akashi,

On 24/07/18 07:57, AKASHI Takahiro wrote:
> The initial user of this system call number is arm64.

This patch conflicts with commit db7a2d1809a5 ("asm-generic: unistd.h: Wire up
sys_rseq") in the arm64 tree.

Thanks,

James

> diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
> index 42990676a55e..c81f4a0df51f 100644
> --- a/include/uapi/asm-generic/unistd.h
> +++ b/include/uapi/asm-generic/unistd.h
> @@ -734,9 +734,11 @@ __SYSCALL(__NR_pkey_free, sys_pkey_free)
> __SYSCALL(__NR_statx, sys_statx)
> #define __NR_io_pgetevents 292
> __SC_COMP(__NR_io_pgetevents, sys_io_pgetevents, compat_sys_io_pgetevents)
> +#define __NR_kexec_file_load 293
> +__SYSCALL(__NR_kexec_file_load, sys_kexec_file_load)
>
> #undef __NR_syscalls
> -#define __NR_syscalls 293
> +#define __NR_syscalls 294
>
> /*
> * 32 bit systems traditionally used different
>

2018-07-26 13:37:33

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v12 13/16] arm64: kexec_file: invoke the kernel without purgatory

Hi Akashi,

On 24/07/18 07:57, AKASHI Takahiro wrote:
> On arm64, purgatory would do almost nothing. So just invoke secondary
> kernel directly by jumping into its entry code.
>
> While, in this case, cpu_soft_restart() must be called with dtb address
> in the fifth argument, the behavior still stays compatible with kexec_load
> case as long as the argument is null.

This patch conflicts with commit 76f4e2da45b4 ("arm64: kexec: always reset to
EL2 if present") in the arm64 tree.

Thanks,

James

> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> index f76ea92dff91..830a5063e09d 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -205,10 +205,18 @@ void machine_kexec(struct kimage *kimage)
> * uses physical addressing to relocate the new image to its final
> * position and transfers control to the image entry point when the
> * relocation is complete.
> + * In kexec case, kimage->start points to purgatory assuming that
> + * kernel entry and dtb address are embedded in purgatory by
> + * userspace (kexec-tools).
> + * In kexec_file case, the kernel starts directly without purgatory.
> */
> -
> cpu_soft_restart(kimage != kexec_crash_image,
> - reboot_code_buffer_phys, kimage->head, kimage->start, 0);
> + reboot_code_buffer_phys, kimage->head, kimage->start,
> +#ifdef CONFIG_KEXEC_FILE
> + kimage->arch.dtb_mem);
> +#else
> + 0);
> +#endif
>
> BUG(); /* Should never get here. */
> }

2018-07-26 13:38:40

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v12 12/16] arm64: kexec_file: add crash dump support

Hi Akashi,

On 24/07/18 07:57, AKASHI Takahiro wrote:
> Enabling crash dump (kdump) includes
> * prepare contents of ELF header of a core dump file, /proc/vmcore,
> using crash_prepare_elf64_headers(), and
> * add two device tree properties, "linux,usable-memory-range" and
> "linux,elfcorehdr", which represent respectively a memory range
> to be used by crash dump kernel and the header's location

> diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
> index 5d102a1054b3..1b2c27026ae0 100644
> --- a/arch/arm64/include/asm/kexec.h
> +++ b/arch/arm64/include/asm/kexec.h
> @@ -97,8 +97,12 @@ static inline void crash_post_resume(void) {}
> #define ARCH_HAS_KIMAGE_ARCH
>
> struct kimage_arch {

> - void *dtb_buf;
> + void *dtb;

This change should be in an earlier patch, otherwise this series doesn't build
during bisect.

With the build-issues fixed:
Reviewed-by: James Morse <[email protected]>

Some boring Nits:

> unsigned long dtb_mem;
> + /* Core ELF header buffer */
> + void *elf_headers;
> + unsigned long elf_headers_mem;
> + unsigned long elf_headers_sz;
> };
> diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
> index b8297f10e2ef..7356da5a53d5 100644
> --- a/arch/arm64/kernel/machine_kexec_file.c
> +++ b/arch/arm64/kernel/machine_kexec_file.c
> @@ -38,12 +44,30 @@ static int setup_dtb(struct kimage *image,
> void **dtb_buf, unsigned long *dtb_buf_len)
> {
> void *buf = NULL;
> - size_t buf_size;
> + size_t buf_size, range_size;
> int nodeoffset;
> int ret;
>
> /* duplicate dt blob */
> buf_size = fdt_totalsize(initial_boot_params);
> + range_size = of_fdt_reg_cells_size();
> +
> + if (image->type == KEXEC_TYPE_CRASH) {
> + buf_size += fdt_prop_len("linux,elfcorehdr", range_size);
> + buf_size += fdt_prop_len("linux,usable-memory-range",
> + range_size);
> + }

Nit: it would be better if these strings were defined in a header file somewhere
so we don't risk a typo if this gets refactored.



> @@ -129,6 +170,43 @@ static int setup_dtb(struct kimage *image,
> return ret;
> }
>
> +static int prepare_elf_headers(void **addr, unsigned long *sz)
> +{
> + struct crash_mem *cmem;
> + unsigned int nr_ranges;
> + int ret;
> + u64 i;
> + phys_addr_t start, end;
> +
> + nr_ranges = 1; /* for exclusion of crashkernel region */

> + for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE, 0,
> + &start, &end, NULL)

Nit: MEMBLOCK_NONE


Thanks,

James

2018-07-26 13:41:15

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v12 15/16] arm64: kexec_file: add kernel signature verification support

Hi Akashi,

On 24/07/18 07:57, AKASHI Takahiro wrote:
> With this patch, kernel verification can be done without IMA security
> subsystem enabled. Turn on CONFIG_KEXEC_VERIFY_SIG instead.
>
> On x86, a signature is embedded into a PE file (Microsoft's format) header
> of binary. Since arm64's "Image" can also be seen as a PE file as far as
> CONFIG_EFI is enabled, we adopt this format for kernel signing.
>
> You can create a signed kernel image with:
> $ sbsign --key ${KEY} --cert ${CERT} Image

Reviewed-by: James Morse <[email protected]>


> diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
> index d64f5e9f9d22..578d358632d0 100644
> --- a/arch/arm64/kernel/kexec_image.c
> +++ b/arch/arm64/kernel/kexec_image.c
> @@ -102,7 +106,18 @@ static void *image_load(struct kimage *image,
> return ERR_PTR(ret);
> }
>
> +#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
> +static int image_verify_sig(const char *kernel, unsigned long kernel_len)
> +{
> + return verify_pefile_signature(kernel, kernel_len, NULL,
> + VERIFYING_KEXEC_PE_SIGNATURE);
> +}
> +#endif

This is identical to x86's PE image verification helper. We can clean this up
later by providing some kexec_image_verify_pe() in the core kexec_file code. Its
not worth doing now.


> const struct kexec_file_ops kexec_image_ops = {
> .probe = image_probe,
> .load = image_load,
> +#ifdef CONFIG_KEXEC_IMAGE_VERIFY_SIG
> + .verify_sig = image_verify_sig,
> +#endif
> };


Thanks,

James

2018-07-26 13:42:01

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v12 16/16] arm64: kexec_file: add kaslr support

Hi Akashi,

On 24/07/18 07:57, AKASHI Takahiro wrote:
> Adding "kaslr-seed" to dtb enables triggering kaslr, or kernel virtual
> address randomization, at secondary kernel boot.

Hmm, there are three things that get moved by CONFIG_RANDOMIZE_BASE. The kernel
physical placement when booted via the EFIstub, the kernel-text VAs and the
location of memory in the linear-map region. Adding the kaslr-seed only does the
last two.

This means the physical placement of the new kernel is predictable from
/proc/iomem ... but this also tells you the physical placement of the current
kernel, so I don't think this is a problem.


> We always do this as it will have no harm on kaslr-incapable kernel.

> We don't have any "switch" to turn off this feature directly, but still
> can suppress it by passing "nokaslr" as a kernel boot argument.


> diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
> index 7356da5a53d5..47a4fbd0dc34 100644
> --- a/arch/arm64/kernel/machine_kexec_file.c
> +++ b/arch/arm64/kernel/machine_kexec_file.c
> @@ -158,6 +160,12 @@ static int setup_dtb(struct kimage *image,

Don't you need to reserve some space in the area you vmalloc()d for the DT?


> + /* add kaslr-seed */
> + get_random_bytes(&value, sizeof(value));

What happens if the crng isn't ready?

It looks like this will print a warning that these random-bytes aren't really up
to standard, but the new kernel doesn't know this happened.

crng_ready() isn't exposed, all we could do now is
wait_for_random_bytes(), but that may wait forever because we do this
unconditionally.

I'd prefer to leave this feature until we can check crng_ready(), and skip
adding a dodgy-seed if its not-ready. This avoids polluting the next-kernel's
entropy pool.


> + ret = fdt_setprop(buf, nodeoffset, "kaslr-seed", &value, sizeof(value));

Nit: It would be nice if this string were in a header file somewhere, to void
future refactoring typos.


Thanks,

James

2018-07-27 05:22:46

by AKASHI Takahiro

[permalink] [raw]
Subject: Re: [PATCH v12 01/16] asm-generic: add kexec_file_load system call to unistd.h

On Thu, Jul 26, 2018 at 02:35:50PM +0100, James Morse wrote:
> Hi Akashi,
>
> On 24/07/18 07:57, AKASHI Takahiro wrote:
> > The initial user of this system call number is arm64.
>
> This patch conflicts with commit db7a2d1809a5 ("asm-generic: unistd.h: Wire up
> sys_rseq") in the arm64 tree.

OK. I will try to rebase my code to arm64/for-next/core at v13.

Thanks,
-Takahiro AKASHI

> Thanks,
>
> James
>
> > diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
> > index 42990676a55e..c81f4a0df51f 100644
> > --- a/include/uapi/asm-generic/unistd.h
> > +++ b/include/uapi/asm-generic/unistd.h
> > @@ -734,9 +734,11 @@ __SYSCALL(__NR_pkey_free, sys_pkey_free)
> > __SYSCALL(__NR_statx, sys_statx)
> > #define __NR_io_pgetevents 292
> > __SC_COMP(__NR_io_pgetevents, sys_io_pgetevents, compat_sys_io_pgetevents)
> > +#define __NR_kexec_file_load 293
> > +__SYSCALL(__NR_kexec_file_load, sys_kexec_file_load)
> >
> > #undef __NR_syscalls
> > -#define __NR_syscalls 293
> > +#define __NR_syscalls 294
> >
> > /*
> > * 32 bit systems traditionally used different
> >

2018-07-27 05:25:02

by AKASHI Takahiro

[permalink] [raw]
Subject: Re: [PATCH v12 04/16] powerpc, kexec_file: factor out memblock-based arch_kexec_walk_mem()

On Wed, Jul 25, 2018 at 08:31:29PM +0800, Dave Young wrote:
> On 07/24/18 at 03:57pm, AKASHI Takahiro wrote:
> > Memblock list is another source for usable system memory layout.
> > So move powerpc's arch_kexec_walk_mem() to common code so that other
> > memblock-based architectures, particularly arm64, can also utilise it.
> > A moved function is now renamed to kexec_walk_memblock() and integrated
> > into kexec_locate_mem_hole(), which will now be usable for all
> > architectures with no need for overriding arch_kexec_walk_mem().
> >
> > kexec_walk_memblock() will not work for kdump in this form, this will be
> > fixed in the next patch.
> >
> > Signed-off-by: AKASHI Takahiro <[email protected]>
> > Cc: "Eric W. Biederman" <[email protected]>
> > Cc: Dave Young <[email protected]>
> > Cc: Vivek Goyal <[email protected]>
> > Cc: Baoquan He <[email protected]>
> > Acked-by: James Morse <[email protected]>
> > ---
> > arch/powerpc/kernel/machine_kexec_file_64.c | 54 -------------------
> > include/linux/kexec.h | 2 -
> > kernel/kexec_file.c | 58 ++++++++++++++++++++-
> > 3 files changed, 56 insertions(+), 58 deletions(-)
> >
> > diff --git a/arch/powerpc/kernel/machine_kexec_file_64.c b/arch/powerpc/kernel/machine_kexec_file_64.c
> > index 0bd23dc789a4..5357b09902c5 100644
> > --- a/arch/powerpc/kernel/machine_kexec_file_64.c
> > +++ b/arch/powerpc/kernel/machine_kexec_file_64.c
> > @@ -24,7 +24,6 @@
> >
> > #include <linux/slab.h>
> > #include <linux/kexec.h>
> > -#include <linux/memblock.h>
> > #include <linux/of_fdt.h>
> > #include <linux/libfdt.h>
> > #include <asm/ima.h>
> > @@ -46,59 +45,6 @@ int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> > return kexec_image_probe_default(image, buf, buf_len);
> > }
> >
> > -/**
> > - * arch_kexec_walk_mem - call func(data) for each unreserved memory block
> > - * @kbuf: Context info for the search. Also passed to @func.
> > - * @func: Function to call for each memory block.
> > - *
> > - * This function is used by kexec_add_buffer and kexec_locate_mem_hole
> > - * to find unreserved memory to load kexec segments into.
> > - *
> > - * Return: The memory walk will stop when func returns a non-zero value
> > - * and that value will be returned. If all free regions are visited without
> > - * func returning non-zero, then zero will be returned.
> > - */
> > -int arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > - int (*func)(struct resource *, void *))
> > -{
> > - int ret = 0;
> > - u64 i;
> > - phys_addr_t mstart, mend;
> > - struct resource res = { };
> > -
> > - if (kbuf->top_down) {
> > - for_each_free_mem_range_reverse(i, NUMA_NO_NODE, 0,
> > - &mstart, &mend, NULL) {
> > - /*
> > - * In memblock, end points to the first byte after the
> > - * range while in kexec, end points to the last byte
> > - * in the range.
> > - */
> > - res.start = mstart;
> > - res.end = mend - 1;
> > - ret = func(&res, kbuf);
> > - if (ret)
> > - break;
> > - }
> > - } else {
> > - for_each_free_mem_range(i, NUMA_NO_NODE, 0, &mstart, &mend,
> > - NULL) {
> > - /*
> > - * In memblock, end points to the first byte after the
> > - * range while in kexec, end points to the last byte
> > - * in the range.
> > - */
> > - res.start = mstart;
> > - res.end = mend - 1;
> > - ret = func(&res, kbuf);
> > - if (ret)
> > - break;
> > - }
> > - }
> > -
> > - return ret;
> > -}
> > -
> > /**
> > * setup_purgatory - initialize the purgatory's global variables
> > * @image: kexec image.
> > diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> > index 49ab758f4d91..c196bfd11bee 100644
> > --- a/include/linux/kexec.h
> > +++ b/include/linux/kexec.h
> > @@ -184,8 +184,6 @@ int __weak arch_kexec_apply_relocations(struct purgatory_info *pi,
> > const Elf_Shdr *relsec,
> > const Elf_Shdr *symtab);
> >
> > -int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > - int (*func)(struct resource *, void *));
> > extern int kexec_add_buffer(struct kexec_buf *kbuf);
> > int kexec_locate_mem_hole(struct kexec_buf *kbuf);
> >
> > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> > index bf39df5e5bb9..2f0691b0f8ad 100644
> > --- a/kernel/kexec_file.c
> > +++ b/kernel/kexec_file.c
> > @@ -16,6 +16,7 @@
> > #include <linux/file.h>
> > #include <linux/slab.h>
> > #include <linux/kexec.h>
> > +#include <linux/memblock.h>
> > #include <linux/mutex.h>
> > #include <linux/list.h>
> > #include <linux/fs.h>
> > @@ -501,6 +502,55 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
> > return locate_mem_hole_bottom_up(start, end, kbuf);
> > }
> >
> > +#if defined(CONFIG_HAVE_MEMBLOCK) && !defined(CONFIG_ARCH_DISCARD_MEMBLOCK)
> > +static int kexec_walk_memblock(struct kexec_buf *kbuf,
> > + int (*func)(struct resource *, void *))
> > +{
> > + int ret = 0;
> > + u64 i;
> > + phys_addr_t mstart, mend;
> > + struct resource res = { };
> > +
> > + if (kbuf->top_down) {
> > + for_each_free_mem_range_reverse(i, NUMA_NO_NODE, 0,
> > + &mstart, &mend, NULL) {
> > + /*
> > + * In memblock, end points to the first byte after the
> > + * range while in kexec, end points to the last byte
> > + * in the range.
> > + */
> > + res.start = mstart;
> > + res.end = mend - 1;
> > + ret = func(&res, kbuf);
> > + if (ret)
> > + break;
> > + }
> > + } else {
> > + for_each_free_mem_range(i, NUMA_NO_NODE, 0, &mstart, &mend,
> > + NULL) {
> > + /*
> > + * In memblock, end points to the first byte after the
> > + * range while in kexec, end points to the last byte
> > + * in the range.
> > + */
> > + res.start = mstart;
> > + res.end = mend - 1;
> > + ret = func(&res, kbuf);
> > + if (ret)
> > + break;
> > + }
> > + }
> > +
> > + return ret;
> > +}
> > +#else
> > +static int kexec_walk_memblock(struct kexec_buf *kbuf,
> > + int (*func)(struct resource *, void *))
> > +{
> > + return 0;
> > +}
> > +#endif
> > +
> > /**
> > * arch_kexec_walk_mem - call func(data) on free memory regions
> > * @kbuf: Context info for the search. Also passed to @func.
> > @@ -510,7 +560,7 @@ static int locate_mem_hole_callback(struct resource *res, void *arg)
> > * and that value will be returned. If all free regions are visited without
> > * func returning non-zero, then zero will be returned.
> > */
> > -int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > +static int arch_kexec_walk_mem(struct kexec_buf *kbuf,
> > int (*func)(struct resource *, void *))
> > {
> > if (kbuf->image->type == KEXEC_TYPE_CRASH)
> > @@ -538,7 +588,11 @@ int kexec_locate_mem_hole(struct kexec_buf *kbuf)
> > if (kbuf->mem)
> > return 0;
> >
> > - ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
> > + if (IS_ENABLED(CONFIG_HAVE_MEMBLOCK) &&
> > + !IS_ENABLED(CONFIG_ARCH_DISCARD_MEMBLOCK))
> > + ret = kexec_walk_memblock(kbuf, locate_mem_hole_callback);
> > + else
> > + ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
>
> AKASHI, since it is not weak function now, it would be better to rename
> the function for example name it as kexec_walk_resource()

OK.

-Takahiro AKASHI


> Other than this,
>
> Acked-by: Dave Young <[email protected]>
>
> >
> > return ret == 1 ? 0 : -EADDRNOTAVAIL;
> > }
> > --
> > 2.18.0
> >
>
> Thanks
> Dave

2018-07-27 05:39:00

by AKASHI Takahiro

[permalink] [raw]
Subject: Re: [PATCH v12 10/16] arm64: kexec_file: load initrd and device-tree

On Thu, Jul 26, 2018 at 02:34:55PM +0100, James Morse wrote:
> Hi Akashi,
>
> On 24/07/18 07:57, AKASHI Takahiro wrote:
> > load_other_segments() is expected to allocate and place all the necessary
> > memory segments other than kernel, including initrd and device-tree
> > blob (and elf core header for crash).
> > While most of the code was borrowed from kexec-tools' counterpart,
> > users may not be allowed to specify dtb explicitly, instead, the dtb
> > presented by the original boot loader is reused.
> >
> > arch_kimage_kernel_post_load_cleanup() is responsible for freeing arm64-
> > specific data allocated in load_other_segments().
>
> Since v11 you've renamed struct kimage_arch's dtb_buf as dtb, but not changed
> the struct. This series doesn't build until patch 12 where you fix it. This will
> cause anyone trying to bisect through here a problem.

Right. My last-minute change introduced this screw-up.
I will double-check at v13.

Thanks,
-Takahiro AKASHI

>
> Thanks,
>
> James
>
> > diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
> > index e17f0529a882..026f7e408f0c 100644
> > --- a/arch/arm64/include/asm/kexec.h
> > +++ b/arch/arm64/include/asm/kexec.h
>
> > +struct kimage_arch {
> > + void *dtb_buf;
> > + unsigned long dtb_mem;
> > +};
>
> > diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
> > index c38a8048ed00..b28fbb0659c9 100644
> > --- a/arch/arm64/kernel/machine_kexec_file.c
> > +++ b/arch/arm64/kernel/machine_kexec_file.c
>
> > +int arch_kimage_file_post_load_cleanup(struct kimage *image)
> > +{
> > + vfree(image->arch.dtb);
> > + image->arch.dtb = NULL;
> > +
> > + return kexec_image_post_load_cleanup_default(image);
> > +}
>

2018-07-27 07:00:12

by AKASHI Takahiro

[permalink] [raw]
Subject: Re: [PATCH v12 12/16] arm64: kexec_file: add crash dump support

On Thu, Jul 26, 2018 at 02:36:58PM +0100, James Morse wrote:
> Hi Akashi,
>
> On 24/07/18 07:57, AKASHI Takahiro wrote:
> > Enabling crash dump (kdump) includes
> > * prepare contents of ELF header of a core dump file, /proc/vmcore,
> > using crash_prepare_elf64_headers(), and
> > * add two device tree properties, "linux,usable-memory-range" and
> > "linux,elfcorehdr", which represent respectively a memory range
> > to be used by crash dump kernel and the header's location
>
> > diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
> > index 5d102a1054b3..1b2c27026ae0 100644
> > --- a/arch/arm64/include/asm/kexec.h
> > +++ b/arch/arm64/include/asm/kexec.h
> > @@ -97,8 +97,12 @@ static inline void crash_post_resume(void) {}
> > #define ARCH_HAS_KIMAGE_ARCH
> >
> > struct kimage_arch {
>
> > - void *dtb_buf;
> > + void *dtb;
>
> This change should be in an earlier patch, otherwise this series doesn't build
> during bisect.

Will fix.

> With the build-issues fixed:
> Reviewed-by: James Morse <[email protected]>
>
> Some boring Nits:
>
> > unsigned long dtb_mem;
> > + /* Core ELF header buffer */
> > + void *elf_headers;
> > + unsigned long elf_headers_mem;
> > + unsigned long elf_headers_sz;
> > };
> > diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
> > index b8297f10e2ef..7356da5a53d5 100644
> > --- a/arch/arm64/kernel/machine_kexec_file.c
> > +++ b/arch/arm64/kernel/machine_kexec_file.c
> > @@ -38,12 +44,30 @@ static int setup_dtb(struct kimage *image,
> > void **dtb_buf, unsigned long *dtb_buf_len)
> > {
> > void *buf = NULL;
> > - size_t buf_size;
> > + size_t buf_size, range_size;
> > int nodeoffset;
> > int ret;
> >
> > /* duplicate dt blob */
> > buf_size = fdt_totalsize(initial_boot_params);
> > + range_size = of_fdt_reg_cells_size();
> > +
> > + if (image->type == KEXEC_TYPE_CRASH) {
> > + buf_size += fdt_prop_len("linux,elfcorehdr", range_size);
> > + buf_size += fdt_prop_len("linux,usable-memory-range",
> > + range_size);
> > + }
>
> Nit: it would be better if these strings were defined in a header file somewhere
> so we don't risk a typo if this gets refactored.

Nit??
Well, I do understand your concern, but it's a bit headache.
If we handle them in such a way, we may want to handle others, such as
linux,initrd-start/end, chosen and bootargs, as well in this file.
They are not kexec specific and should go into a common header. This will
end up propagating similar changes to other no-kexec-related occurrences
under drivers/of.

So I want to keep the following def's local in this file for easy
maintenance.
#define FDT_PSTR_KEXEC_ELFHDR "linux,elfcorehdr"
#define FDT_PSTR_MEM_RANGE "linux,usable-memory-range"
#define FDT_PSTR_INITRD_ST "linux,initrd-start"
#define FDT_PSTR_INITRD_END "linux,initrd-end"
#define FDT_PSTR_BOOTARGS "bootargs"
#define FDT_PSTR_KASLR_SEED "kaslr-seed"

>
>
> > @@ -129,6 +170,43 @@ static int setup_dtb(struct kimage *image,
> > return ret;
> > }
> >
> > +static int prepare_elf_headers(void **addr, unsigned long *sz)
> > +{
> > + struct crash_mem *cmem;
> > + unsigned int nr_ranges;
> > + int ret;
> > + u64 i;
> > + phys_addr_t start, end;
> > +
> > + nr_ranges = 1; /* for exclusion of crashkernel region */
>
> > + for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE, 0,
> > + &start, &end, NULL)
>
> Nit: MEMBLOCK_NONE

OK

-Takahiro AKASHI


>
> Thanks,
>
> James

2018-07-27 07:23:21

by AKASHI Takahiro

[permalink] [raw]
Subject: Re: [PATCH v12 13/16] arm64: kexec_file: invoke the kernel without purgatory

On Thu, Jul 26, 2018 at 02:36:07PM +0100, James Morse wrote:
> Hi Akashi,
>
> On 24/07/18 07:57, AKASHI Takahiro wrote:
> > On arm64, purgatory would do almost nothing. So just invoke secondary
> > kernel directly by jumping into its entry code.
> >
> > While, in this case, cpu_soft_restart() must be called with dtb address
> > in the fifth argument, the behavior still stays compatible with kexec_load
> > case as long as the argument is null.
>
> This patch conflicts with commit 76f4e2da45b4 ("arm64: kexec: always reset to
> EL2 if present") in the arm64 tree.

I haven't noticed Mark's patch.

I'm going to have to refresh my memory regarding why I introduced
el2_switch when I implemented kdump.
According to my current memory, however, I added
kvm_arch_hardware_enable/disable(), and associated functions, to gracefully
shutdown EL2 in case of kexec. Since we have no chance to call reset
function (via notifier) at kdump, I believed that el2_switch was necessary
for better chance of successful kdump.

Thanks,
-Takahiro AKASHI

> Thanks,
>
> James
>
> > diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
> > index f76ea92dff91..830a5063e09d 100644
> > --- a/arch/arm64/kernel/machine_kexec.c
> > +++ b/arch/arm64/kernel/machine_kexec.c
> > @@ -205,10 +205,18 @@ void machine_kexec(struct kimage *kimage)
> > * uses physical addressing to relocate the new image to its final
> > * position and transfers control to the image entry point when the
> > * relocation is complete.
> > + * In kexec case, kimage->start points to purgatory assuming that
> > + * kernel entry and dtb address are embedded in purgatory by
> > + * userspace (kexec-tools).
> > + * In kexec_file case, the kernel starts directly without purgatory.
> > */
> > -
> > cpu_soft_restart(kimage != kexec_crash_image,
> > - reboot_code_buffer_phys, kimage->head, kimage->start, 0);
> > + reboot_code_buffer_phys, kimage->head, kimage->start,
> > +#ifdef CONFIG_KEXEC_FILE
> > + kimage->arch.dtb_mem);
> > +#else
> > + 0);
> > +#endif
> >
> > BUG(); /* Should never get here. */
> > }

2018-07-27 08:30:41

by AKASHI Takahiro

[permalink] [raw]
Subject: Re: [PATCH v12 16/16] arm64: kexec_file: add kaslr support

On Thu, Jul 26, 2018 at 02:40:49PM +0100, James Morse wrote:
> Hi Akashi,
>
> On 24/07/18 07:57, AKASHI Takahiro wrote:
> > Adding "kaslr-seed" to dtb enables triggering kaslr, or kernel virtual
> > address randomization, at secondary kernel boot.
>
> Hmm, there are three things that get moved by CONFIG_RANDOMIZE_BASE. The kernel
> physical placement when booted via the EFIstub, the kernel-text VAs and the
> location of memory in the linear-map region. Adding the kaslr-seed only does the
> last two.

Yes, but I think that I and Mark has agreed that "kaslr" meant
"virtual" randomisation, not including "physical" randomisation.

> This means the physical placement of the new kernel is predictable from
> /proc/iomem ... but this also tells you the physical placement of the current
> kernel, so I don't think this is a problem.
>
>
> > We always do this as it will have no harm on kaslr-incapable kernel.
>
> > We don't have any "switch" to turn off this feature directly, but still
> > can suppress it by passing "nokaslr" as a kernel boot argument.
>
>
> > diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
> > index 7356da5a53d5..47a4fbd0dc34 100644
> > --- a/arch/arm64/kernel/machine_kexec_file.c
> > +++ b/arch/arm64/kernel/machine_kexec_file.c
> > @@ -158,6 +160,12 @@ static int setup_dtb(struct kimage *image,
>
> Don't you need to reserve some space in the area you vmalloc()d for the DT?

No, I don't think so.
All the data to be loaded are temporarily saved in kexec buffers,
which will eventually be copied to target locations in machine_kexec
(arm64_relocate_new_kernel, which, unlike its name, will handle
not only kernel but also other data as well).

>
> > + /* add kaslr-seed */
> > + get_random_bytes(&value, sizeof(value));
>
> What happens if the crng isn't ready?
>
> It looks like this will print a warning that these random-bytes aren't really up
> to standard, but the new kernel doesn't know this happened.
>
> crng_ready() isn't exposed, all we could do now is
> wait_for_random_bytes(), but that may wait forever because we do this
> unconditionally.
>
> I'd prefer to leave this feature until we can check crng_ready(), and skip
> adding a dodgy-seed if its not-ready. This avoids polluting the next-kernel's
> entropy pool.

OK. I would try to follow the same way as Bhupesh's userspace patch
does for kaslr-seed:
http://lists.infradead.org/pipermail/kexec/2018-April/020564.html

if (not found kaslr-seed in 1st kernel's dtb)
don't care; go ahead
else
if (current kaslr-seed != 0)
error
if (crng_ready()) ; FIXME, it's a local macro
get_random_bytes(non-blocking)
set new kaslr-seed
else
error

>
> > + ret = fdt_setprop(buf, nodeoffset, "kaslr-seed", &value, sizeof(value));
>
> Nit: It would be nice if this string were in a header file somewhere, to void
> future refactoring typos.

OK. (but in this file for now as I mentioned in my previous reply)

Thanks,
-Takahiro AKASHI

>
> Thanks,
>
> James

2018-07-27 09:24:12

by James Morse

[permalink] [raw]
Subject: Re: [PATCH v12 16/16] arm64: kexec_file: add kaslr support

Hi Akashi,


On 07/27/2018 09:31 AM, AKASHI Takahiro wrote:
> On Thu, Jul 26, 2018 at 02:40:49PM +0100, James Morse wrote:
>> On 24/07/18 07:57, AKASHI Takahiro wrote:
>>> Adding "kaslr-seed" to dtb enables triggering kaslr, or kernel virtual
>>> address randomization, at secondary kernel boot.
>> Hmm, there are three things that get moved by CONFIG_RANDOMIZE_BASE. The kernel
>> physical placement when booted via the EFIstub, the kernel-text VAs and the
>> location of memory in the linear-map region. Adding the kaslr-seed only does the
>> last two.
> Yes, but I think that I and Mark has agreed that "kaslr" meant
> "virtual" randomisation, not including "physical" randomisation.
Okay, I'll update my terminology!


>> This means the physical placement of the new kernel is predictable from
>> /proc/iomem ... but this also tells you the physical placement of the current
>> kernel, so I don't think this is a problem.
>>
>>
>>> We always do this as it will have no harm on kaslr-incapable kernel.
>>> We don't have any "switch" to turn off this feature directly, but still
>>> can suppress it by passing "nokaslr" as a kernel boot argument.
>>
>>> diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
>>> index 7356da5a53d5..47a4fbd0dc34 100644
>>> --- a/arch/arm64/kernel/machine_kexec_file.c
>>> +++ b/arch/arm64/kernel/machine_kexec_file.c
>>> @@ -158,6 +160,12 @@ static int setup_dtb(struct kimage *image,
>> Don't you need to reserve some space in the area you vmalloc()d for the DT?
> No, I don't think so.
> All the data to be loaded are temporarily saved in kexec buffers,
> which will eventually be copied to target locations in machine_kexec
> (arm64_relocate_new_kernel, which, unlike its name, will handle
> not only kernel but also other data as well).

I think we're speaking at cross purposes. Don't you need:

| buf_size += fdt_prop_len("kaslr―seed", sizeof(u64));


You can't assume the existing DTB had a kaslr-seed property, and the
difference may take us over a PAGE_SIZE boundary.


>
>>
>>> + /* add kaslr-seed */
>>> + get_random_bytes(&value, sizeof(value));
>> What happens if the crng isn't ready?
>>
>> It looks like this will print a warning that these random-bytes aren't really up
>> to standard, but the new kernel doesn't know this happened.
>>
>> crng_ready() isn't exposed, all we could do now is
>> wait_for_random_bytes(), but that may wait forever because we do this
>> unconditionally.
>>
>> I'd prefer to leave this feature until we can check crng_ready(), and skip
>> adding a dodgy-seed if its not-ready. This avoids polluting the next-kernel's
>> entropy pool.
> OK. I would try to follow the same way as Bhupesh's userspace patch
> does for kaslr-seed:
> http://lists.infradead.org/pipermail/kexec/2018-April/020564.html

(I really don't understand this 'copying code from user-space' that
happens with kexec_file_load)


> if (not found kaslr-seed in 1st kernel's dtb)
> don't care; go ahead

Don' t bother. As you say in the commit-message its harmless if the new
kernel doesn't support it.
Always having this would let you use kexec_file_load as a bootloader
that can get the crng to
provide decent entropy even if the platform bootloader can't.


> else
> if (current kaslr-seed != 0)
> error

Don't bother. If this happens its a bug in another part of the kernel
that doesn't affect this one. We aren't second-guessing the file-system
when we read the kernel-fd, lets keep this simple.

> if (crng_ready()) ; FIXME, it's a local macro
> get_random_bytes(non-blocking)
> set new kaslr-seed
> else
> error
error? Something like pr_warn_once().

I thought the kaslr-seed was added to the entropy pool, but now I look
again I see its a separate EFI table. So the new kernel will add the
same entropy ... that doesn't sound clever. (I can't see where its
zero'd or re-initialised)



Thanks,

James

2018-07-27 09:30:06

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v12 16/16] arm64: kexec_file: add kaslr support

On 27 July 2018 at 11:22, James Morse <[email protected]> wrote:
> Hi Akashi,
>
>
> On 07/27/2018 09:31 AM, AKASHI Takahiro wrote:
>
> On Thu, Jul 26, 2018 at 02:40:49PM +0100, James Morse wrote:
>
> On 24/07/18 07:57, AKASHI Takahiro wrote:
>
> Adding "kaslr-seed" to dtb enables triggering kaslr, or kernel virtual
> address randomization, at secondary kernel boot.
>
> Hmm, there are three things that get moved by CONFIG_RANDOMIZE_BASE. The
> kernel
> physical placement when booted via the EFIstub, the kernel-text VAs and the
> location of memory in the linear-map region. Adding the kaslr-seed only does
> the
> last two.
>
> Yes, but I think that I and Mark has agreed that "kaslr" meant
> "virtual" randomisation, not including "physical" randomisation.
>
> Okay, I'll update my terminology!
>
>
> This means the physical placement of the new kernel is predictable from
> /proc/iomem ... but this also tells you the physical placement of the
> current
> kernel, so I don't think this is a problem.
>
>
> We always do this as it will have no harm on kaslr-incapable kernel.
>
> We don't have any "switch" to turn off this feature directly, but still
> can suppress it by passing "nokaslr" as a kernel boot argument.
>
> diff --git a/arch/arm64/kernel/machine_kexec_file.c
> b/arch/arm64/kernel/machine_kexec_file.c
> index 7356da5a53d5..47a4fbd0dc34 100644
> --- a/arch/arm64/kernel/machine_kexec_file.c
> +++ b/arch/arm64/kernel/machine_kexec_file.c
> @@ -158,6 +160,12 @@ static int setup_dtb(struct kimage *image,
>
> Don't you need to reserve some space in the area you vmalloc()d for the DT?
>
> No, I don't think so.
> All the data to be loaded are temporarily saved in kexec buffers,
> which will eventually be copied to target locations in machine_kexec
> (arm64_relocate_new_kernel, which, unlike its name, will handle
> not only kernel but also other data as well).
>
>
> I think we're speaking at cross purposes. Don't you need:
>
> | buf_size += fdt_prop_len("kaslr―seed", sizeof(u64));
>
>
> You can't assume the existing DTB had a kaslr-seed property, and the
> difference may take us over a PAGE_SIZE boundary.
>
>
>
>
> + /* add kaslr-seed */
> + get_random_bytes(&value, sizeof(value));
>
> What happens if the crng isn't ready?
>
> It looks like this will print a warning that these random-bytes aren't
> really up
> to standard, but the new kernel doesn't know this happened.
>
> crng_ready() isn't exposed, all we could do now is
> wait_for_random_bytes(), but that may wait forever because we do this
> unconditionally.
>
> I'd prefer to leave this feature until we can check crng_ready(), and skip
> adding a dodgy-seed if its not-ready. This avoids polluting the
> next-kernel's
> entropy pool.
>
> OK. I would try to follow the same way as Bhupesh's userspace patch
> does for kaslr-seed:
> http://lists.infradead.org/pipermail/kexec/2018-April/020564.html
>
>
> (I really don't understand this 'copying code from user-space' that happens
> with kexec_file_load)
>
>
> if (not found kaslr-seed in 1st kernel's dtb)
> don't care; go ahead
>
>
> Don' t bother. As you say in the commit-message its harmless if the new
> kernel doesn't support it.
> Always having this would let you use kexec_file_load as a bootloader that
> can get the crng to
> provide decent entropy even if the platform bootloader can't.
>
>
> else
> if (current kaslr-seed != 0)
> error
>
>
> Don't bother. If this happens its a bug in another part of the kernel that
> doesn't affect this one. We aren't second-guessing the file-system when we
> read the kernel-fd, lets keep this simple.
>
> if (crng_ready()) ; FIXME, it's a local macro
> get_random_bytes(non-blocking)
> set new kaslr-seed
> else
> error
>
> error? Something like pr_warn_once().
>
> I thought the kaslr-seed was added to the entropy pool, but now I look again
> I see its a separate EFI table. So the new kernel will add the same entropy
> ... that doesn't sound clever. (I can't see where its zero'd or
> re-initialised)
>

We do have a hook for that: grep for update_efi_random_seed()

2018-08-01 07:59:15

by AKASHI Takahiro

[permalink] [raw]
Subject: Re: [PATCH v12 16/16] arm64: kexec_file: add kaslr support

James,

All the changes mentioned below were applied to my coming v13.

On Fri, Jul 27, 2018 at 10:22:31AM +0100, James Morse wrote:
> Hi Akashi,
>
>
> On 07/27/2018 09:31 AM, AKASHI Takahiro wrote:
> >On Thu, Jul 26, 2018 at 02:40:49PM +0100, James Morse wrote:
> >>On 24/07/18 07:57, AKASHI Takahiro wrote:
> >>>Adding "kaslr-seed" to dtb enables triggering kaslr, or kernel virtual
> >>>address randomization, at secondary kernel boot.
> >>Hmm, there are three things that get moved by CONFIG_RANDOMIZE_BASE. The kernel
> >>physical placement when booted via the EFIstub, the kernel-text VAs and the
> >>location of memory in the linear-map region. Adding the kaslr-seed only does the
> >>last two.
> >Yes, but I think that I and Mark has agreed that "kaslr" meant
> >"virtual" randomisation, not including "physical" randomisation.
> Okay, I'll update my terminology!
>
>
> >>This means the physical placement of the new kernel is predictable from
> >>/proc/iomem ... but this also tells you the physical placement of the current
> >>kernel, so I don't think this is a problem.
> >>
> >>
> >>>We always do this as it will have no harm on kaslr-incapable kernel.
> >>>We don't have any "switch" to turn off this feature directly, but still
> >>>can suppress it by passing "nokaslr" as a kernel boot argument.
> >>
> >>>diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
> >>>index 7356da5a53d5..47a4fbd0dc34 100644
> >>>--- a/arch/arm64/kernel/machine_kexec_file.c
> >>>+++ b/arch/arm64/kernel/machine_kexec_file.c
> >>>@@ -158,6 +160,12 @@ static int setup_dtb(struct kimage *image,
> >>Don't you need to reserve some space in the area you vmalloc()d for the DT?
> >No, I don't think so.
> >All the data to be loaded are temporarily saved in kexec buffers,
> >which will eventually be copied to target locations in machine_kexec
> >(arm64_relocate_new_kernel, which, unlike its name, will handle
> >not only kernel but also other data as well).
>
> I think we're speaking at cross purposes. Don't you need:
>
> | buf_size += fdt_prop_len("kaslr―seed", sizeof(u64));
>
>
> You can't assume the existing DTB had a kaslr-seed property, and the
> difference may take us over a PAGE_SIZE boundary.

I see, I will add that.

>
> >
> >>
> >>>+ /* add kaslr-seed */
> >>>+ get_random_bytes(&value, sizeof(value));
> >>What happens if the crng isn't ready?
> >>
> >>It looks like this will print a warning that these random-bytes aren't really up
> >>to standard, but the new kernel doesn't know this happened.
> >>
> >>crng_ready() isn't exposed, all we could do now is
> >>wait_for_random_bytes(), but that may wait forever because we do this
> >>unconditionally.
> >>
> >>I'd prefer to leave this feature until we can check crng_ready(), and skip
> >>adding a dodgy-seed if its not-ready. This avoids polluting the next-kernel's
> >>entropy pool.
> >OK. I would try to follow the same way as Bhupesh's userspace patch
> >does for kaslr-seed:
> >http://lists.infradead.org/pipermail/kexec/2018-April/020564.html
>
> (I really don't understand this 'copying code from user-space' that happens
> with kexec_file_load)
>
>
> > if (not found kaslr-seed in 1st kernel's dtb)
> > don't care; go ahead
>
> Don' t bother. As you say in the commit-message its harmless if the new
> kernel doesn't support it.
> Always having this would let you use kexec_file_load as a bootloader that
> can get the crng to
> provide decent entropy even if the platform bootloader can't.

OK, but anyway previous "kaslr-seed" will be dropped first.

>
> > else
> > if (current kaslr-seed != 0)
> > error
>
> Don't bother. If this happens its a bug in another part of the kernel that
> doesn't affect this one. We aren't second-guessing the file-system when we
> read the kernel-fd, lets keep this simple.

OK

> > if (crng_ready()) ; FIXME, it's a local macro
> > get_random_bytes(non-blocking)
> > set new kaslr-seed
> > else
> > error
> error? Something like pr_warn_once().

It was changed to pr_notice() since there is nothing wrong.

Thanks,
-Takahiro AKASHI

> I thought the kaslr-seed was added to the entropy pool, but now I look again
> I see its a separate EFI table. So the new kernel will add the same entropy
> ... that doesn't sound clever. (I can't see where its zero'd or
> re-initialised)
>
>
>
> Thanks,
>
> James