***Background:
People reported that kaslr may randomly chooses some positions
which are located in movable memory regions. This will break memory
hotplug feature and make the movable memory chosen by KASLR can't be
removed.
***Solutions:
There should be a method to limit kaslr to choosing immovable memory
regions, so there are 2 solutions:
1) Add a kernel parameter to specify the memory regions.
2) Get the information of memory hot-remove, then kaslr will know the
right regions.
In method 2, information about memory hot-remove is in ACPI
tables, which will be parsed after start_kernel(), kaslr can't get
the information.
In method 1, users should know the regions address and specify in
kernel parameter.
In the earliest time, I tried to dig ACPI tabls to solve this problem.
But I didn't splite the code in 'compressed/' and ACPI code, so the patch
is hard to follow so refused by community.
Somebody suggest to add a kernel parameter to specify the
immovable memory so that limit kaslr in these regions. Then I make
a new patchset. After several versions, Ingo gave a suggestion:
https://www.mail-archive.com/[email protected]/msg1634024.html
Follow Ingo's suggestion, imitate the ACPI code to parse the acpi
tables, so that the kaslr can get necessary memory information in
ACPI tables.
I think ACPI code is an independent part, so copy the codes
and functions to 'compressed/' directory, so that kaslr won't
influence the initialization of ACPI.
PATCH 1/3 Add acpitb.c to provide functions to parse ACPI code.
PATCH 2/3 If CONFIG_MEMORY_HOTREMOVE enabled, walk all nodes and
store the information of immovable memory regions.
PATCH 3/3 According to the immovable memory regions, filter the
immovable regions which KASLR can choose.
***Test results:
- I did a very simple test, and it can get the memory information in
bios and efi KVM guest machine, and put it by early printk. But no
more tests, so it's with RFC tag.
v1->v2:
- Simplify some code.
Follow Baoquan He's suggestion:
- Reuse the head file of acpi code.
v2->v3:
- Test in more conditions, so remove the 'RFC' tag.
- Change some comments.
v3->v4:
Follow Thomas Gleixner's suggetsion:
- Put the whole efi related function into #define CONFIG_EFI and return
false in the other stub.
- Simplify two functions in head file.
v4->v5:
Follow Dou Liyang's suggestion:
- Add more comments about some functions based on kernel code.
- Change some typo in comments.
- Clean useless variable.
- Add check for the boundary of array.
- Add check for 'movable_node' parameter
v5->v6:
Follow Baoquan He's suggestion:
- Change some log.
- Add the check for acpi_rsdp
- Change some code logical to make code clear
v6->v7:
Follow Rafael's suggestion:
- Add more comments and patch log.
Follow test robot's suggestion:
- Add "static" tag for function
v7-v8:
Follow Kees Cook's suggestion:
- Use mem_overlaps() to check memory region.
- Use #ifdef in the definition of function.
Any comments will be welcome.
Chao Fan (3):
x86/boot: Add acpitb.c to parse acpi tables
x86/boot/KASLR: Walk srat tables to filter immovable memory
x86/boot/KASLR: Limit kaslr to choosing the immovable memory
arch/x86/boot/compressed/Makefile | 2 +
arch/x86/boot/compressed/acpitb.c | 405 ++++++++++++++++++++++++++++++
arch/x86/boot/compressed/kaslr.c | 75 +++++-
arch/x86/boot/compressed/misc.h | 8 +
4 files changed, 479 insertions(+), 11 deletions(-)
create mode 100644 arch/x86/boot/compressed/acpitb.c
--
2.17.1
If CONFIG_MEMORY_HOTREMOVE enabled, walk through the acpi srat memory
tables and store those immovable memory regions so that kaslr can get
where to choose for randomization.
Signed-off-by: Chao Fan <[email protected]>
---
arch/x86/boot/compressed/kaslr.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 9ed9709d9947..573e582e8709 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -417,6 +417,9 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
/* Mark the memmap regions we need to avoid */
handle_mem_options();
+ /* Mark the immovable regions we need to choose */
+ get_immovable_mem();
+
#ifdef CONFIG_X86_VERBOSE_BOOTUP
/* Make sure video RAM can be used. */
add_identity_map(0, PMD_SIZE);
--
2.17.1
If CONFIG_MEMORY_HOTREMOVE enabled and the amount of immovable
memory regions is not zero. Calculate the intersection between memory
regions from e820/efi memory table and immovable memory regions.
Signed-off-by: Chao Fan <[email protected]>
---
arch/x86/boot/compressed/kaslr.c | 72 +++++++++++++++++++++++++++-----
1 file changed, 61 insertions(+), 11 deletions(-)
diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 573e582e8709..61486aad39b6 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -101,6 +101,11 @@ static bool memmap_too_large;
/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
static unsigned long long mem_limit = ULLONG_MAX;
+#ifdef CONFIG_MEMORY_HOTREMOVE
+/* Store the immovable memory regions */
+extern struct mem_vector immovable_mem[MAX_NUMNODES*2];
+#endif
+
enum mem_avoid_index {
MEM_AVOID_ZO_RANGE = 0,
@@ -575,9 +580,9 @@ static unsigned long slots_fetch_random(void)
return 0;
}
-static void process_mem_region(struct mem_vector *entry,
- unsigned long minimum,
- unsigned long image_size)
+static void slots_count(struct mem_vector *entry,
+ unsigned long minimum,
+ unsigned long image_size)
{
struct mem_vector region, overlap;
unsigned long start_orig, end;
@@ -653,6 +658,57 @@ static void process_mem_region(struct mem_vector *entry,
}
}
+static bool process_mem_region(struct mem_vector *region,
+ unsigned long long minimum,
+ unsigned long long image_size)
+{
+ int i;
+ /*
+ * If no immovable memory found, or MEMORY_HOTREMOVE disabled,
+ * walk all the regions, so use region directely.
+ */
+ if (num_immovable_mem == 0) {
+ slots_count(region, minimum, image_size);
+
+ if (slot_area_index == MAX_SLOT_AREA) {
+ debug_putstr("Aborted e820/efi memmap scan (slot_areas full)!\n");
+ return 1;
+ }
+ return 0;
+ }
+
+#ifdef CONFIG_MEMORY_HOTREMOVE
+ /*
+ * If immovable memory found, filter the intersection between
+ * immovable memory and region to slots_count.
+ * Otherwise, go on old code.
+ */
+ for (i = 0; i < num_immovable_mem; i++) {
+ struct mem_vector entry;
+ unsigned long long start, end, entry_end, region_end;
+
+ if (!mem_overlaps(region, &immovable_mem[i]))
+ continue;
+
+ start = immovable_mem[i].start;
+ end = start + immovable_mem[i].size;
+ region_end = region->start + region->size;
+
+ entry.start = clamp(region->start, start, end);
+ entry_end = clamp(region_end, start, end);
+ entry.size = entry_end - entry.start;
+
+ slots_count(&entry, minimum, image_size);
+
+ if (slot_area_index == MAX_SLOT_AREA) {
+ debug_putstr("Aborted e820/efi memmap scan (slot_areas full)!\n");
+ return 1;
+ }
+ }
+ return 0;
+#endif
+}
+
#ifdef CONFIG_EFI
/*
* Returns true if mirror region found (and must have been processed
@@ -718,11 +774,8 @@ process_efi_entries(unsigned long minimum, unsigned long image_size)
region.start = md->phys_addr;
region.size = md->num_pages << EFI_PAGE_SHIFT;
- process_mem_region(®ion, minimum, image_size);
- if (slot_area_index == MAX_SLOT_AREA) {
- debug_putstr("Aborted EFI scan (slot_areas full)!\n");
+ if (process_mem_region(®ion, minimum, image_size))
break;
- }
}
return true;
}
@@ -749,11 +802,8 @@ static void process_e820_entries(unsigned long minimum,
continue;
region.start = entry->addr;
region.size = entry->size;
- process_mem_region(®ion, minimum, image_size);
- if (slot_area_index == MAX_SLOT_AREA) {
- debug_putstr("Aborted e820 scan (slot_areas full)!\n");
+ if (process_mem_region(®ion, minimum, image_size))
break;
- }
}
}
--
2.17.1
There is a bug that kaslr may randomly chooses some positions
which are located in movable memory regions. This will break memory
hotplug feature and make the movable memory chosen by KASLR can't be
removed. So dig SRAT table from ACPI tables to get memory information.
Imitate the ACPI code of parsing ACPI tables to dig and read ACPI
tables. Since some operations are not needed here, functions are
simplified. Functions will be used to dig only SRAT tables to get
information of memory, so that KASLR can the memory in immovable node.
And also, these functions won't influence the initialization of
ACPI after start_kernel().
Since use physical address directely, so acpi_os_map_memory()
and acpi_os_unmap_memory() are not needed.
Signed-off-by: Chao Fan <[email protected]>
---
arch/x86/boot/compressed/Makefile | 2 +
arch/x86/boot/compressed/acpitb.c | 405 ++++++++++++++++++++++++++++++
arch/x86/boot/compressed/misc.h | 8 +
3 files changed, 415 insertions(+)
create mode 100644 arch/x86/boot/compressed/acpitb.c
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 28764dacf018..1609e4efcaed 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -83,6 +83,8 @@ ifdef CONFIG_X86_64
vmlinux-objs-y += $(obj)/pgtable_64.o
endif
+vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/acpitb.o
+
$(obj)/eboot.o: KBUILD_CFLAGS += -fshort-wchar -mno-red-zone
vmlinux-objs-$(CONFIG_EFI_STUB) += $(obj)/eboot.o $(obj)/efi_stub_$(BITS).o \
diff --git a/arch/x86/boot/compressed/acpitb.c b/arch/x86/boot/compressed/acpitb.c
new file mode 100644
index 000000000000..6b869e3f9780
--- /dev/null
+++ b/arch/x86/boot/compressed/acpitb.c
@@ -0,0 +1,405 @@
+// SPDX-License-Identifier: GPL-2.0
+#define BOOT_CTYPE_H
+#include "misc.h"
+#include "error.h"
+
+#include <linux/efi.h>
+#include <asm/efi.h>
+#include <linux/numa.h>
+#include <linux/acpi.h>
+
+extern unsigned long get_cmd_line_ptr(void);
+
+#define STATIC
+#include <linux/decompress/mm.h>
+
+#ifdef CONFIG_MEMORY_HOTREMOVE
+struct mem_vector {
+ unsigned long long start;
+ unsigned long long size;
+};
+/* Store the immovable memory regions */
+struct mem_vector immovable_mem[MAX_NUMNODES*2];
+#endif
+
+#ifdef CONFIG_EFI
+/* Search EFI table for rsdp table. */
+static bool efi_get_rsdp_addr(acpi_physical_address *rsdp_addr)
+{
+ efi_system_table_t *systab;
+ bool find_rsdp = false;
+ bool efi_64 = false;
+ void *config_tables;
+ struct efi_info *e;
+ char *sig;
+ int size;
+ int i;
+
+ e = &boot_params->efi_info;
+ sig = (char *)&e->efi_loader_signature;
+
+ if (!strncmp(sig, EFI64_LOADER_SIGNATURE, 4))
+ efi_64 = true;
+ else if (!strncmp(sig, EFI32_LOADER_SIGNATURE, 4))
+ efi_64 = false;
+ else {
+ debug_putstr("Wrong EFI loader signature.\n");
+ return false;
+ }
+
+ /* Get systab from boot params. Based on efi_init(). */
+#ifdef CONFIG_X86_32
+ if (e->efi_systab_hi || e->efi_memmap_hi) {
+ debug_putstr("Table located above 4GB, disabling EFI.\n");
+ return false;
+ }
+ systab = (efi_system_table_t *)e->efi_systab;
+#else
+ systab = (efi_system_table_t *)(
+ e->efi_systab | ((__u64)e->efi_systab_hi<<32));
+#endif
+
+ if (systab == NULL)
+ return false;
+
+ /*
+ * Get EFI tables from systab. Based on efi_config_init() and
+ * efi_config_parse_tables(). Only dig the config_table.
+ */
+ size = efi_64 ? sizeof(efi_config_table_64_t) :
+ sizeof(efi_config_table_32_t);
+
+ for (i = 0; i < systab->nr_tables; i++) {
+ efi_guid_t guid;
+ unsigned long table;
+
+ config_tables = (void *)(systab->tables + size * i);
+ if (efi_64) {
+ efi_config_table_64_t *tmp_table;
+
+ tmp_table = (efi_config_table_64_t *)config_tables;
+ guid = tmp_table->guid;
+ table = tmp_table->table;
+#ifndef CONFIG_64BIT
+ if (table >> 32) {
+ debug_putstr("Table located above 4G, disabling EFI.\n");
+ return false;
+ }
+#endif
+ } else {
+ efi_config_table_32_t *tmp_table;
+
+ tmp_table = (efi_config_table_32_t *)config_tables;
+ guid = tmp_table->guid;
+ table = tmp_table->table;
+ }
+
+ /*
+ * Get rsdp from EFI tables.
+ * If ACPI20 table found, use it and return true.
+ * If ACPI20 table not found, but ACPI table found,
+ * use the ACPI table and return true.
+ * If neither ACPI table nor ACPI20 table found,
+ * return false.
+ */
+ if (!(efi_guidcmp(guid, ACPI_TABLE_GUID))) {
+ *rsdp_addr = (acpi_physical_address)table;
+ find_rsdp = true;
+ } else if (!(efi_guidcmp(guid, ACPI_20_TABLE_GUID))) {
+ *rsdp_addr = (acpi_physical_address)table;
+ return true;
+ }
+ }
+ return find_rsdp;
+}
+#else
+static bool efi_get_rsdp_addr(acpi_physical_address *rsdp_addr)
+{
+ return false;
+}
+#endif
+
+static u8 checksum(u8 *buffer, u32 length)
+{
+ u8 sum = 0;
+ u8 *end = buffer + length;
+
+ while (buffer < end)
+ sum = (u8)(sum + *(buffer++));
+
+ return sum;
+}
+
+/*
+ * Used to search a block of memory for the RSDP signature.
+ * Return Pointer to the RSDP if found, otherwise NULL.
+ * Based on acpi_tb_scan_memory_for_rsdp().
+ */
+static u8 *scan_mem_for_rsdp(u8 *start_address, u32 length)
+{
+ struct acpi_table_rsdp *rsdp;
+ u8 *end_address;
+ u8 *mem_rover;
+
+ end_address = start_address + length;
+
+ /* Search from given start address for the requested length */
+ for (mem_rover = start_address; mem_rover < end_address;
+ mem_rover += ACPI_RSDP_SCAN_STEP) {
+ /*
+ * The RSDP signature and checksum must both be correct
+ * Note: Sometimes there exists more than one RSDP in memory;
+ * the valid RSDP has a valid checksum, all others have an
+ * invalid checksum.
+ */
+ rsdp = (struct acpi_table_rsdp *)mem_rover;
+
+ /* Nope, BAD Signature */
+ if (!ACPI_VALIDATE_RSDP_SIG(rsdp->signature))
+ continue;
+
+ /* Check the standard checksum */
+ if (checksum((u8 *) rsdp, ACPI_RSDP_CHECKSUM_LENGTH) != 0)
+ continue;
+
+ /* Check extended checksum if table version >= 2 */
+ if ((rsdp->revision >= 2) &&
+ (checksum((u8 *) rsdp, ACPI_RSDP_XCHECKSUM_LENGTH) != 0))
+ continue;
+
+ /* Sig and checksum valid, we have found a real RSDP */
+ return mem_rover;
+ }
+ return NULL;
+}
+
+/*
+ * Used to search RSDP physical address.
+ * Based on acpi_find_root_pointer(). Since only use physical address
+ * in this period, so there is no need to do the memory map jobs.
+ */
+static void bios_get_rsdp_addr(acpi_physical_address *rsdp_addr)
+{
+ struct acpi_table_rsdp *rsdp;
+ u8 *table_ptr;
+ u8 *mem_rover;
+ u32 address;
+
+ /*
+ * Get the location of the Extended BIOS Data Area (EBDA)
+ * Since we use physical address directely, so
+ * acpi_os_map_memory() and acpi_os_unmap_memory() are
+ * not needed here.
+ */
+ table_ptr = (u8 *)ACPI_EBDA_PTR_LOCATION;
+ *(u32 *)(void *)&address = *(u16 *)(void *)table_ptr;
+ address <<= 4;
+ table_ptr = (u8 *)address;
+
+ /*
+ * Search EBDA paragraphs (EBDA is required to be a minimum of
+ * 1K length)
+ */
+ if (address > 0x400) {
+ mem_rover = scan_mem_for_rsdp(table_ptr, ACPI_EBDA_WINDOW_SIZE);
+
+ if (mem_rover) {
+ address += (u32)ACPI_PTR_DIFF(mem_rover, table_ptr);
+ *rsdp_addr = (acpi_physical_address)address;
+ return;
+ }
+ }
+
+ table_ptr = (u8 *)ACPI_HI_RSDP_WINDOW_BASE;
+ mem_rover = scan_mem_for_rsdp(table_ptr, ACPI_HI_RSDP_WINDOW_SIZE);
+
+ /*
+ * Search upper memory: 16-byte boundaries in E0000h-FFFFFh
+ * Since we use physical address directely, so
+ * acpi_os_map_memory() and acpi_os_unmap_memory() are
+ * not needed here.
+ */
+ if (mem_rover) {
+ address = (u32)(ACPI_HI_RSDP_WINDOW_BASE +
+ ACPI_PTR_DIFF(mem_rover, table_ptr));
+ *rsdp_addr = (acpi_physical_address)address;
+ return;
+ }
+}
+
+#ifdef CONFIG_KEXEC
+static bool get_acpi_rsdp(acpi_physical_address *rsdp_addr)
+{
+ char *args = (char *)get_cmd_line_ptr();
+ size_t len = strlen((char *)args);
+ char *tmp_cmdline, *param, *val;
+ unsigned long long addr = 0;
+ char *endptr;
+
+ if (!strstr(args, "acpi_rsdp="))
+ return false;
+
+ tmp_cmdline = malloc(len+1);
+ if (!tmp_cmdline)
+ error("Failed to allocate space for tmp_cmdline");
+
+ memcpy(tmp_cmdline, args, len);
+ tmp_cmdline[len] = 0;
+ args = tmp_cmdline;
+
+ args = skip_spaces(args);
+
+ while (*args) {
+ args = next_arg(args, ¶m, &val);
+
+ if (!val && strcmp(param, "--") == 0) {
+ warn("Only '--' specified in cmdline");
+ free(tmp_cmdline);
+ return false;
+ }
+
+ if (!strcmp(param, "acpi_rsdp")) {
+ addr = simple_strtoull(val, &endptr, 0);
+
+ if (addr == 0)
+ return false;
+
+ *rsdp_addr = (acpi_physical_address)addr;
+ return true;
+ }
+ }
+ return false;
+}
+#else
+static bool get_acpi_rsdp(acpi_physical_address *rsdp_addr)
+{
+ return false;
+}
+#endif
+
+/*
+ * Used to dig rsdp table from EFI table or BIOS.
+ * If rsdp table found in EFI table, use it. Or search BIOS.
+ * Based on acpi_os_get_root_pointer().
+ */
+static acpi_physical_address get_rsdp_addr(void)
+{
+ acpi_physical_address pa = 0;
+ bool status = false;
+
+ status = get_acpi_rsdp(&pa);
+
+ if (!status || pa == 0)
+ status = efi_get_rsdp_addr(&pa);
+
+ if (!status || pa == 0)
+ bios_get_rsdp_addr(&pa);
+
+ return pa;
+}
+
+static struct acpi_table_header *get_acpi_srat_table(void)
+{
+ char *args = (char *)get_cmd_line_ptr();
+ acpi_physical_address acpi_table;
+ acpi_physical_address root_table;
+ struct acpi_table_header *header;
+ struct acpi_table_rsdp *rsdp;
+ char *signature;
+ u8 *entry;
+ u32 count;
+ u32 size;
+ int i, j;
+ u32 len;
+
+ rsdp = (struct acpi_table_rsdp *)get_rsdp_addr();
+ if (!rsdp)
+ return NULL;
+
+ /* Get rsdt or xsdt from rsdp. */
+ if (!strstr(args, "acpi=rsdt") &&
+ rsdp->xsdt_physical_address && rsdp->revision > 1) {
+ root_table = rsdp->xsdt_physical_address;
+ size = ACPI_XSDT_ENTRY_SIZE;
+ } else {
+ root_table = rsdp->rsdt_physical_address;
+ size = ACPI_RSDT_ENTRY_SIZE;
+ }
+
+ /* Get ACPI root table from rsdt or xsdt.*/
+ header = (struct acpi_table_header *)root_table;
+ len = header->length;
+ count = (u32)((len - sizeof(struct acpi_table_header)) / size);
+ entry = ACPI_ADD_PTR(u8, header, sizeof(struct acpi_table_header));
+
+ for (i = 0; i < count; i++) {
+ u64 address64;
+
+ if (size == ACPI_RSDT_ENTRY_SIZE)
+ acpi_table = ((acpi_physical_address)
+ (*ACPI_CAST_PTR(u32, entry)));
+ else {
+ *(u64 *)(void *)&address64 = *(u64 *)(void *)entry;
+ acpi_table = (acpi_physical_address) address64;
+ }
+
+ if (acpi_table) {
+ header = (struct acpi_table_header *)acpi_table;
+ signature = header->signature;
+
+ if (!strncmp(signature, "SRAT", 4))
+ return header;
+ }
+ entry += size;
+ }
+ return NULL;
+}
+
+#ifdef CONFIG_MEMORY_HOTREMOVE
+/*
+ * According to ACPI table, filter the immvoable memory regions
+ * and store them in immovable_mem[].
+ */
+void get_immovable_mem(void)
+{
+ char *args = (char *)get_cmd_line_ptr();
+ struct acpi_table_header *table_header;
+ struct acpi_subtable_header *table;
+ struct acpi_srat_mem_affinity *ma;
+ unsigned long table_end;
+ int i = 0;
+
+ if (!strstr(args, "movable_node") || strstr(args, "acpi=off"))
+ return;
+
+ table_header = get_acpi_srat_table();
+ if (!table_header)
+ return;
+
+ table_end = (unsigned long)table_header + table_header->length;
+
+ table = (struct acpi_subtable_header *)
+ ((unsigned long)table_header + sizeof(struct acpi_table_srat));
+
+ while (((unsigned long)table) + table->length < table_end) {
+ if (table->type == 1) {
+ ma = (struct acpi_srat_mem_affinity *)table;
+ if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)) {
+ immovable_mem[i].start = ma->base_address;
+ immovable_mem[i].size = ma->length;
+ i++;
+ }
+
+ if (i >= MAX_NUMNODES*2)
+ break;
+ }
+ table = (struct acpi_subtable_header *)
+ ((unsigned long)table + table->length);
+ }
+ num_immovable_mem = i;
+}
+#else
+void get_immovable_mem(void)
+{
+}
+#endif
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index a1d5918765f3..60337355d113 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -116,3 +116,11 @@ static inline void console_init(void)
void set_sev_encryption_mask(void);
#endif
+
+/* acpitb.c */
+/* Store the amount of immovable memory regions */
+int num_immovable_mem;
+void get_immovable_mem(void);
+#ifdef CONFIG_MEMORY_HOTREMOVE
+#define ACPI_MAX_TABLES 128
+#endif
--
2.17.1
On Wed, Oct 10, 2018 at 04:41:16PM +0800, Chao Fan wrote:
> ***Background:
> People reported that kaslr may randomly chooses some positions
> which are located in movable memory regions. This will break memory
> hotplug feature and make the movable memory chosen by KASLR can't be
> removed.
>
> ***Solutions:
> There should be a method to limit kaslr to choosing immovable memory
> regions, so there are 2 solutions:
> 1) Add a kernel parameter to specify the memory regions.
> 2) Get the information of memory hot-remove, then kaslr will know the
> right regions.
> In method 2, information about memory hot-remove is in ACPI
> tables, which will be parsed after start_kernel(), kaslr can't get
> the information.
> In method 1, users should know the regions address and specify in
> kernel parameter.
>
> In the earliest time, I tried to dig ACPI tabls to solve this problem.
> But I didn't splite the code in 'compressed/' and ACPI code, so the patch
> is hard to follow so refused by community.
> Somebody suggest to add a kernel parameter to specify the
> immovable memory so that limit kaslr in these regions. Then I make
> a new patchset. After several versions, Ingo gave a suggestion:
> https://www.mail-archive.com/[email protected]/msg1634024.html
> Follow Ingo's suggestion, imitate the ACPI code to parse the acpi
> tables, so that the kaslr can get necessary memory information in
> ACPI tables.
> I think ACPI code is an independent part, so copy the codes
> and functions to 'compressed/' directory, so that kaslr won't
> influence the initialization of ACPI.
... and we just picked up
https://lkml.kernel.org/r/[email protected]
and without having looked at the rest of your stuff, if people accept
your solution, we don't need the silly parameter anymore, right?
Which means, we should not rush the whole thing yet until the whole
KASLR vs movable memory gets solved properly.
Ingo, Thomas?
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
Hi Boris,
On 10/10/18 at 10:59am, Borislav Petkov wrote:
> ... and we just picked up
>
> https://lkml.kernel.org/r/[email protected]
>
> and without having looked at the rest of your stuff, if people accept
> your solution, we don't need the silly parameter anymore, right?
>
> Which means, we should not rush the whole thing yet until the whole
> KASLR vs movable memory gets solved properly.
Masa's patches solves the problem in memory region KASLR which later hot
added memory may be big than the default padding 10 TB.
Chao's patches is trying to fix a conflict between 'movable_node' and
kernel text KASLR. If 'movable_node' specified, we rely on SRAT to get
which memory region is movable or immovable, and movable region can be
hot removed. But if kernel is randomized into movable memory, it can't
be hot removed any more, this is a regression after KASLR introduced.
So this is a different issue than Masa's.
Thanks
Baoquan
On Wed, Oct 10, 2018 at 05:06:20PM +0800, Baoquan He wrote:
>Hi Boris,
>
>On 10/10/18 at 10:59am, Borislav Petkov wrote:
>> ... and we just picked up
>>
>> https://lkml.kernel.org/r/[email protected]
>>
>> and without having looked at the rest of your stuff, if people accept
>> your solution, we don't need the silly parameter anymore, right?
>>
>> Which means, we should not rush the whole thing yet until the whole
>> KASLR vs movable memory gets solved properly.
>
>Masa's patches solves the problem in memory region KASLR which later hot
>added memory may be big than the default padding 10 TB.
>
>Chao's patches is trying to fix a conflict between 'movable_node' and
>kernel text KASLR. If 'movable_node' specified, we rely on SRAT to get
>which memory region is movable or immovable, and movable region can be
>hot removed. But if kernel is randomized into movable memory, it can't
>be hot removed any more, this is a regression after KASLR introduced.
>So this is a different issue than Masa's.
Yes, they are two issues.
But if we can get more memory information by the function in
the new file acpi.c, semms it's helfpul to Masa's issue.
Thanks,
Chao Fan
>
>Thanks
>Baoquan
>
>
On Wed, 10 Oct 2018, Baoquan He wrote:
> Hi Boris,
>
> On 10/10/18 at 10:59am, Borislav Petkov wrote:
> > ... and we just picked up
> >
> > https://lkml.kernel.org/r/[email protected]
> >
> > and without having looked at the rest of your stuff, if people accept
> > your solution, we don't need the silly parameter anymore, right?
> >
> > Which means, we should not rush the whole thing yet until the whole
> > KASLR vs movable memory gets solved properly.
>
> Masa's patches solves the problem in memory region KASLR which later hot
> added memory may be big than the default padding 10 TB.
>
> Chao's patches is trying to fix a conflict between 'movable_node' and
> kernel text KASLR. If 'movable_node' specified, we rely on SRAT to get
> which memory region is movable or immovable, and movable region can be
> hot removed. But if kernel is randomized into movable memory, it can't
> be hot removed any more, this is a regression after KASLR introduced.
> So this is a different issue than Masa's.
Yes, it's different, but if the SRAT information is available early, then
the command line parameter can go away because then the required
information for Masa's problem is available as well.
Thanks,
tglx
On Wed, Oct 10, 2018 at 11:14:26AM +0200, Thomas Gleixner wrote:
> Yes, it's different, but if the SRAT information is available early, then
> the command line parameter can go away because then the required
> information for Masa's problem is available as well.
Exactly. And I'd prefer we delayed the command line parameter until we
figure out we really need it and not expose it to upstream and then
remove it shortly after.
So I'd suggest we move Masa's patches to a separate branch and not send
it up this round.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On 10/10/18 at 05:12pm, Chao Fan wrote:
> On Wed, Oct 10, 2018 at 05:06:20PM +0800, Baoquan He wrote:
> >Hi Boris,
> >
> >On 10/10/18 at 10:59am, Borislav Petkov wrote:
> >> ... and we just picked up
> >>
> >> https://lkml.kernel.org/r/[email protected]
> >>
> >> and without having looked at the rest of your stuff, if people accept
> >> your solution, we don't need the silly parameter anymore, right?
> >>
> >> Which means, we should not rush the whole thing yet until the whole
> >> KASLR vs movable memory gets solved properly.
> >
> >Masa's patches solves the problem in memory region KASLR which later hot
> >added memory may be big than the default padding 10 TB.
> >
> >Chao's patches is trying to fix a conflict between 'movable_node' and
> >kernel text KASLR. If 'movable_node' specified, we rely on SRAT to get
> >which memory region is movable or immovable, and movable region can be
> >hot removed. But if kernel is randomized into movable memory, it can't
> >be hot removed any more, this is a regression after KASLR introduced.
> >So this is a different issue than Masa's.
>
> Yes, they are two issues.
> But if we can get more memory information by the function in
> the new file acpi.c, semms it's helfpul to Masa's issue.
Hmm, reading SRAT three times during x86 kernel boot? Maybe we try this
after the function has run a time and proved very stable?
Thanks
Baoquan
On 10/10/18 at 11:19am, Borislav Petkov wrote:
> On Wed, Oct 10, 2018 at 11:14:26AM +0200, Thomas Gleixner wrote:
> > Yes, it's different, but if the SRAT information is available early, then
> > the command line parameter can go away because then the required
> > information for Masa's problem is available as well.
>
> Exactly. And I'd prefer we delayed the command line parameter until we
> figure out we really need it and not expose it to upstream and then
> remove it shortly after.
>
> So I'd suggest we move Masa's patches to a separate branch and not send
> it up this round.
Yes, sounds more reasonable if we can reuse functions in Chao's patch 1/3
to solve the padding issue.
Thanks
Baoquan
On Wed, Oct 10, 2018 at 04:41:16PM +0800, Chao Fan wrote:
> In the earliest time, I tried to dig ACPI tabls to solve this problem.
> But I didn't splite the code in 'compressed/' and ACPI code, so the patch
> is hard to follow so refused by community.
> Somebody suggest to add a kernel parameter to specify the
> immovable memory so that limit kaslr in these regions. Then I make
> a new patchset. After several versions, Ingo gave a suggestion:
> https://www.mail-archive.com/[email protected]/msg1634024.html
> Follow Ingo's suggestion, imitate the ACPI code to parse the acpi
> tables, so that the kaslr can get necessary memory information in
> ACPI tables.
> I think ACPI code is an independent part, so copy the codes
> and functions to 'compressed/' directory, so that kaslr won't
> influence the initialization of ACPI.
You say "copy". I'm still about to look at the code but can those
functions be carved out in a separate compilation unit which ACPI *and*
KASLR can both link with so that there's no duplication?
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Wed, Oct 10, 2018 at 05:30:57PM +0800, Baoquan He wrote:
> On 10/10/18 at 11:19am, Borislav Petkov wrote:
> > On Wed, Oct 10, 2018 at 11:14:26AM +0200, Thomas Gleixner wrote:
> > > Yes, it's different, but if the SRAT information is available early, then
> > > the command line parameter can go away because then the required
> > > information for Masa's problem is available as well.
> >
> > Exactly. And I'd prefer we delayed the command line parameter until we
> > figure out we really need it and not expose it to upstream and then
> > remove it shortly after.
> >
> > So I'd suggest we move Masa's patches to a separate branch and not send
> > it up this round.
>
> Yes, sounds more reasonable if we can reuse functions in Chao's patch 1/3
> to solve the padding issue.
Thanks for your comments! Yes, immovable_mem[num_immovable_mem] in Chao's
patch may be useful for calculating the padding size. If so, we don't
need the new kernel parameter. It's nice!
Do you happen to have ideas how we access immovable_mem[num_immovable_mem]
from arch/x86/mm/kaslr.c ? It is located to arch/x86/boot/compressed/*, so
I suppose it is not easy to access it...
I would appreciate if you could give some advice.
Thanks!
Masa
On 10/10/18 at 03:44pm, Masayoshi Mizuma wrote:
> On Wed, Oct 10, 2018 at 05:30:57PM +0800, Baoquan He wrote:
> > On 10/10/18 at 11:19am, Borislav Petkov wrote:
> > > On Wed, Oct 10, 2018 at 11:14:26AM +0200, Thomas Gleixner wrote:
> > > > Yes, it's different, but if the SRAT information is available early, then
> > > > the command line parameter can go away because then the required
> > > > information for Masa's problem is available as well.
> > >
> > > Exactly. And I'd prefer we delayed the command line parameter until we
> > > figure out we really need it and not expose it to upstream and then
> > > remove it shortly after.
> > >
> > > So I'd suggest we move Masa's patches to a separate branch and not send
> > > it up this round.
> >
> > Yes, sounds more reasonable if we can reuse functions in Chao's patch 1/3
> > to solve the padding issue.
>
> Thanks for your comments! Yes, immovable_mem[num_immovable_mem] in Chao's
> patch may be useful for calculating the padding size. If so, we don't
> need the new kernel parameter. It's nice!
>
> Do you happen to have ideas how we access immovable_mem[num_immovable_mem]
> from arch/x86/mm/kaslr.c ? It is located to arch/x86/boot/compressed/*, so
> I suppose it is not easy to access it...
> I would appreciate if you could give some advice.
Hmm, they are living in different life cycle and space. So we can only
reuse the code from Chao's patch, but not the variable. Means we need
go through the SRAT accessing again in arch/x86/mm/kaslr.c and fill
immovable_mem[num_immovable_mem] for mm/kaslr.c usage, if we decide to
do like that.
Thanks
Baoquan
On Wed, Oct 10, 2018 at 07:16:16PM +0200, Borislav Petkov wrote:
>On Wed, Oct 10, 2018 at 04:41:16PM +0800, Chao Fan wrote:
>> In the earliest time, I tried to dig ACPI tabls to solve this problem.
>> But I didn't splite the code in 'compressed/' and ACPI code, so the patch
>> is hard to follow so refused by community.
>> Somebody suggest to add a kernel parameter to specify the
>> immovable memory so that limit kaslr in these regions. Then I make
>> a new patchset. After several versions, Ingo gave a suggestion:
>> https://www.mail-archive.com/[email protected]/msg1634024.html
>> Follow Ingo's suggestion, imitate the ACPI code to parse the acpi
>> tables, so that the kaslr can get necessary memory information in
>> ACPI tables.
>> I think ACPI code is an independent part, so copy the codes
>> and functions to 'compressed/' directory, so that kaslr won't
>> influence the initialization of ACPI.
>
>You say "copy". I'm still about to look at the code but can those
>functions be carved out in a separate compilation unit which ACPI *and*
>KASLR can both link with so that there's no duplication?
Sorry for my poor English, I used 'copy' but they are not same.
Maybe 'imitate' is better.
Just like I said in my log, The ACPI part need to handle the map
between physical address and virtual address.
But in KASLR part, I remove these operations.
So my code is simplified version.
Thanks,
Chao Fan
>
>--
>Regards/Gruss,
> Boris.
>
>Good mailing practices for 400: avoid top-posting and trim the reply.
>
>
On Thu, Oct 11, 2018 at 08:29:55AM +0800, Baoquan He wrote:
>On 10/10/18 at 03:44pm, Masayoshi Mizuma wrote:
>> On Wed, Oct 10, 2018 at 05:30:57PM +0800, Baoquan He wrote:
>> > On 10/10/18 at 11:19am, Borislav Petkov wrote:
>> > > On Wed, Oct 10, 2018 at 11:14:26AM +0200, Thomas Gleixner wrote:
>> > > > Yes, it's different, but if the SRAT information is available early, then
>> > > > the command line parameter can go away because then the required
>> > > > information for Masa's problem is available as well.
>> > >
>> > > Exactly. And I'd prefer we delayed the command line parameter until we
>> > > figure out we really need it and not expose it to upstream and then
>> > > remove it shortly after.
>> > >
>> > > So I'd suggest we move Masa's patches to a separate branch and not send
>> > > it up this round.
>> >
>> > Yes, sounds more reasonable if we can reuse functions in Chao's patch 1/3
>> > to solve the padding issue.
>>
>> Thanks for your comments! Yes, immovable_mem[num_immovable_mem] in Chao's
>> patch may be useful for calculating the padding size. If so, we don't
>> need the new kernel parameter. It's nice!
>>
>> Do you happen to have ideas how we access immovable_mem[num_immovable_mem]
>> from arch/x86/mm/kaslr.c ? It is located to arch/x86/boot/compressed/*, so
>> I suppose it is not easy to access it...
>> I would appreciate if you could give some advice.
>
>Hmm, they are living in different life cycle and space. So we can only
>reuse the code from Chao's patch, but not the variable. Means we need
>go through the SRAT accessing again in arch/x86/mm/kaslr.c and fill
>immovable_mem[num_immovable_mem] for mm/kaslr.c usage, if we decide to
>do like that.
Reading three times is redundant, but reading two times is needed.
Becasue the ACPI code run very stable, but we need more information
before that.
As for Masa's issue, I am wondering whether we can tranfer the
information or only the address of SRAT table from compressed period
to the period after start_kernel().
Thanks,
Chao Fan
>
>Thanks
>Baoquan
>
>
On Wed, Oct 10, 2018 at 04:41:17PM +0800, Chao Fan wrote:
> There is a bug that kaslr may randomly chooses some positions
> which are located in movable memory regions. This will break memory
> hotplug feature and make the movable memory chosen by KASLR can't be
> removed. So dig SRAT table from ACPI tables to get memory information.
>
> Imitate the ACPI code of parsing ACPI tables to dig and read ACPI
> tables. Since some operations are not needed here, functions are
> simplified. Functions will be used to dig only SRAT tables to get
> information of memory, so that KASLR can the memory in immovable node.
>
> And also, these functions won't influence the initialization of
> ACPI after start_kernel().
>
> Since use physical address directely, so acpi_os_map_memory()
> and acpi_os_unmap_memory() are not needed.
>
> Signed-off-by: Chao Fan <[email protected]>
> ---
> arch/x86/boot/compressed/Makefile | 2 +
> arch/x86/boot/compressed/acpitb.c | 405 ++++++++++++++++++++++++++++++
> arch/x86/boot/compressed/misc.h | 8 +
> 3 files changed, 415 insertions(+)
> create mode 100644 arch/x86/boot/compressed/acpitb.c
>
> diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
> index 28764dacf018..1609e4efcaed 100644
> --- a/arch/x86/boot/compressed/Makefile
> +++ b/arch/x86/boot/compressed/Makefile
> @@ -83,6 +83,8 @@ ifdef CONFIG_X86_64
> vmlinux-objs-y += $(obj)/pgtable_64.o
> endif
>
> +vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/acpitb.o
This should be CONFIG_MEMORY_HOTREMOVE *and* CONFIG_RANDOMIZE_BASE.
Otherwise we don't need all that code.
> $(obj)/eboot.o: KBUILD_CFLAGS += -fshort-wchar -mno-red-zone
>
> vmlinux-objs-$(CONFIG_EFI_STUB) += $(obj)/eboot.o $(obj)/efi_stub_$(BITS).o \
> diff --git a/arch/x86/boot/compressed/acpitb.c b/arch/x86/boot/compressed/acpitb.c
> new file mode 100644
> index 000000000000..6b869e3f9780
> --- /dev/null
> +++ b/arch/x86/boot/compressed/acpitb.c
> @@ -0,0 +1,405 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#define BOOT_CTYPE_H
> +#include "misc.h"
> +#include "error.h"
> +
> +#include <linux/efi.h>
> +#include <asm/efi.h>
> +#include <linux/numa.h>
> +#include <linux/acpi.h>
> +
> +extern unsigned long get_cmd_line_ptr(void);
> +
> +#define STATIC
> +#include <linux/decompress/mm.h>
> +
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +struct mem_vector {
> + unsigned long long start;
> + unsigned long long size;
> +};
> +/* Store the immovable memory regions */
> +struct mem_vector immovable_mem[MAX_NUMNODES*2];
> +#endif
> +
> +#ifdef CONFIG_EFI
> +/* Search EFI table for rsdp table. */
> +static bool efi_get_rsdp_addr(acpi_physical_address *rsdp_addr)
> +{
> + efi_system_table_t *systab;
> + bool find_rsdp = false;
> + bool efi_64 = false;
> + void *config_tables;
> + struct efi_info *e;
> + char *sig;
> + int size;
> + int i;
> +
> + e = &boot_params->efi_info;
> + sig = (char *)&e->efi_loader_signature;
> +
> + if (!strncmp(sig, EFI64_LOADER_SIGNATURE, 4))
> + efi_64 = true;
> + else if (!strncmp(sig, EFI32_LOADER_SIGNATURE, 4))
> + efi_64 = false;
> + else {
> + debug_putstr("Wrong EFI loader signature.\n");
> + return false;
> + }
> +
> + /* Get systab from boot params. Based on efi_init(). */
> +#ifdef CONFIG_X86_32
Why the efi_64 detection above but the ifdeffery here? Why not test
efi_64 instead?
> + if (e->efi_systab_hi || e->efi_memmap_hi) {
> + debug_putstr("Table located above 4GB, disabling EFI.\n");
Are you disabling EFI? Where?
Ah, I see, this code is copied from arch/x86/platform/efi/efi.c.
So when copying, fix the user-visible strings too.
> + return false;
> + }
> + systab = (efi_system_table_t *)e->efi_systab;
> +#else
> + systab = (efi_system_table_t *)(
> + e->efi_systab | ((__u64)e->efi_systab_hi<<32));
> +#endif
> +
> + if (systab == NULL)
if (!systab)
Fix all other occurrences.
> + return false;
> +
> + /*
> + * Get EFI tables from systab. Based on efi_config_init() and
> + * efi_config_parse_tables(). Only dig the config_table.
dig out
> + */
> + size = efi_64 ? sizeof(efi_config_table_64_t) :
> + sizeof(efi_config_table_32_t);
> +
> + for (i = 0; i < systab->nr_tables; i++) {
> + efi_guid_t guid;
> + unsigned long table;
> +
> + config_tables = (void *)(systab->tables + size * i);
> + if (efi_64) {
> + efi_config_table_64_t *tmp_table;
> +
> + tmp_table = (efi_config_table_64_t *)config_tables;
> + guid = tmp_table->guid;
> + table = tmp_table->table;
> +#ifndef CONFIG_64BIT
> + if (table >> 32) {
> + debug_putstr("Table located above 4G, disabling EFI.\n");
Fix that.
> + return false;
> + }
> +#endif
> + } else {
> + efi_config_table_32_t *tmp_table;
> +
> + tmp_table = (efi_config_table_32_t *)config_tables;
> + guid = tmp_table->guid;
> + table = tmp_table->table;
> + }
> +
> + /*
> + * Get rsdp from EFI tables.
> + * If ACPI20 table found, use it and return true.
You don't have to say "return true" in the comment - that is in the code
already.
Also, this function - just like get_acpi_rsdp() doesn't need to return
bool but the pa directly.
> + * If ACPI20 table not found, but ACPI table found,
> + * use the ACPI table and return true.
> + * If neither ACPI table nor ACPI20 table found,
> + * return false.
> + */
> + if (!(efi_guidcmp(guid, ACPI_TABLE_GUID))) {
> + *rsdp_addr = (acpi_physical_address)table;
> + find_rsdp = true;
> + } else if (!(efi_guidcmp(guid, ACPI_20_TABLE_GUID))) {
> + *rsdp_addr = (acpi_physical_address)table;
> + return true;
> + }
> + }
> + return find_rsdp;
> +}
> +#else
> +static bool efi_get_rsdp_addr(acpi_physical_address *rsdp_addr)
> +{
> + return false;
> +}
> +#endif
Instead of doing this, move the ifdef inside the function:
static bool efi_get_rsdp_addr(acpi_physical_address *rsdp_addr)
{
#ifdef CONFIG_EFI
/* function body */
#endif
}
> +
> +static u8 checksum(u8 *buffer, u32 length)
compute_checksum(...)
> +{
> + u8 sum = 0;
> + u8 *end = buffer + length;
> +
> + while (buffer < end)
> + sum = (u8)(sum + *(buffer++));
> +
> + return sum;
> +}
> +
> +/*
> + * Used to search a block of memory for the RSDP signature.
> + * Return Pointer to the RSDP if found, otherwise NULL.
> + * Based on acpi_tb_scan_memory_for_rsdp().
> + */
> +static u8 *scan_mem_for_rsdp(u8 *start_address, u32 length)
> +{
> + struct acpi_table_rsdp *rsdp;
> + u8 *end_address;
> + u8 *mem_rover;
> +
> + end_address = start_address + length;
> +
> + /* Search from given start address for the requested length */
> + for (mem_rover = start_address; mem_rover < end_address;
> + mem_rover += ACPI_RSDP_SCAN_STEP) {
Shorten those variable names so that the loop fits on one line.
> + /*
> + * The RSDP signature and checksum must both be correct
> + * Note: Sometimes there exists more than one RSDP in memory;
> + * the valid RSDP has a valid checksum, all others have an
> + * invalid checksum.
> + */
> + rsdp = (struct acpi_table_rsdp *)mem_rover;
> +
> + /* Nope, BAD Signature */
> + if (!ACPI_VALIDATE_RSDP_SIG(rsdp->signature))
> + continue;
> +
> + /* Check the standard checksum */
> + if (checksum((u8 *) rsdp, ACPI_RSDP_CHECKSUM_LENGTH) != 0)
No need for "!= 0" at the end. Fix all other tests too.
> + continue;
> +
> + /* Check extended checksum if table version >= 2 */
> + if ((rsdp->revision >= 2) &&
> + (checksum((u8 *) rsdp, ACPI_RSDP_XCHECKSUM_LENGTH) != 0))
> + continue;
> +
> + /* Sig and checksum valid, we have found a real RSDP */
> + return mem_rover;
> + }
> + return NULL;
> +}
> +
> +/*
> + * Used to search RSDP physical address.
> + * Based on acpi_find_root_pointer(). Since only use physical address
> + * in this period, so there is no need to do the memory map jobs.
> + */
> +static void bios_get_rsdp_addr(acpi_physical_address *rsdp_addr)
> +{
> + struct acpi_table_rsdp *rsdp;
> + u8 *table_ptr;
> + u8 *mem_rover;
> + u32 address;
> +
> + /*
> + * Get the location of the Extended BIOS Data Area (EBDA)
> + * Since we use physical address directely, so
> + * acpi_os_map_memory() and acpi_os_unmap_memory() are
> + * not needed here.
> + */
> + table_ptr = (u8 *)ACPI_EBDA_PTR_LOCATION;
> + *(u32 *)(void *)&address = *(u16 *)(void *)table_ptr;
> + address <<= 4;
> + table_ptr = (u8 *)address;
> +
> + /*
> + * Search EBDA paragraphs (EBDA is required to be a minimum of
> + * 1K length)
> + */
> + if (address > 0x400) {
> + mem_rover = scan_mem_for_rsdp(table_ptr, ACPI_EBDA_WINDOW_SIZE);
> +
> + if (mem_rover) {
> + address += (u32)ACPI_PTR_DIFF(mem_rover, table_ptr);
> + *rsdp_addr = (acpi_physical_address)address;
> + return;
> + }
> + }
> +
> + table_ptr = (u8 *)ACPI_HI_RSDP_WINDOW_BASE;
> + mem_rover = scan_mem_for_rsdp(table_ptr, ACPI_HI_RSDP_WINDOW_SIZE);
> +
> + /*
> + * Search upper memory: 16-byte boundaries in E0000h-FFFFFh
> + * Since we use physical address directely, so
> + * acpi_os_map_memory() and acpi_os_unmap_memory() are
> + * not needed here.
> + */
> + if (mem_rover) {
> + address = (u32)(ACPI_HI_RSDP_WINDOW_BASE +
> + ACPI_PTR_DIFF(mem_rover, table_ptr));
> + *rsdp_addr = (acpi_physical_address)address;
> + return;
We will return anyway, without that statement. :)
> + }
> +}
> +
> +#ifdef CONFIG_KEXEC
> +static bool get_acpi_rsdp(acpi_physical_address *rsdp_addr)
> +{
> + char *args = (char *)get_cmd_line_ptr();
> + size_t len = strlen((char *)args);
> + char *tmp_cmdline, *param, *val;
> + unsigned long long addr = 0;
> + char *endptr;
> +
> + if (!strstr(args, "acpi_rsdp="))
> + return false;
> +
> + tmp_cmdline = malloc(len+1);
> + if (!tmp_cmdline)
> + error("Failed to allocate space for tmp_cmdline");
Why do you even need to allocate a tmp cmdline?
Ah, I see what you've done - you've copied handle_mem_options() in
kaslr.c. Well no, not really.
That functionality needs to get extracted into a separate facility. Oh
look, there's arch/x86/boot/compressed/cmdline.c which is begging to get
extended.
:-)
> +
> + memcpy(tmp_cmdline, args, len);
> + tmp_cmdline[len] = 0;
> + args = tmp_cmdline;
> +
> + args = skip_spaces(args);
> +
> + while (*args) {
> + args = next_arg(args, ¶m, &val);
> +
> + if (!val && strcmp(param, "--") == 0) {
> + warn("Only '--' specified in cmdline");
> + free(tmp_cmdline);
> + return false;
> + }
> +
> + if (!strcmp(param, "acpi_rsdp")) {
> + addr = simple_strtoull(val, &endptr, 0);
WARNING: simple_strtoull is obsolete, use kstrtoull instead
#321: FILE: arch/x86/boot/compressed/acpitb.c:262:
+ addr = simple_strtoull(val, &endptr, 0);
Please integrate scripts/checkpatch.pl into your patch creation
workflow. Some of the warnings/errors *actually* make sense.
> +
> + if (addr == 0)
> + return false;
> +
> + *rsdp_addr = (acpi_physical_address)addr;
> + return true;
> + }
> + }
> + return false;
> +}
> +#else
> +static bool get_acpi_rsdp(acpi_physical_address *rsdp_addr)
> +{
> + return false;
> +}
> +#endif
> +
> +/*
> + * Used to dig rsdp table from EFI table or BIOS.
Write "rsdp" in all caps in all comments.
> + * If rsdp table found in EFI table, use it. Or search BIOS.
> + * Based on acpi_os_get_root_pointer().
> + */
> +static acpi_physical_address get_rsdp_addr(void)
> +{
> + acpi_physical_address pa = 0;
> + bool status = false;
> +
> + status = get_acpi_rsdp(&pa);
Why does this function return bool if pa == 0 is already an invalid
address. You don't need the initialization to 0 above either.
> +
> + if (!status || pa == 0)
if (!status || !pa)
Fix all other tests.
> + status = efi_get_rsdp_addr(&pa);
> +
> + if (!status || pa == 0)
> + bios_get_rsdp_addr(&pa);
> +
> + return pa;
> +}
> +
> +static struct acpi_table_header *get_acpi_srat_table(void)
> +{
> + char *args = (char *)get_cmd_line_ptr();
> + acpi_physical_address acpi_table;
> + acpi_physical_address root_table;
> + struct acpi_table_header *header;
> + struct acpi_table_rsdp *rsdp;
> + char *signature;
> + u8 *entry;
> + u32 count;
> + u32 size;
> + int i, j;
> + u32 len;
> +
> + rsdp = (struct acpi_table_rsdp *)get_rsdp_addr();
> + if (!rsdp)
> + return NULL;
> +
> + /* Get rsdt or xsdt from rsdp. */
> + if (!strstr(args, "acpi=rsdt") &&
> + rsdp->xsdt_physical_address && rsdp->revision > 1) {
> + root_table = rsdp->xsdt_physical_address;
> + size = ACPI_XSDT_ENTRY_SIZE;
> + } else {
> + root_table = rsdp->rsdt_physical_address;
> + size = ACPI_RSDT_ENTRY_SIZE;
> + }
Please rework the cmdline parsing so that the functions can call helpers
only.
> +
> + /* Get ACPI root table from rsdt or xsdt.*/
> + header = (struct acpi_table_header *)root_table;
> + len = header->length;
> + count = (u32)((len - sizeof(struct acpi_table_header)) / size);
> + entry = ACPI_ADD_PTR(u8, header, sizeof(struct acpi_table_header));
> +
> + for (i = 0; i < count; i++) {
> + u64 address64;
> +
> + if (size == ACPI_RSDT_ENTRY_SIZE)
> + acpi_table = ((acpi_physical_address)
> + (*ACPI_CAST_PTR(u32, entry)));
> + else {
> + *(u64 *)(void *)&address64 = *(u64 *)(void *)entry;
> + acpi_table = (acpi_physical_address) address64;
> + }
> +
> + if (acpi_table) {
> + header = (struct acpi_table_header *)acpi_table;
> + signature = header->signature;
> +
> + if (!strncmp(signature, "SRAT", 4))
> + return header;
> + }
> + entry += size;
> + }
> + return NULL;
> +}
> +
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +/*
> + * According to ACPI table, filter the immvoable memory regions
> + * and store them in immovable_mem[].
> + */
> +void get_immovable_mem(void)
> +{
> + char *args = (char *)get_cmd_line_ptr();
> + struct acpi_table_header *table_header;
> + struct acpi_subtable_header *table;
> + struct acpi_srat_mem_affinity *ma;
> + unsigned long table_end;
> + int i = 0;
> +
> + if (!strstr(args, "movable_node") || strstr(args, "acpi=off"))
> + return;
Ditto.
> +
> + table_header = get_acpi_srat_table();
> + if (!table_header)
> + return;
> +
> + table_end = (unsigned long)table_header + table_header->length;
> +
> + table = (struct acpi_subtable_header *)
> + ((unsigned long)table_header + sizeof(struct acpi_table_srat));
> +
> + while (((unsigned long)table) + table->length < table_end) {
> + if (table->type == 1) {
> + ma = (struct acpi_srat_mem_affinity *)table;
> + if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)) {
> + immovable_mem[i].start = ma->base_address;
> + immovable_mem[i].size = ma->length;
> + i++;
> + }
> +
> + if (i >= MAX_NUMNODES*2)
> + break;
> + }
> + table = (struct acpi_subtable_header *)
> + ((unsigned long)table + table->length);
> + }
> + num_immovable_mem = i;
> +}
> +#else
> +void get_immovable_mem(void)
> +{
> +}
> +#endif
This patch is a pain to review - pls split it into patches adding:
* get_acpi_rsdp
* efi_get_rsdp_addr
* bios_get_rsdp_addr
and the needed functionality.
As a prepatch refactor the cmdline parsing pls.
Thx.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Thu, Oct 11, 2018 at 12:57:08PM +0200, Borislav Petkov wrote:
>On Wed, Oct 10, 2018 at 04:41:17PM +0800, Chao Fan wrote:
>> There is a bug that kaslr may randomly chooses some positions
>> which are located in movable memory regions. This will break memory
>> hotplug feature and make the movable memory chosen by KASLR can't be
>> removed. So dig SRAT table from ACPI tables to get memory information.
>>
>> Imitate the ACPI code of parsing ACPI tables to dig and read ACPI
>> tables. Since some operations are not needed here, functions are
>> simplified. Functions will be used to dig only SRAT tables to get
>> information of memory, so that KASLR can the memory in immovable node.
>>
>> And also, these functions won't influence the initialization of
>> ACPI after start_kernel().
>>
>> Since use physical address directely, so acpi_os_map_memory()
>> and acpi_os_unmap_memory() are not needed.
>>
>> Signed-off-by: Chao Fan <[email protected]>
>> ---
>> arch/x86/boot/compressed/Makefile | 2 +
>> arch/x86/boot/compressed/acpitb.c | 405 ++++++++++++++++++++++++++++++
>> arch/x86/boot/compressed/misc.h | 8 +
>> 3 files changed, 415 insertions(+)
>> create mode 100644 arch/x86/boot/compressed/acpitb.c
>>
>> diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
>> index 28764dacf018..1609e4efcaed 100644
>> --- a/arch/x86/boot/compressed/Makefile
>> +++ b/arch/x86/boot/compressed/Makefile
>> @@ -83,6 +83,8 @@ ifdef CONFIG_X86_64
>> vmlinux-objs-y += $(obj)/pgtable_64.o
>> endif
>>
>> +vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/acpitb.o
>
So many thanks for your review.
>This should be CONFIG_MEMORY_HOTREMOVE *and* CONFIG_RANDOMIZE_BASE.
>Otherwise we don't need all that code.
Thanks, I will add CONFIG_RANDOMIZE_BASE.
In V7, I ever added CONFIG_MEMORY_HOTREMOVE, then I need add in kaslr.c:
+#ifdef CONFIG_MEMORY_HOTREMOVE
+ /* Mark the immovable regions we need to choose */
+ get_immovable_mem();
+#endif
Then in V8, follow Kees Cook's suggestion, change the #ifdef to the
definition of get_immovable_mem() in acpitb.c
So I drop the CONFIG_MEMORY_HOTREMOVE.
I will splite it to more patch in next version.
Thanks,
Chao Fan
>
>> $(obj)/eboot.o: KBUILD_CFLAGS += -fshort-wchar -mno-red-zone
>>
>> vmlinux-objs-$(CONFIG_EFI_STUB) += $(obj)/eboot.o $(obj)/efi_stub_$(BITS).o \
>> diff --git a/arch/x86/boot/compressed/acpitb.c b/arch/x86/boot/compressed/acpitb.c
>> new file mode 100644
>> index 000000000000..6b869e3f9780
>> --- /dev/null
>> +++ b/arch/x86/boot/compressed/acpitb.c
>> @@ -0,0 +1,405 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +#define BOOT_CTYPE_H
>> +#include "misc.h"
>> +#include "error.h"
>> +
>> +#include <linux/efi.h>
>> +#include <asm/efi.h>
>> +#include <linux/numa.h>
>> +#include <linux/acpi.h>
>> +
>> +extern unsigned long get_cmd_line_ptr(void);
>> +
>> +#define STATIC
>> +#include <linux/decompress/mm.h>
>> +
>> +#ifdef CONFIG_MEMORY_HOTREMOVE
>> +struct mem_vector {
>> + unsigned long long start;
>> + unsigned long long size;
>> +};
>> +/* Store the immovable memory regions */
>> +struct mem_vector immovable_mem[MAX_NUMNODES*2];
>> +#endif
>> +
>> +#ifdef CONFIG_EFI
>> +/* Search EFI table for rsdp table. */
>> +static bool efi_get_rsdp_addr(acpi_physical_address *rsdp_addr)
>> +{
>> + efi_system_table_t *systab;
>> + bool find_rsdp = false;
>> + bool efi_64 = false;
>> + void *config_tables;
>> + struct efi_info *e;
>> + char *sig;
>> + int size;
>> + int i;
>> +
>> + e = &boot_params->efi_info;
>> + sig = (char *)&e->efi_loader_signature;
>> +
>> + if (!strncmp(sig, EFI64_LOADER_SIGNATURE, 4))
>> + efi_64 = true;
>> + else if (!strncmp(sig, EFI32_LOADER_SIGNATURE, 4))
>> + efi_64 = false;
>> + else {
>> + debug_putstr("Wrong EFI loader signature.\n");
>> + return false;
>> + }
>> +
>> + /* Get systab from boot params. Based on efi_init(). */
>> +#ifdef CONFIG_X86_32
>
>Why the efi_64 detection above but the ifdeffery here? Why not test
>efi_64 instead?
The efi_64 is used for the efi table size.
>
>> + if (e->efi_systab_hi || e->efi_memmap_hi) {
>> + debug_putstr("Table located above 4GB, disabling EFI.\n");
>
>Are you disabling EFI? Where?
>
>Ah, I see, this code is copied from arch/x86/platform/efi/efi.c.
>
>So when copying, fix the user-visible strings too.
Will change it.
>
>> + return false;
>> + }
>> + systab = (efi_system_table_t *)e->efi_systab;
>> +#else
>> + systab = (efi_system_table_t *)(
>> + e->efi_systab | ((__u64)e->efi_systab_hi<<32));
>> +#endif
>> +
>> + if (systab == NULL)
>
> if (!systab)
>
>Fix all other occurrences.
>
>> + return false;
>> +
>> + /*
>> + * Get EFI tables from systab. Based on efi_config_init() and
>> + * efi_config_parse_tables(). Only dig the config_table.
>
> dig out
>
>> + */
>> + size = efi_64 ? sizeof(efi_config_table_64_t) :
>> + sizeof(efi_config_table_32_t);
>> +
>> + for (i = 0; i < systab->nr_tables; i++) {
>> + efi_guid_t guid;
>> + unsigned long table;
>> +
>> + config_tables = (void *)(systab->tables + size * i);
>> + if (efi_64) {
>> + efi_config_table_64_t *tmp_table;
>> +
>> + tmp_table = (efi_config_table_64_t *)config_tables;
>> + guid = tmp_table->guid;
>> + table = tmp_table->table;
>> +#ifndef CONFIG_64BIT
>> + if (table >> 32) {
>> + debug_putstr("Table located above 4G, disabling EFI.\n");
>
>Fix that.
>
>> + return false;
>> + }
>> +#endif
>> + } else {
>> + efi_config_table_32_t *tmp_table;
>> +
>> + tmp_table = (efi_config_table_32_t *)config_tables;
>> + guid = tmp_table->guid;
>> + table = tmp_table->table;
>> + }
>> +
>> + /*
>> + * Get rsdp from EFI tables.
>> + * If ACPI20 table found, use it and return true.
>
>You don't have to say "return true" in the comment - that is in the code
>already.
>
>Also, this function - just like get_acpi_rsdp() doesn't need to return
>bool but the pa directly.
>
>> + * If ACPI20 table not found, but ACPI table found,
>> + * use the ACPI table and return true.
>> + * If neither ACPI table nor ACPI20 table found,
>> + * return false.
>> + */
>> + if (!(efi_guidcmp(guid, ACPI_TABLE_GUID))) {
>> + *rsdp_addr = (acpi_physical_address)table;
>> + find_rsdp = true;
>> + } else if (!(efi_guidcmp(guid, ACPI_20_TABLE_GUID))) {
>> + *rsdp_addr = (acpi_physical_address)table;
>> + return true;
>> + }
>> + }
>> + return find_rsdp;
>> +}
>> +#else
>> +static bool efi_get_rsdp_addr(acpi_physical_address *rsdp_addr)
>> +{
>> + return false;
>> +}
>> +#endif
>
>Instead of doing this, move the ifdef inside the function:
>
>static bool efi_get_rsdp_addr(acpi_physical_address *rsdp_addr)
>{
>#ifdef CONFIG_EFI
>
> /* function body */
>#endif
>}
>
>
>> +
>> +static u8 checksum(u8 *buffer, u32 length)
>
>compute_checksum(...)
>
>> +{
>> + u8 sum = 0;
>> + u8 *end = buffer + length;
>> +
>> + while (buffer < end)
>> + sum = (u8)(sum + *(buffer++));
>> +
>> + return sum;
>> +}
>> +
>> +/*
>> + * Used to search a block of memory for the RSDP signature.
>> + * Return Pointer to the RSDP if found, otherwise NULL.
>> + * Based on acpi_tb_scan_memory_for_rsdp().
>> + */
>> +static u8 *scan_mem_for_rsdp(u8 *start_address, u32 length)
>> +{
>> + struct acpi_table_rsdp *rsdp;
>> + u8 *end_address;
>> + u8 *mem_rover;
>> +
>> + end_address = start_address + length;
>> +
>> + /* Search from given start address for the requested length */
>> + for (mem_rover = start_address; mem_rover < end_address;
>> + mem_rover += ACPI_RSDP_SCAN_STEP) {
>
>Shorten those variable names so that the loop fits on one line.
>
>> + /*
>> + * The RSDP signature and checksum must both be correct
>> + * Note: Sometimes there exists more than one RSDP in memory;
>> + * the valid RSDP has a valid checksum, all others have an
>> + * invalid checksum.
>> + */
>> + rsdp = (struct acpi_table_rsdp *)mem_rover;
>> +
>> + /* Nope, BAD Signature */
>> + if (!ACPI_VALIDATE_RSDP_SIG(rsdp->signature))
>> + continue;
>> +
>> + /* Check the standard checksum */
>> + if (checksum((u8 *) rsdp, ACPI_RSDP_CHECKSUM_LENGTH) != 0)
>
>No need for "!= 0" at the end. Fix all other tests too.
>
>> + continue;
>> +
>> + /* Check extended checksum if table version >= 2 */
>> + if ((rsdp->revision >= 2) &&
>> + (checksum((u8 *) rsdp, ACPI_RSDP_XCHECKSUM_LENGTH) != 0))
>> + continue;
>> +
>> + /* Sig and checksum valid, we have found a real RSDP */
>> + return mem_rover;
>> + }
>> + return NULL;
>> +}
>> +
>> +/*
>> + * Used to search RSDP physical address.
>> + * Based on acpi_find_root_pointer(). Since only use physical address
>> + * in this period, so there is no need to do the memory map jobs.
>> + */
>> +static void bios_get_rsdp_addr(acpi_physical_address *rsdp_addr)
>> +{
>> + struct acpi_table_rsdp *rsdp;
>> + u8 *table_ptr;
>> + u8 *mem_rover;
>> + u32 address;
>> +
>> + /*
>> + * Get the location of the Extended BIOS Data Area (EBDA)
>> + * Since we use physical address directely, so
>> + * acpi_os_map_memory() and acpi_os_unmap_memory() are
>> + * not needed here.
>> + */
>> + table_ptr = (u8 *)ACPI_EBDA_PTR_LOCATION;
>> + *(u32 *)(void *)&address = *(u16 *)(void *)table_ptr;
>> + address <<= 4;
>> + table_ptr = (u8 *)address;
>> +
>> + /*
>> + * Search EBDA paragraphs (EBDA is required to be a minimum of
>> + * 1K length)
>> + */
>> + if (address > 0x400) {
>> + mem_rover = scan_mem_for_rsdp(table_ptr, ACPI_EBDA_WINDOW_SIZE);
>> +
>> + if (mem_rover) {
>> + address += (u32)ACPI_PTR_DIFF(mem_rover, table_ptr);
>> + *rsdp_addr = (acpi_physical_address)address;
>> + return;
>> + }
>> + }
>> +
>> + table_ptr = (u8 *)ACPI_HI_RSDP_WINDOW_BASE;
>> + mem_rover = scan_mem_for_rsdp(table_ptr, ACPI_HI_RSDP_WINDOW_SIZE);
>> +
>> + /*
>> + * Search upper memory: 16-byte boundaries in E0000h-FFFFFh
>> + * Since we use physical address directely, so
>> + * acpi_os_map_memory() and acpi_os_unmap_memory() are
>> + * not needed here.
>> + */
>> + if (mem_rover) {
>> + address = (u32)(ACPI_HI_RSDP_WINDOW_BASE +
>> + ACPI_PTR_DIFF(mem_rover, table_ptr));
>> + *rsdp_addr = (acpi_physical_address)address;
>> + return;
>
>We will return anyway, without that statement. :)
>
>> + }
>> +}
>> +
>> +#ifdef CONFIG_KEXEC
>> +static bool get_acpi_rsdp(acpi_physical_address *rsdp_addr)
>> +{
>> + char *args = (char *)get_cmd_line_ptr();
>> + size_t len = strlen((char *)args);
>> + char *tmp_cmdline, *param, *val;
>> + unsigned long long addr = 0;
>> + char *endptr;
>> +
>> + if (!strstr(args, "acpi_rsdp="))
>> + return false;
>> +
>> + tmp_cmdline = malloc(len+1);
>> + if (!tmp_cmdline)
>> + error("Failed to allocate space for tmp_cmdline");
>
>Why do you even need to allocate a tmp cmdline?
>
>Ah, I see what you've done - you've copied handle_mem_options() in
>kaslr.c. Well no, not really.
>
>That functionality needs to get extracted into a separate facility. Oh
>look, there's arch/x86/boot/compressed/cmdline.c which is begging to get
>extended.
>
>:-)
>
>> +
>> + memcpy(tmp_cmdline, args, len);
>> + tmp_cmdline[len] = 0;
>> + args = tmp_cmdline;
>> +
>> + args = skip_spaces(args);
>> +
>> + while (*args) {
>> + args = next_arg(args, ¶m, &val);
>> +
>> + if (!val && strcmp(param, "--") == 0) {
>> + warn("Only '--' specified in cmdline");
>> + free(tmp_cmdline);
>> + return false;
>> + }
>> +
>> + if (!strcmp(param, "acpi_rsdp")) {
>> + addr = simple_strtoull(val, &endptr, 0);
>
>WARNING: simple_strtoull is obsolete, use kstrtoull instead
>#321: FILE: arch/x86/boot/compressed/acpitb.c:262:
>+ addr = simple_strtoull(val, &endptr, 0);
>
>
>Please integrate scripts/checkpatch.pl into your patch creation
>workflow. Some of the warnings/errors *actually* make sense.
>
>> +
>> + if (addr == 0)
>> + return false;
>> +
>> + *rsdp_addr = (acpi_physical_address)addr;
>> + return true;
>> + }
>> + }
>> + return false;
>> +}
>> +#else
>> +static bool get_acpi_rsdp(acpi_physical_address *rsdp_addr)
>> +{
>> + return false;
>> +}
>> +#endif
>> +
>> +/*
>> + * Used to dig rsdp table from EFI table or BIOS.
>
>Write "rsdp" in all caps in all comments.
>
>> + * If rsdp table found in EFI table, use it. Or search BIOS.
>> + * Based on acpi_os_get_root_pointer().
>> + */
>> +static acpi_physical_address get_rsdp_addr(void)
>> +{
>> + acpi_physical_address pa = 0;
>> + bool status = false;
>> +
>> + status = get_acpi_rsdp(&pa);
>
>Why does this function return bool if pa == 0 is already an invalid
>address. You don't need the initialization to 0 above either.
>
>> +
>> + if (!status || pa == 0)
>
> if (!status || !pa)
>
>Fix all other tests.
>
>> + status = efi_get_rsdp_addr(&pa);
>> +
>> + if (!status || pa == 0)
>> + bios_get_rsdp_addr(&pa);
>> +
>> + return pa;
>> +}
>> +
>> +static struct acpi_table_header *get_acpi_srat_table(void)
>> +{
>> + char *args = (char *)get_cmd_line_ptr();
>> + acpi_physical_address acpi_table;
>> + acpi_physical_address root_table;
>> + struct acpi_table_header *header;
>> + struct acpi_table_rsdp *rsdp;
>> + char *signature;
>> + u8 *entry;
>> + u32 count;
>> + u32 size;
>> + int i, j;
>> + u32 len;
>> +
>> + rsdp = (struct acpi_table_rsdp *)get_rsdp_addr();
>> + if (!rsdp)
>> + return NULL;
>> +
>> + /* Get rsdt or xsdt from rsdp. */
>> + if (!strstr(args, "acpi=rsdt") &&
>> + rsdp->xsdt_physical_address && rsdp->revision > 1) {
>> + root_table = rsdp->xsdt_physical_address;
>> + size = ACPI_XSDT_ENTRY_SIZE;
>> + } else {
>> + root_table = rsdp->rsdt_physical_address;
>> + size = ACPI_RSDT_ENTRY_SIZE;
>> + }
>
>Please rework the cmdline parsing so that the functions can call helpers
>only.
>
>> +
>> + /* Get ACPI root table from rsdt or xsdt.*/
>> + header = (struct acpi_table_header *)root_table;
>> + len = header->length;
>> + count = (u32)((len - sizeof(struct acpi_table_header)) / size);
>> + entry = ACPI_ADD_PTR(u8, header, sizeof(struct acpi_table_header));
>> +
>> + for (i = 0; i < count; i++) {
>> + u64 address64;
>> +
>> + if (size == ACPI_RSDT_ENTRY_SIZE)
>> + acpi_table = ((acpi_physical_address)
>> + (*ACPI_CAST_PTR(u32, entry)));
>> + else {
>> + *(u64 *)(void *)&address64 = *(u64 *)(void *)entry;
>> + acpi_table = (acpi_physical_address) address64;
>> + }
>> +
>> + if (acpi_table) {
>> + header = (struct acpi_table_header *)acpi_table;
>> + signature = header->signature;
>> +
>> + if (!strncmp(signature, "SRAT", 4))
>> + return header;
>> + }
>> + entry += size;
>> + }
>> + return NULL;
>> +}
>> +
>> +#ifdef CONFIG_MEMORY_HOTREMOVE
>> +/*
>> + * According to ACPI table, filter the immvoable memory regions
>> + * and store them in immovable_mem[].
>> + */
>> +void get_immovable_mem(void)
>> +{
>> + char *args = (char *)get_cmd_line_ptr();
>> + struct acpi_table_header *table_header;
>> + struct acpi_subtable_header *table;
>> + struct acpi_srat_mem_affinity *ma;
>> + unsigned long table_end;
>> + int i = 0;
>> +
>> + if (!strstr(args, "movable_node") || strstr(args, "acpi=off"))
>> + return;
>
>Ditto.
>
>> +
>> + table_header = get_acpi_srat_table();
>> + if (!table_header)
>> + return;
>> +
>> + table_end = (unsigned long)table_header + table_header->length;
>> +
>> + table = (struct acpi_subtable_header *)
>> + ((unsigned long)table_header + sizeof(struct acpi_table_srat));
>> +
>> + while (((unsigned long)table) + table->length < table_end) {
>> + if (table->type == 1) {
>> + ma = (struct acpi_srat_mem_affinity *)table;
>> + if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)) {
>> + immovable_mem[i].start = ma->base_address;
>> + immovable_mem[i].size = ma->length;
>> + i++;
>> + }
>> +
>> + if (i >= MAX_NUMNODES*2)
>> + break;
>> + }
>> + table = (struct acpi_subtable_header *)
>> + ((unsigned long)table + table->length);
>> + }
>> + num_immovable_mem = i;
>> +}
>> +#else
>> +void get_immovable_mem(void)
>> +{
>> +}
>> +#endif
>
>This patch is a pain to review - pls split it into patches adding:
>
>* get_acpi_rsdp
>* efi_get_rsdp_addr
>* bios_get_rsdp_addr
>
>and the needed functionality.
>
>As a prepatch refactor the cmdline parsing pls.
>
>Thx.
>
>--
>Regards/Gruss,
> Boris.
>
>Good mailing practices for 400: avoid top-posting and trim the reply.
>
>
On Thu, Oct 11, 2018 at 12:57:08PM +0200, Borislav Petkov wrote:
>On Wed, Oct 10, 2018 at 04:41:17PM +0800, Chao Fan wrote:
[...]
>> + * If ACPI20 table not found, but ACPI table found,
>> + * use the ACPI table and return true.
>> + * If neither ACPI table nor ACPI20 table found,
>> + * return false.
>> + */
>> + if (!(efi_guidcmp(guid, ACPI_TABLE_GUID))) {
>> + *rsdp_addr = (acpi_physical_address)table;
>> + find_rsdp = true;
>> + } else if (!(efi_guidcmp(guid, ACPI_20_TABLE_GUID))) {
>> + *rsdp_addr = (acpi_physical_address)table;
>> + return true;
>> + }
>> + }
>> + return find_rsdp;
>> +}
>> +#else
>> +static bool efi_get_rsdp_addr(acpi_physical_address *rsdp_addr)
>> +{
>> + return false;
>> +}
>> +#endif
>
>Instead of doing this, move the ifdef inside the function:
>
>static bool efi_get_rsdp_addr(acpi_physical_address *rsdp_addr)
>{
>#ifdef CONFIG_EFI
>
> /* function body */
>#endif
>}
>
Hi Borislav,
Thank you for review the detail, but may I ask you why this style is better?
Since the Documentation/process/coding-style.rst said:
Instead,
use such conditionals in a header file defining functions for use in those .c
files, providing no-op stub versions in the #else case, and then call those
functions unconditionally from .c files. The compiler will avoid generating
any code for the stub calls, producing identical results, but the logic will
remain easy to follow.
Prefer to compile out entire functions, rather than portions of functions or
portions of expressions. Rather than putting an ifdef in an expression, factor
out part or all of the expression into a separate helper function and apply the
conditional to that function.
So I am puzzled. If my understanding is wrong, please let me know.
Thanks,
Chao Fan
>> +
>> +static u8 checksum(u8 *buffer, u32 length)
>
>compute_checksum(...)
>
>> +{
>> + u8 sum = 0;
>> + u8 *end = buffer + length;
>> +
>> + while (buffer < end)
>> + sum = (u8)(sum + *(buffer++));
>> +
>> + return sum;
>> +}
>> +
>> +/*
>> + * Used to search a block of memory for the RSDP signature.
>> + * Return Pointer to the RSDP if found, otherwise NULL.
>> + * Based on acpi_tb_scan_memory_for_rsdp().
>> + */
>> +static u8 *scan_mem_for_rsdp(u8 *start_address, u32 length)
>> +{
>> + struct acpi_table_rsdp *rsdp;
>> + u8 *end_address;
>> + u8 *mem_rover;
>> +
>> + end_address = start_address + length;
>> +
>> + /* Search from given start address for the requested length */
>> + for (mem_rover = start_address; mem_rover < end_address;
>> + mem_rover += ACPI_RSDP_SCAN_STEP) {
>
>Shorten those variable names so that the loop fits on one line.
>
>> + /*
>> + * The RSDP signature and checksum must both be correct
>> + * Note: Sometimes there exists more than one RSDP in memory;
>> + * the valid RSDP has a valid checksum, all others have an
>> + * invalid checksum.
>> + */
>> + rsdp = (struct acpi_table_rsdp *)mem_rover;
>> +
>> + /* Nope, BAD Signature */
>> + if (!ACPI_VALIDATE_RSDP_SIG(rsdp->signature))
>> + continue;
>> +
>> + /* Check the standard checksum */
>> + if (checksum((u8 *) rsdp, ACPI_RSDP_CHECKSUM_LENGTH) != 0)
>
>No need for "!= 0" at the end. Fix all other tests too.
>
>> + continue;
>> +
>> + /* Check extended checksum if table version >= 2 */
>> + if ((rsdp->revision >= 2) &&
>> + (checksum((u8 *) rsdp, ACPI_RSDP_XCHECKSUM_LENGTH) != 0))
>> + continue;
>> +
>> + /* Sig and checksum valid, we have found a real RSDP */
>> + return mem_rover;
>> + }
>> + return NULL;
>> +}
>> +
>> +/*
>> + * Used to search RSDP physical address.
>> + * Based on acpi_find_root_pointer(). Since only use physical address
>> + * in this period, so there is no need to do the memory map jobs.
>> + */
>> +static void bios_get_rsdp_addr(acpi_physical_address *rsdp_addr)
>> +{
>> + struct acpi_table_rsdp *rsdp;
>> + u8 *table_ptr;
>> + u8 *mem_rover;
>> + u32 address;
>> +
>> + /*
>> + * Get the location of the Extended BIOS Data Area (EBDA)
>> + * Since we use physical address directely, so
>> + * acpi_os_map_memory() and acpi_os_unmap_memory() are
>> + * not needed here.
>> + */
>> + table_ptr = (u8 *)ACPI_EBDA_PTR_LOCATION;
>> + *(u32 *)(void *)&address = *(u16 *)(void *)table_ptr;
>> + address <<= 4;
>> + table_ptr = (u8 *)address;
>> +
>> + /*
>> + * Search EBDA paragraphs (EBDA is required to be a minimum of
>> + * 1K length)
>> + */
>> + if (address > 0x400) {
>> + mem_rover = scan_mem_for_rsdp(table_ptr, ACPI_EBDA_WINDOW_SIZE);
>> +
>> + if (mem_rover) {
>> + address += (u32)ACPI_PTR_DIFF(mem_rover, table_ptr);
>> + *rsdp_addr = (acpi_physical_address)address;
>> + return;
>> + }
>> + }
>> +
>> + table_ptr = (u8 *)ACPI_HI_RSDP_WINDOW_BASE;
>> + mem_rover = scan_mem_for_rsdp(table_ptr, ACPI_HI_RSDP_WINDOW_SIZE);
>> +
>> + /*
>> + * Search upper memory: 16-byte boundaries in E0000h-FFFFFh
>> + * Since we use physical address directely, so
>> + * acpi_os_map_memory() and acpi_os_unmap_memory() are
>> + * not needed here.
>> + */
>> + if (mem_rover) {
>> + address = (u32)(ACPI_HI_RSDP_WINDOW_BASE +
>> + ACPI_PTR_DIFF(mem_rover, table_ptr));
>> + *rsdp_addr = (acpi_physical_address)address;
>> + return;
>
>We will return anyway, without that statement. :)
>
>> + }
>> +}
>> +
>> +#ifdef CONFIG_KEXEC
>> +static bool get_acpi_rsdp(acpi_physical_address *rsdp_addr)
>> +{
>> + char *args = (char *)get_cmd_line_ptr();
>> + size_t len = strlen((char *)args);
>> + char *tmp_cmdline, *param, *val;
>> + unsigned long long addr = 0;
>> + char *endptr;
>> +
>> + if (!strstr(args, "acpi_rsdp="))
>> + return false;
>> +
>> + tmp_cmdline = malloc(len+1);
>> + if (!tmp_cmdline)
>> + error("Failed to allocate space for tmp_cmdline");
>
>Why do you even need to allocate a tmp cmdline?
>
>Ah, I see what you've done - you've copied handle_mem_options() in
>kaslr.c. Well no, not really.
>
>That functionality needs to get extracted into a separate facility. Oh
>look, there's arch/x86/boot/compressed/cmdline.c which is begging to get
>extended.
>
>:-)
>
>> +
>> + memcpy(tmp_cmdline, args, len);
>> + tmp_cmdline[len] = 0;
>> + args = tmp_cmdline;
>> +
>> + args = skip_spaces(args);
>> +
>> + while (*args) {
>> + args = next_arg(args, ¶m, &val);
>> +
>> + if (!val && strcmp(param, "--") == 0) {
>> + warn("Only '--' specified in cmdline");
>> + free(tmp_cmdline);
>> + return false;
>> + }
>> +
>> + if (!strcmp(param, "acpi_rsdp")) {
>> + addr = simple_strtoull(val, &endptr, 0);
>
>WARNING: simple_strtoull is obsolete, use kstrtoull instead
>#321: FILE: arch/x86/boot/compressed/acpitb.c:262:
>+ addr = simple_strtoull(val, &endptr, 0);
>
>
>Please integrate scripts/checkpatch.pl into your patch creation
>workflow. Some of the warnings/errors *actually* make sense.
>
>> +
>> + if (addr == 0)
>> + return false;
>> +
>> + *rsdp_addr = (acpi_physical_address)addr;
>> + return true;
>> + }
>> + }
>> + return false;
>> +}
>> +#else
>> +static bool get_acpi_rsdp(acpi_physical_address *rsdp_addr)
>> +{
>> + return false;
>> +}
>> +#endif
>> +
>> +/*
>> + * Used to dig rsdp table from EFI table or BIOS.
>
>Write "rsdp" in all caps in all comments.
>
>> + * If rsdp table found in EFI table, use it. Or search BIOS.
>> + * Based on acpi_os_get_root_pointer().
>> + */
>> +static acpi_physical_address get_rsdp_addr(void)
>> +{
>> + acpi_physical_address pa = 0;
>> + bool status = false;
>> +
>> + status = get_acpi_rsdp(&pa);
>
>Why does this function return bool if pa == 0 is already an invalid
>address. You don't need the initialization to 0 above either.
>
>> +
>> + if (!status || pa == 0)
>
> if (!status || !pa)
>
>Fix all other tests.
>
>> + status = efi_get_rsdp_addr(&pa);
>> +
>> + if (!status || pa == 0)
>> + bios_get_rsdp_addr(&pa);
>> +
>> + return pa;
>> +}
>> +
>> +static struct acpi_table_header *get_acpi_srat_table(void)
>> +{
>> + char *args = (char *)get_cmd_line_ptr();
>> + acpi_physical_address acpi_table;
>> + acpi_physical_address root_table;
>> + struct acpi_table_header *header;
>> + struct acpi_table_rsdp *rsdp;
>> + char *signature;
>> + u8 *entry;
>> + u32 count;
>> + u32 size;
>> + int i, j;
>> + u32 len;
>> +
>> + rsdp = (struct acpi_table_rsdp *)get_rsdp_addr();
>> + if (!rsdp)
>> + return NULL;
>> +
>> + /* Get rsdt or xsdt from rsdp. */
>> + if (!strstr(args, "acpi=rsdt") &&
>> + rsdp->xsdt_physical_address && rsdp->revision > 1) {
>> + root_table = rsdp->xsdt_physical_address;
>> + size = ACPI_XSDT_ENTRY_SIZE;
>> + } else {
>> + root_table = rsdp->rsdt_physical_address;
>> + size = ACPI_RSDT_ENTRY_SIZE;
>> + }
>
>Please rework the cmdline parsing so that the functions can call helpers
>only.
>
>> +
>> + /* Get ACPI root table from rsdt or xsdt.*/
>> + header = (struct acpi_table_header *)root_table;
>> + len = header->length;
>> + count = (u32)((len - sizeof(struct acpi_table_header)) / size);
>> + entry = ACPI_ADD_PTR(u8, header, sizeof(struct acpi_table_header));
>> +
>> + for (i = 0; i < count; i++) {
>> + u64 address64;
>> +
>> + if (size == ACPI_RSDT_ENTRY_SIZE)
>> + acpi_table = ((acpi_physical_address)
>> + (*ACPI_CAST_PTR(u32, entry)));
>> + else {
>> + *(u64 *)(void *)&address64 = *(u64 *)(void *)entry;
>> + acpi_table = (acpi_physical_address) address64;
>> + }
>> +
>> + if (acpi_table) {
>> + header = (struct acpi_table_header *)acpi_table;
>> + signature = header->signature;
>> +
>> + if (!strncmp(signature, "SRAT", 4))
>> + return header;
>> + }
>> + entry += size;
>> + }
>> + return NULL;
>> +}
>> +
>> +#ifdef CONFIG_MEMORY_HOTREMOVE
>> +/*
>> + * According to ACPI table, filter the immvoable memory regions
>> + * and store them in immovable_mem[].
>> + */
>> +void get_immovable_mem(void)
>> +{
>> + char *args = (char *)get_cmd_line_ptr();
>> + struct acpi_table_header *table_header;
>> + struct acpi_subtable_header *table;
>> + struct acpi_srat_mem_affinity *ma;
>> + unsigned long table_end;
>> + int i = 0;
>> +
>> + if (!strstr(args, "movable_node") || strstr(args, "acpi=off"))
>> + return;
>
>Ditto.
>
>> +
>> + table_header = get_acpi_srat_table();
>> + if (!table_header)
>> + return;
>> +
>> + table_end = (unsigned long)table_header + table_header->length;
>> +
>> + table = (struct acpi_subtable_header *)
>> + ((unsigned long)table_header + sizeof(struct acpi_table_srat));
>> +
>> + while (((unsigned long)table) + table->length < table_end) {
>> + if (table->type == 1) {
>> + ma = (struct acpi_srat_mem_affinity *)table;
>> + if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)) {
>> + immovable_mem[i].start = ma->base_address;
>> + immovable_mem[i].size = ma->length;
>> + i++;
>> + }
>> +
>> + if (i >= MAX_NUMNODES*2)
>> + break;
>> + }
>> + table = (struct acpi_subtable_header *)
>> + ((unsigned long)table + table->length);
>> + }
>> + num_immovable_mem = i;
>> +}
>> +#else
>> +void get_immovable_mem(void)
>> +{
>> +}
>> +#endif
>
>This patch is a pain to review - pls split it into patches adding:
>
>* get_acpi_rsdp
>* efi_get_rsdp_addr
>* bios_get_rsdp_addr
>
>and the needed functionality.
>
>As a prepatch refactor the cmdline parsing pls.
>
>Thx.
>
>--
>Regards/Gruss,
> Boris.
>
>Good mailing practices for 400: avoid top-posting and trim the reply.
>
>
On Fri, Oct 12, 2018 at 05:36:38PM +0800, Chao Fan wrote:
> Prefer to compile out entire functions, rather than portions of functions or
> portions of expressions. Rather than putting an ifdef in an expression, factor
> out part or all of the expression into a separate helper function and apply the
> conditional to that function.
>
> So I am puzzled. If my understanding is wrong, please let me know.
If you do it the way I suggested, you simply have one ifdeffery branch
less. And ifdeffery is ugly. So less clutter. Also, this way you do
compile out entire functions too.
HTH.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Fri, Oct 12, 2018 at 11:46:55AM +0200, Borislav Petkov wrote:
>On Fri, Oct 12, 2018 at 05:36:38PM +0800, Chao Fan wrote:
>> Prefer to compile out entire functions, rather than portions of functions or
>> portions of expressions. Rather than putting an ifdef in an expression, factor
>> out part or all of the expression into a separate helper function and apply the
>> conditional to that function.
>>
>> So I am puzzled. If my understanding is wrong, please let me know.
>
>If you do it the way I suggested, you simply have one ifdeffery branch
>less. And ifdeffery is ugly. So less clutter. Also, this way you do
>compile out entire functions too.
>
>HTH.
Got it, thanks for your rapid reply.
Thanks,
Chao Fan
>
>--
>Regards/Gruss,
> Boris.
>
>Good mailing practices for 400: avoid top-posting and trim the reply.
>
>
Baoquan and Chao, thank you for your comments!
Boris, the patches has been merged to linux-next.
Should I create a new patch for linux-next?
I'm trying to make the padding size set automatically.
Thanks,
Masa
On Thu, Oct 11, 2018 at 01:51:54PM +0800, Chao Fan wrote:
> On Thu, Oct 11, 2018 at 08:29:55AM +0800, Baoquan He wrote:
> >On 10/10/18 at 03:44pm, Masayoshi Mizuma wrote:
> >> On Wed, Oct 10, 2018 at 05:30:57PM +0800, Baoquan He wrote:
> >> > On 10/10/18 at 11:19am, Borislav Petkov wrote:
> >> > > On Wed, Oct 10, 2018 at 11:14:26AM +0200, Thomas Gleixner wrote:
> >> > > > Yes, it's different, but if the SRAT information is available early, then
> >> > > > the command line parameter can go away because then the required
> >> > > > information for Masa's problem is available as well.
> >> > >
> >> > > Exactly. And I'd prefer we delayed the command line parameter until we
> >> > > figure out we really need it and not expose it to upstream and then
> >> > > remove it shortly after.
> >> > >
> >> > > So I'd suggest we move Masa's patches to a separate branch and not send
> >> > > it up this round.
> >> >
> >> > Yes, sounds more reasonable if we can reuse functions in Chao's patch 1/3
> >> > to solve the padding issue.
> >>
> >> Thanks for your comments! Yes, immovable_mem[num_immovable_mem] in Chao's
> >> patch may be useful for calculating the padding size. If so, we don't
> >> need the new kernel parameter. It's nice!
> >>
> >> Do you happen to have ideas how we access immovable_mem[num_immovable_mem]
> >> from arch/x86/mm/kaslr.c ? It is located to arch/x86/boot/compressed/*, so
> >> I suppose it is not easy to access it...
> >> I would appreciate if you could give some advice.
> >
> >Hmm, they are living in different life cycle and space. So we can only
> >reuse the code from Chao's patch, but not the variable. Means we need
> >go through the SRAT accessing again in arch/x86/mm/kaslr.c and fill
> >immovable_mem[num_immovable_mem] for mm/kaslr.c usage, if we decide to
> >do like that.
>
> Reading three times is redundant, but reading two times is needed.
> Becasue the ACPI code run very stable, but we need more information
> before that.
> As for Masa's issue, I am wondering whether we can tranfer the
> information or only the address of SRAT table from compressed period
> to the period after start_kernel().
>
> Thanks,
> Chao Fan
>
> >
> >Thanks
> >Baoquan
> >
> >
>
>
On Sat, Oct 13, 2018 at 04:19:59PM -0400, Masayoshi Mizuma wrote:
> Baoquan and Chao, thank you for your comments!
>
> Boris, the patches has been merged to linux-next.
First of all, please do not top-post.
Even if they're in linux-next, that doesn't mean they should go upstream
while there are still issues to be hammered out.
In this particular case, if Chao Fan's patchset shapes up in a decent
form to be upstreamed, we don't need the cmdline parameter anymore, do
we?
Because we'll have the SRAT parsing and proper KASLR limits ready and
working automatically, I'd say.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Sat, Oct 13, 2018 at 10:34:29PM +0200, Borislav Petkov wrote:
> On Sat, Oct 13, 2018 at 04:19:59PM -0400, Masayoshi Mizuma wrote:
> > Baoquan and Chao, thank you for your comments!
> >
> > Boris, the patches has been merged to linux-next.
>
> First of all, please do not top-post.
>
> Even if they're in linux-next, that doesn't mean they should go upstream
> while there are still issues to be hammered out.
Got it, thanks.
>
> In this particular case, if Chao Fan's patchset shapes up in a decent
> form to be upstreamed, we don't need the cmdline parameter anymore, do
> we?
>
> Because we'll have the SRAT parsing and proper KASLR limits ready and
> working automatically, I'd say.
Yes, but I think we need to arrange the Chao's SRAT parsing to be used
from kernel_randomize_memory() because Chao's approach needs the SRAT
parsing before extract kernel and the padding size calculation needs
the parsing in start_kernel(), so they are living in different life
cycle and space.
Thanks,
Masa
On Sat, Oct 13, 2018 at 05:45:51PM -0400, Masayoshi Mizuma wrote:
> Yes, but I think we need to arrange the Chao's SRAT parsing to be used
> from kernel_randomize_memory() because Chao's approach needs the SRAT
> parsing before extract kernel and the padding size calculation needs
> the parsing in start_kernel(), so they are living in different life
> cycle and space.
Or you would like to pass some info from the compressed kernel to kernel
proper?
For example, something like the passing in add_e820ext() and the consumtion
in parse_setup_data():
switch (data_type) {
case SETUP_E820_EXT:
e820__memory_setup_extended(pa_data, data_len);
So info you need for your padding gets collected in
arch/x86/boot/compressed/acpitb.c and you consume it in
kernel_randomize_memory()?
Would that work?
:-)
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Sun, Oct 14, 2018 at 12:05:41AM +0200, Borislav Petkov wrote:
> On Sat, Oct 13, 2018 at 05:45:51PM -0400, Masayoshi Mizuma wrote:
> > Yes, but I think we need to arrange the Chao's SRAT parsing to be used
> > from kernel_randomize_memory() because Chao's approach needs the SRAT
> > parsing before extract kernel and the padding size calculation needs
> > the parsing in start_kernel(), so they are living in different life
> > cycle and space.
>
> Or you would like to pass some info from the compressed kernel to kernel
> proper?
>
> For example, something like the passing in add_e820ext() and the consumtion
> in parse_setup_data():
>
> switch (data_type) {
> case SETUP_E820_EXT:
> e820__memory_setup_extended(pa_data, data_len);
>
> So info you need for your padding gets collected in
> arch/x86/boot/compressed/acpitb.c and you consume it in
> kernel_randomize_memory()?
>
> Would that work?
It's nice idea! Thank you so much!
- Masa
Hi Chao,
Let me add some suggestions.
On Wed, Oct 10, 2018 at 04:41:17PM +0800, Chao Fan wrote:
> There is a bug that kaslr may randomly chooses some positions
> which are located in movable memory regions. This will break memory
> hotplug feature and make the movable memory chosen by KASLR can't be
> removed. So dig SRAT table from ACPI tables to get memory information.
>
> Imitate the ACPI code of parsing ACPI tables to dig and read ACPI
> tables. Since some operations are not needed here, functions are
> simplified. Functions will be used to dig only SRAT tables to get
> information of memory, so that KASLR can the memory in immovable node.
>
> And also, these functions won't influence the initialization of
> ACPI after start_kernel().
>
> Since use physical address directely, so acpi_os_map_memory()
> and acpi_os_unmap_memory() are not needed.
>
> Signed-off-by: Chao Fan <[email protected]>
> ---
> arch/x86/boot/compressed/Makefile | 2 +
> arch/x86/boot/compressed/acpitb.c | 405 ++++++++++++++++++++++++++++++
> arch/x86/boot/compressed/misc.h | 8 +
> 3 files changed, 415 insertions(+)
> create mode 100644 arch/x86/boot/compressed/acpitb.c
>
...cut...
> +static struct acpi_table_header *get_acpi_srat_table(void)
> +{
> + char *args = (char *)get_cmd_line_ptr();
> + acpi_physical_address acpi_table;
> + acpi_physical_address root_table;
> + struct acpi_table_header *header;
> + struct acpi_table_rsdp *rsdp;
> + char *signature;
> + u8 *entry;
> + u32 count;
> + u32 size;
> + int i, j;
> + u32 len;
> +
> + rsdp = (struct acpi_table_rsdp *)get_rsdp_addr();
> + if (!rsdp)
> + return NULL;
> +
> + /* Get rsdt or xsdt from rsdp. */
> + if (!strstr(args, "acpi=rsdt") &&
> + rsdp->xsdt_physical_address && rsdp->revision > 1) {
> + root_table = rsdp->xsdt_physical_address;
> + size = ACPI_XSDT_ENTRY_SIZE;
> + } else {
> + root_table = rsdp->rsdt_physical_address;
> + size = ACPI_RSDT_ENTRY_SIZE;
> + }
> +
> + /* Get ACPI root table from rsdt or xsdt.*/
> + header = (struct acpi_table_header *)root_table;
> + len = header->length;
> + count = (u32)((len - sizeof(struct acpi_table_header)) / size);
> + entry = ACPI_ADD_PTR(u8, header, sizeof(struct acpi_table_header));
> +
> + for (i = 0; i < count; i++) {
> + u64 address64;
> +
> + if (size == ACPI_RSDT_ENTRY_SIZE)
> + acpi_table = ((acpi_physical_address)
> + (*ACPI_CAST_PTR(u32, entry)));
> + else {
> + *(u64 *)(void *)&address64 = *(u64 *)(void *)entry;
> + acpi_table = (acpi_physical_address) address64;
> + }
> +
> + if (acpi_table) {
> + header = (struct acpi_table_header *)acpi_table;
> + signature = header->signature;
> +
> + if (!strncmp(signature, "SRAT", 4))
if (ACPI_COMPARE_NAME(header->signature, ACPI_SIG_SRAT))
> + return header;
> + }
> + entry += size;
> + }
> + return NULL;
> +}
> +
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +/*
> + * According to ACPI table, filter the immvoable memory regions
> + * and store them in immovable_mem[].
> + */
> +void get_immovable_mem(void)
> +{
> + char *args = (char *)get_cmd_line_ptr();
> + struct acpi_table_header *table_header;
> + struct acpi_subtable_header *table;
> + struct acpi_srat_mem_affinity *ma;
> + unsigned long table_end;
> + int i = 0;
> +
> + if (!strstr(args, "movable_node") || strstr(args, "acpi=off"))
> + return;
> +
> + table_header = get_acpi_srat_table();
> + if (!table_header)
> + return;
> +
> + table_end = (unsigned long)table_header + table_header->length;
> +
> + table = (struct acpi_subtable_header *)
> + ((unsigned long)table_header + sizeof(struct acpi_table_srat));
> +
> + while (((unsigned long)table) + table->length < table_end) {
while (((unsigned long)table) +
sizeof(struct acpi_subtable_header) < table_end) {
> + if (table->type == 1) {
if (table->type == ACPI_SRAT_TYPE_MEMORY_AFFINITY) {
> + ma = (struct acpi_srat_mem_affinity *)table;
> + if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)) {
> + immovable_mem[i].start = ma->base_address;
> + immovable_mem[i].size = ma->length;
> + i++;
> + }
> +
> + if (i >= MAX_NUMNODES*2)
> + break;
> + }
> + table = (struct acpi_subtable_header *)
> + ((unsigned long)table + table->length);
> + }
> + num_immovable_mem = i;
> +}
Thanks,
Masa
On Mon, Oct 15, 2018 at 04:26:09PM -0400, Masayoshi Mizuma wrote:
>Hi Chao,
>
>Let me add some suggestions.
Thanks for your review and suggestion.
I will change them in next version.
Thanks,
Chao Fan
>
>On Wed, Oct 10, 2018 at 04:41:17PM +0800, Chao Fan wrote:
>> There is a bug that kaslr may randomly chooses some positions
>> which are located in movable memory regions. This will break memory
>> hotplug feature and make the movable memory chosen by KASLR can't be
>> removed. So dig SRAT table from ACPI tables to get memory information.
>>
>> Imitate the ACPI code of parsing ACPI tables to dig and read ACPI
>> tables. Since some operations are not needed here, functions are
>> simplified. Functions will be used to dig only SRAT tables to get
>> information of memory, so that KASLR can the memory in immovable node.
>>
>> And also, these functions won't influence the initialization of
>> ACPI after start_kernel().
>>
>> Since use physical address directely, so acpi_os_map_memory()
>> and acpi_os_unmap_memory() are not needed.
>>
>> Signed-off-by: Chao Fan <[email protected]>
>> ---
>> arch/x86/boot/compressed/Makefile | 2 +
>> arch/x86/boot/compressed/acpitb.c | 405 ++++++++++++++++++++++++++++++
>> arch/x86/boot/compressed/misc.h | 8 +
>> 3 files changed, 415 insertions(+)
>> create mode 100644 arch/x86/boot/compressed/acpitb.c
>>
>...cut...
>> +static struct acpi_table_header *get_acpi_srat_table(void)
>> +{
>> + char *args = (char *)get_cmd_line_ptr();
>> + acpi_physical_address acpi_table;
>> + acpi_physical_address root_table;
>> + struct acpi_table_header *header;
>> + struct acpi_table_rsdp *rsdp;
>> + char *signature;
>> + u8 *entry;
>> + u32 count;
>> + u32 size;
>> + int i, j;
>> + u32 len;
>> +
>> + rsdp = (struct acpi_table_rsdp *)get_rsdp_addr();
>> + if (!rsdp)
>> + return NULL;
>> +
>> + /* Get rsdt or xsdt from rsdp. */
>> + if (!strstr(args, "acpi=rsdt") &&
>> + rsdp->xsdt_physical_address && rsdp->revision > 1) {
>> + root_table = rsdp->xsdt_physical_address;
>> + size = ACPI_XSDT_ENTRY_SIZE;
>> + } else {
>> + root_table = rsdp->rsdt_physical_address;
>> + size = ACPI_RSDT_ENTRY_SIZE;
>> + }
>> +
>> + /* Get ACPI root table from rsdt or xsdt.*/
>> + header = (struct acpi_table_header *)root_table;
>> + len = header->length;
>> + count = (u32)((len - sizeof(struct acpi_table_header)) / size);
>> + entry = ACPI_ADD_PTR(u8, header, sizeof(struct acpi_table_header));
>> +
>> + for (i = 0; i < count; i++) {
>> + u64 address64;
>> +
>> + if (size == ACPI_RSDT_ENTRY_SIZE)
>> + acpi_table = ((acpi_physical_address)
>> + (*ACPI_CAST_PTR(u32, entry)));
>> + else {
>> + *(u64 *)(void *)&address64 = *(u64 *)(void *)entry;
>> + acpi_table = (acpi_physical_address) address64;
>> + }
>> +
>> + if (acpi_table) {
>> + header = (struct acpi_table_header *)acpi_table;
>
>> + signature = header->signature;
>> +
>> + if (!strncmp(signature, "SRAT", 4))
>
> if (ACPI_COMPARE_NAME(header->signature, ACPI_SIG_SRAT))
>
>> + return header;
>> + }
>> + entry += size;
>> + }
>> + return NULL;
>> +}
>> +
>> +#ifdef CONFIG_MEMORY_HOTREMOVE
>> +/*
>> + * According to ACPI table, filter the immvoable memory regions
>> + * and store them in immovable_mem[].
>> + */
>> +void get_immovable_mem(void)
>> +{
>> + char *args = (char *)get_cmd_line_ptr();
>> + struct acpi_table_header *table_header;
>> + struct acpi_subtable_header *table;
>> + struct acpi_srat_mem_affinity *ma;
>> + unsigned long table_end;
>> + int i = 0;
>> +
>> + if (!strstr(args, "movable_node") || strstr(args, "acpi=off"))
>> + return;
>> +
>> + table_header = get_acpi_srat_table();
>> + if (!table_header)
>> + return;
>> +
>> + table_end = (unsigned long)table_header + table_header->length;
>> +
>> + table = (struct acpi_subtable_header *)
>> + ((unsigned long)table_header + sizeof(struct acpi_table_srat));
>> +
>
>> + while (((unsigned long)table) + table->length < table_end) {
>
> while (((unsigned long)table) +
> sizeof(struct acpi_subtable_header) < table_end) {
>
>> + if (table->type == 1) {
>
> if (table->type == ACPI_SRAT_TYPE_MEMORY_AFFINITY) {
>
>> + ma = (struct acpi_srat_mem_affinity *)table;
>> + if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)) {
>> + immovable_mem[i].start = ma->base_address;
>> + immovable_mem[i].size = ma->length;
>> + i++;
>> + }
>> +
>> + if (i >= MAX_NUMNODES*2)
>> + break;
>> + }
>> + table = (struct acpi_subtable_header *)
>> + ((unsigned long)table + table->length);
>> + }
>> + num_immovable_mem = i;
>> +}
>
>Thanks,
>Masa
>
>
On Thu, Oct 11, 2018 at 12:57:08PM +0200, Borislav Petkov wrote:
>On Wed, Oct 10, 2018 at 04:41:17PM +0800, Chao Fan wrote:
[...]
>> +#ifdef CONFIG_KEXEC
>> +static bool get_acpi_rsdp(acpi_physical_address *rsdp_addr)
>> +{
>> + char *args = (char *)get_cmd_line_ptr();
>> + size_t len = strlen((char *)args);
>> + char *tmp_cmdline, *param, *val;
>> + unsigned long long addr = 0;
>> + char *endptr;
>> +
>> + if (!strstr(args, "acpi_rsdp="))
>> + return false;
>> +
>> + tmp_cmdline = malloc(len+1);
>> + if (!tmp_cmdline)
>> + error("Failed to allocate space for tmp_cmdline");
>
>Why do you even need to allocate a tmp cmdline?
>
>Ah, I see what you've done - you've copied handle_mem_options() in
>kaslr.c. Well no, not really.
>
>That functionality needs to get extracted into a separate facility. Oh
>look, there's arch/x86/boot/compressed/cmdline.c which is begging to get
>extended.
>
>:-)
>
Hi Boris,
Sorry for disturbing you again, I want to make sure this detail with you.
You mean that I need splite this as a function and put it to
cmdline.c, right?
If my understand is wrong, please let me know.
Thanks,
Chao Fan
On Tue, Oct 16, 2018 at 10:48:44AM +0800, Chao Fan wrote:
> Sorry for disturbing you again, I want to make sure this detail with you.
> You mean that I need splite this as a function and put it to
> cmdline.c, right?
Extract that functionality into a generic helper so that
handle_mem_options() and your get_acpi_rsdp() can call it instead
of duplicating the code. Also, why aren't they both using
cmdline_find_option() directly?
If something's missing, extend cmdline_find_option() to serve your
purposes too instead of copying the same code.
Make more sense?
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
Hi Boris, Baoquan and all,
I have created a RFC patch for adjust KASLR padding size.
This patch is based on Can's v8 patch [1/3], and the Can's patch
will be changed in the future, so this patch is just RFC.
Welcome to any comments and suggestions. Thanks!
---
arch/x86/boot/compressed/acpitb.c | 8 ++++++-
arch/x86/include/uapi/asm/bootparam.h | 3 ++-
arch/x86/mm/kaslr.c | 31 ++++++++++++++++++++++++++-
3 files changed, 39 insertions(+), 3 deletions(-)
diff --git a/arch/x86/boot/compressed/acpitb.c b/arch/x86/boot/compressed/acpitb.c
index 6b869e3f9780..1504b46f2a04 100644
--- a/arch/x86/boot/compressed/acpitb.c
+++ b/arch/x86/boot/compressed/acpitb.c
@@ -367,6 +367,7 @@ void get_immovable_mem(void)
struct acpi_subtable_header *table;
struct acpi_srat_mem_affinity *ma;
unsigned long table_end;
+ unsigned long long possible_addr, max_possible_addr = 0;
int i = 0;
if (!strstr(args, "movable_node") || strstr(args, "acpi=off"))
@@ -384,7 +385,11 @@ void get_immovable_mem(void)
while (((unsigned long)table) + table->length < table_end) {
if (table->type == 1) {
ma = (struct acpi_srat_mem_affinity *)table;
- if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)) {
+ if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
+ possible_addr = ma->base_address + ma->length;
+ if (possible_addr > max_possible_addr)
+ max_possible_addr = possible_addr;
+ } else {
immovable_mem[i].start = ma->base_address;
immovable_mem[i].size = ma->length;
i++;
@@ -397,6 +402,7 @@ void get_immovable_mem(void)
((unsigned long)table + table->length);
}
num_immovable_mem = i;
+ boot_params->possible_mem_addr = max_possible_addr;
}
#else
void get_immovable_mem(void)
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index a06cbf019744..c987c725e93a 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -191,7 +191,8 @@ struct boot_params {
__u8 _pad7[0x290-0x1f1-sizeof(struct setup_header)];
__u32 edd_mbr_sig_buffer[EDD_MBR_SIG_MAX]; /* 0x290 */
struct boot_e820_entry e820_table[E820_MAX_ENTRIES_ZEROPAGE]; /* 0x2d0 */
- __u8 _pad8[48]; /* 0xcd0 */
+ __u8 _pad8[40]; /* 0xcd0 */
+ __u64 possible_mem_addr; /* 0xcf8 */
struct edd_info eddbuf[EDDMAXNR]; /* 0xd00 */
__u8 _pad9[276]; /* 0xeec */
} __attribute__((packed));
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 61db77b0eda9..8c5aca792b51 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -69,6 +69,35 @@ static inline bool kaslr_memory_enabled(void)
return kaslr_enabled() && !IS_ENABLED(CONFIG_KASAN);
}
+#ifdef CONFIG_MEMORY_HOTPLUG
+static unsigned int __init kaslr_padding(void)
+{
+ unsigned long long max_possible_phys, max_actual_phys, threshold;
+ unsigned int rand_mem_physical_padding =
+ CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
+
+ if (!boot_params.possible_mem_addr)
+ goto out;
+
+ max_actual_phys = roundup(PFN_PHYS(max_pfn), 1ULL << TB_SHIFT);
+ max_possible_phys = roundup(boot_params.possible_mem_addr,
+ 1ULL << TB_SHIFT);
+ threshold = max_actual_phys +
+ ((unsigned long long)rand_mem_physical_padding << TB_SHIFT);
+
+ if (max_possible_phys > threshold)
+ rand_mem_physical_padding =
+ (max_possible_phys - max_actual_phys) >> TB_SHIFT;
+out:
+ return rand_mem_physical_padding;
+}
+#else
+static unsigned int __init kaslr_padding(void)
+{
+ return CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
+}
+#endif
+
/* Initialize base and padding for each memory region randomized with KASLR */
void __init kernel_randomize_memory(void)
{
@@ -102,7 +131,7 @@ void __init kernel_randomize_memory(void)
*/
BUG_ON(kaslr_regions[0].base != &page_offset_base);
memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) +
- CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
+ kaslr_padding();
/* Adapt phyiscal memory region size based on available memory */
if (memory_tb < kaslr_regions[0].size_tb)
--
2.18.0
- Masa
On Tue, Oct 16, 2018 at 11:13:54AM -0400, Masayoshi Mizuma wrote:
> Hi Boris, Baoquan and all,
>
> I have created a RFC patch for adjust KASLR padding size.
> This patch is based on Can's v8 patch [1/3], and the Can's patch
> will be changed in the future, so this patch is just RFC.
>
> Welcome to any comments and suggestions. Thanks!
...
> diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
> index a06cbf019744..c987c725e93a 100644
> --- a/arch/x86/include/uapi/asm/bootparam.h
> +++ b/arch/x86/include/uapi/asm/bootparam.h
> @@ -191,7 +191,8 @@ struct boot_params {
> __u8 _pad7[0x290-0x1f1-sizeof(struct setup_header)];
> __u32 edd_mbr_sig_buffer[EDD_MBR_SIG_MAX]; /* 0x290 */
> struct boot_e820_entry e820_table[E820_MAX_ENTRIES_ZEROPAGE]; /* 0x2d0 */
> - __u8 _pad8[48]; /* 0xcd0 */
> + __u8 _pad8[40]; /* 0xcd0 */
> + __u64 possible_mem_addr; /* 0xcf8 */
Where in the example I gave you with add_e820ext() do you see members
being added to struct boot_params?
Take a look at it again pls.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Tue, Oct 16, 2018 at 09:11:14PM +0200, Borislav Petkov wrote:
> On Tue, Oct 16, 2018 at 11:13:54AM -0400, Masayoshi Mizuma wrote:
> > Hi Boris, Baoquan and all,
> >
> > I have created a RFC patch for adjust KASLR padding size.
> > This patch is based on Can's v8 patch [1/3], and the Can's patch
> > will be changed in the future, so this patch is just RFC.
> >
> > Welcome to any comments and suggestions. Thanks!
>
> ...
>
> > diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
> > index a06cbf019744..c987c725e93a 100644
> > --- a/arch/x86/include/uapi/asm/bootparam.h
> > +++ b/arch/x86/include/uapi/asm/bootparam.h
> > @@ -191,7 +191,8 @@ struct boot_params {
> > __u8 _pad7[0x290-0x1f1-sizeof(struct setup_header)];
> > __u32 edd_mbr_sig_buffer[EDD_MBR_SIG_MAX]; /* 0x290 */
> > struct boot_e820_entry e820_table[E820_MAX_ENTRIES_ZEROPAGE]; /* 0x2d0 */
> > - __u8 _pad8[48]; /* 0xcd0 */
> > + __u8 _pad8[40]; /* 0xcd0 */
> > + __u64 possible_mem_addr; /* 0xcf8 */
>
> Where in the example I gave you with add_e820ext() do you see members
> being added to struct boot_params?
>
> Take a look at it again pls.
Ah, sorry, I misunderstood your suggetion...
In parse_setup_data(), the data is picked up from boot_params,
so I thought it is also useful for this sitiation.
I'll try again, thanks!
- Masa
On Tue, Oct 16, 2018 at 03:54:30PM -0400, Masayoshi Mizuma wrote:
> Ah, sorry, I misunderstood your suggetion...
> In parse_setup_data(), the data is picked up from boot_params,
Yes, this is the pointer to the setup_data linked list head. See
Documentation/ABI/testing/sysfs-kernel-boot_params
Documentation/x86/boot.txt
for more information and basically grep the tree for examples.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Tue, Oct 16, 2018 at 02:40:36PM +0200, Borislav Petkov wrote:
>On Tue, Oct 16, 2018 at 10:48:44AM +0800, Chao Fan wrote:
>> Sorry for disturbing you again, I want to make sure this detail with you.
>> You mean that I need splite this as a function and put it to
>> cmdline.c, right?
>
>Extract that functionality into a generic helper so that
>handle_mem_options() and your get_acpi_rsdp() can call it instead
>of duplicating the code. Also, why aren't they both using
>cmdline_find_option() directly?
Yes, I found cmdline_find_option() is enough for me.
Thanks,
Chao Fan
>
>If something's missing, extend cmdline_find_option() to serve your
>purposes too instead of copying the same code.
>
>Make more sense?
>
>--
>Regards/Gruss,
> Boris.
>
>Good mailing practices for 400: avoid top-posting and trim the reply.
>
>
Hi Boris,
On Tue, Oct 16, 2018 at 09:59:02PM +0200, Borislav Petkov wrote:
> On Tue, Oct 16, 2018 at 03:54:30PM -0400, Masayoshi Mizuma wrote:
> > Ah, sorry, I misunderstood your suggetion...
> > In parse_setup_data(), the data is picked up from boot_params,
>
> Yes, this is the pointer to the setup_data linked list head. See
>
> Documentation/ABI/testing/sysfs-kernel-boot_params
> Documentation/x86/boot.txt
>
> for more information and basically grep the tree for examples.
I'm trying to store the SRAT info and pass it to kernel_randomize_memory(),
looks like add_e820ext()/parse_setup_data().
Is the approach useful only EFI environment? I'm not sure how I
allocate memory to store the data in legacy bios environment...
On EFI, I can use efi_call_early(allocate_pool, EFI_LOADER_DATA, ...).
Am I missing something? I would appreciate if you could help my
understanding.
Following patch is a prototype for EFI enviromnent.
---
arch/x86/boot/compressed/acpitb.c | 23 ++++++++++-
arch/x86/boot/compressed/eboot.c | 36 +++++++++++++++++
arch/x86/include/uapi/asm/bootparam.h | 1 +
arch/x86/mm/kaslr.c | 58 ++++++++++++++++++++++++++-
4 files changed, 116 insertions(+), 2 deletions(-)
diff --git a/arch/x86/boot/compressed/acpitb.c b/arch/x86/boot/compressed/acpitb.c
index d119663..b79560c 100644
--- a/arch/x86/boot/compressed/acpitb.c
+++ b/arch/x86/boot/compressed/acpitb.c
@@ -309,6 +309,20 @@ static struct acpi_table_header *get_acpi_srat_table(void)
return NULL;
}
+static void store_possible_addr(unsigned long long possible)
+{
+ struct setup_data *data;
+
+ data = (struct setup_data *)(unsigned long)boot_params->hdr.setup_data;
+ while (data) {
+ if (data->type == SETUP_KASLR) {
+ *(unsigned long long *)data->data = possible;
+ return;
+ }
+ data = (struct setup_data *)(unsigned long)data->next;
+ }
+}
+
/*
* According to ACPI table, filter the immvoable memory regions
* and store them in immovable_mem[].
@@ -319,6 +333,7 @@ void get_immovable_mem(void)
struct acpi_subtable_header *table;
struct acpi_srat_mem_affinity *ma;
unsigned long table_end;
+ unsigned long long possible_addr, max_possible_addr = 0;
int i = 0;
if (!cmdline_find_option_bool("movable_node") ||
@@ -338,7 +353,12 @@ void get_immovable_mem(void)
sizeof(struct acpi_subtable_header) < table_end) {
if (table->type == ACPI_SRAT_TYPE_MEMORY_AFFINITY) {
ma = (struct acpi_srat_mem_affinity *)table;
- if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)) {
+
+ if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
+ possible_addr = ma->base_address + ma->length;
+ if (possible_addr > max_possible_addr)
+ max_possible_addr = possible_addr;
+ } else {
immovable_mem[i].start = ma->base_address;
immovable_mem[i].size = ma->length;
i++;
@@ -351,4 +371,5 @@ void get_immovable_mem(void)
((unsigned long)table + table->length);
}
num_immovable_mem = i;
+ store_possible_addr(max_possible_addr);
}
diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c
index 1458b17..9b95fba 100644
--- a/arch/x86/boot/compressed/eboot.c
+++ b/arch/x86/boot/compressed/eboot.c
@@ -192,6 +192,40 @@ static void setup_efi_pci(struct boot_params *params)
efi_call_early(free_pool, pci_handle);
}
+#ifdef CONFIG_RANDOMIZE_MEMORY
+static void setup_kaslr(struct boot_params *params)
+{
+ struct setup_data *kaslr_data = NULL;
+ struct setup_data *data;
+ unsigned long size;
+ efi_status_t status;
+
+ size = sizeof(struct setup_data) + sizeof(unsigned long long);
+
+ status = efi_call_early(allocate_pool, EFI_LOADER_DATA,
+ size, (void **)&kaslr_data);
+ if (status != EFI_SUCCESS) {
+ efi_printk(sys_table, "Failed to allocate memory for 'kaslr_data'\n");
+ return;
+ }
+
+ kaslr_data->type = SETUP_KASLR;
+ kaslr_data->next = 0;
+ kaslr_data->len = size;
+
+ data = (struct setup_data *)(unsigned long)params->hdr.setup_data;
+ if (data)
+ data->next = (unsigned long)kaslr_data;
+ else {
+ while (data->next)
+ data = (struct setup_data *)(unsigned long)data->next;
+ data->next = (unsigned long)kaslr_data;
+ }
+}
+#else
+static void setup_kaslr(struct boot_params *params) {}
+#endif
+
static void retrieve_apple_device_properties(struct boot_params *boot_params)
{
efi_guid_t guid = APPLE_PROPERTIES_PROTOCOL_GUID;
@@ -770,6 +804,8 @@ efi_main(struct efi_config *c, struct boot_params *boot_params)
setup_efi_pci(boot_params);
+ setup_kaslr(boot_params);
+
setup_quirks(boot_params);
status = efi_call_early(allocate_pool, EFI_LOADER_DATA,
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index a06cbf0..0a44d83 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -10,6 +10,7 @@
#define SETUP_EFI 4
#define SETUP_APPLE_PROPERTIES 5
#define SETUP_JAILHOUSE 6
+#define SETUP_KASLR 7
/* ram_size flags */
#define RAMDISK_IMAGE_START_MASK 0x07FF
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 61db77b..6f91cf4 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -28,6 +28,7 @@
#include <asm/pgtable.h>
#include <asm/setup.h>
#include <asm/kaslr.h>
+#include <linux/io.h>
#include "mm_internal.h"
@@ -69,6 +70,61 @@ static inline bool kaslr_memory_enabled(void)
return kaslr_enabled() && !IS_ENABLED(CONFIG_KASAN);
}
+#ifdef CONFIG_MEMORY_HOTPLUG
+static unsigned long long __init get_max_possible_addr(void)
+{
+ struct setup_data *data;
+ u64 pa_data;
+ unsigned long long max = 0;
+
+ pa_data = boot_params.hdr.setup_data;
+ while (pa_data) {
+ data = early_memremap(pa_data, sizeof(*data));
+ if (!data)
+ return 0;
+
+ if (data->type == SETUP_KASLR) {
+ max = *(unsigned long long *)data->data;
+ early_memunmap(data, sizeof(*data));
+ return max;
+ }
+ pa_data = data->next;
+ early_memunmap(data, sizeof(*data));
+ }
+
+ return max;
+}
+
+static unsigned int __init kaslr_padding(void)
+{
+ unsigned long long max_possible_addr;
+ unsigned long long max_possible_phys, max_actual_phys, threshold;
+ unsigned int rand_mem_physical_padding =
+ CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
+
+ max_possible_addr = get_max_possible_addr();
+ if (!max_possible_addr)
+ goto out;
+
+ max_actual_phys = roundup(PFN_PHYS(max_pfn), 1ULL << TB_SHIFT);
+ max_possible_phys = roundup(max_possible_addr,
+ 1ULL << TB_SHIFT);
+ threshold = max_actual_phys +
+ ((unsigned long long)rand_mem_physical_padding << TB_SHIFT);
+
+ if (max_possible_phys > threshold)
+ rand_mem_physical_padding =
+ (max_possible_phys - max_actual_phys) >> TB_SHIFT;
+out:
+ return rand_mem_physical_padding;
+}
+#else
+static unsigned int __init kaslr_padding(void)
+{
+ return CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
+}
+#endif
+
/* Initialize base and padding for each memory region randomized with KASLR */
void __init kernel_randomize_memory(void)
{
@@ -102,7 +158,7 @@ void __init kernel_randomize_memory(void)
*/
BUG_ON(kaslr_regions[0].base != &page_offset_base);
memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) +
- CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
+ kaslr_padding();
/* Adapt phyiscal memory region size based on available memory */
if (memory_tb < kaslr_regions[0].size_tb)
--
Thanks!
Masa
On Mon, Oct 22, 2018 at 11:42:05AM -0400, Masayoshi Mizuma wrote:
>Hi Boris,
Hi Mizuma-san,
I have several questions:
>+static void store_possible_addr(unsigned long long possible)
>+{
>+ struct setup_data *data;
>+
>+ data = (struct setup_data *)(unsigned long)boot_params->hdr.setup_data;
I suggest you add check:
if (!data) {
debug_putstr("No setup_data found.\n");
return;
}
>+ while (data) {
>+ if (data->type == SETUP_KASLR) {
>+ *(unsigned long long *)data->data = possible;
>+ return;
>+ }
>+ data = (struct setup_data *)(unsigned long)data->next;
>+ }
>+}
>+
> /*
> * According to ACPI table, filter the immvoable memory regions
> * and store them in immovable_mem[].
>@@ -319,6 +333,7 @@ void get_immovable_mem(void)
> struct acpi_subtable_header *table;
> struct acpi_srat_mem_affinity *ma;
> unsigned long table_end;
>+ unsigned long long possible_addr, max_possible_addr = 0;
> int i = 0;
>
> if (!cmdline_find_option_bool("movable_node") ||
>@@ -338,7 +353,12 @@ void get_immovable_mem(void)
> sizeof(struct acpi_subtable_header) < table_end) {
> if (table->type == ACPI_SRAT_TYPE_MEMORY_AFFINITY) {
> ma = (struct acpi_srat_mem_affinity *)table;
>- if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)) {
>+
>+ if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
>+ possible_addr = ma->base_address + ma->length;
>+ if (possible_addr > max_possible_addr)
>+ max_possible_addr = possible_addr;
>+ } else {
> immovable_mem[i].start = ma->base_address;
> immovable_mem[i].size = ma->length;
> i++;
>@@ -351,4 +371,5 @@ void get_immovable_mem(void)
> ((unsigned long)table + table->length);
> }
> num_immovable_mem = i;
>+ store_possible_addr(max_possible_addr);
> }
>diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c
>index 1458b17..9b95fba 100644
>--- a/arch/x86/boot/compressed/eboot.c
>+++ b/arch/x86/boot/compressed/eboot.c
>@@ -192,6 +192,40 @@ static void setup_efi_pci(struct boot_params *params)
> efi_call_early(free_pool, pci_handle);
> }
>
>+#ifdef CONFIG_RANDOMIZE_MEMORY
>+static void setup_kaslr(struct boot_params *params)
>+{
>+ struct setup_data *kaslr_data = NULL;
>+ struct setup_data *data;
>+ unsigned long size;
>+ efi_status_t status;
>+
>+ size = sizeof(struct setup_data) + sizeof(unsigned long long);
>+
>+ status = efi_call_early(allocate_pool, EFI_LOADER_DATA,
>+ size, (void **)&kaslr_data);
>+ if (status != EFI_SUCCESS) {
>+ efi_printk(sys_table, "Failed to allocate memory for 'kaslr_data'\n");
>+ return;
>+ }
>+
>+ kaslr_data->type = SETUP_KASLR;
>+ kaslr_data->next = 0;
>+ kaslr_data->len = size;
>+
>+ data = (struct setup_data *)(unsigned long)params->hdr.setup_data;
>+ if (data)
>+ data->next = (unsigned long)kaslr_data;
Why just put the kaslr_data in data->next. You can't make sure
data->next was NULL.
>+ else {
If data is NULL, go to this else{}, so these two lines below work?
>+ while (data->next)
>+ data = (struct setup_data *)(unsigned long)data->next;
>+ data->next = (unsigned long)kaslr_data;
>+ }
If my understanding is not wrong, it should be:
data = (struct setup_data *)(unsigned long)params->hdr.setup_data;
if (!data)
params->hdr.setup_data = (unsigned long)kaslr_data;
else {
while (data->next)
data = (struct setup_data *)(unsigned long)data->next;
data->next = (unsigned long)kaslr_data;
}
If I misunderstand something, please tell me.
Thanks,
Chao Fan
On Tue, Oct 23, 2018 at 10:48:02AM +0800, Chao Fan wrote:
> On Mon, Oct 22, 2018 at 11:42:05AM -0400, Masayoshi Mizuma wrote:
> >Hi Boris,
>
> Hi Mizuma-san,
>
> I have several questions:
Thank you for your comments! I think your suggestions are
right.
However, the prototype patch works EFI environment only.
The memory hot-plug affinity in SRAT and KASLR are also available
on legacy BIOS environment, so I need to get the patch useful
for legacy BIOS as well, but I have no idea to add such things...
If you have ideas, could you let me know?
Probably I should have another idea, for example,
add the SRAT parsing code, looks like you are adding to
arch/x86/boot/compressed/acpitb.c, to arch/x86/mm/kaslr.c.
Thanks,
Masa
>
> >+static void store_possible_addr(unsigned long long possible)
> >+{
> >+ struct setup_data *data;
> >+
> >+ data = (struct setup_data *)(unsigned long)boot_params->hdr.setup_data;
> I suggest you add check:
>
> if (!data) {
> debug_putstr("No setup_data found.\n");
> return;
> }
>
> >+ while (data) {
> >+ if (data->type == SETUP_KASLR) {
> >+ *(unsigned long long *)data->data = possible;
> >+ return;
> >+ }
> >+ data = (struct setup_data *)(unsigned long)data->next;
> >+ }
> >+}
> >+
> > /*
> > * According to ACPI table, filter the immvoable memory regions
> > * and store them in immovable_mem[].
> >@@ -319,6 +333,7 @@ void get_immovable_mem(void)
> > struct acpi_subtable_header *table;
> > struct acpi_srat_mem_affinity *ma;
> > unsigned long table_end;
> >+ unsigned long long possible_addr, max_possible_addr = 0;
> > int i = 0;
> >
> > if (!cmdline_find_option_bool("movable_node") ||
> >@@ -338,7 +353,12 @@ void get_immovable_mem(void)
> > sizeof(struct acpi_subtable_header) < table_end) {
> > if (table->type == ACPI_SRAT_TYPE_MEMORY_AFFINITY) {
> > ma = (struct acpi_srat_mem_affinity *)table;
> >- if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)) {
> >+
> >+ if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
> >+ possible_addr = ma->base_address + ma->length;
> >+ if (possible_addr > max_possible_addr)
> >+ max_possible_addr = possible_addr;
> >+ } else {
> > immovable_mem[i].start = ma->base_address;
> > immovable_mem[i].size = ma->length;
> > i++;
> >@@ -351,4 +371,5 @@ void get_immovable_mem(void)
> > ((unsigned long)table + table->length);
> > }
> > num_immovable_mem = i;
> >+ store_possible_addr(max_possible_addr);
> > }
> >diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c
> >index 1458b17..9b95fba 100644
> >--- a/arch/x86/boot/compressed/eboot.c
> >+++ b/arch/x86/boot/compressed/eboot.c
> >@@ -192,6 +192,40 @@ static void setup_efi_pci(struct boot_params *params)
> > efi_call_early(free_pool, pci_handle);
> > }
> >
> >+#ifdef CONFIG_RANDOMIZE_MEMORY
> >+static void setup_kaslr(struct boot_params *params)
> >+{
> >+ struct setup_data *kaslr_data = NULL;
> >+ struct setup_data *data;
> >+ unsigned long size;
> >+ efi_status_t status;
> >+
> >+ size = sizeof(struct setup_data) + sizeof(unsigned long long);
> >+
> >+ status = efi_call_early(allocate_pool, EFI_LOADER_DATA,
> >+ size, (void **)&kaslr_data);
> >+ if (status != EFI_SUCCESS) {
> >+ efi_printk(sys_table, "Failed to allocate memory for 'kaslr_data'\n");
> >+ return;
> >+ }
> >+
> >+ kaslr_data->type = SETUP_KASLR;
> >+ kaslr_data->next = 0;
> >+ kaslr_data->len = size;
> >+
> >+ data = (struct setup_data *)(unsigned long)params->hdr.setup_data;
> >+ if (data)
> >+ data->next = (unsigned long)kaslr_data;
> Why just put the kaslr_data in data->next. You can't make sure
> data->next was NULL.
> >+ else {
> If data is NULL, go to this else{}, so these two lines below work?
> >+ while (data->next)
> >+ data = (struct setup_data *)(unsigned long)data->next;
> >+ data->next = (unsigned long)kaslr_data;
> >+ }
> If my understanding is not wrong, it should be:
>
> data = (struct setup_data *)(unsigned long)params->hdr.setup_data;
> if (!data)
> params->hdr.setup_data = (unsigned long)kaslr_data;
> else {
> while (data->next)
> data = (struct setup_data *)(unsigned long)data->next;
> data->next = (unsigned long)kaslr_data;
> }
>
> If I misunderstand something, please tell me.
>
> Thanks,
> Chao Fan
>
>
On Wed, Oct 24, 2018 at 03:21:36PM -0400, Masayoshi Mizuma wrote:
>On Tue, Oct 23, 2018 at 10:48:02AM +0800, Chao Fan wrote:
>> On Mon, Oct 22, 2018 at 11:42:05AM -0400, Masayoshi Mizuma wrote:
>> >Hi Boris,
>>
>> Hi Mizuma-san,
>>
>> I have several questions:
>
>Thank you for your comments! I think your suggestions are
>right.
>However, the prototype patch works EFI environment only.
Yes, I agree. But I think this method is much better than
adding code to arch/x86/mm/kaslr.c.
>The memory hot-plug affinity in SRAT and KASLR are also available
>on legacy BIOS environment, so I need to get the patch useful
>for legacy BIOS as well, but I have no idea to add such things...
>If you have ideas, could you let me know?
I have no idea. I will work on it, try to help figure out the BIOS issue.
Thanks,
Chao Fan
>
>Probably I should have another idea, for example,
>add the SRAT parsing code, looks like you are adding to
>arch/x86/boot/compressed/acpitb.c, to arch/x86/mm/kaslr.c.
>
>Thanks,
>Masa
>
>>
>> >+static void store_possible_addr(unsigned long long possible)
>> >+{
>> >+ struct setup_data *data;
>> >+
>> >+ data = (struct setup_data *)(unsigned long)boot_params->hdr.setup_data;
>> I suggest you add check:
>>
>> if (!data) {
>> debug_putstr("No setup_data found.\n");
>> return;
>> }
>>
>> >+ while (data) {
>> >+ if (data->type == SETUP_KASLR) {
>> >+ *(unsigned long long *)data->data = possible;
>> >+ return;
>> >+ }
>> >+ data = (struct setup_data *)(unsigned long)data->next;
>> >+ }
>> >+}
>> >+
>> > /*
>> > * According to ACPI table, filter the immvoable memory regions
>> > * and store them in immovable_mem[].
>> >@@ -319,6 +333,7 @@ void get_immovable_mem(void)
>> > struct acpi_subtable_header *table;
>> > struct acpi_srat_mem_affinity *ma;
>> > unsigned long table_end;
>> >+ unsigned long long possible_addr, max_possible_addr = 0;
>> > int i = 0;
>> >
>> > if (!cmdline_find_option_bool("movable_node") ||
>> >@@ -338,7 +353,12 @@ void get_immovable_mem(void)
>> > sizeof(struct acpi_subtable_header) < table_end) {
>> > if (table->type == ACPI_SRAT_TYPE_MEMORY_AFFINITY) {
>> > ma = (struct acpi_srat_mem_affinity *)table;
>> >- if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)) {
>> >+
>> >+ if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
>> >+ possible_addr = ma->base_address + ma->length;
>> >+ if (possible_addr > max_possible_addr)
>> >+ max_possible_addr = possible_addr;
>> >+ } else {
>> > immovable_mem[i].start = ma->base_address;
>> > immovable_mem[i].size = ma->length;
>> > i++;
>> >@@ -351,4 +371,5 @@ void get_immovable_mem(void)
>> > ((unsigned long)table + table->length);
>> > }
>> > num_immovable_mem = i;
>> >+ store_possible_addr(max_possible_addr);
>> > }
>> >diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c
>> >index 1458b17..9b95fba 100644
>> >--- a/arch/x86/boot/compressed/eboot.c
>> >+++ b/arch/x86/boot/compressed/eboot.c
>> >@@ -192,6 +192,40 @@ static void setup_efi_pci(struct boot_params *params)
>> > efi_call_early(free_pool, pci_handle);
>> > }
>> >
>> >+#ifdef CONFIG_RANDOMIZE_MEMORY
>> >+static void setup_kaslr(struct boot_params *params)
>> >+{
>> >+ struct setup_data *kaslr_data = NULL;
>> >+ struct setup_data *data;
>> >+ unsigned long size;
>> >+ efi_status_t status;
>> >+
>> >+ size = sizeof(struct setup_data) + sizeof(unsigned long long);
>> >+
>> >+ status = efi_call_early(allocate_pool, EFI_LOADER_DATA,
>> >+ size, (void **)&kaslr_data);
>> >+ if (status != EFI_SUCCESS) {
>> >+ efi_printk(sys_table, "Failed to allocate memory for 'kaslr_data'\n");
>> >+ return;
>> >+ }
>> >+
>> >+ kaslr_data->type = SETUP_KASLR;
>> >+ kaslr_data->next = 0;
>> >+ kaslr_data->len = size;
>> >+
>> >+ data = (struct setup_data *)(unsigned long)params->hdr.setup_data;
>> >+ if (data)
>> >+ data->next = (unsigned long)kaslr_data;
>> Why just put the kaslr_data in data->next. You can't make sure
>> data->next was NULL.
>> >+ else {
>> If data is NULL, go to this else{}, so these two lines below work?
>> >+ while (data->next)
>> >+ data = (struct setup_data *)(unsigned long)data->next;
>> >+ data->next = (unsigned long)kaslr_data;
>> >+ }
>> If my understanding is not wrong, it should be:
>>
>> data = (struct setup_data *)(unsigned long)params->hdr.setup_data;
>> if (!data)
>> params->hdr.setup_data = (unsigned long)kaslr_data;
>> else {
>> while (data->next)
>> data = (struct setup_data *)(unsigned long)data->next;
>> data->next = (unsigned long)kaslr_data;
>> }
>>
>> If I misunderstand something, please tell me.
>>
>> Thanks,
>> Chao Fan
>>
>>
>
>
On Mon, Oct 22, 2018 at 11:42:05AM -0400, Masayoshi Mizuma wrote:
> I'm trying to store the SRAT info and pass it to kernel_randomize_memory(),
> looks like add_e820ext()/parse_setup_data().
>
> Is the approach useful only EFI environment? I'm not sure how I
Does it matter for non-EFI even?
I mean, you want this code only for your use case - when you have
movable memory and you're doing kexec, yes?
And those machines are all EFI boxes I'd assume...
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
On Thu, Oct 25, 2018 at 12:33:45PM +0200, Borislav Petkov wrote:
> On Mon, Oct 22, 2018 at 11:42:05AM -0400, Masayoshi Mizuma wrote:
> > I'm trying to store the SRAT info and pass it to kernel_randomize_memory(),
> > looks like add_e820ext()/parse_setup_data().
> >
> > Is the approach useful only EFI environment? I'm not sure how I
>
> Does it matter for non-EFI even?
>
> I mean, you want this code only for your use case - when you have
> movable memory and you're doing kexec, yes?
>
> And those machines are all EFI boxes I'd assume...
My actual use case is for EFI boxes, however, I think it's better to useful
for legacy BIOS as well because memory hot-plug affinity in SRAT and KASLR
are available on legacy BIOS.
Actually, we can create such environment in qemu.
I have another idea to solve this issue. Adding a SRAT parsing code
to arch/x86/mm/kaslr.c. It is useful for both EFI and BIOS and
also we don't need a new kernel parameter...
Dose the idea make sense?
Thanks,
Masa
On Thu, Oct 25, 2018 at 09:40:51AM -0400, Masayoshi Mizuma wrote:
> My actual use case is for EFI boxes, however, I think it's better to useful
> for legacy BIOS as well because memory hot-plug affinity in SRAT and KASLR
> are available on legacy BIOS.
> Actually, we can create such environment in qemu.
Ah, right, qemu. :)
> I have another idea to solve this issue. Adding a SRAT parsing code
> to arch/x86/mm/kaslr.c. It is useful for both EFI and BIOS and
> also we don't need a new kernel parameter...
> Dose the idea make sense?
The more automatic stuff we do and we don't have to involve the user,
the better.
However, lemme look at Chao's current patchset first - we should not go
nuts by putting SRAT parsing everywhere :)
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On 11/06/18 at 01:10pm, Borislav Petkov wrote:
> > I have another idea to solve this issue. Adding a SRAT parsing code
> > to arch/x86/mm/kaslr.c. It is useful for both EFI and BIOS and
> > also we don't need a new kernel parameter...
> > Dose the idea make sense?
>
> The more automatic stuff we do and we don't have to involve the user,
> the better.
>
> However, lemme look at Chao's current patchset first - we should not go
> nuts by putting SRAT parsing everywhere :)
arch/x86/mm/ident_map.c is a good example, it's shared between
arch/x86/boot/compressed and arch/x86/mm/init_64.c
On Thu, Oct 25, 2018 at 09:40:51AM -0400, Masayoshi Mizuma wrote:
> I have another idea to solve this issue. Adding a SRAT parsing code
> to arch/x86/mm/kaslr.c. It is useful for both EFI and BIOS and
> also we don't need a new kernel parameter...
> Dose the idea make sense?
Ok, having swapped the whole thing back into my brain, forget what I
said earlier today.
Didn't we talk about passing info with setup_data to the later kernel
stage? You even had a patch:
https://lkml.kernel.org/r/20181022154204.kagmdb55jtoez4ca@gabell
So what is that "idea" again about adding SRAT parsing code to
arch/x86/mm/kaslr.c?!?!
The intent of passing info with setup_data is to *avoid* parsing SRAT
yet another time and duplicating that code one more time.
IOW:
* The first place that needs SRAT parsing, does the parsing - i.e., the
compressed stage.
* Later stages get information passed to them with setup_data. No second
parsing.
Ok?
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Tue, Nov 06, 2018 at 07:45:19PM +0100, Borislav Petkov wrote:
> Ok, having swapped the whole thing back into my brain, forget what I
> said earlier today.
>
> Didn't we talk about passing info with setup_data to the later kernel
> stage? You even had a patch:
>
> https://lkml.kernel.org/r/20181022154204.kagmdb55jtoez4ca@gabell
>
> So what is that "idea" again about adding SRAT parsing code to
> arch/x86/mm/kaslr.c?!?!
>
> The intent of passing info with setup_data is to *avoid* parsing SRAT
> yet another time and duplicating that code one more time.
>
> IOW:
>
> * The first place that needs SRAT parsing, does the parsing - i.e., the
> compressed stage.
>
> * Later stages get information passed to them with setup_data. No second
> parsing.
>
> Ok?
Yes, I think I can see what you are saying. However, I'm not sure how
I use the setup_data in legacy BIOS environment. So, I said the another
idea which adding the SRAT parsing code to arch/x86/mm/kaslr.c as well.
Yes, as you said, that is not so good...
I would appreciate if you could help to use setup_data or something to
pass the information to later kernel stage in BIOS environment.
Thanks,
Masa
On Tue, Nov 06, 2018 at 02:36:38PM -0500, Masayoshi Mizuma wrote:
> Yes, I think I can see what you are saying. However, I'm not sure how
> I use the setup_data in legacy BIOS environment.
What is "legacy BIOS environment" and what does that have to do with
setup_data?
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Tue, Nov 06, 2018 at 09:45:11PM +0100, Borislav Petkov wrote:
> On Tue, Nov 06, 2018 at 02:36:38PM -0500, Masayoshi Mizuma wrote:
> > Yes, I think I can see what you are saying. However, I'm not sure how
> > I use the setup_data in legacy BIOS environment.
>
> What is "legacy BIOS environment" and what does that have to do with
> setup_data?
My proposed patch [1] is useful only for EFI environment. The patch
allocates a setup_date structure by efi_call_early() and store the
KASLR data into the memory area.
+static void setup_kaslr(struct boot_params *params)
+{
+ struct setup_data *kaslr_data = NULL;
+ struct setup_data *data;
+ unsigned long size;
+ efi_status_t status;
+
+ size = sizeof(struct setup_data) + sizeof(unsigned long long);
+
+ status = efi_call_early(allocate_pool, EFI_LOADER_DATA,
+ size, (void **)&kaslr_data);
I'm not sure how I allocate such memory on no EFI environment.
Am I missing something...?
[1] https://lkml.kernel.org/r/20181022154204.kagmdb55jtoez4ca@gabell
Thanks,
Masa
On Tue, Nov 06, 2018 at 10:07:31PM +0800, Baoquan He wrote:
>On 11/06/18 at 01:10pm, Borislav Petkov wrote:
>> > I have another idea to solve this issue. Adding a SRAT parsing code
>> > to arch/x86/mm/kaslr.c. It is useful for both EFI and BIOS and
>> > also we don't need a new kernel parameter...
>> > Dose the idea make sense?
>>
>> The more automatic stuff we do and we don't have to involve the user,
>> the better.
>>
>> However, lemme look at Chao's current patchset first - we should not go
>> nuts by putting SRAT parsing everywhere :)
>
>arch/x86/mm/ident_map.c is a good example, it's shared between
>arch/x86/boot/compressed and arch/x86/mm/init_64.c
Thanks to Baoquan, I think we can try this idea.
How about you, Masa?
Thanks,
Chao Fan
>
>
On Tue, Nov 06, 2018 at 05:21:34PM -0500, Masayoshi Mizuma wrote:
> On Tue, Nov 06, 2018 at 09:45:11PM +0100, Borislav Petkov wrote:
> > On Tue, Nov 06, 2018 at 02:36:38PM -0500, Masayoshi Mizuma wrote:
> > > Yes, I think I can see what you are saying. However, I'm not sure how
> > > I use the setup_data in legacy BIOS environment.
> >
> > What is "legacy BIOS environment" and what does that have to do with
> > setup_data?
>
> My proposed patch [1] is useful only for EFI environment. The patch
> allocates a setup_date structure by efi_call_early() and store the
> KASLR data into the memory area.
>
> +static void setup_kaslr(struct boot_params *params)
> +{
> + struct setup_data *kaslr_data = NULL;
> + struct setup_data *data;
> + unsigned long size;
> + efi_status_t status;
> +
> + size = sizeof(struct setup_data) + sizeof(unsigned long long);
> +
> + status = efi_call_early(allocate_pool, EFI_LOADER_DATA,
> + size, (void **)&kaslr_data);
>
> I'm not sure how I allocate such memory on no EFI environment.
A global definition which doesn't need allocation?
Maybe hpa would have another, better idea...
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Thu, Nov 08, 2018 at 11:51:29AM +0100, Borislav Petkov wrote:
> A global definition which doesn't need allocation?
>
> Maybe hpa would have another, better idea...
...and he has: just put that address in a new field in struct
boot_params by converting one of the padding arrays there.
Don't forget to document it in Documentation/x86/zero-page.txt
This way you don't need any of the allocation fun or to use setup_data
at all.
HTH.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Sat, Nov 10, 2018 at 11:54:22AM +0100, Borislav Petkov wrote:
> On Thu, Nov 08, 2018 at 11:51:29AM +0100, Borislav Petkov wrote:
> > A global definition which doesn't need allocation?
> >
> > Maybe hpa would have another, better idea...
>
> ...and he has: just put that address in a new field in struct
> boot_params by converting one of the padding arrays there.
>
> Don't forget to document it in Documentation/x86/zero-page.txt
>
> This way you don't need any of the allocation fun or to use setup_data
> at all.
Thanks!
I have the prototype patch to use boot_params [1].
I will try to brush up it.
[1] https://lore.kernel.org/lkml/20181016151353.punyk7exekut2543@gabell
Thanks,
Masa
Hi Boris and all,
On Sun, Nov 11, 2018 at 08:45:57AM -0500, Masayoshi Mizuma wrote:
> On Sat, Nov 10, 2018 at 11:54:22AM +0100, Borislav Petkov wrote:
> > On Thu, Nov 08, 2018 at 11:51:29AM +0100, Borislav Petkov wrote:
> > > A global definition which doesn't need allocation?
> > >
> > > Maybe hpa would have another, better idea...
> >
> > ...and he has: just put that address in a new field in struct
> > boot_params by converting one of the padding arrays there.
> >
> > Don't forget to document it in Documentation/x86/zero-page.txt
> >
> > This way you don't need any of the allocation fun or to use setup_data
> > at all.
>
> Thanks!
> I have the prototype patch to use boot_params [1].
> I will try to brush up it.
>
> [1] https://lore.kernel.org/lkml/20181016151353.punyk7exekut2543@gabell
Chao's patches are included in the tip tree, so I modified the patch.
Could you review the following patch?
From: Masayoshi Mizuma <[email protected]>
Date: Tue, 5 Feb 2019 10:00:59 -0500
Subject: [PATCH] x86/mm: Introduce adjustment the padding size for KASLR
If the physical memory layout has huge space for hotplug, the padding
used for the physical memory mapping section is not enough.
So, such system may crash while memory hot-adding on KASLR enabled system.
For example, SRAT has the following layout, the maximum possible memory
size is 32TB, and the memory is installed as 2TB actually, then the padding
size should set 30TB (== possible memory size - actual memory size).
SRAT: Node 3 PXM 7 [mem 0x1c0000000000-0x1fffffffffff] hotplug
This patch introduces adjustment the padding size if the default
padding size isn't enough.
Signed-off-by: Masayoshi Mizuma <[email protected]>
---
Documentation/x86/zero-page.txt | 1 +
arch/x86/boot/compressed/acpi.c | 19 +++++++++++++++----
arch/x86/include/uapi/asm/bootparam.h | 2 +-
arch/x86/mm/kaslr.c | 26 +++++++++++++++++++++++++-
4 files changed, 42 insertions(+), 6 deletions(-)
diff --git a/Documentation/x86/zero-page.txt b/Documentation/x86/zero-page.txt
index 68aed077f..343fe1a90 100644
--- a/Documentation/x86/zero-page.txt
+++ b/Documentation/x86/zero-page.txt
@@ -15,6 +15,7 @@ Offset Proto Name Meaning
058/008 ALL tboot_addr Physical address of tboot shared page
060/010 ALL ist_info Intel SpeedStep (IST) BIOS support information
(struct ist_info)
+078/010 ALL possible_mem_addr The possible maximum physical memory address.
080/010 ALL hd0_info hd0 disk parameter, OBSOLETE!!
090/010 ALL hd1_info hd1 disk parameter, OBSOLETE!!
0A0/010 ALL sys_desc_table System description table (struct sys_desc_table),
diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
index c5a949335..7dd61b943 100644
--- a/arch/x86/boot/compressed/acpi.c
+++ b/arch/x86/boot/compressed/acpi.c
@@ -288,6 +288,7 @@ int count_immovable_mem_regions(void)
struct acpi_subtable_header *sub_table;
struct acpi_table_header *table_header;
char arg[MAX_ACPI_ARG_LENGTH];
+ unsigned long long possible_addr, max_possible_addr = 0;
int num = 0;
if (cmdline_find_option("acpi", arg, sizeof(arg)) == 3 &&
@@ -308,10 +309,19 @@ int count_immovable_mem_regions(void)
struct acpi_srat_mem_affinity *ma;
ma = (struct acpi_srat_mem_affinity *)sub_table;
- if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) && ma->length) {
- immovable_mem[num].start = ma->base_address;
- immovable_mem[num].size = ma->length;
- num++;
+ if (ma->length) {
+ if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
+ possible_addr =
+ ma->base_address + ma->length;
+ if (possible_addr > max_possible_addr)
+ max_possible_addr =
+ possible_addr;
+ } else {
+ immovable_mem[num].start =
+ ma->base_address;
+ immovable_mem[num].size = ma->length;
+ num++;
+ }
}
if (num >= MAX_NUMNODES*2) {
@@ -320,6 +330,7 @@ int count_immovable_mem_regions(void)
}
}
table += sub_table->length;
+ boot_params->possible_mem_addr = max_possible_addr;
}
return num;
}
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index 60733f137..5b64b606e 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -156,7 +156,7 @@ struct boot_params {
__u64 tboot_addr; /* 0x058 */
struct ist_info ist_info; /* 0x060 */
__u64 acpi_rsdp_addr; /* 0x070 */
- __u8 _pad3[8]; /* 0x078 */
+ __u64 possible_mem_addr; /* 0x078 */
__u8 hd0_info[16]; /* obsolete! */ /* 0x080 */
__u8 hd1_info[16]; /* obsolete! */ /* 0x090 */
struct sys_desc_table sys_desc_table; /* obsolete! */ /* 0x0a0 */
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 3f452ffed..71fc28570 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -70,6 +70,30 @@ static inline bool kaslr_memory_enabled(void)
return kaslr_enabled() && !IS_ENABLED(CONFIG_KASAN);
}
+static unsigned int __init kaslr_padding(void)
+{
+ unsigned int rand_mem_physical_padding =
+ CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
+#ifdef CONFIG_MEMORY_HOTPLUG
+ unsigned long long max_possible_phys, max_actual_phys, threshold;
+
+ if (!boot_params.possible_mem_addr)
+ goto out;
+
+ max_actual_phys = roundup(PFN_PHYS(max_pfn), 1ULL << TB_SHIFT);
+ max_possible_phys = roundup(boot_params.possible_mem_addr,
+ 1ULL << TB_SHIFT);
+ threshold = max_actual_phys +
+ ((unsigned long long)rand_mem_physical_padding << TB_SHIFT);
+
+ if (max_possible_phys > threshold)
+ rand_mem_physical_padding =
+ (max_possible_phys - max_actual_phys) >> TB_SHIFT;
+out:
+#endif
+ return rand_mem_physical_padding;
+}
+
/* Initialize base and padding for each memory region randomized with KASLR */
void __init kernel_randomize_memory(void)
{
@@ -103,7 +127,7 @@ void __init kernel_randomize_memory(void)
*/
BUG_ON(kaslr_regions[0].base != &page_offset_base);
memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) +
- CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
+ kaslr_padding();
/* Adapt phyiscal memory region size based on available memory */
if (memory_tb < kaslr_regions[0].size_tb)
--
2.20.1
Thanks,
Masa
On Tue, Feb 05, 2019 at 10:05:16AM -0500, Masayoshi Mizuma wrote:
> From: Masayoshi Mizuma <[email protected]>
> Date: Tue, 5 Feb 2019 10:00:59 -0500
> Subject: [PATCH] x86/mm: Introduce adjustment the padding size for KASLR
"Adjust the padding size for KASLR"
> If the physical memory layout has huge space for hotplug, the padding
> used for the physical memory mapping section is not enough.
> So, such system may crash while memory hot-adding on KASLR enabled system.
Crash why?
Why is the padding not enough?
> For example, SRAT has the following layout, the maximum possible memory
> size is 32TB, and the memory is installed as 2TB actually, then the padding
> size should set 30TB (== possible memory size - actual memory size).
>
> SRAT: Node 3 PXM 7 [mem 0x1c0000000000-0x1fffffffffff] hotplug
What is that supposed to exemplify: that range is 3T not 2 and that
range start is not at 2T but 28T. So I have absolutely no clue what
you're trying to explain here.
Please go back, take your time and structure your commit message like
this:
Problem is A.
It happens because of B.
Fix it by doing C.
(Potentially do D).
For more detailed info, see
Documentation/process/submitting-patches.rst, Section "2) Describe your
changes".
> This patch introduces adjustment the padding size if the default
Avoid having "This patch" or "This commit" in the commit message. It is
tautologically useless.
Also, do
$ git grep 'This patch' Documentation/process
for more details.
> padding size isn't enough.
>
> Signed-off-by: Masayoshi Mizuma <[email protected]>
> ---
> Documentation/x86/zero-page.txt | 1 +
> arch/x86/boot/compressed/acpi.c | 19 +++++++++++++++----
> arch/x86/include/uapi/asm/bootparam.h | 2 +-
> arch/x86/mm/kaslr.c | 26 +++++++++++++++++++++++++-
> 4 files changed, 42 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/x86/zero-page.txt b/Documentation/x86/zero-page.txt
> index 68aed077f..343fe1a90 100644
> --- a/Documentation/x86/zero-page.txt
> +++ b/Documentation/x86/zero-page.txt
> @@ -15,6 +15,7 @@ Offset Proto Name Meaning
> 058/008 ALL tboot_addr Physical address of tboot shared page
> 060/010 ALL ist_info Intel SpeedStep (IST) BIOS support information
> (struct ist_info)
> +078/010 ALL possible_mem_addr The possible maximum physical memory address.
Why isn't this called max_phys_addr then?
Also, please explain what it means at the end of this file.
> 080/010 ALL hd0_info hd0 disk parameter, OBSOLETE!!
> 090/010 ALL hd1_info hd1 disk parameter, OBSOLETE!!
> 0A0/010 ALL sys_desc_table System description table (struct sys_desc_table),
> diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
> index c5a949335..7dd61b943 100644
> --- a/arch/x86/boot/compressed/acpi.c
> +++ b/arch/x86/boot/compressed/acpi.c
> @@ -288,6 +288,7 @@ int count_immovable_mem_regions(void)
> struct acpi_subtable_header *sub_table;
> struct acpi_table_header *table_header;
> char arg[MAX_ACPI_ARG_LENGTH];
> + unsigned long long possible_addr, max_possible_addr = 0;
> int num = 0;
>
> if (cmdline_find_option("acpi", arg, sizeof(arg)) == 3 &&
> @@ -308,10 +309,19 @@ int count_immovable_mem_regions(void)
> struct acpi_srat_mem_affinity *ma;
>
> ma = (struct acpi_srat_mem_affinity *)sub_table;
> - if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) && ma->length) {
> - immovable_mem[num].start = ma->base_address;
> - immovable_mem[num].size = ma->length;
> - num++;
> + if (ma->length) {
> + if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
> + possible_addr =
> + ma->base_address + ma->length;
> + if (possible_addr > max_possible_addr)
> + max_possible_addr =
> + possible_addr;
> + } else {
> + immovable_mem[num].start =
> + ma->base_address;
> + immovable_mem[num].size = ma->length;
> + num++;
> + }
This piece has some ugly linebreaks which makes it impossible to read.
Perhaps because the variable names you're adding are too long.
You can carve it out in a separate function, for example.
> if (num >= MAX_NUMNODES*2) {
> @@ -320,6 +330,7 @@ int count_immovable_mem_regions(void)
> }
> }
> table += sub_table->length;
> + boot_params->possible_mem_addr = max_possible_addr;
This assignment is inside the loop and happens potentially a bunch of
times. Why?
> }
> return num;
> }
> diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
> index 60733f137..5b64b606e 100644
> --- a/arch/x86/include/uapi/asm/bootparam.h
> +++ b/arch/x86/include/uapi/asm/bootparam.h
> @@ -156,7 +156,7 @@ struct boot_params {
> __u64 tboot_addr; /* 0x058 */
> struct ist_info ist_info; /* 0x060 */
> __u64 acpi_rsdp_addr; /* 0x070 */
> - __u8 _pad3[8]; /* 0x078 */
> + __u64 possible_mem_addr; /* 0x078 */
> __u8 hd0_info[16]; /* obsolete! */ /* 0x080 */
> __u8 hd1_info[16]; /* obsolete! */ /* 0x090 */
> struct sys_desc_table sys_desc_table; /* obsolete! */ /* 0x0a0 */
> diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
> index 3f452ffed..71fc28570 100644
> --- a/arch/x86/mm/kaslr.c
> +++ b/arch/x86/mm/kaslr.c
> @@ -70,6 +70,30 @@ static inline bool kaslr_memory_enabled(void)
> return kaslr_enabled() && !IS_ENABLED(CONFIG_KASAN);
> }
>
> +static unsigned int __init kaslr_padding(void)
> +{
> + unsigned int rand_mem_physical_padding =
> + CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
> +#ifdef CONFIG_MEMORY_HOTPLUG
> + unsigned long long max_possible_phys, max_actual_phys, threshold;
> +
> + if (!boot_params.possible_mem_addr)
> + goto out;
> +
> + max_actual_phys = roundup(PFN_PHYS(max_pfn), 1ULL << TB_SHIFT);
> + max_possible_phys = roundup(boot_params.possible_mem_addr,
> + 1ULL << TB_SHIFT);
> + threshold = max_actual_phys +
> + ((unsigned long long)rand_mem_physical_padding << TB_SHIFT);
> +
> + if (max_possible_phys > threshold)
> + rand_mem_physical_padding =
> + (max_possible_phys - max_actual_phys) >> TB_SHIFT;
> +out:
> +#endif
> + return rand_mem_physical_padding;
Same problem: local variables with unnecessarily long names make the
code hard to read. Pls shorten.
Also, the types in that function are a total mess. An unsigned int which
you cast to unsigned long long and return an unsigned int again to add
into a sum with unsigned longs. Are you selecting the types randomly?
Why aren't you using plain unsigned longs like the rest of the file does
with memory addresses?
Also, this function could have a comment above it explaining what all
that threshold and actual and possible address arithmetic is supposed to
do.
Thx.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
Hi Boris,
Thank you for your review.
On Fri, Feb 08, 2019 at 07:26:00PM +0100, Borislav Petkov wrote:
> On Tue, Feb 05, 2019 at 10:05:16AM -0500, Masayoshi Mizuma wrote:
> > From: Masayoshi Mizuma <[email protected]>
> > Date: Tue, 5 Feb 2019 10:00:59 -0500
> > Subject: [PATCH] x86/mm: Introduce adjustment the padding size for KASLR
>
> "Adjust the padding size for KASLR"
>
> > If the physical memory layout has huge space for hotplug, the padding
> > used for the physical memory mapping section is not enough.
> > So, such system may crash while memory hot-adding on KASLR enabled system.
>
> Crash why?
>
> Why is the padding not enough?
>
> > For example, SRAT has the following layout, the maximum possible memory
> > size is 32TB, and the memory is installed as 2TB actually, then the padding
> > size should set 30TB (== possible memory size - actual memory size).
> >
> > SRAT: Node 3 PXM 7 [mem 0x1c0000000000-0x1fffffffffff] hotplug
>
> What is that supposed to exemplify: that range is 3T not 2 and that
> range start is not at 2T but 28T. So I have absolutely no clue what
> you're trying to explain here.
>
> Please go back, take your time and structure your commit message like
> this:
>
> Problem is A.
>
> It happens because of B.
>
> Fix it by doing C.
>
> (Potentially do D).
>
> For more detailed info, see
> Documentation/process/submitting-patches.rst, Section "2) Describe your
> changes".
Got it, thanks.
>
> > This patch introduces adjustment the padding size if the default
>
> Avoid having "This patch" or "This commit" in the commit message. It is
> tautologically useless.
>
> Also, do
>
> $ git grep 'This patch' Documentation/process
>
> for more details.
Thanks. I see.
>
> > padding size isn't enough.
> >
> > Signed-off-by: Masayoshi Mizuma <[email protected]>
> > ---
> > Documentation/x86/zero-page.txt | 1 +
> > arch/x86/boot/compressed/acpi.c | 19 +++++++++++++++----
> > arch/x86/include/uapi/asm/bootparam.h | 2 +-
> > arch/x86/mm/kaslr.c | 26 +++++++++++++++++++++++++-
> > 4 files changed, 42 insertions(+), 6 deletions(-)
> >
> > diff --git a/Documentation/x86/zero-page.txt b/Documentation/x86/zero-page.txt
> > index 68aed077f..343fe1a90 100644
> > --- a/Documentation/x86/zero-page.txt
> > +++ b/Documentation/x86/zero-page.txt
> > @@ -15,6 +15,7 @@ Offset Proto Name Meaning
> > 058/008 ALL tboot_addr Physical address of tboot shared page
> > 060/010 ALL ist_info Intel SpeedStep (IST) BIOS support information
> > (struct ist_info)
> > +078/010 ALL possible_mem_addr The possible maximum physical memory address.
>
> Why isn't this called max_phys_addr then?
>
> Also, please explain what it means at the end of this file.
>
> > 080/010 ALL hd0_info hd0 disk parameter, OBSOLETE!!
> > 090/010 ALL hd1_info hd1 disk parameter, OBSOLETE!!
> > 0A0/010 ALL sys_desc_table System description table (struct sys_desc_table),
> > diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
> > index c5a949335..7dd61b943 100644
> > --- a/arch/x86/boot/compressed/acpi.c
> > +++ b/arch/x86/boot/compressed/acpi.c
> > @@ -288,6 +288,7 @@ int count_immovable_mem_regions(void)
> > struct acpi_subtable_header *sub_table;
> > struct acpi_table_header *table_header;
> > char arg[MAX_ACPI_ARG_LENGTH];
> > + unsigned long long possible_addr, max_possible_addr = 0;
> > int num = 0;
> >
> > if (cmdline_find_option("acpi", arg, sizeof(arg)) == 3 &&
> > @@ -308,10 +309,19 @@ int count_immovable_mem_regions(void)
> > struct acpi_srat_mem_affinity *ma;
> >
> > ma = (struct acpi_srat_mem_affinity *)sub_table;
> > - if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) && ma->length) {
> > - immovable_mem[num].start = ma->base_address;
> > - immovable_mem[num].size = ma->length;
> > - num++;
> > + if (ma->length) {
> > + if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
> > + possible_addr =
> > + ma->base_address + ma->length;
> > + if (possible_addr > max_possible_addr)
> > + max_possible_addr =
> > + possible_addr;
> > + } else {
> > + immovable_mem[num].start =
> > + ma->base_address;
> > + immovable_mem[num].size = ma->length;
> > + num++;
> > + }
>
> This piece has some ugly linebreaks which makes it impossible to read.
> Perhaps because the variable names you're adding are too long.
>
> You can carve it out in a separate function, for example.
Thanks. I will add a separate function like as:
static void subtable_parse(struct acpi_subtable_header *sub_table, int *num,
unsigned long *max_addr)
{
struct acpi_srat_mem_affinity *ma;
unsigned long addr;
ma = (struct acpi_srat_mem_affinity *)sub_table;
if (ma->length) {
if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
addr = ma->base_address + ma->length;
if (addr > *max_addr)
*max_addr = addr;
} else {
immovable_mem[*num].start = ma->base_address;
immovable_mem[*num].size = ma->length;
(*num)++;
}
}
}
>
> > if (num >= MAX_NUMNODES*2) {
> > @@ -320,6 +330,7 @@ int count_immovable_mem_regions(void)
> > }
> > }
> > table += sub_table->length;
> > + boot_params->possible_mem_addr = max_possible_addr;
>
> This assignment is inside the loop and happens potentially a bunch of
> times. Why?
I will move the assigment out of the loop, thanks.
>
> > }
> > return num;
> > }
> > diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
> > index 60733f137..5b64b606e 100644
> > --- a/arch/x86/include/uapi/asm/bootparam.h
> > +++ b/arch/x86/include/uapi/asm/bootparam.h
> > @@ -156,7 +156,7 @@ struct boot_params {
> > __u64 tboot_addr; /* 0x058 */
> > struct ist_info ist_info; /* 0x060 */
> > __u64 acpi_rsdp_addr; /* 0x070 */
> > - __u8 _pad3[8]; /* 0x078 */
> > + __u64 possible_mem_addr; /* 0x078 */
> > __u8 hd0_info[16]; /* obsolete! */ /* 0x080 */
> > __u8 hd1_info[16]; /* obsolete! */ /* 0x090 */
> > struct sys_desc_table sys_desc_table; /* obsolete! */ /* 0x0a0 */
> > diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
> > index 3f452ffed..71fc28570 100644
> > --- a/arch/x86/mm/kaslr.c
> > +++ b/arch/x86/mm/kaslr.c
> > @@ -70,6 +70,30 @@ static inline bool kaslr_memory_enabled(void)
> > return kaslr_enabled() && !IS_ENABLED(CONFIG_KASAN);
> > }
> >
> > +static unsigned int __init kaslr_padding(void)
> > +{
> > + unsigned int rand_mem_physical_padding =
> > + CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
> > +#ifdef CONFIG_MEMORY_HOTPLUG
> > + unsigned long long max_possible_phys, max_actual_phys, threshold;
> > +
> > + if (!boot_params.possible_mem_addr)
> > + goto out;
> > +
> > + max_actual_phys = roundup(PFN_PHYS(max_pfn), 1ULL << TB_SHIFT);
> > + max_possible_phys = roundup(boot_params.possible_mem_addr,
> > + 1ULL << TB_SHIFT);
> > + threshold = max_actual_phys +
> > + ((unsigned long long)rand_mem_physical_padding << TB_SHIFT);
> > +
> > + if (max_possible_phys > threshold)
> > + rand_mem_physical_padding =
> > + (max_possible_phys - max_actual_phys) >> TB_SHIFT;
> > +out:
> > +#endif
> > + return rand_mem_physical_padding;
>
> Same problem: local variables with unnecessarily long names make the
> code hard to read. Pls shorten.
>
> Also, the types in that function are a total mess. An unsigned int which
> you cast to unsigned long long and return an unsigned int again to add
> into a sum with unsigned longs. Are you selecting the types randomly?
> Why aren't you using plain unsigned longs like the rest of the file does
> with memory addresses?
Sorry. I will unify the type as unsinged long.
>
> Also, this function could have a comment above it explaining what all
> that threshold and actual and possible address arithmetic is supposed to
> do.
Got it.
Thanks!
Masa
On Tue, Feb 05, 2019 at 10:05:16AM -0500, Masayoshi Mizuma wrote:
[...]
Hi Masa,
Sorry for delay, since last days were Chinese holiday.
>diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
>index c5a949335..7dd61b943 100644
>--- a/arch/x86/boot/compressed/acpi.c
>+++ b/arch/x86/boot/compressed/acpi.c
>@@ -288,6 +288,7 @@ int count_immovable_mem_regions(void)
> struct acpi_subtable_header *sub_table;
> struct acpi_table_header *table_header;
> char arg[MAX_ACPI_ARG_LENGTH];
>+ unsigned long long possible_addr, max_possible_addr = 0;
This line is so long that it should be added in first line.
> int num = 0;
>
> if (cmdline_find_option("acpi", arg, sizeof(arg)) == 3 &&
>@@ -308,10 +309,19 @@ int count_immovable_mem_regions(void)
> struct acpi_srat_mem_affinity *ma;
>
> ma = (struct acpi_srat_mem_affinity *)sub_table;
>- if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) && ma->length) {
>- immovable_mem[num].start = ma->base_address;
>- immovable_mem[num].size = ma->length;
>- num++;
>+ if (ma->length) {
>+ if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
>+ possible_addr =
>+ ma->base_address + ma->length;
>+ if (possible_addr > max_possible_addr)
>+ max_possible_addr =
>+ possible_addr;
>+ } else {
>+ immovable_mem[num].start =
>+ ma->base_address;
>+ immovable_mem[num].size = ma->length;
>+ num++;
>+ }
> }
It looks better in another mail where you add a new function.
Thanks,
Chao Fan
>
> if (num >= MAX_NUMNODES*2) {
>@@ -320,6 +330,7 @@ int count_immovable_mem_regions(void)
> }
> }
> table += sub_table->length;
>+ boot_params->possible_mem_addr = max_possible_addr;
> }
> return num;
> }
>diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
>index 60733f137..5b64b606e 100644
>--- a/arch/x86/include/uapi/asm/bootparam.h
>+++ b/arch/x86/include/uapi/asm/bootparam.h
>@@ -156,7 +156,7 @@ struct boot_params {
> __u64 tboot_addr; /* 0x058 */
> struct ist_info ist_info; /* 0x060 */
> __u64 acpi_rsdp_addr; /* 0x070 */
>- __u8 _pad3[8]; /* 0x078 */
>+ __u64 possible_mem_addr; /* 0x078 */
> __u8 hd0_info[16]; /* obsolete! */ /* 0x080 */
> __u8 hd1_info[16]; /* obsolete! */ /* 0x090 */
> struct sys_desc_table sys_desc_table; /* obsolete! */ /* 0x0a0 */
>diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
>index 3f452ffed..71fc28570 100644
>--- a/arch/x86/mm/kaslr.c
>+++ b/arch/x86/mm/kaslr.c
>@@ -70,6 +70,30 @@ static inline bool kaslr_memory_enabled(void)
> return kaslr_enabled() && !IS_ENABLED(CONFIG_KASAN);
> }
>
>+static unsigned int __init kaslr_padding(void)
>+{
>+ unsigned int rand_mem_physical_padding =
>+ CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
>+#ifdef CONFIG_MEMORY_HOTPLUG
>+ unsigned long long max_possible_phys, max_actual_phys, threshold;
>+
>+ if (!boot_params.possible_mem_addr)
>+ goto out;
>+
>+ max_actual_phys = roundup(PFN_PHYS(max_pfn), 1ULL << TB_SHIFT);
>+ max_possible_phys = roundup(boot_params.possible_mem_addr,
>+ 1ULL << TB_SHIFT);
>+ threshold = max_actual_phys +
>+ ((unsigned long long)rand_mem_physical_padding << TB_SHIFT);
>+
>+ if (max_possible_phys > threshold)
>+ rand_mem_physical_padding =
>+ (max_possible_phys - max_actual_phys) >> TB_SHIFT;
>+out:
>+#endif
>+ return rand_mem_physical_padding;
>+}
>+
> /* Initialize base and padding for each memory region randomized with KASLR */
> void __init kernel_randomize_memory(void)
> {
>@@ -103,7 +127,7 @@ void __init kernel_randomize_memory(void)
> */
> BUG_ON(kaslr_regions[0].base != &page_offset_base);
> memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) +
>- CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
>+ kaslr_padding();
>
> /* Adapt phyiscal memory region size based on available memory */
> if (memory_tb < kaslr_regions[0].size_tb)
>--
>2.20.1
>
>Thanks,
>Masa
>
>
Hi Chao,
Thank you for your review.
On Mon, Feb 11, 2019 at 09:46:05AM +0800, Chao Fan wrote:
> On Tue, Feb 05, 2019 at 10:05:16AM -0500, Masayoshi Mizuma wrote:
> [...]
>
> Hi Masa,
>
> Sorry for delay, since last days were Chinese holiday.
>
> >diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
> >index c5a949335..7dd61b943 100644
> >--- a/arch/x86/boot/compressed/acpi.c
> >+++ b/arch/x86/boot/compressed/acpi.c
> >@@ -288,6 +288,7 @@ int count_immovable_mem_regions(void)
> > struct acpi_subtable_header *sub_table;
> > struct acpi_table_header *table_header;
> > char arg[MAX_ACPI_ARG_LENGTH];
> >+ unsigned long long possible_addr, max_possible_addr = 0;
>
> This line is so long that it should be added in first line.
Thanks. I will simplify around the local variables.
>
> > int num = 0;
> >
> > if (cmdline_find_option("acpi", arg, sizeof(arg)) == 3 &&
> >@@ -308,10 +309,19 @@ int count_immovable_mem_regions(void)
> > struct acpi_srat_mem_affinity *ma;
> >
> > ma = (struct acpi_srat_mem_affinity *)sub_table;
> >- if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) && ma->length) {
> >- immovable_mem[num].start = ma->base_address;
> >- immovable_mem[num].size = ma->length;
> >- num++;
> >+ if (ma->length) {
> >+ if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) {
> >+ possible_addr =
> >+ ma->base_address + ma->length;
> >+ if (possible_addr > max_possible_addr)
> >+ max_possible_addr =
> >+ possible_addr;
> >+ } else {
> >+ immovable_mem[num].start =
> >+ ma->base_address;
> >+ immovable_mem[num].size = ma->length;
> >+ num++;
> >+ }
> > }
>
> It looks better in another mail where you add a new function.
Thanks!
- Masa