2019-11-02 01:10:26

by Masayoshi Mizuma

[permalink] [raw]
Subject: [PATCH v4 0/4] Adjust the padding size for KASLR

From: Masayoshi Mizuma <[email protected]>

The system sometimes crashes while memory hot-adding on KASLR
enabled system. The crash happens because the regions pointed by
kaslr_regions[].base are overwritten by the hot-added memory.

It happens because of the padding size for kaslr_regions[].base isn't
enough for the system whose physical memory layout has huge space for
memory hotplug. kaslr_regions[].base points "actual installed
memory size + padding" or higher address. So, if the "actual + padding"
is lower address than the maximum memory address, which means the memory
address reachable by memory hot-add, kaslr_regions[].base is destroyed by
the overwritten.

address
^
|------- maximum memory address (Hotplug)
| ^
|------- kaslr_regions[0].base | Hotadd-able region
| ^ |
| | padding |
| V V
|------- actual memory address (Installed on boot)
|

Fix it by getting the maximum memory address from SRAT and store
the value in boot_param, then set the padding size while KASLR
initializing if the default padding size isn't enough.

Masayoshi Mizuma (4):
x86/boot: Wrap up the SRAT traversing code into subtable_parse()
x86/boot: Add max_addr field in struct boot_params
x86/boot: Get the max address from SRAT
x86/mm/KASLR: Adjust the padding size for the direct mapping.

Documentation/x86/zero-page.rst | 4 ++
arch/x86/boot/compressed/acpi.c | 35 ++++++++++++---
arch/x86/include/uapi/asm/bootparam.h | 2 +-
arch/x86/mm/kaslr.c | 65 ++++++++++++++++++++-------
4 files changed, 83 insertions(+), 23 deletions(-)

--
2.20.1


2019-11-02 01:10:50

by Masayoshi Mizuma

[permalink] [raw]
Subject: [PATCH v4 4/4] x86/mm/KASLR: Adjust the padding size for the direct mapping.

From: Masayoshi Mizuma <[email protected]>

The system sometimes crashes while memory hot-adding on KASLR
enabled system. The crash happens because the regions pointed by
kaslr_regions[].base are overwritten by the hot-added memory.

It happens because of the padding size for kaslr_regions[].base isn't
enough for the system whose physical memory layout has huge space for
memory hotplug. kaslr_regions[].base points "actual installed
memory size + padding" or higher address. So, if the "actual + padding"
is lower address than the maximum memory address, which means the memory
address reachable by memory hot-add, kaslr_regions[].base is destroyed by
the overwritten.

address
^
|------- maximum memory address (Hotplug)
| ^
|------- kaslr_regions[0].base | Hotadd-able region
| ^ |
| | padding |
| V V
|------- actual memory address (Installed on boot)
|

Fix it by getting the maximum memory address from SRAT and store
the value in boot_param, then set the padding size while KASLR
initializing if the default padding size isn't enough.

Signed-off-by: Baoquan He <[email protected]>
Signed-off-by: Masayoshi Mizuma <[email protected]>
---
arch/x86/mm/kaslr.c | 65 ++++++++++++++++++++++++++++++++++-----------
1 file changed, 50 insertions(+), 15 deletions(-)

diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index dc6182eec..a80eed563 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -70,15 +70,60 @@ static inline bool kaslr_memory_enabled(void)
return kaslr_enabled() && !IS_ENABLED(CONFIG_KASAN);
}

+/*
+ * Even though a huge virtual address space is reserved for the direct
+ * mapping of physical memory, e.g in 4-level paging mode, it's 64TB,
+ * rare system can own enough physical memory to use it up, most are
+ * even less than 1TB. So with KASLR enabled, we adapt the size of
+ * direct mapping area to the size of actual physical memory plus the
+ * configured padding CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING.
+ * The left part will be taken out to join memory randomization.
+ */
+static inline unsigned long calc_direct_mapping_size(void)
+{
+ unsigned long padding = CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
+ unsigned long size_tb, memory_tb;
+#ifdef CONFIG_MEMORY_HOTPLUG
+ unsigned long actual, maximum, base;
+
+ if (boot_params.max_addr) {
+ /*
+ * The padding size should set to get for kaslr_regions[].base
+ * bigger address than the maximum memory address the system can
+ * have. kaslr_regions[].base points "actual size + padding" or
+ * higher address. If "actual size + padding" points the lower
+ * address than the maximum memory size, fix the padding size.
+ */
+ actual = roundup(PFN_PHYS(max_pfn), 1UL << TB_SHIFT);
+ maximum = roundup(boot_params.max_addr, 1UL << TB_SHIFT);
+ base = actual + (padding << TB_SHIFT);
+
+ if (maximum > base)
+ padding = (maximum - actual) >> TB_SHIFT;
+ }
+#endif
+ memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) +
+ padding;
+
+ size_tb = 1 << (MAX_PHYSMEM_BITS - TB_SHIFT);
+
+ /*
+ * Adapt physical memory region size based on available memory
+ */
+ if (memory_tb < size_tb)
+ size_tb = memory_tb;
+
+ return size_tb;
+}
+
/* Initialize base and padding for each memory region randomized with KASLR */
void __init kernel_randomize_memory(void)
{
- size_t i;
- unsigned long vaddr_start, vaddr;
- unsigned long rand, memory_tb;
- struct rnd_state rand_state;
+ unsigned long vaddr_start, vaddr, rand;
unsigned long remain_entropy;
unsigned long vmemmap_size;
+ struct rnd_state rand_state;
+ size_t i;

vaddr_start = pgtable_l5_enabled() ? __PAGE_OFFSET_BASE_L5 : __PAGE_OFFSET_BASE_L4;
vaddr = vaddr_start;
@@ -95,20 +140,10 @@ void __init kernel_randomize_memory(void)
if (!kaslr_memory_enabled())
return;

- kaslr_regions[0].size_tb = 1 << (MAX_PHYSMEM_BITS - TB_SHIFT);
+ kaslr_regions[0].size_tb = calc_direct_mapping_size();
kaslr_regions[1].size_tb = VMALLOC_SIZE_TB;

- /*
- * Update Physical memory mapping to available and
- * add padding if needed (especially for memory hotplug support).
- */
BUG_ON(kaslr_regions[0].base != &page_offset_base);
- memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) +
- CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
-
- /* Adapt phyiscal memory region size based on available memory */
- if (memory_tb < kaslr_regions[0].size_tb)
- kaslr_regions[0].size_tb = memory_tb;

/*
* Calculate the vmemmap region size in TBs, aligned to a TB
--
2.20.1

2019-11-02 01:10:57

by Masayoshi Mizuma

[permalink] [raw]
Subject: [PATCH v4 3/4] x86/boot: Get the max address from SRAT

From: Masayoshi Mizuma <[email protected]>

Get the max address from SRAT and write it into boot_params->max_addr.

Signed-off-by: Masayoshi Mizuma <[email protected]>
---
arch/x86/boot/compressed/acpi.c | 26 ++++++++++++++++++++------
1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
index a0f81438a..764206c23 100644
--- a/arch/x86/boot/compressed/acpi.c
+++ b/arch/x86/boot/compressed/acpi.c
@@ -362,17 +362,25 @@ static unsigned long get_acpi_srat_table(void)
return 0;
}

-static void subtable_parse(struct acpi_subtable_header *sub_table, int *num)
+static unsigned long subtable_parse(struct acpi_subtable_header *sub_table,
+ int *num)
{
struct acpi_srat_mem_affinity *ma;
+ unsigned long addr = 0;

ma = (struct acpi_srat_mem_affinity *)sub_table;

- if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) && ma->length) {
- immovable_mem[*num].start = ma->base_address;
- immovable_mem[*num].size = ma->length;
- (*num)++;
+ if (ma->length) {
+ if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)
+ addr = ma->base_address + ma->length;
+ else {
+ immovable_mem[*num].start = ma->base_address;
+ immovable_mem[*num].size = ma->length;
+ (*num)++;
+ }
}
+
+ return addr;
}

/**
@@ -391,6 +399,7 @@ int count_immovable_mem_regions(void)
struct acpi_subtable_header *sub_table;
struct acpi_table_header *table_header;
char arg[MAX_ACPI_ARG_LENGTH];
+ unsigned long max_addr = 0, addr;
int num = 0;

if (cmdline_find_option("acpi", arg, sizeof(arg)) == 3 &&
@@ -409,7 +418,9 @@ int count_immovable_mem_regions(void)
sub_table = (struct acpi_subtable_header *)table;
if (sub_table->type == ACPI_SRAT_TYPE_MEMORY_AFFINITY) {

- subtable_parse(sub_table, &num);
+ addr = subtable_parse(sub_table, &num);
+ if (addr > max_addr)
+ max_addr = addr;

if (num >= MAX_NUMNODES*2) {
debug_putstr("Too many immovable memory regions, aborting.\n");
@@ -418,6 +429,9 @@ int count_immovable_mem_regions(void)
}
table += sub_table->length;
}
+
+ boot_params->max_addr = max_addr;
+
return num;
}
#endif /* CONFIG_RANDOMIZE_BASE && CONFIG_MEMORY_HOTREMOVE */
--
2.20.1

2019-11-02 01:11:23

by Masayoshi Mizuma

[permalink] [raw]
Subject: [PATCH v4 2/4] x86/boot: Add max_addr field in struct boot_params

From: Masayoshi Mizuma <[email protected]>

Add max_addr field in struct boot_params. max_addr shows the
maximum memory address to be reachable by memory hot-add.
max_addr is set by parsing ACPI SRAT.

Reviewed-by: Baoquan He <[email protected]>
Signed-off-by: Masayoshi Mizuma <[email protected]>
---
Documentation/x86/zero-page.rst | 4 ++++
arch/x86/include/uapi/asm/bootparam.h | 2 +-
2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/Documentation/x86/zero-page.rst b/Documentation/x86/zero-page.rst
index f088f5881..cc3938d68 100644
--- a/Documentation/x86/zero-page.rst
+++ b/Documentation/x86/zero-page.rst
@@ -19,6 +19,7 @@ Offset/Size Proto Name Meaning
058/008 ALL tboot_addr Physical address of tboot shared page
060/010 ALL ist_info Intel SpeedStep (IST) BIOS support information
(struct ist_info)
+078/010 ALL max_addr The possible maximum physical memory address [1]_
080/010 ALL hd0_info hd0 disk parameter, OBSOLETE!!
090/010 ALL hd1_info hd1 disk parameter, OBSOLETE!!
0A0/010 ALL sys_desc_table System description table (struct sys_desc_table),
@@ -43,3 +44,6 @@ Offset/Size Proto Name Meaning
(array of struct e820_entry)
D00/1EC ALL eddbuf EDD data (array of struct edd_info)
=========== ===== ======================= =================================================
+
+.. [1] max_addr shows the maximum memory address to be reachable by memory
+ hot-add. max_addr is set by parsing ACPI SRAT.
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index db1e24e56..855d30f64 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -165,7 +165,7 @@ struct boot_params {
__u64 tboot_addr; /* 0x058 */
struct ist_info ist_info; /* 0x060 */
__u64 acpi_rsdp_addr; /* 0x070 */
- __u8 _pad3[8]; /* 0x078 */
+ __u64 max_addr; /* 0x078 */
__u8 hd0_info[16]; /* obsolete! */ /* 0x080 */
__u8 hd1_info[16]; /* obsolete! */ /* 0x090 */
struct sys_desc_table sys_desc_table; /* obsolete! */ /* 0x0a0 */
--
2.20.1

2019-11-02 01:11:43

by Masayoshi Mizuma

[permalink] [raw]
Subject: [PATCH v4 1/4] x86/boot: Wrap up the SRAT traversing code into subtable_parse()

From: Masayoshi Mizuma <[email protected]>

Wrap up the SRAT traversing code into subtable_parse().

Reviewed-by: Baoquan He <[email protected]>
Signed-off-by: Masayoshi Mizuma <[email protected]>
---
arch/x86/boot/compressed/acpi.c | 21 ++++++++++++++-------
1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
index 25019d42a..a0f81438a 100644
--- a/arch/x86/boot/compressed/acpi.c
+++ b/arch/x86/boot/compressed/acpi.c
@@ -362,6 +362,19 @@ static unsigned long get_acpi_srat_table(void)
return 0;
}

+static void subtable_parse(struct acpi_subtable_header *sub_table, int *num)
+{
+ struct acpi_srat_mem_affinity *ma;
+
+ ma = (struct acpi_srat_mem_affinity *)sub_table;
+
+ if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) && ma->length) {
+ immovable_mem[*num].start = ma->base_address;
+ immovable_mem[*num].size = ma->length;
+ (*num)++;
+ }
+}
+
/**
* count_immovable_mem_regions - Parse SRAT and cache the immovable
* memory regions into the immovable_mem array.
@@ -395,14 +408,8 @@ int count_immovable_mem_regions(void)
while (table + sizeof(struct acpi_subtable_header) < table_end) {
sub_table = (struct acpi_subtable_header *)table;
if (sub_table->type == ACPI_SRAT_TYPE_MEMORY_AFFINITY) {
- struct acpi_srat_mem_affinity *ma;

- ma = (struct acpi_srat_mem_affinity *)sub_table;
- if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) && ma->length) {
- immovable_mem[num].start = ma->base_address;
- immovable_mem[num].size = ma->length;
- num++;
- }
+ subtable_parse(sub_table, &num);

if (num >= MAX_NUMNODES*2) {
debug_putstr("Too many immovable memory regions, aborting.\n");
--
2.20.1

2019-11-04 00:36:40

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH v4 3/4] x86/boot: Get the max address from SRAT

On 11/01/19 at 09:09pm, Masayoshi Mizuma wrote:
> From: Masayoshi Mizuma <[email protected]>
>
> Get the max address from SRAT and write it into boot_params->max_addr.
>
> Signed-off-by: Masayoshi Mizuma <[email protected]>
> ---
> arch/x86/boot/compressed/acpi.c | 26 ++++++++++++++++++++------
> 1 file changed, 20 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
> index a0f81438a..764206c23 100644
> --- a/arch/x86/boot/compressed/acpi.c
> +++ b/arch/x86/boot/compressed/acpi.c
> @@ -362,17 +362,25 @@ static unsigned long get_acpi_srat_table(void)
> return 0;
> }
>
> -static void subtable_parse(struct acpi_subtable_header *sub_table, int *num)
> +static unsigned long subtable_parse(struct acpi_subtable_header *sub_table,
> + int *num)
> {
> struct acpi_srat_mem_affinity *ma;
> + unsigned long addr = 0;
>
> ma = (struct acpi_srat_mem_affinity *)sub_table;
>
> - if (!(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) && ma->length) {
> - immovable_mem[*num].start = ma->base_address;
> - immovable_mem[*num].size = ma->length;
> - (*num)++;
> + if (ma->length) {
> + if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE)
> + addr = ma->base_address + ma->length;
> + else {
> + immovable_mem[*num].start = ma->base_address;
> + immovable_mem[*num].size = ma->length;
> + (*num)++;

Here maybe add code comment or doc above the subtable_parse() to explain
why we only get the end address of hotpluggable region. We assume hot
pluggable memory board will be added above the existed system RAM,
right?

Otherwise, this patch looks good to me. Thanks.

Acked-by: Baoquan He <[email protected]>

Thanks
Baoquan

> + }
> }
> +
> + return addr;
> }
>
> /**
> @@ -391,6 +399,7 @@ int count_immovable_mem_regions(void)
> struct acpi_subtable_header *sub_table;
> struct acpi_table_header *table_header;
> char arg[MAX_ACPI_ARG_LENGTH];
> + unsigned long max_addr = 0, addr;
> int num = 0;
>
> if (cmdline_find_option("acpi", arg, sizeof(arg)) == 3 &&
> @@ -409,7 +418,9 @@ int count_immovable_mem_regions(void)
> sub_table = (struct acpi_subtable_header *)table;
> if (sub_table->type == ACPI_SRAT_TYPE_MEMORY_AFFINITY) {
>
> - subtable_parse(sub_table, &num);
> + addr = subtable_parse(sub_table, &num);
> + if (addr > max_addr)
> + max_addr = addr;
>
> if (num >= MAX_NUMNODES*2) {
> debug_putstr("Too many immovable memory regions, aborting.\n");
> @@ -418,6 +429,9 @@ int count_immovable_mem_regions(void)
> }
> table += sub_table->length;
> }
> +
> + boot_params->max_addr = max_addr;
> +
> return num;
> }
> #endif /* CONFIG_RANDOMIZE_BASE && CONFIG_MEMORY_HOTREMOVE */
> --
> 2.20.1
>

2019-11-04 00:51:24

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH v4 4/4] x86/mm/KASLR: Adjust the padding size for the direct mapping.

On 11/01/19 at 09:09pm, Masayoshi Mizuma wrote:
> ---
> arch/x86/mm/kaslr.c | 65 ++++++++++++++++++++++++++++++++++-----------
> 1 file changed, 50 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
> index dc6182eec..a80eed563 100644
> --- a/arch/x86/mm/kaslr.c
> +++ b/arch/x86/mm/kaslr.c
> @@ -70,15 +70,60 @@ static inline bool kaslr_memory_enabled(void)
> return kaslr_enabled() && !IS_ENABLED(CONFIG_KASAN);
> }
>
> +/*
> + * Even though a huge virtual address space is reserved for the direct
> + * mapping of physical memory, e.g in 4-level paging mode, it's 64TB,
> + * rare system can own enough physical memory to use it up, most are
> + * even less than 1TB. So with KASLR enabled, we adapt the size of
> + * direct mapping area to the size of actual physical memory plus the
> + * configured padding CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING.
> + * The left part will be taken out to join memory randomization.
> + */
> +static inline unsigned long calc_direct_mapping_size(void)
> +{
> + unsigned long padding = CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
> + unsigned long size_tb, memory_tb;
> +#ifdef CONFIG_MEMORY_HOTPLUG
> + unsigned long actual, maximum, base;
> +
> + if (boot_params.max_addr) {
> + /*
> + * The padding size should set to get for kaslr_regions[].base
> + * bigger address than the maximum memory address the system can
> + * have. kaslr_regions[].base points "actual size + padding" or
> + * higher address. If "actual size + padding" points the lower
> + * address than the maximum memory size, fix the padding size.
> + */
> + actual = roundup(PFN_PHYS(max_pfn), 1UL << TB_SHIFT);
> + maximum = roundup(boot_params.max_addr, 1UL << TB_SHIFT);
> + base = actual + (padding << TB_SHIFT);
> +
> + if (maximum > base)
> + padding = (maximum - actual) >> TB_SHIFT;
> + }
> +#endif
> + memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) +
> + padding;

Yes, wrapping up the whole adjusting code block for the direct mapping
area into a function looks much better. This was also suggested by Ingo
when I posted UV system issue fix before, just later the UV system issue
is not seen in the current code.

However, I have a small concern about the memory_tb calculateion here.
We can treat the (actual RAM + CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING)
as the default memory_tb, then check if we need adjst it according to
boot_params.max_addr. Discarding the local padding variable can make
code much simpler? And it is a little confusing when mix with the
later padding concept when doing randomization, I mean the get_padding()
thing.


memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) +
CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;

if (boot_params.max_addr) {
maximum = roundup(boot_params.max_addr, 1UL << TB_SHIFT);

if (maximum > memory_tb)
memory_tb = maximum;
}
#endif

Personal opinion. Anyway, this patch looks good to me. Thanks.

Thanks
Baoquan


> +
> + size_tb = 1 << (MAX_PHYSMEM_BITS - TB_SHIFT);
> +
> + /*
> + * Adapt physical memory region size based on available memory
> + */
> + if (memory_tb < size_tb)
> + size_tb = memory_tb;
> +
> + return size_tb;
> +}
> +
> /* Initialize base and padding for each memory region randomized with KASLR */
> void __init kernel_randomize_memory(void)
> {
> - size_t i;
> - unsigned long vaddr_start, vaddr;
> - unsigned long rand, memory_tb;
> - struct rnd_state rand_state;
> + unsigned long vaddr_start, vaddr, rand;
> unsigned long remain_entropy;
> unsigned long vmemmap_size;
> + struct rnd_state rand_state;
> + size_t i;
>
> vaddr_start = pgtable_l5_enabled() ? __PAGE_OFFSET_BASE_L5 : __PAGE_OFFSET_BASE_L4;
> vaddr = vaddr_start;
> @@ -95,20 +140,10 @@ void __init kernel_randomize_memory(void)
> if (!kaslr_memory_enabled())
> return;
>
> - kaslr_regions[0].size_tb = 1 << (MAX_PHYSMEM_BITS - TB_SHIFT);
> + kaslr_regions[0].size_tb = calc_direct_mapping_size();
> kaslr_regions[1].size_tb = VMALLOC_SIZE_TB;
>
> - /*
> - * Update Physical memory mapping to available and
> - * add padding if needed (especially for memory hotplug support).
> - */
> BUG_ON(kaslr_regions[0].base != &page_offset_base);
> - memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) +
> - CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
> -
> - /* Adapt phyiscal memory region size based on available memory */
> - if (memory_tb < kaslr_regions[0].size_tb)
> - kaslr_regions[0].size_tb = memory_tb;
>
> /*
> * Calculate the vmemmap region size in TBs, aligned to a TB
> --
> 2.20.1
>

2019-11-12 20:48:50

by Masayoshi Mizuma

[permalink] [raw]
Subject: Re: [PATCH v4 4/4] x86/mm/KASLR: Adjust the padding size for the direct mapping.

On Mon, Nov 04, 2019 at 08:48:25AM +0800, Baoquan He wrote:
> On 11/01/19 at 09:09pm, Masayoshi Mizuma wrote:
> > ---
> > arch/x86/mm/kaslr.c | 65 ++++++++++++++++++++++++++++++++++-----------
> > 1 file changed, 50 insertions(+), 15 deletions(-)
> >
> > diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
> > index dc6182eec..a80eed563 100644
> > --- a/arch/x86/mm/kaslr.c
> > +++ b/arch/x86/mm/kaslr.c
> > @@ -70,15 +70,60 @@ static inline bool kaslr_memory_enabled(void)
> > return kaslr_enabled() && !IS_ENABLED(CONFIG_KASAN);
> > }
> >
> > +/*
> > + * Even though a huge virtual address space is reserved for the direct
> > + * mapping of physical memory, e.g in 4-level paging mode, it's 64TB,
> > + * rare system can own enough physical memory to use it up, most are
> > + * even less than 1TB. So with KASLR enabled, we adapt the size of
> > + * direct mapping area to the size of actual physical memory plus the
> > + * configured padding CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING.
> > + * The left part will be taken out to join memory randomization.
> > + */
> > +static inline unsigned long calc_direct_mapping_size(void)
> > +{
> > + unsigned long padding = CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
> > + unsigned long size_tb, memory_tb;
> > +#ifdef CONFIG_MEMORY_HOTPLUG
> > + unsigned long actual, maximum, base;
> > +
> > + if (boot_params.max_addr) {
> > + /*
> > + * The padding size should set to get for kaslr_regions[].base
> > + * bigger address than the maximum memory address the system can
> > + * have. kaslr_regions[].base points "actual size + padding" or
> > + * higher address. If "actual size + padding" points the lower
> > + * address than the maximum memory size, fix the padding size.
> > + */
> > + actual = roundup(PFN_PHYS(max_pfn), 1UL << TB_SHIFT);
> > + maximum = roundup(boot_params.max_addr, 1UL << TB_SHIFT);
> > + base = actual + (padding << TB_SHIFT);
> > +
> > + if (maximum > base)
> > + padding = (maximum - actual) >> TB_SHIFT;
> > + }
> > +#endif
> > + memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) +
> > + padding;
>
> Yes, wrapping up the whole adjusting code block for the direct mapping
> area into a function looks much better. This was also suggested by Ingo
> when I posted UV system issue fix before, just later the UV system issue
> is not seen in the current code.
>
> However, I have a small concern about the memory_tb calculateion here.
> We can treat the (actual RAM + CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING)
> as the default memory_tb, then check if we need adjst it according to
> boot_params.max_addr. Discarding the local padding variable can make
> code much simpler? And it is a little confusing when mix with the
> later padding concept when doing randomization, I mean the get_padding()
> thing.
>
>
> memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) +
> CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
>
> if (boot_params.max_addr) {
> maximum = roundup(boot_params.max_addr, 1UL << TB_SHIFT);
>
> if (maximum > memory_tb)
> memory_tb = maximum;
> }
> #endif
>
> Personal opinion. Anyway, this patch looks good to me. Thanks.

Your suggesion makes it simpler, thanks!
So I'll modify calc_direct_mapping_size() as following.
Does it make sense?

static inline unsigned long calc_direct_mapping_size(void)
{
unsigned long size_tb, memory_tb;

memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) +
CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;

#ifdef CONFIG_MEMORY_HOTPLUG
if (boot_params.max_addr) {
unsigned long maximum_tb;

maximum_tb = DIV_ROUND_UP(boot_params.max_addr,
1UL << TB_SHIFT);

if (maximum_tb > memory_tb)
memory_tb = maximum_tb;
}
#endif
size_tb = 1 << (MAX_PHYSMEM_BITS - TB_SHIFT);

/*
* Adapt physical memory region size based on available memory
*/
if (memory_tb < size_tb)
size_tb = memory_tb;

return size_tb;
}

Thanks,
Masa

>
> Thanks
> Baoquan
>
>
> > +
> > + size_tb = 1 << (MAX_PHYSMEM_BITS - TB_SHIFT);
> > +
> > + /*
> > + * Adapt physical memory region size based on available memory
> > + */
> > + if (memory_tb < size_tb)
> > + size_tb = memory_tb;
> > +
> > + return size_tb;
> > +}
> > +
> > /* Initialize base and padding for each memory region randomized with KASLR */
> > void __init kernel_randomize_memory(void)
> > {
> > - size_t i;
> > - unsigned long vaddr_start, vaddr;
> > - unsigned long rand, memory_tb;
> > - struct rnd_state rand_state;
> > + unsigned long vaddr_start, vaddr, rand;
> > unsigned long remain_entropy;
> > unsigned long vmemmap_size;
> > + struct rnd_state rand_state;
> > + size_t i;
> >
> > vaddr_start = pgtable_l5_enabled() ? __PAGE_OFFSET_BASE_L5 : __PAGE_OFFSET_BASE_L4;
> > vaddr = vaddr_start;
> > @@ -95,20 +140,10 @@ void __init kernel_randomize_memory(void)
> > if (!kaslr_memory_enabled())
> > return;
> >
> > - kaslr_regions[0].size_tb = 1 << (MAX_PHYSMEM_BITS - TB_SHIFT);
> > + kaslr_regions[0].size_tb = calc_direct_mapping_size();
> > kaslr_regions[1].size_tb = VMALLOC_SIZE_TB;
> >
> > - /*
> > - * Update Physical memory mapping to available and
> > - * add padding if needed (especially for memory hotplug support).
> > - */
> > BUG_ON(kaslr_regions[0].base != &page_offset_base);
> > - memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) +
> > - CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
> > -
> > - /* Adapt phyiscal memory region size based on available memory */
> > - if (memory_tb < kaslr_regions[0].size_tb)
> > - kaslr_regions[0].size_tb = memory_tb;
> >
> > /*
> > * Calculate the vmemmap region size in TBs, aligned to a TB
> > --
> > 2.20.1
> >
>

2019-11-14 08:17:43

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH v4 4/4] x86/mm/KASLR: Adjust the padding size for the direct mapping.

On 11/12/19 at 03:47pm, Masayoshi Mizuma wrote:
> Your suggesion makes it simpler, thanks!
> So I'll modify calc_direct_mapping_size() as following.
> Does it make sense?

Yeah, it looks good to me. Thanks.

>
> static inline unsigned long calc_direct_mapping_size(void)
> {
> unsigned long size_tb, memory_tb;
>
> memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) +
> CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
>
> #ifdef CONFIG_MEMORY_HOTPLUG
> if (boot_params.max_addr) {
> unsigned long maximum_tb;
>
> maximum_tb = DIV_ROUND_UP(boot_params.max_addr,
> 1UL << TB_SHIFT);
>
> if (maximum_tb > memory_tb)
> memory_tb = maximum_tb;
> }
> #endif
> size_tb = 1 << (MAX_PHYSMEM_BITS - TB_SHIFT);
>
> /*
> * Adapt physical memory region size based on available memory
> */
> if (memory_tb < size_tb)
> size_tb = memory_tb;
>
> return size_tb;
> }