2019-06-14 21:16:01

by Tom Lendacky

[permalink] [raw]
Subject: [PATCH v2 1/2] x86/mm: Identify the end of the kernel area to be reserved

The memory occupied by the kernel is reserved using memblock_reserve()
in setup_arch(). Currently, the area is from symbols _text to __bss_stop.
Everything after __bss_stop must be specifically reserved otherwise it
is discarded. This is not clearly documented.

Add a new symbol, __end_of_kernel_reserve, that more readily identifies
what is reserved, along with comments that indicate what is reserved,
what is discarded and what needs to be done to prevent a section from
being discarded.

Cc: Baoquan He <[email protected]>
Cc: Lianbo Jiang <[email protected]>
Signed-off-by: Tom Lendacky <[email protected]>
---
arch/x86/include/asm/sections.h | 2 ++
arch/x86/kernel/setup.c | 8 +++++++-
arch/x86/kernel/vmlinux.lds.S | 9 ++++++++-
3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/sections.h b/arch/x86/include/asm/sections.h
index 8ea1cfdbeabc..71b32f2570ab 100644
--- a/arch/x86/include/asm/sections.h
+++ b/arch/x86/include/asm/sections.h
@@ -13,4 +13,6 @@ extern char __end_rodata_aligned[];
extern char __end_rodata_hpage_align[];
#endif

+extern char __end_of_kernel_reserve[];
+
#endif /* _ASM_X86_SECTIONS_H */
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 08a5f4a131f5..32eb70625b3b 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -827,8 +827,14 @@ dump_kernel_offset(struct notifier_block *self, unsigned long v, void *p)

void __init setup_arch(char **cmdline_p)
{
+ /*
+ * Reserve the memory occupied by the kernel between _text and
+ * __end_of_kernel_reserve symbols. Any kernel sections after the
+ * __end_of_kernel_reserve symbol must be explicity reserved with a
+ * separate memblock_reserve() or it will be discarded.
+ */
memblock_reserve(__pa_symbol(_text),
- (unsigned long)__bss_stop - (unsigned long)_text);
+ (unsigned long)__end_of_kernel_reserve - (unsigned long)_text);

/*
* Make sure page 0 is always reserved because on systems with
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 0850b5149345..ca2252ca6ad7 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -368,6 +368,14 @@ SECTIONS
__bss_stop = .;
}

+ /*
+ * The memory occupied from _text to here, __end_of_kernel_reserve, is
+ * automatically reserved in setup_arch(). Anything after here must be
+ * explicitly reserved using memblock_reserve() or it will be discarded
+ * and treated as available memory.
+ */
+ __end_of_kernel_reserve = .;
+
. = ALIGN(PAGE_SIZE);
.brk : AT(ADDR(.brk) - LOAD_OFFSET) {
__brk_base = .;
@@ -382,7 +390,6 @@ SECTIONS
STABS_DEBUG
DWARF_DEBUG

- /* Sections to be discarded */
DISCARDS
/DISCARD/ : {
*(.eh_frame)
--
2.17.1


2019-06-14 22:08:43

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] x86/mm: Identify the end of the kernel area to be reserved

On 6/14/19 2:15 PM, Lendacky, Thomas wrote:
> + /*
> + * The memory occupied from _text to here, __end_of_kernel_reserve, is
> + * automatically reserved in setup_arch(). Anything after here must be
> + * explicitly reserved using memblock_reserve() or it will be discarded
> + * and treated as available memory.
> + */
> + __end_of_kernel_reserve = .;

This new stuff looks really nice to me, including the comments. Thanks
for doing those!

For both patches:

Reviewed-by: Dave Hansen <[email protected]>

2019-06-16 12:04:29

by Lianbo Jiang

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] x86/mm: Identify the end of the kernel area to be reserved

After applied the patch series(v2), the kexec-d kernel and the kdump kernel can
successfully boot.

Thanks.

Tested-by: Lianbo Jiang <[email protected]>

在 2019年06月15日 05:15, Lendacky, Thomas 写道:
> The memory occupied by the kernel is reserved using memblock_reserve()
> in setup_arch(). Currently, the area is from symbols _text to __bss_stop.
> Everything after __bss_stop must be specifically reserved otherwise it
> is discarded. This is not clearly documented.
>
> Add a new symbol, __end_of_kernel_reserve, that more readily identifies
> what is reserved, along with comments that indicate what is reserved,
> what is discarded and what needs to be done to prevent a section from
> being discarded.
>
> Cc: Baoquan He <[email protected]>
> Cc: Lianbo Jiang <[email protected]>
> Signed-off-by: Tom Lendacky <[email protected]>
> ---
> arch/x86/include/asm/sections.h | 2 ++
> arch/x86/kernel/setup.c | 8 +++++++-
> arch/x86/kernel/vmlinux.lds.S | 9 ++++++++-
> 3 files changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/sections.h b/arch/x86/include/asm/sections.h
> index 8ea1cfdbeabc..71b32f2570ab 100644
> --- a/arch/x86/include/asm/sections.h
> +++ b/arch/x86/include/asm/sections.h
> @@ -13,4 +13,6 @@ extern char __end_rodata_aligned[];
> extern char __end_rodata_hpage_align[];
> #endif
>
> +extern char __end_of_kernel_reserve[];
> +
> #endif /* _ASM_X86_SECTIONS_H */
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 08a5f4a131f5..32eb70625b3b 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -827,8 +827,14 @@ dump_kernel_offset(struct notifier_block *self, unsigned long v, void *p)
>
> void __init setup_arch(char **cmdline_p)
> {
> + /*
> + * Reserve the memory occupied by the kernel between _text and
> + * __end_of_kernel_reserve symbols. Any kernel sections after the
> + * __end_of_kernel_reserve symbol must be explicity reserved with a
> + * separate memblock_reserve() or it will be discarded.
> + */
> memblock_reserve(__pa_symbol(_text),
> - (unsigned long)__bss_stop - (unsigned long)_text);
> + (unsigned long)__end_of_kernel_reserve - (unsigned long)_text);
>
> /*
> * Make sure page 0 is always reserved because on systems with
> diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
> index 0850b5149345..ca2252ca6ad7 100644
> --- a/arch/x86/kernel/vmlinux.lds.S
> +++ b/arch/x86/kernel/vmlinux.lds.S
> @@ -368,6 +368,14 @@ SECTIONS
> __bss_stop = .;
> }
>
> + /*
> + * The memory occupied from _text to here, __end_of_kernel_reserve, is
> + * automatically reserved in setup_arch(). Anything after here must be
> + * explicitly reserved using memblock_reserve() or it will be discarded
> + * and treated as available memory.
> + */
> + __end_of_kernel_reserve = .;
> +
> . = ALIGN(PAGE_SIZE);
> .brk : AT(ADDR(.brk) - LOAD_OFFSET) {
> __brk_base = .;
> @@ -382,7 +390,6 @@ SECTIONS
> STABS_DEBUG
> DWARF_DEBUG
>
> - /* Sections to be discarded */
> DISCARDS
> /DISCARD/ : {
> *(.eh_frame)
>

2019-06-17 01:55:05

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] x86/mm: Identify the end of the kernel area to be reserved

On 06/14/19 at 09:15pm, Lendacky, Thomas wrote:
> The memory occupied by the kernel is reserved using memblock_reserve()
> in setup_arch(). Currently, the area is from symbols _text to __bss_stop.
> Everything after __bss_stop must be specifically reserved otherwise it
> is discarded. This is not clearly documented.
>
> Add a new symbol, __end_of_kernel_reserve, that more readily identifies
> what is reserved, along with comments that indicate what is reserved,
> what is discarded and what needs to be done to prevent a section from
> being discarded.
>
> Cc: Baoquan He <[email protected]>
> Cc: Lianbo Jiang <[email protected]>
> Signed-off-by: Tom Lendacky <[email protected]>
> ---
> arch/x86/include/asm/sections.h | 2 ++
> arch/x86/kernel/setup.c | 8 +++++++-
> arch/x86/kernel/vmlinux.lds.S | 9 ++++++++-
> 3 files changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/sections.h b/arch/x86/include/asm/sections.h
> index 8ea1cfdbeabc..71b32f2570ab 100644
> --- a/arch/x86/include/asm/sections.h
> +++ b/arch/x86/include/asm/sections.h
> @@ -13,4 +13,6 @@ extern char __end_rodata_aligned[];
> extern char __end_rodata_hpage_align[];
> #endif
>
> +extern char __end_of_kernel_reserve[];
> +
> #endif /* _ASM_X86_SECTIONS_H */
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 08a5f4a131f5..32eb70625b3b 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -827,8 +827,14 @@ dump_kernel_offset(struct notifier_block *self, unsigned long v, void *p)
>
> void __init setup_arch(char **cmdline_p)
> {
> + /*
> + * Reserve the memory occupied by the kernel between _text and
> + * __end_of_kernel_reserve symbols. Any kernel sections after the
> + * __end_of_kernel_reserve symbol must be explicity reserved with a
> + * separate memblock_reserve() or it will be discarded.
> + */
> memblock_reserve(__pa_symbol(_text),
> - (unsigned long)__bss_stop - (unsigned long)_text);
> + (unsigned long)__end_of_kernel_reserve - (unsigned long)_text);
>
> /*
> * Make sure page 0 is always reserved because on systems with
> diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
> index 0850b5149345..ca2252ca6ad7 100644
> --- a/arch/x86/kernel/vmlinux.lds.S
> +++ b/arch/x86/kernel/vmlinux.lds.S
> @@ -368,6 +368,14 @@ SECTIONS
> __bss_stop = .;
> }
>
> + /*
> + * The memory occupied from _text to here, __end_of_kernel_reserve, is
> + * automatically reserved in setup_arch(). Anything after here must be
> + * explicitly reserved using memblock_reserve() or it will be discarded
> + * and treated as available memory.
> + */
> + __end_of_kernel_reserve = .;
> +
> . = ALIGN(PAGE_SIZE);
> .brk : AT(ADDR(.brk) - LOAD_OFFSET) {
> __brk_base = .;
> @@ -382,7 +390,6 @@ SECTIONS
> STABS_DEBUG
> DWARF_DEBUG
>
> - /* Sections to be discarded */
> DISCARDS
> /DISCARD/ : {
> *(.eh_frame)

Looks good to me, thanks. To the series,

Reviewed-by: Baoquan He <[email protected]>

Thanks
Baoquan

2019-06-17 10:48:16

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] x86/mm: Identify the end of the kernel area to be reserved

On Fri, Jun 14, 2019 at 09:15:18PM +0000, Lendacky, Thomas wrote:
> The memory occupied by the kernel is reserved using memblock_reserve()
> in setup_arch(). Currently, the area is from symbols _text to __bss_stop.
> Everything after __bss_stop must be specifically reserved otherwise it
> is discarded. This is not clearly documented.

Hmm, so I see this in arch/x86/kernel/vmlinux.lds.S after _end:

_end = .;

STABS_DEBUG
DWARF_DEBUG

/* Sections to be discarded */
DISCARDS
/DISCARD/ : {
*(.eh_frame)
}

and over DISCARDS:

/*
* Default discarded sections.
*
* Some archs want to discard exit text/data at runtime rather than
* link time due to cross-section references such as alt instructions,
* bug table, eh_frame, etc. DISCARDS must be the last of output
* section definitions so that such archs put those in earlier section
* definitions.
*/
#define DISCARDS

That sounds like it is documented to me, or do you mean something else?

> Add a new symbol, __end_of_kernel_reserve, that more readily identifies
> what is reserved, along with comments that indicate what is reserved,
> what is discarded and what needs to be done to prevent a section from
> being discarded.
>
> Cc: Baoquan He <[email protected]>
> Cc: Lianbo Jiang <[email protected]>
> Signed-off-by: Tom Lendacky <[email protected]>
> ---
> arch/x86/include/asm/sections.h | 2 ++
> arch/x86/kernel/setup.c | 8 +++++++-
> arch/x86/kernel/vmlinux.lds.S | 9 ++++++++-
> 3 files changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/sections.h b/arch/x86/include/asm/sections.h
> index 8ea1cfdbeabc..71b32f2570ab 100644
> --- a/arch/x86/include/asm/sections.h
> +++ b/arch/x86/include/asm/sections.h
> @@ -13,4 +13,6 @@ extern char __end_rodata_aligned[];
> extern char __end_rodata_hpage_align[];
> #endif
>
> +extern char __end_of_kernel_reserve[];
> +
> #endif /* _ASM_X86_SECTIONS_H */
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 08a5f4a131f5..32eb70625b3b 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -827,8 +827,14 @@ dump_kernel_offset(struct notifier_block *self, unsigned long v, void *p)
>
> void __init setup_arch(char **cmdline_p)
> {
> + /*
> + * Reserve the memory occupied by the kernel between _text and
> + * __end_of_kernel_reserve symbols. Any kernel sections after the
> + * __end_of_kernel_reserve symbol must be explicity reserved with a
> + * separate memblock_reserve() or it will be discarded.

s/it/they/

> + */
> memblock_reserve(__pa_symbol(_text),
> - (unsigned long)__bss_stop - (unsigned long)_text);
> + (unsigned long)__end_of_kernel_reserve - (unsigned long)_text);
>
> /*
> * Make sure page 0 is always reserved because on systems with
> diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
> index 0850b5149345..ca2252ca6ad7 100644
> --- a/arch/x86/kernel/vmlinux.lds.S
> +++ b/arch/x86/kernel/vmlinux.lds.S
> @@ -368,6 +368,14 @@ SECTIONS
> __bss_stop = .;
> }
>
> + /*
> + * The memory occupied from _text to here, __end_of_kernel_reserve, is
> + * automatically reserved in setup_arch(). Anything after here must be
> + * explicitly reserved using memblock_reserve() or it will be discarded
> + * and treated as available memory.
> + */
> + __end_of_kernel_reserve = .;
> +
> . = ALIGN(PAGE_SIZE);
> .brk : AT(ADDR(.brk) - LOAD_OFFSET) {
> __brk_base = .;
> @@ -382,7 +390,6 @@ SECTIONS
> STABS_DEBUG
> DWARF_DEBUG
>
> - /* Sections to be discarded */

Huh?

They're called DISCARD* ...

> DISCARDS
> /DISCARD/ : {
> *(.eh_frame)

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2019-06-18 01:43:43

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] x86/mm: Identify the end of the kernel area to be reserved



On 6/17/19 5:47 AM, Borislav Petkov wrote:
> On Fri, Jun 14, 2019 at 09:15:18PM +0000, Lendacky, Thomas wrote:
>> The memory occupied by the kernel is reserved using memblock_reserve()
>> in setup_arch(). Currently, the area is from symbols _text to __bss_stop.
>> Everything after __bss_stop must be specifically reserved otherwise it
>> is discarded. This is not clearly documented.
>
> Hmm, so I see this in arch/x86/kernel/vmlinux.lds.S after _end:
>
> _end = .;
>
> STABS_DEBUG
> DWARF_DEBUG
>
> /* Sections to be discarded */
> DISCARDS
> /DISCARD/ : {
> *(.eh_frame)
> }
>
> and over DISCARDS:
>
> /*
> * Default discarded sections.
> *
> * Some archs want to discard exit text/data at runtime rather than
> * link time due to cross-section references such as alt instructions,
> * bug table, eh_frame, etc. DISCARDS must be the last of output
> * section definitions so that such archs put those in earlier section
> * definitions.
> */
> #define DISCARDS
>
> That sounds like it is documented to me, or do you mean something else?

Yes and no... it doesn't say how it is done, namely through the use of
memblock_reserve() calls and when and where those occur.

>
>> Add a new symbol, __end_of_kernel_reserve, that more readily identifies
>> what is reserved, along with comments that indicate what is reserved,
>> what is discarded and what needs to be done to prevent a section from
>> being discarded.
>>
>> Cc: Baoquan He <[email protected]>
>> Cc: Lianbo Jiang <[email protected]>
>> Signed-off-by: Tom Lendacky <[email protected]>
>> ---
>> arch/x86/include/asm/sections.h | 2 ++
>> arch/x86/kernel/setup.c | 8 +++++++-
>> arch/x86/kernel/vmlinux.lds.S | 9 ++++++++-
>> 3 files changed, 17 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/sections.h b/arch/x86/include/asm/sections.h
>> index 8ea1cfdbeabc..71b32f2570ab 100644
>> --- a/arch/x86/include/asm/sections.h
>> +++ b/arch/x86/include/asm/sections.h
>> @@ -13,4 +13,6 @@ extern char __end_rodata_aligned[];
>> extern char __end_rodata_hpage_align[];
>> #endif
>>
>> +extern char __end_of_kernel_reserve[];
>> +
>> #endif /* _ASM_X86_SECTIONS_H */
>> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
>> index 08a5f4a131f5..32eb70625b3b 100644
>> --- a/arch/x86/kernel/setup.c
>> +++ b/arch/x86/kernel/setup.c
>> @@ -827,8 +827,14 @@ dump_kernel_offset(struct notifier_block *self, unsigned long v, void *p)
>>
>> void __init setup_arch(char **cmdline_p)
>> {
>> + /*
>> + * Reserve the memory occupied by the kernel between _text and
>> + * __end_of_kernel_reserve symbols. Any kernel sections after the
>> + * __end_of_kernel_reserve symbol must be explicity reserved with a
>> + * separate memblock_reserve() or it will be discarded.
>
> s/it/they/
>
>> + */
>> memblock_reserve(__pa_symbol(_text),
>> - (unsigned long)__bss_stop - (unsigned long)_text);
>> + (unsigned long)__end_of_kernel_reserve - (unsigned long)_text);
>>
>> /*
>> * Make sure page 0 is always reserved because on systems with
>> diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
>> index 0850b5149345..ca2252ca6ad7 100644
>> --- a/arch/x86/kernel/vmlinux.lds.S
>> +++ b/arch/x86/kernel/vmlinux.lds.S
>> @@ -368,6 +368,14 @@ SECTIONS
>> __bss_stop = .;
>> }
>>
>> + /*
>> + * The memory occupied from _text to here, __end_of_kernel_reserve, is
>> + * automatically reserved in setup_arch(). Anything after here must be
>> + * explicitly reserved using memblock_reserve() or it will be discarded
>> + * and treated as available memory.
>> + */
>> + __end_of_kernel_reserve = .;
>> +
>> . = ALIGN(PAGE_SIZE);
>> .brk : AT(ADDR(.brk) - LOAD_OFFSET) {
>> __brk_base = .;
>> @@ -382,7 +390,6 @@ SECTIONS
>> STABS_DEBUG
>> DWARF_DEBUG
>>
>> - /* Sections to be discarded */
>
> Huh?
>
> They're called DISCARD* ...

The comment above is more explicit about what will be discarded and
how not to have it discarded, so I removed this comment.

Thanks,
Tom

>
>> DISCARDS
>> /DISCARD/ : {
>> *(.eh_frame)
>

2019-06-18 09:39:25

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] x86/mm: Identify the end of the kernel area to be reserved

On Tue, Jun 18, 2019 at 01:43:00AM +0000, Lendacky, Thomas wrote:
> Yes and no... it doesn't say how it is done, namely through the use of
> memblock_reserve() calls and when and where those occur.

Ah ok, so you found that out and documented it now. Good.

:-)

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.