2008-06-16 04:24:55

by Dave Young

[permalink] [raw]
Subject: [PATCH] kernel parameter vmalloc size fix

booting kernel with vmalloc=[any size<=16m] will oops.

It's due to the vm area hole.

In include/asm-x86/pgtable_32.h:
#define VMALLOC_OFFSET (8 * 1024 * 1024)
#define VMALLOC_START (((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) \
& ~(VMALLOC_OFFSET - 1))

BUG_ON in arch/x86/mm/init_32.c will be triggered:
BUG_ON((unsigned long)high_memory > VMALLOC_START);

Fixed by return -EINVAL for invalid parameter

Signed-off-by: Dave Young <[email protected]>

Documentation/kernel-parameters.txt | 3 ++-
arch/x86/kernel/setup_32.c | 11 +++++++++--
2 files changed, 11 insertions(+), 3 deletions(-)

diff -upr linux/Documentation/kernel-parameters.txt linux.new/Documentation/kernel-parameters.txt
--- linux/Documentation/kernel-parameters.txt 2008-06-16 11:30:29.000000000 +0800
+++ linux.new/Documentation/kernel-parameters.txt 2008-06-16 11:43:01.000000000 +0800
@@ -2139,7 +2139,8 @@ and is between 256 and 4096 characters.
size of <nn>. This can be used to increase the
minimum size (128MB on x86). It can also be used to
decrease the size and leave more room for directly
- mapped kernel RAM.
+ mapped kernel RAM. Note that the size must be bigger
+ than 16M now on i386 due to the memory hole.

vmhalt= [KNL,S390] Perform z/VM CP command after system halt.
Format: <command>
diff -upr linux/arch/x86/kernel/setup_32.c linux.new/arch/x86/kernel/setup_32.c
--- linux/arch/x86/kernel/setup_32.c 2008-06-16 11:28:51.000000000 +0800
+++ linux.new/arch/x86/kernel/setup_32.c 2008-06-16 11:43:35.000000000 +0800
@@ -305,15 +305,22 @@ early_param("highmem", parse_highmem);

/*
* vmalloc=size forces the vmalloc area to be exactly 'size'
- * bytes. This can be used to increase (or decrease) the
+ * bytes. Now size must be bigger than 16m due to the memory hole.
+ * This can be used to increase (or decrease) the
* vmalloc area - the default is 128m.
*/
static int __init parse_vmalloc(char *arg)
{
+ unsigned int v;
+
if (!arg)
return -EINVAL;

- __VMALLOC_RESERVE = memparse(arg, &arg);
+ v = memparse(arg, &arg);
+ if (v <= 2 * VMALLOC_OFFSET)
+ return -EINVAL;
+ __VMALLOC_RESERVE = v;
+
return 0;
}
early_param("vmalloc", parse_vmalloc);


2008-06-16 08:01:58

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] kernel parameter vmalloc size fix


* Dave Young <[email protected]> wrote:

> booting kernel with vmalloc=[any size<=16m] will oops.
>
> It's due to the vm area hole.
>
> In include/asm-x86/pgtable_32.h:
> #define VMALLOC_OFFSET (8 * 1024 * 1024)
> #define VMALLOC_START (((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) \
> & ~(VMALLOC_OFFSET - 1))
>
> BUG_ON in arch/x86/mm/init_32.c will be triggered:
> BUG_ON((unsigned long)high_memory > VMALLOC_START);
>
> Fixed by return -EINVAL for invalid parameter

hm. Why dont we instead add the size of the hole to the
__VMALLOC_RESERVE value instead? There's nothing inherently bad about
using vmalloc=16m. The VM area hole is really a kernel-internal
abstraction that should not be visible in the usage of the parameter.

Ingo

2008-06-16 08:09:16

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH] kernel parameter vmalloc size fix

On Mon, Jun 16, 2008 at 4:01 PM, Ingo Molnar <[email protected]> wrote:
>
> * Dave Young <[email protected]> wrote:
>
>> booting kernel with vmalloc=[any size<=16m] will oops.
>>
>> It's due to the vm area hole.
>>
>> In include/asm-x86/pgtable_32.h:
>> #define VMALLOC_OFFSET (8 * 1024 * 1024)
>> #define VMALLOC_START (((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) \
>> & ~(VMALLOC_OFFSET - 1))
>>
>> BUG_ON in arch/x86/mm/init_32.c will be triggered:
>> BUG_ON((unsigned long)high_memory > VMALLOC_START);
>>
>> Fixed by return -EINVAL for invalid parameter
>
> hm. Why dont we instead add the size of the hole to the
> __VMALLOC_RESERVE value instead? There's nothing inherently bad about
> using vmalloc=16m. The VM area hole is really a kernel-internal
> abstraction that should not be visible in the usage of the parameter.

Good suggestion, thanks. I will rewrite the patch and send.

--
Regards
dave

2008-06-16 08:52:26

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH] kernel parameter vmalloc size fix

On Mon, Jun 16, 2008 at 4:08 PM, Dave Young <[email protected]> wrote:
> On Mon, Jun 16, 2008 at 4:01 PM, Ingo Molnar <[email protected]> wrote:
>>
>> * Dave Young <[email protected]> wrote:
>>
>>> booting kernel with vmalloc=[any size<=16m] will oops.
>>>
>>> It's due to the vm area hole.
>>>
>>> In include/asm-x86/pgtable_32.h:
>>> #define VMALLOC_OFFSET (8 * 1024 * 1024)
>>> #define VMALLOC_START (((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) \
>>> & ~(VMALLOC_OFFSET - 1))
>>>
>>> BUG_ON in arch/x86/mm/init_32.c will be triggered:
>>> BUG_ON((unsigned long)high_memory > VMALLOC_START);
>>>
>>> Fixed by return -EINVAL for invalid parameter
>>
>> hm. Why dont we instead add the size of the hole to the
>> __VMALLOC_RESERVE value instead? There's nothing inherently bad about
>> using vmalloc=16m. The VM area hole is really a kernel-internal
>> abstraction that should not be visible in the usage of the parameter.

I built with:
__VMALLOC_RESERVE = memparse(arg, &arg) + 2 * VMALLOC_OFFSET;

But it doesn't work, still oops at ie. vmalloc=4M (BUT vmalloc=8M is ok)
I can't figure out what's wrong with it.

Regards
dave

2008-06-16 15:45:26

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] kernel parameter vmalloc size fix

Ingo Molnar wrote:
>
> hm. Why dont we instead add the size of the hole to the
> __VMALLOC_RESERVE value instead? There's nothing inherently bad about
> using vmalloc=16m. The VM area hole is really a kernel-internal
> abstraction that should not be visible in the usage of the parameter.
>

Well, the question is are we taking it away from RAM or away from
vmalloc... there aren't really any other alternatives.

-hpa

2008-06-24 05:50:04

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH] kernel parameter vmalloc size fix

On Mon, Jun 16, 2008 at 4:01 PM, Ingo Molnar <[email protected]> wrote:
>
> * Dave Young <[email protected]> wrote:
>
>> booting kernel with vmalloc=[any size<=16m] will oops.
>>
>> It's due to the vm area hole.
>>
>> In include/asm-x86/pgtable_32.h:
>> #define VMALLOC_OFFSET (8 * 1024 * 1024)
>> #define VMALLOC_START (((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) \
>> & ~(VMALLOC_OFFSET - 1))
>>
>> BUG_ON in arch/x86/mm/init_32.c will be triggered:
>> BUG_ON((unsigned long)high_memory > VMALLOC_START);
>>
>> Fixed by return -EINVAL for invalid parameter
>
> hm. Why dont we instead add the size of the hole to the
> __VMALLOC_RESERVE value instead? There's nothing inherently bad about
> using vmalloc=16m. The VM area hole is really a kernel-internal
> abstraction that should not be visible in the usage of the parameter.

I do some test about this last weekend, there's some questions, could
you help to fix it?

1. MAXMEM :
(-__PAGE_OFFSET - __VMALLOC_RESERVE).
The space after VMALLOC_END is included as well, seting it to
(VMALLOC_END - PAGE_OFFSET - __VMALLOC_RESERVE), is it right?

2. VMALLOC_OFFSET is not considered in __VMALLOC_RESERVE
Should fixed by adding VMALLOC_OFFSET to it.

3. VMALLOC_START :
(((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) & ~(VMALLOC_OFFSET - 1))
So it's not always 8M, bigger than 8M possible.
Set it to ((unsigned long)high_memory + VMALLOC_OFFSET), is it right?

Attached the proposed patch. please give some advice.

Regards
dave


Attachments:
(No filename) (1.51 kB)
diff.vmalloc (1.90 kB)
Download all attachments

2008-06-26 12:15:00

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] kernel parameter vmalloc size fix


* Dave Young <[email protected]> wrote:

> I do some test about this last weekend, there's some questions, could
> you help to fix it?
>
> 1. MAXMEM :
> (-__PAGE_OFFSET - __VMALLOC_RESERVE).
> The space after VMALLOC_END is included as well, seting it to
> (VMALLOC_END - PAGE_OFFSET - __VMALLOC_RESERVE), is it right?
>
> 2. VMALLOC_OFFSET is not considered in __VMALLOC_RESERVE
> Should fixed by adding VMALLOC_OFFSET to it.
>
> 3. VMALLOC_START :
> (((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) & ~(VMALLOC_OFFSET - 1))
> So it's not always 8M, bigger than 8M possible.
> Set it to ((unsigned long)high_memory + VMALLOC_OFFSET), is it right?
>
> Attached the proposed patch. please give some advice.

i've ported it to tip/master, see the patch below. Yinghai, what do you
think about this change?

Ingo

---
arch/x86/mm/pgtable_32.c | 3 ++-
include/asm-x86/page_32.h | 1 -
include/asm-x86/pgtable_32.h | 5 +++--
3 files changed, 5 insertions(+), 4 deletions(-)

Index: tip/arch/x86/mm/pgtable_32.c
===================================================================
--- tip.orig/arch/x86/mm/pgtable_32.c
+++ tip/arch/x86/mm/pgtable_32.c
@@ -171,7 +171,8 @@ static int __init parse_vmalloc(char *ar
if (!arg)
return -EINVAL;

- __VMALLOC_RESERVE = memparse(arg, &arg);
+ /* Add VMALLOC_OFFSET to the parsed value due to vm area guard hole*/
+ __VMALLOC_RESERVE = memparse(arg, &arg) + VMALLOC_OFFSET;
return 0;
}
early_param("vmalloc", parse_vmalloc);
Index: tip/include/asm-x86/page_32.h
===================================================================
--- tip.orig/include/asm-x86/page_32.h
+++ tip/include/asm-x86/page_32.h
@@ -95,7 +95,6 @@ extern unsigned int __VMALLOC_RESERVE;
extern int sysctl_legacy_va_layout;

#define VMALLOC_RESERVE ((unsigned long)__VMALLOC_RESERVE)
-#define MAXMEM (-__PAGE_OFFSET - __VMALLOC_RESERVE)

extern void find_low_pfn_range(void);
extern unsigned long init_memory_mapping(unsigned long start,
Index: tip/include/asm-x86/pgtable_32.h
===================================================================
--- tip.orig/include/asm-x86/pgtable_32.h
+++ tip/include/asm-x86/pgtable_32.h
@@ -56,8 +56,7 @@ void paging_init(void);
* area for the same reason. ;)
*/
#define VMALLOC_OFFSET (8 * 1024 * 1024)
-#define VMALLOC_START (((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) \
- & ~(VMALLOC_OFFSET - 1))
+#define VMALLOC_START ((unsigned long)high_memory + VMALLOC_OFFSET)
#ifdef CONFIG_X86_PAE
#define LAST_PKMAP 512
#else
@@ -73,6 +72,8 @@ void paging_init(void);
# define VMALLOC_END (FIXADDR_START - 2 * PAGE_SIZE)
#endif

+#define MAXMEM (VMALLOC_END - PAGE_OFFSET - __VMALLOC_RESERVE)
+
/*
* Define this if things work differently on an i386 and an i486:
* it will (on an i486) warn about kernel memory accesses that are

2008-06-27 01:55:44

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH] kernel parameter vmalloc size fix

On Thu, Jun 26, 2008 at 8:14 PM, Ingo Molnar <[email protected]> wrote:
>
> * Dave Young <[email protected]> wrote:
>
>> I do some test about this last weekend, there's some questions, could
>> you help to fix it?
>>
>> 1. MAXMEM :
>> (-__PAGE_OFFSET - __VMALLOC_RESERVE).
>> The space after VMALLOC_END is included as well, seting it to
>> (VMALLOC_END - PAGE_OFFSET - __VMALLOC_RESERVE), is it right?
>>
>> 2. VMALLOC_OFFSET is not considered in __VMALLOC_RESERVE
>> Should fixed by adding VMALLOC_OFFSET to it.
>>
>> 3. VMALLOC_START :
>> (((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) & ~(VMALLOC_OFFSET - 1))
>> So it's not always 8M, bigger than 8M possible.
>> Set it to ((unsigned long)high_memory + VMALLOC_OFFSET), is it right?
>>
>> Attached the proposed patch. please give some advice.
>
> i've ported it to tip/master, see the patch below. Yinghai, what do you
> think about this change?

Thanks. If there's no objections please add my signed-off line

Signed-off-by: Dave Young <[email protected]>

>
> Ingo
>
> ---
> arch/x86/mm/pgtable_32.c | 3 ++-
> include/asm-x86/page_32.h | 1 -
> include/asm-x86/pgtable_32.h | 5 +++--
> 3 files changed, 5 insertions(+), 4 deletions(-)
>
> Index: tip/arch/x86/mm/pgtable_32.c
> ===================================================================
> --- tip.orig/arch/x86/mm/pgtable_32.c
> +++ tip/arch/x86/mm/pgtable_32.c
> @@ -171,7 +171,8 @@ static int __init parse_vmalloc(char *ar
> if (!arg)
> return -EINVAL;
>
> - __VMALLOC_RESERVE = memparse(arg, &arg);
> + /* Add VMALLOC_OFFSET to the parsed value due to vm area guard hole*/
> + __VMALLOC_RESERVE = memparse(arg, &arg) + VMALLOC_OFFSET;
> return 0;
> }
> early_param("vmalloc", parse_vmalloc);
> Index: tip/include/asm-x86/page_32.h
> ===================================================================
> --- tip.orig/include/asm-x86/page_32.h
> +++ tip/include/asm-x86/page_32.h
> @@ -95,7 +95,6 @@ extern unsigned int __VMALLOC_RESERVE;
> extern int sysctl_legacy_va_layout;
>
> #define VMALLOC_RESERVE ((unsigned long)__VMALLOC_RESERVE)
> -#define MAXMEM (-__PAGE_OFFSET - __VMALLOC_RESERVE)
>
> extern void find_low_pfn_range(void);
> extern unsigned long init_memory_mapping(unsigned long start,
> Index: tip/include/asm-x86/pgtable_32.h
> ===================================================================
> --- tip.orig/include/asm-x86/pgtable_32.h
> +++ tip/include/asm-x86/pgtable_32.h
> @@ -56,8 +56,7 @@ void paging_init(void);
> * area for the same reason. ;)
> */
> #define VMALLOC_OFFSET (8 * 1024 * 1024)
> -#define VMALLOC_START (((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) \
> - & ~(VMALLOC_OFFSET - 1))
> +#define VMALLOC_START ((unsigned long)high_memory + VMALLOC_OFFSET)
> #ifdef CONFIG_X86_PAE
> #define LAST_PKMAP 512
> #else
> @@ -73,6 +72,8 @@ void paging_init(void);
> # define VMALLOC_END (FIXADDR_START - 2 * PAGE_SIZE)
> #endif
>
> +#define MAXMEM (VMALLOC_END - PAGE_OFFSET - __VMALLOC_RESERVE)
> +
> /*
> * Define this if things work differently on an i386 and an i486:
> * it will (on an i486) warn about kernel memory accesses that are
>
>



--
Regards
dave

2008-07-11 07:05:38

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH] kernel parameter vmalloc size fix

On Fri, Jun 27, 2008 at 9:55 AM, Dave Young <[email protected]> wrote:
> On Thu, Jun 26, 2008 at 8:14 PM, Ingo Molnar <[email protected]> wrote:
>>
>> * Dave Young <[email protected]> wrote:
>>
>>> I do some test about this last weekend, there's some questions, could
>>> you help to fix it?
>>>
>>> 1. MAXMEM :
>>> (-__PAGE_OFFSET - __VMALLOC_RESERVE).
>>> The space after VMALLOC_END is included as well, seting it to
>>> (VMALLOC_END - PAGE_OFFSET - __VMALLOC_RESERVE), is it right?
>>>
>>> 2. VMALLOC_OFFSET is not considered in __VMALLOC_RESERVE
>>> Should fixed by adding VMALLOC_OFFSET to it.
>>>
>>> 3. VMALLOC_START :
>>> (((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) & ~(VMALLOC_OFFSET - 1))
>>> So it's not always 8M, bigger than 8M possible.
>>> Set it to ((unsigned long)high_memory + VMALLOC_OFFSET), is it right?
>>>
>>> Attached the proposed patch. please give some advice.
>>
>> i've ported it to tip/master, see the patch below. Yinghai, what do you
>> think about this change?

What's the status of this? It's indeed a bug which can be easily reproduced.
Anyone care about it?

Is it necessary for me to send it again for review?

(Add andrew in cc, maybe it could be put into mm to test)
>
> Thanks. If there's no objections please add my signed-off line
>
> Signed-off-by: Dave Young <[email protected]>
>
>>
>> Ingo
>>
>> ---
>> arch/x86/mm/pgtable_32.c | 3 ++-
>> include/asm-x86/page_32.h | 1 -
>> include/asm-x86/pgtable_32.h | 5 +++--
>> 3 files changed, 5 insertions(+), 4 deletions(-)
>>
>> Index: tip/arch/x86/mm/pgtable_32.c
>> ===================================================================
>> --- tip.orig/arch/x86/mm/pgtable_32.c
>> +++ tip/arch/x86/mm/pgtable_32.c
>> @@ -171,7 +171,8 @@ static int __init parse_vmalloc(char *ar
>> if (!arg)
>> return -EINVAL;
>>
>> - __VMALLOC_RESERVE = memparse(arg, &arg);
>> + /* Add VMALLOC_OFFSET to the parsed value due to vm area guard hole*/
>> + __VMALLOC_RESERVE = memparse(arg, &arg) + VMALLOC_OFFSET;
>> return 0;
>> }
>> early_param("vmalloc", parse_vmalloc);
>> Index: tip/include/asm-x86/page_32.h
>> ===================================================================
>> --- tip.orig/include/asm-x86/page_32.h
>> +++ tip/include/asm-x86/page_32.h
>> @@ -95,7 +95,6 @@ extern unsigned int __VMALLOC_RESERVE;
>> extern int sysctl_legacy_va_layout;
>>
>> #define VMALLOC_RESERVE ((unsigned long)__VMALLOC_RESERVE)
>> -#define MAXMEM (-__PAGE_OFFSET - __VMALLOC_RESERVE)
>>
>> extern void find_low_pfn_range(void);
>> extern unsigned long init_memory_mapping(unsigned long start,
>> Index: tip/include/asm-x86/pgtable_32.h
>> ===================================================================
>> --- tip.orig/include/asm-x86/pgtable_32.h
>> +++ tip/include/asm-x86/pgtable_32.h
>> @@ -56,8 +56,7 @@ void paging_init(void);
>> * area for the same reason. ;)
>> */
>> #define VMALLOC_OFFSET (8 * 1024 * 1024)
>> -#define VMALLOC_START (((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) \
>> - & ~(VMALLOC_OFFSET - 1))
>> +#define VMALLOC_START ((unsigned long)high_memory + VMALLOC_OFFSET)
>> #ifdef CONFIG_X86_PAE
>> #define LAST_PKMAP 512
>> #else
>> @@ -73,6 +72,8 @@ void paging_init(void);
>> # define VMALLOC_END (FIXADDR_START - 2 * PAGE_SIZE)
>> #endif
>>
>> +#define MAXMEM (VMALLOC_END - PAGE_OFFSET - __VMALLOC_RESERVE)
>> +
>> /*
>> * Define this if things work differently on an i386 and an i486:
>> * it will (on an i486) warn about kernel memory accesses that are
>>
>>
>
>
>
> --
> Regards
> dave
>