2004-09-15 12:53:43

by Ingo Molnar

[permalink] [raw]
Subject: [patch] tune vmalloc size


there are a few devices that use lots of ioremap space. vmalloc space is
a showstopper problem for them.

this patch adds the vmalloc=<size> boot parameter to override
__VMALLOC_RESERVE. The default is 128mb right now - e.g. vmalloc=256m
doubles the size.

Ingo


Attachments:
(No filename) (264.00 B)
tune-vmalloc.patch (2.40 kB)
Download all attachments

2004-09-15 13:30:20

by Joe Korty

[permalink] [raw]
Subject: Re: [patch] tune vmalloc size

On Wed, Sep 15, 2004 at 02:53:56PM +0200, Ingo Molnar wrote:
>
> there are a few devices that use lots of ioremap space. vmalloc space is
> a showstopper problem for them.
>
> this patch adds the vmalloc=<size> boot parameter to override
> __VMALLOC_RESERVE. The default is 128mb right now - e.g. vmalloc=256m
> doubles the size.

Perhaps this should instead be a configurable.
Joe

2004-09-15 13:39:20

by Andi Kleen

[permalink] [raw]
Subject: Re: [patch] tune vmalloc size

Ingo Molnar <[email protected]> writes:

> there are a few devices that use lots of ioremap space. vmalloc space is
> a showstopper problem for them.
>
> this patch adds the vmalloc=<size> boot parameter to override
> __VMALLOC_RESERVE. The default is 128mb right now - e.g. vmalloc=256m
> doubles the size.

Ah, Karsten Keil did a similar patch some months ago. There is
clearly a need.

But I think this should be self tuning instead. For a machine with
less than 900MB of memory the vmalloc area can be automagically increased,
growing into otherwise unused address space.

This way many users wouldn't need to specify weird options. So far
most machines still don't have more than 512MB.

-Andi

2004-09-15 13:39:19

by Alan

[permalink] [raw]
Subject: Re: [patch] tune vmalloc size

On Mer, 2004-09-15 at 13:53, Ingo Molnar wrote:
> there are a few devices that use lots of ioremap space. vmalloc space is
> a showstopper problem for them.
>
> this patch adds the vmalloc=<size> boot parameter to override
> __VMALLOC_RESERVE. The default is 128mb right now - e.g. vmalloc=256m
> doubles the size.

Is there a reason for defaulting to such a small allocation even on
4G/4G platforms ?

2004-09-15 13:45:01

by Joe Korty

[permalink] [raw]
Subject: Re: [patch] tune vmalloc size

On Wed, Sep 15, 2004 at 03:31:44PM +0200, Arjan van de Ven wrote:
> On Wed, Sep 15, 2004 at 09:29:36AM -0400, Joe Korty wrote:
> > On Wed, Sep 15, 2004 at 02:53:56PM +0200, Ingo Molnar wrote:
> > >
> > > there are a few devices that use lots of ioremap space. vmalloc space is
> > > a showstopper problem for them.
> > >
> > > this patch adds the vmalloc=<size> boot parameter to override
> > > __VMALLOC_RESERVE. The default is 128mb right now - e.g. vmalloc=256m
> > > doubles the size.
> >
> > Perhaps this should instead be a configurable.
>
> boot time settable is 100x better than only compile time settable imo :)

IMO, everything that is changable at boot time needs an equivalent way
of changing the default without specifying a boot time value.

boot time values works well only when the number of values that need
changing is small.

Joe

2004-09-15 13:43:35

by Dave Jones

[permalink] [raw]
Subject: Re: [patch] tune vmalloc size

On Wed, Sep 15, 2004 at 09:29:36AM -0400, Joe Korty wrote:
> On Wed, Sep 15, 2004 at 02:53:56PM +0200, Ingo Molnar wrote:
> >
> > there are a few devices that use lots of ioremap space. vmalloc space is
> > a showstopper problem for them.
> >
> > this patch adds the vmalloc=<size> boot parameter to override
> > __VMALLOC_RESERVE. The default is 128mb right now - e.g. vmalloc=256m
> > doubles the size.
>
> Perhaps this should instead be a configurable.

that would make it useless for distribution kernels for eg.

Dave

2004-09-15 14:06:25

by Rodrigo FGV

[permalink] [raw]
Subject: Re: [patch] tune vmalloc size

How i know the best value to set vmalloc, it's full size of ram????
----- Original Message -----
From: "Andi Kleen" <[email protected]>
To: "Ingo Molnar" <[email protected]>
Cc: <[email protected]>; <[email protected]>
Sent: Wednesday, September 15, 2004 10:29 AM
Subject: Re: [patch] tune vmalloc size


> Ingo Molnar <[email protected]> writes:
>
> > there are a few devices that use lots of ioremap space. vmalloc space is
> > a showstopper problem for them.
> >
> > this patch adds the vmalloc=<size> boot parameter to override
> > __VMALLOC_RESERVE. The default is 128mb right now - e.g. vmalloc=256m
> > doubles the size.
>
> Ah, Karsten Keil did a similar patch some months ago. There is
> clearly a need.
>
> But I think this should be self tuning instead. For a machine with
> less than 900MB of memory the vmalloc area can be automagically increased,
> growing into otherwise unused address space.
>
> This way many users wouldn't need to specify weird options. So far
> most machines still don't have more than 512MB.
>
> -Andi
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2004-09-15 14:11:21

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [patch] tune vmalloc size

On Wed, Sep 15, 2004 at 09:29:36AM -0400, Joe Korty wrote:
> On Wed, Sep 15, 2004 at 02:53:56PM +0200, Ingo Molnar wrote:
> >
> > there are a few devices that use lots of ioremap space. vmalloc space is
> > a showstopper problem for them.
> >
> > this patch adds the vmalloc=<size> boot parameter to override
> > __VMALLOC_RESERVE. The default is 128mb right now - e.g. vmalloc=256m
> > doubles the size.
>
> Perhaps this should instead be a configurable.

boot time settable is 100x better than only compile time settable imo :)


Attachments:
(No filename) (533.00 B)
(No filename) (189.00 B)
Download all attachments

2004-09-15 14:16:20

by Dave Jones

[permalink] [raw]
Subject: Re: [patch] tune vmalloc size

On Wed, Sep 15, 2004 at 09:40:47AM -0400, Joe Korty wrote:

> > boot time settable is 100x better than only compile time settable imo :)
>
> IMO, everything that is changable at boot time needs an equivalent way
> of changing the default without specifying a boot time value.
>
> boot time values works well only when the number of values that need
> changing is small.

Most users will never need to change this /at all/, at boot time,
or compile time. Its a corner case for certain hardware configurations.
That fits into 'small number of values' afaics.

Dave

2004-09-15 14:19:56

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [patch] tune vmalloc size

On Wed, 2004-09-15 at 15:29, Andi Kleen wrote:
> But I think this should be self tuning instead. For a machine with
> less than 900MB of memory the vmalloc area can be automagically increased,
> growing into otherwise unused address space.

that is the case already

this patch is for the other case


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2004-09-15 14:59:27

by Karsten Keil

[permalink] [raw]
Subject: Re: [patch] tune vmalloc size

On Wed, Sep 15, 2004 at 03:29:53PM +0200, Andi Kleen wrote:
> Ingo Molnar <[email protected]> writes:
>
> > there are a few devices that use lots of ioremap space. vmalloc space is
> > a showstopper problem for them.
> >
> > this patch adds the vmalloc=<size> boot parameter to override
> > __VMALLOC_RESERVE. The default is 128mb right now - e.g. vmalloc=256m
> > doubles the size.
>
> Ah, Karsten Keil did a similar patch some months ago. There is
> clearly a need.
>
> But I think this should be self tuning instead. For a machine with
> less than 900MB of memory the vmalloc area can be automagically increased,
> growing into otherwise unused address space.
>
> This way many users wouldn't need to specify weird options. So far
> most machines still don't have more than 512MB.
>

Yes my patch has this autotune already in.

Hmm, I though I did sent it on LKML begin on this year, but after looking
through the archives, it didn't happend.

Here is my version.

diff -urN linux-2.6.9-rc2-bk2.org/arch/i386/boot/setup.S linux-2.6.9-rc2-bk2/arch/i386/boot/setup.S
--- linux-2.6.9-rc2-bk2.org/arch/i386/boot/setup.S 2004-06-18 23:36:57.000000000 +0200
+++ linux-2.6.9-rc2-bk2/arch/i386/boot/setup.S 2004-09-15 16:50:53.760287653 +0200
@@ -156,7 +156,7 @@
# can be located anywhere in
# low memory 0x10000 or higher.

-ramdisk_max: .long (MAXMEM-1) & 0x7fffffff
+ramdisk_max: .long (__MAXMEM-1) & 0x7fffffff
# (Header version 0x0203 or later)
# The highest safe address for
# the contents of an initrd
diff -urN linux-2.6.9-rc2-bk2.org/arch/i386/kernel/setup.c linux-2.6.9-rc2-bk2/arch/i386/kernel/setup.c
--- linux-2.6.9-rc2-bk2.org/arch/i386/kernel/setup.c 2004-09-15 16:45:31.488885339 +0200
+++ linux-2.6.9-rc2-bk2/arch/i386/kernel/setup.c 2004-09-15 16:50:53.779285142 +0200
@@ -97,6 +97,11 @@
/* For PCI or other memory-mapped resources */
unsigned long pci_mem_start = 0x10000000;

+/* reserved mapping space for vmalloc and ioremap */
+unsigned long vmalloc_reserve = __VMALLOC_RESERVE_DEFAULT;
+EXPORT_SYMBOL(vmalloc_reserve);
+static unsigned long vm_reserve __initdata = -1;
+
/* user-defined highmem size */
static unsigned int highmem_pages = -1;

@@ -814,7 +819,16 @@
*/
if (c == ' ' && !memcmp(from, "highmem=", 8))
highmem_pages = memparse(from+8, &from) >> PAGE_SHIFT;
-
+
+ /*
+ * vm_reserve=size forces to reserve 'size' bytes for vmalloc and
+ * ioremap areas minimum is 32 MB maximum is 800 MB
+ * the default without vm_reserve depends on the total amount of
+ * memory the minimum default is 128 MB
+ */
+ if (c == ' ' && !memcmp(from, "vm_reserve=", 11))
+ vm_reserve = memparse(from+11, &from);
+
c = *(from++);
if (!c)
break;
@@ -1019,7 +1033,28 @@
start_pfn = PFN_UP(init_pg_tables_end);

find_max_pfn();
+
+ /*
+ * calculate the default size of vmalloc/ioremap area
+ * overwrite with the value of the vm_reserve= option
+ * if set
+ */

+ if (max_pfn >= PFN_UP(KERNEL_MAXMEM - __VMALLOC_RESERVE_DEFAULT))
+ vmalloc_reserve = __VMALLOC_RESERVE_DEFAULT;
+ else
+ vmalloc_reserve = KERNEL_MAXMEM - PFN_PHYS(max_pfn);
+ if (vm_reserve != -1) {
+ if (vm_reserve < __VMALLOC_RESERVE_MIN)
+ vm_reserve = __VMALLOC_RESERVE_MIN;
+ if (vm_reserve > __VMALLOC_RESERVE_MAX)
+ vm_reserve = __VMALLOC_RESERVE_MAX;
+ vmalloc_reserve = vm_reserve;
+ }
+
+ printk(KERN_NOTICE "%ldMB vmalloc/ioremap area available.\n",
+ vmalloc_reserve>>20);
+
max_low_pfn = find_max_low_pfn();

#ifdef CONFIG_HIGHMEM
diff -urN linux-2.6.9-rc2-bk2.org/arch/i386/mm/discontig.c linux-2.6.9-rc2-bk2/arch/i386/mm/discontig.c
--- linux-2.6.9-rc2-bk2.org/arch/i386/mm/discontig.c 2004-09-15 16:45:31.865835529 +0200
+++ linux-2.6.9-rc2-bk2/arch/i386/mm/discontig.c 2004-09-15 16:50:53.788283952 +0200
@@ -266,6 +266,19 @@
system_start_pfn = min_low_pfn = PFN_UP(init_pg_tables_end);

find_max_pfn();
+
+ /* Added 2004-03-02, <[email protected]>, copied from i386/setup.c
+ * but leave out automatic vmalloc size increase ... */
+ if (vm_reserve != -1) {
+ if (vm_reserve < __VMALLOC_RESERVE_MIN)
+ vm_reserve = __VMALLOC_RESERVE_MIN;
+ if (vm_reserve > __VMALLOC_RESERVE_MAX)
+ vm_reserve = __VMALLOC_RESERVE_MAX;
+ vmalloc_reserve = vm_reserve;
+ }
+ printk(KERN_NOTICE "%ldMB vmalloc/ioremap area available.\n",
+ vmalloc_reserve>>20);
+
system_max_low_pfn = max_low_pfn = find_max_low_pfn() - reserve_pages;
printk("reserve_pages = %ld find_max_low_pfn() ~ %ld\n",
reserve_pages, max_low_pfn + reserve_pages);
diff -urN linux-2.6.9-rc2-bk2.org/Documentation/kernel-parameters.txt linux-2.6.9-rc2-bk2/Documentation/kernel-parameters.txt
--- linux-2.6.9-rc2-bk2.org/Documentation/kernel-parameters.txt 2004-09-15 16:45:29.162192787 +0200
+++ linux-2.6.9-rc2-bk2/Documentation/kernel-parameters.txt 2004-09-15 16:50:53.799282498 +0200
@@ -1280,6 +1280,11 @@
This is actually a boot loader parameter; the value is
passed to the kernel using a special protocol.

+ vm_reserve=nn[KM]
+ [KNL,BOOT,IA-32] force use of a specific amount of
+ virtual memory for vmalloc and ioremap allocations
+ minimum 32 MB maximum 800 MB
+
vmhalt= [KNL,S390]

vmpoff= [KNL,S390]
diff -urN linux-2.6.9-rc2-bk2.org/include/asm-i386/page.h linux-2.6.9-rc2-bk2/include/asm-i386/page.h
--- linux-2.6.9-rc2-bk2.org/include/asm-i386/page.h 2004-09-15 16:46:01.094973092 +0200
+++ linux-2.6.9-rc2-bk2/include/asm-i386/page.h 2004-09-15 16:50:53.809281176 +0200
@@ -98,10 +98,15 @@
* This much address space is reserved for vmalloc() and iomap()
* as well as fixmap mappings.
*/
-#define __VMALLOC_RESERVE (128 << 20)
+#define __VMALLOC_RESERVE_MIN (32 << 20)
+#define __VMALLOC_RESERVE_DEFAULT (128 << 20)
+#define __VMALLOC_RESERVE_MAX (800 << 20)
+#define __RESERVED_AREA (10 << 20)

#ifndef __ASSEMBLY__

+extern unsigned long vmalloc_reserve;
+
/* Pure 2^n version of get_order */
static __inline__ int get_order(unsigned long size)
{
@@ -128,11 +133,14 @@


#define PAGE_OFFSET ((unsigned long)__PAGE_OFFSET)
-#define VMALLOC_RESERVE ((unsigned long)__VMALLOC_RESERVE)
-#define MAXMEM (-__PAGE_OFFSET-__VMALLOC_RESERVE)
+#define KERNEL_MEMORY ((unsigned long)(FIXADDR_START - __PAGE_OFFSET))
+#define RESERVED_AREA ((unsigned long)__RESERVED_AREA)
+#define KERNEL_MAXMEM ((unsigned long)(KERNEL_MEMORY - RESERVED_AREA))
+#define __MAXMEM (-__PAGE_OFFSET-__VMALLOC_RESERVE_MAX)
+#define MAXMEM ((unsigned long)(-PAGE_OFFSET-vmalloc_reserve))
#define __pa(x) ((unsigned long)(x)-PAGE_OFFSET)
#define __va(x) ((void *)((unsigned long)(x)+PAGE_OFFSET))
-#define pfn_to_kaddr(pfn) __va((pfn) << PAGE_SHIFT)
+#define pfn_to_kaddr(pfn) __va((pfn) << PAGE_SHIFT)
#ifndef CONFIG_DISCONTIGMEM
#define pfn_to_page(pfn) (mem_map + (pfn))
#define page_to_pfn(page) ((unsigned long)((page) - mem_map))


--
Karsten Keil
SuSE Labs
ISDN development

2004-09-15 15:07:20

by Martin J. Bligh

[permalink] [raw]
Subject: Re: [patch] tune vmalloc size

--Andi Kleen <[email protected]> wrote (on Wednesday, September 15, 2004 15:29:53 +0200):

> Ingo Molnar <[email protected]> writes:
>
>> there are a few devices that use lots of ioremap space. vmalloc space is
>> a showstopper problem for them.
>>
>> this patch adds the vmalloc=<size> boot parameter to override
>> __VMALLOC_RESERVE. The default is 128mb right now - e.g. vmalloc=256m
>> doubles the size.
>
> Ah, Karsten Keil did a similar patch some months ago. There is
> clearly a need.
>
> But I think this should be self tuning instead. For a machine with
> less than 900MB of memory the vmalloc area can be automagically increased,
> growing into otherwise unused address space.
>
> This way many users wouldn't need to specify weird options. So far
> most machines still don't have more than 512MB.

It already does that, IIRC.

M.

2004-09-15 21:48:05

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch] tune vmalloc size


* Andrew Morton <[email protected]> wrote:

> Ingo Molnar <[email protected]> wrote:
> >
> > + if (c == ' ' && !memcmp(from, "vmalloc=", 8))
> > + __VMALLOC_RESERVE = memparse(from+8, &from);
>
> u o akpm an update to kernel-parameters.txt, please.

here you go:

--- Documentation/kernel-parameters.txt.orig
+++ Documentation/kernel-parameters.txt
@@ -453,6 +453,11 @@ running once the system is up.
hd?= [HW] (E)IDE subsystem
hd?lun= See Documentation/ide.txt.

+ highmem=nn[KMG] [KNL,BOOT] forces the highmem zone to have an exact
+ size of <nn>. This works even on boxes that have no
+ highmem otherwise. This also works to reduce highmem
+ size on bigger boxes.
+
hisax= [HW,ISDN]
See Documentation/isdn/README.HiSax.

@@ -1280,6 +1285,12 @@ running once the system is up.
This is actually a boot loader parameter; the value is
passed to the kernel using a special protocol.

+ vmalloc=nn[KMG] [KNL,BOOT] forces the vmalloc area to have an exact
+ size of <nn>. This can be used to increase the
+ minimum size (128MB on x86). It can also be used to
+ decrease the size and leave more room for directly
+ mapped kernel RAM.
+
vmhalt= [KNL,S390]

vmpoff= [KNL,S390]

2004-09-16 00:13:45

by Andrew Morton

[permalink] [raw]
Subject: Re: [patch] tune vmalloc size

Ingo Molnar <[email protected]> wrote:
>
> + if (c == ' ' && !memcmp(from, "vmalloc=", 8))
> + __VMALLOC_RESERVE = memparse(from+8, &from);

u o akpm an update to kernel-parameters.txt, please.

2004-09-17 22:04:13

by Chris Wedgwood

[permalink] [raw]
Subject: Re: [patch] tune vmalloc size

On Wed, Sep 15, 2004 at 04:12:56PM +0200, Arjan van de Ven wrote:

> that is the case already

why do we still use 128MB as a default then? this is way over-kill
from what i can tell looking on what my machines use. i'd rather have
this be a bit smaller and enable the slab/whatever to grow a little
more

2004-09-17 22:13:51

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [patch] tune vmalloc size

On Fri, Sep 17, 2004 at 03:03:40PM -0700, Chris Wedgwood wrote:
> On Wed, Sep 15, 2004 at 04:12:56PM +0200, Arjan van de Ven wrote:
>
> > that is the case already
>
> why do we still use 128MB as a default then? this is way over-kill
> from what i can tell looking on what my machines use. i'd rather have
> this be a bit smaller and enable the slab/whatever to grow a little
> more

if you have an old glibc it will use ldt's which in turn use vmalloc for
threading... 128Mb is no luxury there.


Attachments:
(No filename) (501.00 B)
(No filename) (189.00 B)
Download all attachments