2006-05-22 01:36:50

by Giridhar Pemmasani

[permalink] [raw]
Subject: __vmalloc with GFP_ATOMIC causes 'sleeping from invalid context'


If __vmalloc is called in atomic context with GFP_ATOMIC flags,
__get_vm_area_node is called, which calls kmalloc_node with GFP_KERNEL
flags. This causes 'sleeping function called from invalid context at
mm/slab.c:2729' with 2.6.16-rc4 kernel. A simple solution is to use
proper flags in __get_vm_area_node, depending on the context:

diff -Naur linux.orig/mm/vmalloc.c linux/mm/vmalloc.c
--- linux.orig/mm/vmalloc.c 2006-05-19 01:22:00.000000000 -0400
+++ linux/mm/vmalloc.c 2006-05-19 01:53:05.000000000 -0400
@@ -177,7 +177,10 @@
addr = ALIGN(start, align);
size = PAGE_ALIGN(size);

- area = kmalloc_node(sizeof(*area), GFP_KERNEL, node);
+ if (in_atomic() || in_interrupt())
+ area = kmalloc_node(sizeof(*area), GFP_ATOMIC, node);
+ else
+ area = kmalloc_node(sizeof(*area), GFP_KERNEL, node);
if (unlikely(!area))
return NULL;


An alternate solution is to pass flags to __get_vm_area_node what was
passed to __vmalloc so it can call kmalloc_node with either GFP_KERNEL
or GFP_ATOMIC, but that changes signature of get_vm_area_node and
__get_vm_area_node which may affect other parts of kernel / modules.

Another point: In include/linux/kernel.h, might_resched() is defined
as

#ifdef CONFIG_PREEMPT_VOLUNTARY
extern int cond_resched(void);
# define might_resched() cond_resched()
#else
# define might_resched() do { } while (0)
#endif

Why is cond_resched() used only if CONFIG_PREEMPT_VOLUNTARY and not if
CONFIG_PREEMPT is enabled? i.e., shouldn't it be defined as

#if defined(CONFIG_PREEMPT_VOLUNTARY) || defined(CONFIG_PREEMPT)
extern int cond_resched(void);
# define might_resched() cond_resched()
#else
# define might_resched() do { } while (0)
#endif

Thanks,
Giri


2006-05-22 01:51:41

by Arjan van de Ven

[permalink] [raw]
Subject: Re: __vmalloc with GFP_ATOMIC causes 'sleeping from invalid context'

On Sun, 2006-05-21 at 21:36 -0400, Giridhar Pemmasani wrote:
> If __vmalloc is called in atomic context with GFP_ATOMIC flags,
> __get_vm_area_node is called, which calls kmalloc_node with GFP_KERNEL
> flags. This causes 'sleeping function called from invalid context at
> mm/slab.c:2729' with 2.6.16-rc4 kernel. A simple solution is to use
> proper flags in __get_vm_area_node, depending on the context:


vmalloc sleeps, or at least does things to the lower vm layers that
really do sleepy things. So calling it from an atomic context really
tends to be a bug....

where in the kernel is this done?

2006-05-22 01:53:59

by Nick Piggin

[permalink] [raw]
Subject: Re: __vmalloc with GFP_ATOMIC causes 'sleeping from invalid context'

Giridhar Pemmasani wrote:
> If __vmalloc is called in atomic context with GFP_ATOMIC flags,
> __get_vm_area_node is called, which calls kmalloc_node with GFP_KERNEL
> flags. This causes 'sleeping function called from invalid context at
> mm/slab.c:2729' with 2.6.16-rc4 kernel. A simple solution is to use

I can't see what would cause this in either 2.6.16-rc4 or 2.6.17-rc4.
What is the line?

> proper flags in __get_vm_area_node, depending on the context:

I don't think that always works, you might pass in GFP_ATOMIC due to
having hold of a spinlock, for example.

Also, vmlist_lock isn't interrupt safe, so it still kind of goes
against the spirit of GFP_ATOMIC (which is to allow allocation from
interrupt context).

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com

2006-05-22 05:58:54

by Giridhar Pemmasani

[permalink] [raw]
Subject: Re: __vmalloc with GFP_ATOMIC causes 'sleeping from invalid context'

On Mon, 22 May 2006 11:53:55 +1000, Nick Piggin <[email protected]> said:

> Giridhar Pemmasani wrote:
>> If __vmalloc is called in atomic context with GFP_ATOMIC flags,
>> __get_vm_area_node is called, which calls kmalloc_node with
>> GFP_KERNEL flags. This causes 'sleeping function called from
>> invalid context at mm/slab.c:2729' with 2.6.16-rc4 kernel. A
>> simple solution is to use

> I can't see what would cause this in either 2.6.16-rc4 or
> 2.6.17-rc4. What is the line?

If someone calls __vmalloc in atomic context (with GFP_ATOMIC flags):

with 2.6.17-rc4, in file mm/vmalloc.c,

__vmalloc calls __vmalloc_node on line 484,
__vmalloc_node calls get_vm_area_node on line 474,
get_vm_area_node calls __get_vm_area_node on line 256,
__get_vm_area_node calls kmalloc_node with GFP_KERNEL on line 180

and in include/linux/slab.h,

kmalloc_node calls kmalloc with GFP_KERNEL on line 164,
kmalloc calls kmem_cache_alloc on line 106,

and in mm/slab.c,

kmem_cache_alloc calls __cache_alloc on line 3136,
__cache_alloc calls cache_alloc_debugcheck_before on line 2880,
cache_alloc_debugcheck_before calls might_sleep_if(GFP_KERNEL & GFP_WAIT)
on line 2783

which causes the warning.

--
Giri

2006-05-22 06:01:36

by Giridhar Pemmasani

[permalink] [raw]
Subject: Re: __vmalloc with GFP_ATOMIC causes 'sleeping from invalid context'

On Mon, 22 May 2006 03:51:37 +0200, Arjan van de Ven <[email protected]> said:

> On Sun, 2006-05-21 at 21:36 -0400, Giridhar Pemmasani wrote:
>> If __vmalloc is called in atomic context with GFP_ATOMIC flags,
>> __get_vm_area_node is called, which calls kmalloc_node with
>> GFP_KERNEL flags. This causes 'sleeping function called from
>> invalid context at mm/slab.c:2729' with 2.6.16-rc4 kernel. A
>> simple solution is to use proper flags in __get_vm_area_node,
>> depending on the context:


> vmalloc sleeps, or at least does things to the lower vm layers
> that really do sleepy things. So calling it from an atomic
> context really tends to be a bug....

> where in the kernel is this done?

It is not about vmalloc, but about __vmalloc. I gave more details in
response to Nick Piggin's reply on this thread.

Thanks,
Giri

2006-05-22 06:07:28

by Nick Piggin

[permalink] [raw]
Subject: Re: __vmalloc with GFP_ATOMIC causes 'sleeping from invalid context'

Giridhar Pemmasani wrote:
> On Mon, 22 May 2006 11:53:55 +1000, Nick Piggin <[email protected]> said:
>
> > Giridhar Pemmasani wrote:
> >> If __vmalloc is called in atomic context with GFP_ATOMIC flags,
> >> __get_vm_area_node is called, which calls kmalloc_node with
> >> GFP_KERNEL flags. This causes 'sleeping function called from
> >> invalid context at mm/slab.c:2729' with 2.6.16-rc4 kernel. A
> >> simple solution is to use
>
> > I can't see what would cause this in either 2.6.16-rc4 or
> > 2.6.17-rc4. What is the line?
>
> If someone calls __vmalloc in atomic context (with GFP_ATOMIC flags):

OK I misunderstood your comment. I was looking for the caller.
Hmm, page_alloc.c does, but I don't know that it needs to be
atomic -- what happens if we just make that allocation GFP_KERNEL?

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com

2006-05-22 06:10:49

by Nick Piggin

[permalink] [raw]
Subject: Re: __vmalloc with GFP_ATOMIC causes 'sleeping from invalid context'

Nick Piggin wrote:
> Giridhar Pemmasani wrote:
>
>> On Mon, 22 May 2006 11:53:55 +1000, Nick Piggin
>> <[email protected]> said:
>>
>> > Giridhar Pemmasani wrote:
>> >> If __vmalloc is called in atomic context with GFP_ATOMIC flags,
>> >> __get_vm_area_node is called, which calls kmalloc_node with
>> >> GFP_KERNEL flags. This causes 'sleeping function called from
>> >> invalid context at mm/slab.c:2729' with 2.6.16-rc4 kernel. A
>> >> simple solution is to use
>>
>> > I can't see what would cause this in either 2.6.16-rc4 or
>> > 2.6.17-rc4. What is the line?
>>
>> If someone calls __vmalloc in atomic context (with GFP_ATOMIC flags):
>
>
> OK I misunderstood your comment. I was looking for the caller.
> Hmm, page_alloc.c does, but I don't know that it needs to be
> atomic -- what happens if we just make that allocation GFP_KERNEL?
>

OTOH, it doesn't seem to be particularly wrong to allow __vmalloc
GFP_ATOMIC allocations. The correct fix is to pass the gfp_mask
to kmalloc: if you're worried about breaking the API, introduce a
new __get_vm_area_node_mask() and implement __get_vm_area_node()
as a simple wrapper that passes in GFP_KERNEL.

Thanks,
Nick

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com

2006-05-22 06:14:07

by Nick Piggin

[permalink] [raw]
Subject: Re: __vmalloc with GFP_ATOMIC causes 'sleeping from invalid context'

Nick Piggin wrote:

> OTOH, it doesn't seem to be particularly wrong to allow __vmalloc
> GFP_ATOMIC allocations. The correct fix is to pass the gfp_mask
> to kmalloc: if you're worried about breaking the API, introduce a
> new __get_vm_area_node_mask() and implement __get_vm_area_node()
> as a simple wrapper that passes in GFP_KERNEL.

Oh, and __get_vm_area_node{_mask} should BUG_ON(in_interrupt());

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com

2006-05-22 06:56:52

by Giridhar Pemmasani

[permalink] [raw]
Subject: Re: __vmalloc with GFP_ATOMIC causes 'sleeping from invalid context'

On Mon, 22 May 2006 16:10:45 +1000, Nick Piggin <[email protected]> said:

> Nick Piggin wrote:
>> Giridhar Pemmasani wrote:
>>
>>> On Mon, 22 May 2006 11:53:55 +1000, Nick Piggin
>>> <[email protected]> said:
>>>
>>> > Giridhar Pemmasani wrote: >> If __vmalloc is called in atomic
>>> context with GFP_ATOMIC flags, >> __get_vm_area_node is called,
>>> which calls kmalloc_node with >> GFP_KERNEL flags. This causes
>>> 'sleeping function called from >> invalid context at
>>> mm/slab.c:2729' with 2.6.16-rc4 kernel. A >> simple solution is
>>> to use
>>>
>>> > I can't see what would cause this in either 2.6.16-rc4 or >
>>> 2.6.17-rc4. What is the line?
>>>
>>> If someone calls __vmalloc in atomic context (with GFP_ATOMIC
>>> flags):
>>
>>
>> OK I misunderstood your comment. I was looking for the caller.
>> Hmm, page_alloc.c does, but I don't know that it needs to be
>> atomic -- what happens if we just make that allocation
>> GFP_KERNEL?
>>

> OTOH, it doesn't seem to be particularly wrong to allow __vmalloc
> GFP_ATOMIC allocations. The correct fix is to pass the gfp_mask
> to kmalloc: if you're worried about breaking the API, introduce a
> new __get_vm_area_node_mask() and implement __get_vm_area_node()
> as a simple wrapper that passes in GFP_KERNEL.

Here is an attempt at this. I also made __get_vm_area_node static.

Signed-off-by: Giridhar Pemmasani <[email protected]>

diff -Naur linux.orig/include/linux/vmalloc.h linux/include/linux/vmalloc.h
--- linux.orig/include/linux/vmalloc.h 2006-05-22 02:45:23.000000000 -0400
+++ linux/include/linux/vmalloc.h 2006-05-22 02:45:38.000000000 -0400
@@ -3,6 +3,7 @@

#include <linux/spinlock.h>
#include <asm/page.h> /* pgprot_t */
+#include <linux/gfp.h>

/* bits in vm_struct->flags */
#define VM_IOREMAP 0x00000001 /* ioremap() and friends */
@@ -52,8 +53,15 @@
extern struct vm_struct *get_vm_area(unsigned long size, unsigned long flags);
extern struct vm_struct *__get_vm_area(unsigned long size, unsigned long flags,
unsigned long start, unsigned long end);
-extern struct vm_struct *get_vm_area_node(unsigned long size,
- unsigned long flags, int node);
+extern struct vm_struct *get_vm_area_node_mask(unsigned long size,
+ unsigned long flags, int node,
+ gfp_t gfp_mask);
+static inline struct vm_struct *get_vm_area_node(unsigned long size,
+ unsigned long flags, int node)
+{
+ return get_vm_area_node_mask(size, flags, node, GFP_KERNEL);
+}
+
extern struct vm_struct *remove_vm_area(void *addr);
extern struct vm_struct *__remove_vm_area(void *addr);
extern int map_vm_area(struct vm_struct *area, pgprot_t prot,
diff -Naur linux.orig/mm/vmalloc.c linux/mm/vmalloc.c
--- linux.orig/mm/vmalloc.c 2006-05-19 01:22:00.000000000 -0400
+++ linux/mm/vmalloc.c 2006-05-22 02:45:49.000000000 -0400
@@ -157,8 +157,9 @@
return err;
}

-struct vm_struct *__get_vm_area_node(unsigned long size, unsigned long flags,
- unsigned long start, unsigned long end, int node)
+static struct vm_struct *__get_vm_area_node(unsigned long size, unsigned long flags,
+ unsigned long start, unsigned long end,
+ int node, gfp_t gfp_mask)
{
struct vm_struct **p, *tmp, *area;
unsigned long align = 1;
@@ -177,7 +178,7 @@
addr = ALIGN(start, align);
size = PAGE_ALIGN(size);

- area = kmalloc_node(sizeof(*area), GFP_KERNEL, node);
+ area = kmalloc_node(sizeof(*area), gfp_mask, node);
if (unlikely(!area))
return NULL;

@@ -233,7 +234,7 @@
struct vm_struct *__get_vm_area(unsigned long size, unsigned long flags,
unsigned long start, unsigned long end)
{
- return __get_vm_area_node(size, flags, start, end, -1);
+ return __get_vm_area_node(size, flags, start, end, -1, GFP_KERNEL);
}

/**
@@ -251,9 +252,11 @@
return __get_vm_area(size, flags, VMALLOC_START, VMALLOC_END);
}

-struct vm_struct *get_vm_area_node(unsigned long size, unsigned long flags, int node)
+struct vm_struct *get_vm_area_node_mask(unsigned long size, unsigned long flags,
+ int node, gfp_t gfp_mask)
{
- return __get_vm_area_node(size, flags, VMALLOC_START, VMALLOC_END, node);
+ return __get_vm_area_node(size, flags, VMALLOC_START, VMALLOC_END, node,
+ gfp_mask);
}

/* Caller must hold vmlist_lock */
@@ -471,7 +474,7 @@
if (!size || (size >> PAGE_SHIFT) > num_physpages)
return NULL;

- area = get_vm_area_node(size, VM_ALLOC, node);
+ area = get_vm_area_node_mask(size, VM_ALLOC, node, gfp_mask);
if (!area)
return NULL;


Thanks,
Giri

2006-05-22 07:08:21

by Giridhar Pemmasani

[permalink] [raw]
Subject: Re: __vmalloc with GFP_ATOMIC causes 'sleeping from invalid context'

On Mon, 22 May 2006 16:14:03 +1000, Nick Piggin <[email protected]> said:

> Nick Piggin wrote:

>> OTOH, it doesn't seem to be particularly wrong to allow __vmalloc
>> GFP_ATOMIC allocations. The correct fix is to pass the gfp_mask
>> to kmalloc: if you're worried about breaking the API, introduce a
>> new __get_vm_area_node_mask() and implement __get_vm_area_node()
>> as a simple wrapper that passes in GFP_KERNEL.

> Oh, and __get_vm_area_node{_mask} should BUG_ON(in_interrupt());

With the patch I sent earlier, this may not be required: Since
__get_vm_area_node calls kmalloc, it should be taken care of in
kmalloc and friends. Currently cache_alloc_debugcheck_before doesn't
check for in_interrupt(); perhaps that is the right place to add

if (flags & GFP_WAIT)
BUG_ON(in_interrupt());

Thanks,
Giri

2006-05-22 07:34:42

by Nick Piggin

[permalink] [raw]
Subject: Re: __vmalloc with GFP_ATOMIC causes 'sleeping from invalid context'

Giridhar Pemmasani wrote:
> On Mon, 22 May 2006 16:14:03 +1000, Nick Piggin <[email protected]> said:
>
> > Nick Piggin wrote:
>
> >> OTOH, it doesn't seem to be particularly wrong to allow __vmalloc
> >> GFP_ATOMIC allocations. The correct fix is to pass the gfp_mask
> >> to kmalloc: if you're worried about breaking the API, introduce a
> >> new __get_vm_area_node_mask() and implement __get_vm_area_node()
> >> as a simple wrapper that passes in GFP_KERNEL.
>
> > Oh, and __get_vm_area_node{_mask} should BUG_ON(in_interrupt());
>
> With the patch I sent earlier, this may not be required: Since
> __get_vm_area_node calls kmalloc, it should be taken care of in
> kmalloc and friends. Currently cache_alloc_debugcheck_before doesn't
> check for in_interrupt(); perhaps that is the right place to add

vmlist_lock is not irq safe. If you call it from interrupt, you can
deadlock.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com

2006-05-22 11:33:28

by Alan

[permalink] [raw]
Subject: Re: __vmalloc with GFP_ATOMIC causes 'sleeping from invalid context'

On Sul, 2006-05-21 at 21:36 -0400, Giridhar Pemmasani wrote:
> If __vmalloc is called in atomic context with GFP_ATOMIC flags,

vmalloc is not safely callable in a non sleeping context. If you need a
vmalloc buffer in an IRQ then pre-allocate it.

Alan

2006-05-22 11:51:43

by Andi Kleen

[permalink] [raw]
Subject: Re: __vmalloc with GFP_ATOMIC causes 'sleeping from invalid context'

Nick Piggin <[email protected]> writes:

> Giridhar Pemmasani wrote:
> > If __vmalloc is called in atomic context with GFP_ATOMIC flags,
> > __get_vm_area_node is called, which calls kmalloc_node with GFP_KERNEL
> > flags. This causes 'sleeping function called from invalid context at
> > mm/slab.c:2729' with 2.6.16-rc4 kernel. A simple solution is to use
>
> I can't see what would cause this in either 2.6.16-rc4 or 2.6.17-rc4.
> What is the line?
>
> > proper flags in __get_vm_area_node, depending on the context:
>
> I don't think that always works, you might pass in GFP_ATOMIC due to
> having hold of a spinlock, for example.
>
> Also, vmlist_lock isn't interrupt safe, so it still kind of goes
> against the spirit of GFP_ATOMIC (which is to allow allocation from
> interrupt context).

That's not the only problem. Allocating page table entries or flushing TLBs from
an atomic context is just not supported by the low level architecture code.

-Andi

2006-05-22 14:55:18

by Giridhar Pemmasani

[permalink] [raw]
Subject: Re: __vmalloc with GFP_ATOMIC causes 'sleeping from invalid context'

On Mon, 22 May 2006 17:34:32 +1000, Nick Piggin <[email protected]> said:

>> > Oh, and __get_vm_area_node{_mask} should
>> BUG_ON(in_interrupt());
>>
>> With the patch I sent earlier, this may not be required: Since
>> __get_vm_area_node calls kmalloc, it should be taken care of in
>> kmalloc and friends. Currently cache_alloc_debugcheck_before
>> doesn't check for in_interrupt(); perhaps that is the right place
>> to add

> vmlist_lock is not irq safe. If you call it from interrupt, you
> can deadlock.

Aha. I will add BUG_ON(in_interrupt()) to __get_vm_area_node in the
next round (if there is interest in the patch).

Thanks,
Giri

2006-05-22 15:13:03

by Giridhar Pemmasani

[permalink] [raw]
Subject: Re: __vmalloc with GFP_ATOMIC causes 'sleeping from invalid context'

On 22 May 2006 13:18:18 +0200, Andi Kleen <[email protected]> said:

> Nick Piggin <[email protected]> writes:

>> Giridhar Pemmasani wrote: > If __vmalloc is called in atomic
>> context with GFP_ATOMIC flags, > __get_vm_area_node is called,
>> which calls kmalloc_node with GFP_KERNEL > flags. This causes
>> 'sleeping function called from invalid context at >
>> mm/slab.c:2729' with 2.6.16-rc4 kernel. A simple solution is to
>> use
>>
>> I can't see what would cause this in either 2.6.16-rc4 or
>> 2.6.17-rc4. What is the line?
>>
>> > proper flags in __get_vm_area_node, depending on the context:
>>
>> I don't think that always works, you might pass in GFP_ATOMIC due
>> to having hold of a spinlock, for example.
>>
>> Also, vmlist_lock isn't interrupt safe, so it still kind of goes
>> against the spirit of GFP_ATOMIC (which is to allow allocation
>> from interrupt context).

> That's not the only problem. Allocating page table entries or
> flushing TLBs from an atomic context is just not supported by the
> low level architecture code.

I looked through __vmalloc implementation and the call tree. Nowhere
do I see a problem with calling __vmalloc in atomic context (with
GFP_ATOMIC flags, of course). Except for a few architectures (arm,
m68k, sparc and xtensa), flusch_cache_all is a nop. I didn't look into
all architectures, but at least for m68k, flush_cache_all seems to
safe in atomic context. Please correct me if I am wrong.

To reiterate, __get_vm_area_node allocates space for 'struct
vm_struct' with GFP_KERNEL irrespective of how its caller was called -
if __vmalloc was called with GFP_ATOMIC, the node for that allocation
should be allocated with the same flags. So the patch I suggested
simply passes the flags from __vmalloc down to __get_vm_area_node.

If indeed it is not safe to call __vmalloc in atomic context (even
with GFP_ATOMIC flags), perhaps we can add BUG_ON(in_atomic()) to
__vmalloc?

Thanks,
Giri

2006-05-22 21:56:53

by Andrew Morton

[permalink] [raw]
Subject: Re: __vmalloc with GFP_ATOMIC causes 'sleeping from invalid context'

Giridhar Pemmasani <[email protected]> wrote:
>
> diff -Naur linux.orig/include/linux/vmalloc.h linux/include/linux/vmalloc.h
> --- linux.orig/include/linux/vmalloc.h 2006-05-22 02:45:23.000000000 -0400
> +++ linux/include/linux/vmalloc.h 2006-05-22 02:45:38.000000000 -0400
> @@ -3,6 +3,7 @@
>
> #include <linux/spinlock.h>
> #include <asm/page.h> /* pgprot_t */
> +#include <linux/gfp.h>
>
> /* bits in vm_struct->flags */
> #define VM_IOREMAP 0x00000001 /* ioremap() and friends */
> @@ -52,8 +53,15 @@
> extern struct vm_struct *get_vm_area(unsigned long size, unsigned long flags);
> extern struct vm_struct *__get_vm_area(unsigned long size, unsigned long flags,
> unsigned long start, unsigned long end);
> -extern struct vm_struct *get_vm_area_node(unsigned long size,
> - unsigned long flags, int node);
> +extern struct vm_struct *get_vm_area_node_mask(unsigned long size,
> + unsigned long flags, int node,
> + gfp_t gfp_mask);
> +static inline struct vm_struct *get_vm_area_node(unsigned long size,
> + unsigned long flags, int node)
> +{
> + return get_vm_area_node_mask(size, flags, node, GFP_KERNEL);
> +}
> +
> extern struct vm_struct *remove_vm_area(void *addr);
> extern struct vm_struct *__remove_vm_area(void *addr);
> extern int map_vm_area(struct vm_struct *area, pgprot_t prot,
> diff -Naur linux.orig/mm/vmalloc.c linux/mm/vmalloc.c
> --- linux.orig/mm/vmalloc.c 2006-05-19 01:22:00.000000000 -0400
> +++ linux/mm/vmalloc.c 2006-05-22 02:45:49.000000000 -0400
> @@ -157,8 +157,9 @@
> return err;
> }
>
> -struct vm_struct *__get_vm_area_node(unsigned long size, unsigned long flags,
> - unsigned long start, unsigned long end, int node)
> +static struct vm_struct *__get_vm_area_node(unsigned long size, unsigned long flags,
> + unsigned long start, unsigned long end,
> + int node, gfp_t gfp_mask)
> {
> struct vm_struct **p, *tmp, *area;
> unsigned long align = 1;
> @@ -177,7 +178,7 @@
> addr = ALIGN(start, align);
> size = PAGE_ALIGN(size);
>
> - area = kmalloc_node(sizeof(*area), GFP_KERNEL, node);
> + area = kmalloc_node(sizeof(*area), gfp_mask, node);
> if (unlikely(!area))
> return NULL;
>
> @@ -233,7 +234,7 @@
> struct vm_struct *__get_vm_area(unsigned long size, unsigned long flags,
> unsigned long start, unsigned long end)
> {
> - return __get_vm_area_node(size, flags, start, end, -1);
> + return __get_vm_area_node(size, flags, start, end, -1, GFP_KERNEL);
> }
>
> /**
> @@ -251,9 +252,11 @@
> return __get_vm_area(size, flags, VMALLOC_START, VMALLOC_END);
> }
>
> -struct vm_struct *get_vm_area_node(unsigned long size, unsigned long flags, int node)
> +struct vm_struct *get_vm_area_node_mask(unsigned long size, unsigned long flags,
> + int node, gfp_t gfp_mask)
> {
> - return __get_vm_area_node(size, flags, VMALLOC_START, VMALLOC_END, node);
> + return __get_vm_area_node(size, flags, VMALLOC_START, VMALLOC_END, node,
> + gfp_mask);
> }
>
> /* Caller must hold vmlist_lock */
> @@ -471,7 +474,7 @@
> if (!size || (size >> PAGE_SHIFT) > num_physpages)
> return NULL;
>
> - area = get_vm_area_node(size, VM_ALLOC, node);
> + area = get_vm_area_node_mask(size, VM_ALLOC, node, gfp_mask);
> if (!area)
> return NULL;
>

It was wrong for get_vm_area_node() to have assumed that it could use
GFP_KERNEL.

Please just change the top-level API of get_vm_area_node() to take a gfp_t
and don't worry about the get_vm_area_node_mask() thing.

The only callers of get_vm_area_node() are in vmalloc.c anyway. We could
in fact make it static, but I guess exposing it as an API call makes sense.


2006-05-22 22:59:25

by Giridhar Pemmasani

[permalink] [raw]
Subject: Re: __vmalloc with GFP_ATOMIC causes 'sleeping from invalid context'

On Mon, 22 May 2006 14:56:35 -0700, Andrew Morton <[email protected]> said:

> It was wrong for get_vm_area_node() to have assumed that it could
> use GFP_KERNEL.

> Please just change the top-level API of get_vm_area_node() to
> take a gfp_t and don't worry about the get_vm_area_node_mask()
> thing.

> The only callers of get_vm_area_node() are in vmalloc.c anyway.
> We could in fact make it static, but I guess exposing it as an
> API call makes sense.

In addition, I added BUG_ON(in_interrupt()), as suggested by Nick
Piggin earlier. The patch is against 2.6.17-rc4.

Thanks,
Giri

Signed-off-by: Giridhar Pemmasani <[email protected]>

---

diff -Naur linux-2.6.17-rc4.orig/include/linux/vmalloc.h linux-2.6.17-rc4/include/linux/vmalloc.h
--- linux-2.6.17-rc4.orig/include/linux/vmalloc.h 2006-05-22 18:48:33.000000000 -0400
+++ linux-2.6.17-rc4/include/linux/vmalloc.h 2006-05-22 18:50:40.000000000 -0400
@@ -53,7 +53,8 @@
extern struct vm_struct *__get_vm_area(unsigned long size, unsigned long flags,
unsigned long start, unsigned long end);
extern struct vm_struct *get_vm_area_node(unsigned long size,
- unsigned long flags, int node);
+ unsigned long flags, int node,
+ gfp_t gfp_mask);
extern struct vm_struct *remove_vm_area(void *addr);
extern struct vm_struct *__remove_vm_area(void *addr);
extern int map_vm_area(struct vm_struct *area, pgprot_t prot,
diff -Naur linux-2.6.17-rc4.orig/mm/vmalloc.c linux-2.6.17-rc4/mm/vmalloc.c
--- linux-2.6.17-rc4.orig/mm/vmalloc.c 2006-05-22 18:48:40.000000000 -0400
+++ linux-2.6.17-rc4/mm/vmalloc.c 2006-05-22 18:47:09.000000000 -0400
@@ -157,13 +157,15 @@
return err;
}

-struct vm_struct *__get_vm_area_node(unsigned long size, unsigned long flags,
- unsigned long start, unsigned long end, int node)
+static struct vm_struct *__get_vm_area_node(unsigned long size, unsigned long flags,
+ unsigned long start, unsigned long end,
+ int node, gfp_t gfp_mask)
{
struct vm_struct **p, *tmp, *area;
unsigned long align = 1;
unsigned long addr;

+ BUG_ON(in_interrupt());
if (flags & VM_IOREMAP) {
int bit = fls(size);

@@ -177,7 +179,7 @@
addr = ALIGN(start, align);
size = PAGE_ALIGN(size);

- area = kmalloc_node(sizeof(*area), GFP_KERNEL, node);
+ area = kmalloc_node(sizeof(*area), gfp_mask, node);
if (unlikely(!area))
return NULL;

@@ -233,7 +235,7 @@
struct vm_struct *__get_vm_area(unsigned long size, unsigned long flags,
unsigned long start, unsigned long end)
{
- return __get_vm_area_node(size, flags, start, end, -1);
+ return __get_vm_area_node(size, flags, start, end, -1, GFP_KERNEL);
}

/**
@@ -251,9 +253,11 @@
return __get_vm_area(size, flags, VMALLOC_START, VMALLOC_END);
}

-struct vm_struct *get_vm_area_node(unsigned long size, unsigned long flags, int node)
+struct vm_struct *get_vm_area_node(unsigned long size, unsigned long flags,
+ int node, gfp_t gfp_mask)
{
- return __get_vm_area_node(size, flags, VMALLOC_START, VMALLOC_END, node);
+ return __get_vm_area_node(size, flags, VMALLOC_START, VMALLOC_END, node,
+ gfp_mask);
}

/* Caller must hold vmlist_lock */
@@ -471,7 +475,7 @@
if (!size || (size >> PAGE_SHIFT) > num_physpages)
return NULL;

- area = get_vm_area_node(size, VM_ALLOC, node);
+ area = get_vm_area_node(size, VM_ALLOC, node, gfp_mask);
if (!area)
return NULL;