2009-10-05 10:16:54

by Jan Beulich

[permalink] [raw]
Subject: [PATCH] adjust gfp mask passed on nested vmalloc() invocation

- fix a latent bug resulting from blindly or-ing in __GFP_ZERO, since
the combination of this and __GFP_HIGHMEM (possibly passed into the
function) is forbidden in interrupt context
- avoid wasting more precious resources (DMA or DMA32 pools), when
being called through vmalloc_32{,_user}()
- explicitly allow using high memory here even if the outer allocation
request doesn't allow it, unless is collides with __GFP_ZERO

Signed-off-by: Jan Beulich <[email protected]>

---
mm/vmalloc.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)

--- linux-2.6.32-rc3/mm/vmalloc.c 2009-10-05 11:59:56.000000000 +0200
+++ 2.6.32-rc3-vmalloc-nested-gfp/mm/vmalloc.c 2009-10-05 08:40:36.000000000 +0200
@@ -1410,6 +1410,7 @@ static void *__vmalloc_area_node(struct
{
struct page **pages;
unsigned int nr_pages, array_size, i;
+ gfp_t nested_gfp = (gfp_mask & GFP_RECLAIM_MASK) | __GFP_ZERO;

nr_pages = (area->size - PAGE_SIZE) >> PAGE_SHIFT;
array_size = (nr_pages * sizeof(struct page *));
@@ -1417,13 +1418,16 @@ static void *__vmalloc_area_node(struct
area->nr_pages = nr_pages;
/* Please note that the recursion is strictly bounded. */
if (array_size > PAGE_SIZE) {
- pages = __vmalloc_node(array_size, gfp_mask | __GFP_ZERO,
+#ifdef CONFIG_HIGHMEM
+ /* See the comment in prep_zero_page(). */
+ if (!in_interrupt())
+ nested_gfp |= __GFP_HIGHMEM;
+#endif
+ pages = __vmalloc_node(array_size, nested_gfp,
PAGE_KERNEL, node, caller);
area->flags |= VM_VPAGES;
} else {
- pages = kmalloc_node(array_size,
- (gfp_mask & GFP_RECLAIM_MASK) | __GFP_ZERO,
- node);
+ pages = kmalloc_node(array_size, nested_gfp, node);
}
area->pages = pages;
area->caller = caller;



2009-10-06 21:58:55

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] adjust gfp mask passed on nested vmalloc() invocation

On Mon, 5 Oct 2009, Jan Beulich wrote:

> - fix a latent bug resulting from blindly or-ing in __GFP_ZERO, since
> the combination of this and __GFP_HIGHMEM (possibly passed into the
> function) is forbidden in interrupt context
> - avoid wasting more precious resources (DMA or DMA32 pools), when
> being called through vmalloc_32{,_user}()
> - explicitly allow using high memory here even if the outer allocation
> request doesn't allow it, unless is collides with __GFP_ZERO
>
> Signed-off-by: Jan Beulich <[email protected]>

I thought vmalloc.c was a BUG_ON(in_interrupt()) zone?
The locking is all spin_lock stuff, not spin_lock_irq stuff.
That's probably why your "bug" has remained "latent".

Using HIGHMEM for internal arrays looks reasonable to me; but if
__GFP_ZERO were a problem, wouldn't it be much cleaner to skip the
"unless it collides" and #ifdef CONFIG_HIGHMEM !in_interrupt() stuff,
just memset the array returned from __vmalloc_node()?

Hugh

>
> ---
> mm/vmalloc.c | 12 ++++++++----
> 1 file changed, 8 insertions(+), 4 deletions(-)
>
> --- linux-2.6.32-rc3/mm/vmalloc.c 2009-10-05 11:59:56.000000000 +0200
> +++ 2.6.32-rc3-vmalloc-nested-gfp/mm/vmalloc.c 2009-10-05 08:40:36.000000000 +0200
> @@ -1410,6 +1410,7 @@ static void *__vmalloc_area_node(struct
> {
> struct page **pages;
> unsigned int nr_pages, array_size, i;
> + gfp_t nested_gfp = (gfp_mask & GFP_RECLAIM_MASK) | __GFP_ZERO;
>
> nr_pages = (area->size - PAGE_SIZE) >> PAGE_SHIFT;
> array_size = (nr_pages * sizeof(struct page *));
> @@ -1417,13 +1418,16 @@ static void *__vmalloc_area_node(struct
> area->nr_pages = nr_pages;
> /* Please note that the recursion is strictly bounded. */
> if (array_size > PAGE_SIZE) {
> - pages = __vmalloc_node(array_size, gfp_mask | __GFP_ZERO,
> +#ifdef CONFIG_HIGHMEM
> + /* See the comment in prep_zero_page(). */
> + if (!in_interrupt())
> + nested_gfp |= __GFP_HIGHMEM;
> +#endif
> + pages = __vmalloc_node(array_size, nested_gfp,
> PAGE_KERNEL, node, caller);
> area->flags |= VM_VPAGES;
> } else {
> - pages = kmalloc_node(array_size,
> - (gfp_mask & GFP_RECLAIM_MASK) | __GFP_ZERO,
> - node);
> + pages = kmalloc_node(array_size, nested_gfp, node);
> }
> area->pages = pages;
> area->caller = caller;

2009-10-07 07:44:39

by Jan Beulich

[permalink] [raw]
Subject: Re: [PATCH] adjust gfp mask passed on nested vmalloc() invocation

>>> Hugh Dickins <[email protected]> 06.10.09 23:58 >>>
>On Mon, 5 Oct 2009, Jan Beulich wrote:
>
>> - fix a latent bug resulting from blindly or-ing in __GFP_ZERO, since
>> the combination of this and __GFP_HIGHMEM (possibly passed into the
>> function) is forbidden in interrupt context
>> - avoid wasting more precious resources (DMA or DMA32 pools), when
>> being called through vmalloc_32{,_user}()
>> - explicitly allow using high memory here even if the outer allocation
>> request doesn't allow it, unless is collides with __GFP_ZERO
>>
>> Signed-off-by: Jan Beulich <[email protected]>
>
>I thought vmalloc.c was a BUG_ON(in_interrupt()) zone?
>The locking is all spin_lock stuff, not spin_lock_irq stuff.
>That's probably why your "bug" has remained "latent".

Then you probably mean BUG_ON(irqs_disabled()), which would seem
correct. But if the gfp mask massaging was needed for calling kmalloc(),
it would seem odd that the same shouldn't be needed for calling
vmalloc() recursively...

>Using HIGHMEM for internal arrays looks reasonable to me; but if
>__GFP_ZERO were a problem, wouldn't it be much cleaner to skip the
>"unless it collides" and #ifdef CONFIG_HIGHMEM !in_interrupt() stuff,
>just memset the array returned from __vmalloc_node()?

The main goal was to change the existing code as little as possible - I
did consider this alternative, but wasn't sure that would be accepted.
If you view this as the better alternative, I'll certainly modify the
patch to do it that way.

Jan

2009-10-07 09:00:13

by Jan Beulich

[permalink] [raw]
Subject: Re: [PATCH] adjust gfp mask passed on nested vmalloc() invocation

>>> Hugh Dickins <[email protected]> 06.10.09 23:58 >>>
>On Mon, 5 Oct 2009, Jan Beulich wrote:
>
>> - fix a latent bug resulting from blindly or-ing in __GFP_ZERO, since
>> the combination of this and __GFP_HIGHMEM (possibly passed into the
>> function) is forbidden in interrupt context
>> - avoid wasting more precious resources (DMA or DMA32 pools), when
>> being called through vmalloc_32{,_user}()
>> - explicitly allow using high memory here even if the outer allocation
>> request doesn't allow it, unless is collides with __GFP_ZERO
>>
>> Signed-off-by: Jan Beulich <[email protected]>
>
>I thought vmalloc.c was a BUG_ON(in_interrupt()) zone?
>The locking is all spin_lock stuff, not spin_lock_irq stuff.
>That's probably why your "bug" has remained "latent".

Actually, my previous reply to this was bogus, and I agree with your
statement. Hence, from a second version of the patch (depending on
your response on my question regarding the other part of your reply),
I should drop that part of the description.

Jan

2009-10-07 12:09:06

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] adjust gfp mask passed on nested vmalloc() invocation

On Wed, 7 Oct 2009, Jan Beulich wrote:
> >>> Hugh Dickins <[email protected]> 06.10.09 23:58 >>>
> >On Mon, 5 Oct 2009, Jan Beulich wrote:
> >
> >> - fix a latent bug resulting from blindly or-ing in __GFP_ZERO, since
> >> the combination of this and __GFP_HIGHMEM (possibly passed into the
> >> function) is forbidden in interrupt context
> >> - avoid wasting more precious resources (DMA or DMA32 pools), when
> >> being called through vmalloc_32{,_user}()
> >> - explicitly allow using high memory here even if the outer allocation
> >> request doesn't allow it, unless is collides with __GFP_ZERO
> >>
> >> Signed-off-by: Jan Beulich <[email protected]>
> >
> >I thought vmalloc.c was a BUG_ON(in_interrupt()) zone?
> >The locking is all spin_lock stuff, not spin_lock_irq stuff.
> >That's probably why your "bug" has remained "latent".
>
> Then you probably mean BUG_ON(irqs_disabled()), which would seem
> correct.

I'm relieved you came to see that remark as bogus.

> But if the gfp mask massaging was needed for calling kmalloc(),
> it would seem odd that the same shouldn't be needed for calling
> vmalloc() recursively...
>
> >Using HIGHMEM for internal arrays looks reasonable to me; but if
> >__GFP_ZERO were a problem, wouldn't it be much cleaner to skip the
> >"unless it collides" and #ifdef CONFIG_HIGHMEM !in_interrupt() stuff,
> >just memset the array returned from __vmalloc_node()?
>
> The main goal was to change the existing code as little as possible - I
> did consider this alternative, but wasn't sure that would be accepted.
> If you view this as the better alternative, I'll certainly modify the
> patch to do it that way.

Well, now we've accepted that this code cannot be used in_interrupt(),
there's no need for your #ifdef CONFIG_HIGHMEM nor for my memset: just
use __GFP_ZERO as it was before, and your patch would amount to or'ing
__GFP_HIGHMEM into gfp_mask for the __vmalloc_node case - wouldn't it?

Hugh

2009-10-07 12:21:14

by Jan Beulich

[permalink] [raw]
Subject: Re: [PATCH] adjust gfp mask passed on nested vmalloc() invocation

>>> Hugh Dickins <[email protected]> 07.10.09 14:08 >>>
>Well, now we've accepted that this code cannot be used in_interrupt(),
>there's no need for your #ifdef CONFIG_HIGHMEM nor for my memset: just
>use __GFP_ZERO as it was before, and your patch would amount to or'ing
>__GFP_HIGHMEM into gfp_mask for the __vmalloc_node case - wouldn't it?

Plus the consolidation of masking the passed in gfp_mask by
GFP_RECLAIM_MASK also for the nested vmalloc() case, in particular to
remove the GFP_DMA* possibly coming in from vmalloc_32(). But yes,
it will become simpler.

Jan