LinuxLists.cc - Using Bootmem for large DMA buffers in the presence of the slab allocator

2010-08-04 06:07:55

Subject: Using Bootmem for large DMA buffers in the presence of the slab allocator

Hi Everyone,

I am currently developing Kernel code to allocate and reserve a large
(64MB) contiguous buffer for DMA. My approach is to use the the boot
time allocator (alloc_bootmem_low_pages()), with my module statically
linked into the kernel. I initially tried to call this function from
my kernel modules init() function, however on boot this would generate
a warning, indicating that the slab allocator was already available:

from mm/bootmem.c, in the alloc_arch_preferred_bootmem() function -
lines 541-542:

if (WARN_ON_ONCE(slab_is_available()))
return kzalloc(size, GFP_NOWAIT);

Because the buffer was too large for kmalloc, the kmalloc call would
fail. I traced the alloc_bootmem_low_pages() call further and
discovered that since the kmalloc call was failing, it was falling
back to alloc_bootmem_core(). So does this mean that the bootmem
allocator is trying to allocate memory while the slab allocator is up
and running? And is this supposed to work?

The reason i ask, is that when testing the system under high memory
usage conditions, I would get a "Bad page state" BUG() for my
allocated pages (see below). I have matched the pfns and confirmed
that they correspond to the pages allocated by the
alloc_bootmem_low_pages(). My theory is that the slab allocators list
of free pages does not get updated by the bootmem allocator, so the
slab allocator is seeing my DMA buffer as un-allocated. Does this
sound correct?

The only resolution i am seeing to this problem is to call the bootmem
allocator before the slab allocator is up and running, but as far as i
can tell, this requires editing one of the kernel start routines, or
the kernel_start() function itself. I have done this and it now works
without the bug, but is there a cleaner solution?

I am running linux 2.6.31 on the Microblaze architecture.

Thanks in Advance
Peter Crosthwaite
PetaLogix

BUG: Bad page state in process mst pfn:4bc01
page:c09a0020 flags:(null) count:1 mapcount:0 mapping:(null) index:0

Stack:
c0044150 c023f330 c6e5dd5c 00005f65 00004000 00004001 c6e5dd78 c0045024
c01e0c0c c09a0020 00000000 00000001 00000000 00000000 00000000 c024b5a8
c004525c 00000001 000004b8 c6e22000 00000001 000200da c010c188 c024b594
Call Trace:

[<c0044150>] bad_page+0x12c/0x160
[<c0045024>] get_page_from_freelist+0x318/0x43c
[<c004525c>] __alloc_pages_nodemask+0x114/0x594
[<c010c188>] ulite_transmit+0x78/0xf0
[<c0051dac>] handle_mm_fault+0x19c/0x48c
[<c0059fdc>] page_add_new_anon_rmap+0x68/0x94
[<c0009914>] do_page_fault+0x264/0x480
[<c01020b0>] tty_ldisc_deref+0x8/0x1c
[<c00fb210>] tty_write_unlock+0x14/0x44
[<c00081c8>] page_fault_instr_trap+0x1f8/0x200
[<c000ba00>] set_next_entity+0x28/0x70
[<c0062f78>] vfs_write+0xa4/0x150
[<c000bb3c>] __enqueue_entity+0xb0/0xd4
[<c0062ff0>] vfs_write+0x11c/0x150
[<c0016d78>] do_softirq+0x34/0x54
[<c000bd7c>] pick_next_task_fair+0x98/0xd4
[<c000bd88>] pick_next_task_fair+0xa4/0xd4
[<c000dd18>] put_prev_task_fair+0x48/0x70
[<c01d25cc>] schedule+0x1b4/0x414
[<c01d27e4>] schedule+0x3cc/0x414
[<c01d25a0>] schedule+0x188/0x414
[<c01d248c>] schedule+0x74/0x414
[<c01d2654>] schedule+0x23c/0x414
[<c0007738>] ret_from_trap+0x48/0x1d4
[<c0008550>] irq_call+0x0/0x8

2010-08-04 15:40:32

by Christoph Lameter

[permalink] [raw]

Subject: Re: Using Bootmem for large DMA buffers in the presence of the slab allocator

On Wed, 4 Aug 2010, Peter Crosthwaite wrote:

> Because the buffer was too large for kmalloc, the kmalloc call would
> fail. I traced the alloc_bootmem_low_pages() call further and
> discovered that since the kmalloc call was failing, it was falling
> back to alloc_bootmem_core(). So does this mean that the bootmem
> allocator is trying to allocate memory while the slab allocator is up
> and running? And is this supposed to work?

The bootmem allocator should not work when slab is fully up. However,
there is a grey period where the page allocator is not fully functional
yet but the slab allocator is mostly working.

> The reason i ask, is that when testing the system under high memory
> usage conditions, I would get a "Bad page state" BUG() for my
> allocated pages (see below). I have matched the pfns and confirmed
> that they correspond to the pages allocated by the
> alloc_bootmem_low_pages(). My theory is that the slab allocators list
> of free pages does not get updated by the bootmem allocator, so the
> slab allocator is seeing my DMA buffer as un-allocated. Does this
> sound correct?

bootmem allocations do not reserve page structs. So you do not have a page
state at all. If you do something that requires a page state then you will
have strange failures.

2010-08-04 17:36:05

by Pekka Enberg

[permalink] [raw]

Subject: Re: Using Bootmem for large DMA buffers in the presence of the slab allocator

On Wed, 4 Aug 2010, Peter Crosthwaite wrote:
>> Because the buffer was too large for kmalloc, the kmalloc call would
>> fail. I traced the alloc_bootmem_low_pages() call further and
>> discovered that since the kmalloc call was failing, it was falling
>> back to alloc_bootmem_core(). So does this mean that the bootmem
>> allocator is trying to allocate memory while the slab allocator is up
>> and running? And is this supposed to work?

On Wed, Aug 4, 2010 at 6:40 PM, Christoph Lameter
<[email protected]> wrote:
> The bootmem allocator should not work when slab is fully up. However,
> there is a grey period where the page allocator is not fully functional
> yet but the slab allocator is mostly working.

Yup, the WARN_ON there means that someone is calling the bootmem
allocator after slab is up and running and the call-site needs to be
fixed. The slab fallback is there for convenience so that we don't
crash the kernel during bootup and it's not supposed to work for large
allocations.

Pekka

2010-08-04 18:01:53

by Christoph Lameter

[permalink] [raw]

Subject: Re: Using Bootmem for large DMA buffers in the presence of the slab allocator

Note also that allocating a 64M buffer will be difficult due to the
maximum allocation size restrictions of the page allocator.

You therefore have no choice but to allocate the memory at boot
time. Or (if you have control over the arch) increase the maximum
allocation unit (MAX_ORDER I believe).

2010-08-05 00:21:21

by FUJITA Tomonori

[permalink] [raw]

Subject: Re: Using Bootmem for large DMA buffers in the presence of the slab allocator

On Wed, 4 Aug 2010 16:07:52 +1000
Peter Crosthwaite <[email protected]> wrote:

> I am currently developing Kernel code to allocate and reserve a large
> (64MB) contiguous buffer for DMA.

A buffer needs to be physically continuous (your hardware can't do
scatter gather)? If so, as already pointed out, there is no pretty
solution for it.

There are some drivers that need to do such so the issue has been
under discussion:

http://marc.info/?l=linux-mm&m=128015343705933&w=2

2010-08-06 00:02:46

by Peter Crosthwaite

[permalink] [raw]

Subject: Re: Using Bootmem for large DMA buffers in the presence of the slab allocator

Hi Everyone,

Thanks for your replies. I am going to stick with my start_kernel()
edit until this CMA engine comes along.

Regards,
Peter

On Thu, Aug 5, 2010 at 10:16 AM, FUJITA Tomonori
<[email protected]> wrote:
> On Wed, 4 Aug 2010 16:07:52 +1000
> Peter Crosthwaite <[email protected]> wrote:
>
>> I am currently developing Kernel code to allocate and reserve a large
>> (64MB) contiguous buffer for DMA.
>
> A buffer needs to be physically continuous (your hardware can't do
> scatter gather)? If so, as already pointed out, there is no pretty
> solution for it.
>
> There are some drivers that need to do such so the issue has been
> under discussion:
>
> http://marc.info/?l=linux-mm&m=128015343705933&w=2
>