2016-10-06 16:42:14

by Adrian Chadd

[permalink] [raw]
Subject: Re: Question on kzalloc and GFP_DMA32

On 28 September 2016 at 16:37, Ben Greear <[email protected]> wrote:
> On 09/28/2016 02:11 PM, David Rientjes wrote:

[snip]

>>
>> I suppose it's failing sometimes because the BUG() will trigger when
>> trying to allocate new slab or CONFIG_ZONE_DMA32 isn't configured. That
>> shouldn't panic the kernel anymore since commit 72baeef0c271 ("slab: do
>> not panic on invalid gfp_mask") in 4.8, but you shouldn't be doing
>> kzalloc(..., ... | GFP_DMA32) anyway.
>
>
> CONFIG_ZONE_DMA32 is enabled in my .config.
>
>>> pool_size is relatively large (maybe 256k or so).
>>>
>>
>> If it's 256k, why allocate through the slab allocator? Why not
>> alloc_pages(GFP_KERNEL | GFP_DMA32 | __GFP_ZERO | __GFP_NOWARN,
>> get_order(pool_size))?
>
>
> I really don't understand the (subtle?) difference between alloc_pages and
> kzalloc,
> but I will give your suggestion a try and see if it works. If you have
> time, maybe you could take
> a look at drivers/net/wireless/ath/ath10k/wmi.c in the
> ath10k_wmi_alloc_chunk method
> to see if you notice any problems with using alloc_pages there?
>
> Thanks for the suggestion.

Hi, to follow up (since I was the original suggester of DMA32 for
fixing this WMI specific issue) :

The underlying reason is that ath10k is looking for some memory that's
in the device physical memory range (which is 32 bit) so the on-chip
SoC (target CPU) can use the host memory for doing, well, what's
effectively swapping code/data segments in and out.

When using kzalloc, it looks like it's allocating memory from whatever
is available, and then it has to find contiguous DMA32 pages when it's
asked to get a physical mapping to that range. Normally the failure
experienced on 64 bit machines w/ ath10k is when the DMA32 region is
either too fragmented, or just plain full.

I've pointed out to a few ath10k people that because of how the code
kzalloc's a hundred k or so of memory and then wants a physically
contig region to be mapped, they're likely to eventually fail. My
guess (since I'm mainly a FreeBSD developer) is that the way ath10k is
getting physically contig bounce-buffer style memory after a kzalloc
is really designed for small, ephemeral allocations, and not the large
allocation that ath10k does every time the device is brought up.

So, what's the "blessed" way in Linux to allocate device DMA aware
contiguous memory to let the target device play in? Is it alloc_pages,
or is there some other API that ath10k should be using to allocate
physical memory within the bus dma constraints of a specific device?

Thanks!


-adrian