Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756020AbcJFQmO (ORCPT ); Thu, 6 Oct 2016 12:42:14 -0400 Received: from mail-it0-f66.google.com ([209.85.214.66]:33132 "EHLO mail-it0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S942352AbcJFQmF (ORCPT ); Thu, 6 Oct 2016 12:42:05 -0400 MIME-Version: 1.0 From: Adrian Chadd Date: Thu, 6 Oct 2016 09:41:30 -0700 X-Google-Sender-Auth: THMRnk71_4kUjFQCqpgcrujoQYg Message-ID: Subject: Re: Question on kzalloc and GFP_DMA32 To: Ben Greear Cc: David Rientjes , Linux Kernel Mailing List Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2577 Lines: 64 On 28 September 2016 at 16:37, Ben Greear wrote: > On 09/28/2016 02:11 PM, David Rientjes wrote: [snip] >> >> I suppose it's failing sometimes because the BUG() will trigger when >> trying to allocate new slab or CONFIG_ZONE_DMA32 isn't configured. That >> shouldn't panic the kernel anymore since commit 72baeef0c271 ("slab: do >> not panic on invalid gfp_mask") in 4.8, but you shouldn't be doing >> kzalloc(..., ... | GFP_DMA32) anyway. > > > CONFIG_ZONE_DMA32 is enabled in my .config. > >>> pool_size is relatively large (maybe 256k or so). >>> >> >> If it's 256k, why allocate through the slab allocator? Why not >> alloc_pages(GFP_KERNEL | GFP_DMA32 | __GFP_ZERO | __GFP_NOWARN, >> get_order(pool_size))? > > > I really don't understand the (subtle?) difference between alloc_pages and > kzalloc, > but I will give your suggestion a try and see if it works. If you have > time, maybe you could take > a look at drivers/net/wireless/ath/ath10k/wmi.c in the > ath10k_wmi_alloc_chunk method > to see if you notice any problems with using alloc_pages there? > > Thanks for the suggestion. Hi, to follow up (since I was the original suggester of DMA32 for fixing this WMI specific issue) : The underlying reason is that ath10k is looking for some memory that's in the device physical memory range (which is 32 bit) so the on-chip SoC (target CPU) can use the host memory for doing, well, what's effectively swapping code/data segments in and out. When using kzalloc, it looks like it's allocating memory from whatever is available, and then it has to find contiguous DMA32 pages when it's asked to get a physical mapping to that range. Normally the failure experienced on 64 bit machines w/ ath10k is when the DMA32 region is either too fragmented, or just plain full. I've pointed out to a few ath10k people that because of how the code kzalloc's a hundred k or so of memory and then wants a physically contig region to be mapped, they're likely to eventually fail. My guess (since I'm mainly a FreeBSD developer) is that the way ath10k is getting physically contig bounce-buffer style memory after a kzalloc is really designed for small, ephemeral allocations, and not the large allocation that ath10k does every time the device is brought up. So, what's the "blessed" way in Linux to allocate device DMA aware contiguous memory to let the target device play in? Is it alloc_pages, or is there some other API that ath10k should be using to allocate physical memory within the bus dma constraints of a specific device? Thanks! -adrian