Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755321AbbBPEnS (ORCPT ); Sun, 15 Feb 2015 23:43:18 -0500 Received: from mail-ob0-f182.google.com ([209.85.214.182]:36208 "EHLO mail-ob0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751756AbbBPEnR (ORCPT ); Sun, 15 Feb 2015 23:43:17 -0500 MIME-Version: 1.0 In-Reply-To: References: Date: Mon, 16 Feb 2015 13:43:16 +0900 Message-ID: Subject: Re: [Regression]: mm: nommu: Memory leak introduced with commit "mm/nommu: use alloc_pages_exact() rather than its own implementation" From: Joonsoo Kim To: Maxime Coquelin Cc: Joonsoo Kim , Linux Memory Management List , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 10209 Lines: 235 2015-02-15 23:21 GMT+09:00 Maxime Coquelin : > Hi Joonsoon, > > I am currently working on STM32 microcontroller family upstream. > The STM32 family is ARM Cortex-M based, so no MMU. > As user-space, I use a ramdisk with a statically-linked busybox installed. > > On v3.19, I am facing a memory leak. > Each time I run a command one page is lost. Here an example with > busybox's free command: > > / # free > total used free shared buffers cached > Mem: 7928 1972 5956 0 0 492 > -/+ buffers/cache: 1480 6448 > / # free > total used free shared buffers cached > Mem: 7928 1976 5952 0 0 492 > -/+ buffers/cache: 1484 6444 > / # free > total used free shared buffers cached > Mem: 7928 1980 5948 0 0 492 > -/+ buffers/cache: 1488 6440 > / # free > total used free shared buffers cached > Mem: 7928 1984 5944 0 0 492 > -/+ buffers/cache: 1492 6436 > / # free > total used free shared buffers cached > Mem: 7928 1988 5940 0 0 492 > -/+ buffers/cache: 1496 6432 > > At some point, the system fails to sastisfy 256KB allocations: > > [ 38.720000] free: page allocation failure: order:6, mode:0xd0 > [ 38.730000] CPU: 0 PID: 67 Comm: free Not tainted > 3.19.0-05389-gacf2cf1-dirty #64 > [ 38.740000] Hardware name: STM32 (Device Tree Support) > [ 38.740000] [<08022e25>] (unwind_backtrace) from [<080221e7>] > (show_stack+0xb/0xc) > [ 38.750000] [<080221e7>] (show_stack) from [<0804fd3b>] > (warn_alloc_failed+0x97/0xbc) > [ 38.760000] [<0804fd3b>] (warn_alloc_failed) from [<08051171>] > (__alloc_pages_nodemask+0x295/0x35c) > [ 38.770000] [<08051171>] (__alloc_pages_nodemask) from [<08051243>] > (__get_free_pages+0xb/0x24) > [ 38.780000] [<08051243>] (__get_free_pages) from [<0805127f>] > (alloc_pages_exact+0x19/0x24) > [ 38.790000] [<0805127f>] (alloc_pages_exact) from [<0805bdbf>] > (do_mmap_pgoff+0x423/0x658) > [ 38.800000] [<0805bdbf>] (do_mmap_pgoff) from [<08056e73>] > (vm_mmap_pgoff+0x3f/0x4e) > [ 38.810000] [<08056e73>] (vm_mmap_pgoff) from [<08080949>] > (load_flat_file+0x20d/0x4f8) > [ 38.820000] [<08080949>] (load_flat_file) from [<08080dfb>] > (load_flat_binary+0x3f/0x26c) > [ 38.830000] [<08080dfb>] (load_flat_binary) from [<08063741>] > (search_binary_handler+0x51/0xe4) > [ 38.840000] [<08063741>] (search_binary_handler) from [<08063a45>] > (do_execveat_common+0x271/0x35c) > [ 38.850000] [<08063a45>] (do_execveat_common) from [<08063b49>] > (do_execve+0x19/0x1c) > [ 38.860000] [<08063b49>] (do_execve) from [<08020a01>] > (ret_fast_syscall+0x1/0x4a) > [ 38.870000] Mem-info: > [ 38.870000] Normal per-cpu: > [ 38.870000] CPU 0: hi: 0, btch: 1 usd: 0 > [ 38.880000] active_anon:0 inactive_anon:0 isolated_anon:0 > [ 38.880000] active_file:0 inactive_file:0 isolated_file:0 > [ 38.880000] unevictable:123 dirty:0 writeback:0 unstable:0 > [ 38.880000] free:1515 slab_reclaimable:17 slab_unreclaimable:139 > [ 38.880000] mapped:0 shmem:0 pagetables:0 bounce:0 > [ 38.880000] free_cma:0 > [ 38.910000] Normal free:6060kB min:352kB low:440kB high:528kB > active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB > unevictable:492kB isolated(anon):0ks > [ 38.950000] lowmem_reserve[]: 0 0 > [ 38.950000] Normal: 23*4kB (U) 22*8kB (U) 24*16kB (U) 23*32kB (U) > 23*64kB (U) 23*128kB (U) 1*256kB (U) 0*512kB 0*1024kB 0*2048kB > 0*4096kB = 6060kB > [ 38.970000] 123 total pagecache pages > [ 38.970000] 2048 pages of RAM > [ 38.980000] 1538 free pages > [ 38.980000] 66 reserved pages > [ 38.990000] 109 slab pages > [ 38.990000] -46 pages shared > [ 38.990000] 0 pages swap cached > [ 38.990000] nommu: Allocation of length 221184 from process 67 (free) failed > [ 39.000000] Normal per-cpu: > [ 39.010000] CPU 0: hi: 0, btch: 1 usd: 0 > [ 39.010000] active_anon:0 inactive_anon:0 isolated_anon:0 > [ 39.010000] active_file:0 inactive_file:0 isolated_file:0 > [ 39.010000] unevictable:123 dirty:0 writeback:0 unstable:0 > [ 39.010000] free:1515 slab_reclaimable:17 slab_unreclaimable:139 > [ 39.010000] mapped:0 shmem:0 pagetables:0 bounce:0 > [ 39.010000] free_cma:0 > [ 39.050000] Normal free:6060kB min:352kB low:440kB high:528kB > active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB > unevictable:492kB isolated(anon):0ks > [ 39.090000] lowmem_reserve[]: 0 0 > [ 39.090000] Normal: 23*4kB (U) 22*8kB (U) 24*16kB (U) 23*32kB (U) > 23*64kB (U) 23*128kB (U) 1*256kB (U) 0*512kB 0*1024kB 0*2048kB > 0*4096kB = 6060kB > [ 39.100000] 123 total pagecache pages > [ 39.110000] Unable to allocate RAM for process text/data, errno 12 > SEGV > > I found that this is a regression, which has been introduced with this patch: > > ------------------------------------------------------------------------------ > commit dbc8358c72373daa4f37b7e233fecbc47105fe54 > Author: Joonsoo Kim > Date: Fri Dec 12 16:55:55 2014 -0800 > > mm/nommu: use alloc_pages_exact() rather than its own implementation > > do_mmap_private() in nommu.c try to allocate physically contiguous pages > with arbitrary size in some cases and we now have good abstract function > to do exactly same thing, alloc_pages_exact(). So, change to use it. > > There is no functional change. This is the preparation step for support > page owner feature accurately. > > Signed-off-by: Joonsoo Kim > ------------------------------------------------------------------------------ > > Indeed, when I revert it, the issue no more appear, I can run the free > command for hours without any issue. > The problem is that I fail to understand what in your patch could > cause the issue. > > I enabled the traces in mm/nommu.c file, this is what I get with you patch: > > [ 5.970000] ==> do_mmap_pgoff(,0,36000,7,2,0) > [ 5.970000] xxxalloc order 6 for 36000yyy > [ 5.970000] xxxtry to alloc exact 54 pagesyyy > [ 5.970000] ==> add_vma_to_mm(,d0781600) > [ 5.970000] <== do_mmap_pgoff() = d0540000 > [ 5.990000] ==> do_mmap_pgoff(,0,2000,3,4000021,0) > [ 5.990000] xxxalloc order 1 for 2000yyy > [ 5.990000] ==> add_vma_to_mm(,d07818a0) > [ 5.990000] <== do_mmap_pgoff() = d0576000 > [ 6.000000] ==> exit_mmap() > [ 6.000000] ==> delete_vma_from_mm(d0781600) > [ 6.000000] ==> delete_vma(d0781600) > [ 6.000000] ==> __put_nommu_region(d078f120{1}) > [ 6.000000] xxxfree seriesyyy > [ 6.000000] xxx- free d0540000yyy > [ 6.000000] xxx- free d0541000yyy > [ 6.010000] xxx- free d0542000yyy > > [ 6.020000] xxx- free d0572000yyy > [ 6.020000] xxx- free d0573000yyy > [ 6.020000] xxx- free d0574000yyy > [ 6.020000] xxx- free d0575000yyy > [ 6.020000] ==> delete_vma_from_mm(d07818a0) > [ 6.020000] ==> delete_vma(d07818a0) > [ 6.020000] ==> __put_nommu_region(d078f0f0{1}) > [ 6.020000] xxxfree seriesyyy > [ 6.020000] xxx- free d0576000yyy > [ 6.020000] xxx- free d0577000yyy > [ 6.020000] xxxfree page d07faee0: refcount not one: 0yyy > [ 6.020000] <== exit_mmap() > > As you can see, I have one warning that shows up "free page d07faee0: > refcount not one: 0". > When reverting your patch, I don't have this warning: > > [ 6.320000] ==> do_mmap_pgoff(,0,36000,7,2,0) > [ 6.320000] xxxalloc order 6 for 36000yyy > [ 6.320000] xxxshave 8/10 @64yyy > [ 6.320000] xxxshave 2/2 @56yyy > [ 6.320000] ==> add_vma_to_mm(,d0781600) > [ 6.320000] <== do_mmap_pgoff() = d0540000 > [ 6.340000] ==> do_mmap_pgoff(,0,2000,3,4000021,0) > [ 6.340000] xxxalloc order 1 for 2000yyy > [ 6.340000] ==> add_vma_to_mm(,d0781720) > [ 6.340000] <== do_mmap_pgoff() = d0536000 > [ 6.350000] ==> exit_mmap() > [ 6.350000] ==> delete_vma_from_mm(d0781720) > [ 6.350000] ==> delete_vma(d0781720) > [ 6.350000] ==> __put_nommu_region(d078f0f0{1}) > [ 6.350000] xxxfree seriesyyy > [ 6.350000] xxx- free d0536000yyy > [ 6.350000] xxx- free d0537000yyy > [ 6.350000] ==> delete_vma_from_mm(d0781600) > [ 6.350000] ==> delete_vma(d0781600) > [ 6.350000] ==> __put_nommu_region(d078f120{1}) > [ 6.350000] xxxfree seriesyyy > [ 6.350000] xxx- free d0540000yyy > [ 6.350000] xxx- free d0541000yyy > [ 6.350000] xxx- free d0542000yyy > > [ 6.370000] xxx- free d0572000yyy > [ 6.370000] xxx- free d0573000yyy > [ 6.370000] xxx- free d0574000yyy > [ 6.370000] xxx- free d0575000yyy > [ 6.370000] <== exit_mmap() > > Do you have an idea on what could cause the issue? > > I can do any tests you could find relevant to hunt down this bug. Hello, Sorry for my mistake. Problem happens because when we allocate memory through __get_free_pages(), refcount of each pages is not 1 except head page. Below modification will fix your problem. Could you test it, please? Thanks. ------------>8------------- diff --git a/mm/nommu.c b/mm/nommu.c index 28bd8c4..ff6c1e2 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -1189,11 +1189,9 @@ static int do_mmap_private(struct vm_area_struct *vma, if (sysctl_nr_trim_pages && total - point >= sysctl_nr_trim_pages) { total = point; kdebug("try to alloc exact %lu pages", total); - base = alloc_pages_exact(len, GFP_KERNEL); - } else { - base = (void *)__get_free_pages(GFP_KERNEL, order); } + base = alloc_pages_exact(total << PAGE_SHIFT, GFP_KERNEL); if (!base) goto enomem; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/