Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932735AbbLBPci (ORCPT ); Wed, 2 Dec 2015 10:32:38 -0500 Received: from userp1040.oracle.com ([156.151.31.81]:25727 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932509AbbLBPcY (ORCPT ); Wed, 2 Dec 2015 10:32:24 -0500 Subject: Re: [PATCH] mm/hugetlb resv map memory leak for placeholder entries To: Dmitry Vyukov , syzkaller References: <1449024761-11280-1-git-send-email-mike.kravetz@oracle.com> <04ad01d12cd0$c9bfe070$5d3fa150$@alibaba-inc.com> Cc: Andrew Morton , Naoya Horiguchi , David Rientjes , "Kirill A. Shutemov" , Dave Hansen , "linux-mm@kvack.org" , LKML , Hugh Dickins , Greg Thelen , Kostya Serebryany , Alexander Potapenko , Sasha Levin , Eric Dumazet From: Mike Kravetz Message-ID: <565F0EFE.2000804@oracle.com> Date: Wed, 2 Dec 2015 07:32:14 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Source-IP: aserv0022.oracle.com [141.146.126.234] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5073 Lines: 110 On 12/02/2015 01:26 AM, Dmitry Vyukov wrote: > FWIW, I see this leak also with mlock, mmap, get_mempolicy and page > faults. So it is not specific only to the new fancy mlock2. I assume/hope the patch addresses leaks with those other calls as well? -- Mike Kravetz > > > > > On Wed, Dec 2, 2015 at 8:12 AM, Hillf Danton wrote: >>> >>> Dmitry Vyukov reported the following memory leak >>> >>> unreferenced object 0xffff88002eaafd88 (size 32): >>> comm "a.out", pid 5063, jiffies 4295774645 (age 15.810s) >>> hex dump (first 32 bytes): >>> 28 e9 4e 63 00 88 ff ff 28 e9 4e 63 00 88 ff ff (.Nc....(.Nc.... >>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >>> backtrace: >>> [< inline >] kmalloc include/linux/slab.h:458 >>> [] region_chg+0x2d4/0x6b0 mm/hugetlb.c:398 >>> [] __vma_reservation_common+0x2c3/0x390 mm/hugetlb.c:1791 >>> [< inline >] vma_needs_reservation mm/hugetlb.c:1813 >>> [] alloc_huge_page+0x19e/0xc70 mm/hugetlb.c:1845 >>> [< inline >] hugetlb_no_page mm/hugetlb.c:3543 >>> [] hugetlb_fault+0x7a1/0x1250 mm/hugetlb.c:3717 >>> [] follow_hugetlb_page+0x339/0xc70 mm/hugetlb.c:3880 >>> [] __get_user_pages+0x542/0xf30 mm/gup.c:497 >>> [] populate_vma_page_range+0xde/0x110 mm/gup.c:919 >>> [] __mm_populate+0x1c7/0x310 mm/gup.c:969 >>> [] do_mlock+0x291/0x360 mm/mlock.c:637 >>> [< inline >] SYSC_mlock2 mm/mlock.c:658 >>> [] SyS_mlock2+0x4b/0x70 mm/mlock.c:648 >>> >>> Dmitry identified a potential memory leak in the routine region_chg, >>> where a region descriptor is not free'ed on an error path. >>> >>> However, the root cause for the above memory leak resides in region_del. >>> In this specific case, a "placeholder" entry is created in region_chg. The >>> associated page allocation fails, and the placeholder entry is left in the >>> reserve map. This is "by design" as the entry should be deleted when the >>> map is released. The bug is in the region_del routine which is used to >>> delete entries within a specific range (and when the map is released). >>> region_del did not handle the case where a placeholder entry exactly matched >>> the start of the range range to be deleted. In this case, the entry would >>> not be deleted and leaked. The fix is to take these special placeholder >>> entries into account in region_del. >>> >>> The region_chg error path leak is also fixed. >>> >>> Fixes: feba16e25a57 ("add region_del() to delete a specific range of entries") >>> Cc: stable@vger.kernel.org [4.3] >>> Signed-off-by: Mike Kravetz >>> Reported-by: Dmitry Vyukov >>> --- >> >> Acked-by: Hillf Danton >> >>> mm/hugetlb.c | 12 ++++++++++-- >>> 1 file changed, 10 insertions(+), 2 deletions(-) >>> >>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >>> index 1101ccd94..ba07014 100644 >>> --- a/mm/hugetlb.c >>> +++ b/mm/hugetlb.c >>> @@ -372,8 +372,10 @@ retry_locked: >>> spin_unlock(&resv->lock); >>> >>> trg = kmalloc(sizeof(*trg), GFP_KERNEL); >>> - if (!trg) >>> + if (!trg) { >>> + kfree(nrg); >>> return -ENOMEM; >>> + } >>> >>> spin_lock(&resv->lock); >>> list_add(&trg->link, &resv->region_cache); >>> @@ -483,7 +485,13 @@ static long region_del(struct resv_map *resv, long f, long t) >>> retry: >>> spin_lock(&resv->lock); >>> list_for_each_entry_safe(rg, trg, head, link) { >>> - if (rg->to <= f) >>> + /* >>> + * file_region ranges are normally of the form [from, to). >>> + * However, there may be a "placeholder" entry in the map >>> + * which is of the form (from, to) with from == to. Check >>> + * for placeholder entries as well. >>> + */ >>> + if (rg->to <= f && rg->to != rg->from) >>> continue; >>> if (rg->from >= t) >>> break; >>> -- >>> 2.4.3 >> >> -- >> You received this message because you are subscribed to the Google Groups "syzkaller" group. >> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@googlegroups.com. >> To post to this group, send email to syzkaller@googlegroups.com. >> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller/04ad01d12cd0%24c9bfe070%245d3fa150%24%40alibaba-inc.com. >> For more options, visit https://groups.google.com/d/optout. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/