Received: by 10.192.165.148 with SMTP id m20csp527903imm; Fri, 27 Apr 2018 03:15:02 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqWo2id/YxmMsE8u9gvQbAFDHmOlmdmC6be9BfEuT0eHLNhzhj0YfIRcj3+K7blkDde154u X-Received: by 10.98.192.220 with SMTP id g89mr1648103pfk.223.1524824102754; Fri, 27 Apr 2018 03:15:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524824102; cv=none; d=google.com; s=arc-20160816; b=UOcI7akde8IbY/EzNrXPJ4swgHMJW2cuRZ52+4eI1FZSd1aQY4UYM83njBr7LlDukE ziVFm6h1mtmqZ4wjBoHkUwxBTcNxjRkSNU/0lk0gvSLJjZPNIk+XOMp3Amgovc5basst Gfd/o3ezCCmhVpPL2vikRjPkvKa58KQUEcarWtQCiztxYBCIJ/8uxY46IjDry/w4QDzM aE1xr8wfk2T17b55bN55zQOGnYxn99uP+3UZzYAUEaHSvP12e2FUfw3S5xvUrTRd1Pji 9aHUf9gvqmqHYWvwkB4zNUWQ7qS4pAw7bCQ7Xvw/NyGxQyHOX0ZBZ5Jni4sBEq0He0GM hFDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:thread-index:thread-topic :content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:reply-to:from:date :arc-authentication-results; bh=RxMmzGy6Vg42FB5tda9xQH3CAWHzWTgAvW94UZgRwBQ=; b=JYXJsS0Ic5rG5H2/unpb74u1js5dqW+Ofo2/noK0RSBu5XNo973DX/Mai8pOGqi2Fd /D/R86SvhSQynEhJHcq2rrLXA7vTkKPWnakPiyq3pPQ92yZXrvK8CKlX/dxlmUvyQIxq Zv6sBJ9hqS5nLwerDPZKRPO1q3+yEm7gR1GhjzY+ZQUY/N/NfuM18CAXgJGSEngZwtLc MBHSzfZO8ToF4DywuHst0tfU51MUa3k43/wioilZT55zK6klwKQxmF62Jle8v30iJiUB 7s7lHQN9G2mh2BVRYoRcSdtB15kMcrFt5SOhRJej0gRKNw90+WeKGttzOs+1ZMUyIv7v liTA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b18-v6si943888pgn.669.2018.04.27.03.14.48; Fri, 27 Apr 2018 03:15:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932697AbeD0KNW (ORCPT + 99 others); Fri, 27 Apr 2018 06:13:22 -0400 Received: from mx1.redhat.com ([209.132.183.28]:45842 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932572AbeD0KNT (ORCPT ); Fri, 27 Apr 2018 06:13:19 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id ECFD630F8DA2; Fri, 27 Apr 2018 10:13:18 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B3D3B5D6A2; Fri, 27 Apr 2018 10:13:18 +0000 (UTC) Received: from zmail23.collab.prod.int.phx2.redhat.com (zmail23.collab.prod.int.phx2.redhat.com [10.5.83.28]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 783E84CA9F; Fri, 27 Apr 2018 10:13:18 +0000 (UTC) Date: Fri, 27 Apr 2018 06:13:18 -0400 (EDT) From: Chunyu Hu Reply-To: Chunyu Hu To: Catalin Marinas Cc: Michal Hocko , Chunyu Hu , Dmitry Vyukov , LKML , Linux-MM Message-ID: <503481697.20310393.1524823998160.JavaMail.zimbra@redhat.com> In-Reply-To: <978702110.19841228.1524666829157.JavaMail.zimbra@redhat.com> References: <1524243513-29118-1-git-send-email-chuhu@redhat.com> <20180424132057.GE17484@dhcp22.suse.cz> <20180424134148.qkvqqa4c37l6irvg@armageddon.cambridge.arm.com> <482146467.19754107.1524649841393.JavaMail.zimbra@redhat.com> <20180425125154.GA29722@MBP.local> <978702110.19841228.1524666829157.JavaMail.zimbra@redhat.com> Subject: Re: [RFC] mm: kmemleak: replace __GFP_NOFAIL to GFP_NOWAIT in gfp_kmemleak_mask MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [10.68.5.41, 10.4.195.5] Thread-Topic: kmemleak: replace __GFP_NOFAIL to GFP_NOWAIT in gfp_kmemleak_mask Thread-Index: pBQ/tV+qtZ7HPT+DqawRyrojOrxvAFdE5/o0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Fri, 27 Apr 2018 10:13:19 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- Original Message ----- > From: "Chunyu Hu" > To: "Catalin Marinas" > Cc: "Michal Hocko" , "Chunyu Hu" , "Dmitry Vyukov" , > "LKML" , "Linux-MM" > Sent: Wednesday, April 25, 2018 10:33:49 PM > Subject: Re: [RFC] mm: kmemleak: replace __GFP_NOFAIL to GFP_NOWAIT in gfp_kmemleak_mask > > > > ----- Original Message ----- > > From: "Catalin Marinas" > > To: "Chunyu Hu" > > Cc: "Michal Hocko" , "Chunyu Hu" > > , "Dmitry Vyukov" , > > "LKML" , "Linux-MM" > > Sent: Wednesday, April 25, 2018 8:51:55 PM > > Subject: Re: [RFC] mm: kmemleak: replace __GFP_NOFAIL to GFP_NOWAIT in > > gfp_kmemleak_mask > > > > On Wed, Apr 25, 2018 at 05:50:41AM -0400, Chunyu Hu wrote: > > > ----- Original Message ----- > > > > From: "Catalin Marinas" > > > > On Tue, Apr 24, 2018 at 07:20:57AM -0600, Michal Hocko wrote: > > > > > On Mon 23-04-18 12:17:32, Chunyu Hu wrote: > > > > > [...] > > > > > > So if there is a new flag, it would be the 25th bits. > > > > > > > > > > No new flags please. Can you simply store a simple bool into > > > > > fail_page_alloc > > > > > and have save/restore api for that? > > > > > > > > For kmemleak, we probably first hit failslab. Something like below may > > > > do the trick: > > > > > > > > diff --git a/mm/failslab.c b/mm/failslab.c > > > > index 1f2f248e3601..63f13da5cb47 100644 > > > > --- a/mm/failslab.c > > > > +++ b/mm/failslab.c > > > > @@ -29,6 +29,9 @@ bool __should_failslab(struct kmem_cache *s, gfp_t > > > > gfpflags) > > > > if (failslab.cache_filter && !(s->flags & SLAB_FAILSLAB)) > > > > return false; > > > > > > > > + if (s->flags & SLAB_NOLEAKTRACE) > > > > + return false; > > > > + > > > > return should_fail(&failslab.attr, s->object_size); > > > > } Looks like if just for this slab fault inject issue, and when fail page alloc is not enabled, this should be enough to make the warning go away. And for page allocate fail part, per task handling is an option way, without introducing GFP new flag for fault injection. > > > > > > This maybe is the easy enough way for skipping fault injection for > > > kmemleak slab object. > > > > This was added to avoid kmemleak tracing itself, so could be used for > > other kmemleak-related cases. > > > > > > Can we get a second should_fail() via should_fail_alloc_page() if a new > > > > slab page is allocated? > > > > > > looking at code path below, what do you mean by getting a second > > > should_fail() via fail_alloc_page? > > > > Kmemleak calls kmem_cache_alloc() on a cache with SLAB_LEAKNOTRACE, so the > > first point of failure injection is __should_failslab() which we can > > handle with the slab flag. The slab allocator itself ends up calling > > alloc_pages() to allocate a slab page (and __GFP_NOFAIL is explicitly > > cleared). Here we have the second potential failure injection via > > Indeed. > > > fail_alloc_page(). That's unless the order < fail_page_alloc.min_order > > which I think is the default case (min_order = 1 while the slab page > > allocation for kmemleak would need an order of 0. It's not ideal but we > > may get away with it. > > In my workstation, I checked the value shown is order=2 > > [mm]# cat /sys/kernel/slab/kmemleak_object/order > 2 > [mm]# uname -r > 4.17.0-rc1.syzcaller+ > > > If order is 2, then not into the branch, no false is returned, so not > skipped.. > static bool should_fail_alloc_page(gfp_t gfp_mask, unsigned int order) > { > if (order < fail_page_alloc.min_order) > return false; > > > > > > > Seems we need to insert the flag between alloc_slab_page and > > > alloc_pages()? Without GFP flag, it's difficult to pass info to > > > should_fail_alloc_page and keep simple at same time. > > > > Indeed. > > > > > Or as Michal suggested, completely disabling page alloc fail injection > > > when kmemleak enabled. And enable it again when kmemleak off. > > > > Dmitry's point was that kmemleak is still useful to detect leaks on the > > error path where errors are actually introduced by the fault injection. > > Kmemleak cannot cope with allocation failures as it needs a pretty > > precise tracking of the allocated objects. > > understand. > > > > > An alternative could be to not free the early_log buffer in kmemleak and > > use that memory in an emergency when allocation fails (though I don't > > particularly like this). This is still an option. > > > > Yet another option is to use NOFAIL and remove NORETRY in kmemleak when > > fault injection is enabled. > > I'm going to have a try this way to see if any warning can be seen when > running. > This should be the best if it works fine. NOFAIL has a strict requirement that it must direct_reclaimable, otherwise, the warning still will be seen, though 'use NOFAIL and remove NORETRY' as you suggested. so this is not an option. mm/page_alloc.c 4256 if (gfp_mask & __GFP_NOFAIL) { 4257 /* 4258 * All existing users of the __GFP_NOFAIL are blockable, so warn 4259 * of any new users that actually require GFP_NOWAIT 4260 */ 4261 if (WARN_ON_ONCE(!can_direct_reclaim)) 4262 goto fail; So I also tried to add DIRECT_RECLAIM and NOFAIL together, and no doubt, it will sleep in irq, so don't work. [ 168.802049] BUG: sleeping function called from invalid context at mm/slab.h:421 [ 168.802937] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/2 [ 168.803701] INFO: lockdep is turned off. [ 168.804162] Preemption disabled at: [ 168.804171] [] start_secondary+0x141/0x5f0 [ 168.805259] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G W 4.17.0-rc2.syzcaller+ #18 [ 168.806267] Hardware name: Red Hat KVM, BIOS 0.0.0 02/06/2015 [ 168.806928] Call Trace: [ 168.807211] [ 168.807456] dump_stack+0x11b/0x1be [ 168.807854] ? show_regs_print_info+0x12/0x12 [ 168.808347] ? start_secondary+0x141/0x5f0 [ 168.808845] ? create_object+0xa6/0xaf0 [ 168.809284] ___might_sleep+0x3a6/0x5d0 [ 168.809732] kmem_cache_alloc+0x2d0/0x580 [ 168.810186] ? update_sd_lb_stats+0x3080/0x3080 [ 168.810727] ? __netif_receive_skb_core+0x15a3/0x3400 [ 168.811293] ? __build_skb+0x86/0x3b0 [ 168.811698] create_object+0xa6/0xaf0 > > > > > -- > > Catalin > > > > -- > Regards, > Chunyu Hu > > -- Regards, Chunyu Hu