Received: by 10.192.165.148 with SMTP id m20csp2991707imm; Sun, 22 Apr 2018 21:21:48 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/r+H2UvO14hSlbmoAVHcj5f/n6iIKKgPokNlr2TFCCqi+QbCrmYJ0J5cnChwfLU9+aphpR X-Received: by 10.101.86.134 with SMTP id v6mr6197210pgs.92.1524457308312; Sun, 22 Apr 2018 21:21:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524457308; cv=none; d=google.com; s=arc-20160816; b=mn4TuUB5tFDkW+TTnGZEG6A7JTK3+n85j9FMlKerShKZGvhHhd8y/MY4aedKAkcuqh bkEP6rAuJYbr9Ay9+/4Ps4/hKqtBRKUMkCEUNKkVHwMJID6LgUEB5SaPjMyXrWEZ0Z3O bTQl/XhGtQQpq73WdTo9x2R+uU+JPRaxBg4DfywyZz+q6U4iNfRUYqO9P7Mx//j0MGqb p8Pch8HJ62kLWBQ/b3wqV3j0SuLjdDvLgaoooF3tV28IBxT5vBNzficT92jy4K13GtyM IoXfZRpr1whxujsdc1uIHMomPmeRddtBE88BCOlOq+j0doKTZSMMJI/SE6T0ilIVSsW4 2TrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=yosjfbOmTzG5aemPopkq+76MBGBLdP6N45bn+4ytyyU=; b=JmLW/Xs9NNxoKKtr7HNatbTdtjgZLZu9aehsQYYu58i7apFVTZC4nz3s4VNhkg03mk WWhyk8xAJC+fkyK9r2R1lBciMWcL9NYk/OsBZNqVDq7u+ua3PatlvJW7oGqYwDOEhEG0 gz1sgWyQCjztj/tFNgXKTQ5P3WkcNW33TV+4M0pBZu62BZNjTP0kqzwNmk4iRgo92cBh 90+gwh3YKe8QhpefzS/50AFQRGq5qgq00bYJFJwf+nDmpx3GUm0+2Mh4zQlpxesACP1d gBIrjrL0rB60LeI1VT8lLcWD5hsSfUNbc5JVeq9VA+EqEpMUABnyJO53YsuZlp+7N5mM Xs5w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ZTHU+IZm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e15-v6si11118389pli.163.2018.04.22.21.21.05; Sun, 22 Apr 2018 21:21:48 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ZTHU+IZm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751233AbeDWERg (ORCPT + 99 others); Mon, 23 Apr 2018 00:17:36 -0400 Received: from mail-lf0-f49.google.com ([209.85.215.49]:40527 "EHLO mail-lf0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751097AbeDWERe (ORCPT ); Mon, 23 Apr 2018 00:17:34 -0400 Received: by mail-lf0-f49.google.com with SMTP id i18-v6so12828429lfc.7 for ; Sun, 22 Apr 2018 21:17:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=yosjfbOmTzG5aemPopkq+76MBGBLdP6N45bn+4ytyyU=; b=ZTHU+IZmhQ7QW4fbLVoBSdUWnr61OOH/jWUejqCwjHvvakuTgYcNIkCChmw3n9XSZL D++UUfdFUFynoHIw4mqRUcSGQy17VIG9hfm+X2hyqc6SKZmxG0g1gSMZTWXCWpiWpeTk 8u20oYby7LbbAiGSkljYBeDUnHHnvDQLwyEL5M0VGtJ+pmSIHHCnRFs1sTTcIMwISYNz fi/ZOi3hrxnjLQWRPrGBf0yrJUNoffhYD1z/zjDt+NSJrySmKCKY69QovQAhkgpYgm1g uis7IuCSmOOFBm3ietOHAhK07JZscYVBfB/mJh0BaIXS1wb2WtFw6YtKrYpTGaQTjnxK 3a6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=yosjfbOmTzG5aemPopkq+76MBGBLdP6N45bn+4ytyyU=; b=X/VJsyE4GhP526kuHbmjNGHSm5WbvTE6DXxZWoeleRfj4MDtSDgVdDn+SvuoWqMzU/ qKxmOaJTqTrsrwVVSCOoGRyMAOo+1Tl1QsIeWVtXL5HydVZ6Wd4OurN4ULzgmviwXf4r I5lM0LhRpNYbw+8zh3opNZVA9nnb/cTXJpEgUsty7YoZwJN7AEJkIH4e8yE/hpboC9hb CEzO/oT/jbsgvZrAI0WF/mpAEgxsvOVB4yF9E0m5+a2XBHep3sCalLdJz9TMUir7JTbg UT5tcGtO53lW7gPLKM7iV8C3nLDuYtSLeG7CEwYtptGH2JFPRwEzgctMmx9rc6DdxvrM op4Q== X-Gm-Message-State: ALQs6tBhwxrcx8IdKwvwjbCIwt5KJxBpmnn0ISC8/DJAVDjB5eZGxe/Q rsvfA9IFkeFj9HwAd0vY4riyT+bvcULXZ53PbxE= X-Received: by 10.46.73.73 with SMTP id b9mr5087855ljd.118.1524457052737; Sun, 22 Apr 2018 21:17:32 -0700 (PDT) MIME-Version: 1.0 Received: by 10.46.157.213 with HTTP; Sun, 22 Apr 2018 21:17:32 -0700 (PDT) In-Reply-To: References: <1524243513-29118-1-git-send-email-chuhu@redhat.com> <20180420175023.3c4okuayrcul2bom@armageddon.cambridge.arm.com> <20180422125141.GF17484@dhcp22.suse.cz> From: Chunyu Hu Date: Mon, 23 Apr 2018 12:17:32 +0800 Message-ID: Subject: Re: [RFC] mm: kmemleak: replace __GFP_NOFAIL to GFP_NOWAIT in gfp_kmemleak_mask To: Dmitry Vyukov Cc: Michal Hocko , Catalin Marinas , Chunyu Hu , LKML , Linux-MM Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 22 April 2018 at 23:00, Dmitry Vyukov wrote: > On Sun, Apr 22, 2018 at 2:51 PM, Michal Hocko wrote: >> On Fri 20-04-18 18:50:24, Catalin Marinas wrote: >>> On Sat, Apr 21, 2018 at 12:58:33AM +0800, Chunyu Hu wrote: >>> > __GFP_NORETRY and __GFP_NOFAIL are combined in gfp_kmemleak_mask now. >>> > But it's a wrong combination. As __GFP_NOFAIL is blockable, but >>> > __GFP_NORETY is not blockable, make it self-contradiction. >>> > >>> > __GFP_NOFAIL means 'The VM implementation _must_ retry infinitely'. But >>> > it's not the real intention, as kmemleak allow alloc failure happen in >>> > memory pressure, in that case kmemleak just disables itself. >>> >>> Good point. The __GFP_NOFAIL flag was added by commit d9570ee3bd1d >>> ("kmemleak: allow to coexist with fault injection") to keep kmemleak >>> usable under fault injection. >>> >>> > commit 9a67f6488eca ("mm: consolidate GFP_NOFAIL checks in the allocator >>> > slowpath") documented that what user wants here should use GFP_NOWAIT, and >>> > the WARN in __alloc_pages_slowpath caught this weird usage. >>> > >>> > >>> > WARNING: CPU: 3 PID: 64 at mm/page_alloc.c:4261 __alloc_pages_slowpath+0x1cc3/0x2780 >>> [...] >>> > Replace the __GFP_NOFAIL with GFP_NOWAIT in gfp_kmemleak_mask, __GFP_NORETRY >>> > and GFP_NOWAIT are in the gfp_kmemleak_mask. So kmemleak object allocaion >>> > is no blockable and no reclaim, making kmemleak less disruptive to user >>> > processes in pressure. >>> >>> It doesn't solve the fault injection problem for kmemleak (unless we >>> change __should_failslab() somehow, not sure yet). An option would be to >>> replace __GFP_NORETRY with __GFP_NOFAIL in kmemleak when fault injection >>> is enabled. >> >> Cannot we simply have a disable_fault_injection knob around the >> allocation rather than playing this dirty tricks with gfp flags which do >> not make any sense? To this way, looks like we need to change the attrs. but what we have stored in attr is also gfp flags, even we define a new member, ignore_flags, we still need a new flag, or did I miss something? But looks like a new flag is simple. For slab, it supports cache_filter, so it's possible to add a filer_out slab flag, and skip it. But for page, it's just has gfp flag and size info for filter, no other info. #ifdef CONFIG_FAIL_PAGE_ALLOC static struct { struct fault_attr attr; bool ignore_gfp_highmem; bool ignore_gfp_reclaim; u32 min_order; } fail_page_alloc = { .attr = FAULT_ATTR_INITIALIZER, .ignore_gfp_reclaim = true, .ignore_gfp_highmem = true, .min_order = 1, }; static struct { struct fault_attr attr; bool ignore_gfp_reclaim; bool cache_filter; } failslab = { .attr = FAULT_ATTR_INITIALIZER, .ignore_gfp_reclaim = true, .cache_filter = false, }; >> >>> BTW, does the combination of NOWAIT and NORETRY make kmemleak >>> allocations more likely to fail? >> >> NOWAIT + NORETRY simply doesn't make much sesne. It is equivalent to >> NOWAIT. > > Specifying a flag that says "don't do fault injection for this > allocation" looks like a reasonable solution. Fewer lines of code and > no need to switch on interrupts. __GFP_NOFAIL seems to mean more than > that, so perhaps we need a separate flag that affects only fault > injection and should be used only in debugging code (no-op without > fault injection anyway). Got the two places for skipping fault injection, both they check the gfp flags as part of the work. If we have the new no fault inject flag, and define the wrapper as below, then it will look like. #define _GFP_NOFAULTINJECT (__GFP_NOFAIL|___GFP_NOFAULTINJECT) bool __should_failslab(struct kmem_cache *s, gfp_t gfpflags) { /* No fault-injection for bootstrap cache */ if (unlikely(s == kmem_cache)) return false; if (gfpflags & _GFP_NOFAULTINJECTL) return false; if (failslab.ignore_gfp_reclaim && (gfpflags & __GFP_RECLAIM)) return false; if (failslab.cache_filter && !(s->flags & SLAB_FAILSLAB)) return false; return should_fail(&failslab.attr, s->object_size); } static bool should_fail_alloc_page(gfp_t gfp_mask, unsigned int order) { if (order < fail_page_alloc.min_order) return false; if (gfp_mask & _GFP_NOFAULTINJECT) return false; if (fail_page_alloc.ignore_gfp_highmem && (gfp_mask & __GFP_HIGHMEM)) return false; if (fail_page_alloc.ignore_gfp_reclaim && (gfp_mask & __GFP_DIRECT_RECLAIM)) return false; return should_fail(&fail_page_alloc.attr, 1 << order); } the gfp flags defined in linux/gfp.h. we have now 24 flags already, and we have an precedent ___GFP_NOLOCKDEP for skipping lockdep. /* Plain integer GFP bitmasks. Do not use this directly. */ #define ___GFP_DMA 0x01u #define ___GFP_HIGHMEM 0x02u #define ___GFP_DMA32 0x04u #define ___GFP_MOVABLE 0x08u #define ___GFP_RECLAIMABLE 0x10u #define ___GFP_HIGH 0x20u #define ___GFP_IO 0x40u #define ___GFP_FS 0x80u #define ___GFP_NOWARN 0x200u #define ___GFP_RETRY_MAYFAIL 0x400u #define ___GFP_NOFAIL 0x800u #define ___GFP_NORETRY 0x1000u #define ___GFP_MEMALLOC 0x2000u #define ___GFP_COMP 0x4000u #define ___GFP_ZERO 0x8000u #define ___GFP_NOMEMALLOC 0x10000u #define ___GFP_HARDWALL 0x20000u #define ___GFP_THISNODE 0x40000u #define ___GFP_ATOMIC 0x80000u #define ___GFP_ACCOUNT 0x100000u #define ___GFP_DIRECT_RECLAIM 0x400000u #define ___GFP_WRITE 0x800000u #define ___GFP_KSWAPD_RECLAIM 0x1000000u #ifdef CONFIG_LOCKDEP #define ___GFP_NOLOCKDEP 0x2000000u #else #define ___GFP_NOLOCKDEP 0 #endif So if there is a new flag, it would be the 25th bits. #ifdef CONFIG_KMEMLEAK #define ___GFP_NOFAULTINJECT 0x4000000u #else #define ___GFP_NOFAULT_INJECT 0 #endif