Received: by 10.213.65.68 with SMTP id h4csp1018088imn; Thu, 22 Mar 2018 13:31:13 -0700 (PDT) X-Google-Smtp-Source: AG47ELtpU79cfAMLYI4FWZucqsdldMqfx/Ogt3lJXXsOnS3I/w5ShnbmXgNH8TiworQUP5+DFvTn X-Received: by 10.98.152.207 with SMTP id d76mr21372486pfk.130.1521750673787; Thu, 22 Mar 2018 13:31:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521750673; cv=none; d=google.com; s=arc-20160816; b=hvwX2Pl5FZmCrjBmUULO9Xu2iMM7ELVFiTW2CONbrd6t6YxSx2oCvkxSyzAgNXVr/K +wISQk/HlKmHyfvrIvJHO3j6e27xZOzNqCxouQAdUb2ZEYxw6u5pO9B6a/PxKLSHtuEE y2JrJMnLERW4cMbLiqKQOmgMPjan0jFDIPZEsI4iQo8V6NjFkOUVujDmIV2rDusYVCne K2dIC+VMheonYm1W4tM5pPyxvSk6s9i+ePRGw8c1iy2Pryj73iVgJD5qCaBuNzcJ4nNH 9k63JoVLn3CnS6sHz/Ufva7/8bH4zRK2SoLYl/eXF4auDVF6fI5ZK8fDKbTRb13AClHx K8xQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date:dkim-signature :arc-authentication-results; bh=XIdLkkOekvSkrRJgq4JS1mXbgTnh0MkNhubr5JyKIWA=; b=HHvUKRYaGQP71an1GQb88aMtqA2xRbFM++HfjopZ0tJqJMiU526lH7bEB5VP87Icwo AP2X/BG2hBJJAu0FtttwqiLjLJp8yHwxqLRbz30bknq3LcFCFgaRl86e0aeuEd5gxoL4 rwENyTrzSzfVRGYA1gqFD6HMZaVZ8dgRjPu3MIL8d1XEXfNdL5CStOGt4pObrtRfhzqz jVJ/OjNtPFfNUWGF52M1gSJbSdVM10kphn2ou/PCN52CnOYQ6+HB8KGcrL9uZudhFR58 PmhXxtRURnusVgWibDD8kKABF26AluPxn5WVQ/vGIoZFPUfUrQ8UiWpEFRKpg2onPNMQ 4L2Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=LlT8e7js; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v65si4728603pgv.807.2018.03.22.13.30.58; Thu, 22 Mar 2018 13:31:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=LlT8e7js; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752145AbeCVU3l (ORCPT + 99 others); Thu, 22 Mar 2018 16:29:41 -0400 Received: from mail-pg0-f44.google.com ([74.125.83.44]:46961 "EHLO mail-pg0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751600AbeCVU3j (ORCPT ); Thu, 22 Mar 2018 16:29:39 -0400 Received: by mail-pg0-f44.google.com with SMTP id t12so1951403pgp.13 for ; Thu, 22 Mar 2018 13:29:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=XIdLkkOekvSkrRJgq4JS1mXbgTnh0MkNhubr5JyKIWA=; b=LlT8e7jsap+xVWTAZDY6WSCErtLR1QkUWciKHORnZDkykiukwJ/ignHduoykx/RWni jd1jRQixpCqOSZ8b3BPhwoF6ReVakTWj63+KuhbcxiqQTtY1oCpCeah1ULIHqdiwo29u DFS0nf5r3tmJv5q8Cd/2Xo0fvNn//BxOSFhVG2R/ZHlQbvr8wAF6cVMgMZGqv2niM7en fblLZyApI0wgr2zCftMYbUquXhnoiur8UOqCDw/imyiD+RQrq8ZfZ4hJ0d2wiEQGpD3y IQnGQbSk2E2O5KPT85u3T9Bv+YWH1YwOThBJk3jBOsOsumj30WcsncxS6U43eadIcL+D XIlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=XIdLkkOekvSkrRJgq4JS1mXbgTnh0MkNhubr5JyKIWA=; b=Hes4Gg4rbJ24xjuOnBHJleDshfVmdu5w0uk7Z/r9KEJWa1TnkIvaoy7g0EyBRIKB7v PmuZ5HU8uW6y7ywKcr4YZCnBgKXoFBV7lv7JEc/UvzrjoAs+CjNcyERP9s6ephs4DGir F4nQCZiD4u/RxHNAll5rD+9928Zy3uGcJLDiCjBb2URMwWM5rFQoCGHWgDsvjau7SdZv QakqGuodrrPt2ei8LxKoczNdEAPWE3Y0jWMPtiHddqFI+So1JQdbVpbIrYzNiWie83z1 4GBqfGV4kEK7Oyq2Kxkp4PRB1qPxwezWUXS2+YX0rOljzsi+ACx2VcC2xgdUQbhrcx0W tD0A== X-Gm-Message-State: AElRT7HHJ83AVbhmCRFPVksMMYCPdLCw8pakZcoovryd9mhJC4VtxU0T RzdlvYo29XfN+Pu8+63WafVN/w== X-Received: by 10.99.95.86 with SMTP id t83mr18585522pgb.183.1521750578956; Thu, 22 Mar 2018 13:29:38 -0700 (PDT) Received: from [2620:15c:17:3:3a5:23a7:5e32:4598] ([2620:15c:17:3:3a5:23a7:5e32:4598]) by smtp.gmail.com with ESMTPSA id n187sm3684554pfn.8.2018.03.22.13.29.37 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 22 Mar 2018 13:29:38 -0700 (PDT) Date: Thu, 22 Mar 2018 13:29:37 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Michal Hocko cc: Andrew Morton , Johannes Weiner , "Kirill A. Shutemov" , Vlastimil Babka , linux-mm@kvack.org, LKML Subject: Re: [PATCH] memcg, thp: do not invoke oom killer on thp charges In-Reply-To: <20180322085611.GY23100@dhcp22.suse.cz> Message-ID: References: <20180321205928.22240-1-mhocko@kernel.org> <20180321214104.GT23100@dhcp22.suse.cz> <20180322085611.GY23100@dhcp22.suse.cz> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 22 Mar 2018, Michal Hocko wrote: > > So now you're making a generalized policy change to the memcg charge path > > to fix what is obviously only thp and caused by removing the __GFP_NORETRY > > from thp allocations in commit 2516035499b9? > > Yes, because relying on __GFP_NORETRY for the oom handling has proven to > be subtle and error prone. And as I've repeated few times already there > is _no_ reason why the oom policy for the memcg charge should be any > different from the allocator's one. > The PAGE_ALLOC_COSTLY_ORDER oom heuristic in the page allocator is about the unlikelihood of freeing contiguous memory. It was added around the same time I added the oom heuristic to prevent killing processes for lowmem because of the unlikelihood of freeing lowmem. If lowmem is accounted to a memcg, would you avoid oom kill in that scenario too, just for the sake of matching the page allocator? Of course not. The argument is absurd, I'm sorry. Charging 32KB of memory allows the memcg oom killer but magically 64KB of memory automatically fails and the charging path has no ability to override that because you've made it part of the charge path. There is no __GFP_RETRY. Making blanket claims that high-order charges should never oom kill and that you know better than the caller, and that you don't think the caller knows what they are doing wrt __GFP_NORETRY, is more dangerous imo than the potential benefit from what you are proposing. > They simply cannot because kmalloc performs the change under the cover. > So you would have to use kmalloc(gfp|__GFP_NORETRY) to be absolutely > sure to not trigger _any_ oom killer. This is just wrong thing to do. > Examples of where this isn't already done? It certainly wasn't a problem before __GFP_NORETRY was dropped in commit 2516035499b9 but you suspect it's a problem now. > > You're diverging from it because the memcg charge path has never had this > > heuristic. > > Which is arguably a bug which just didn't matter because we do not > have costly order oom eligible charges in general and THP was subtly > different and turned out to be error prone. > It was inadvertently dropped from commit 2516035499b9. There were no high-order charge oom kill problems before this commit. People know how to use __GFP_NORETRY or leave it off, which you don't trust them to do because you're hardcoding a heuristic in the charge path. You also acked the commit that introduced this "error prone" problem. Before you start to advertise that you know better than what previously worked just fine, let's fix the issue that was introduced by that commit and then you can propose a follow-up patch that changes the charge heuristic and it can stand on its own merits. > > I'm somewhat stunned this has to be repeated: > > PAGE_ALLOC_COSTLY_ORDER is about the ability to allocate _contiguous_ > > memory, it's not about the _amount_ of memory. Changing the memcg charge > > path to factor order into oom kill decisions is new, and should be > > proposed as a follow-up patch to my bug fix to describe what else is being > > impacted by your patch and what is fixed by it. > > > > Yours is a heuristic change, mine is a bug fix. > > Nobody is really arguing about this. I have just pointed out my > reservation that your bug fix is adding more special casing and a more > generic solution is due. Dude, it's setting a bit that the problem commit dropped. That's it. I'm setting a bit. > If you absolutely believe that your bugfix is > so important to make it to rc7 I will not object. It is however strange > that we haven't seen a _single_ bug report in last two years about this > being a problem. So I am not really sure the urgency is due. > You're not really sure about the urgency but you were "tempted to mark this for stable" for your heuristic "fix"? > > Your change is broken and I wouldn't push it to Linus for rc7 if my life > > depended on it. What is the response when someone complains that they > > start getting a ton of MEMCG_OOM notifications for every thp fallback? > > They will, because yours is a broken implementation. > > I fail to see what is broken. Could you be more specific? > I said MEMCG_OOM notifications on thp fallback. You modified mem_cgroup_oom(). What is called before mem_cgroup_oom()? mem_cgroup_event(mem_over_limit, MEMCG_OOM). That increments the MEMCG_OOM event and anybody waiting on the events file gets notified it changed. They read a MEMCG_OOM event. It's thp fallback, it's not memcg oom. Really, I can't continue to write 100 emails in this thread. I'm sorry, but there are only so many hours in a day. I can't read 20 emails about why Tetsuo shouldn't emit a stack trace when the oom reaper fails, which happens 0.04% of the time on production workloads with data I provided. I can't continue to reiterate why we added the PAGE_ALLOC_COSTLY_ORDER heuristic to the allocator. I'm done. > > Respectfully, allow the bugfix to fix what was obviously left off from > > commit 2516035499b9. > > I haven't nacked the patch AFAIR so nothing really prevents it from > being merged. > It should be merged for rc7. Please send any follow-up policy change wrt to high-order charges as a separate patch for 4.17.