Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751289AbbG1XaW (ORCPT ); Tue, 28 Jul 2015 19:30:22 -0400 Received: from mail-pd0-f180.google.com ([209.85.192.180]:34570 "EHLO mail-pd0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751077AbbG1XaV (ORCPT ); Tue, 28 Jul 2015 19:30:21 -0400 Date: Tue, 28 Jul 2015 16:30:19 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: =?UTF-8?Q?J=C3=B6rn_Engel?= cc: Mike Kravetz , "linux-mm@kvack.org" , linux-kernel Subject: Re: hugetlb pages not accounted for in rss In-Reply-To: <20150728222654.GA28456@Sligo.logfs.org> Message-ID: References: <55B6BE37.3010804@oracle.com> <20150728183248.GB1406@Sligo.logfs.org> <55B7F0F8.8080909@oracle.com> <20150728222654.GA28456@Sligo.logfs.org> User-Agent: Alpine 2.10 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="397176738-632582589-1438126220=:10368" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2334 Lines: 49 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --397176738-632582589-1438126220=:10368 Content-Type: TEXT/PLAIN; charset=iso-8859-1 Content-Transfer-Encoding: 8BIT On Tue, 28 Jul 2015, J?rn Engel wrote: > What would you propose for me then? I have 80% RAM or more in reserved > hugepages. OOM-killer is not a concern, as it panics the system - the > alternatives were almost universally silly and we didn't want to deal > with system in unpredictable states. But knowing how much memory is > used by which process is a concern. And if you only tell me about the > small (and continuously shrinking) portion, I essentially fly blind. > > That is not a case of "may lead to breakage", it _is_ broken. > > Ideally we would have fixed this in 2002 when hugetlbfs was introduced. > By now we might have to introduce a new field, rss_including_hugepages > or whatever. Then we have to update tools like top etc. to use the new > field when appropriate. No fun, but might be necessary. > > If we can get away with including hugepages in rss and fixing the OOM > killer to be less silly, I would strongly prefer that. But I don't know > how much of a mess we are already in. > It's not only the oom killer, I don't believe hugeltb pages are accounted to the "rss" in memcg. They use the hugetlb_cgroup for that. Starting to account for them in existing memcg deployments would cause them to hit their memory limits much earlier. The "rss_huge" field in memcg only represents transparent hugepages. I agree with your comment that having done this when hugetlbfs was introduced would have been optimal. It's always difficult to add a new class of memory to an existing metric ("new" here because it's currently unaccounted). If we can add yet another process metric to track hugetlbfs memory mapped, then the test could be converted to use that. I'm not sure if the jusitifcation would be strong enough, but you could try. --397176738-632582589-1438126220=:10368-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/