Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754659AbbHDC45 (ORCPT ); Mon, 3 Aug 2015 22:56:57 -0400 Received: from TYO201.gate.nec.co.jp ([210.143.35.51]:37367 "EHLO tyo201.gate.nec.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753914AbbHDC4z (ORCPT ); Mon, 3 Aug 2015 22:56:55 -0400 From: Naoya Horiguchi To: Mike Kravetz CC: David Rientjes , =?utf-8?B?SsO2cm4gRW5nZWw=?= , "linux-mm@kvack.org" , linux-kernel Subject: Re: hugetlb pages not accounted for in rss Thread-Topic: hugetlb pages not accounted for in rss Thread-Index: AQHQzmEGDQtUnCuLE0uTWCNi6WMN8Q== Date: Tue, 4 Aug 2015 02:55:30 +0000 Message-ID: <20150804025530.GA13210@hori1.linux.bs1.fc.nec.co.jp> References: <55B6BE37.3010804@oracle.com> <20150728183248.GB1406@Sligo.logfs.org> <55B7F0F8.8080909@oracle.com> <20150728222654.GA28456@Sligo.logfs.org> <20150729005332.GB17938@Sligo.logfs.org> <55B95FDB.1000801@oracle.com> In-Reply-To: <55B95FDB.1000801@oracle.com> Accept-Language: ja-JP, en-US Content-Language: ja-JP X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.128.101.25] Content-Type: text/plain; charset="utf-8" Content-ID: <18A38E793700D44A88AAC373F8BC250B@gisp.nec.co.jp> MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id t742v7Z8029780 Content-Length: 2123 Lines: 42 On Wed, Jul 29, 2015 at 04:20:59PM -0700, Mike Kravetz wrote: > On 07/29/2015 12:08 PM, David Rientjes wrote: > >On Tue, 28 Jul 2015, Jörn Engel wrote: > > > >>Well, we definitely need something. Having a 100GB process show 3GB of > >>rss is not very useful. How would we notice a memory leak if it only > >>affects hugepages, for example? > >> > > > >Since the hugetlb pool is a global resource, it would also be helpful to > >determine if a process is mapping more than expected. You can't do that > >just by adding a huge rss metric, however: if you have 2MB and 1GB > >hugepages configured you wouldn't know if a process was mapping 512 2MB > >hugepages or 1 1GB hugepage. > > > >That's the purpose of hugetlb_cgroup, after all, and it supports usage > >counters for all hstates. The test could be converted to use that to > >measure usage if configured in the kernel. > > > >Beyond that, I'm not sure how a per-hstate rss metric would be exported to > >userspace in a clean way and other ways of obtaining the same data are > >possible with hugetlb_cgroup. I'm not sure how successful you'd be in > >arguing that we need separate rss counters for it. > > If I want to track hugetlb usage on a per-task basis, do I then need to > create one cgroup per task? > > For example, suppose I have many tasks using hugetlb and the global pool > is getting low on free pages. It might be useful to know which tasks are > using hugetlb pages, and how many they are using. > > I don't actually have this need (I think), but it appears to be what > Jörn is asking for. One possible way to get hugetlb metric in per-task basis is to walk page table via /proc/pid/pagemap, and counting page flags for each mapped page (we can easily do this with tools/vm/page-types.c like "page-types -p -b huge"). This is obviously slower than just storing the counter as in-kernel data and just exporting it, but might be useful in some situation. Thanks, Naoya Horiguchi????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m????????????I?