Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933250AbbHKAh5 (ORCPT ); Mon, 10 Aug 2015 20:37:57 -0400 Received: from mail-pa0-f45.google.com ([209.85.220.45]:35649 "EHLO mail-pa0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932670AbbHKAh4 (ORCPT ); Mon, 10 Aug 2015 20:37:56 -0400 Date: Mon, 10 Aug 2015 17:37:54 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Naoya Horiguchi cc: Andrew Morton , =?UTF-8?Q?J=C3=B6rn_Engel?= , Mike Kravetz , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , Naoya Horiguchi Subject: Re: [PATCH v2 1/2] smaps: fill missing fields for vma(VM_HUGETLB) In-Reply-To: <1438932278-7973-2-git-send-email-n-horiguchi@ah.jp.nec.com> Message-ID: References: <20150806074443.GA7870@hori1.linux.bs1.fc.nec.co.jp> <1438932278-7973-1-git-send-email-n-horiguchi@ah.jp.nec.com> <1438932278-7973-2-git-send-email-n-horiguchi@ah.jp.nec.com> User-Agent: Alpine 2.10 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3123 Lines: 77 On Fri, 7 Aug 2015, Naoya Horiguchi wrote: > Currently smaps reports many zero fields for vma(VM_HUGETLB), which is > inconvenient when we want to know per-task or per-vma base hugetlb usage. > This patch enables these fields by introducing smaps_hugetlb_range(). > > before patch: > > Size: 20480 kB > Rss: 0 kB > Pss: 0 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 0 kB > Referenced: 0 kB > Anonymous: 0 kB > AnonHugePages: 0 kB > Swap: 0 kB > KernelPageSize: 2048 kB > MMUPageSize: 2048 kB > Locked: 0 kB > VmFlags: rd wr mr mw me de ht > > after patch: > > Size: 20480 kB > Rss: 18432 kB > Pss: 18432 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 18432 kB > Referenced: 18432 kB > Anonymous: 18432 kB > AnonHugePages: 0 kB > Swap: 0 kB > KernelPageSize: 2048 kB > MMUPageSize: 2048 kB > Locked: 0 kB > VmFlags: rd wr mr mw me de ht > I think this will lead to breakage, unfortunately, specifically for users who are concerned with resource management. An example: we use memcg hierarchies to charge memory for individual jobs, specific users, and system overhead. Memcg is a cgroup, so this is done for an aggregate of processes, and we often have to monitor their memory usage. Each process isn't assigned to its own memcg, and I don't believe common users of memcg assign individual processes to their own memcgs. When a memcg is out of memory, we need to track the memory usage of processes attached to its memcg hierarchy to determine what is unexpected, either as a result of a new rollout or because of a memory leak. To do that, we use the rss exported by smaps that is now changed with this patch. By using smaps rather than /proc/pid/status, we can report where memory usage is unexpected. This would cause our process that manages all memcgs on our systems to break. Perhaps I haven't been as convincing in my previous messages of this, but it's quite an obvious userspace regression. This memory was not included in rss originally because memory in the hugetlb persistent pool is always resident. Unmapping the memory does not free memory. For this reason, hugetlb memory has always been treated as its own type of memory. It would have been arguable back when hugetlbfs was introduced whether it should be included. I'm afraid the ship has sailed on that since a decade has past and it would cause userspace to break if existing metrics are used that already have cleared defined semantics. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/