Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp1196035imm; Tue, 2 Oct 2018 04:29:57 -0700 (PDT) X-Google-Smtp-Source: ACcGV62WQ+oMPwULQ8m5/UFiKXc0Hqzke68/P8VyzCRpsBL7Jg/o0oHd8G9Njjlr8XzY/po2FN+X X-Received: by 2002:a63:af5b:: with SMTP id s27-v6mr14012881pgo.448.1538479797776; Tue, 02 Oct 2018 04:29:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538479797; cv=none; d=google.com; s=arc-20160816; b=TJYLAf1tnu1t/qVNG1PasTIlTZhgAZsVHBjXKXE3vtJng0yuOnKVFwsHTKu6mcckJk xKsLrqjRLbYa+dNvgCQOSkIlCR2HQ7EJWiUN/ce7Ptwo0birNMl8Oy0VOb7uDJspTwdh lqhCyDcpKSijwbXtFcfmwsqpl4TgnYouV1e6Y34CmnnUABymaVMl84Lp6EZafI8vGlo+ lWAIsUiTgw6h71lhfGI8H50AukWw4tEF3ySXvO66L1hpmZE54+lreep+JcfeCGvXJolz CBarOmJs1SdrS+d/RMzwZvX5dx4Z/pl4kYdABTnUYeqPw91FyR3/zVLekib0fxhXv7eH stLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=yPU3Akyh8rW0R2TKf/8bODhaNmVeLV7DoYy6oJU7/tE=; b=Eqp5ovyewywESIpL+gUXVQG5hSXR2rlBBfxXTHQZ2I4Qx15cz1wqfOD6nyYMnIGTjx F6M7xDe8FISIg3wISAVRIK/Ult0vieSn1BrvmkR7Y1CscbU71qgULdLc2nOEBkwWzN5o nvqIgRQVzKalMPe+EzgGar6h3enXv3Y74YB4lrn62hMgx4IOxdJO107pvbbTpn6WN5TE +imeE5o4YrFHE3z2y4r1g6N8/E0lklVHa67Z9SZS2jqVX+P+TJ7FGmfR/QALxjRIo2fZ FK/eWPaDo2/mJJ56B+curAEBk/Th6h29wojjxglxf6XnclfAutXgVCjdebh7+9U1xF4G aPYg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l36-v6si16386127plg.289.2018.10.02.04.29.43; Tue, 02 Oct 2018 04:29:57 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727674AbeJBSLr (ORCPT + 99 others); Tue, 2 Oct 2018 14:11:47 -0400 Received: from mx2.suse.de ([195.135.220.15]:51200 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727443AbeJBSLr (ORCPT ); Tue, 2 Oct 2018 14:11:47 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 18E8CAFC1; Tue, 2 Oct 2018 11:28:56 +0000 (UTC) Date: Tue, 2 Oct 2018 13:28:51 +0200 From: Michal Hocko To: Andrew Morton Cc: David Rientjes , Vlastimil Babka , Alexey Dobriyan , "Kirill A. Shutemov" , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-api@vger.kernel.org Subject: [RFC PATCH] mm, proc: report PR_SET_THP_DISABLE in proc Message-ID: <20181002112851.GP18290@dhcp22.suse.cz> References: <20180924195603.GJ18685@dhcp22.suse.cz> <20180924200258.GK18685@dhcp22.suse.cz> <0aa3eb55-82c0-eba3-b12c-2ba22e052a8e@suse.cz> <20180925202959.GY18685@dhcp22.suse.cz> <20180925150406.872aab9f4f945193e5915d69@linux-foundation.org> <20180926060624.GA18685@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180926060624.GA18685@dhcp22.suse.cz> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 26-09-18 08:06:24, Michal Hocko wrote: > On Tue 25-09-18 15:04:06, Andrew Morton wrote: > > On Tue, 25 Sep 2018 14:45:19 -0700 (PDT) David Rientjes wrote: > > > > > > > It is also used in > > > > > automated testing to ensure that vmas get disabled for thp appropriately > > > > > and we used "nh" since that is how PR_SET_THP_DISABLE previously enforced > > > > > this, and those tests now break. > > > > > > > > This sounds like a bit of an abuse to me. It shows how an internal > > > > implementation detail leaks out to the userspace which is something we > > > > should try to avoid. > > > > > > > > > > Well, it's already how this has worked for years before commit > > > 1860033237d4 broke it. Changing the implementation in the kernel is fine > > > as long as you don't break userspace who relies on what is exported to it > > > and is the only way to determine if MADV_NOHUGEPAGE is preventing it from > > > being backed by hugepages. > > > > 1860033237d4 was over a year ago so perhaps we don't need to be > > too worried about restoring the old interface. In which case > > we have an opportunity to make improvements such as that suggested > > by Michal? > > Yeah, can we add a way to export PR_SET_THP_DISABLE to userspace > somehow? E.g. /proc//status. It is a process wide thing so > reporting it per VMA sounds strange at best. So how about this? (not tested yet but it should be pretty straightforward) --- From 048b29102de326900b54cce78b614345cd77a230 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Tue, 2 Oct 2018 10:53:48 +0200 Subject: [PATCH] mm, proc: report PR_SET_THP_DISABLE in proc David Rientjes has reported that 1860033237d4 ("mm: make PR_SET_THP_DISABLE immediately active") has changed the way how we report THPable VMAs to the userspace. Their monitoring tool is triggering false alarms on PR_SET_THP_DISABLE tasks because it considers an insufficient THP usage as a memory fragmentation resp. memory pressure issue. Before the said commit each newly created VMA inherited VM_NOHUGEPAGE flag and that got exposed to the userspace via /proc//smaps file. This implementation had its downsides as explained in the commit message but it is true that the userspace doesn't have any means to query for the process wide THP enabled/disabled status. PR_SET_THP_DISABLE is a process wide flag so it makes a lot of sense to export in the process wide context rather than per-vma. Introduce a new field to /proc//status which export this status. If PR_SET_THP_DISABLE is used the it reports false same as when the THP is not compiled in. It doesn't consider the global THP status because we already export that information via sysfs Fixes: 1860033237d4 ("mm: make PR_SET_THP_DISABLE immediately active") Signed-off-by: Michal Hocko --- Documentation/filesystems/proc.txt | 3 +++ fs/proc/array.c | 10 ++++++++++ 2 files changed, 13 insertions(+) diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index 22b4b00dee31..bafa5cb1685a 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -182,6 +182,7 @@ For example, to get the status information of a process, all you have to do is VmSwap: 0 kB HugetlbPages: 0 kB CoreDumping: 0 + THP_enabled: 1 Threads: 1 SigQ: 0/28578 SigPnd: 0000000000000000 @@ -256,6 +257,8 @@ Table 1-2: Contents of the status files (as of 4.8) HugetlbPages size of hugetlb memory portions CoreDumping process's memory is currently being dumped (killing the process may lead to a corrupted core) + THP_enabled process is allowed to use THP (returns 0 when + PR_SET_THP_DISABLE is set on the process Threads number of threads SigQ number of signals queued/max. number for queue SigPnd bitmap of pending signals for the thread diff --git a/fs/proc/array.c b/fs/proc/array.c index 0ceb3b6b37e7..9d428d5a0ac8 100644 --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -392,6 +392,15 @@ static inline void task_core_dumping(struct seq_file *m, struct mm_struct *mm) seq_putc(m, '\n'); } +static inline void task_thp_status(struct seq_file *m, struct mm_struct *mm) +{ + bool thp_enabled = IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE); + + if (thp_enabled) + thp_enabled = !test_bit(MMF_DISABLE_THP, &mm->flags); + seq_printf(m, "THP_enabled:\t%d\n", thp_enabled); +} + int proc_pid_status(struct seq_file *m, struct pid_namespace *ns, struct pid *pid, struct task_struct *task) { @@ -406,6 +415,7 @@ int proc_pid_status(struct seq_file *m, struct pid_namespace *ns, if (mm) { task_mem(m, mm); task_core_dumping(m, mm); + task_thp_status(m, mm); mmput(mm); } task_sig(m, task); -- 2.19.0 -- Michal Hocko SUSE Labs