Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp4220443imm; Tue, 25 Sep 2018 13:30:39 -0700 (PDT) X-Google-Smtp-Source: ACcGV62ivf21CLl3qA2PyPSZMjy78oNCtG73jECIO3zpJ/U3G6CV2EO829bXAlatINTYAQ5oMttY X-Received: by 2002:a17:902:6102:: with SMTP id t2-v6mr2740641plj.278.1537907438742; Tue, 25 Sep 2018 13:30:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537907438; cv=none; d=google.com; s=arc-20160816; b=w2KkvIDpKEf0gl++NDlaAwlAVkVpwTrBZH2kXFZKLCPHn+ENrEs/YWvWL/38X+7E2/ 30tH7YRpXNMRIXJCABazE2O4UWxwzuMZM3KDtBy7e8qXXE0jyl4Uyzx6WRyIBm1oymQQ sC5Hj7FDaf1xon2fU0uAnb/bH+63iaOAttrmiGcQZlcq/8uEvqtI3xuakJsLUR8Z7D+G 2Wx3y6U7u+s6WLHA8K2vHnTjOgzWItddQ7K9k2sw/iTf+WAc2iRo1hT5x0UWk7uc8Cvt b/fhI+NYIeFpF9zK6QvzRbnJrXkLkuddoEA24kOFe5fE5dmfEOFlBE+J/okXldvgHLxl JBTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=FnACvBOG4cNAsCZ/xpAiaFXkqRod3cr2hZuCr4WOz4A=; b=K7dn9nKDGbXekdVCgTaLPm87q20f8qMKM9xT3LUCqEhb+g6qVVCWH+bcI3b9H5Xz+P hU+cTddnqBsjzVYlEOKr8QX/tQMq2HfUaZjAzvdQHS+/JTdL8b0J1GmO5oF8dLV/sJom CkPw3zyJFLtJ2Htt7rSRq8Pos/hTpyPpxnIlH3yh6K6UPy4CjUwtnmDdyNQKiyclFFVk UkyYqHJSW93DsMGPl5MH8q5Dcc0Sc3+/w6DHW/g6lW+niBqn6zegnsGNslL2qdYW4ECa liy+rzwR2f3jZqPo+5AzeOZf69DDR0MuCloJiX9TvLD/wTCXeFWJkGVa++aTF19e9FZ5 +bHA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x3-v6si3114860pgr.27.2018.09.25.13.30.18; Tue, 25 Sep 2018 13:30:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726487AbeIZCjX (ORCPT + 99 others); Tue, 25 Sep 2018 22:39:23 -0400 Received: from mx2.suse.de ([195.135.220.15]:54992 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726026AbeIZCjW (ORCPT ); Tue, 25 Sep 2018 22:39:22 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id DD959ACA7; Tue, 25 Sep 2018 20:30:01 +0000 (UTC) Date: Tue, 25 Sep 2018 22:29:59 +0200 From: Michal Hocko To: David Rientjes Cc: Vlastimil Babka , Andrew Morton , Alexey Dobriyan , "Kirill A. Shutemov" , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-api@vger.kernel.org Subject: Re: [patch v2] mm, thp: always specify ineligible vmas as nh in smaps Message-ID: <20180925202959.GY18685@dhcp22.suse.cz> References: <20180924195603.GJ18685@dhcp22.suse.cz> <20180924200258.GK18685@dhcp22.suse.cz> <0aa3eb55-82c0-eba3-b12c-2ba22e052a8e@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 25-09-18 12:52:09, David Rientjes wrote: > On Mon, 24 Sep 2018, Vlastimil Babka wrote: > > > On 9/24/18 10:02 PM, Michal Hocko wrote: > > > On Mon 24-09-18 21:56:03, Michal Hocko wrote: > > >> On Mon 24-09-18 12:30:07, David Rientjes wrote: > > >>> Commit 1860033237d4 ("mm: make PR_SET_THP_DISABLE immediately active") > > >>> introduced a regression in that userspace cannot always determine the set > > >>> of vmas where thp is ineligible. > > >>> > > >>> Userspace relies on the "nh" flag being emitted as part of /proc/pid/smaps > > >>> to determine if a vma is eligible to be backed by hugepages. > > >> > > >> I was under impression that nh resp hg flags only tell about the madvise > > >> status. How do you exactly use these flags in an application? > > >> > > This is used to identify heap mappings that should be able to fault thp > but do not, and they normally point to a low-on-memory or fragmentation > issue. After commit 1860033237d4, our users of PR_SET_THP_DISABLE no > longer show "nh" for their heap mappings so they get reported as having a > low thp ratio when in reality it is disabled. I am still not sure I understand the issue completely. How are PR_SET_THP_DISABLE users any different from the global THP disabled case? Is this only about the scope? E.g the one who checks for the state cannot check the PR_SET_THP_DISABLE state? Besides that what are consequences of the low ratio? Is this an example of somebody using the prctl and still complaining or an external observer trying to do something useful which ends up doing contrary? > It is also used in > automated testing to ensure that vmas get disabled for thp appropriately > and we used "nh" since that is how PR_SET_THP_DISABLE previously enforced > this, and those tests now break. This sounds like a bit of an abuse to me. It shows how an internal implementation detail leaks out to the userspace which is something we should try to avoid. > > >> Your eligible rules as defined here: > > >> > > >>> + [*] A process mapping is eligible to be backed by transparent hugepages (thp) > > >>> + depending on system-wide settings and the mapping itself. See > > >>> + Documentation/admin-guide/mm/transhuge.rst for default behavior. If a > > >>> + mapping has a flag of "nh", it is not eligible to be backed by hugepages > > >>> + in any condition, either because of prctl(PR_SET_THP_DISABLE) or > > >>> + madvise(MADV_NOHUGEPAGE). PR_SET_THP_DISABLE takes precedence over any > > >>> + MADV_HUGEPAGE. > > >> > > >> doesn't seem to match the reality. I do not see all the file backed > > >> mappings to be nh marked. So is this really about eligibility rather > > >> than the madvise status? Maybe it is just the above documentation that > > >> needs to be updated. > > > > Yeah the change from madvise to eligibility in the doc seems to go too far. > > > > I'll reword this to explicitly state that "hg" and "nh" mappings either > allow or disallow thp backing. How are you going to distinguish a regular THP-able mapping then? I am still not sure how this is supposed to work. Could you be more specific. Let's say I have a THP-able mapping (shmem resp. anon for the current implementation). What is the the matrix for hg/nh wrt. madvice/nomadvise PR_SET_THP_DISABLE and global THP enabled/disable. -- Michal Hocko SUSE Labs