Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp754294imm; Thu, 4 Oct 2018 02:47:48 -0700 (PDT) X-Google-Smtp-Source: ACcGV60CQivnJ2NWlzptL8FhoKfTsrIw7GCgpQUGwZUzOwLyG+fnlNoVaH50aHQzaOuLVHJDYbvb X-Received: by 2002:a62:4c3:: with SMTP id 186-v6mr5850295pfe.156.1538646468120; Thu, 04 Oct 2018 02:47:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538646468; cv=none; d=google.com; s=arc-20160816; b=f4JctYQwMJS55sCIMlZeqlFdat3eM6MXddkCTmKXTTO7wFbEPP38wr4zXh36G64RnU rpoLgw1WbwtXmaaNFdH14ByuSsASnS3WpW8Wh9Rm4GgrSjGY9P6cl6xFn/g4dEe0g/W8 x++vr4Rd328vZvFuG6LUie1+2mLP1FLv8SISeecbZs93FPL7bilXZr5xFQ9XW3anWkz0 w7FlL9UVLUF6RQCOJdGqHZoGeJIavHjxF02J5FQi0niMbwwEvK3HMCWuJHlCVpJcbpc9 QFJTCy/R/QULXfhLewazWeoqHYoD+xKiSRCWII4YSPQymSks71cV+ALuh0ukMqgG+Qj0 PP3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=DAoyBnheRpWdGnKoRFWxG26KfJOH7xwq85HYeuL93tE=; b=JIdhJnc4oTaU8mo3qjJMPFtFq765SIRs3RyDqy2fXKJdYwXmhjvyl8nQueHDWZIJUn 7c5Vu2E0cMNg+lfVv0NUjQ+NdhBzoNkI2XVG7Wripxhate3SAZBPb4wTfRqH1t7AenHm rEcEBwdyzUhpltQV4AefnnTMnMGaDc//giO5LXh+Cf8831/tw1Cty43e3GdAB7Vh1yH2 xDi6RRWWMqrlF1+iBzXjytYodbEc+aeIej0ROMY3jQXH7tqlWgIewHmIVyHslfzyr7qH ap98r0ShMvo2OSdXeVS2f0biHs4hGhrfuqykhSeRCNE4yt6uecndPHZRP1oxLtDgB1Qb sBNQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f2-v6si3971393pgi.5.2018.10.04.02.47.32; Thu, 04 Oct 2018 02:47:48 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727955AbeJDQjJ (ORCPT + 99 others); Thu, 4 Oct 2018 12:39:09 -0400 Received: from mx2.suse.de ([195.135.220.15]:58800 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727046AbeJDQjJ (ORCPT ); Thu, 4 Oct 2018 12:39:09 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 12341B016; Thu, 4 Oct 2018 09:46:39 +0000 (UTC) Date: Thu, 4 Oct 2018 11:46:37 +0200 From: Michal Hocko To: David Rientjes Cc: Andrew Morton , Vlastimil Babka , Alexey Dobriyan , "Kirill A. Shutemov" , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-api@vger.kernel.org Subject: Re: [RFC PATCH] mm, proc: report PR_SET_THP_DISABLE in proc Message-ID: <20181004094637.GG22173@dhcp22.suse.cz> References: <20180925202959.GY18685@dhcp22.suse.cz> <20180925150406.872aab9f4f945193e5915d69@linux-foundation.org> <20180926060624.GA18685@dhcp22.suse.cz> <20181002112851.GP18290@dhcp22.suse.cz> <20181003073640.GF18290@dhcp22.suse.cz> <20181004055842.GA22173@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 04-10-18 02:15:38, David Rientjes wrote: > On Thu, 4 Oct 2018, Michal Hocko wrote: > > > > > > > So how about this? (not tested yet but it should be pretty > > > > > > straightforward) > > > > > > > > > > Umm, prctl(PR_GET_THP_DISABLE)? > > > > > > > > /me confused. I thought you want to query for the flag on a > > > > _different_ process. > > > > > > Why would we want to check three locations (system wide setting, prctl > > > setting, madvise setting) to determine if a heap can be backed by thp? > > > > Because we simply have 3 different ways to control THP? Is this a real > > problem? > > > > And prior to the offending commit, there were three ways to control thp > but two ways to determine if a mapping was eligible for thp based on the > implementation detail of one of those ways. Yes, it is really unfortunate that we have ever allowed to leak such an internal stuff like VMA flags to userspace. > If there are three ways to > control thp, userspace is still in the dark wrt which takes precedence > over the other: we have PR_SET_THP_DISABLE but globally sysfs has it set > to "always", or we have MADV_HUGEPAGE set per smaps but PR_SET_THP_DISABLE > shown in /proc/pid/status, etc. > > Which one is the ultimate authority? Isn't our documentation good enough? If not then we should document it properly. > There's one way to specify it: in a > single per-mapping location that reveals whether that mapping is eligible > for thp or not. So I think it would be a very sane extension so that > smaps reveals if a mapping can be backed by hugepages or not depending on > the helper function thp uses itself to determine if it can fault > hugepages. I don't think we should have three locations to check and then > try to resolve which one takes precedence over the other for each > userspace implementation (and perhaps how the kernel implementation > evolves). But we really have three different ways to disable thp. Which one has caused the end result might be interesting/important because different entities might be under control. You either have to contact your admin for the global one, or whomever has launched you for the prctl thing. So the distinction might be important. Checking 3 different places and the precedence rules is not really trivial but I do not see any reason why this couldn't be implemented in a library so the user doesn't really have to scratch head. If you really insist to have per-vma thing then all right but do not conflate vma flags and the higher level logic and make it its own line in the smaps output and make sure it reports only THP able VMAs. -- Michal Hocko SUSE Labs