Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935581Ab3FTC1p (ORCPT ); Wed, 19 Jun 2013 22:27:45 -0400 Received: from relay3.sgi.com ([192.48.152.1]:44720 "EHLO relay.sgi.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S935506Ab3FTC1n (ORCPT ); Wed, 19 Jun 2013 22:27:43 -0400 Date: Wed, 19 Jun 2013 21:27:39 -0500 From: Robin Holt To: David Rientjes Cc: Robin Holt , Alex Thorlton , linux-kernel@vger.kernel.org, Li Zefan , Rob Landley , Andrew Morton , Mel Gorman , Rik van Riel , "Kirill A. Shutemov" , Johannes Weiner , Xiao Guangrong , linux-doc@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v2] Make transparent hugepages cpuset aware Message-ID: <20130620022739.GF3658@sgi.com> References: <1370967244-5610-1-git-send-email-athorlton@sgi.com> <20130618164537.GJ16067@sgi.com> <20130619093212.GX3658@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3775 Lines: 77 On Wed, Jun 19, 2013 at 02:24:07PM -0700, David Rientjes wrote: > On Wed, 19 Jun 2013, Robin Holt wrote: > > > The convenience being that many batch schedulers have added cpuset > > support. They create the cpuset's and configure them as appropriate > > for the job as determined by a mixture of input from the submitting > > user but still under the control of the administrator. That seems like > > a fairly significant convenience given that it took years to get the > > batch schedulers to adopt cpusets in the first place. At this point, > > expanding their use of cpusets is under the control of the system > > administrator and would not require any additional development on > > the batch scheduler developers part. > > > > You can't say the same for memcg? I am not aware of batch scheduler support for memory controllers. The request came from our benchmarking group. > > Here are the entries in the cpuset: > > cgroup.event_control mem_exclusive memory_pressure_enabled notify_on_release tasks > > cgroup.procs mem_hardwall memory_spread_page release_agent > > cpu_exclusive memory_migrate memory_spread_slab sched_load_balance > > cpus memory_pressure mems sched_relax_domain_level > > > > There are scheduler, slab allocator, page_cache layout, etc controls. > > I think this is mostly for historical reasons since cpusets were > introduced before cgroups. > > > Why _NOT_ add a thp control to that nicely contained central location? > > It is a concise set of controls for the job. > > > > All of the above seem to be for cpusets primary purpose, i.e. NUMA > optimizations. It has nothing to do with transparent hugepages. (I'm not > saying thp has anything to do with memcg either, but a "memory controller" > seems more appropriate for controlling thp behavior.) cpusets was not for NUMA. It has no preference for "nodes" or anything like that. It was for splitting a machine into layered smaller groups. Usually, we see one cpuset with contains the batch scheduler. The batch scheduler then creates cpusets for jobs it starts. Has nothing to do with nodes. That is more an administrator issue. They set the minimum grouping of resources for scheduled jobs. > > Maybe I am misunderstanding. Are you saying you want to put memcg > > information into the cpuset or something like that? > > > > I'm saying there's absolutely no reason to have thp controlled by a > cpuset, or ANY cgroup for that matter, since you chose not to respond to > the question I asked: why do you want to control thp behavior for certain > static binaries and not others? Where is the performance regression or > the downside? Is it because of max_ptes_none for certain jobs blowing up > the rss? We need information, and even if were justifiable then it > wouldn't have anything to do with ANY cgroup but rather a per-process > control. It has nothing to do with cpusets whatsoever. It was a request from our benchmarking group that has found some jobs benefit from thp, while other are harmed. Let me ask them for more details. > (And I'm very curious why you didn't even cc the cpusets maintainer on > this patch in the first place who would probably say the same thing.) I didn't know there was a cpuset maintainer. Paul Jackson (SGI retired) had originally worked to get cpusets introduced and then converted to use cgroups. I had never known there was a maintainer after him. Sorry for that. Robin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/