Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp5758246imm; Wed, 12 Sep 2018 10:37:03 -0700 (PDT) X-Google-Smtp-Source: ANB0VdYjlCXb5BXgWCPvY8pQHgpIRvJl12bmblKiDOJycmzfjhuOR9CDFLZUhrQ/Lt55/Q6X7ahN X-Received: by 2002:a17:902:29e3:: with SMTP id h90-v6mr3477817plb.215.1536773823802; Wed, 12 Sep 2018 10:37:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536773823; cv=none; d=google.com; s=arc-20160816; b=Izyskj7YW7Gzqx3reFAPXaFBAFTjY/zJ8f9H5ItdFfFvIlSltcnOTFVPCcatokJ8g2 VFUNAna2f+zUKTqAoAoY6R+I8nXuj1nqUPaoUEBCRhIdvCwF7vFuWssvQbucNoA2j2Gg Kcc0iTm3EAGj0tLRRBWsa4eI1zs9jeAouAAzPb3tKxSnuvYLZEoAQhzQ0h5p3Nf8c6iM awJ9Uu71YTnqqd/2g5jhwMixMg1sejHTZmqgHY4yO6EJs3Ih0nJvoH/lLMqxWHXckJne 6e2aDqu9SOBLXd55OHuzLI1LNB9b7A9fWTOfNqSeqq+938L5a02A3U4Vv45XZeZocjsj peXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=jienSbO9imgjgXlKznXj9qE2aG4UZ7cYDLBt6qIrFFw=; b=el3iVBJqtRFY0xIELBCRbaTTsALG3U6A2BDbNGXJVSzE7r5qtzsqVJxZtXvHARPgMQ kRR03BJlb2etf3086l2kWhG7YAM5cADGzEVSEwI9YlOMyqIiPRGGbmPTSXvKXUMU9yFF 2E3YEg7MukdxrpdHM9+6LXazqPgidKYaV3VURcQJ+D8icb0zBF9ksNGD7z3/VxgGlnFv VjiMjBDqM7WUBWRbp91545FxNXy4EDXbMdzSkkpimfZ79/Ueo8U7kPFwmyS4VZvT73eG XLiA/23B0QW6gNzIagKBsse8Cnh15eZadIiH37jNEiRAU6H+nmlkkKRqV62GzpPQCZQt M8MQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i16-v6si1567082pgj.462.2018.09.12.10.36.48; Wed, 12 Sep 2018 10:37:03 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727795AbeILWky (ORCPT + 99 others); Wed, 12 Sep 2018 18:40:54 -0400 Received: from foss.arm.com ([217.140.101.70]:36516 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727010AbeILWkx (ORCPT ); Wed, 12 Sep 2018 18:40:53 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1B8BF7A9; Wed, 12 Sep 2018 10:35:21 -0700 (PDT) Received: from e110439-lin (e110439-lin.Emea.Arm.com [10.4.12.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4BFDA3F557; Wed, 12 Sep 2018 10:35:18 -0700 (PDT) Date: Wed, 12 Sep 2018 18:35:15 +0100 From: Patrick Bellasi To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ingo Molnar , Tejun Heo , "Rafael J . Wysocki" , Viresh Kumar , Vincent Guittot , Paul Turner , Quentin Perret , Dietmar Eggemann , Morten Rasmussen , Juri Lelli , Todd Kjos , Joel Fernandes , Steve Muckle , Suren Baghdasaryan Subject: Re: [PATCH v4 02/16] sched/core: uclamp: map TASK's clamp values into CPU's clamp groups Message-ID: <20180912173515.GH1413@e110439-lin> References: <20180828135324.21976-1-patrick.bellasi@arm.com> <20180828135324.21976-3-patrick.bellasi@arm.com> <20180912134945.GZ24106@hirez.programming.kicks-ass.net> <20180912155619.GG1413@e110439-lin> <20180912161218.GW24082@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180912161218.GW24082@hirez.programming.kicks-ass.net> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 12-Sep 18:12, Peter Zijlstra wrote: > On Wed, Sep 12, 2018 at 04:56:19PM +0100, Patrick Bellasi wrote: > > On 12-Sep 15:49, Peter Zijlstra wrote: > > > On Tue, Aug 28, 2018 at 02:53:10PM +0100, Patrick Bellasi wrote: > > > > > +/** > > > > + * uclamp_map: reference counts a utilization "clamp value" > > > > + * @value: the utilization "clamp value" required > > > > + * @se_count: the number of scheduling entities requiring the "clamp value" > > > > + * @se_lock: serialize reference count updates by protecting se_count > > > > > > Why do you have a spinlock to serialize a single value? Don't we have > > > atomics for that? > > > > There are some code paths where it's used to protect clamp groups > > mapping and initialization, e.g. > > > > uclamp_group_get() > > spin_lock() > > // initialize clamp group (if required) and then... ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is actually a couple of function calls > > se_count += 1 > > spin_unlock() > > > > Almost all these paths are triggered from user-space and protected > > by a global uclamp_mutex, but fork/exit paths. > > > > To serialize these paths I'm using the spinlock above, does it make > > sense ? Can we use the global uclamp_mutex on forks/exit too ? > > OK, then your comment is misleading; it serializes both fields. Yes... that definitively needs an update. > > One additional observations is that, if in the future we want to add a > > kernel space API, (e.g. driver asking for a new clamp value), maybe we > > will need to have a serialized non-sleeping uclamp_group_get() API ? > > No idea; but if you want to go all fancy you can replace he whole > uclamp_map thing with something like: > > struct uclamp_map { > union { > struct { > unsigned long v : 10; > unsigned long c : BITS_PER_LONG - 10; > }; > atomic_long_t s; > }; > }; That sounds really cool and scary at the same time :) The v:10 requires that we never set SCHED_CAPACITY_SCALE>1024 or that we use it to track a percentage value (i.e. [0..100]). One of the last patches introduces percentage values to userspace. But, I was considering that in kernel space we should always track full scale utilization values. The c:(BITS_PER_LONG-10) restricts the range of concurrently active SE refcounting the same clamp value. Which, for some 32bit systems is only 4 milions among tasks and cgroups... maybe still reasonable... > And use uclamp_map::c == 0 as unused (as per normal refcount > semantics) and atomic_long_cmpxchg() the whole thing using > uclamp_map::s. Yes... that could work for the uclamp_map updates, but as I noted above, I think I have other calls serialized by that lock... will look better into what you suggest, thanks! [...] > > > What's the purpose of that cacheline align statement? > > > > In uclamp_maps, we still need to scan the array when a clamp value is > > changed from user-space, i.e. the cases reported above. Thus, that > > alignment is just to ensure that we minimize the number of cache lines > > used. Does that make sense ? > > > > Maybe that alignment implicitly generated by the compiler ? > > It is not, but if it really is a slow path, we shouldn't care about > alignment. Ok, will remove it. > > > Note that without that apparently superfluous lock, it would be 8*12 = > > > 96 bytes, which is 1.5 lines and would indeed suggest you default to > > > GROUP_COUNT=7 by default to fill 2 lines. > > > > Yes, will check better if we can count on just the uclamp_mutex > > Well, if we don't care about performance (slow path) then keeping he > lock is fine, just the comment and alignment are misleading. Ok [...] Cheers, Patrick -- #include Patrick Bellasi