Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3206745imu; Wed, 7 Nov 2018 06:48:54 -0800 (PST) X-Google-Smtp-Source: AJdET5ffilcFbBQwJ6uN24sHriukDQ+RFrGn2orkpVY4BLCKnIcdlM39e8KUSaAaEJ5jhd9UmaiR X-Received: by 2002:a62:6486:: with SMTP id y128-v6mr540180pfb.76.1541602134575; Wed, 07 Nov 2018 06:48:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541602134; cv=none; d=google.com; s=arc-20160816; b=Cmk45w6yxjbd+f9uIlgpeNO9pWC22XUr5yUtO12x1PeAv4MO2c6jw/IrsKjQYQjsew iH5z4NNm+zpPpyvxtSWQb+A9NwovH6PMQolxF9oavgXwtuyyfTKBohLmMmG1Xcg8jizV tyYYEpCC+z2YqGmeajFeyQPzluKOaFEyeYITUYxqYptVSAGqYoWcA4IDB6JqIsyYFH9r 0Erqa54AO4/ywyWny68JHKD9cfXSfPSAy+gOzdAGkoTWFqLXl6wjYOxS3nR6boOsxXII Pu8YCRgAUFWV0IVojWivOzsZXYBkMs9QmQ+6l2EWZOOWEDuBdx9F0yWdE02dEYFRtue8 /jAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=BzEZk1Ih8Ri4S1miqsEme+nr/o3S10dA05Vc5YOA5tc=; b=ReB9NeHgEeOIhx/z6tm2aSWAeZdFvPvr6LrSM8BxhgO8UJhYWWYrcb5hPIGlImUQhm XYy5RSMScKo3u8VtfdWhB2wEX6eKbUHac/OHJdovwjz7hLwahe2xdiWC0GrpzbQ8RJTI oqFhIZ8IxhR1HXaM2zpQjzSoo7gutFkxudwgxHEXzQATq+0TP3f68cdyJjgN3xNtTwMR FwY0E0qUDJwMoVwUgIAMqpHYMjCqTDB9V9VCkJGuzRQMZUuviDtoWD1VYuaOox7ZXlad AgoBIk5JC5BpAbHdX/IabLtIofSUb31EZk6Bj3om9gjbJp/87vXNU1IZsJy/iNGB6+AN Fj6A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a34-v6si845356pld.249.2018.11.07.06.48.39; Wed, 07 Nov 2018 06:48:54 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727698AbeKHASz (ORCPT + 99 others); Wed, 7 Nov 2018 19:18:55 -0500 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:51942 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726635AbeKHASy (ORCPT ); Wed, 7 Nov 2018 19:18:54 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C9D4A80D; Wed, 7 Nov 2018 06:48:14 -0800 (PST) Received: from e110439-lin (e110439-lin.cambridge.arm.com [10.1.194.43]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 070DD3F5BD; Wed, 7 Nov 2018 06:48:11 -0800 (PST) Date: Wed, 7 Nov 2018 14:48:09 +0000 From: Patrick Bellasi To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ingo Molnar , Tejun Heo , "Rafael J . Wysocki" , Vincent Guittot , Viresh Kumar , Paul Turner , Quentin Perret , Dietmar Eggemann , Morten Rasmussen , Juri Lelli , Todd Kjos , Joel Fernandes , Steve Muckle , Suren Baghdasaryan Subject: Re: [PATCH v5 03/15] sched/core: uclamp: map TASK's clamp values into CPU's clamp groups Message-ID: <20181107144809.GH14309@e110439-lin> References: <20181029183311.29175-1-patrick.bellasi@arm.com> <20181029183311.29175-4-patrick.bellasi@arm.com> <20181107133504.GQ9781@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181107133504.GQ9781@hirez.programming.kicks-ass.net> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07-Nov 14:35, Peter Zijlstra wrote: > On Mon, Oct 29, 2018 at 06:32:57PM +0000, Patrick Bellasi wrote: > > +/** > > + * uclamp_group_get: increase the reference count for a clamp group > > + * @uc_se: the utilization clamp data for the task > > + * @clamp_id: the clamp index affected by the task > > + * @clamp_value: the new clamp value for the task > > + * > > + * Each time a task changes its utilization clamp value, for a specified clamp > > + * index, we need to find an available clamp group which can be used to track >You mean se_count overflow ? > + * this new clamp value. The corresponding clamp group index will be used to > > + * reference count the corresponding clamp value while the task is enqueued on > > + * a CPU. > > + */ > > +static void uclamp_group_get(struct uclamp_se *uc_se, unsigned int clamp_id, > > + unsigned int clamp_value) > > +{ > > + union uclamp_map *uc_maps = &uclamp_maps[clamp_id][0]; > > + unsigned int prev_group_id = uc_se->group_id; > > + union uclamp_map uc_map_old, uc_map_new; > > + unsigned int free_group_id; > > + unsigned int group_id; > > + unsigned long res; > > + > > +retry: > > + > > + free_group_id = UCLAMP_GROUPS; > > + for (group_id = 0; group_id < UCLAMP_GROUPS; ++group_id) { > > + uc_map_old.data = atomic_long_read(&uc_maps[group_id].adata); > > + if (free_group_id == UCLAMP_GROUPS && !uc_map_old.se_count) > > + free_group_id = group_id; > > + if (uc_map_old.value == clamp_value) > > + break; > > + } > > + if (group_id >= UCLAMP_GROUPS) { > > +#ifdef CONFIG_SCHED_DEBUG > > +#define UCLAMP_MAPERR "clamp value [%u] mapping to clamp group failed\n" > > + if (unlikely(free_group_id == UCLAMP_GROUPS)) { > > + pr_err_ratelimited(UCLAMP_MAPERR, clamp_value); > > + return; > > + } > > +#endif > > + group_id = free_group_id; > > + uc_map_old.data = atomic_long_read(&uc_maps[group_id].adata); > > + } > > You forgot to check for refcount overflow here ;-) You mean se_count overflow ? That se_count is (BITS_PER_LONG - SCHED_CAPACITY_SHIFT - 1) which makes it able to track up to: +2mln tasks/task_groups on 32bit systems (with SCHED_CAPACITY_SHIFT 10) +10^12 tasks/task_groups on 64bit systems (with SCHED_CAPACITY_SHIFT 20) I don't expect overflow on 64bit systems, do we ? It's more likely on 32bit systems, especially if in the future we should increase SCHED_CAPACITY_SHIFT. > And I'm not really a fan of hiding that error in a define like you keep > doing. The #define is not there to mask an overflow, it's there to catch the case in which the refcount should be corrupted and we end up violating the invariant: "there is always a clamp group available". NOTE: that invariant is granted once we add sched/core: uclamp: add clamp group bucketing support The warning reports the issue only on CONFIG_SCHED_DEBUG, but... it makes sense to keep it always enabled. While, in case of data corruption, we should just return thus not setting the scheduling entity as "mapped" towards the end of the function... which makes me see that it's actually wrong to conditionally compile the above "return" > What's wrong with something like: > > if (SCHED_WARN(free_group_id == UCLAMP_GROUPS)) > return; Right, the flow should be: 1. try to find a valid clamp group 2. if you don't find one, the data structures are corrupted warn once for data corruption do not map this scheduling entity and return 3. map the scheduling entity Is that ok ? > and > > > + uc_map_new.se_count = uc_map_old.se_count + 1; > > if (SCHED_WARN(!new.se_count)) > new.se_count = -1; Mmm... not sure we can recover from a corrupted refcount or an overflow. What should we do on these cases, disable uclamp completely ? > > + uc_map_new.value = clamp_value; > > + res = atomic_long_cmpxchg(&uc_maps[group_id].adata, > > + uc_map_old.data, uc_map_new.data); > > + if (res != uc_map_old.data) > > + goto retry; > > + > > + /* Update SE's clamp values and attach it to new clamp group */ > > + uc_se->value = clamp_value; > > + uc_se->group_id = group_id; > > + > > + /* Release the previous clamp group */ > > + if (uc_se->mapped) > > + uclamp_group_put(clamp_id, prev_group_id); > > + uc_se->mapped = true; > > +} -- #include Patrick Bellasi