Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754205AbdHUR01 (ORCPT ); Mon, 21 Aug 2017 13:26:27 -0400 Received: from mail-qt0-f195.google.com ([209.85.216.195]:37319 "EHLO mail-qt0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753979AbdHUR0Z (ORCPT ); Mon, 21 Aug 2017 13:26:25 -0400 Date: Mon, 21 Aug 2017 13:26:23 -0400 From: Josef Bacik To: Brendan Jackman Cc: linux-kernel@vger.kernel.org, Joel Fernandes , Andres Oportus , Ingo Molnar , Morten Rasmussen , Peter Zijlstra , Dietmar Eggemann , Vincent Guittot Subject: Re: [PATCH 2/2] sched/fair: Fix use of NULL with find_idlest_group Message-ID: <20170821172622.GA23807@destiny> References: <20170821152128.14418-1-brendan.jackman@arm.com> <20170821152128.14418-3-brendan.jackman@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170821152128.14418-3-brendan.jackman@arm.com> User-Agent: Mutt/1.8.0 (2017-02-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4154 Lines: 106 On Mon, Aug 21, 2017 at 04:21:28PM +0100, Brendan Jackman wrote: > The current use of returning NULL from find_idlest_group is broken in > two cases: > > a1) The local group is not allowed. > > In this case, we currently do not change this_runnable_load or > this_avg_load from its initial value of 0, which means we return > NULL regardless of the load of the other, allowed groups. This > results in pointlessly continuing the find_idlest_group search > within the local group and then returning prev_cpu from > select_task_rq_fair. > > a2) No CPUs in the sched_domain are allowed. > > In this case we also return NULL and again pointlessly continue > the search. > > b) smp_processor_id() is the "idlest" and != prev_cpu. > > find_idlest_group also returns NULL when the local group is > allowed and is the idlest. The caller then continues the > find_idlest_group search at a lower level of the current CPU's > sched_domain hierarchy. However new_cpu is not updated. This means > the search is pointless and we return prev_cpu from > select_task_rq_fair. > > This is fixed by: > > 1. Returning NULL from find_idlest_group only when _no_ groups were > allowed in the current sched_domain. In this case, we now break > from the while(sd) loop and immediately return prev_cpu. This > fixes case a2). > > 2. Initializing this_runnable_load and this_avg_load to ULONG_MAX > instead of 0. This means in case a1) we now return the idlest > non-local group. > > 3. Explicitly updating new_cpu when find_idlest_group returns the > local group, fixing case b). > > This patch also re-words the check for whether the group in > consideration is local, under the assumption that the first group in > the sched domain is always the local one. > > Signed-off-by: Brendan Jackman > Cc: Ingo Molnar > Cc: Morten Rasmussen > Cc: Peter Zijlstra > Cc: Dietmar Eggemann > Cc: Vincent Guittot > --- > kernel/sched/fair.c | 34 +++++++++++++++++++--------------- > 1 file changed, 19 insertions(+), 15 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 64618d768546..7cb5ed719cf9 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -5382,26 +5382,29 @@ static unsigned long capacity_spare_wake(int cpu, struct task_struct *p) > * domain. > */ > static struct sched_group * > -find_idlest_group(struct sched_domain *sd, struct task_struct *p, > - int this_cpu, int sd_flag) > +find_idlest_group(struct sched_domain *sd, struct task_struct *p, int sd_flag) > { > struct sched_group *idlest = NULL, *group = sd->groups; > + struct sched_group *local_group = sd->groups; > struct sched_group *most_spare_sg = NULL; > - unsigned long min_runnable_load = ULONG_MAX, this_runnable_load = 0; > - unsigned long min_avg_load = ULONG_MAX, this_avg_load = 0; > + unsigned long min_runnable_load = ULONG_MAX, this_runnable_load = ULONG_MAX; > + unsigned long min_avg_load = ULONG_MAX, this_avg_load = ULONG_MAX; > unsigned long most_spare = 0, this_spare = 0; > int load_idx = sd->forkexec_idx; > int imbalance_scale = 100 + (sd->imbalance_pct-100)/2; > unsigned long imbalance = scale_load_down(NICE_0_LOAD) * > (sd->imbalance_pct-100) / 100; > > + if (!cpumask_intersects(sched_domain_span(sd), &p->cpus_allowed)) > + return NULL; > + > if (sd_flag & SD_BALANCE_WAKE) > load_idx = sd->wake_idx; > > do { > unsigned long load, avg_load, runnable_load; > unsigned long spare_cap, max_spare_cap; > - int local_group; > + bool group_is_local = group == local_group; > int i; > > /* Skip over this group if it has no CPUs allowed */ > @@ -5409,9 +5412,6 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p, > &p->cpus_allowed)) > continue; > > - local_group = cpumask_test_cpu(this_cpu, > - sched_group_span(group)); > - This isn't right is it? cpu isn't necessarily in the very first group of a sd right? Thanks, Josef