Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751976AbdHVEgR (ORCPT ); Tue, 22 Aug 2017 00:36:17 -0400 Received: from mail-qt0-f173.google.com ([209.85.216.173]:37757 "EHLO mail-qt0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751303AbdHVEer (ORCPT ); Tue, 22 Aug 2017 00:34:47 -0400 MIME-Version: 1.0 In-Reply-To: <20170821211400.GF32112@worktop.programming.kicks-ass.net> References: <20170821152128.14418-1-brendan.jackman@arm.com> <20170821152128.14418-3-brendan.jackman@arm.com> <20170821211400.GF32112@worktop.programming.kicks-ass.net> From: Joel Fernandes Date: Mon, 21 Aug 2017 21:34:45 -0700 Message-ID: Subject: Re: [PATCH 2/2] sched/fair: Fix use of NULL with find_idlest_group To: Peter Zijlstra Cc: Brendan Jackman , LKML , Andres Oportus , Ingo Molnar , Morten Rasmussen , Dietmar Eggemann , Vincent Guittot Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3802 Lines: 99 Hi Peter, On Mon, Aug 21, 2017 at 2:14 PM, Peter Zijlstra wrote: > On Mon, Aug 21, 2017 at 04:21:28PM +0100, Brendan Jackman wrote: >> The current use of returning NULL from find_idlest_group is broken in >> two cases: >> >> a1) The local group is not allowed. >> >> In this case, we currently do not change this_runnable_load or >> this_avg_load from its initial value of 0, which means we return >> NULL regardless of the load of the other, allowed groups. This >> results in pointlessly continuing the find_idlest_group search >> within the local group and then returning prev_cpu from >> select_task_rq_fair. > >> b) smp_processor_id() is the "idlest" and != prev_cpu. >> >> find_idlest_group also returns NULL when the local group is >> allowed and is the idlest. The caller then continues the >> find_idlest_group search at a lower level of the current CPU's >> sched_domain hierarchy. However new_cpu is not updated. This means >> the search is pointless and we return prev_cpu from >> select_task_rq_fair. >> > > I think its much simpler than that.. but its late, so who knows ;-) > > Both cases seem predicated on the assumption that we'll return @cpu when > we don't find any idler CPU. Consider, if the local group is the idlest, > we should stick with @cpu and simply proceed with the child domain. > > The confusion, and the bugs, seem to have snuck in when we started > considering @prev_cpu, whenever that was. The below is mostly code > movement to put that whole while(sd) loop into its own function. > > The effective change is setting @new_cpu = @cpu when we start that loop: > > --- > kernel/sched/fair.c | 83 +++++++++++++++++++++++++++++++---------------------- > 1 file changed, 48 insertions(+), 35 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index c77e4b1d51c0..3e77265c480a 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -5588,10 +5588,10 @@ static unsigned long capacity_spare_wake(int cpu, struct task_struct *p) > } > > /* > - * find_idlest_cpu - find the idlest cpu among the cpus in group. > + * find_idlest_group_cpu - find the idlest cpu among the cpus in group. > */ > static int > -find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu) > +find_idlest_group_cpu(struct sched_group *group, struct task_struct *p, int this_cpu) > { > unsigned long load, min_load = ULONG_MAX; > unsigned int min_exit_latency = UINT_MAX; > @@ -5640,6 +5640,50 @@ static unsigned long capacity_spare_wake(int cpu, struct task_struct *p) > return shallowest_idle_cpu != -1 ? shallowest_idle_cpu : least_loaded_cpu; > } > > +static int > +find_idlest_cpu(struct sched_domain *sd, struct task_struct *p, int cpu, int sd_flag) > +{ > + struct sched_domain *tmp; > + int new_cpu = cpu; > + > + while (sd) { > + struct sched_group *group; > + int weight; > + > + if (!(sd->flags & sd_flag)) { > + sd = sd->child; > + continue; > + } > + > + group = find_idlest_group(sd, p, cpu, sd_flag); > + if (!group) { > + sd = sd->child; > + continue; But this will still have the issue of pointlessly searching in the local_group when the idlest CPU is in the non-local group? Stemming from the fact that find_idlest_group is broken if the local group is not allowed. I believe this is fixed by Brendan's patch? : "Initializing this_runnable_load and this_avg_load to ULONG_MAX instead of 0. This means in case a1) we now return the idlest non-local group." Hopefully I didn't missing something. Sorry if I did, thanks, -Joel