Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751904AbdHAKwz (ORCPT ); Tue, 1 Aug 2017 06:52:55 -0400 Received: from foss.arm.com ([217.140.101.70]:38036 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751008AbdHAKvL (ORCPT ); Tue, 1 Aug 2017 06:51:11 -0400 References: <1500038464-8742-1-git-send-email-josef@toxicpanda.com> <1500038464-8742-8-git-send-email-josef@toxicpanda.com> User-agent: mu4e 0.9.17; emacs 25.1.1 From: Brendan Jackman To: Josef Bacik Cc: mingo@redhat.com, peterz@infradead.org, linux-kernel@vger.kernel.org, umgwanakikbuti@gmail.com, tj@kernel.org, kernel-team@fb.com, Josef Bacik Subject: Re: [PATCH 7/7] sched/fair: don't wake affine recently load balanced tasks Message-ID: <87fudbsgs1.fsf@arm.com> In-reply-to: <1500038464-8742-8-git-send-email-josef@toxicpanda.com> Date: Tue, 01 Aug 2017 11:51:05 +0100 MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2339 Lines: 68 Hi Josef, I happened to be thinking about something like this while investigating a totally different issue with ARM big.LITTLE. Comment below... On Fri, Jul 14 2017 at 13:21, Josef Bacik wrote: > From: Josef Bacik > > The wake affinity logic will move tasks between two cpu's that appear to be > loaded equally at the current time, with a slight bias towards cache locality. > However on a heavily loaded system the load balancer has a better insight into > what needs to be moved around, so instead keep track of the last time a task was > migrated by the load balancer. If it was recent, opt to let the process stay on > it's current CPU (or an idle sibling). > > Signed-off-by: Josef Bacik > --- > include/linux/sched.h | 1 + > kernel/sched/fair.c | 11 +++++++++++ > 2 files changed, 12 insertions(+) > > diff --git a/include/linux/sched.h b/include/linux/sched.h > index 1a0eadd..d872780 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -528,6 +528,7 @@ struct task_struct { > unsigned long wakee_flip_decay_ts; > struct task_struct *last_wakee; > > + unsigned long last_balance_ts; > int wake_cpu; > #endif > int on_rq; > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 034d5df..6a98a38 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -5604,6 +5604,16 @@ static int wake_wide(struct task_struct *p) > unsigned int slave = p->wakee_flips; > int factor = this_cpu_read(sd_llc_size); > > + /* > + * If we've balanced this task recently we don't want to undo all of > + * that hard work by the load balancer and move it to the current cpu. > + * Constantly overriding the load balancers decisions is going to make > + * it question its purpose in life and give it anxiety and self worth > + * issues, and nobody wants that. > + */ > + if (time_before(jiffies, p->last_balance_ts + HZ)) > + return 1; > + > if (master < slave) > swap(master, slave); > if (slave < factor || master < slave * factor) > @@ -7097,6 +7107,7 @@ static int detach_tasks(struct lb_env *env) > goto next; > > detach_task(p, env); > + p->last_balance_ts = jiffies; I guess this timestamp should be set in the active balance path too? > list_add(&p->se.group_node, &env->tasks); > > detached++; Cheers, Brendan