Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp4172988yba; Tue, 23 Apr 2019 16:49:01 -0700 (PDT) X-Google-Smtp-Source: APXvYqz7e0PkVIE/6FuS6WQgHh9K0JwQAbNX9cbTtlvp8Zj0gYcKQLdzGnNoF3kl0v7OWBkNE9ls X-Received: by 2002:a17:902:1007:: with SMTP id b7mr29285494pla.48.1556063341715; Tue, 23 Apr 2019 16:49:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556063341; cv=none; d=google.com; s=arc-20160816; b=Oi4orA+w9EsGtJ/1yAZsa80MTsDYkSGlgb2FfR7nhWiNMZuker+98FjfL52dj9inMw Q1IIW/HejXvhAEXmUdTKws+zKppGBMvp+adhXmquSsqdVnl7/Y5n97lCtJ1KC0TqC4Bw f0ttsL8iYBGxj6w3+KpR9GDK+gznX2aB7Lti3fLuHQDufAs3DiwBbciAYqjmTgJO01GK neyj03AvDzsJWAM//hN52mFtVYUBpm9R2msrF9bXXeyQTtCRpHGdZdwBqx6h5YVwBHUm iFd1jNP7E0iPpLIWXsSSigz75uGDN/DMZI6sHg0RkL13tdAXlwI8WmdBKKxVBeTBlij6 s+dQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=JamLPcc2cO/9BiTkxZ+G70OkDKZaJfqbA7V5gBeer08=; b=ol+uMezSpZDfscCXqianayGHCwpsd4CH1Sn/MnW5Hkb1caatb2dfbb/gsK2Y+Ha6ZL sZBxwIrRgIcvHu78XCtK7IYnhmcSTLLniPBCk4NKaFTk2geb2iVh1F7lNyknMcx3ZOoX cVeHw0BpieGlr3bYSa2hHXoi+MHu8HXoOiH4XuY/2BYDxNXNL7bBQgcXQEjEwpWB/Yw1 GsKk/Q72kdZDRggC+rC67W+OekhobspJxC5yS1S4+eeTfjQSgc7skYJh1evXYl+BRRGh xAw3ivDqUvrRDP7168mumw5G/81XIJEEV/G60dKiMZzbgHG579X8HspEyq4VhGcZP2jV wzVg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=A6fSC253; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b22si17110208pls.285.2019.04.23.16.48.45; Tue, 23 Apr 2019 16:49:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=A6fSC253; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728604AbfDWXqd (ORCPT + 99 others); Tue, 23 Apr 2019 19:46:33 -0400 Received: from mail-lf1-f67.google.com ([209.85.167.67]:40181 "EHLO mail-lf1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726325AbfDWXqd (ORCPT ); Tue, 23 Apr 2019 19:46:33 -0400 Received: by mail-lf1-f67.google.com with SMTP id o16so2281708lfl.7 for ; Tue, 23 Apr 2019 16:46:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=JamLPcc2cO/9BiTkxZ+G70OkDKZaJfqbA7V5gBeer08=; b=A6fSC2534C2kU9FncHx1yriPKdTASRI49dp2bGGNzUZq0x5F5GAVGokBOtLD3xVhUL GOxVrFhDRAxHAPTY7tOZLotzFuC+8Ww6eeouYneucPAKlFFIS208rPX520SJuPRDfsqM de02ijd6JRKwsgvxdmqju7SVLQ5FGu6OwtzCfYvet6iTDUPELCbCo1iJFFo6DNyY1tI4 hPOo3aoLvOxL/+2h/RT2LdY6yldvTiu/by6W5/KtRMDxny+GLkkVlKkkzgtjcis2OYNm jF/qxANoQDsfx9MCWF1yQXYwSAxus+dzkgR7dcaZqUPdTT4HDDynOIz0xETxCT70HnK4 cLWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=JamLPcc2cO/9BiTkxZ+G70OkDKZaJfqbA7V5gBeer08=; b=fYrcWZhAGAg7uPuZS+wYdAhsO5UxDd1AQS3Xbs51vKkD2+085pVBL8zRZzP8V1GLzG RYn9GB5baSS9fNU4wzQ+X0btXd2AmRrL/rFHoFETeFNwumdFL2n9sBkYFsYXGjNvNGHA H2UZqkzXHOK+gwct7MzEkC8RzymXq5QWFgU4AYY4vHW+qeLNOAt/3pyXxNmE4shPMZRT yAlsBTticssYzpc/NzbtltZEPlqdmmT6xeGkFn7qWMqPIBjelUCXLQSo1PnvbuWktag0 oXUz5KBWIBlylOUIDfXiDMWkBxihnf3TXVnCjoALULbUiDrsf+PIoTkbLSNCrBYwL/OZ hw4w== X-Gm-Message-State: APjAAAWY0OSjJjstyzp/GV+CZS1OKiFgkdvOYqvhfhMuWqmZpemJsQ8K RvHkLXv3bABek2BV7+KO07ZQUJwg9pu2UNiSwNQ= X-Received: by 2002:ac2:5455:: with SMTP id d21mr14836842lfn.60.1556063190656; Tue, 23 Apr 2019 16:46:30 -0700 (PDT) MIME-Version: 1.0 References: <002304fa58577c7926abe9f4cdc8039945985b3d.1556025155.git.vpillai@digitalocean.com> In-Reply-To: <002304fa58577c7926abe9f4cdc8039945985b3d.1556025155.git.vpillai@digitalocean.com> From: Aubrey Li Date: Wed, 24 Apr 2019 07:46:19 +0800 Message-ID: Subject: Re: [RFC PATCH v2 15/17] sched: Trivial forced-newidle balancer To: Vineeth Remanan Pillai Cc: Nishanth Aravamudan , Julien Desfossez , Peter Zijlstra , Tim Chen , Ingo Molnar , Thomas Gleixner , Paul Turner , Linus Torvalds , Linux List Kernel Mailing , Subhra Mazumdar , =?UTF-8?B?RnLDqWTDqXJpYyBXZWlzYmVja2Vy?= , Kees Cook , Greg Kerr , Phil Auld , Aaron Lu , Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 24, 2019 at 12:18 AM Vineeth Remanan Pillai wrote: > > From: Peter Zijlstra (Intel) > > When a sibling is forced-idle to match the core-cookie; search for > matching tasks to fill the core. > > Signed-off-by: Peter Zijlstra (Intel) > --- > include/linux/sched.h | 1 + > kernel/sched/core.c | 131 +++++++++++++++++++++++++++++++++++++++++- > kernel/sched/idle.c | 1 + > kernel/sched/sched.h | 6 ++ > 4 files changed, 138 insertions(+), 1 deletion(-) > > diff --git a/include/linux/sched.h b/include/linux/sched.h > index a4b39a28236f..1a309e8546cd 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -641,6 +641,7 @@ struct task_struct { > #ifdef CONFIG_SCHED_CORE > struct rb_node core_node; > unsigned long core_cookie; > + unsigned int core_occupation; > #endif > > #ifdef CONFIG_CGROUP_SCHED > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 9e6e90c6f9b9..e8f5ec641d0a 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -217,6 +217,21 @@ struct task_struct *sched_core_find(struct rq *rq, unsigned long cookie) > return match; > } > > +struct task_struct *sched_core_next(struct task_struct *p, unsigned long cookie) > +{ > + struct rb_node *node = &p->core_node; > + > + node = rb_next(node); > + if (!node) > + return NULL; > + > + p = container_of(node, struct task_struct, core_node); > + if (p->core_cookie != cookie) > + return NULL; > + > + return p; > +} > + > /* > * The static-key + stop-machine variable are needed such that: > * > @@ -3672,7 +3687,7 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) > struct task_struct *next, *max = NULL; > const struct sched_class *class; > const struct cpumask *smt_mask; > - int i, j, cpu; > + int i, j, cpu, occ = 0; > > if (!sched_core_enabled(rq)) > return __pick_next_task(rq, prev, rf); > @@ -3763,6 +3778,9 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) > goto done; > } > > + if (!is_idle_task(p)) > + occ++; > + > rq_i->core_pick = p; > > /* > @@ -3786,6 +3804,7 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) > > cpu_rq(j)->core_pick = NULL; > } > + occ = 1; > goto again; > } > } > @@ -3808,6 +3827,8 @@ next_class:; > > WARN_ON_ONCE(!rq_i->core_pick); > > + rq_i->core_pick->core_occupation = occ; > + > if (i == cpu) > continue; > > @@ -3823,6 +3844,114 @@ next_class:; > return next; > } > > +static bool try_steal_cookie(int this, int that) > +{ > + struct rq *dst = cpu_rq(this), *src = cpu_rq(that); > + struct task_struct *p; > + unsigned long cookie; > + bool success = false; > + try_steal_cookie() is in the loop of for_each_cpu_wrap(). The root domain could be large and we should avoid stealing cookie if source rq has only one task or dst is really busy. The following patch eliminated a deadlock issue on my side if I didn't miss anything in v1. I'll double check with v2, but it at least avoids unnecessary irq off/on and double rq lock. Especially, it avoids lock contention that the idle cpu which is holding rq lock in the progress of load_balance() and tries to lock rq here. I think it might be worth to be picked up. Thanks, -Aubrey --- kernel/sched/core.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 191ebf9..973a75d 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -3876,6 +3876,13 @@ static bool try_steal_cookie(int this, int that) unsigned long cookie; bool success = false; + /* + * Don't steal if src is idle or has only one runnable task, + * or dst has more than one runnable task + */ + if (src->nr_running <= 1 || unlikely(dst->nr_running >= 1)) + return false; + local_irq_disable(); double_rq_lock(dst, src); -- 2.7.4 > + local_irq_disable(); > + double_rq_lock(dst, src); > + > + cookie = dst->core->core_cookie; > + if (!cookie) > + goto unlock; > + > + if (dst->curr != dst->idle) > + goto unlock; > + > + p = sched_core_find(src, cookie); > + if (p == src->idle) > + goto unlock; > + > + do { > + if (p == src->core_pick || p == src->curr) > + goto next; > + > + if (!cpumask_test_cpu(this, &p->cpus_allowed)) > + goto next; > + > + if (p->core_occupation > dst->idle->core_occupation) > + goto next; > + > + p->on_rq = TASK_ON_RQ_MIGRATING; > + deactivate_task(src, p, 0); > + set_task_cpu(p, this); > + activate_task(dst, p, 0); > + p->on_rq = TASK_ON_RQ_QUEUED; > + > + resched_curr(dst); > + > + success = true; > + break; > + > +next: > + p = sched_core_next(p, cookie); > + } while (p); > + > +unlock: > + double_rq_unlock(dst, src); > + local_irq_enable(); > + > + return success; > +} > + > +static bool steal_cookie_task(int cpu, struct sched_domain *sd) > +{ > + int i; > + > + for_each_cpu_wrap(i, sched_domain_span(sd), cpu) { > + if (i == cpu) > + continue; > + > + if (need_resched()) > + break; > + > + if (try_steal_cookie(cpu, i)) > + return true; > + } > + > + return false; > +} > + > +static void sched_core_balance(struct rq *rq) > +{ > + struct sched_domain *sd; > + int cpu = cpu_of(rq); > + > + rcu_read_lock(); > + raw_spin_unlock_irq(rq_lockp(rq)); > + for_each_domain(cpu, sd) { > + if (!(sd->flags & SD_LOAD_BALANCE)) > + break; > + > + if (need_resched()) > + break; > + > + if (steal_cookie_task(cpu, sd)) > + break; > + } > + raw_spin_lock_irq(rq_lockp(rq)); > + rcu_read_unlock(); > +} > + > +static DEFINE_PER_CPU(struct callback_head, core_balance_head); > + > +void queue_core_balance(struct rq *rq) > +{ > + if (!sched_core_enabled(rq)) > + return; > + > + if (!rq->core->core_cookie) > + return; > + > + if (!rq->nr_running) /* not forced idle */ > + return; > + > + queue_balance_callback(rq, &per_cpu(core_balance_head, rq->cpu), sched_core_balance); > +} > + > #else /* !CONFIG_SCHED_CORE */ > > static struct task_struct * > diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c > index e7f38da60373..44decdcccba1 100644 > --- a/kernel/sched/idle.c > +++ b/kernel/sched/idle.c > @@ -387,6 +387,7 @@ static void set_next_task_idle(struct rq *rq, struct task_struct *next) > { > update_idle_core(rq); > schedstat_inc(rq->sched_goidle); > + queue_core_balance(rq); > } > > static struct task_struct * > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h > index 4cfde289610d..2a5f5a6b11ae 100644 > --- a/kernel/sched/sched.h > +++ b/kernel/sched/sched.h > @@ -1013,6 +1013,8 @@ static inline raw_spinlock_t *rq_lockp(struct rq *rq) > return &rq->__lock; > } > > +extern void queue_core_balance(struct rq *rq); > + > #else /* !CONFIG_SCHED_CORE */ > > static inline bool sched_core_enabled(struct rq *rq) > @@ -1025,6 +1027,10 @@ static inline raw_spinlock_t *rq_lockp(struct rq *rq) > return &rq->__lock; > } > > +static inline void queue_core_balance(struct rq *rq) > +{ > +} > + > #endif /* CONFIG_SCHED_CORE */ > > #ifdef CONFIG_SCHED_SMT > -- > 2.17.1 >