Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp1811418ybh; Mon, 20 Jul 2020 07:50:32 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx2uvae/0bdCmm7Cpixn0gA3b1e1TC5RqhuZrHqodV9ZL39daV44WoekEaqEjGOaMe50L9S X-Received: by 2002:a05:6402:b0d:: with SMTP id bm13mr22036464edb.301.1595256632034; Mon, 20 Jul 2020 07:50:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1595256632; cv=none; d=google.com; s=arc-20160816; b=IS70Txhs1kkfwzFZcyqe1FDcE/VQhqWTLdZKY3L6JNNk3AOm49r6xCAStFTAfu80qn 5cQbjR8XMv1Dc3TeHwFd23El+ALk0YCNrkY2LL3bhTt8tHwv4mtdx+t86VgiRbKJtH/Q FDNmlNh8T9nWPDLJdRvEE7HkAmxKCtqLlO8YF7KpvlWUAodfS7/k36yTeukH3LvNGpch bwYnBI6f+G9Yarxy14yY6eRTXdhJ43bA4WLmc1x6eogu93WDDEp6tyNgMe1Xtx+vzOwH IrMnn/WTAzrOSFsHCI8bIjGYRxV2yKXfHGTF/Ycdf08Ny9FHh84/qzRUCy228rDLEdVQ JgGg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :content-id:content-language:accept-language:in-reply-to:references :message-id:date:thread-index:thread-topic:subject:cc:to:from :dkim-signature; bh=PJXKfQaJw2cYqc9qw1q13oXwfcq0ex5vKK2/5Htkk10=; b=C0kg3rF8R2oWmgjVP6EgyFyH3pjwWGO04b2joG4BLO0UQ7s+/a9TSNJ4mpwfkFgr05 iqKGglFdRCkGLnAhBt1boQraShXok8SsNdJka0ewKNFv8gtwErMa6XsyGjtkpjS3nwvX PuSzQXqbfStMx1pat6DoopDxvULjZHt6YDZZbSD1p4W9n380LwW9Y/qluCXDImkYLalR W3DsWuv38A040fKH/BEVgRXo59d41S8qyxt95bUwN27EGCQzdhTXM8Q+N1QAuisgEwjA 9lqiEAL3+xkNHOHi+hy06mqcAU/hjgEPUtz4AEo71kus6+4DSqQtls5SoqhMOmyL8p2X h2PQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@tencent.com header.s=s202002 header.b=IGIpniGN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=tencent.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i22si9788009edy.547.2020.07.20.07.50.02; Mon, 20 Jul 2020 07:50:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@tencent.com header.s=s202002 header.b=IGIpniGN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=tencent.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728615AbgGTOtY (ORCPT + 99 others); Mon, 20 Jul 2020 10:49:24 -0400 Received: from mail6.tencent.com ([220.249.245.26]:41313 "EHLO mail6.tencent.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726983AbgGTOtY (ORCPT ); Mon, 20 Jul 2020 10:49:24 -0400 X-Greylist: delayed 871 seconds by postgrey-1.27 at vger.kernel.org; Mon, 20 Jul 2020 10:49:19 EDT Received: from EX-SZ021.tencent.com (unknown [10.28.6.73]) by mail6.tencent.com (Postfix) with ESMTP id 14AE7CC36A; Mon, 20 Jul 2020 22:35:52 +0800 (CST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tencent.com; s=s202002; t=1595255752; bh=28Hzqp0zBl/L0kyUJ2oJsek9vrVGCYcz3VH2Qqe77zo=; h=From:To:CC:Subject:Date:References:In-Reply-To; b=IGIpniGNRHccIGCHjsS4AM9IO2k8jxxdFq46LPgNBZUeCM8imV8/3F9wltMhn6MDq UnlxtYdyW7H/HPqcSt//Rc2Y88ttJIbtTj0DAFdO+TX0idAoOJrkHlma265A5tQiVc +VQ1/2gXO0+aTi7Bo5XebihDFIXKSSYw8kz55UtY= Received: from EX-SZ001.tencent.com (10.28.6.13) by EX-SZ021.tencent.com (10.28.6.73) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1847.3; Mon, 20 Jul 2020 22:34:44 +0800 Received: from EX-SZ012.tencent.com (10.28.6.36) by EX-SZ001.tencent.com (10.28.6.13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1847.3; Mon, 20 Jul 2020 22:34:43 +0800 Received: from EX-SZ012.tencent.com ([fe80::f57b:8971:e6d4:fe6b]) by EX-SZ012.tencent.com ([fe80::f57b:8971:e6d4:fe6b%3]) with mapi id 15.01.1847.007; Mon, 20 Jul 2020 22:34:43 +0800 From: =?iso-2022-jp?B?YmVuYmppYW5nKBskQj5VSTcbKEIp?= To: Vineeth Remanan Pillai CC: Nishanth Aravamudan , Julien Desfossez , Peter Zijlstra , "Tim Chen" , "mingo@kernel.org" , "tglx@linutronix.de" , "pjt@google.com" , "torvalds@linux-foundation.org" , "linux-kernel@vger.kernel.org" , "subhra.mazumdar@oracle.com" , "fweisbec@gmail.com" , "keescook@chromium.org" , "kerrnel@google.com" , Phil Auld , Aaron Lu , Aubrey Li , Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini , Joel Fernandes , "Joel Fernandes (Google)" , "vineethrp@gmail.com" , "Chen Yu" , Christian Brauner Subject: Re: [RFC PATCH 10/16] sched: Trivial forced-newidle balancer(Internet mail) Thread-Topic: [RFC PATCH 10/16] sched: Trivial forced-newidle balancer(Internet mail) Thread-Index: AQHWTyYpnBjYC7x930alFUJ4JyvKnakQIWiA Date: Mon, 20 Jul 2020 14:34:43 +0000 Message-ID: <603813E7-7096-455F-894F-9356456C2E33@tencent.com> References: <980b600006945a45ce1ec34ef206fc04bcf0b5dc.1593530334.git.vpillai@digitalocean.com> In-Reply-To: <980b600006945a45ce1ec34ef206fc04bcf0b5dc.1593530334.git.vpillai@digitalocean.com> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [9.19.161.123] Content-Type: text/plain; charset="iso-2022-jp" Content-ID: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Jul 1, 2020, at 5:32 AM, Vineeth Remanan Pillai wrote: >=20 > From: Peter Zijlstra >=20 > When a sibling is forced-idle to match the core-cookie; search for > matching tasks to fill the core. >=20 > rcu_read_unlock() can incur an infrequent deadlock in > sched_core_balance(). Fix this by using the RCU-sched flavor instead. >=20 > Signed-off-by: Peter Zijlstra (Intel) > Signed-off-by: Joel Fernandes (Google) > Acked-by: Paul E. McKenney > --- > include/linux/sched.h | 1 + > kernel/sched/core.c | 131 +++++++++++++++++++++++++++++++++++++++++- > kernel/sched/idle.c | 1 + > kernel/sched/sched.h | 6 ++ > 4 files changed, 138 insertions(+), 1 deletion(-) >=20 > diff --git a/include/linux/sched.h b/include/linux/sched.h > index 3c8dcc5ff039..4f9edf013df3 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -688,6 +688,7 @@ struct task_struct { > #ifdef CONFIG_SCHED_CORE > struct rb_node core_node; > unsigned long core_cookie; > + unsigned int core_occupation; > #endif >=20 > #ifdef CONFIG_CGROUP_SCHED > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 4d6d6a678013..fb9edb09ead7 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -201,6 +201,21 @@ static struct task_struct *sched_core_find(struct rq= *rq, unsigned long cookie) > return match; > } >=20 > +static struct task_struct *sched_core_next(struct task_struct *p, unsign= ed long cookie) > +{ > + struct rb_node *node =3D &p->core_node; > + > + node =3D rb_next(node); > + if (!node) > + return NULL; > + > + p =3D container_of(node, struct task_struct, core_node); > + if (p->core_cookie !=3D cookie) > + return NULL; > + > + return p; > +} > + > /* > * The static-key + stop-machine variable are needed such that: > * > @@ -4233,7 +4248,7 @@ pick_next_task(struct rq *rq, struct task_struct *p= rev, struct rq_flags *rf) > struct task_struct *next, *max =3D NULL; > const struct sched_class *class; > const struct cpumask *smt_mask; > - int i, j, cpu; > + int i, j, cpu, occ =3D 0; > bool need_sync; >=20 > if (!sched_core_enabled(rq)) > @@ -4332,6 +4347,9 @@ pick_next_task(struct rq *rq, struct task_struct *p= rev, struct rq_flags *rf) > goto done; > } >=20 > + if (!is_idle_task(p)) > + occ++; > + > rq_i->core_pick =3D p; >=20 > /* > @@ -4357,6 +4375,7 @@ pick_next_task(struct rq *rq, struct task_struct *p= rev, struct rq_flags *rf) >=20 > cpu_rq(j)->core_pick =3D NULL; > } > + occ =3D 1; > goto again; > } else { > /* > @@ -4393,6 +4412,8 @@ next_class:; > if (is_idle_task(rq_i->core_pick) && rq_i->nr_running) > rq_i->core_forceidle =3D true; >=20 > + rq_i->core_pick->core_occupation =3D occ; > + > if (i =3D=3D cpu) > continue; >=20 > @@ -4408,6 +4429,114 @@ next_class:; > return next; > } >=20 > +static bool try_steal_cookie(int this, int that) > +{ > + struct rq *dst =3D cpu_rq(this), *src =3D cpu_rq(that); > + struct task_struct *p; > + unsigned long cookie; > + bool success =3D false; > + > + local_irq_disable(); > + double_rq_lock(dst, src); > + > + cookie =3D dst->core->core_cookie; > + if (!cookie) > + goto unlock; > + > + if (dst->curr !=3D dst->idle) > + goto unlock; > + Could it be ok to add another fast return here? if (src->nr_running =3D=3D 1) goto unlock; When src cpu has only 1 running task, no need to pull and no need to do sch= ed_core_find. Thx. Regards, Jiang > + p =3D sched_core_find(src, cookie); > + if (p =3D=3D src->idle) > + goto unlock; > + > + do { > + if (p =3D=3D src->core_pick || p =3D=3D src->curr) > + goto next; > + > + if (!cpumask_test_cpu(this, &p->cpus_mask)) > + goto next; > + > + if (p->core_occupation > dst->idle->core_occupation) > + goto next; > + > + p->on_rq =3D TASK_ON_RQ_MIGRATING; > + deactivate_task(src, p, 0); > + set_task_cpu(p, this); > + activate_task(dst, p, 0); > + p->on_rq =3D TASK_ON_RQ_QUEUED; > + > + resched_curr(dst); > + > + success =3D true; > + break; > + > +next: > + p =3D sched_core_next(p, cookie); > + } while (p); > + > +unlock: > + double_rq_unlock(dst, src); > + local_irq_enable(); > + > + return success; > +} > + > +static bool steal_cookie_task(int cpu, struct sched_domain *sd) > +{ > + int i; > + > + for_each_cpu_wrap(i, sched_domain_span(sd), cpu) { > + if (i =3D=3D cpu) > + continue; > + > + if (need_resched()) > + break; > + > + if (try_steal_cookie(cpu, i)) > + return true; > + } > + > + return false; > +} > + > +static void sched_core_balance(struct rq *rq) > +{ > + struct sched_domain *sd; > + int cpu =3D cpu_of(rq); > + > + rcu_read_lock_sched(); > + raw_spin_unlock_irq(rq_lockp(rq)); > + for_each_domain(cpu, sd) { > + if (!(sd->flags & SD_LOAD_BALANCE)) > + break; > + > + if (need_resched()) > + break; > + > + if (steal_cookie_task(cpu, sd)) > + break; > + } > + raw_spin_lock_irq(rq_lockp(rq)); > + rcu_read_unlock_sched(); > +} > + > +static DEFINE_PER_CPU(struct callback_head, core_balance_head); > + > +void queue_core_balance(struct rq *rq) > +{ > + if (!sched_core_enabled(rq)) > + return; > + > + if (!rq->core->core_cookie) > + return; > + > + if (!rq->nr_running) /* not forced idle */ > + return; > + > + queue_balance_callback(rq, &per_cpu(core_balance_head, rq->cpu), sched_= core_balance); > +} > + > #else /* !CONFIG_SCHED_CORE */ >=20 > static struct task_struct * > diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c > index a8d40ffab097..dff6ba220ed7 100644 > --- a/kernel/sched/idle.c > +++ b/kernel/sched/idle.c > @@ -395,6 +395,7 @@ static void set_next_task_idle(struct rq *rq, struct = task_struct *next, bool fir > { > update_idle_core(rq); > schedstat_inc(rq->sched_goidle); > + queue_core_balance(rq); > } >=20 > #ifdef CONFIG_SMP > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h > index 293aa1ae0308..464559676fd2 100644 > --- a/kernel/sched/sched.h > +++ b/kernel/sched/sched.h > @@ -1089,6 +1089,8 @@ static inline raw_spinlock_t *rq_lockp(struct rq *r= q) > bool cfs_prio_less(struct task_struct *a, struct task_struct *b); > void sched_core_adjust_sibling_vruntime(int cpu, bool coresched_enabled); >=20 > +extern void queue_core_balance(struct rq *rq); > + > #else /* !CONFIG_SCHED_CORE */ >=20 > static inline bool sched_core_enabled(struct rq *rq) > @@ -1101,6 +1103,10 @@ static inline raw_spinlock_t *rq_lockp(struct rq *= rq) > return &rq->__lock; > } >=20 > +static inline void queue_core_balance(struct rq *rq) > +{ > +} > + > #endif /* CONFIG_SCHED_CORE */ >=20 > #ifdef CONFIG_SCHED_SMT > --=20 > 2.17.1 >=20 >=20