Received: by 2002:a25:868d:0:0:0:0:0 with SMTP id z13csp1814585ybk; Thu, 21 May 2020 16:16:48 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxcEistnJ8yTzOKWABddowru2No9vVKgxltXUZEEALWtd948vBYTY9ZsN29nb4h0y2mkr6L X-Received: by 2002:a17:906:8d0:: with SMTP id o16mr5958074eje.196.1590103008453; Thu, 21 May 2020 16:16:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1590103008; cv=none; d=google.com; s=arc-20160816; b=PO3APkUK/ME7DM4h+zrZTS+MiPpFiqhwnwJNI9gq3KWV9wNRy4KHsBPB+viCy6ZkFx rTP5YW8VwBayqp5RkwtVZbyDLx5rGr0NQOxEFxIkUvGkNKVE5wCaElMXgvGvc9Azu4Fb 8Iw/oYwPSNSjglN/+zvmN7dhgLJdVb1EZzB7szjA7xWn6vuE7sjsuAfLE13vTQL2ewE1 xK4cOyMQ4+BiUei7Lm8APiI8G5+DTNr1FQOy4lfeRsggQZitlaQ+84ToCe3k+JlNsMVa +jIz+rnSOkllBmBzBL8KMvkymkaHpBmGcQA8vRF277B/sOckLdSH318J4F/WKX7R6ITx bKhw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=p3Hdvcd+xHhyt3NoKuQ6ayw1V2hWUZczEPhgChE6r2I=; b=RrLQmKCjs+AXWYfWSGsmsFjWkEKzURKi9Cu3C5rzkci93KHF1rksiovbt5ML0EkHjN iAWrJCuBxi90JJ5R9gRAUYRyrggsdMar3Wxx5wIY9WzR5smy5vq/XpH73vTmOWG8fiJA KIEriwu0Aoy6S44+XK7JmA8Ak/Osqrg0z+3HhQ7u/h3kXQpoSxchWgZAW/4tOoVBk2f0 zwjYyaSmTKHRIX2omDg6H4XPpWx+H1oQs2SpxarHBhtbA20XdTIlV3jAg4TssbWX7DVi hW3uoIg0mgtxpPXL31v8852140ya7agD2dzU0Nif05OYExDGRNQPLpdIVBFFCn8Jt1e5 JCyw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=wSg8zAEc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id r10si3844427edo.137.2020.05.21.16.16.26; Thu, 21 May 2020 16:16:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=wSg8zAEc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730764AbgEUXO3 (ORCPT + 99 others); Thu, 21 May 2020 19:14:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52700 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730658AbgEUXO2 (ORCPT ); Thu, 21 May 2020 19:14:28 -0400 Received: from mail-qt1-x843.google.com (mail-qt1-x843.google.com [IPv6:2607:f8b0:4864:20::843]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90F2DC061A0E for ; Thu, 21 May 2020 16:14:28 -0700 (PDT) Received: by mail-qt1-x843.google.com with SMTP id l1so6941900qtp.6 for ; Thu, 21 May 2020 16:14:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=p3Hdvcd+xHhyt3NoKuQ6ayw1V2hWUZczEPhgChE6r2I=; b=wSg8zAEchboHGRWoQA3FX2iSGBAIm5t/5PTrQb+91NK0EumvVgmfXPYgbbwXFkbms/ 5ZAsqxfVF5QXxsvyEJr37KsuHJed4LPiCe8TgfggMT8IZ+b+EJbG0UD2ZgXD9G5LZ16B EA5WK3B4nmESt6f911z3u6/JGYsUUg+JWSQHo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=p3Hdvcd+xHhyt3NoKuQ6ayw1V2hWUZczEPhgChE6r2I=; b=c4xJfWBD/czP84ALDkeD1r8p2kAwszQjEZPAm3qAhxTFXYNN5LDXDgXgwVXvyXA/pu tE9V5ZVtfRZslngD8CKEZNVJXmDQUPNDHEM3/G1uIQCqBwaCydbqFN1JMLOYLAc0Pipl DEig6NMcCpuZnLBlWYYUDL6V/Pbx7yCiSudLr9sUGOat/Blpwwz3kTdIQjCQxP8tz9vS ExnXJY3fIc1u85/OEHYQni2tKvK6aPMp0fg3OVmU0qfjpDcYd9Em76DZCB7s+7hPKrcV JxTANX1/pIGhGiIZ1+Nzv7ZMF21DzY10fxzGP18Vy96Y7ovbQ2mWVyPxgT1mLdFpWE29 kbFw== X-Gm-Message-State: AOAM531yiug8U9TkoTCOoNCQCG6zlpYlqSDLTxWItkDW+jVhlGkOxE6W 43Puc+gqQR7YEPOIbW2D49xWGg== X-Received: by 2002:ac8:2492:: with SMTP id s18mr5923786qts.81.1590102867666; Thu, 21 May 2020 16:14:27 -0700 (PDT) Received: from localhost ([2620:15c:6:12:9c46:e0da:efbf:69cc]) by smtp.gmail.com with ESMTPSA id u39sm6793217qtc.8.2020.05.21.16.14.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 May 2020 16:14:27 -0700 (PDT) Date: Thu, 21 May 2020 19:14:26 -0400 From: Joel Fernandes To: vpillai Cc: Nishanth Aravamudan , Julien Desfossez , Peter Zijlstra , Tim Chen , mingo@kernel.org, tglx@linutronix.de, pjt@google.com, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, fweisbec@gmail.com, keescook@chromium.org, kerrnel@google.com, Phil Auld , Aaron Lu , Aubrey Li , aubrey.li@linux.intel.com, Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini , Aaron Lu Subject: Re: [RFC PATCH 07/13] sched: Add core wide task selection and scheduling. Message-ID: <20200521231426.GA246288@google.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 04, 2020 at 04:59:57PM +0000, vpillai wrote: > From: Peter Zijlstra > > Instead of only selecting a local task, select a task for all SMT > siblings for every reschedule on the core (irrespective which logical > CPU does the reschedule). > > There could be races in core scheduler where a CPU is trying to pick > a task for its sibling in core scheduler, when that CPU has just been > offlined. We should not schedule any tasks on the CPU in this case. > Return an idle task in pick_next_task for this situation. > > NOTE: there is still potential for siblings rivalry. > NOTE: this is far too complicated; but thus far I've failed to > simplify it further. > > Signed-off-by: Peter Zijlstra (Intel) > Signed-off-by: Julien Desfossez > Signed-off-by: Vineeth Remanan Pillai > Signed-off-by: Aaron Lu > Signed-off-by: Tim Chen > --- > kernel/sched/core.c | 274 ++++++++++++++++++++++++++++++++++++++++++- > kernel/sched/fair.c | 40 +++++++ > kernel/sched/sched.h | 6 +- > 3 files changed, 318 insertions(+), 2 deletions(-) > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 445f0d519336..9a1bd236044e 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -4253,7 +4253,7 @@ static inline void schedule_debug(struct task_struct *prev, bool preempt) > * Pick up the highest-prio task: > */ > static inline struct task_struct * > -pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) > +__pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) > { > const struct sched_class *class; > struct task_struct *p; > @@ -4309,6 +4309,273 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) > BUG(); > } > > +#ifdef CONFIG_SCHED_CORE > + > +static inline bool cookie_equals(struct task_struct *a, unsigned long cookie) > +{ > + return is_idle_task(a) || (a->core_cookie == cookie); > +} > + > +static inline bool cookie_match(struct task_struct *a, struct task_struct *b) > +{ > + if (is_idle_task(a) || is_idle_task(b)) > + return true; > + > + return a->core_cookie == b->core_cookie; > +} > + > +// XXX fairness/fwd progress conditions > +/* > + * Returns > + * - NULL if there is no runnable task for this class. > + * - the highest priority task for this runqueue if it matches > + * rq->core->core_cookie or its priority is greater than max. > + * - Else returns idle_task. > + */ > +static struct task_struct * > +pick_task(struct rq *rq, const struct sched_class *class, struct task_struct *max) > +{ > + struct task_struct *class_pick, *cookie_pick; > + unsigned long cookie = rq->core->core_cookie; > + > + class_pick = class->pick_task(rq); > + if (!class_pick) > + return NULL; > + > + if (!cookie) { > + /* > + * If class_pick is tagged, return it only if it has > + * higher priority than max. > + */ > + if (max && class_pick->core_cookie && > + prio_less(class_pick, max)) > + return idle_sched_class.pick_task(rq); > + > + return class_pick; > + } > + > + /* > + * If class_pick is idle or matches cookie, return early. > + */ > + if (cookie_equals(class_pick, cookie)) > + return class_pick; > + > + cookie_pick = sched_core_find(rq, cookie); > + > + /* > + * If class > max && class > cookie, it is the highest priority task on > + * the core (so far) and it must be selected, otherwise we must go with > + * the cookie pick in order to satisfy the constraint. > + */ > + if (prio_less(cookie_pick, class_pick) && > + (!max || prio_less(max, class_pick))) > + return class_pick; > + > + return cookie_pick; > +} I've been hating on this pick_task() routine for a while now :-). If we add the task to the tag tree as Peter suggested at OSPM for that other issue Vineeth found, it seems it could be simpler. This has just been near a compiler so far but how about: ---8<----------------------- diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 005d7f7323e2d..81e23252b6c99 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -182,9 +182,6 @@ static void sched_core_enqueue(struct rq *rq, struct task_struct *p) rq->core->core_task_seq++; - if (!p->core_cookie) - return; - node = &rq->core_tree.rb_node; parent = *node; @@ -215,7 +212,7 @@ static void sched_core_dequeue(struct rq *rq, struct task_struct *p) void sched_core_add(struct rq *rq, struct task_struct *p) { - if (p->core_cookie && task_on_rq_queued(p)) + if (task_on_rq_queued(p)) sched_core_enqueue(rq, p); } @@ -4563,36 +4560,32 @@ pick_task(struct rq *rq, const struct sched_class *class, struct task_struct *ma if (!class_pick) return NULL; - if (!cookie) { - /* - * If class_pick is tagged, return it only if it has - * higher priority than max. - */ - if (max && class_pick->core_cookie && - prio_less(class_pick, max)) - return idle_sched_class.pick_task(rq); - + if (!max) return class_pick; - } - /* - * If class_pick is idle or matches cookie, return early. - */ + /* Make sure the current max's cookie is core->core_cookie */ + WARN_ON_ONCE(max->core_cookie != cookie); + + /* If class_pick is idle or matches cookie, play nice. */ if (cookie_equals(class_pick, cookie)) return class_pick; - cookie_pick = sched_core_find(rq, cookie); + /* If class_pick is highest prio, trump max. */ + if (prio_less(max, class_pick)) { + + /* .. but not before checking if cookie trumps class. */ + cookie_pick = sched_core_find(rq, cookie); + if (prio_less(class_pick, cookie_pick)) + return cookie_pick; - /* - * If class > max && class > cookie, it is the highest priority task on - * the core (so far) and it must be selected, otherwise we must go with - * the cookie pick in order to satisfy the constraint. - */ - if (prio_less(cookie_pick, class_pick) && - (!max || prio_less(max, class_pick))) return class_pick; + } - return cookie_pick; + /* + * We get here if class_pick was incompatible with max + * and lower prio than max. So we have nothing. + */ + return idle_sched_class.pick_task(rq); } static struct task_struct *