Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp434031pxy; Thu, 22 Apr 2021 05:40:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyvgJ0af5R5BKRYNW5kf6iakBvidCOiLT58VWbsZ5Qw8kkT7Xr8E7tDeaoIh7GEUJES6nZ1 X-Received: by 2002:a17:907:2bcb:: with SMTP id gv11mr3226984ejc.353.1619095209146; Thu, 22 Apr 2021 05:40:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619095209; cv=none; d=google.com; s=arc-20160816; b=NnzbQcWjobtMw+CYJJ5SCNPx9C6BfUpbe7uX0Cdi3lSCUIp56jOk/nBiVpMRjJ0XcI 4hPQoWphwCkVKY6njT6HhQoGjaGYoQT9Bpgsg40p/BXcZbcjis6T0rR/Zm7HoinuCWSJ hiwySHjYXNWWkirQN/59Fs9ZxaLaHzeRPubnHCoyE+SvHWcz2ufSX35MVvphPRaamJ7W Ac071AZeAkc6RuZmWpEXSl9x50XPqwzAJacLCHbtP7WlRKo18TqskyDLb7TGb0JAkXVw WvuVaaQPZt0qTtNkc2sVKtOc6vqU1nq+HKRh3A87IDKSrRDZ6piIB/tTyG5LJ2s1QYXC 1cJw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:subject:cc:to:from:date :user-agent:message-id:dkim-signature; bh=vJdnIGDgVZgoK/hCDh9SHDmaJu5dQtgJzuJod+VK0Rk=; b=pysWrWxzbpXI6inukN2r6JrHHLgI4lh22zHNJT4HG1yleJQ2LI9ILCxNWVofRba+cT JOBNl3XNmgKV7Row3pOadHyC0bZ8I4fTNcKGPUHBryQlwp1kW/5w6ljfgTtudeSI2T6p 3fSe/T7lwB9y635V/Vxj3cJxwJqvD+qfzAE6ExQgXdAAPo0ljZFG35rudlUFp66rWQZG GvEEEU4ghM4syLtwzZwx4Q7wp19iRqKK0tlvlIUc0cOg+TAYlAp29zQrJuludWJ51P0P bBg+xCe0wu/hMNfgXKWylS56K4ywaNeaHibjuW0w87/1PXMMoe7l8Gjt2OJQPGDDfxRb sdfA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=d6gZmE0P; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a19si2110717ejy.700.2021.04.22.05.39.45; Thu, 22 Apr 2021 05:40:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=d6gZmE0P; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236618AbhDVMin (ORCPT + 99 others); Thu, 22 Apr 2021 08:38:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40024 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236144AbhDVMh3 (ORCPT ); Thu, 22 Apr 2021 08:37:29 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 64292C061342 for ; Thu, 22 Apr 2021 05:36:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=vJdnIGDgVZgoK/hCDh9SHDmaJu5dQtgJzuJod+VK0Rk=; b=d6gZmE0PQZEFc058BhnCbLFYqB yYpgMhjrqNsEbp2P5mzWw92NnEZBTTRgUgVV4VlUL1OIwwl+JPXWlO+UuEwV3gVmIN+tZulhnLiS4 DtDJWSeUVqVKpU9fbeay66NDcfCLmyNw2U6D8/5uZacCG2R4I/sxednrIYZFtQOLQLIZhykv8bI8Y 3ByaZDZy7lqQzdcAT4f5AFW2+rHfD1TAetgmDRv4TG53DL4TYijo6aCf+GIkF/uNfMDhewy+uofTg MM2tu5l/v4WIbVJuy+3p+6BslKfPcfM/VQWjXz5PnJsX4KpC8U7B75VUCfPH3VbeRqwi0CJbGg4oZ EZPtDLWw==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94 #2 (Red Hat Linux)) id 1lZYZ2-000ICB-FK; Thu, 22 Apr 2021 12:36:22 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 6C84C300308; Thu, 22 Apr 2021 14:35:22 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 382292C679684; Thu, 22 Apr 2021 14:35:20 +0200 (CEST) Message-ID: <20210422123308.256677625@infradead.org> User-Agent: quilt/0.66 Date: Thu, 22 Apr 2021 14:05:04 +0200 From: Peter Zijlstra To: joel@joelfernandes.org, chris.hyser@oracle.com, joshdon@google.com, mingo@kernel.org, vincent.guittot@linaro.org, valentin.schneider@arm.com, mgorman@suse.de Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, tglx@linutronix.de Subject: [PATCH 05/19] sched: Core-wide rq->lock References: <20210422120459.447350175@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Introduce the basic infrastructure to have a core wide rq->lock. This relies on the rq->__lock order being in increasing CPU number. It is also constrained to SMT8 per lockdep (and SMT256 per preempt_count). Luckily SMT8 is the max supported SMT count for Linux (Mips, Sparc and Power are known to have this). Signed-off-by: Peter Zijlstra (Intel) --- kernel/Kconfig.preempt | 6 ++ kernel/sched/core.c | 139 +++++++++++++++++++++++++++++++++++++++++++++++++ kernel/sched/sched.h | 37 +++++++++++++ 3 files changed, 182 insertions(+) --- a/kernel/Kconfig.preempt +++ b/kernel/Kconfig.preempt @@ -99,3 +99,9 @@ config PREEMPT_DYNAMIC Interesting if you want the same pre-built kernel should be used for both Server and Desktop workloads. + +config SCHED_CORE + bool "Core Scheduling for SMT" + default y + depends on SCHED_SMT + --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -84,6 +84,103 @@ unsigned int sysctl_sched_rt_period = 10 __read_mostly int scheduler_running; +#ifdef CONFIG_SCHED_CORE + +DEFINE_STATIC_KEY_FALSE(__sched_core_enabled); + +/* + * Magic required such that: + * + * raw_spin_rq_lock(rq); + * ... + * raw_spin_rq_unlock(rq); + * + * ends up locking and unlocking the _same_ lock, and all CPUs + * always agree on what rq has what lock. + * + * XXX entirely possible to selectively enable cores, don't bother for now. + */ + +static DEFINE_MUTEX(sched_core_mutex); +static int sched_core_count; +static struct cpumask sched_core_mask; + +static void __sched_core_flip(bool enabled) +{ + int cpu, t, i; + + cpus_read_lock(); + + /* + * Toggle the online cores, one by one. + */ + cpumask_copy(&sched_core_mask, cpu_online_mask); + for_each_cpu(cpu, &sched_core_mask) { + const struct cpumask *smt_mask = cpu_smt_mask(cpu); + + i = 0; + local_irq_disable(); + for_each_cpu(t, smt_mask) { + /* supports up to SMT8 */ + raw_spin_lock_nested(&cpu_rq(t)->__lock, i++); + } + + for_each_cpu(t, smt_mask) + cpu_rq(t)->core_enabled = enabled; + + for_each_cpu(t, smt_mask) + raw_spin_unlock(&cpu_rq(t)->__lock); + local_irq_enable(); + + cpumask_andnot(&sched_core_mask, &sched_core_mask, smt_mask); + } + + /* + * Toggle the offline CPUs. + */ + cpumask_copy(&sched_core_mask, cpu_possible_mask); + cpumask_andnot(&sched_core_mask, &sched_core_mask, cpu_online_mask); + + for_each_cpu(cpu, &sched_core_mask) + cpu_rq(cpu)->core_enabled = enabled; + + cpus_read_unlock(); +} + +static void __sched_core_enable(void) +{ + // XXX verify there are no cookie tasks (yet) + + static_branch_enable(&__sched_core_enabled); + __sched_core_flip(true); +} + +static void __sched_core_disable(void) +{ + // XXX verify there are no cookie tasks (left) + + __sched_core_flip(false); + static_branch_disable(&__sched_core_enabled); +} + +void sched_core_get(void) +{ + mutex_lock(&sched_core_mutex); + if (!sched_core_count++) + __sched_core_enable(); + mutex_unlock(&sched_core_mutex); +} + +void sched_core_put(void) +{ + mutex_lock(&sched_core_mutex); + if (!--sched_core_count) + __sched_core_disable(); + mutex_unlock(&sched_core_mutex); +} + +#endif /* CONFIG_SCHED_CORE */ + /* * part of the period that we allow rt tasks to run in us. * default: 0.95s @@ -5042,6 +5139,40 @@ pick_next_task(struct rq *rq, struct tas BUG(); } +#ifdef CONFIG_SCHED_CORE + +static inline void sched_core_cpu_starting(unsigned int cpu) +{ + const struct cpumask *smt_mask = cpu_smt_mask(cpu); + struct rq *rq, *core_rq = NULL; + int i; + + core_rq = cpu_rq(cpu)->core; + + if (!core_rq) { + for_each_cpu(i, smt_mask) { + rq = cpu_rq(i); + if (rq->core && rq->core == rq) + core_rq = rq; + } + + if (!core_rq) + core_rq = cpu_rq(cpu); + + for_each_cpu(i, smt_mask) { + rq = cpu_rq(i); + + WARN_ON_ONCE(rq->core && rq->core != core_rq); + rq->core = core_rq; + } + } +} +#else /* !CONFIG_SCHED_CORE */ + +static inline void sched_core_cpu_starting(unsigned int cpu) {} + +#endif /* CONFIG_SCHED_CORE */ + /* * __schedule() is the main scheduler function. * @@ -8006,6 +8137,7 @@ static void sched_rq_cpu_starting(unsign int sched_cpu_starting(unsigned int cpu) { + sched_core_cpu_starting(cpu); sched_rq_cpu_starting(cpu); sched_tick_start(cpu); return 0; @@ -8290,6 +8424,11 @@ void __init sched_init(void) #endif /* CONFIG_SMP */ hrtick_rq_init(rq); atomic_set(&rq->nr_iowait, 0); + +#ifdef CONFIG_SCHED_CORE + rq->core = NULL; + rq->core_enabled = 0; +#endif } set_load_weight(&init_task, false); --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1075,6 +1075,12 @@ struct rq { #endif unsigned int push_busy; struct cpu_stop_work push_work; + +#ifdef CONFIG_SCHED_CORE + /* per rq */ + struct rq *core; + unsigned int core_enabled; +#endif }; #ifdef CONFIG_FAIR_GROUP_SCHED @@ -1113,6 +1119,35 @@ static inline bool is_migration_disabled #endif } +#ifdef CONFIG_SCHED_CORE + +DECLARE_STATIC_KEY_FALSE(__sched_core_enabled); + +static inline bool sched_core_enabled(struct rq *rq) +{ + return static_branch_unlikely(&__sched_core_enabled) && rq->core_enabled; +} + +static inline bool sched_core_disabled(void) +{ + return !static_branch_unlikely(&__sched_core_enabled); +} + +static inline raw_spinlock_t *rq_lockp(struct rq *rq) +{ + if (sched_core_enabled(rq)) + return &rq->core->__lock; + + return &rq->__lock; +} + +#else /* !CONFIG_SCHED_CORE */ + +static inline bool sched_core_enabled(struct rq *rq) +{ + return false; +} + static inline bool sched_core_disabled(void) { return true; @@ -1123,6 +1158,8 @@ static inline raw_spinlock_t *rq_lockp(s return &rq->__lock; } +#endif /* CONFIG_SCHED_CORE */ + static inline void lockdep_assert_rq_held(struct rq *rq) { lockdep_assert_held(rq_lockp(rq));