Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8DC01C38142 for ; Sat, 28 Jan 2023 00:18:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232710AbjA1ASR (ORCPT ); Fri, 27 Jan 2023 19:18:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36802 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232925AbjA1ARj (ORCPT ); Fri, 27 Jan 2023 19:17:39 -0500 Received: from mail-pj1-x1029.google.com (mail-pj1-x1029.google.com [IPv6:2607:f8b0:4864:20::1029]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9D1468CE10; Fri, 27 Jan 2023 16:17:14 -0800 (PST) Received: by mail-pj1-x1029.google.com with SMTP id x2-20020a17090a46c200b002295ca9855aso10221895pjg.2; Fri, 27 Jan 2023 16:17:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=x4vCYz2beF9ROB0r0Gl728SjZFvQ7n/iXYiikxbP3n0=; b=KMhhq72veYUOFqhd6X9qf2fdXBypP80XYIye0ym74+A1efyLRprDjbFCGLRTe3a7cD xVC2XedgeNnW+6aSCq2G/WYSQoBvy1R3rmZXeqlH5xhFvgopOwkz1p2sk3HuTyeidPHk Dc9Z+h02bNBmCTsIDRpx2zSDuIxZyd45dIM5eZSk2jHQqGqpMaJBS7SY4bAKW1aogMkj 3x9qXHfMO0uD48aA5JjjHjZZ8Ytzyfh+J8bAn/XD1HgUsEAYOQ+NUueJl2Mb1yUmmtgG 08ZGj1x8FBVaXEME1WCsOXDelo6qzusu0n4JVjvMmgjoQeiP5NpnNfT9TrIa3Y7QRjj8 iJ6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=x4vCYz2beF9ROB0r0Gl728SjZFvQ7n/iXYiikxbP3n0=; b=JjYphUW9xY04v6VqdU4lZS2OIyg5FORJoBhpHSwycbvMGgvZVtMtXbfpk1dGp7EOAR LvXnOVzkaKCtUhgqF+W5gKVl3zXsCoj/uo77ugqMR1oUQgxii1/SSJkKB+wPNSbWEqCj XPiCJboRWn3M6qUAweJADkqN9D7GyPxXkuQs3+Ex4J/aizNZCTKg1XMqlgup4gJ7MvXl qkyPGCVntLLJmFvVf1lhJob+yyXIoRP1F3Kj3r9eS4ccTTDHgLosvN4OllRakMcSgOGv 3F9Lg5L7SfAd6jdmlo0K2lT2wSMHEs28BYztqdyhhkNFbjdWJSFnDmA8zP/EsnGjR/jU 4vRw== X-Gm-Message-State: AO0yUKX24ZmlfKoEk8PqUQjDLGV0GShY8mONDQmfaQzkjeqTKrtUXbwa bxoeb0+wAqgXuhm0fUeVe2c= X-Google-Smtp-Source: AK7set+1KkQ49qWAPCdlYzUfto+yptNFQcWCZcuL3237VUu+irbxCAkxlpR2j4y7/KqSGarYPg1rMA== X-Received: by 2002:a17:902:c406:b0:196:e8e:cd28 with SMTP id k6-20020a170902c40600b001960e8ecd28mr21121116plk.15.1674865033944; Fri, 27 Jan 2023 16:17:13 -0800 (PST) Received: from localhost (2603-800c-1a02-1bae-a7fa-157f-969a-4cde.res6.spectrum.com. [2603:800c:1a02:1bae:a7fa:157f:969a:4cde]) by smtp.gmail.com with ESMTPSA id r20-20020a170902be1400b00192a8b35fa3sm3413297pls.122.2023.01.27.16.17.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Jan 2023 16:17:13 -0800 (PST) Sender: Tejun Heo From: Tejun Heo To: torvalds@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, joshdon@google.com, brho@google.com, pjt@google.com, derkling@google.com, haoluo@google.com, dvernet@meta.com, dschatzberg@meta.com, dskarlat@cs.cmu.edu, riel@surriel.com Cc: linux-kernel@vger.kernel.org, bpf@vger.kernel.org, kernel-team@meta.com, Tejun Heo Subject: [PATCH 12/30] sched_ext: Add boilerplate for extensible scheduler class Date: Fri, 27 Jan 2023 14:16:21 -1000 Message-Id: <20230128001639.3510083-13-tj@kernel.org> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230128001639.3510083-1-tj@kernel.org> References: <20230128001639.3510083-1-tj@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This adds dummy implementations of sched_ext interfaces which interact with the scheduler core and hook them in the correct places. As they're all dummies, this doesn't cause any behavior changes. This is split out to help reviewing. v2: balance_scx_on_up() dropped. This will be handled in sched_ext proper. Signed-off-by: Tejun Heo Reviewed-by: David Vernet Acked-by: Josh Don Acked-by: Hao Luo Acked-by: Barret Rhoden --- include/linux/sched/ext.h | 12 ++++++++++++ kernel/fork.c | 2 ++ kernel/sched/core.c | 32 ++++++++++++++++++++++++-------- kernel/sched/ext.h | 24 ++++++++++++++++++++++++ kernel/sched/idle.c | 2 ++ kernel/sched/sched.h | 2 ++ 6 files changed, 66 insertions(+), 8 deletions(-) create mode 100644 include/linux/sched/ext.h create mode 100644 kernel/sched/ext.h diff --git a/include/linux/sched/ext.h b/include/linux/sched/ext.h new file mode 100644 index 000000000000..a05dfcf533b0 --- /dev/null +++ b/include/linux/sched/ext.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_SCHED_EXT_H +#define _LINUX_SCHED_EXT_H + +#ifdef CONFIG_SCHED_CLASS_EXT +#error "NOT IMPLEMENTED YET" +#else /* !CONFIG_SCHED_CLASS_EXT */ + +static inline void sched_ext_free(struct task_struct *p) {} + +#endif /* CONFIG_SCHED_CLASS_EXT */ +#endif /* _LINUX_SCHED_EXT_H */ diff --git a/kernel/fork.c b/kernel/fork.c index 184c622b5513..f5aad7fa6350 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -23,6 +23,7 @@ #include #include #include +#include #include #include #include @@ -846,6 +847,7 @@ void __put_task_struct(struct task_struct *tsk) WARN_ON(refcount_read(&tsk->usage)); WARN_ON(tsk == current); + sched_ext_free(tsk); io_uring_free(tsk); cgroup_free(tsk); task_numa_free(tsk, true); diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 6447f10ecd44..6058042dcc3d 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4623,6 +4623,8 @@ late_initcall(sched_core_sysctl_init); */ int sched_fork(unsigned long clone_flags, struct task_struct *p) { + int ret; + __sched_fork(clone_flags, p); /* * We mark the process as NEW here. This guarantees that @@ -4659,12 +4661,16 @@ int sched_fork(unsigned long clone_flags, struct task_struct *p) p->sched_reset_on_fork = 0; } - if (dl_prio(p->prio)) - return -EAGAIN; - else if (rt_prio(p->prio)) + scx_pre_fork(p); + + if (dl_prio(p->prio)) { + ret = -EAGAIN; + goto out_cancel; + } else if (rt_prio(p->prio)) { p->sched_class = &rt_sched_class; - else + } else { p->sched_class = &fair_sched_class; + } init_entity_runnable_average(&p->se); @@ -4682,6 +4688,10 @@ int sched_fork(unsigned long clone_flags, struct task_struct *p) RB_CLEAR_NODE(&p->pushable_dl_tasks); #endif return 0; + +out_cancel: + scx_cancel_fork(p); + return ret; } int sched_cgroup_fork(struct task_struct *p, struct kernel_clone_args *kargs) @@ -4712,16 +4722,18 @@ int sched_cgroup_fork(struct task_struct *p, struct kernel_clone_args *kargs) p->sched_class->task_fork(p); raw_spin_unlock_irqrestore(&p->pi_lock, flags); - return 0; + return scx_fork(p); } void sched_cancel_fork(struct task_struct *p) { + scx_cancel_fork(p); } void sched_post_fork(struct task_struct *p) { uclamp_post_fork(p); + scx_post_fork(p); } unsigned long to_ratio(u64 period, u64 runtime) @@ -5868,7 +5880,7 @@ static void put_prev_task_balance(struct rq *rq, struct task_struct *prev, * We can terminate the balance pass as soon as we know there is * a runnable task of @class priority or higher. */ - for_class_range(class, prev->sched_class, &idle_sched_class) { + for_balance_class_range(class, prev->sched_class, &idle_sched_class) { if (class->balance(rq, prev, rf)) break; } @@ -5886,6 +5898,9 @@ __pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) const struct sched_class *class; struct task_struct *p; + if (scx_enabled()) + goto restart; + /* * Optimization: we know that if all tasks are in the fair class we can * call that function directly, but only if the @prev task wasn't of a @@ -5911,7 +5926,7 @@ __pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) restart: put_prev_task_balance(rq, prev, rf); - for_each_class(class) { + for_each_active_class(class) { p = class->pick_next_task(rq); if (p) return p; @@ -5944,7 +5959,7 @@ static inline struct task_struct *pick_task(struct rq *rq) const struct sched_class *class; struct task_struct *p; - for_each_class(class) { + for_each_active_class(class) { p = class->pick_task(rq); if (p) return p; @@ -9880,6 +9895,7 @@ void __init sched_init(void) balance_push_set(smp_processor_id(), false); #endif init_sched_fair_class(); + init_sched_ext_class(); psi_init(); diff --git a/kernel/sched/ext.h b/kernel/sched/ext.h new file mode 100644 index 000000000000..6a93c4825339 --- /dev/null +++ b/kernel/sched/ext.h @@ -0,0 +1,24 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifdef CONFIG_SCHED_CLASS_EXT +#error "NOT IMPLEMENTED YET" +#else /* CONFIG_SCHED_CLASS_EXT */ + +#define scx_enabled() false + +static inline void scx_pre_fork(struct task_struct *p) {} +static inline int scx_fork(struct task_struct *p) { return 0; } +static inline void scx_post_fork(struct task_struct *p) {} +static inline void scx_cancel_fork(struct task_struct *p) {} +static inline void init_sched_ext_class(void) {} + +#define for_each_active_class for_each_class +#define for_balance_class_range for_class_range + +#endif /* CONFIG_SCHED_CLASS_EXT */ + +#if defined(CONFIG_SCHED_CLASS_EXT) && defined(CONFIG_SMP) +#error "NOT IMPLEMENTED YET" +#else +static inline void scx_update_idle(struct rq *rq, bool idle) {} +#endif diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c index f26ab2675f7d..86bc5832bdc4 100644 --- a/kernel/sched/idle.c +++ b/kernel/sched/idle.c @@ -428,11 +428,13 @@ static void check_preempt_curr_idle(struct rq *rq, struct task_struct *p, int fl static void put_prev_task_idle(struct rq *rq, struct task_struct *prev) { + scx_update_idle(rq, false); } static void set_next_task_idle(struct rq *rq, struct task_struct *next, bool first) { update_idle_core(rq); + scx_update_idle(rq, true); schedstat_inc(rq->sched_goidle); } diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 91b6fed6aa93..d57d17d8ea6b 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -3366,4 +3366,6 @@ enum cpu_cftype_id { extern struct cftype cpu_cftypes[CPU_CFTYPE_CNT + 1]; #endif /* CONFIG_CGROUP_SCHED */ +#include "ext.h" + #endif /* _KERNEL_SCHED_SCHED_H */ -- 2.39.1