Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp688787ybz; Wed, 22 Apr 2020 06:15:30 -0700 (PDT) X-Google-Smtp-Source: APiQypIj22Xe1PuakuqIvRBpuSjvXGLOpNKvuQzjUH18Pyys5GZ+vUjoA4kmjKgpuRSQHalf6Yy1 X-Received: by 2002:a17:906:7c11:: with SMTP id t17mr26764768ejo.73.1587561329889; Wed, 22 Apr 2020 06:15:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587561329; cv=none; d=google.com; s=arc-20160816; b=AN3zEkSYQzcjXeamTbGGMbkw4JXO9SIvz/9l5WsfF4lN6IvoD9i1QP4MEG+5KXi9bb /oNnPV+fjVcHTlc+H9Vj3bYHiHpb8LaIJ4TD4N6uONe7nRh3xqzGEotfiT11z81QMLsv s9dsKSvtQ0XsnkjlXkOpw1TN2jyVjGwAw+uxGc5TEE5aHphLo67gLHUkrBHPas3JXL3T A36e18t6HfIF6Y4x8YfoH6jGLZXfqC6LirQ7JaE3ZMzMiZ7xdzTcPz6iDUKiZfhAoKG4 +FdgjsRN1ugkrpXymufr7GevFG2QfigCd2PK1dLeIAIJrswKYhXG5yLWEaqMZA6fxTHk wRQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:reply-to:message-id :subject:cc:to:from:date:dkim-signature; bh=f6YYVj2SAzEQOam+SQGLcNR+rjMdG6adWRFJJ0o7ub0=; b=qzAuz2/d4zZEGBW76CV+lFtRfX2IhDSx6NuTQpRBv4cJ+XkiwLDtFGDvIvuGz1t1ua j9bEeT2dJqvcpVvZ4d+EeG4whoCH0DhuhLvHN+/5ZgN1xSkQdzxEQ5Zyb6n4mSdPQqq0 sC6r6v5HuXecPqHNHOLcRJ+h1tNNIexlz6X/wCbjZ1+0/95Eyb0+qLUWhTq2NheXGoIn Q0Ssy2pDO9IT8XVLvlom5ZlKypLQsvrwak2TZsMpkcNUu9DyCTSOedDGHOmxmX2kIfsJ H3AAF7qCRx52J69vvPS7WKne8TMzb3Ibe9A1SNdG1uMeJTAGesIp0FmVmCMUXrPHHAml ODbA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=SPS+Y8+Y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id m17si3359987edf.454.2020.04.22.06.15.01; Wed, 22 Apr 2020 06:15:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=SPS+Y8+Y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726579AbgDVNLk (ORCPT + 99 others); Wed, 22 Apr 2020 09:11:40 -0400 Received: from mail.kernel.org ([198.145.29.99]:49522 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725968AbgDVNLj (ORCPT ); Wed, 22 Apr 2020 09:11:39 -0400 Received: from paulmck-ThinkPad-P72.home (50-39-105-78.bvtn.or.frontiernet.net [50.39.105.78]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id A1210206EC; Wed, 22 Apr 2020 13:11:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1587561098; bh=JAXMeuxnrQkhQtQmdC45f1lA4ZgBPcOGJxU8xOeQp9I=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=SPS+Y8+YhfwxyXJkV87limWqo0uLjkQ1ajLP1usvBGo/6Uarq/cSg3Eb8eHogO/nM xiNWoVB5JX1AW6XVh84O5JVP8Abp+eUIvQIh7mz49HSMERGr9/dLDAMxf+SztwxDWV U5RPBkx1/crqWdTjcHUCcNlF3RMZvuVMe67kHeko= Received: by paulmck-ThinkPad-P72.home (Postfix, from userid 1000) id 78AD035227B0; Wed, 22 Apr 2020 06:11:38 -0700 (PDT) Date: Wed, 22 Apr 2020 06:11:38 -0700 From: "Paul E. McKenney" To: Peter Zijlstra Cc: mingo@kernel.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, rostedt@goodmis.org, qais.yousef@arm.com, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, bsegall@google.com, mgorman@suse.de, airlied@redhat.com, alexander.deucher@amd.com, awalls@md.metrocast.net, axboe@kernel.dk, broonie@kernel.org, daniel.lezcano@linaro.org, gregkh@linuxfoundation.org, hannes@cmpxchg.org, herbert@gondor.apana.org.au, hverkuil@xs4all.nl, john.stultz@linaro.org, nico@fluxnic.net, rafael.j.wysocki@intel.com, rmk+kernel@arm.linux.org.uk, sudeep.holla@arm.com, ulf.hansson@linaro.org, wim@linux-watchdog.org Subject: Re: [PATCH 01/23] sched: Provide sched_set_fifo() Message-ID: <20200422131138.GL17661@paulmck-ThinkPad-P72> Reply-To: paulmck@kernel.org References: <20200422112719.826676174@infradead.org> <20200422112831.266499893@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200422112831.266499893@infradead.org> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 22, 2020 at 01:27:20PM +0200, Peter Zijlstra wrote: > SCHED_FIFO (or any static priority scheduler) is a broken scheduler > model; it is fundamentally incapable of resource management, the one > thing an OS is actually supposed to do. > > It is impossible to compose static priority workloads. One cannot take > two well designed and functional static priority workloads and mash > them together and still expect them to work. > > Therefore it doesn't make sense to expose the priority field; the > kernel is fundamentally incapable of setting a sensible value, it > needs systems knowledge that it doesn't have. > > Take away sched_setschedule() / sched_setattr() from modules and > replace them with: > > - sched_set_fifo(p); create a FIFO task (at prio 50) > - sched_set_fifo_low(p); create a task higher than NORMAL, > which ends up being a FIFO task at prio 1. > - sched_set_normal(p, nice); (re)set the task to normal > > This stops the proliferation of randomly chosen, and irrelevant, FIFO > priorities that dont't really mean anything anyway. > > The system administrator/integrator, whoever has insight into the > actual system design and requirements (userspace) can set-up > appropriate priorities if and when needed. The sched_setscheduler_nocheck() calls in rcu_spawn_gp_kthread(), rcu_cpu_kthread_setup(), and rcu_spawn_one_boost_kthread() all stay as is because they all use the rcutree.kthread_prio boot parameter, which is set at boot time by the system administrator (or {who,what}ever, correct? Or did my email reader eat a patch or two? Thanx, Paul > Cc: airlied@redhat.com > Cc: alexander.deucher@amd.com > Cc: awalls@md.metrocast.net > Cc: axboe@kernel.dk > Cc: broonie@kernel.org > Cc: daniel.lezcano@linaro.org > Cc: gregkh@linuxfoundation.org > Cc: hannes@cmpxchg.org > Cc: herbert@gondor.apana.org.au > Cc: hverkuil@xs4all.nl > Cc: john.stultz@linaro.org > Cc: nico@fluxnic.net > Cc: paulmck@kernel.org > Cc: rafael.j.wysocki@intel.com > Cc: rmk+kernel@arm.linux.org.uk > Cc: sudeep.holla@arm.com > Cc: tglx@linutronix.de > Cc: ulf.hansson@linaro.org > Cc: wim@linux-watchdog.org > Signed-off-by: Peter Zijlstra (Intel) > Reviewed-by: Ingo Molnar > --- > include/linux/sched.h | 3 +++ > kernel/sched/core.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 50 insertions(+) > > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -1631,6 +1631,9 @@ extern int idle_cpu(int cpu); > extern int available_idle_cpu(int cpu); > extern int sched_setscheduler(struct task_struct *, int, const struct sched_param *); > extern int sched_setscheduler_nocheck(struct task_struct *, int, const struct sched_param *); > +extern int sched_set_fifo(struct task_struct *p); > +extern int sched_set_fifo_low(struct task_struct *p); > +extern int sched_set_normal(struct task_struct *p, int nice); > extern int sched_setattr(struct task_struct *, const struct sched_attr *); > extern int sched_setattr_nocheck(struct task_struct *, const struct sched_attr *); > extern struct task_struct *idle_task(int cpu); > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -5055,6 +5055,8 @@ static int _sched_setscheduler(struct ta > * @policy: new policy. > * @param: structure containing the new RT priority. > * > + * Use sched_set_fifo(), read its comment. > + * > * Return: 0 on success. An error code otherwise. > * > * NOTE that the task may be already dead. > @@ -5097,6 +5099,51 @@ int sched_setscheduler_nocheck(struct ta > } > EXPORT_SYMBOL_GPL(sched_setscheduler_nocheck); > > +/* > + * SCHED_FIFO is a broken scheduler model; that is, it is fundamentally > + * incapable of resource management, which is the one thing an OS really should > + * be doing. > + * > + * This is of course the reason it is limited to privileged users only. > + * > + * Worse still; it is fundamentally impossible to compose static priority > + * workloads. You cannot take two correctly working static prio workloads > + * and smash them together and still expect them to work. > + * > + * For this reason 'all' FIFO tasks the kernel creates are basically at: > + * > + * MAX_RT_PRIO / 2 > + * > + * The administrator _MUST_ configure the system, the kernel simply doesn't > + * know enough information to make a sensible choice. > + */ > +int sched_set_fifo(struct task_struct *p) > +{ > + struct sched_param sp = { .sched_priority = MAX_RT_PRIO / 2 }; > + return sched_setscheduler_nocheck(p, SCHED_FIFO, &sp); > +} > +EXPORT_SYMBOL_GPL(sched_set_fifo); > + > +/* > + * For when you don't much care about FIFO, but want to be above SCHED_NORMAL. > + */ > +int sched_set_fifo_low(struct task_struct *p) > +{ > + struct sched_param sp = { .sched_priority = 1 }; > + return sched_setscheduler_nocheck(p, SCHED_FIFO, &sp); > +} > +EXPORT_SYMBOL_GPL(sched_set_fifo_low); > + > +int sched_set_normal(struct task_struct *p, int nice) > +{ > + struct sched_attr attr = { > + .sched_policy = SCHED_NORMAL, > + .sched_nice = nice, > + }; > + return sched_setattr_nocheck(p, &attr); > +} > +EXPORT_SYMBOL_GPL(sched_set_normal); > + > static int > do_sched_setscheduler(pid_t pid, int policy, struct sched_param __user *param) > { > >