2012-08-14 20:51:05

by Trevor Brandt

[permalink] [raw]
Subject: [PATCH] sched: Support compiling out real-time scheduling with REALTIME_SCHED.

Adds support for compiling out the real-time scheduler (SCHED_FIFO
and SCHED_RR) to save space. Changes sched_set_stop_task to use
SCHED_NORMAL rather than SCHED_FIFO, since the kernel only uses this
function as a fake scheduling priority for userspace to read to avoid
exposing the stop class to userspace. Bloat-o-meter gives a space
savings of 1877 bytes with REALTIME_SCHED turned off.

Userspace works fine with REALTIME_SCHED turned off. Processes
attempting to set a real-time scheduling policy get EINVAL, exactly
the response that the sched_setscheduler manpage documents you will
get if the "scheduling policy is not one of the recognized policies."

Signed-off-by: Trevor Brandt <[email protected]>
Reviewed-by: Josh Triplett <[email protected]>
---
init/Kconfig | 8 ++++++++
kernel/sched/Makefile | 3 ++-
kernel/sched/core.c | 14 ++++++--------
kernel/sched/rt.c | 2 --
kernel/sched/sched.h | 46 ++++++++++++++++++++++++++++++++++++++++++++++
kernel/sched/stop_task.c | 4 ++++
6 files changed, 66 insertions(+), 11 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 3f42cd6..768dc76 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -27,6 +27,13 @@ config IRQ_WORK
bool
depends on HAVE_IRQ_WORK

+config REALTIME_SCHED
+ bool "Realtime Scheduler" if EXPERT
+ default y
+ help
+ This option enables support for the realtime scheduler and the
+ corresponding scheduling classes SCHED_FIFO and SCHED_RR.
+
menu "General setup"

config EXPERIMENTAL
@@ -753,6 +760,7 @@ config CFS_BANDWIDTH

config RT_GROUP_SCHED
bool "Group scheduling for SCHED_RR/FIFO"
+ depends on REALTIME_SCHED
depends on EXPERIMENTAL
depends on CGROUP_SCHED
default n
diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
index 9a7dd35..a9bee25 100644
--- a/kernel/sched/Makefile
+++ b/kernel/sched/Makefile
@@ -11,7 +11,8 @@ ifneq ($(CONFIG_SCHED_OMIT_FRAME_POINTER),y)
CFLAGS_core.o := $(PROFILING) -fno-omit-frame-pointer
endif

-obj-y += core.o clock.o idle_task.o fair.o rt.o stop_task.o
+obj-y += core.o clock.o idle_task.o fair.o stop_task.o
+obj-$(CONFIG_REALTIME_SCHED) += rt.o
obj-$(CONFIG_SMP) += cpupri.o
obj-$(CONFIG_SCHED_AUTOGROUP) += auto_group.o
obj-$(CONFIG_SCHEDSTATS) += stats.o
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 478a04c..3579286 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -85,6 +85,8 @@
#define CREATE_TRACE_POINTS
#include <trace/events/sched.h>

+struct rt_bandwidth def_rt_bandwidth;
+
void start_bandwidth_timer(struct hrtimer *period_timer, ktime_t period)
{
unsigned long delta;
@@ -959,19 +961,15 @@ static int irqtime_account_si_update(void)

void sched_set_stop_task(int cpu, struct task_struct *stop)
{
- struct sched_param param = { .sched_priority = MAX_RT_PRIO - 1 };
+ struct sched_param param = { .sched_priority = 0 };
struct task_struct *old_stop = cpu_rq(cpu)->stop;

if (stop) {
/*
- * Make it appear like a SCHED_FIFO task, its something
+ * Make it appear like a SCHED_NORMAL task, its something
* userspace knows about and won't get confused about.
- *
- * Also, it will make PI more or less work without too
- * much confusion -- but then, stop work should not
- * rely on PI working anyway.
*/
- sched_setscheduler_nocheck(stop, SCHED_FIFO, &param);
+ sched_setscheduler_nocheck(stop, SCHED_NORMAL, &param);

stop->sched_class = &stop_sched_class;
}
@@ -983,7 +981,7 @@ void sched_set_stop_task(int cpu, struct task_struct *stop)
* Reset it back to a normal scheduling class so that
* it can die in pieces.
*/
- old_stop->sched_class = &rt_sched_class;
+ old_stop->sched_class = &fair_sched_class;
}
}

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index f42ae7f..4100803 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -9,8 +9,6 @@

static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun);

-struct rt_bandwidth def_rt_bandwidth;
-
static enum hrtimer_restart sched_rt_period_timer(struct hrtimer *timer)
{
struct rt_bandwidth *rt_b =
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 98c0c26..ed41cb7 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -47,6 +47,8 @@ extern __read_mostly int scheduler_running;
*/
#define RUNTIME_INF ((u64)~0ULL)

+#ifdef CONFIG_REALTIME_SCHED
+
static inline int rt_policy(int policy)
{
if (policy == SCHED_FIFO || policy == SCHED_RR)
@@ -59,6 +61,19 @@ static inline int task_has_rt_policy(struct task_struct *p)
return rt_policy(p->policy);
}

+#else /* CONFIG_REALTIME_SCHED */
+
+static inline int rt_policy(int policy)
+{
+ return 0;
+}
+static inline int task_has_rt_policy(struct task_struct *p)
+{
+ return 0;
+}
+
+#endif /* CONFIG_REALTIME_SCHED */
+
/*
* This is the priority-queue data structure of the RT scheduling class:
*/
@@ -190,8 +205,17 @@ extern void __refill_cfs_bandwidth_runtime(struct cfs_bandwidth *cfs_b);
extern void __start_cfs_bandwidth(struct cfs_bandwidth *cfs_b);
extern void unthrottle_cfs_rq(struct cfs_rq *cfs_rq);

+#ifdef CONFIG_REALTIME_SCHED
extern void free_rt_sched_group(struct task_group *tg);
extern int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent);
+#else
+static inline void free_rt_sched_group(struct task_group *tg) { }
+static inline int alloc_rt_sched_group(struct task_group *tg,
+ struct task_group *parent)
+{
+ return 1;
+}
+#endif
extern void init_tg_rt_entry(struct task_group *tg, struct rt_rq *rt_rq,
struct sched_rt_entity *rt_se, int cpu,
struct sched_rt_entity *parent);
@@ -852,7 +876,11 @@ enum cpuacct_stat_index {
for (class = sched_class_highest; class; class = class->next)

extern const struct sched_class stop_sched_class;
+#ifdef CONFIG_REALTIME_SCHED
extern const struct sched_class rt_sched_class;
+#else
+static const struct sched_class rt_sched_class = { };
+#endif
extern const struct sched_class fair_sched_class;
extern const struct sched_class idle_sched_class;

@@ -874,15 +902,29 @@ extern void sysrq_sched_debug_show(void);
extern void sched_init_granularity(void);
extern void update_max_interval(void);
extern void update_group_power(struct sched_domain *sd, int cpu);
+#ifdef CONFIG_REALTIME_SCHED
extern int update_runtime(struct notifier_block *nfb, unsigned long action, void *hcpu);
extern void init_sched_rt_class(void);
+#else
+static inline int update_runtime(struct notifier_block *nfb,
+ unsigned long action, void *hcpu)
+{
+ return 0;
+}
+static inline void init_sched_rt_class(void) { }
+#endif
extern void init_sched_fair_class(void);

extern void resched_task(struct task_struct *p);
extern void resched_cpu(int cpu);

extern struct rt_bandwidth def_rt_bandwidth;
+#ifdef CONFIG_REALTIME_SCHED
extern void init_rt_bandwidth(struct rt_bandwidth *rt_b, u64 period, u64 runtime);
+#else
+static inline void init_rt_bandwidth(struct rt_bandwidth *rt_b,
+ u64 period, u64 runtime) { }
+#endif

extern void update_cpu_load(struct rq *this_rq);

@@ -1150,7 +1192,11 @@ extern void print_cfs_stats(struct seq_file *m, int cpu);
extern void print_rt_stats(struct seq_file *m, int cpu);

extern void init_cfs_rq(struct cfs_rq *cfs_rq);
+#ifdef CONFIG_REALTIME_SCHED
extern void init_rt_rq(struct rt_rq *rt_rq, struct rq *rq);
+#else
+static inline void init_rt_rq(struct rt_rq *rt_rq, struct rq *rq) { }
+#endif
extern void unthrottle_offline_cfs_rqs(struct rq *rq);

extern void account_cfs_bandwidth_used(int enabled, int was_enabled);
diff --git a/kernel/sched/stop_task.c b/kernel/sched/stop_task.c
index 7b386e8..5c29766 100644
--- a/kernel/sched/stop_task.c
+++ b/kernel/sched/stop_task.c
@@ -83,7 +83,11 @@ get_rr_interval_stop(struct rq *rq, struct task_struct *task)
* Simple, special scheduling class for the per-CPU stop tasks:
*/
const struct sched_class stop_sched_class = {
+#ifdef CONFIG_REALTIME_SCHED
.next = &rt_sched_class,
+#else
+ .next = &fair_sched_class,
+#endif

.enqueue_task = enqueue_task_stop,
.dequeue_task = dequeue_task_stop,
--
1.7.9.5


2012-08-15 07:12:31

by Mike Galbraith

[permalink] [raw]
Subject: Re: [PATCH] sched: Support compiling out real-time scheduling with REALTIME_SCHED.

On Tue, 2012-08-14 at 13:50 -0700, Trevor Brandt wrote:

> diff --git a/init/Kconfig b/init/Kconfig
> index 3f42cd6..768dc76 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -27,6 +27,13 @@ config IRQ_WORK
> bool
> depends on HAVE_IRQ_WORK
>
> +config REALTIME_SCHED
> + bool "Realtime Scheduler" if EXPERT
> + default y
> + help
> + This option enables support for the realtime scheduler and the
> + corresponding scheduling classes SCHED_FIFO and SCHED_RR.
> +
> menu "General setup"
>
> config EXPERIMENTAL

If you inverted that, it could be a proper default n new feature [1].

However, if weight loss is the goal, why not go whole hog, and create
sched/thin.c containing no lard... or just integrate an existing thin
scheduler as a config option? Whole body replacement is a very radical
diet, but somehow seems less so than chopping off fingers and toes.

-Mike

(that SMP could select to greatly simplify RT)

2012-08-15 15:10:31

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH] sched: Support compiling out real-time scheduling with REALTIME_SCHED.

On Wed, Aug 15, 2012 at 09:12:20AM +0200, Mike Galbraith wrote:
> On Tue, 2012-08-14 at 13:50 -0700, Trevor Brandt wrote:
> > diff --git a/init/Kconfig b/init/Kconfig
> > index 3f42cd6..768dc76 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -27,6 +27,13 @@ config IRQ_WORK
> > bool
> > depends on HAVE_IRQ_WORK
> >
> > +config REALTIME_SCHED
> > + bool "Realtime Scheduler" if EXPERT
> > + default y
> > + help
> > + This option enables support for the realtime scheduler and the
> > + corresponding scheduling classes SCHED_FIFO and SCHED_RR.
> > +
> > menu "General setup"
> >
> > config EXPERIMENTAL
>
> If you inverted that, it could be a proper default n new feature [1].

Huh. You mean, DISABLE_REALTIME_SCHED? How would that help?
DISABLE_REALTIME_SCHED=n seems like an unnecessary double negative, and
I see very little precedent for that in Kconfig options.

> (that SMP could select to greatly simplify RT)

I hope this isn't a serious suggestion. :) In any case, that doesn't
seem like something that should happen in *this* patch, if it should
happen at all.

> However, if weight loss is the goal, why not go whole hog, and create
> sched/thin.c containing no lard... or just integrate an existing thin
> scheduler as a config option?

Historically, the response to configurable/modular/selectable schedulers
has been entirely negative, with most responses amounting to "we should
fix the scheduler we have to work for all workloads", which doesn't seem
like an unreasonable response to me.

The kernel also has a *large* number of dependencies on the workings of
the fair scheduler, and as this patch shows, far fewer on the real-time
scheduler.

Given both of the above, writing and integrating an entirely new
scheduler (*and* dealing with the repeats of old flamewars that would
ensue after posting it) seems a bit much to ask for a student project.
:)

- Josh Triplett

2012-08-15 15:15:06

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] sched: Support compiling out real-time scheduling with REALTIME_SCHED.

On Tue, 2012-08-14 at 13:50 -0700, Trevor Brandt wrote:
> Adds support for compiling out the real-time scheduler (SCHED_FIFO
> and SCHED_RR) to save space. Changes sched_set_stop_task to use
> SCHED_NORMAL rather than SCHED_FIFO, since the kernel only uses this
> function as a fake scheduling priority for userspace to read to avoid
> exposing the stop class to userspace. Bloat-o-meter gives a space
> savings of 1877 bytes with REALTIME_SCHED turned off.
>
> Userspace works fine with REALTIME_SCHED turned off. Processes
> attempting to set a real-time scheduling policy get EINVAL, exactly
> the response that the sched_setscheduler manpage documents you will
> get if the "scheduling policy is not one of the recognized policies."

Are you in the same group that wanted to make SCHED_FAIR optional as
well?

> Signed-off-by: Trevor Brandt <[email protected]>
> Reviewed-by: Josh Triplett <[email protected]>

> diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
> index 9a7dd35..a9bee25 100644
> --- a/kernel/sched/Makefile
> +++ b/kernel/sched/Makefile
> @@ -11,7 +11,8 @@ ifneq ($(CONFIG_SCHED_OMIT_FRAME_POINTER),y)
> CFLAGS_core.o := $(PROFILING) -fno-omit-frame-pointer
> endif
>
> -obj-y += core.o clock.o idle_task.o fair.o rt.o stop_task.o
> +obj-y += core.o clock.o idle_task.o fair.o stop_task.o
> +obj-$(CONFIG_REALTIME_SCHED) += rt.o
> obj-$(CONFIG_SMP) += cpupri.o

This wants extra magic, cpupri is only used for rt, cutting that will of
course increase your savings.

> obj-$(CONFIG_SCHED_AUTOGROUP) += auto_group.o
> obj-$(CONFIG_SCHEDSTATS) += stats.o

Other than that I'm not entirely happy with the growing #ifdef maze.
Granted this new one isn't nearly as bad as some of the existing ones,
but I do wish someone would clean up some of that.

2012-08-15 19:59:12

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] sched: Support compiling out real-time scheduling with REALTIME_SCHED.

On Wed, 2012-08-15 at 17:14 +0200, Peter Zijlstra wrote:

> Other than that I'm not entirely happy with the growing #ifdef maze.
> Granted this new one isn't nearly as bad as some of the existing ones,
> but I do wish someone would clean up some of that.

To get rid of the #ifdefs, just make the RT scheduler into a module that
gets loaded for those that want it.

/me runs!

-- Steve

2012-08-16 04:30:42

by Mike Galbraith

[permalink] [raw]
Subject: Re: [PATCH] sched: Support compiling out real-time scheduling with REALTIME_SCHED.

On Wed, 2012-08-15 at 08:10 -0700, Josh Triplett wrote:
> On Wed, Aug 15, 2012 at 09:12:20AM +0200, Mike Galbraith wrote:
> > On Tue, 2012-08-14 at 13:50 -0700, Trevor Brandt wrote:
> > > diff --git a/init/Kconfig b/init/Kconfig
> > > index 3f42cd6..768dc76 100644
> > > --- a/init/Kconfig
> > > +++ b/init/Kconfig
> > > @@ -27,6 +27,13 @@ config IRQ_WORK
> > > bool
> > > depends on HAVE_IRQ_WORK
> > >
> > > +config REALTIME_SCHED
> > > + bool "Realtime Scheduler" if EXPERT
> > > + default y
> > > + help
> > > + This option enables support for the realtime scheduler and the
> > > + corresponding scheduling classes SCHED_FIFO and SCHED_RR.
> > > +
> > > menu "General setup"
> > >
> > > config EXPERIMENTAL
> >
> > If you inverted that, it could be a proper default n new feature [1].
>
> Huh. You mean, DISABLE_REALTIME_SCHED? How would that help?
> DISABLE_REALTIME_SCHED=n seems like an unnecessary double negative, and
> I see very little precedent for that in Kconfig options.

No, it doesn't change anything.

> > (that SMP could select to greatly simplify RT)
>
> I hope this isn't a serious suggestion. :) In any case, that doesn't
> seem like something that should happen in *this* patch, if it should
> happen at all.

Slightly deformed funny-bone.

> > However, if weight loss is the goal, why not go whole hog, and create
> > sched/thin.c containing no lard... or just integrate an existing thin
> > scheduler as a config option?
>
> Historically, the response to configurable/modular/selectable schedulers
> has been entirely negative, with most responses amounting to "we should
> fix the scheduler we have to work for all workloads", which doesn't seem
> like an unreasonable response to me.
>
> The kernel also has a *large* number of dependencies on the workings of
> the fair scheduler, and as this patch shows, far fewer on the real-time
> scheduler.
>
> Given both of the above, writing and integrating an entirely new
> scheduler (*and* dealing with the repeats of old flamewars that would
> ensue after posting it) seems a bit much to ask for a student project.
> :)

I think you could make something more generally useful to size extra
dinky boxen by doing that regardless. But yeah, the bar for inclusion
might be a _tad_ high ;-)

Maintainers certainly wouldn't find it lovely, but there is some
utility. I can imagine a single array version of the O(1) scheduler
saving lots of space. If you made it single queue like BFS (or for that
matter maybe just uses BFS out of the box, dunno) you'd get rid of the
load balancing code as well, so would probably have a small footprint
scheduler without having to axe standard classes that may well be needed
in even size extra dinky boxen.

-Mike