2020-08-27 00:44:20

by Alexander Graf

[permalink] [raw]
Subject: [PATCH 0/3] Add HRTICK support to Core Scheduling

CFS supports a feature called "HRTICK" which allows scheduling
decisions to be made independent of the HZ tick. That means that
we can achieve much more fine grained time slices and thus be
more fair in distributing time to different workloads.

Unfortunately, HRTICK currently does not work with the Core
Scheduling patch set. This patch set adds support for it.
Feel free to squash bits in where it makes sense.


Alex

Alexander Graf (3):
sched: Allow hrticks to work with core scheduling
sched: Trigger new hrtick if timer expires too fast
sched: Use hrticks even with >sched_nr_latency tasks

kernel/sched/core.c | 13 +++++++++++++
kernel/sched/fair.c | 18 ++++++++++++++++--
kernel/sched/sched.h | 4 ++++
3 files changed, 33 insertions(+), 2 deletions(-)

--
2.26.2




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




2020-08-27 00:45:18

by Alexander Graf

[permalink] [raw]
Subject: [PATCH 2/3] sched: Trigger new hrtick if timer expires too fast

When an hrtick timer event occurs too quickly, we just bail out and don't
attempt to set a new hrtick timeout. That means that the time slice for
that particular task grows until the next HZ tick occurs. That again may
create significant jitter for the respective task, as it will not get
scheduled for as long as it executed before, to bring the overall queue's
vruntime into balance again.

With this patch, even a too early hrtick timer event will just reconfigure
the hrtick to when we expected it to fire, removing overall jitter from
the system.

Signed-off-by: Alexander Graf <[email protected]>
---
kernel/sched/fair.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0d4ff3ab2572..66e7aae8b15e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -99,6 +99,8 @@ static int __init setup_sched_thermal_decay_shift(char *str)
}
__setup("sched_thermal_decay_shift=", setup_sched_thermal_decay_shift);

+static void hrtick_update(struct rq *rq);
+
#ifdef CONFIG_SMP
/*
* For asym packing, by default the lower numbered CPU has higher priority.
@@ -4458,8 +4460,10 @@ check_preempt_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr)
* narrow margin doesn't have to wait for a full slice.
* This also mitigates buddy induced latencies under load.
*/
- if (delta_exec < sysctl_sched_min_granularity)
+ if (delta_exec < sysctl_sched_min_granularity) {
+ hrtick_update(rq_of(cfs_rq));
return;
+ }

se = __pick_first_entity(cfs_rq);
delta = curr->vruntime - se->vruntime;
--
2.26.2




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



2020-08-27 00:46:00

by Alexander Graf

[permalink] [raw]
Subject: [PATCH 3/3] sched: Use hrticks even with >sched_nr_latency tasks

When hrticks are enabled, we configure an hrtimer fire at the exact point
in time when we would like to have a rescheduling event occur.

However, the current code disables that logic when the number of currently
running tasks exceeds sched_nr_latency. sched_nr_latency describes the point
at which CFS resorts to giving each task sched_min_granularity slices.

However, these slices may well be smaller than the HZ tick and we thus may
still want to use hrticks to ensure that we can actually slice the CPU time
at sched_min_granularity.

This patch changes the logic to still enable hrticks if sched_min_granularity
is smaller than the HZ tick would allow us to account with. That way systems
with HZ=1000 will usually resort to the HZ tick while systems at lower HZ values
will keep using hrticks.

Signed-off-by: Alexander Graf <[email protected]>
---
kernel/sched/fair.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 66e7aae8b15e..0092bba52edf 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5502,7 +5502,8 @@ static void hrtick_update(struct rq *rq)
if (!hrtick_enabled(rq) || curr->sched_class != &fair_sched_class)
return;

- if (cfs_rq_of(&curr->se)->nr_running < sched_nr_latency)
+ if ((cfs_rq_of(&curr->se)->nr_running < sched_nr_latency) ||
+ (sysctl_sched_min_granularity < (HZ * 1000000)))
hrtick_start_fair(rq, curr);
}
#else /* !CONFIG_SCHED_HRTICK */
--
2.26.2




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



2020-08-27 00:47:27

by Alexander Graf

[permalink] [raw]
Subject: [PATCH 1/3] sched: Allow hrticks to work with core scheduling

The core scheduling logic bypasses the scheduling class's
pick_next_task() which starts the hrtick logic usually. Instead,
it explicitly calls set_next_task() or leaves the current task
running without any callback into the CFS scheduler.

To ensure that we still configure the hrtick timer properly when we
know which task we want to run, let's add an explicit callback to
the scheduler class which can then be triggered from the core's
pick_next_task().

With this patch, core scheduling with HRTICK enabled does see
improved responsiveness on scheduling decisions.

Signed-off-by: Alexander Graf <[email protected]>
---
kernel/sched/core.c | 13 +++++++++++++
kernel/sched/fair.c | 9 +++++++++
kernel/sched/sched.h | 4 ++++
3 files changed, 26 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 0362102fa3d2..72bf837422bf 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4486,6 +4486,12 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
set_next_task(rq, next);
}

+#ifdef CONFIG_SCHED_HRTICK
+ /* Trigger next hrtick after task selection */
+ if (next->sched_class->hrtick_update)
+ next->sched_class->hrtick_update(rq);
+#endif
+
trace_printk("pick pre selected (%u %u %u): %s/%d %lx\n",
rq->core->core_task_seq,
rq->core->core_pick_seq,
@@ -4667,6 +4673,13 @@ next_class:;

done:
set_next_task(rq, next);
+
+#ifdef CONFIG_SCHED_HRTICK
+ /* Trigger next hrtick after task selection */
+ if (next->sched_class->hrtick_update)
+ next->sched_class->hrtick_update(rq);
+#endif
+
return next;
}

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 435b460d3c3f..0d4ff3ab2572 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5512,6 +5512,11 @@ static inline void hrtick_update(struct rq *rq)
}
#endif

+static void hrtick_update_fair(struct rq *rq)
+{
+ hrtick_update(rq);
+}
+
#ifdef CONFIG_SMP
static inline unsigned long cpu_util(int cpu);

@@ -11391,6 +11396,10 @@ const struct sched_class fair_sched_class = {
#ifdef CONFIG_UCLAMP_TASK
.uclamp_enabled = 1,
#endif
+
+#ifdef CONFIG_SCHED_HRTICK
+ .hrtick_update = hrtick_update_fair,
+#endif
};

#ifdef CONFIG_SCHED_DEBUG
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 6445943d3215..b382e0ee0c87 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1942,6 +1942,10 @@ struct sched_class {
#ifdef CONFIG_FAIR_GROUP_SCHED
void (*task_change_group)(struct task_struct *p, int type);
#endif
+
+#ifdef CONFIG_SCHED_HRTICK
+ void (*hrtick_update)(struct rq *rq);
+#endif
};

static inline void put_prev_task(struct rq *rq, struct task_struct *prev)
--
2.26.2




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879