LinuxLists.cc - [PATCH 0/3] RT Capacity Awareness Improvements

2020-02-14 16:41:50

Subject: [PATCH 0/3] RT Capacity Awareness Improvements

Pavan has pointed out an oversight in the first implementation where we don't
fallback to another unfitting CPU if we are already running on unfitting one.

This also stirred discussion around handling downmigration from fitting to
unfitting CPU.

https://lore.kernel.org/lkml/[email protected]/

Patch 1 adds the missing fallback when a fitting CPU wasn't found, reported
by Pavan.

Patch 2 allows downmigration in the pull case and marks the CPU as overloaded
as suggested by Steve.

Patch 3 fixes the condition in select_task_rq_rt() in case the new_cpu and the
task priorities are the *same*.

The series is based on Linus/master v5.6-rc1.

I ran the following test cases, the results of which can be found in [1]
Each test was run 3 times to demonstrate repeatability of the result.

The tests were ran on Juno-r2, which has 6 CPUs. CPUs [0, 3-5] are Littles and
CPUs [1-2] are Bigs.

By default RT tasks are boosted to max capacity, so no work was done to modify
that. ie: rt_task->uclamp_min = 1024 for all running tests.

1. 6 RT Tasks
* 2 Tasks always end up on the big cores
* The rest end up on the little cores with migration among them
happening (it didn't before)
2. 2 RT Tasks
* We can see they always end up on the 2 big cores
3. 4 RT Tasks
* Results similar to 1
4. 4 RT Tasks affined to little cores
* The tasks run on the little cores with some migration as
expected

[1] https://gist.github.com/qais-yousef/bb99bdd912628489408a5edae33f85e1

Qais Yousef (3):
sched/rt: cpupri_find: implement fallback mechanism for !fit case
sched/rt: allow pulling unfitting task
sched/rt: fix pushing unfit tasks to a better CPU

kernel/sched/cpupri.c | 157 +++++++++++++++++++++++++++---------------
kernel/sched/rt.c | 50 ++++++++++----
2 files changed, 139 insertions(+), 68 deletions(-)

--
2.17.1

2020-02-14 16:41:58

by Qais Yousef

[permalink] [raw]

Subject: [PATCH 1/3] sched/rt: cpupri_find: implement fallback mechanism for !fit case

When searching for the best lowest_mask with a fitness_fn passed, make
sure we record the lowest_level that returns a valid lowest_mask so that
we can use that as a fallback in case we fail to find a fitting CPU at
all levels.

The intention in the original patch was not to allow a down migration to
unfitting CPU. But this missed the case where we are already running on
unfitting one.

With this change now RT tasks can still move between unfitting CPUs when
they're already running on such CPU.

And as Steve suggested; to adhere to the strict priority rules of RT, if
a task is already running on a fitting CPU but due to priority it can't
run on it, allow it to downmigrate to unfitting CPU so it can run.

Reported-by: Pavan Kondeti <[email protected]>
Signed-off-by: Qais Yousef <[email protected]>
---
kernel/sched/cpupri.c | 157 +++++++++++++++++++++++++++---------------
1 file changed, 101 insertions(+), 56 deletions(-)

diff --git a/kernel/sched/cpupri.c b/kernel/sched/cpupri.c
index 1a2719e1350a..1bcfa1995550 100644
--- a/kernel/sched/cpupri.c
+++ b/kernel/sched/cpupri.c
@@ -41,6 +41,59 @@ static int convert_prio(int prio)
return cpupri;
}

+static inline int __cpupri_find(struct cpupri *cp, struct task_struct *p,
+ struct cpumask *lowest_mask, int idx)
+{
+ struct cpupri_vec *vec = &cp->pri_to_cpu[idx];
+ int skip = 0;
+
+ if (!atomic_read(&(vec)->count))
+ skip = 1;
+ /*
+ * When looking at the vector, we need to read the counter,
+ * do a memory barrier, then read the mask.
+ *
+ * Note: This is still all racey, but we can deal with it.
+ * Ideally, we only want to look at masks that are set.
+ *
+ * If a mask is not set, then the only thing wrong is that we
+ * did a little more work than necessary.
+ *
+ * If we read a zero count but the mask is set, because of the
+ * memory barriers, that can only happen when the highest prio
+ * task for a run queue has left the run queue, in which case,
+ * it will be followed by a pull. If the task we are processing
+ * fails to find a proper place to go, that pull request will
+ * pull this task if the run queue is running at a lower
+ * priority.
+ */
+ smp_rmb();
+
+ /* Need to do the rmb for every iteration */
+ if (skip)
+ return 0;
+
+ if (cpumask_any_and(p->cpus_ptr, vec->mask) >= nr_cpu_ids)
+ return 0;
+
+ if (lowest_mask) {
+ cpumask_and(lowest_mask, p->cpus_ptr, vec->mask);
+
+ /*
+ * We have to ensure that we have at least one bit
+ * still set in the array, since the map could have
+ * been concurrently emptied between the first and
+ * second reads of vec->mask. If we hit this
+ * condition, simply act as though we never hit this
+ * priority level and continue on.
+ */
+ if (cpumask_empty(lowest_mask))
+ return 0;
+ }
+
+ return 1;
+}
+
/**
* cpupri_find - find the best (lowest-pri) CPU in the system
* @cp: The cpupri context
@@ -62,80 +115,72 @@ int cpupri_find(struct cpupri *cp, struct task_struct *p,
struct cpumask *lowest_mask,
bool (*fitness_fn)(struct task_struct *p, int cpu))
{
- int idx = 0;
int task_pri = convert_prio(p->prio);
+ int best_unfit_idx = -1;
+ int idx = 0, cpu;

BUG_ON(task_pri >= CPUPRI_NR_PRIORITIES);

for (idx = 0; idx < task_pri; idx++) {
- struct cpupri_vec *vec = &cp->pri_to_cpu[idx];
- int skip = 0;

- if (!atomic_read(&(vec)->count))
- skip = 1;
- /*
- * When looking at the vector, we need to read the counter,
- * do a memory barrier, then read the mask.
- *
- * Note: This is still all racey, but we can deal with it.
- * Ideally, we only want to look at masks that are set.
- *
- * If a mask is not set, then the only thing wrong is that we
- * did a little more work than necessary.
- *
- * If we read a zero count but the mask is set, because of the
- * memory barriers, that can only happen when the highest prio
- * task for a run queue has left the run queue, in which case,
- * it will be followed by a pull. If the task we are processing
- * fails to find a proper place to go, that pull request will
- * pull this task if the run queue is running at a lower
- * priority.
- */
- smp_rmb();
-
- /* Need to do the rmb for every iteration */
- if (skip)
- continue;
-
- if (cpumask_any_and(p->cpus_ptr, vec->mask) >= nr_cpu_ids)
+ if (!__cpupri_find(cp, p, lowest_mask, idx))
continue;

- if (lowest_mask) {
- int cpu;
+ if (!lowest_mask || !fitness_fn)
+ return 1;

- cpumask_and(lowest_mask, p->cpus_ptr, vec->mask);
+ /* Ensure the capacity of the CPUs fit the task */
+ for_each_cpu(cpu, lowest_mask) {
+ if (!fitness_fn(p, cpu))
+ cpumask_clear_cpu(cpu, lowest_mask);
+ }

+ /*
+ * If no CPU at the current priority can fit the task
+ * continue looking
+ */
+ if (cpumask_empty(lowest_mask)) {
/*
- * We have to ensure that we have at least one bit
- * still set in the array, since the map could have
- * been concurrently emptied between the first and
- * second reads of vec->mask. If we hit this
- * condition, simply act as though we never hit this
- * priority level and continue on.
+ * Store our fallback priority in case we
+ * didn't find a fitting CPU
*/
- if (cpumask_empty(lowest_mask))
- continue;
+ if (best_unfit_idx == -1)
+ best_unfit_idx = idx;

- if (!fitness_fn)
- return 1;
-
- /* Ensure the capacity of the CPUs fit the task */
- for_each_cpu(cpu, lowest_mask) {
- if (!fitness_fn(p, cpu))
- cpumask_clear_cpu(cpu, lowest_mask);
- }
-
- /*
- * If no CPU at the current priority can fit the task
- * continue looking
- */
- if (cpumask_empty(lowest_mask))
- continue;
+ continue;
}

return 1;
}

+ /*
+ * If we failed to find a fitting lowest_mask, make sure we fall back
+ * to the last known unfitting lowest_mask.
+ *
+ * Note that the map of the recorded idx might have changed since then,
+ * so we must ensure to do the full dance to make sure that level still
+ * holds a valid lowest_mask.
+ *
+ * As per above, the map could have been concurrently emptied while we
+ * were busy searching for a fitting lowest_mask at the other priority
+ * levels.
+ *
+ * This rule favours honouring priority over fitting the task in the
+ * correct CPU (Capacity Awareness being the only user now).
+ * The idea is that if a higher priority task can run, then it should
+ * run even if this ends up being on unfitting CPU.
+ *
+ * The cost of this trade-off is not entirely clear and will probably
+ * be good for some workloads and bad for others.
+ *
+ * The main idea here is that if some CPUs were overcommitted, we try
+ * to spread which is what the scheduler traditionally did. Sys admins
+ * must do proper RT planning to avoid overloading the system if they
+ * really care.
+ */
+ if (best_unfit_idx != -1)
+ return __cpupri_find(cp, p, lowest_mask, best_unfit_idx);
+
return 0;
}

--
2.17.1

2020-02-17 17:10:15

by Valentin Schneider

[permalink] [raw]

Subject: Re: [PATCH 1/3] sched/rt: cpupri_find: implement fallback mechanism for !fit case

On 2/14/20 4:39 PM, Qais Yousef wrote:
> When searching for the best lowest_mask with a fitness_fn passed, make
> sure we record the lowest_level that returns a valid lowest_mask so that
> we can use that as a fallback in case we fail to find a fitting CPU at
> all levels.
>
> The intention in the original patch was not to allow a down migration to
> unfitting CPU. But this missed the case where we are already running on
> unfitting one.
>
> With this change now RT tasks can still move between unfitting CPUs when
> they're already running on such CPU.
>
> And as Steve suggested; to adhere to the strict priority rules of RT, if
> a task is already running on a fitting CPU but due to priority it can't
> run on it, allow it to downmigrate to unfitting CPU so it can run.
>
> Reported-by: Pavan Kondeti <[email protected]>
> Signed-off-by: Qais Yousef <[email protected]>
> ---
> kernel/sched/cpupri.c | 157 +++++++++++++++++++++++++++---------------
> 1 file changed, 101 insertions(+), 56 deletions(-)
>
> diff --git a/kernel/sched/cpupri.c b/kernel/sched/cpupri.c
> index 1a2719e1350a..1bcfa1995550 100644
> --- a/kernel/sched/cpupri.c
> +++ b/kernel/sched/cpupri.c
> @@ -41,6 +41,59 @@ static int convert_prio(int prio)
> return cpupri;
> }
>
> +static inline int __cpupri_find(struct cpupri *cp, struct task_struct *p,
> + struct cpumask *lowest_mask, int idx)
> +{
> + struct cpupri_vec *vec = &cp->pri_to_cpu[idx];
> + int skip = 0;
> +
> + if (!atomic_read(&(vec)->count))
> + skip = 1;
> + /*
> + * When looking at the vector, we need to read the counter,
> + * do a memory barrier, then read the mask.
> + *
> + * Note: This is still all racey, but we can deal with it.
> + * Ideally, we only want to look at masks that are set.
> + *
> + * If a mask is not set, then the only thing wrong is that we
> + * did a little more work than necessary.
> + *
> + * If we read a zero count but the mask is set, because of the
> + * memory barriers, that can only happen when the highest prio
> + * task for a run queue has left the run queue, in which case,
> + * it will be followed by a pull. If the task we are processing
> + * fails to find a proper place to go, that pull request will
> + * pull this task if the run queue is running at a lower
> + * priority.
> + */
> + smp_rmb();
> +
> + /* Need to do the rmb for every iteration */
> + if (skip)
> + return 0;
> +
> + if (cpumask_any_and(p->cpus_ptr, vec->mask) >= nr_cpu_ids)
> + return 0;
> +
> + if (lowest_mask) {
> + cpumask_and(lowest_mask, p->cpus_ptr, vec->mask);
> +
> + /*
> + * We have to ensure that we have at least one bit
> + * still set in the array, since the map could have
> + * been concurrently emptied between the first and
> + * second reads of vec->mask. If we hit this
> + * condition, simply act as though we never hit this
> + * priority level and continue on.
> + */
> + if (cpumask_empty(lowest_mask))
> + return 0;
> + }
> +
> + return 1;
> +}
> +

Just a drive-by comment; could you split that code move into its own patch?
It'd make the history a bit easier to read IMO.

2020-02-17 19:20:58

by Dietmar Eggemann

[permalink] [raw]

Subject: Re: [PATCH 1/3] sched/rt: cpupri_find: implement fallback mechanism for !fit case

On 14/02/2020 17:39, Qais Yousef wrote:

[...]

> /**
> * cpupri_find - find the best (lowest-pri) CPU in the system
> * @cp: The cpupri context
> @@ -62,80 +115,72 @@ int cpupri_find(struct cpupri *cp, struct task_struct *p,
> struct cpumask *lowest_mask,
> bool (*fitness_fn)(struct task_struct *p, int cpu))
> {
> - int idx = 0;
> int task_pri = convert_prio(p->prio);
> + int best_unfit_idx = -1;
> + int idx = 0, cpu;
>
> BUG_ON(task_pri >= CPUPRI_NR_PRIORITIES);
>
> for (idx = 0; idx < task_pri; idx++) {
> - struct cpupri_vec *vec = &cp->pri_to_cpu[idx];
> - int skip = 0;
>
> - if (!atomic_read(&(vec)->count))
> - skip = 1;
> - /*
> - * When looking at the vector, we need to read the counter,
> - * do a memory barrier, then read the mask.
> - *
> - * Note: This is still all racey, but we can deal with it.
> - * Ideally, we only want to look at masks that are set.
> - *
> - * If a mask is not set, then the only thing wrong is that we
> - * did a little more work than necessary.
> - *
> - * If we read a zero count but the mask is set, because of the
> - * memory barriers, that can only happen when the highest prio
> - * task for a run queue has left the run queue, in which case,
> - * it will be followed by a pull. If the task we are processing
> - * fails to find a proper place to go, that pull request will
> - * pull this task if the run queue is running at a lower
> - * priority.
> - */
> - smp_rmb();
> -
> - /* Need to do the rmb for every iteration */
> - if (skip)
> - continue;
> -
> - if (cpumask_any_and(p->cpus_ptr, vec->mask) >= nr_cpu_ids)
> + if (!__cpupri_find(cp, p, lowest_mask, idx))
> continue;
>
> - if (lowest_mask) {
> - int cpu;

Shouldn't we add an extra condition here?

+ if (!static_branch_unlikely(&sched_asym_cpucapacity))
+ return 1;
+

Otherwise non-heterogeneous systems have to got through this
for_each_cpu(cpu, lowest_mask) further below for no good reason.

> + if (!lowest_mask || !fitness_fn)
> + return 1;
>
> - cpumask_and(lowest_mask, p->cpus_ptr, vec->mask);
> + /* Ensure the capacity of the CPUs fit the task */
> + for_each_cpu(cpu, lowest_mask) {
> + if (!fitness_fn(p, cpu))
> + cpumask_clear_cpu(cpu, lowest_mask);
> + }

[...]

2020-02-17 23:34:42

by Qais Yousef

[permalink] [raw]

Subject: Re: [PATCH 1/3] sched/rt: cpupri_find: implement fallback mechanism for !fit case

On 02/17/20 17:07, Valentin Schneider wrote:
> Just a drive-by comment; could you split that code move into its own patch?
> It'd make the history a bit easier to read IMO.

Hmm I don't see how it would help the history.

In large series with big churn splitting helps to facilitate review, but
I don't think reviewing this patch is hard because of creating the new
function.

And git-blame will have this patch all over the new function, so people who
care to know the reason of the split will land at the right place directly
without any extra level of indirection.

Thanks

--
Qais Yousef

2020-02-17 23:46:17

by Qais Yousef

[permalink] [raw]

Subject: Re: [PATCH 1/3] sched/rt: cpupri_find: implement fallback mechanism for !fit case

On 02/17/20 20:09, Dietmar Eggemann wrote:
> On 14/02/2020 17:39, Qais Yousef wrote:
>
> [...]
>
> > /**
> > * cpupri_find - find the best (lowest-pri) CPU in the system
> > * @cp: The cpupri context
> > @@ -62,80 +115,72 @@ int cpupri_find(struct cpupri *cp, struct task_struct *p,
> > struct cpumask *lowest_mask,
> > bool (*fitness_fn)(struct task_struct *p, int cpu))
> > {
> > - int idx = 0;
> > int task_pri = convert_prio(p->prio);
> > + int best_unfit_idx = -1;
> > + int idx = 0, cpu;
> >
> > BUG_ON(task_pri >= CPUPRI_NR_PRIORITIES);
> >
> > for (idx = 0; idx < task_pri; idx++) {
> > - struct cpupri_vec *vec = &cp->pri_to_cpu[idx];
> > - int skip = 0;
> >
> > - if (!atomic_read(&(vec)->count))
> > - skip = 1;
> > - /*
> > - * When looking at the vector, we need to read the counter,
> > - * do a memory barrier, then read the mask.
> > - *
> > - * Note: This is still all racey, but we can deal with it.
> > - * Ideally, we only want to look at masks that are set.
> > - *
> > - * If a mask is not set, then the only thing wrong is that we
> > - * did a little more work than necessary.
> > - *
> > - * If we read a zero count but the mask is set, because of the
> > - * memory barriers, that can only happen when the highest prio
> > - * task for a run queue has left the run queue, in which case,
> > - * it will be followed by a pull. If the task we are processing
> > - * fails to find a proper place to go, that pull request will
> > - * pull this task if the run queue is running at a lower
> > - * priority.
> > - */
> > - smp_rmb();
> > -
> > - /* Need to do the rmb for every iteration */
> > - if (skip)
> > - continue;
> > -
> > - if (cpumask_any_and(p->cpus_ptr, vec->mask) >= nr_cpu_ids)
> > + if (!__cpupri_find(cp, p, lowest_mask, idx))
> > continue;
> >
> > - if (lowest_mask) {
> > - int cpu;
>
> Shouldn't we add an extra condition here?
>
> + if (!static_branch_unlikely(&sched_asym_cpucapacity))
> + return 1;
> +
>
> Otherwise non-heterogeneous systems have to got through this
> for_each_cpu(cpu, lowest_mask) further below for no good reason.

Hmm below is the best solution I can think of at the moment. Works for you?

It's independent of what this patch tries to fix, so I'll add as a separate
patch to the series in the next update.

Thanks

--
Qais Yousef

---

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 5ea235f2cfe8..5f2eaf3affde 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -14,6 +14,8 @@ static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun);

struct rt_bandwidth def_rt_bandwidth;

+typedef bool (*fitness_fn_t)(struct task_struct *p, int cpu);
+
static enum hrtimer_restart sched_rt_period_timer(struct hrtimer *timer)
{
struct rt_bandwidth *rt_b =
@@ -1708,6 +1710,7 @@ static int find_lowest_rq(struct task_struct *task)
struct cpumask *lowest_mask = this_cpu_cpumask_var_ptr(local_cpu_mask);
int this_cpu = smp_processor_id();
int cpu = task_cpu(task);
+ fitness_fn_t fitness_fn;

/* Make sure the mask is initialized first */
if (unlikely(!lowest_mask))
@@ -1716,8 +1719,17 @@ static int find_lowest_rq(struct task_struct *task)
if (task->nr_cpus_allowed == 1)
return -1; /* No other targets possible */

+ /*
+ * Help cpupri_find avoid the cost of looking for a fitting CPU when
+ * not really needed.
+ */
+ if (static_branch_unlikely(&sched_asym_cpucapacity))
+ fitness_fn = rt_task_fits_capacity;
+ else
+ fitness_fn = NULL;
+
if (!cpupri_find(&task_rq(task)->rd->cpupri, task, lowest_mask,
- rt_task_fits_capacity))
+ fitness_fn))
return -1; /* No targets found */

/*

>
> > + if (!lowest_mask || !fitness_fn)
> > + return 1;
> >
> > - cpumask_and(lowest_mask, p->cpus_ptr, vec->mask);
> > + /* Ensure the capacity of the CPUs fit the task */
> > + for_each_cpu(cpu, lowest_mask) {
> > + if (!fitness_fn(p, cpu))
> > + cpumask_clear_cpu(cpu, lowest_mask);
> > + }
>
> [...]

2020-02-18 09:54:53

by Dietmar Eggemann

[permalink] [raw]

Subject: Re: [PATCH 1/3] sched/rt: cpupri_find: implement fallback mechanism for !fit case

On 18/02/2020 00:45, Qais Yousef wrote:
> On 02/17/20 20:09, Dietmar Eggemann wrote:
>> On 14/02/2020 17:39, Qais Yousef wrote:
>>
>> [...]
>>
>>> /**
>>> * cpupri_find - find the best (lowest-pri) CPU in the system
>>> * @cp: The cpupri context
>>> @@ -62,80 +115,72 @@ int cpupri_find(struct cpupri *cp, struct task_struct *p,
>>> struct cpumask *lowest_mask,
>>> bool (*fitness_fn)(struct task_struct *p, int cpu))
>>> {
>>> - int idx = 0;
>>> int task_pri = convert_prio(p->prio);
>>> + int best_unfit_idx = -1;
>>> + int idx = 0, cpu;
>>>
>>> BUG_ON(task_pri >= CPUPRI_NR_PRIORITIES);
>>>
>>> for (idx = 0; idx < task_pri; idx++) {
>>> - struct cpupri_vec *vec = &cp->pri_to_cpu[idx];
>>> - int skip = 0;
>>>
>>> - if (!atomic_read(&(vec)->count))
>>> - skip = 1;
>>> - /*
>>> - * When looking at the vector, we need to read the counter,
>>> - * do a memory barrier, then read the mask.
>>> - *
>>> - * Note: This is still all racey, but we can deal with it.
>>> - * Ideally, we only want to look at masks that are set.
>>> - *
>>> - * If a mask is not set, then the only thing wrong is that we
>>> - * did a little more work than necessary.
>>> - *
>>> - * If we read a zero count but the mask is set, because of the
>>> - * memory barriers, that can only happen when the highest prio
>>> - * task for a run queue has left the run queue, in which case,
>>> - * it will be followed by a pull. If the task we are processing
>>> - * fails to find a proper place to go, that pull request will
>>> - * pull this task if the run queue is running at a lower
>>> - * priority.
>>> - */
>>> - smp_rmb();
>>> -
>>> - /* Need to do the rmb for every iteration */
>>> - if (skip)
>>> - continue;
>>> -
>>> - if (cpumask_any_and(p->cpus_ptr, vec->mask) >= nr_cpu_ids)
>>> + if (!__cpupri_find(cp, p, lowest_mask, idx))
>>> continue;
>>>
>>> - if (lowest_mask) {
>>> - int cpu;
>>
>> Shouldn't we add an extra condition here?
>>
>> + if (!static_branch_unlikely(&sched_asym_cpucapacity))
>> + return 1;
>> +
>>
>> Otherwise non-heterogeneous systems have to got through this
>> for_each_cpu(cpu, lowest_mask) further below for no good reason.
>
> Hmm below is the best solution I can think of at the moment. Works for you?
>
> It's independent of what this patch tries to fix, so I'll add as a separate
> patch to the series in the next update.

OK.

Since we can't set it as early as init_sched_rt_class()

root@juno:~# dmesg | grep "\*\*\*"
[ 0.501697] *** set sched_asym_cpucapacity <-- CPU cap asym by uArch
[ 0.505847] *** init_sched_rt_class()
[ 1.796706] *** set sched_asym_cpucapacity <-- CPUfreq kicked in

we probably have to do it either by bailing out of cpupri_find() early
with this extra condition (above) or by initializing the func pointer
dynamically (your example).

[...]

> @@ -1708,6 +1710,7 @@ static int find_lowest_rq(struct task_struct *task)
> struct cpumask *lowest_mask = this_cpu_cpumask_var_ptr(local_cpu_mask);
> int this_cpu = smp_processor_id();
> int cpu = task_cpu(task);
> + fitness_fn_t fitness_fn;
>
> /* Make sure the mask is initialized first */
> if (unlikely(!lowest_mask))
> @@ -1716,8 +1719,17 @@ static int find_lowest_rq(struct task_struct *task)
> if (task->nr_cpus_allowed == 1)
> return -1; /* No other targets possible */
>
> + /*
> + * Help cpupri_find avoid the cost of looking for a fitting CPU when
> + * not really needed.
> + */

In case the commend is really needed, for me it would work better
logically inverse.

[...]

2020-02-18 10:02:39

by Valentin Schneider

[permalink] [raw]

Subject: Re: [PATCH 1/3] sched/rt: cpupri_find: implement fallback mechanism for !fit case

On 2/17/20 11:34 PM, Qais Yousef wrote:
> On 02/17/20 17:07, Valentin Schneider wrote:
>> Just a drive-by comment; could you split that code move into its own patch?
>> It'd make the history a bit easier to read IMO.
>
> Hmm I don't see how it would help the history.
>
> In large series with big churn splitting helps to facilitate review, but
> I don't think reviewing this patch is hard because of creating the new
> function.
>
> And git-blame will have this patch all over the new function, so people who
> care to know the reason of the split will land at the right place directly
> without any extra level of indirection.
>

As a git-blame addict I yearn for nice clean splits, and this is a code move
plus a new behaviour; having them in the same patch doesn't make for the
prettiest diff there is (also helps for review, yadda yadda). That said, I'm
not going to argue further over it, I'm barely a tenant in this house.

> Thanks
>
> --
> Qais Yousef
>

2020-02-18 16:47:43

by Steven Rostedt

[permalink] [raw]

Subject: Re: [PATCH 1/3] sched/rt: cpupri_find: implement fallback mechanism for !fit case

On Mon, 17 Feb 2020 23:45:49 +0000
Qais Yousef <[email protected]> wrote:

> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -14,6 +14,8 @@ static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun);
>
> struct rt_bandwidth def_rt_bandwidth;
>
> +typedef bool (*fitness_fn_t)(struct task_struct *p, int cpu);
> +
> static enum hrtimer_restart sched_rt_period_timer(struct hrtimer *timer)
> {
> struct rt_bandwidth *rt_b =
> @@ -1708,6 +1710,7 @@ static int find_lowest_rq(struct task_struct *task)
> struct cpumask *lowest_mask = this_cpu_cpumask_var_ptr(local_cpu_mask);
> int this_cpu = smp_processor_id();
> int cpu = task_cpu(task);
> + fitness_fn_t fitness_fn;
>
> /* Make sure the mask is initialized first */
> if (unlikely(!lowest_mask))
> @@ -1716,8 +1719,17 @@ static int find_lowest_rq(struct task_struct *task)
> if (task->nr_cpus_allowed == 1)
> return -1; /* No other targets possible */
>
> + /*
> + * Help cpupri_find avoid the cost of looking for a fitting CPU when
> + * not really needed.
> + */
> + if (static_branch_unlikely(&sched_asym_cpucapacity))
> + fitness_fn = rt_task_fits_capacity;
> + else
> + fitness_fn = NULL;
> +
> if (!cpupri_find(&task_rq(task)->rd->cpupri, task, lowest_mask,
> - rt_task_fits_capacity))
> + fitness_fn))
> return -1; /* No targets found */
>
> /*

If we are going to use static branches, then lets just remove the
parameter totally. That is, make two functions (with helpers), where
one needs this fitness function the other does not.

if (static_branch_unlikely(&sched_asym_cpu_capacity))
ret = cpupri_find_fitness(...);
else
ret = cpupri_find(...);

if (!ret)
return -1;

Something like that?

-- Steve

2020-02-18 17:28:40

by Qais Yousef

[permalink] [raw]

Subject: Re: [PATCH 1/3] sched/rt: cpupri_find: implement fallback mechanism for !fit case

On 02/18/20 11:46, Steven Rostedt wrote:
> On Mon, 17 Feb 2020 23:45:49 +0000
> Qais Yousef <[email protected]> wrote:
>
> > --- a/kernel/sched/rt.c
> > +++ b/kernel/sched/rt.c
> > @@ -14,6 +14,8 @@ static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun);
> >
> > struct rt_bandwidth def_rt_bandwidth;
> >
> > +typedef bool (*fitness_fn_t)(struct task_struct *p, int cpu);
> > +
> > static enum hrtimer_restart sched_rt_period_timer(struct hrtimer *timer)
> > {
> > struct rt_bandwidth *rt_b =
> > @@ -1708,6 +1710,7 @@ static int find_lowest_rq(struct task_struct *task)
> > struct cpumask *lowest_mask = this_cpu_cpumask_var_ptr(local_cpu_mask);
> > int this_cpu = smp_processor_id();
> > int cpu = task_cpu(task);
> > + fitness_fn_t fitness_fn;
> >
> > /* Make sure the mask is initialized first */
> > if (unlikely(!lowest_mask))
> > @@ -1716,8 +1719,17 @@ static int find_lowest_rq(struct task_struct *task)
> > if (task->nr_cpus_allowed == 1)
> > return -1; /* No other targets possible */
> >
> > + /*
> > + * Help cpupri_find avoid the cost of looking for a fitting CPU when
> > + * not really needed.
> > + */
> > + if (static_branch_unlikely(&sched_asym_cpucapacity))
> > + fitness_fn = rt_task_fits_capacity;
> > + else
> > + fitness_fn = NULL;
> > +
> > if (!cpupri_find(&task_rq(task)->rd->cpupri, task, lowest_mask,
> > - rt_task_fits_capacity))
> > + fitness_fn))
> > return -1; /* No targets found */
> >
> > /*
>
>
> If we are going to use static branches, then lets just remove the
> parameter totally. That is, make two functions (with helpers), where
> one needs this fitness function the other does not.
>
> if (static_branch_unlikely(&sched_asym_cpu_capacity))
> ret = cpupri_find_fitness(...);
> else
> ret = cpupri_find(...);
>
> if (!ret)
> return -1;
>
> Something like that?

Is there any implication on code generation here?

I like my flavour better tbh. But I don't mind refactoring the function out if
it does make it more readable.

Thanks

--
Qais Yousef

2020-02-18 17:29:07

by Qais Yousef

[permalink] [raw]

Subject: Re: [PATCH 1/3] sched/rt: cpupri_find: implement fallback mechanism for !fit case

On 02/18/20 10:53, Dietmar Eggemann wrote:
> On 18/02/2020 00:45, Qais Yousef wrote:
> > + /*
> > + * Help cpupri_find avoid the cost of looking for a fitting CPU when
> > + * not really needed.
> > + */
>
> In case the commend is really needed, for me it would work better
> logically inverse.

Sure.

Thanks

--
Qais Yousef

2020-02-18 18:04:18

by Steven Rostedt

[permalink] [raw]

Subject: Re: [PATCH 1/3] sched/rt: cpupri_find: implement fallback mechanism for !fit case

On Tue, 18 Feb 2020 17:27:46 +0000
Qais Yousef <[email protected]> wrote:

> > If we are going to use static branches, then lets just remove the
> > parameter totally. That is, make two functions (with helpers), where
> > one needs this fitness function the other does not.
> >
> > if (static_branch_unlikely(&sched_asym_cpu_capacity))
> > ret = cpupri_find_fitness(...);
> > else
> > ret = cpupri_find(...);
> >
> > if (!ret)
> > return -1;
> >
> > Something like that?
>
> Is there any implication on code generation here?
>
> I like my flavour better tbh. But I don't mind refactoring the function out if
> it does make it more readable.

I just figured we remove the passing of the parameter (which does make
an impact on the code generation).

Also, perhaps it would be better to not have to pass functions to the
cpupri_find(). Is there any other function that needs to be past, or
just this one in this series?

-- Steve

2020-02-18 18:54:34

by Qais Yousef

[permalink] [raw]

Subject: Re: [PATCH 1/3] sched/rt: cpupri_find: implement fallback mechanism for !fit case

On 02/18/20 13:03, Steven Rostedt wrote:
> On Tue, 18 Feb 2020 17:27:46 +0000
> Qais Yousef <[email protected]> wrote:
>
> > > If we are going to use static branches, then lets just remove the
> > > parameter totally. That is, make two functions (with helpers), where
> > > one needs this fitness function the other does not.
> > >
> > > if (static_branch_unlikely(&sched_asym_cpu_capacity))
> > > ret = cpupri_find_fitness(...);
> > > else
> > > ret = cpupri_find(...);
> > >
> > > if (!ret)
> > > return -1;
> > >
> > > Something like that?
> >
> > Is there any implication on code generation here?
> >
> > I like my flavour better tbh. But I don't mind refactoring the function out if
> > it does make it more readable.
>
> I just figured we remove the passing of the parameter (which does make
> an impact on the code generation).

Ok. My mind went to protecting the whole function call with the static key
could be better.

> Also, perhaps it would be better to not have to pass functions to the
> cpupri_find(). Is there any other function that needs to be past, or
> just this one in this series?

I had that discussion in the past with Dietmar [1]

My argument was this way the code is generic and self contained and allows for
easy extension for other potential users.

I'm happy to split the function into cpupri_find_fitness() (with a fn ptr) and
cpupri_find() (original) like you suggest above - if you're still okay with
that..

[1] https://lore.kernel.org/lkml/[email protected]/

Thanks

--
Qais Youesf