2014-02-06 23:10:44

by Daniel Lezcano

[permalink] [raw]
Subject: [PATCH V2 0/3] sched: idle_balance() cleanup and fix

Peter,

this patchset replace the beginning of the previous mixed one which was not
yet commited [1] without changes except a compilation error fix for UP kernel
config. It is refreshed against tip/sched/core.

As the UP config compilation is broken on the previous patchset, git bisect is
no longer safe for it. You can apply this small serie and drop the 3 first
patches of the previous series [1], or ignore it and I will bring the fix
after you applied [1].

It cleanups the idle_balance function parameters by passing the struct rq
only, fixes a race in the idle_balance function and finally move the idle_stamp
from fair.c to core.c. I am aware it will return back to fair.c with Peter's
pending patches but at least it changes the idle_balance() function to return
true if a balance occured.

[1] https://www.mail-archive.com/[email protected]/msg577271.html

Changelog:
V2:
* fixed compilation errors when CONFIG_SMP=n

V1: initial post


Daniel Lezcano (3):
sched: Remove cpu parameter for idle_balance()
sched: Fix race in idle_balance()
sched: Move idle_stamp up to the core

kernel/sched/core.c | 13 +++++++++++--
kernel/sched/fair.c | 20 +++++++++++++-------
kernel/sched/sched.h | 8 +-------
3 files changed, 25 insertions(+), 16 deletions(-)

--
1.7.9.5


2014-02-06 23:10:47

by Daniel Lezcano

[permalink] [raw]
Subject: [PATCH V2 1/3] sched: Remove cpu parameter for idle_balance()

The cpu parameter passed to idle_balance is not needed as it could
be retrieved from the struct rq.

Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Daniel Lezcano <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
---
kernel/sched/core.c | 2 +-
kernel/sched/fair.c | 3 ++-
kernel/sched/sched.h | 4 ++--
3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 210a12a..16b97dd 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2705,7 +2705,7 @@ need_resched:
pre_schedule(rq, prev);

if (unlikely(!rq->nr_running))
- idle_balance(cpu, rq);
+ idle_balance(rq);

put_prev_task(rq, prev);
next = pick_next_task(rq);
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4caa803..428bc9d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6531,12 +6531,13 @@ out:
* idle_balance is called by schedule() if this_cpu is about to become
* idle. Attempts to pull tasks from other CPUs.
*/
-void idle_balance(int this_cpu, struct rq *this_rq)
+void idle_balance(struct rq *this_rq)
{
struct sched_domain *sd;
int pulled_task = 0;
unsigned long next_balance = jiffies + HZ;
u64 curr_cost = 0;
+ int this_cpu = this_rq->cpu;

this_rq->idle_stamp = rq_clock(this_rq);

diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index c2119fd..1436219 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1176,14 +1176,14 @@ extern const struct sched_class idle_sched_class;
extern void update_group_power(struct sched_domain *sd, int cpu);

extern void trigger_load_balance(struct rq *rq);
-extern void idle_balance(int this_cpu, struct rq *this_rq);
+extern void idle_balance(struct rq *this_rq);

extern void idle_enter_fair(struct rq *this_rq);
extern void idle_exit_fair(struct rq *this_rq);

#else /* CONFIG_SMP */

-static inline void idle_balance(int cpu, struct rq *rq)
+static inline void idle_balance(struct rq *rq)
{
}

--
1.7.9.5

2014-02-06 23:10:55

by Daniel Lezcano

[permalink] [raw]
Subject: [PATCH V2 3/3] sched: Move idle_stamp up to the core

The idle_balance modifies the idle_stamp field of the rq, making this
information to be shared across core.c and fair.c. As we can know if the
cpu is going to idle or not with the previous patch, let's encapsulate the
idle_stamp information in core.c by moving it up to the caller. The
idle_balance function returns true in case a balancing occured and the cpu
won't be idle, false if no balance happened and the cpu is going idle.

Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Daniel Lezcano <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
---
kernel/sched/core.c | 13 +++++++++++--
kernel/sched/fair.c | 14 ++++++--------
kernel/sched/sched.h | 8 +-------
3 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 16b97dd..428ee4c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2704,8 +2704,17 @@ need_resched:

pre_schedule(rq, prev);

- if (unlikely(!rq->nr_running))
- idle_balance(rq);
+#ifdef CONFIG_SMP
+ if (unlikely(!rq->nr_running)) {
+ /*
+ * We must set idle_stamp _before_ calling idle_balance(), such
+ * that we measure the duration of idle_balance() as idle time.
+ */
+ rq->idle_stamp = rq_clock(rq);
+ if (idle_balance(rq))
+ rq->idle_stamp = 0;
+ }
+#endif

put_prev_task(rq, prev);
next = pick_next_task(rq);
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5ebc681..04fea77 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6531,7 +6531,7 @@ out:
* idle_balance is called by schedule() if this_cpu is about to become
* idle. Attempts to pull tasks from other CPUs.
*/
-void idle_balance(struct rq *this_rq)
+int idle_balance(struct rq *this_rq)
{
struct sched_domain *sd;
int pulled_task = 0;
@@ -6539,10 +6539,8 @@ void idle_balance(struct rq *this_rq)
u64 curr_cost = 0;
int this_cpu = this_rq->cpu;

- this_rq->idle_stamp = rq_clock(this_rq);
-
if (this_rq->avg_idle < sysctl_sched_migration_cost)
- return;
+ return 0;

/*
* Drop the rq->lock, but keep IRQ/preempt disabled.
@@ -6580,10 +6578,8 @@ void idle_balance(struct rq *this_rq)
interval = msecs_to_jiffies(sd->balance_interval);
if (time_after(next_balance, sd->last_balance + interval))
next_balance = sd->last_balance + interval;
- if (pulled_task) {
- this_rq->idle_stamp = 0;
+ if (pulled_task)
break;
- }
}
rcu_read_unlock();

@@ -6594,7 +6590,7 @@ void idle_balance(struct rq *this_rq)
* A task could have be enqueued in the meantime
*/
if (this_rq->nr_running && !pulled_task)
- return;
+ return 1;

if (pulled_task || time_after(jiffies, this_rq->next_balance)) {
/*
@@ -6606,6 +6602,8 @@ void idle_balance(struct rq *this_rq)

if (curr_cost > this_rq->max_idle_balance_cost)
this_rq->max_idle_balance_cost = curr_cost;
+
+ return pulled_task;
}

/*
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 1436219..c08c070 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1176,17 +1176,11 @@ extern const struct sched_class idle_sched_class;
extern void update_group_power(struct sched_domain *sd, int cpu);

extern void trigger_load_balance(struct rq *rq);
-extern void idle_balance(struct rq *this_rq);
+extern int idle_balance(struct rq *this_rq);

extern void idle_enter_fair(struct rq *this_rq);
extern void idle_exit_fair(struct rq *this_rq);

-#else /* CONFIG_SMP */
-
-static inline void idle_balance(struct rq *rq)
-{
-}
-
#endif

extern void sysrq_sched_debug_show(void);
--
1.7.9.5

2014-02-06 23:11:14

by Daniel Lezcano

[permalink] [raw]
Subject: [PATCH V2 2/3] sched: Fix race in idle_balance()

The scheduler main function 'schedule()' checks if there are no more tasks
on the runqueue. Then it checks if a task should be pulled in the current
runqueue in idle_balance() assuming it will go to idle otherwise.

But the idle_balance() releases the rq->lock in order to lookup in the sched
domains and takes the lock again right after. That opens a window where
another cpu may put a task in our runqueue, so we won't go to idle but
we have filled the idle_stamp, thinking we will.

This patch closes the window by checking if the runqueue has been modified
but without pulling a task after taking the lock again, so we won't go to idle
right after in the __schedule() function.

Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Daniel Lezcano <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
---
kernel/sched/fair.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 428bc9d..5ebc681 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6589,6 +6589,13 @@ void idle_balance(struct rq *this_rq)

raw_spin_lock(&this_rq->lock);

+ /*
+ * While browsing the domains, we released the rq lock.
+ * A task could have be enqueued in the meantime
+ */
+ if (this_rq->nr_running && !pulled_task)
+ return;
+
if (pulled_task || time_after(jiffies, this_rq->next_balance)) {
/*
* We are going idle. next_balance may be set based on
--
1.7.9.5

2014-02-10 09:24:27

by Preeti Murthy

[permalink] [raw]
Subject: Re: [PATCH V2 2/3] sched: Fix race in idle_balance()

HI Daniel,

Isn't the only scenario where another cpu can put an idle task on
our runqueue, in nohz_idle_balance() where only the cpus in
the nohz.idle_cpus_mask are iterated through. But for the case
that this patch is addressing, the cpu in question is not yet a part
of the nohz.idle_cpus_mask right?

Any other case would trigger load balancing on the same cpu, but
we are preempt_disabled and interrupt disabled at this point.

Thanks

Regards
Preeti U Murthy

On Fri, Feb 7, 2014 at 4:40 AM, Daniel Lezcano
<[email protected]> wrote:
> The scheduler main function 'schedule()' checks if there are no more tasks
> on the runqueue. Then it checks if a task should be pulled in the current
> runqueue in idle_balance() assuming it will go to idle otherwise.
>
> But the idle_balance() releases the rq->lock in order to lookup in the sched
> domains and takes the lock again right after. That opens a window where
> another cpu may put a task in our runqueue, so we won't go to idle but
> we have filled the idle_stamp, thinking we will.
>
> This patch closes the window by checking if the runqueue has been modified
> but without pulling a task after taking the lock again, so we won't go to idle
> right after in the __schedule() function.
>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Daniel Lezcano <[email protected]>
> Signed-off-by: Peter Zijlstra <[email protected]>
> ---
> kernel/sched/fair.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 428bc9d..5ebc681 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6589,6 +6589,13 @@ void idle_balance(struct rq *this_rq)
>
> raw_spin_lock(&this_rq->lock);
>
> + /*
> + * While browsing the domains, we released the rq lock.
> + * A task could have be enqueued in the meantime
> + */
> + if (this_rq->nr_running && !pulled_task)
> + return;
> +
> if (pulled_task || time_after(jiffies, this_rq->next_balance)) {
> /*
> * We are going idle. next_balance may be set based on
> --
> 1.7.9.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2014-02-10 10:10:08

by Preeti Murthy

[permalink] [raw]
Subject: Re: [PATCH V2 3/3] sched: Move idle_stamp up to the core

Hi Daniel,

On Fri, Feb 7, 2014 at 4:40 AM, Daniel Lezcano
<[email protected]> wrote:
> The idle_balance modifies the idle_stamp field of the rq, making this
> information to be shared across core.c and fair.c. As we can know if the
> cpu is going to idle or not with the previous patch, let's encapsulate the
> idle_stamp information in core.c by moving it up to the caller. The
> idle_balance function returns true in case a balancing occured and the cpu
> won't be idle, false if no balance happened and the cpu is going idle.
>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Daniel Lezcano <[email protected]>
> Signed-off-by: Peter Zijlstra <[email protected]>
> ---
> kernel/sched/core.c | 13 +++++++++++--
> kernel/sched/fair.c | 14 ++++++--------
> kernel/sched/sched.h | 8 +-------
> 3 files changed, 18 insertions(+), 17 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 16b97dd..428ee4c 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2704,8 +2704,17 @@ need_resched:
>
> pre_schedule(rq, prev);
>
> - if (unlikely(!rq->nr_running))
> - idle_balance(rq);
> +#ifdef CONFIG_SMP
> + if (unlikely(!rq->nr_running)) {
> + /*
> + * We must set idle_stamp _before_ calling idle_balance(), such
> + * that we measure the duration of idle_balance() as idle time.

Should not this be "such that we *do not* measure the duration of idle_balance()
as idle time?"

Thanks

Regards
Preeti U Murthy

2014-02-11 11:11:55

by Daniel Lezcano

[permalink] [raw]
Subject: Re: [PATCH V2 2/3] sched: Fix race in idle_balance()

On 02/10/2014 10:24 AM, Preeti Murthy wrote:
> HI Daniel,
>
> Isn't the only scenario where another cpu can put an idle task on
> our runqueue,

Well, I am not sure to understand what you meant, but I assume you are
asking if it is possible to have a task to be pulled when we are idle,
right ?

This patch fixes the race when the current cpu is *about* to enter idle
when calling schedule().


> in nohz_idle_balance() where only the cpus in
> the nohz.idle_cpus_mask are iterated through. But for the case
> that this patch is addressing, the cpu in question is not yet a part
> of the nohz.idle_cpus_mask right?
>
> Any other case would trigger load balancing on the same cpu, but
> we are preempt_disabled and interrupt disabled at this point.
>
> Thanks
>
> Regards
> Preeti U Murthy
>
> On Fri, Feb 7, 2014 at 4:40 AM, Daniel Lezcano
> <[email protected]> wrote:
>> The scheduler main function 'schedule()' checks if there are no more tasks
>> on the runqueue. Then it checks if a task should be pulled in the current
>> runqueue in idle_balance() assuming it will go to idle otherwise.
>>
>> But the idle_balance() releases the rq->lock in order to lookup in the sched
>> domains and takes the lock again right after. That opens a window where
>> another cpu may put a task in our runqueue, so we won't go to idle but
>> we have filled the idle_stamp, thinking we will.
>>
>> This patch closes the window by checking if the runqueue has been modified
>> but without pulling a task after taking the lock again, so we won't go to idle
>> right after in the __schedule() function.
>>
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: [email protected]
>> Signed-off-by: Daniel Lezcano <[email protected]>
>> Signed-off-by: Peter Zijlstra <[email protected]>
>> ---
>> kernel/sched/fair.c | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 428bc9d..5ebc681 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -6589,6 +6589,13 @@ void idle_balance(struct rq *this_rq)
>>
>> raw_spin_lock(&this_rq->lock);
>>
>> + /*
>> + * While browsing the domains, we released the rq lock.
>> + * A task could have be enqueued in the meantime
>> + */
>> + if (this_rq->nr_running && !pulled_task)
>> + return;
>> +
>> if (pulled_task || time_after(jiffies, this_rq->next_balance)) {
>> /*
>> * We are going idle. next_balance may be set based on
>> --
>> 1.7.9.5
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/


--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

2014-02-11 12:07:17

by Daniel Lezcano

[permalink] [raw]
Subject: Re: [PATCH V2 3/3] sched: Move idle_stamp up to the core

On 02/10/2014 11:04 AM, Preeti Murthy wrote:
> Hi Daniel,
>
> On Fri, Feb 7, 2014 at 4:40 AM, Daniel Lezcano
> <[email protected]> wrote:
>> The idle_balance modifies the idle_stamp field of the rq, making this
>> information to be shared across core.c and fair.c. As we can know if the
>> cpu is going to idle or not with the previous patch, let's encapsulate the
>> idle_stamp information in core.c by moving it up to the caller. The
>> idle_balance function returns true in case a balancing occured and the cpu
>> won't be idle, false if no balance happened and the cpu is going idle.
>>
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: [email protected]
>> Signed-off-by: Daniel Lezcano <[email protected]>
>> Signed-off-by: Peter Zijlstra <[email protected]>
>> ---
>> kernel/sched/core.c | 13 +++++++++++--
>> kernel/sched/fair.c | 14 ++++++--------
>> kernel/sched/sched.h | 8 +-------
>> 3 files changed, 18 insertions(+), 17 deletions(-)
>>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 16b97dd..428ee4c 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -2704,8 +2704,17 @@ need_resched:
>>
>> pre_schedule(rq, prev);
>>
>> - if (unlikely(!rq->nr_running))
>> - idle_balance(rq);
>> +#ifdef CONFIG_SMP
>> + if (unlikely(!rq->nr_running)) {
>> + /*
>> + * We must set idle_stamp _before_ calling idle_balance(), such
>> + * that we measure the duration of idle_balance() as idle time.
>
> Should not this be "such that we *do not* measure the duration of idle_balance()
> as idle time?"

Actually, the initial code was including the idle balance time
processing in the idle stamp. When I moved the idle stamp in core.c,
idle balance was no longer measured (an unwanted change). That has been
fixed and to prevent that to occur again, we added a comment.

--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

2014-02-13 07:14:45

by Alex Shi

[permalink] [raw]
Subject: Re: [PATCH V2 1/3] sched: Remove cpu parameter for idle_balance()

On 02/07/2014 07:10 AM, Daniel Lezcano wrote:
> The cpu parameter passed to idle_balance is not needed as it could
> be retrieved from the struct rq.
>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Daniel Lezcano <[email protected]>
> Signed-off-by: Peter Zijlstra <[email protected]>

Reviewed-by: Alex Shi <[email protected]>
> ---
> kernel/sched/core.c | 2 +-
> kernel/sched/fair.c | 3 ++-
> kernel/sched/sched.h | 4 ++--
> 3 files changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 210a12a..16b97dd 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2705,7 +2705,7 @@ need_resched:
> pre_schedule(rq, prev);
>
> if (unlikely(!rq->nr_running))
> - idle_balance(cpu, rq);
> + idle_balance(rq);
>
> put_prev_task(rq, prev);
> next = pick_next_task(rq);
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 4caa803..428bc9d 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6531,12 +6531,13 @@ out:
> * idle_balance is called by schedule() if this_cpu is about to become
> * idle. Attempts to pull tasks from other CPUs.
> */
> -void idle_balance(int this_cpu, struct rq *this_rq)
> +void idle_balance(struct rq *this_rq)
> {
> struct sched_domain *sd;
> int pulled_task = 0;
> unsigned long next_balance = jiffies + HZ;
> u64 curr_cost = 0;
> + int this_cpu = this_rq->cpu;
>
> this_rq->idle_stamp = rq_clock(this_rq);
>
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index c2119fd..1436219 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1176,14 +1176,14 @@ extern const struct sched_class idle_sched_class;
> extern void update_group_power(struct sched_domain *sd, int cpu);
>
> extern void trigger_load_balance(struct rq *rq);
> -extern void idle_balance(int this_cpu, struct rq *this_rq);
> +extern void idle_balance(struct rq *this_rq);
>
> extern void idle_enter_fair(struct rq *this_rq);
> extern void idle_exit_fair(struct rq *this_rq);
>
> #else /* CONFIG_SMP */
>
> -static inline void idle_balance(int cpu, struct rq *rq)
> +static inline void idle_balance(struct rq *rq)
> {
> }
>
>


--
Thanks
Alex

2014-02-13 07:46:09

by Alex Shi

[permalink] [raw]
Subject: Re: [PATCH V2 2/3] sched: Fix race in idle_balance()

On 02/11/2014 07:11 PM, Daniel Lezcano wrote:
> On 02/10/2014 10:24 AM, Preeti Murthy wrote:
>> HI Daniel,
>>
>> Isn't the only scenario where another cpu can put an idle task on
>> our runqueue,
>
> Well, I am not sure to understand what you meant, but I assume you are
> asking if it is possible to have a task to be pulled when we are idle,
> right ?
>
> This patch fixes the race when the current cpu is *about* to enter idle
> when calling schedule().

Preeti said the she didn't see a possible to insert a task on the cpu.

I also did a quick check, maybe task come from wakeup path?

--
Thanks
Alex

2014-02-13 07:46:40

by Alex Shi

[permalink] [raw]
Subject: Re: [PATCH V2 2/3] sched: Fix race in idle_balance()

On 02/07/2014 07:10 AM, Daniel Lezcano wrote:
> The scheduler main function 'schedule()' checks if there are no more tasks
> on the runqueue. Then it checks if a task should be pulled in the current
> runqueue in idle_balance() assuming it will go to idle otherwise.
>
> But the idle_balance() releases the rq->lock in order to lookup in the sched
> domains and takes the lock again right after. That opens a window where
> another cpu may put a task in our runqueue, so we won't go to idle but
> we have filled the idle_stamp, thinking we will.
>
> This patch closes the window by checking if the runqueue has been modified
> but without pulling a task after taking the lock again, so we won't go to idle
> right after in the __schedule() function.
>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Daniel Lezcano <[email protected]>
> Signed-off-by: Peter Zijlstra <[email protected]>
> ---
> kernel/sched/fair.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 428bc9d..5ebc681 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6589,6 +6589,13 @@ void idle_balance(struct rq *this_rq)
>
> raw_spin_lock(&this_rq->lock);
>
> + /*
> + * While browsing the domains, we released the rq lock.
> + * A task could have be enqueued in the meantime
> + */

Mind to move the following line up to here?

if (curr_cost > this_rq->max_idle_balance_cost)
this_rq->max_idle_balance_cost = curr_cost;

> + if (this_rq->nr_running && !pulled_task)
> + return;
> +
> if (pulled_task || time_after(jiffies, this_rq->next_balance)) {
> /*
> * We are going idle. next_balance may be set based on
>




--
Thanks
Alex

2014-02-13 07:53:14

by Alex Shi

[permalink] [raw]
Subject: Re: [PATCH V2 3/3] sched: Move idle_stamp up to the core

On 02/07/2014 07:10 AM, Daniel Lezcano wrote:
> The idle_balance modifies the idle_stamp field of the rq, making this
> information to be shared across core.c and fair.c. As we can know if the
> cpu is going to idle or not with the previous patch, let's encapsulate the
> idle_stamp information in core.c by moving it up to the caller. The
> idle_balance function returns true in case a balancing occured and the cpu
> won't be idle, false if no balance happened and the cpu is going idle.
>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Daniel Lezcano <[email protected]>
> Signed-off-by: Peter Zijlstra <[email protected]>

Reviewed-by: Alex Shi <[email protected]>



--
Thanks
Alex

2014-02-13 10:14:11

by Preeti U Murthy

[permalink] [raw]
Subject: Re: [PATCH V2 2/3] sched: Fix race in idle_balance()

Hi,

On 02/13/2014 01:15 PM, Alex Shi wrote:
> On 02/11/2014 07:11 PM, Daniel Lezcano wrote:
>> On 02/10/2014 10:24 AM, Preeti Murthy wrote:
>>> HI Daniel,
>>>
>>> Isn't the only scenario where another cpu can put an idle task on
>>> our runqueue,
>>
>> Well, I am not sure to understand what you meant, but I assume you are
>> asking if it is possible to have a task to be pulled when we are idle,
>> right ?
>>
>> This patch fixes the race when the current cpu is *about* to enter idle
>> when calling schedule().
>
> Preeti said the she didn't see a possible to insert a task on the cpu.
>
> I also did a quick check, maybe task come from wakeup path?

Yes this is possible. Thanks for pointing this :)

Reviewed-by: Preeti U Murthy <[email protected]>
>

2014-02-13 10:18:02

by Preeti U Murthy

[permalink] [raw]
Subject: Re: [PATCH V2 3/3] sched: Move idle_stamp up to the core

Hi Daniel,

On 02/11/2014 05:37 PM, Daniel Lezcano wrote:
> On 02/10/2014 11:04 AM, Preeti Murthy wrote:
>> Hi Daniel,
>>
>> On Fri, Feb 7, 2014 at 4:40 AM, Daniel Lezcano
>> <[email protected]> wrote:
>>> The idle_balance modifies the idle_stamp field of the rq, making this
>>> information to be shared across core.c and fair.c. As we can know if the
>>> cpu is going to idle or not with the previous patch, let's
>>> encapsulate the
>>> idle_stamp information in core.c by moving it up to the caller. The
>>> idle_balance function returns true in case a balancing occured and
>>> the cpu
>>> won't be idle, false if no balance happened and the cpu is going idle.
>>>
>>> Cc: [email protected]
>>> Cc: [email protected]
>>> Cc: [email protected]
>>> Signed-off-by: Daniel Lezcano <[email protected]>
>>> Signed-off-by: Peter Zijlstra <[email protected]>
>>> ---
>>> kernel/sched/core.c | 13 +++++++++++--
>>> kernel/sched/fair.c | 14 ++++++--------
>>> kernel/sched/sched.h | 8 +-------
>>> 3 files changed, 18 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>>> index 16b97dd..428ee4c 100644
>>> --- a/kernel/sched/core.c
>>> +++ b/kernel/sched/core.c
>>> @@ -2704,8 +2704,17 @@ need_resched:
>>>
>>> pre_schedule(rq, prev);
>>>
>>> - if (unlikely(!rq->nr_running))
>>> - idle_balance(rq);
>>> +#ifdef CONFIG_SMP
>>> + if (unlikely(!rq->nr_running)) {
>>> + /*
>>> + * We must set idle_stamp _before_ calling
>>> idle_balance(), such
>>> + * that we measure the duration of idle_balance() as
>>> idle time.
>>
>> Should not this be "such that we *do not* measure the duration of
>> idle_balance()
>> as idle time?"
>
> Actually, the initial code was including the idle balance time
> processing in the idle stamp. When I moved the idle stamp in core.c,
> idle balance was no longer measured (an unwanted change). That has been
> fixed and to prevent that to occur again, we added a comment.

Oh sorry! Yes you are right.

Thanks

Regards
Preeti U Murthy
>

2014-02-13 10:22:39

by Daniel Lezcano

[permalink] [raw]
Subject: Re: [PATCH V2 2/3] sched: Fix race in idle_balance()

On 02/13/2014 11:10 AM, Preeti U Murthy wrote:
> Hi,
>
> On 02/13/2014 01:15 PM, Alex Shi wrote:
>> On 02/11/2014 07:11 PM, Daniel Lezcano wrote:
>>> On 02/10/2014 10:24 AM, Preeti Murthy wrote:
>>>> HI Daniel,
>>>>
>>>> Isn't the only scenario where another cpu can put an idle task on
>>>> our runqueue,
>>>
>>> Well, I am not sure to understand what you meant, but I assume you are
>>> asking if it is possible to have a task to be pulled when we are idle,
>>> right ?
>>>
>>> This patch fixes the race when the current cpu is *about* to enter idle
>>> when calling schedule().
>>
>> Preeti said the she didn't see a possible to insert a task on the cpu.
>>
>> I also did a quick check, maybe task come from wakeup path?
>
> Yes this is possible. Thanks for pointing this :)
>
> Reviewed-by: Preeti U Murthy <[email protected]>

Thanks for the review !

-- Daniel


--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog