2017-03-07 06:03:26

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH] sched/deadline: Add missing update_rq_clock() in dl_task_timer()

From: Wanpeng Li <[email protected]>

The following warning can be triggered by hot-unplugging the CPU
on which an active SCHED_DEADLINE task is running on:

------------[ cut here ]------------
WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 replenish_dl_entity+0x71e/0xc40
rq->clock_update_flags < RQCF_ACT_SKIP
CPU: 7 PID: 0 Comm: swapper/7 Tainted: G B 4.11.0-rc1+ #24
Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016
Call Trace:
<IRQ>
dump_stack+0x85/0xc4
__warn+0x172/0x1b0
warn_slowpath_fmt+0xb4/0xf0
? __warn+0x1b0/0x1b0
? debug_check_no_locks_freed+0x2c0/0x2c0
? cpudl_set+0x3d/0x2b0
replenish_dl_entity+0x71e/0xc40
enqueue_task_dl+0x2ea/0x12e0
? dl_task_timer+0x777/0x990
? __hrtimer_run_queues+0x270/0xa50
dl_task_timer+0x316/0x990
? enqueue_task_dl+0x12e0/0x12e0
? enqueue_task_dl+0x12e0/0x12e0
__hrtimer_run_queues+0x270/0xa50
? hrtimer_cancel+0x20/0x20
? hrtimer_interrupt+0x119/0x600
hrtimer_interrupt+0x19c/0x600
? trace_hardirqs_off+0xd/0x10
local_apic_timer_interrupt+0x74/0xe0
smp_apic_timer_interrupt+0x76/0xa0
apic_timer_interrupt+0x93/0xa0

The DL task will be migrated to a suitable later deadline rq once the DL
timer fires and currnet rq is offline. The rq clock of the new rq should
be updated. This patch fixes it by updating the rq clock after holding
the new rq's rq lock.

Cc: Juri Lelli <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Matt Fleming <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
kernel/sched/deadline.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 99b2c33..c6db3fd 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -638,6 +638,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
lockdep_unpin_lock(&rq->lock, rf.cookie);
rq = dl_task_offline_migration(rq, p);
rf.cookie = lockdep_pin_lock(&rq->lock);
+ update_rq_clock(rq);

/*
* Now that the task has been migrated to the new RQ and we
--
2.7.4


2017-03-07 14:00:32

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH] sched/deadline: Add missing update_rq_clock() in dl_task_timer()

On Mon, 06 Mar, at 09:51:28PM, Wanpeng Li wrote:
> From: Wanpeng Li <[email protected]>
>
> The following warning can be triggered by hot-unplugging the CPU
> on which an active SCHED_DEADLINE task is running on:
>
> ------------[ cut here ]------------
> WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 replenish_dl_entity+0x71e/0xc40
> rq->clock_update_flags < RQCF_ACT_SKIP
> CPU: 7 PID: 0 Comm: swapper/7 Tainted: G B 4.11.0-rc1+ #24

[...]

> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 99b2c33..c6db3fd 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -638,6 +638,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
> lockdep_unpin_lock(&rq->lock, rf.cookie);
> rq = dl_task_offline_migration(rq, p);
> rf.cookie = lockdep_pin_lock(&rq->lock);
> + update_rq_clock(rq);
>
> /*
> * Now that the task has been migrated to the new RQ and we

Yeah, I guess the reason we can't use the rq_repin_lock() function is
because of all the DL double rq locking going on inside of
dl_task_offline_migration().

I'd definitely like someone else to verify, but this looks OK to me.

Reviewed-by: Matt Fleming <[email protected]>

2017-03-15 07:53:09

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH] sched/deadline: Add missing update_rq_clock() in dl_task_timer()

Ping, :)
2017-03-07 13:51 GMT+08:00 Wanpeng Li <[email protected]>:
> From: Wanpeng Li <[email protected]>
>
> The following warning can be triggered by hot-unplugging the CPU
> on which an active SCHED_DEADLINE task is running on:
>
> ------------[ cut here ]------------
> WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 replenish_dl_entity+0x71e/0xc40
> rq->clock_update_flags < RQCF_ACT_SKIP
> CPU: 7 PID: 0 Comm: swapper/7 Tainted: G B 4.11.0-rc1+ #24
> Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016
> Call Trace:
> <IRQ>
> dump_stack+0x85/0xc4
> __warn+0x172/0x1b0
> warn_slowpath_fmt+0xb4/0xf0
> ? __warn+0x1b0/0x1b0
> ? debug_check_no_locks_freed+0x2c0/0x2c0
> ? cpudl_set+0x3d/0x2b0
> replenish_dl_entity+0x71e/0xc40
> enqueue_task_dl+0x2ea/0x12e0
> ? dl_task_timer+0x777/0x990
> ? __hrtimer_run_queues+0x270/0xa50
> dl_task_timer+0x316/0x990
> ? enqueue_task_dl+0x12e0/0x12e0
> ? enqueue_task_dl+0x12e0/0x12e0
> __hrtimer_run_queues+0x270/0xa50
> ? hrtimer_cancel+0x20/0x20
> ? hrtimer_interrupt+0x119/0x600
> hrtimer_interrupt+0x19c/0x600
> ? trace_hardirqs_off+0xd/0x10
> local_apic_timer_interrupt+0x74/0xe0
> smp_apic_timer_interrupt+0x76/0xa0
> apic_timer_interrupt+0x93/0xa0
>
> The DL task will be migrated to a suitable later deadline rq once the DL
> timer fires and currnet rq is offline. The rq clock of the new rq should
> be updated. This patch fixes it by updating the rq clock after holding
> the new rq's rq lock.
>
> Cc: Juri Lelli <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Matt Fleming <[email protected]>
> Signed-off-by: Wanpeng Li <[email protected]>
> ---
> kernel/sched/deadline.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 99b2c33..c6db3fd 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -638,6 +638,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
> lockdep_unpin_lock(&rq->lock, rf.cookie);
> rq = dl_task_offline_migration(rq, p);
> rf.cookie = lockdep_pin_lock(&rq->lock);
> + update_rq_clock(rq);
>
> /*
> * Now that the task has been migrated to the new RQ and we
> --
> 2.7.4
>

Subject: Re: [PATCH] sched/deadline: Add missing update_rq_clock() in dl_task_timer()

On 03/15/2017 08:53 AM, Wanpeng Li wrote:
> Ping, :)
> 2017-03-07 13:51 GMT+08:00 Wanpeng Li <[email protected]>:
>> From: Wanpeng Li <[email protected]>
>>
>> The following warning can be triggered by hot-unplugging the CPU
>> on which an active SCHED_DEADLINE task is running on:
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 replenish_dl_entity+0x71e/0xc40
>> rq->clock_update_flags < RQCF_ACT_SKIP
>> CPU: 7 PID: 0 Comm: swapper/7 Tainted: G B 4.11.0-rc1+ #24
>> Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016
>> Call Trace:
>> <IRQ>
>> dump_stack+0x85/0xc4
>> __warn+0x172/0x1b0
>> warn_slowpath_fmt+0xb4/0xf0
>> ? __warn+0x1b0/0x1b0
>> ? debug_check_no_locks_freed+0x2c0/0x2c0
>> ? cpudl_set+0x3d/0x2b0
>> replenish_dl_entity+0x71e/0xc40
>> enqueue_task_dl+0x2ea/0x12e0
>> ? dl_task_timer+0x777/0x990
>> ? __hrtimer_run_queues+0x270/0xa50
>> dl_task_timer+0x316/0x990
>> ? enqueue_task_dl+0x12e0/0x12e0
>> ? enqueue_task_dl+0x12e0/0x12e0
>> __hrtimer_run_queues+0x270/0xa50
>> ? hrtimer_cancel+0x20/0x20
>> ? hrtimer_interrupt+0x119/0x600
>> hrtimer_interrupt+0x19c/0x600
>> ? trace_hardirqs_off+0xd/0x10
>> local_apic_timer_interrupt+0x74/0xe0
>> smp_apic_timer_interrupt+0x76/0xa0
>> apic_timer_interrupt+0x93/0xa0
>>
>> The DL task will be migrated to a suitable later deadline rq once the DL
>> timer fires and currnet rq is offline. The rq clock of the new rq should
>> be updated. This patch fixes it by updating the rq clock after holding
>> the new rq's rq lock.
>>
>> Cc: Juri Lelli <[email protected]>
>> Cc: Peter Zijlstra <[email protected]>
>> Cc: Ingo Molnar <[email protected]>
>> Cc: Thomas Gleixner <[email protected]>
>> Cc: Matt Fleming <[email protected]>
>> Signed-off-by: Wanpeng Li <[email protected]>

Reviewed-by: Daniel Bristot de Oliveira <[email protected]>

-- Daniel

2017-03-15 11:08:49

by Juri Lelli

[permalink] [raw]
Subject: Re: [PATCH] sched/deadline: Add missing update_rq_clock() in dl_task_timer()

Hi,

On 06/03/17 21:51, Wanpeng Li wrote:
> From: Wanpeng Li <[email protected]>
>
> The following warning can be triggered by hot-unplugging the CPU
> on which an active SCHED_DEADLINE task is running on:
>
> ------------[ cut here ]------------
> WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 replenish_dl_entity+0x71e/0xc40
> rq->clock_update_flags < RQCF_ACT_SKIP
> CPU: 7 PID: 0 Comm: swapper/7 Tainted: G B 4.11.0-rc1+ #24
> Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016
> Call Trace:
> <IRQ>
> dump_stack+0x85/0xc4
> __warn+0x172/0x1b0
> warn_slowpath_fmt+0xb4/0xf0
> ? __warn+0x1b0/0x1b0
> ? debug_check_no_locks_freed+0x2c0/0x2c0
> ? cpudl_set+0x3d/0x2b0
> replenish_dl_entity+0x71e/0xc40
> enqueue_task_dl+0x2ea/0x12e0
> ? dl_task_timer+0x777/0x990
> ? __hrtimer_run_queues+0x270/0xa50
> dl_task_timer+0x316/0x990
> ? enqueue_task_dl+0x12e0/0x12e0
> ? enqueue_task_dl+0x12e0/0x12e0
> __hrtimer_run_queues+0x270/0xa50
> ? hrtimer_cancel+0x20/0x20
> ? hrtimer_interrupt+0x119/0x600
> hrtimer_interrupt+0x19c/0x600
> ? trace_hardirqs_off+0xd/0x10
> local_apic_timer_interrupt+0x74/0xe0
> smp_apic_timer_interrupt+0x76/0xa0
> apic_timer_interrupt+0x93/0xa0
>
> The DL task will be migrated to a suitable later deadline rq once the DL
> timer fires and currnet rq is offline. The rq clock of the new rq should
> be updated. This patch fixes it by updating the rq clock after holding
> the new rq's rq lock.
>
> Cc: Juri Lelli <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Matt Fleming <[email protected]>
> Signed-off-by: Wanpeng Li <[email protected]>
> ---
> kernel/sched/deadline.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 99b2c33..c6db3fd 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -638,6 +638,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
> lockdep_unpin_lock(&rq->lock, rf.cookie);
> rq = dl_task_offline_migration(rq, p);
> rf.cookie = lockdep_pin_lock(&rq->lock);
> + update_rq_clock(rq);

Looks good to me.

Acked-by: Juri Lelli <[email protected]>

Thanks,

- Juri

Subject: [tip:sched/core] sched/deadline: Add missing update_rq_clock() in dl_task_timer()

Commit-ID: dcc3b5ffe1b32771c9a22e2c916fb94c4fcf5b79
Gitweb: http://git.kernel.org/tip/dcc3b5ffe1b32771c9a22e2c916fb94c4fcf5b79
Author: Wanpeng Li <[email protected]>
AuthorDate: Mon, 6 Mar 2017 21:51:28 -0800
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 16 Mar 2017 09:20:59 +0100

sched/deadline: Add missing update_rq_clock() in dl_task_timer()

The following warning can be triggered by hot-unplugging the CPU
on which an active SCHED_DEADLINE task is running on:

------------[ cut here ]------------
WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 replenish_dl_entity+0x71e/0xc40
rq->clock_update_flags < RQCF_ACT_SKIP
CPU: 7 PID: 0 Comm: swapper/7 Tainted: G B 4.11.0-rc1+ #24
Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016
Call Trace:
<IRQ>
dump_stack+0x85/0xc4
__warn+0x172/0x1b0
warn_slowpath_fmt+0xb4/0xf0
? __warn+0x1b0/0x1b0
? debug_check_no_locks_freed+0x2c0/0x2c0
? cpudl_set+0x3d/0x2b0
replenish_dl_entity+0x71e/0xc40
enqueue_task_dl+0x2ea/0x12e0
? dl_task_timer+0x777/0x990
? __hrtimer_run_queues+0x270/0xa50
dl_task_timer+0x316/0x990
? enqueue_task_dl+0x12e0/0x12e0
? enqueue_task_dl+0x12e0/0x12e0
__hrtimer_run_queues+0x270/0xa50
? hrtimer_cancel+0x20/0x20
? hrtimer_interrupt+0x119/0x600
hrtimer_interrupt+0x19c/0x600
? trace_hardirqs_off+0xd/0x10
local_apic_timer_interrupt+0x74/0xe0
smp_apic_timer_interrupt+0x76/0xa0
apic_timer_interrupt+0x93/0xa0

The DL task will be migrated to a suitable later deadline rq once the DL
timer fires and currnet rq is offline. The rq clock of the new rq should
be updated. This patch fixes it by updating the rq clock after holding
the new rq's rq lock.

Signed-off-by: Wanpeng Li <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Matt Fleming <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/sched/deadline.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 99b2c33..c6db3fd 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -638,6 +638,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
lockdep_unpin_lock(&rq->lock, rf.cookie);
rq = dl_task_offline_migration(rq, p);
rf.cookie = lockdep_pin_lock(&rq->lock);
+ update_rq_clock(rq);

/*
* Now that the task has been migrated to the new RQ and we