2021-08-04 14:02:00

by Dietmar Eggemann

[permalink] [raw]
Subject: [PATCH] sched/deadline: Fix missing clock update in migrate_task_rq_dl()

A missing clock update is causing the following warning:

rq->clock_update_flags < RQCF_ACT_SKIP
WARNING: CPU: 112 PID: 2041 at kernel/sched/sched.h:1453
sub_running_bw.isra.0+0x190/0x1a0
...
CPU: 112 PID: 2041 Comm: sugov:112 Tainted: G W 5.14.0-rc1 #1
Hardware name: WIWYNN Mt.Jade Server System
B81.030Z1.0007/Mt.Jade Motherboard, BIOS 1.6.20210526 (SCP:
1.06.20210526) 2021/05/26
...
Call trace:
sub_running_bw.isra.0+0x190/0x1a0
migrate_task_rq_dl+0xf8/0x1e0
set_task_cpu+0xa8/0x1f0
try_to_wake_up+0x150/0x3d4
wake_up_q+0x64/0xc0
__up_write+0xd0/0x1c0
up_write+0x4c/0x2b0
cppc_set_perf+0x120/0x2d0
cppc_cpufreq_set_target+0xe0/0x1a4 [cppc_cpufreq]
__cpufreq_driver_target+0x74/0x140
sugov_work+0x64/0x80
kthread_worker_fn+0xe0/0x230
kthread+0x138/0x140
ret_from_fork+0x10/0x18

The task causing this is the `cppc_fie` DL task introduced by
commit 1eb5dde674f5 ("cpufreq: CPPC: Add support for frequency
invariance").

With CONFIG_ACPI_CPPC_CPUFREQ_FIE=y and schedutil cpufreq governor on
slow-switching system (like on this Ampere Altra WIWYNN Mt. Jade Arm
Server):

DL task `curr=sugov:112` lets `p=cppc_fie` migrate and since the latter
is in `non_contending` state, migrate_task_rq_dl() calls

sub_running_bw()->__sub_running_bw()->cpufreq_update_util()->
rq_clock()->assert_clock_updated()

on p.

Fix this by updating the clock for a non_contending task in
migrate_task_rq_dl() before calling sub_running_bw().

Reported-by: Bruno Goncalves <[email protected]>
Signed-off-by: Dietmar Eggemann <[email protected]>
---
kernel/sched/deadline.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index aaacd6cfd42f..4920f498492f 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1733,6 +1733,7 @@ static void migrate_task_rq_dl(struct task_struct *p, int new_cpu __maybe_unused
*/
raw_spin_rq_lock(rq);
if (p->dl.dl_non_contending) {
+ update_rq_clock(rq);
sub_running_bw(&p->dl, &rq->dl);
p->dl.dl_non_contending = 0;
/*
--
2.25.1


Subject: Re: [PATCH] sched/deadline: Fix missing clock update in migrate_task_rq_dl()

On 8/4/21 3:59 PM, Dietmar Eggemann wrote:
> A missing clock update is causing the following warning:
>
> rq->clock_update_flags < RQCF_ACT_SKIP
> WARNING: CPU: 112 PID: 2041 at kernel/sched/sched.h:1453
> sub_running_bw.isra.0+0x190/0x1a0
> ...
> CPU: 112 PID: 2041 Comm: sugov:112 Tainted: G W 5.14.0-rc1 #1
> Hardware name: WIWYNN Mt.Jade Server System
> B81.030Z1.0007/Mt.Jade Motherboard, BIOS 1.6.20210526 (SCP:
> 1.06.20210526) 2021/05/26
> ...
> Call trace:
> sub_running_bw.isra.0+0x190/0x1a0
> migrate_task_rq_dl+0xf8/0x1e0
> set_task_cpu+0xa8/0x1f0
> try_to_wake_up+0x150/0x3d4
> wake_up_q+0x64/0xc0
> __up_write+0xd0/0x1c0
> up_write+0x4c/0x2b0
> cppc_set_perf+0x120/0x2d0
> cppc_cpufreq_set_target+0xe0/0x1a4 [cppc_cpufreq]
> __cpufreq_driver_target+0x74/0x140
> sugov_work+0x64/0x80
> kthread_worker_fn+0xe0/0x230
> kthread+0x138/0x140
> ret_from_fork+0x10/0x18
>
> The task causing this is the `cppc_fie` DL task introduced by
> commit 1eb5dde674f5 ("cpufreq: CPPC: Add support for frequency
> invariance").
>
> With CONFIG_ACPI_CPPC_CPUFREQ_FIE=y and schedutil cpufreq governor on
> slow-switching system (like on this Ampere Altra WIWYNN Mt. Jade Arm
> Server):
>
> DL task `curr=sugov:112` lets `p=cppc_fie` migrate and since the latter
> is in `non_contending` state, migrate_task_rq_dl() calls
>
> sub_running_bw()->__sub_running_bw()->cpufreq_update_util()->
> rq_clock()->assert_clock_updated()
>
> on p.
>
> Fix this by updating the clock for a non_contending task in
> migrate_task_rq_dl() before calling sub_running_bw().
>
> Reported-by: Bruno Goncalves <[email protected]>
> Signed-off-by: Dietmar Eggemann <[email protected]>

Reviewed-by: Daniel Bristot de Oliveira <[email protected]>

-- Daniel

> ---
> kernel/sched/deadline.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index aaacd6cfd42f..4920f498492f 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1733,6 +1733,7 @@ static void migrate_task_rq_dl(struct task_struct *p, int new_cpu __maybe_unused
> */
> raw_spin_rq_lock(rq);
> if (p->dl.dl_non_contending) {
> + update_rq_clock(rq);
> sub_running_bw(&p->dl, &rq->dl);
> p->dl.dl_non_contending = 0;
> /*
>

2021-08-05 09:40:16

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] sched/deadline: Fix missing clock update in migrate_task_rq_dl()

On Thu, Aug 05, 2021 at 10:16:26AM +0200, Daniel Bristot de Oliveira wrote:
> On 8/4/21 3:59 PM, Dietmar Eggemann wrote:
> > A missing clock update is causing the following warning:
> >
> > rq->clock_update_flags < RQCF_ACT_SKIP
> > WARNING: CPU: 112 PID: 2041 at kernel/sched/sched.h:1453
> > sub_running_bw.isra.0+0x190/0x1a0
> > ...
> > CPU: 112 PID: 2041 Comm: sugov:112 Tainted: G W 5.14.0-rc1 #1
> > Hardware name: WIWYNN Mt.Jade Server System
> > B81.030Z1.0007/Mt.Jade Motherboard, BIOS 1.6.20210526 (SCP:
> > 1.06.20210526) 2021/05/26
> > ...
> > Call trace:
> > sub_running_bw.isra.0+0x190/0x1a0
> > migrate_task_rq_dl+0xf8/0x1e0
> > set_task_cpu+0xa8/0x1f0
> > try_to_wake_up+0x150/0x3d4
> > wake_up_q+0x64/0xc0
> > __up_write+0xd0/0x1c0
> > up_write+0x4c/0x2b0
> > cppc_set_perf+0x120/0x2d0
> > cppc_cpufreq_set_target+0xe0/0x1a4 [cppc_cpufreq]
> > __cpufreq_driver_target+0x74/0x140
> > sugov_work+0x64/0x80
> > kthread_worker_fn+0xe0/0x230
> > kthread+0x138/0x140
> > ret_from_fork+0x10/0x18
> >
> > The task causing this is the `cppc_fie` DL task introduced by
> > commit 1eb5dde674f5 ("cpufreq: CPPC: Add support for frequency
> > invariance").
> >
> > With CONFIG_ACPI_CPPC_CPUFREQ_FIE=y and schedutil cpufreq governor on
> > slow-switching system (like on this Ampere Altra WIWYNN Mt. Jade Arm
> > Server):
> >
> > DL task `curr=sugov:112` lets `p=cppc_fie` migrate and since the latter
> > is in `non_contending` state, migrate_task_rq_dl() calls
> >
> > sub_running_bw()->__sub_running_bw()->cpufreq_update_util()->
> > rq_clock()->assert_clock_updated()
> >
> > on p.
> >
> > Fix this by updating the clock for a non_contending task in
> > migrate_task_rq_dl() before calling sub_running_bw().
> >
> > Reported-by: Bruno Goncalves <[email protected]>
> > Signed-off-by: Dietmar Eggemann <[email protected]>
>
> Reviewed-by: Daniel Bristot de Oliveira <[email protected]>

Thanks!

2021-08-05 12:56:25

by Juri Lelli

[permalink] [raw]
Subject: Re: [PATCH] sched/deadline: Fix missing clock update in migrate_task_rq_dl()

Hi,

On 04/08/21 15:59, Dietmar Eggemann wrote:
> A missing clock update is causing the following warning:
>
> rq->clock_update_flags < RQCF_ACT_SKIP
> WARNING: CPU: 112 PID: 2041 at kernel/sched/sched.h:1453
> sub_running_bw.isra.0+0x190/0x1a0
> ...
> CPU: 112 PID: 2041 Comm: sugov:112 Tainted: G W 5.14.0-rc1 #1
> Hardware name: WIWYNN Mt.Jade Server System
> B81.030Z1.0007/Mt.Jade Motherboard, BIOS 1.6.20210526 (SCP:
> 1.06.20210526) 2021/05/26
> ...
> Call trace:
> sub_running_bw.isra.0+0x190/0x1a0
> migrate_task_rq_dl+0xf8/0x1e0
> set_task_cpu+0xa8/0x1f0
> try_to_wake_up+0x150/0x3d4
> wake_up_q+0x64/0xc0
> __up_write+0xd0/0x1c0
> up_write+0x4c/0x2b0
> cppc_set_perf+0x120/0x2d0
> cppc_cpufreq_set_target+0xe0/0x1a4 [cppc_cpufreq]
> __cpufreq_driver_target+0x74/0x140
> sugov_work+0x64/0x80
> kthread_worker_fn+0xe0/0x230
> kthread+0x138/0x140
> ret_from_fork+0x10/0x18
>
> The task causing this is the `cppc_fie` DL task introduced by
> commit 1eb5dde674f5 ("cpufreq: CPPC: Add support for frequency
> invariance").
>
> With CONFIG_ACPI_CPPC_CPUFREQ_FIE=y and schedutil cpufreq governor on
> slow-switching system (like on this Ampere Altra WIWYNN Mt. Jade Arm
> Server):
>
> DL task `curr=sugov:112` lets `p=cppc_fie` migrate and since the latter
> is in `non_contending` state, migrate_task_rq_dl() calls
>
> sub_running_bw()->__sub_running_bw()->cpufreq_update_util()->
> rq_clock()->assert_clock_updated()
>
> on p.
>
> Fix this by updating the clock for a non_contending task in
> migrate_task_rq_dl() before calling sub_running_bw().
>
> Reported-by: Bruno Goncalves <[email protected]>
> Signed-off-by: Dietmar Eggemann <[email protected]>
> ---
> kernel/sched/deadline.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index aaacd6cfd42f..4920f498492f 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1733,6 +1733,7 @@ static void migrate_task_rq_dl(struct task_struct *p, int new_cpu __maybe_unused
> */
> raw_spin_rq_lock(rq);
> if (p->dl.dl_non_contending) {
> + update_rq_clock(rq);
> sub_running_bw(&p->dl, &rq->dl);
> p->dl.dl_non_contending = 0;
> /*

Acked-by: Juri Lelli <[email protected]>

Thanks!
Juri

Subject: [tip: sched/core] sched/deadline: Fix missing clock update in migrate_task_rq_dl()

The following commit has been merged into the sched/core branch of tip:

Commit-ID: b4da13aa28d4fd0071247b7b41c579ee8a86c81a
Gitweb: https://git.kernel.org/tip/b4da13aa28d4fd0071247b7b41c579ee8a86c81a
Author: Dietmar Eggemann <[email protected]>
AuthorDate: Wed, 04 Aug 2021 15:59:25 +02:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Fri, 06 Aug 2021 14:25:24 +02:00

sched/deadline: Fix missing clock update in migrate_task_rq_dl()

A missing clock update is causing the following warning:

rq->clock_update_flags < RQCF_ACT_SKIP
WARNING: CPU: 112 PID: 2041 at kernel/sched/sched.h:1453
sub_running_bw.isra.0+0x190/0x1a0
...
CPU: 112 PID: 2041 Comm: sugov:112 Tainted: G W 5.14.0-rc1 #1
Hardware name: WIWYNN Mt.Jade Server System
B81.030Z1.0007/Mt.Jade Motherboard, BIOS 1.6.20210526 (SCP:
1.06.20210526) 2021/05/26
...
Call trace:
sub_running_bw.isra.0+0x190/0x1a0
migrate_task_rq_dl+0xf8/0x1e0
set_task_cpu+0xa8/0x1f0
try_to_wake_up+0x150/0x3d4
wake_up_q+0x64/0xc0
__up_write+0xd0/0x1c0
up_write+0x4c/0x2b0
cppc_set_perf+0x120/0x2d0
cppc_cpufreq_set_target+0xe0/0x1a4 [cppc_cpufreq]
__cpufreq_driver_target+0x74/0x140
sugov_work+0x64/0x80
kthread_worker_fn+0xe0/0x230
kthread+0x138/0x140
ret_from_fork+0x10/0x18

The task causing this is the `cppc_fie` DL task introduced by
commit 1eb5dde674f5 ("cpufreq: CPPC: Add support for frequency
invariance").

With CONFIG_ACPI_CPPC_CPUFREQ_FIE=y and schedutil cpufreq governor on
slow-switching system (like on this Ampere Altra WIWYNN Mt. Jade Arm
Server):

DL task `curr=sugov:112` lets `p=cppc_fie` migrate and since the latter
is in `non_contending` state, migrate_task_rq_dl() calls

sub_running_bw()->__sub_running_bw()->cpufreq_update_util()->
rq_clock()->assert_clock_updated()

on p.

Fix this by updating the clock for a non_contending task in
migrate_task_rq_dl() before calling sub_running_bw().

Reported-by: Bruno Goncalves <[email protected]>
Signed-off-by: Dietmar Eggemann <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Daniel Bristot de Oliveira <[email protected]>
Acked-by: Juri Lelli <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
kernel/sched/deadline.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 5cafc64..e943146 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1733,6 +1733,7 @@ static void migrate_task_rq_dl(struct task_struct *p, int new_cpu __maybe_unused
*/
raw_spin_rq_lock(rq);
if (p->dl.dl_non_contending) {
+ update_rq_clock(rq);
sub_running_bw(&p->dl, &rq->dl);
p->dl.dl_non_contending = 0;
/*