On Wed, Sep 21, 2016 at 02:38:07PM +0100, Matt Fleming wrote:
> detach_task_cfs_rq() may indirectly call rq_clock() to inform the
> cpufreq code that the rq utilisation has changed. In which case, we
> need to update the rq clock.
Hurm,. so it would've been good to know the callchain that got you
there.
There's two functions that use detach_task_cfs_rq(), one is through
sched_change_group() and that does indeed lack a rq_clock update.
The other is through switched_from() where its far harder (but still
possible afaict) to miss the update.
Now, neither cases are really fast paths, but it would be good to try
and avoid too many update_rq_clock() calls in the same rq-lock section.
So I'm not entirely sure about the placement here.
But let me go stare at the actual debug framework thing first.. I think
this patch is fallout/fixups from that.
On Mon, 03 Oct, at 02:49:07PM, Peter Zijlstra wrote:
> On Wed, Sep 21, 2016 at 02:38:07PM +0100, Matt Fleming wrote:
> > detach_task_cfs_rq() may indirectly call rq_clock() to inform the
> > cpufreq code that the rq utilisation has changed. In which case, we
> > need to update the rq clock.
>
> Hurm,. so it would've been good to know the callchain that got you
> there.
>
> There's two functions that use detach_task_cfs_rq(), one is through
> sched_change_group() and that does indeed lack a rq_clock update.
>
> The other is through switched_from() where its far harder (but still
> possible afaict) to miss the update.
It was the former callchain.
> Now, neither cases are really fast paths, but it would be good to try
> and avoid too many update_rq_clock() calls in the same rq-lock section.
> So I'm not entirely sure about the placement here.
>
> But let me go stare at the actual debug framework thing first.. I think
> this patch is fallout/fixups from that.
On Mon, Oct 03, 2016 at 03:37:45PM +0100, Matt Fleming wrote:
> On Mon, 03 Oct, at 02:49:07PM, Peter Zijlstra wrote:
> > The other is through switched_from() where its far harder (but still
> > possible afaict) to miss the update.
>
> It was the former callchain.
Yep, just found it ;-)
I seem to hit a few you didn't as well.. let me prod at this a wee bit
more before I add more asserts..
4WARNING: CPU: 0 PID: 1 at ../kernel/sched/sched.h:797 detach_task_cfs_rq+0x6fe/0x930
rq->clock_update_flags < RQCF_ACT_SKIPdModules linked in:
dCPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-00637-g67223e2-dirty #553
dHardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
ffffc900000cbc00 ffffffff816152c5 ffffc900000cbc50 0000000000000000
ffffc900000cbc40 ffffffff810d5bab 0000031d00017b00 ffff88042f817b68
ffff88042dbb0000 ffff88042f817b00 ffff88042dbb0000 ffffffff81c18900
Call Trace:
[<ffffffff816152c5>] dump_stack+0x67/0x92
[<ffffffff810d5bab>] __warn+0xcb/0xf0
[<ffffffff810d5c1f>] warn_slowpath_fmt+0x4f/0x60
[<ffffffff8110efae>] detach_task_cfs_rq+0x6fe/0x930
[<ffffffff8110f1f1>] switched_from_fair+0x11/0x20
[<ffffffff810fde77>] __sched_setscheduler+0x2a7/0xb40
[<ffffffff810fe779>] _sched_setscheduler+0x69/0x70
[<ffffffff810ff243>] sched_set_stop_task+0x53/0x90
[<ffffffff81173703>] cpu_stop_create+0x23/0x30
[<ffffffff810f90c0>] __smpboot_create_thread.part.2+0xb0/0x100
[<ffffffff810f91ef>] smpboot_register_percpu_thread_cpumask+0xdf/0x140
[<ffffffff823c24e7>] ? pid_namespaces_init+0x40/0x40
[<ffffffff823c254b>] cpu_stop_init+0x64/0x9b
[<ffffffff8100040d>] do_one_initcall+0x3d/0x150
[<ffffffff8107763d>] ? print_cpu_info+0x7d/0xe0
[<ffffffff823a0001>] kernel_init_freeable+0xcc/0x207
[<ffffffff81a7d8d0>] ? rest_init+0x90/0x90
[<ffffffff81a7d8de>] kernel_init+0xe/0x100
[<ffffffff81a89bc7>] ret_from_fork+0x27/0x40
4---[ end trace 90bea7c93d2289cb ]---