Hi Vincent,
I have observed issues running on linus/master from a few days back [1].
I'm running on a Renesas Koelsch board (arm32) and I can trigger a issue
by X forwarding the v4l2 test application qv4l2 over ssh and moving the
courser around in the GUI (best test case description award...). I'm
sorry about the really bad way I trigger this but I can't do it in any
other way, I'm happy to try other methods if you got some ideas. The
symptom of the issue is a complete hang of the system for more then 30
seconds and then this information is printed in the console:
[ 142.849390] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 142.854972] 1-...!: (1 GPs behind) idle=7a4/0/0 softirq=3214/3217 fqs=0
[ 142.861976] (detected by 0, t=8232 jiffies, g=930, c=929, q=11)
[ 142.868042] Sending NMI from CPU 0 to CPUs 1:
[ 142.872443] NMI backtrace for cpu 1
[ 142.872452] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-05506-g28aba11c1393691a #14
[ 142.872455] Hardware name: Generic R8A7791 (Flattened Device Tree)
[ 142.872473] PC is at arch_cpu_idle+0x28/0x44
[ 142.872484] LR is at trace_hardirqs_on_caller+0x1a4/0x1d4
[ 142.872488] pc : [<c01088a4>] lr : [<c01747a8>] psr: 20070013
[ 142.872491] sp : eb0b9f90 ip : eb0b9f60 fp : eb0b9f9c
[ 142.872495] r10: 00000000 r9 : 413fc0f2 r8 : 4000406a
[ 142.872498] r7 : c0c08478 r6 : c0c0842c r5 : ffffe000 r4 : 00000002
[ 142.872502] r3 : eb0b6ec0 r2 : 00000000 r1 : 00000004 r0 : 00000001
[ 142.872507] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 142.872511] Control: 10c5387d Table: 6a61406a DAC: 00000051
[ 142.872516] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-05506-g28aba11c1393691a #14
[ 142.872519] Hardware name: Generic R8A7791 (Flattened Device Tree)
[ 142.872522] Backtrace:
[ 142.872534] [<c010cbf8>] (dump_backtrace) from [<c010ce78>] (show_stack+0x18/0x1c)
[ 142.872540] r7:c0c81388 r6:00000000 r5:60070193 r4:c0c81388
[ 142.872550] [<c010ce60>] (show_stack) from [<c07dec20>] (dump_stack+0xa4/0xd8)
[ 142.872557] [<c07deb7c>] (dump_stack) from [<c0108b38>] (show_regs+0x14/0x18)
[ 142.872563] r9:00000001 r8:00000000 r7:c0c4f678 r6:eb0b9f40 r5:00000001 r4:c13e1130
[ 142.872571] [<c0108b24>] (show_regs) from [<c07e4cd0>] (nmi_cpu_backtrace+0xfc/0x118)
[ 142.872578] [<c07e4bd4>] (nmi_cpu_backtrace) from [<c010fb28>] (handle_IPI+0x2a8/0x320)
[ 142.872583] r7:c0c4f678 r6:eb0b9f40 r5:00000007 r4:c0b75b68
[ 142.872594] [<c010f880>] (handle_IPI) from [<c03cb528>] (gic_handle_irq+0x8c/0x98)
[ 142.872599] r10:00000000 r9:eb0b8000 r8:f0803000 r7:c0c4f678 r6:eb0b9f40 r5:c0c08a90
[ 142.872602] r4:f0802000
[ 142.872608] [<c03cb49c>] (gic_handle_irq) from [<c01019f0>] (__irq_svc+0x70/0x98)
[ 142.872612] Exception stack(0xeb0b9f40 to 0xeb0b9f88)
[ 142.872618] 9f40: 00000001 00000004 00000000 eb0b6ec0 00000002 ffffe000 c0c0842c c0c08478
[ 142.872624] 9f60: 4000406a 413fc0f2 00000000 eb0b9f9c eb0b9f60 eb0b9f90 c01747a8 c01088a4
[ 142.872627] 9f80: 20070013 ffffffff
[ 142.872632] r9:eb0b8000 r8:4000406a r7:eb0b9f74 r6:ffffffff r5:20070013 r4:c01088a4
[ 142.872642] [<c010887c>] (arch_cpu_idle) from [<c07fb1a8>] (default_idle_call+0x34/0x38)
[ 142.872650] [<c07fb174>] (default_idle_call) from [<c0155fe0>] (do_idle+0xe0/0x134)
[ 142.872656] [<c0155f00>] (do_idle) from [<c0156398>] (cpu_startup_entry+0x20/0x24)
[ 142.872660] r7:c0c8e9d0 r6:10c0387d r5:00000051 r4:00000085
[ 142.872667] [<c0156378>] (cpu_startup_entry) from [<c010f644>] (secondary_start_kernel+0x114/0x134)
[ 142.872673] [<c010f530>] (secondary_start_kernel) from [<401026ec>] (0x401026ec)
[ 142.872676] r5:00000051 r4:6b0a406a
[ 142.873456] rcu_sched kthread starved for 8235 jiffies! g930 c929 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
[ 143.135040] RCU grace-period kthread stack dump:
[ 143.139695] rcu_sched I 0 9 2 0x00000000
[ 143.145234] Backtrace:
[ 143.147719] [<c07f5120>] (__schedule) from [<c07f5b3c>] (schedule+0x94/0xb8)
[ 143.154823] r10:c0b714c0 r9:c0c85f8a r8:ffffffff r7:eb0abec4 r6:ffffa274 r5:00000000
[ 143.162712] r4:ffffe000
[ 143.165273] [<c07f5aa8>] (schedule) from [<c07fa940>] (schedule_timeout+0x440/0x4b0)
[ 143.173076] r5:00000000 r4:eb79c4c0
[ 143.176692] [<c07fa500>] (schedule_timeout) from [<c0196e08>] (rcu_gp_kthread+0x958/0x150c)
[ 143.185108] r10:c0c87274 r9:00000000 r8:c0c165b8 r7:00000001 r6:00000000 r5:c0c16590
[ 143.192997] r4:c0c16300
[ 143.195560] [<c01964b0>] (rcu_gp_kthread) from [<c0145eb8>] (kthread+0x148/0x160)
[ 143.203099] r7:c0c16300
[ 143.205660] [<c0145d70>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
[ 143.212938] Exception stack(0xeb0abfb0 to 0xeb0abff8)
[ 143.218030] bfa0: 00000000 00000000 00000000 00000000
[ 143.226271] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 143.234511] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 143.241177] r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0145d70
[ 143.249065] r4:eb037b00
After the freeze the system becomes responsive again and I can sometimes
trigger the hang multiple times. I tried to bisect the problem and I
found that by reverting [2] I can no longer reproduce the issue. I can
also not reproduce the issue on v4.16. I can't figure out if reverting
[2] is just treating a symptom or the root cause of my troubles and
would appreciate your input. Also my "test-case" do not trigger every
time but I have tested this scenario quiet a lot and the result seems to
be constant.
My test setup involves a NFS root filesystem, I ssh in and launch the
GUI application over X forwarding. From what I know the application is
not doing any ioctl calls to the v4l2 framework it's just sitting there
idle as I move the courser around showing tool tips and such as I hover
over elements and then it freeze up. I have not observed this issue by
just booting the system and leaving it idle, movement in the GUI seems
to be the key to trigger this.
I'm a bit lost on how to progress with this issue and would appreciate
any help you can provide to help me figure this out.
1. c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
2. 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
--
Regards,
Niklas S?derlund
Hi Niklas,
On 12 April 2018 at 11:18, Niklas Söderlund
<[email protected]> wrote:
> Hi Vincent,
>
> I have observed issues running on linus/master from a few days back [1].
> I'm running on a Renesas Koelsch board (arm32) and I can trigger a issue
> by X forwarding the v4l2 test application qv4l2 over ssh and moving the
> courser around in the GUI (best test case description award...). I'm
> sorry about the really bad way I trigger this but I can't do it in any
> other way, I'm happy to try other methods if you got some ideas. The
> symptom of the issue is a complete hang of the system for more then 30
> seconds and then this information is printed in the console:
Heiner (edded cc) also reported similar problem with his platform: a
dual core celeron
Do you confirm that your platform is a dual cortex-A15 ? At least that
what I have seen on web
This would confirm that dual system is a key point.
The ssh connection is also common with Heiner's setup
>
> [ 142.849390] INFO: rcu_sched detected stalls on CPUs/tasks:
> [ 142.854972] 1-...!: (1 GPs behind) idle=7a4/0/0 softirq=3214/3217 fqs=0
> [ 142.861976] (detected by 0, t=8232 jiffies, g=930, c=929, q=11)
> [ 142.868042] Sending NMI from CPU 0 to CPUs 1:
> [ 142.872443] NMI backtrace for cpu 1
> [ 142.872452] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-05506-g28aba11c1393691a #14
> [ 142.872455] Hardware name: Generic R8A7791 (Flattened Device Tree)
> [ 142.872473] PC is at arch_cpu_idle+0x28/0x44
> [ 142.872484] LR is at trace_hardirqs_on_caller+0x1a4/0x1d4
> [ 142.872488] pc : [<c01088a4>] lr : [<c01747a8>] psr: 20070013
> [ 142.872491] sp : eb0b9f90 ip : eb0b9f60 fp : eb0b9f9c
> [ 142.872495] r10: 00000000 r9 : 413fc0f2 r8 : 4000406a
> [ 142.872498] r7 : c0c08478 r6 : c0c0842c r5 : ffffe000 r4 : 00000002
> [ 142.872502] r3 : eb0b6ec0 r2 : 00000000 r1 : 00000004 r0 : 00000001
> [ 142.872507] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
> [ 142.872511] Control: 10c5387d Table: 6a61406a DAC: 00000051
> [ 142.872516] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-05506-g28aba11c1393691a #14
> [ 142.872519] Hardware name: Generic R8A7791 (Flattened Device Tree)
> [ 142.872522] Backtrace:
> [ 142.872534] [<c010cbf8>] (dump_backtrace) from [<c010ce78>] (show_stack+0x18/0x1c)
> [ 142.872540] r7:c0c81388 r6:00000000 r5:60070193 r4:c0c81388
> [ 142.872550] [<c010ce60>] (show_stack) from [<c07dec20>] (dump_stack+0xa4/0xd8)
> [ 142.872557] [<c07deb7c>] (dump_stack) from [<c0108b38>] (show_regs+0x14/0x18)
> [ 142.872563] r9:00000001 r8:00000000 r7:c0c4f678 r6:eb0b9f40 r5:00000001 r4:c13e1130
> [ 142.872571] [<c0108b24>] (show_regs) from [<c07e4cd0>] (nmi_cpu_backtrace+0xfc/0x118)
> [ 142.872578] [<c07e4bd4>] (nmi_cpu_backtrace) from [<c010fb28>] (handle_IPI+0x2a8/0x320)
> [ 142.872583] r7:c0c4f678 r6:eb0b9f40 r5:00000007 r4:c0b75b68
> [ 142.872594] [<c010f880>] (handle_IPI) from [<c03cb528>] (gic_handle_irq+0x8c/0x98)
> [ 142.872599] r10:00000000 r9:eb0b8000 r8:f0803000 r7:c0c4f678 r6:eb0b9f40 r5:c0c08a90
> [ 142.872602] r4:f0802000
> [ 142.872608] [<c03cb49c>] (gic_handle_irq) from [<c01019f0>] (__irq_svc+0x70/0x98)
> [ 142.872612] Exception stack(0xeb0b9f40 to 0xeb0b9f88)
> [ 142.872618] 9f40: 00000001 00000004 00000000 eb0b6ec0 00000002 ffffe000 c0c0842c c0c08478
> [ 142.872624] 9f60: 4000406a 413fc0f2 00000000 eb0b9f9c eb0b9f60 eb0b9f90 c01747a8 c01088a4
> [ 142.872627] 9f80: 20070013 ffffffff
> [ 142.872632] r9:eb0b8000 r8:4000406a r7:eb0b9f74 r6:ffffffff r5:20070013 r4:c01088a4
> [ 142.872642] [<c010887c>] (arch_cpu_idle) from [<c07fb1a8>] (default_idle_call+0x34/0x38)
> [ 142.872650] [<c07fb174>] (default_idle_call) from [<c0155fe0>] (do_idle+0xe0/0x134)
> [ 142.872656] [<c0155f00>] (do_idle) from [<c0156398>] (cpu_startup_entry+0x20/0x24)
> [ 142.872660] r7:c0c8e9d0 r6:10c0387d r5:00000051 r4:00000085
> [ 142.872667] [<c0156378>] (cpu_startup_entry) from [<c010f644>] (secondary_start_kernel+0x114/0x134)
> [ 142.872673] [<c010f530>] (secondary_start_kernel) from [<401026ec>] (0x401026ec)
> [ 142.872676] r5:00000051 r4:6b0a406a
> [ 142.873456] rcu_sched kthread starved for 8235 jiffies! g930 c929 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
> [ 143.135040] RCU grace-period kthread stack dump:
> [ 143.139695] rcu_sched I 0 9 2 0x00000000
> [ 143.145234] Backtrace:
> [ 143.147719] [<c07f5120>] (__schedule) from [<c07f5b3c>] (schedule+0x94/0xb8)
> [ 143.154823] r10:c0b714c0 r9:c0c85f8a r8:ffffffff r7:eb0abec4 r6:ffffa274 r5:00000000
> [ 143.162712] r4:ffffe000
> [ 143.165273] [<c07f5aa8>] (schedule) from [<c07fa940>] (schedule_timeout+0x440/0x4b0)
> [ 143.173076] r5:00000000 r4:eb79c4c0
> [ 143.176692] [<c07fa500>] (schedule_timeout) from [<c0196e08>] (rcu_gp_kthread+0x958/0x150c)
> [ 143.185108] r10:c0c87274 r9:00000000 r8:c0c165b8 r7:00000001 r6:00000000 r5:c0c16590
> [ 143.192997] r4:c0c16300
> [ 143.195560] [<c01964b0>] (rcu_gp_kthread) from [<c0145eb8>] (kthread+0x148/0x160)
> [ 143.203099] r7:c0c16300
> [ 143.205660] [<c0145d70>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
> [ 143.212938] Exception stack(0xeb0abfb0 to 0xeb0abff8)
> [ 143.218030] bfa0: 00000000 00000000 00000000 00000000
> [ 143.226271] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [ 143.234511] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [ 143.241177] r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0145d70
> [ 143.249065] r4:eb037b00
>
> After the freeze the system becomes responsive again and I can sometimes
> trigger the hang multiple times. I tried to bisect the problem and I
> found that by reverting [2] I can no longer reproduce the issue. I can
> also not reproduce the issue on v4.16. I can't figure out if reverting
> [2] is just treating a symptom or the root cause of my troubles and
> would appreciate your input. Also my "test-case" do not trigger every
> time but I have tested this scenario quiet a lot and the result seems to
> be constant.
>
> My test setup involves a NFS root filesystem, I ssh in and launch the
> GUI application over X forwarding. From what I know the application is
> not doing any ioctl calls to the v4l2 framework it's just sitting there
> idle as I move the courser around showing tool tips and such as I hover
> over elements and then it freeze up. I have not observed this issue by
> just booting the system and leaving it idle, movement in the GUI seems
> to be the key to trigger this.
>
> I'm a bit lost on how to progress with this issue and would appreciate
> any help you can provide to help me figure this out.
Can you send me your config ?
I'm going to prepare a debug patch to spy what's happening when entering idle
Regards,
Vincent
>
> 1. c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
> 2. 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
>
> --
> Regards,
> Niklas Söderlund
Hi Vincent,
Thanks for your feedback.
On 2018-04-12 12:33:27 +0200, Vincent Guittot wrote:
> Hi Niklas,
>
> On 12 April 2018 at 11:18, Niklas S?derlund
> <[email protected]> wrote:
> > Hi Vincent,
> >
> > I have observed issues running on linus/master from a few days back [1].
> > I'm running on a Renesas Koelsch board (arm32) and I can trigger a issue
> > by X forwarding the v4l2 test application qv4l2 over ssh and moving the
> > courser around in the GUI (best test case description award...). I'm
> > sorry about the really bad way I trigger this but I can't do it in any
> > other way, I'm happy to try other methods if you got some ideas. The
> > symptom of the issue is a complete hang of the system for more then 30
> > seconds and then this information is printed in the console:
>
> Heiner (edded cc) also reported similar problem with his platform: a
> dual core celeron
>
> Do you confirm that your platform is a dual cortex-A15 ? At least that
> what I have seen on web
> This would confirm that dual system is a key point.
I can confirm that my platform is a dual core.
>
> The ssh connection is also common with Heiner's setup
Interesting, I found Heiner's mail and I can confirm that I too
experience ssh sessions lockups. I ssh into the system and by repeatedly
hitting the return key I can lockup the board, while locked up starting
another ssh session unblocks the first. If I don't start another ssh
session but keep hitting return key sporadically in the first one I can
get the trace I reported in my first mail to be printed on the serial
console.
When locked up the symptoms are that both the single ssh session is dead
and the serial console. But attempting another ssh connection
immediately unblocks both ssh and serial console. And if I allow enough
time before starting the second ssh connection I can trigger a trace to
be printed on the serial console, it's similar but different from the
first I reported.
[ 207.548610] 1-...!: (0 ticks this GP) idle=79a/1/1073741824 softirq=2146/2146 fqs=0
[ 207.556442] (detected by 0, t=12645 jiffies, g=333, c=332, q=20)
[ 207.562546] rcu_sched kthread starved for 12645 jiffies! g333 c332 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0
[ 207.572548] RCU grace-period kthread stack dump:
[ 207.577166] rcu_sched R running task 0 9 2 0x00000000
[ 207.584389] Backtrace:
[ 207.586849] [<c0760450>] (__schedule) from [<c0760ba4>] (schedule+0x94/0xb8)
[ 207.593901] r10:e77813c0 r9:e77813c0 r8:ffffffff r7:e709bed4 r6:ffffaa80 r5:00000000
[ 207.601732] r4:ffffe000
[ 207.604269] [<c0760b10>] (schedule) from [<c0764560>] (schedule_timeout+0x380/0x3dc)
[ 207.612013] r5:00000000 r4:00000000
[ 207.615596] [<c07641e0>] (schedule_timeout) from [<c017b698>] (rcu_gp_kthread+0x668/0xe2c)
[ 207.623863] r10:c0b79018 r9:0000014d r8:0000014c r7:00000001 r6:00000000 r5:c0b10ad0
[ 207.631693] r4:c0b10980
[ 207.634230] [<c017b030>] (rcu_gp_kthread) from [<c013ddd8>] (kthread+0x148/0x160)
[ 207.641712] r7:c0b10980
[ 207.644249] [<c013dc90>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
[ 207.651472] Exception stack(0xe709bfb0 to 0xe709bff8)
[ 207.656527] bfa0: 00000000 00000000 00000000 00000000
[ 207.664709] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 207.672890] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 207.679508] r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c013dc90
[ 207.687340] r4:e7026f4
Continuing the anecdotal testing, I can't seem to be able to trigger the
lockup if i have ever had two ssh sessions open to the systems. And
about half the time I can't trigger it at all but after a reset of the
system it triggers with just hitting the return key 2-5 times of opening
a ssh session and just hitting the return key. But please take this part
with a grain of salt as it's done by the monkey testing method :-)
All tests above have been run base on c18bb396d3d261eb ("Merge
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
>
> >
> > [ 142.849390] INFO: rcu_sched detected stalls on CPUs/tasks:
> > [ 142.854972] 1-...!: (1 GPs behind) idle=7a4/0/0 softirq=3214/3217 fqs=0
> > [ 142.861976] (detected by 0, t=8232 jiffies, g=930, c=929, q=11)
> > [ 142.868042] Sending NMI from CPU 0 to CPUs 1:
> > [ 142.872443] NMI backtrace for cpu 1
> > [ 142.872452] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-05506-g28aba11c1393691a #14
> > [ 142.872455] Hardware name: Generic R8A7791 (Flattened Device Tree)
> > [ 142.872473] PC is at arch_cpu_idle+0x28/0x44
> > [ 142.872484] LR is at trace_hardirqs_on_caller+0x1a4/0x1d4
> > [ 142.872488] pc : [<c01088a4>] lr : [<c01747a8>] psr: 20070013
> > [ 142.872491] sp : eb0b9f90 ip : eb0b9f60 fp : eb0b9f9c
> > [ 142.872495] r10: 00000000 r9 : 413fc0f2 r8 : 4000406a
> > [ 142.872498] r7 : c0c08478 r6 : c0c0842c r5 : ffffe000 r4 : 00000002
> > [ 142.872502] r3 : eb0b6ec0 r2 : 00000000 r1 : 00000004 r0 : 00000001
> > [ 142.872507] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
> > [ 142.872511] Control: 10c5387d Table: 6a61406a DAC: 00000051
> > [ 142.872516] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-05506-g28aba11c1393691a #14
> > [ 142.872519] Hardware name: Generic R8A7791 (Flattened Device Tree)
> > [ 142.872522] Backtrace:
> > [ 142.872534] [<c010cbf8>] (dump_backtrace) from [<c010ce78>] (show_stack+0x18/0x1c)
> > [ 142.872540] r7:c0c81388 r6:00000000 r5:60070193 r4:c0c81388
> > [ 142.872550] [<c010ce60>] (show_stack) from [<c07dec20>] (dump_stack+0xa4/0xd8)
> > [ 142.872557] [<c07deb7c>] (dump_stack) from [<c0108b38>] (show_regs+0x14/0x18)
> > [ 142.872563] r9:00000001 r8:00000000 r7:c0c4f678 r6:eb0b9f40 r5:00000001 r4:c13e1130
> > [ 142.872571] [<c0108b24>] (show_regs) from [<c07e4cd0>] (nmi_cpu_backtrace+0xfc/0x118)
> > [ 142.872578] [<c07e4bd4>] (nmi_cpu_backtrace) from [<c010fb28>] (handle_IPI+0x2a8/0x320)
> > [ 142.872583] r7:c0c4f678 r6:eb0b9f40 r5:00000007 r4:c0b75b68
> > [ 142.872594] [<c010f880>] (handle_IPI) from [<c03cb528>] (gic_handle_irq+0x8c/0x98)
> > [ 142.872599] r10:00000000 r9:eb0b8000 r8:f0803000 r7:c0c4f678 r6:eb0b9f40 r5:c0c08a90
> > [ 142.872602] r4:f0802000
> > [ 142.872608] [<c03cb49c>] (gic_handle_irq) from [<c01019f0>] (__irq_svc+0x70/0x98)
> > [ 142.872612] Exception stack(0xeb0b9f40 to 0xeb0b9f88)
> > [ 142.872618] 9f40: 00000001 00000004 00000000 eb0b6ec0 00000002 ffffe000 c0c0842c c0c08478
> > [ 142.872624] 9f60: 4000406a 413fc0f2 00000000 eb0b9f9c eb0b9f60 eb0b9f90 c01747a8 c01088a4
> > [ 142.872627] 9f80: 20070013 ffffffff
> > [ 142.872632] r9:eb0b8000 r8:4000406a r7:eb0b9f74 r6:ffffffff r5:20070013 r4:c01088a4
> > [ 142.872642] [<c010887c>] (arch_cpu_idle) from [<c07fb1a8>] (default_idle_call+0x34/0x38)
> > [ 142.872650] [<c07fb174>] (default_idle_call) from [<c0155fe0>] (do_idle+0xe0/0x134)
> > [ 142.872656] [<c0155f00>] (do_idle) from [<c0156398>] (cpu_startup_entry+0x20/0x24)
> > [ 142.872660] r7:c0c8e9d0 r6:10c0387d r5:00000051 r4:00000085
> > [ 142.872667] [<c0156378>] (cpu_startup_entry) from [<c010f644>] (secondary_start_kernel+0x114/0x134)
> > [ 142.872673] [<c010f530>] (secondary_start_kernel) from [<401026ec>] (0x401026ec)
> > [ 142.872676] r5:00000051 r4:6b0a406a
> > [ 142.873456] rcu_sched kthread starved for 8235 jiffies! g930 c929 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
> > [ 143.135040] RCU grace-period kthread stack dump:
> > [ 143.139695] rcu_sched I 0 9 2 0x00000000
> > [ 143.145234] Backtrace:
> > [ 143.147719] [<c07f5120>] (__schedule) from [<c07f5b3c>] (schedule+0x94/0xb8)
> > [ 143.154823] r10:c0b714c0 r9:c0c85f8a r8:ffffffff r7:eb0abec4 r6:ffffa274 r5:00000000
> > [ 143.162712] r4:ffffe000
> > [ 143.165273] [<c07f5aa8>] (schedule) from [<c07fa940>] (schedule_timeout+0x440/0x4b0)
> > [ 143.173076] r5:00000000 r4:eb79c4c0
> > [ 143.176692] [<c07fa500>] (schedule_timeout) from [<c0196e08>] (rcu_gp_kthread+0x958/0x150c)
> > [ 143.185108] r10:c0c87274 r9:00000000 r8:c0c165b8 r7:00000001 r6:00000000 r5:c0c16590
> > [ 143.192997] r4:c0c16300
> > [ 143.195560] [<c01964b0>] (rcu_gp_kthread) from [<c0145eb8>] (kthread+0x148/0x160)
> > [ 143.203099] r7:c0c16300
> > [ 143.205660] [<c0145d70>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
> > [ 143.212938] Exception stack(0xeb0abfb0 to 0xeb0abff8)
> > [ 143.218030] bfa0: 00000000 00000000 00000000 00000000
> > [ 143.226271] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> > [ 143.234511] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> > [ 143.241177] r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0145d70
> > [ 143.249065] r4:eb037b00
> >
> > After the freeze the system becomes responsive again and I can sometimes
> > trigger the hang multiple times. I tried to bisect the problem and I
> > found that by reverting [2] I can no longer reproduce the issue. I can
> > also not reproduce the issue on v4.16. I can't figure out if reverting
> > [2] is just treating a symptom or the root cause of my troubles and
> > would appreciate your input. Also my "test-case" do not trigger every
> > time but I have tested this scenario quiet a lot and the result seems to
> > be constant.
> >
> > My test setup involves a NFS root filesystem, I ssh in and launch the
> > GUI application over X forwarding. From what I know the application is
> > not doing any ioctl calls to the v4l2 framework it's just sitting there
> > idle as I move the courser around showing tool tips and such as I hover
> > over elements and then it freeze up. I have not observed this issue by
> > just booting the system and leaving it idle, movement in the GUI seems
> > to be the key to trigger this.
> >
> > I'm a bit lost on how to progress with this issue and would appreciate
> > any help you can provide to help me figure this out.
>
> Can you send me your config ?
>
> I'm going to prepare a debug patch to spy what's happening when entering idle
>
> Regards,
> Vincent
> >
> > 1. c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
> > 2. 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
> >
> > --
> > Regards,
> > Niklas S?derlund
--
Regards,
Niklas S?derlund
Am 12.04.2018 um 15:30 schrieb Vincent Guittot:
> Heiner, Niklas,
>
> Le Thursday 12 Apr 2018 à 13:15:19 (+0200), Niklas Söderlund a écrit :
>> Hi Vincent,
>>
>> Thanks for your feedback.
>>
>> On 2018-04-12 12:33:27 +0200, Vincent Guittot wrote:
>>> Hi Niklas,
>>>
>>> On 12 April 2018 at 11:18, Niklas Söderlund
>>> <[email protected]> wrote:
>>>> Hi Vincent,
>>>>
>>>> I have observed issues running on linus/master from a few days back [1].
>>>> I'm running on a Renesas Koelsch board (arm32) and I can trigger a issue
>>>> by X forwarding the v4l2 test application qv4l2 over ssh and moving the
>>>> courser around in the GUI (best test case description award...). I'm
>>>> sorry about the really bad way I trigger this but I can't do it in any
>>>> other way, I'm happy to try other methods if you got some ideas. The
>>>> symptom of the issue is a complete hang of the system for more then 30
>>>> seconds and then this information is printed in the console:
>>>
>>> Heiner (edded cc) also reported similar problem with his platform: a
>>> dual core celeron
>>>
>>> Do you confirm that your platform is a dual cortex-A15 ? At least that
>>> what I have seen on web
>>> This would confirm that dual system is a key point.
>>
>> I can confirm that my platform is a dual core.
>>
>>>
>>> The ssh connection is also common with Heiner's setup
>>
>> Interesting, I found Heiner's mail and I can confirm that I too
>> experience ssh sessions lockups. I ssh into the system and by repeatedly
>> hitting the return key I can lockup the board, while locked up starting
>> another ssh session unblocks the first. If I don't start another ssh
>> session but keep hitting return key sporadically in the first one I can
>> get the trace I reported in my first mail to be printed on the serial
>> console.
>>
>> When locked up the symptoms are that both the single ssh session is dead
>> and the serial console. But attempting another ssh connection
>> immediately unblocks both ssh and serial console. And if I allow enough
>> time before starting the second ssh connection I can trigger a trace to
>> be printed on the serial console, it's similar but different from the
>> first I reported.
>>
>> [ 207.548610] 1-...!: (0 ticks this GP) idle=79a/1/1073741824 softirq=2146/2146 fqs=0
>> [ 207.556442] (detected by 0, t=12645 jiffies, g=333, c=332, q=20)
>> [ 207.562546] rcu_sched kthread starved for 12645 jiffies! g333 c332 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0
>> [ 207.572548] RCU grace-period kthread stack dump:
>>
>> [ 207.577166] rcu_sched R running task 0 9 2 0x00000000
>> [ 207.584389] Backtrace:
>> [ 207.586849] [<c0760450>] (__schedule) from [<c0760ba4>] (schedule+0x94/0xb8)
>> [ 207.593901] r10:e77813c0 r9:e77813c0 r8:ffffffff r7:e709bed4 r6:ffffaa80 r5:00000000
>> [ 207.601732] r4:ffffe000
>> [ 207.604269] [<c0760b10>] (schedule) from [<c0764560>] (schedule_timeout+0x380/0x3dc)
>> [ 207.612013] r5:00000000 r4:00000000
>> [ 207.615596] [<c07641e0>] (schedule_timeout) from [<c017b698>] (rcu_gp_kthread+0x668/0xe2c)
>> [ 207.623863] r10:c0b79018 r9:0000014d r8:0000014c r7:00000001 r6:00000000 r5:c0b10ad0
>> [ 207.631693] r4:c0b10980
>> [ 207.634230] [<c017b030>] (rcu_gp_kthread) from [<c013ddd8>] (kthread+0x148/0x160)
>> [ 207.641712] r7:c0b10980
>> [ 207.644249] [<c013dc90>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
>> [ 207.651472] Exception stack(0xe709bfb0 to 0xe709bff8)
>> [ 207.656527] bfa0: 00000000 00000000 00000000 00000000
>> [ 207.664709] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>> [ 207.672890] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
>> [ 207.679508] r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c013dc90
>> [ 207.687340] r4:e7026f4
>>
>> Continuing the anecdotal testing, I can't seem to be able to trigger the
>> lockup if i have ever had two ssh sessions open to the systems. And
>> about half the time I can't trigger it at all but after a reset of the
>> system it triggers with just hitting the return key 2-5 times of opening
>> a ssh session and just hitting the return key. But please take this part
>> with a grain of salt as it's done by the monkey testing method :-)
>>
>> All tests above have been run base on c18bb396d3d261eb ("Merge
>> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
>>
>>>
>>>>
>
> [snip]
>
>>>>
>>>> I'm a bit lost on how to progress with this issue and would appreciate
>>>> any help you can provide to help me figure this out.
>>>
>>> Can you send me your config ?
>>>
>>> I'm going to prepare a debug patch to spy what's happening when entering idle
>
> I'd like to narrow the problem a bit more with the 2 patchies aboves. Can you try
> them separatly on top of c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
> and check if one of them fixes the problem ?i
>
> (They should apply on linux-next as well)
>
> First patch always kick ilb instead of doing ilb on local cpu before entering idle
>
> ---
> kernel/sched/fair.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..b21925b 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9739,8 +9739,7 @@ static void nohz_newidle_balance(struct rq *this_rq)
> * candidate for ilb instead of waking up another idle CPU.
> * Kick an normal ilb if we failed to do the update.
> */
> - if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
> - kick_ilb(NOHZ_STATS_KICK);
> + kick_ilb(NOHZ_STATS_KICK);
> raw_spin_lock(&this_rq->lock);
> }
>
>
I tested both patches, with both of them the issue still occurs. However,
on top of linux-next from yesterday I have the impression that it happens
less frequent with the second patch.
On top of the commit mentioned by you I don't see a change in system behavior
with either patch.
Regards, Heiner
Hi Vincent,
On 2018-04-12 12:33:27 +0200, Vincent Guittot wrote:
[snip[
>
> Can you send me your config ?
>
> I'm going to prepare a debug patch to spy what's happening when entering idle
I'm sorry i missed this request when first reading your reply, I'm using
arch/arm/configs/shmobile_defconfig for my tests.
--
Regards,
Niklas S?derlund
Hi Vincent,
Thanks for helping trying to figure this out.
On 2018-04-12 15:30:31 +0200, Vincent Guittot wrote:
[snip]
>
> I'd like to narrow the problem a bit more with the 2 patchies aboves. Can you try
> them separatly on top of c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
> and check if one of them fixes the problem ?i
I tried your suggested changes based on top of c18bb396d3d261eb.
>
> (They should apply on linux-next as well)
>
> First patch always kick ilb instead of doing ilb on local cpu before entering idle
>
> ---
> kernel/sched/fair.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..b21925b 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9739,8 +9739,7 @@ static void nohz_newidle_balance(struct rq *this_rq)
> * candidate for ilb instead of waking up another idle CPU.
> * Kick an normal ilb if we failed to do the update.
> */
> - if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
> - kick_ilb(NOHZ_STATS_KICK);
> + kick_ilb(NOHZ_STATS_KICK);
> raw_spin_lock(&this_rq->lock);
> }
This change don't seem to effect the issue. I can still get the single
ssh session and the system to lockup by hitting the return key. And
opening a second ssh session immediately unblocks both the first ssh
session and the serial console. And I can still trigger the console
warning by just letting the system be once it locks-up. I do have
just as before reset the system a few times to trigger the issue.
[ 245.351693] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 245.357199] 0-...!: (1 GPs behind) idle=93c/0/0 softirq=2224/2225 fqs=0
[ 245.363988] (detected by 1, t=3025 jiffies, g=337, c=336, q=10)
[ 245.370003] Sending NMI from CPU 1 to CPUs 0:
[ 245.374368] NMI backtrace for cpu 0
[ 245.374377] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.16.0-10930-ged741fb4567c816f #42
[ 245.374379] Hardware name: Generic R8A7791 (Flattened Device Tree)
[ 245.374393] PC is at arch_cpu_idle+0x24/0x40
[ 245.374397] LR is at arch_cpu_idle+0x34/0x40
[ 245.374400] pc : [<c0108554>] lr : [<c0108564>] psr: 60050013
[ 245.374403] sp : c0b01f40 ip : c0b01f50 fp : c0b01f4c
[ 245.374405] r10: c0a56a38 r9 : e7fffbc0 r8 : c0b04c00
[ 245.374407] r7 : c0b04c78 r6 : c0b04c2c r5 : ffffe000 r4 : 00000001
[ 245.374410] r3 : c0119100 r2 : e77813a8 r1 : 0002d93c r0 : 00000000
[ 245.374414] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 245.374417] Control: 10c5387d Table: 6662006a DAC: 00000051
[ 245.374421] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.16.0-10930-ged741fb4567c816f #42
[ 245.374423] Hardware name: Generic R8A7791 (Flattened Device Tree)
[ 245.374425] Backtrace:
[ 245.374435] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
[ 245.374440] r7:c0b47278 r6:60050193 r5:00000000 r4:c0b73d80
[ 245.374450] [<c010c3d8>] (show_stack) from [<c074b824>] (dump_stack+0x84/0xa4)
[ 245.374456] [<c074b7a0>] (dump_stack) from [<c01087e8>] (show_regs+0x14/0x18)
[ 245.374460] r7:c0b47278 r6:c0b01ef0 r5:00000000 r4:c0bc62c8
[ 245.374468] [<c01087d4>] (show_regs) from [<c0751938>] (nmi_cpu_backtrace+0xfc/0x118)
[ 245.374475] [<c075183c>] (nmi_cpu_backtrace) from [<c010ef34>] (handle_IPI+0x22c/0x294)
[ 245.374479] r7:c0b47278 r6:c0b01ef0 r5:00000007 r4:c0a775fc
[ 245.374488] [<c010ed08>] (handle_IPI) from [<c037cd68>] (gic_handle_irq+0x8c/0x98)
[ 245.374492] r10:c0a56a38 r9:c0b00000 r8:f0803000 r7:c0b47278 r6:c0b01ef0 r5:c0b05244
[ 245.374495] r4:f0802000 r3:00000407
[ 245.374501] [<c037ccdc>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
[ 245.374504] Exception stack(0xc0b01ef0 to 0xc0b01f38)
[ 245.374507] 1ee0: 00000000 0002d93c e77813a8 c0119100
[ 245.374512] 1f00: 00000001 ffffe000 c0b04c2c c0b04c78 c0b04c00 e7fffbc0 c0a56a38 c0b01f4c
[ 245.374516] 1f20: c0b01f50 c0b01f40 c0108564 c0108554 60050013 ffffffff
[ 245.374521] r9:c0b00000 r8:c0b04c00 r7:c0b01f24 r6:ffffffff r5:60050013 r4:c0108554
[ 245.374528] [<c0108530>] (arch_cpu_idle) from [<c076512c>] (default_idle_call+0x30/0x34)
[ 245.374535] [<c07650fc>] (default_idle_call) from [<c014b834>] (do_idle+0xd8/0x128)
[ 245.374540] [<c014b75c>] (do_idle) from [<c014baf0>] (cpu_startup_entry+0x20/0x24)
[ 245.374543] r7:c0b04c08 r6:ffffffff r5:c0b80380 r4:000000c2
[ 245.374549] [<c014bad0>] (cpu_startup_entry) from [<c075fb14>] (rest_init+0x9c/0xbc)
[ 245.374555] [<c075fa78>] (rest_init) from [<c0a00dc4>] (start_kernel+0x368/0x3ec)
[ 245.374558] r5:c0b80380 r4:c0b803c0
[ 245.374563] [<c0a00a5c>] (start_kernel) from [<00000000>] ( (null))
[ 245.375369] rcu_sched kthread starved for 3028 jiffies! g337 c336 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=1
[ 245.641242] RCU grace-period kthread stack dump:
[ 245.645859] rcu_sched I 0 9 2 0x00000000
[ 245.651350] Backtrace:
[ 245.653804] [<c07603f0>] (__schedule) from [<c0760b44>] (schedule+0x94/0xb8)
[ 245.660857] r10:e77903c0 r9:e77903c0 r8:ffffffff r7:e709bed4 r6:ffffded7 r5:00000000
[ 245.668689] r4:ffffe000
[ 245.671227] [<c0760ab0>] (schedule) from [<c0764500>] (schedule_timeout+0x380/0x3dc)
[ 245.678973] r5:00000000 r4:00000000
[ 245.682557] [<c0764180>] (schedule_timeout) from [<c017b620>] (rcu_gp_kthread+0x668/0xe2c)
[ 245.690826] r10:c0b79018 r9:00000151 r8:00000150 r7:00000001 r6:00000000 r5:c0b10ad0
[ 245.698658] r4:c0b10980
[ 245.701197] [<c017afb8>] (rcu_gp_kthread) from [<c013ddd8>] (kthread+0x148/0x160)
[ 245.708682] r7:c0b10980
[ 245.711218] [<c013dc90>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
[ 245.718443] Exception stack(0xe709bfb0 to 0xe709bff8)
[ 245.723495] bfa0: 00000000 00000000 00000000 00000000
[ 245.731677] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 245.739858] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 245.746476] r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c013dc90
[ 245.754307] r4:e7026f40
[ 335.351162] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 335.356682] 0-...!: (0 ticks this GP) idle=b0a/1/1073741824 softirq=2226/2226 fqs=0
[ 335.364521] (detected by 1, t=6027 jiffies, g=340, c=339, q=77)
[ 335.370545] Sending NMI from CPU 1 to CPUs 0:
[ 345.374803] NMI backtrace for cpu 1
[ 345.378295] CPU: 1 PID: 1926 Comm: bash Not tainted 4.16.0-10930-ged741fb4567c816f #42
[ 345.386215] Hardware name: Generic R8A7791 (Flattened Device Tree)
[ 345.392395] Backtrace:
[ 345.394855] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
[ 345.402430] r7:00000000 r6:600f0193 r5:00000000 r4:c0b73d80
[ 345.408097] [<c010c3d8>] (show_stack) from [<c074b824>] (dump_stack+0x84/0xa4)
[ 345.415329] [<c074b7a0>] (dump_stack) from [<c0751940>] (nmi_cpu_backtrace+0x104/0x118)
[ 345.423337] r7:00000000 r6:00000000 r5:00000001 r4:c0bc62c8
[ 345.429001] [<c075183c>] (nmi_cpu_backtrace) from [<c07519e0>] (nmi_trigger_cpumask_backtrace+0x8c/0x10c)
[ 345.438572] r7:00000000 r6:c010e4d4 r5:00000001 r4:c0b08158
[ 345.444237] [<c0751954>] (nmi_trigger_cpumask_backtrace) from [<c010f0ec>] (arch_trigger_cpumask_backtrace+0x14/0x1c)
[ 345.454850] r7:c0b10980 r6:c0b04cd0 r5:00000001 r4:c0b10980
[ 345.460519] [<c010f0d8>] (arch_trigger_cpumask_backtrace) from [<c017f9ac>] (rcu_dump_cpu_stacks+0xb0/0xd0)
[ 345.470268] [<c017f8fc>] (rcu_dump_cpu_stacks) from [<c017e73c>] (rcu_check_callbacks+0x584/0x71c)
[ 345.479232] r10:0000004d r9:c0b03d00 r8:ffffe000 r7:00000001 r6:c0b10940 r5:e7795080
[ 345.487065] r4:c0b10980 r3:f181863e
[ 345.490648] [<c017e1b8>] (rcu_check_callbacks) from [<c0184a60>] (update_process_times+0x38/0x64)
[ 345.499524] r10:e7fa9ea0 r9:e67e8000 r8:e7795e00 r7:0000004e r6:00000000 r5:e72d0640
[ 345.507356] r4:ffffe000
[ 345.509903] [<c0184a28>] (update_process_times) from [<c0195bf0>] (tick_nohz_handler+0xc8/0x114)
[ 345.518693] r7:0000004e r6:146afc64 r5:e67e9d70 r4:e77917a0
[ 345.524362] [<c0195b28>] (tick_nohz_handler) from [<c05c16ac>] (arch_timer_handler_virt+0x30/0x40)
[ 345.533325] r8:c0b05244 r7:00000013 r6:e7017a40 r5:00000000 r4:e7015c00
[ 345.540033] [<c05c167c>] (arch_timer_handler_virt) from [<c016ecd4>] (handle_percpu_devid_irq+0xf0/0x224)
[ 345.549607] [<c016ebe4>] (handle_percpu_devid_irq) from [<c016a208>] (generic_handle_irq+0x20/0x30)
[ 345.558657] r10:e7fa9ea0 r9:e67e8000 r8:e7008800 r7:00000013 r6:c0a775fc r5:00000000
[ 345.566489] r4:00000000 r3:c016ebe4
[ 345.570067] [<c016a1e8>] (generic_handle_irq) from [<c016a2c0>] (__handle_domain_irq+0xa8/0xbc)
[ 345.578777] [<c016a218>] (__handle_domain_irq) from [<c037cd3c>] (gic_handle_irq+0x60/0x98)
[ 345.587132] r9:e67e8000 r8:f0803000 r7:c0b47278 r6:e67e9d70 r5:c0b05244 r4:f0802000
[ 345.594883] [<c037ccdc>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
[ 345.602368] Exception stack(0xe67e9d70 to 0xe67e9db8)
[ 345.607421] 9d60: 00000000 e6c433e8 00182b08 00000000
[ 345.615603] 9d80: e6c429b4 00000087 e67e9e34 e6c434c4 00000000 00000087 e7fa9ea0 e67e9e14
[ 345.623784] 9da0: 00000000 e67e9dc0 c01e0f34 c01e0f88 600f0113 ffffffff
[ 345.630401] r9:e67e8000 r8:00000000 r7:e67e9da4 r6:ffffffff r5:600f0113 r4:c01e0f88
[ 345.638156] [<c01e0d84>] (filemap_map_pages) from [<c02141f0>] (handle_mm_fault+0x5c8/0x93c)
[ 345.646599] r10:e5dc5380 r9:80000007 r8:00000080 r7:b6e02000 r6:c0b04c08 r5:0000008f
[ 345.654431] r4:00000000
[ 345.656968] [<c0213c28>] (handle_mm_fault) from [<c011404c>] (do_page_fault+0x17c/0x324)
[ 345.665063] r10:e5dc5380 r9:80000007 r8:e72d0640 r7:00000054 r6:e5d030c0 r5:b6e023dc
[ 345.672895] r4:e67e9fb0
[ 345.675431] [<c0113ed0>] (do_page_fault) from [<c0114470>] (do_PrefetchAbort+0x4c/0xc8)
[ 345.683439] r10:00000010 r9:00179207 r8:c0b0a434 r7:e67e9fb0 r6:b6e023dc r5:c0b04c08
[ 345.691271] r4:00000007
[ 345.693807] [<c0114424>] (do_PrefetchAbort) from [<c010204c>] (ret_from_exception+0x0/0x14)
[ 345.702160] Exception stack(0xe67e9fb0 to 0xe67e9ff8)
[ 345.707213] 9fa0: 00000000 000e9600 00000001 bec86544
[ 345.715395] 9fc0: 00000001 bec86544 00000000 00000000 bec8654c 00179207 00000010 000e9600
[ 345.723576] 9fe0: 000c8140 bec86538 0002847c b6e023dc 200f0010 ffffffff
[ 345.730192] r8:10c5387d r7:10c5387d r6:ffffffff r5:200f0010 r4:b6e023dc
[ 345.737364] nfs: server 10.0.1.1 not responding, still trying
[ 345.743294] nfs: server 10.0.1.1 not responding, still trying
[ 345.744534] nfs: server 10.0.1.1 not responding, still trying
[ 345.749262] nfs: server 10.0.1.1 OK
[ 345.755391] nfs: server 10.0.1.1 OK
[ 345.758641] nfs: server 10.0.1.1 OK
>
> --
> 2.7.4
>
>
> The 2nd patch does the opposite and never kick ilb. I have always made the
> assumtpion that the problem was coming from _nohz_idle_balance() and I just
> wanted to check that's the case
>
> ---
> kernel/sched/fair.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..1766594 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9739,8 +9739,7 @@ static void nohz_newidle_balance(struct rq *this_rq)
> * candidate for ilb instead of waking up another idle CPU.
> * Kick an normal ilb if we failed to do the update.
> */
> - if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
> - kick_ilb(NOHZ_STATS_KICK);
> + _nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE);
> raw_spin_lock(&this_rq->lock);
> }
With this change it seem more difficult to trigger the issue, I usually
can trigger it within 2-3 rests of the system but with this applied I
had to reset it over 5 times to trigger it. But I can still reproduce
the lockup reliably multiple times. I would not trust my feeling that
it's harder to reproduce with this change as the method to trigger it is
kind of spotty and most likely due to circumstantial circumstances :-)
I can still unlock the system by starting a second ssh session and
trigger the output on the serial console.
[ 193.741481] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 193.746991] 1-...!: (0 ticks this GP) idle=83c/0/0 softirq=2445/2445 fqs=0
[ 193.754041] (detected by 0, t=4096 jiffies, g=611, c=610, q=31)
[ 193.760054] NMI backtrace for cpu 0
[ 193.763546] CPU: 0 PID: 2193 Comm: bash Not tainted 4.16.0-10929-gc18bb396d3d261eb-dirty #43
[ 193.771984] Hardware name: Generic R8A7791 (Flattened Device Tree)
[ 193.778163] Backtrace:
[ 193.780625] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
[ 193.788198] r7:00000000 r6:60010193 r5:00000000 r4:c0b73d80
[ 193.793865] [<c010c3d8>] (show_stack) from [<c074b864>] (dump_stack+0x84/0xa4)
[ 193.801093] [<c074b7e0>] (dump_stack) from [<c0751980>] (nmi_cpu_backtrace+0x104/0x118)
[ 193.809098] r7:00000000 r6:00000000 r5:00000000 r4:c0bc62c8
[ 193.814761] [<c075187c>] (nmi_cpu_backtrace) from [<c0751a20>] (nmi_trigger_cpumask_backtrace+0x8c/0x10c)
[ 193.824329] r7:00000000 r6:c010e4d4 r5:00000000 r4:c0b08158
[ 193.829992] [<c0751994>] (nmi_trigger_cpumask_backtrace) from [<c010f0ec>] (arch_trigger_cpumask_backtrace+0x14/0x1c)
[ 193.840603] r7:c0b10980 r6:c0b04cd0 r5:00000000 r4:c0b10980
[ 193.846268] [<c010f0d8>] (arch_trigger_cpumask_backtrace) from [<c017f9ec>] (rcu_dump_cpu_stacks+0xb0/0xd0)
[ 193.856015] [<c017f93c>] (rcu_dump_cpu_stacks) from [<c017e77c>] (rcu_check_callbacks+0x584/0x71c)
[ 193.864977] r10:0000001f r9:c0b03d00 r8:ffffe000 r7:00000001 r6:c0b10940 r5:e7786080
[ 193.872810] r4:c0b10980 r3:f181863e
[ 193.876392] [<c017e1f8>] (rcu_check_callbacks) from [<c0184aa0>] (update_process_times+0x38/0x64)
[ 193.885268] r10:00000000 r9:00180b98 r8:e7786e00 r7:0000002d r6:00000001 r5:e5d86180
[ 193.893100] r4:ffffe000
[ 193.895642] [<c0184a68>] (update_process_times) from [<c0195c30>] (tick_nohz_handler+0xc8/0x114)
[ 193.904431] r7:0000002d r6:1bcdda00 r5:e66c7fb0 r4:e77827a0
[ 193.910099] [<c0195b68>] (tick_nohz_handler) from [<c05c16ec>] (arch_timer_handler_virt+0x30/0x40)
[ 193.919062] r8:c0b05244 r7:00000013 r6:e7017a40 r5:00000000 r4:e7015c00
[ 193.925769] [<c05c16bc>] (arch_timer_handler_virt) from [<c016ed14>] (handle_percpu_devid_irq+0xf0/0x224)
[ 193.935343] [<c016ec24>] (handle_percpu_devid_irq) from [<c016a248>] (generic_handle_irq+0x20/0x30)
[ 193.944394] r10:00000000 r9:00180b98 r8:e7008800 r7:00000013 r6:c0a775fc r5:00000000
[ 193.952225] r4:00000000 r3:c016ec24
[ 193.955803] [<c016a228>] (generic_handle_irq) from [<c016a300>] (__handle_domain_irq+0xa8/0xbc)
[ 193.964513] [<c016a258>] (__handle_domain_irq) from [<c037cd7c>] (gic_handle_irq+0x60/0x98)
[ 193.972869] r9:00180b98 r8:f0803000 r7:c0b47278 r6:e66c7fb0 r5:c0b05244 r4:f0802000
[ 193.980619] [<c037cd1c>] (gic_handle_irq) from [<c0101dd4>] (__irq_usr+0x54/0x80)
[ 193.988105] Exception stack(0xe66c7fb0 to 0xe66c7ff8)
[ 193.993157] 7fa0: 00000000 00000000 b6e8b79c 00000000
[ 194.001339] 7fc0: 00178378 000134e0 b6e8b794 b6e8b000 b6e8b794 00180b98 00000000 b6e8b7c4
[ 194.009520] 7fe0: be955190 be955170 00000000 b6db5c58 20010010 ffffffff
[ 194.016137] r9:00180b98 r8:10c5387d r7:10c5387d r6:ffffffff r5:20010010 r4:b6db5c58
[ 194.023888] Sending NMI from CPU 0 to CPUs 1:
[ 204.028147] NMI backtrace for cpu 1
[ 204.028152] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-10929-gc18bb396d3d261eb-dirty #43
[ 204.028155] Hardware name: Generic R8A7791 (Flattened Device Tree)
[ 204.028164] PC is at _raw_spin_unlock_irqrestore+0x28/0x2c
[ 204.028169] LR is at rcu_report_qs_rnp+0x4c/0x20c
[ 204.028172] pc : [<c07653d8>] lr : [<c01796d4>] psr: 200e0113
[ 204.028175] sp : e70a3db0 ip : e70a3dc0 fp : e70a3dbc
[ 204.028177] r10: 00000264 r9 : c0b787eb r8 : 00000001
[ 204.028179] r7 : 00000002 r6 : c0b10980 r5 : c0b10980 r4 : c0b10980
[ 204.028182] r3 : 000022a3 r2 : 00000264 r1 : 600e0113 r0 : c0b10980
[ 204.028187] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 204.028189] Control: 10c5387d Table: 66b9806a DAC: 00000051
[ 204.028193] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-10929-gc18bb396d3d261eb-dirty #43
[ 204.028195] Hardware name: Generic R8A7791 (Flattened Device Tree)
[ 204.028197] Backtrace:
[ 204.028204] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
[ 204.028209] r7:c0b47278 r6:600e0193 r5:00000000 r4:c0b73d80
[ 204.028215] [<c010c3d8>] (show_stack) from [<c074b864>] (dump_stack+0x84/0xa4)
[ 204.028228] [<c074b7e0>] (dump_stack) from [<c01087e8>] (show_regs+0x14/0x18)
[ 204.028231] r7:c0b47278 r6:e70a3d60 r5:00000001 r4:c0bc62c8
[ 204.028238] [<c01087d4>] (show_regs) from [<c0751978>] (nmi_cpu_backtrace+0xfc/0x118)
[ 204.028244] [<c075187c>] (nmi_cpu_backtrace) from [<c010ef34>] (handle_IPI+0x22c/0x294)
[ 204.028247] r7:c0b47278 r6:e70a3d60 r5:00000007 r4:c0a775fc
[ 204.028254] [<c010ed08>] (handle_IPI) from [<c037cda8>] (gic_handle_irq+0x8c/0x98)
[ 204.028259] r10:00000264 r9:e70a2000 r8:f0803000 r7:c0b47278 r6:e70a3d60 r5:c0b05244
[ 204.028261] r4:f0802000 r3:00000007
[ 204.028267] [<c037cd1c>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
[ 204.028270] Exception stack(0xe70a3d60 to 0xe70a3da8)
[ 204.028275] 3d60: c0b10980 600e0113 00000264 000022a3 c0b10980 c0b10980 c0b10980 00000002
[ 204.028280] 3d80: 00000001 c0b787eb 00000264 e70a3dbc e70a3dc0 e70a3db0 c01796d4 c07653d8
[ 204.028283] 3da0: 200e0113 ffffffff
[ 204.028287] r9:e70a2000 r8:00000001 r7:e70a3d94 r6:ffffffff r5:200e0113 r4:c07653d8
[ 204.028295] [<c07653b0>] (_raw_spin_unlock_irqrestore) from [<c01796d4>] (rcu_report_qs_rnp+0x4c/0x20c)
[ 204.028301] [<c0179688>] (rcu_report_qs_rnp) from [<c017c188>] (rcu_process_callbacks+0x364/0x8cc)
[ 204.028305] r10:c0b03080 r9:00000000 r8:00000002 r7:600e0113 r6:c0b10980 r5:c0b10980
[ 204.028306] r4:e7795080
[ 204.028312] [<c017be24>] (rcu_process_callbacks) from [<c010236c>] (__do_softirq+0x23c/0x338)
[ 204.028316] r10:c0b03080 r9:00000100 r8:c0b80ec0 r7:00000002 r6:00000009 r5:c0b030a4
[ 204.028318] r4:ffffe000
[ 204.028325] [<c0102130>] (__do_softirq) from [<c0125b90>] (irq_exit+0x6c/0xd4)
[ 204.028329] r10:00000000 r9:e70a2000 r8:e7008800 r7:00000013 r6:c0a775fc r5:00000000
[ 204.028331] r4:00000000
[ 204.028336] [<c0125b24>] (irq_exit) from [<c016a2e0>] (__handle_domain_irq+0x88/0xbc)
[ 204.028338] r5:00000000 r4:00000000
[ 204.028344] [<c016a258>] (__handle_domain_irq) from [<c037cd7c>] (gic_handle_irq+0x60/0x98)
[ 204.028348] r9:e70a2000 r8:f0803000 r7:c0b47278 r6:e70a3f40 r5:c0b05244 r4:f0802000
[ 204.028354] [<c037cd1c>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
[ 204.028356] Exception stack(0xe70a3f40 to 0xe70a3f88)
[ 204.028361] 3f40: 00000000 00013844 e77903a8 c0119100 00000002 ffffe000 c0b04c2c c0b04c78
[ 204.028367] 3f60: 4000406a 413fc0f2 00000000 e70a3f9c e70a3fa0 e70a3f90 c0108564 c0108554
[ 204.028370] 3f80: 600e0013 ffffffff
[ 204.028374] r9:e70a2000 r8:4000406a r7:e70a3f74 r6:ffffffff r5:600e0013 r4:c0108554
[ 204.028381] [<c0108530>] (arch_cpu_idle) from [<c076516c>] (default_idle_call+0x30/0x34)
[ 204.028388] [<c076513c>] (default_idle_call) from [<c014b834>] (do_idle+0xd8/0x128)
[ 204.028392] [<c014b75c>] (do_idle) from [<c014baf0>] (cpu_startup_entry+0x20/0x24)
[ 204.028396] r7:c0b80750 r6:10c0387d r5:00000051 r4:00000085
[ 204.028401] [<c014bad0>] (cpu_startup_entry) from [<c010eacc>] (secondary_start_kernel+0x10c/0x12c)
[ 204.028406] [<c010e9c0>] (secondary_start_kernel) from [<4010250c>] (0x4010250c)
[ 204.028408] r5:00000051 r4:6709406a
>
> --
> 2.7.4
>
>
> Thanks,
> Vincent
>
> > >
> > >
> > > Regards,
> > > Vincent
> > > >
> > > > 1. c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
> > > > 2. 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
> > > >
> > > > --
> > > > Regards,
> > > > Niklas S?derlund
> >
> > --
> > Regards,
> > Niklas S?derlund
--
Regards,
Niklas S?derlund
Hi Vincent,
On 2018-04-12 13:15:19 +0200, Niklas S?derlund wrote:
> Hi Vincent,
>
> Thanks for your feedback.
>
> On 2018-04-12 12:33:27 +0200, Vincent Guittot wrote:
> > Hi Niklas,
> >
> > On 12 April 2018 at 11:18, Niklas S?derlund
> > <[email protected]> wrote:
> > > Hi Vincent,
> > >
> > > I have observed issues running on linus/master from a few days back [1].
> > > I'm running on a Renesas Koelsch board (arm32) and I can trigger a issue
> > > by X forwarding the v4l2 test application qv4l2 over ssh and moving the
> > > courser around in the GUI (best test case description award...). I'm
> > > sorry about the really bad way I trigger this but I can't do it in any
> > > other way, I'm happy to try other methods if you got some ideas. The
> > > symptom of the issue is a complete hang of the system for more then 30
> > > seconds and then this information is printed in the console:
> >
> > Heiner (edded cc) also reported similar problem with his platform: a
> > dual core celeron
> >
> > Do you confirm that your platform is a dual cortex-A15 ? At least that
> > what I have seen on web
> > This would confirm that dual system is a key point.
>
> I can confirm that my platform is a dual core.
I tested another dual core system today Renesas M3-W ARM64 system and I
can observe the same lockups-on that system if it helps you understand
the problem. It seems to be much harder to trigger the issue on this
system for some reason. Hitting return in a ssh session don't seem to
produce the lockup while starting a GUI using X forwarding over ssh it's
possible.
[ 392.306441] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 392.312201] (detected by 0, t=19366 jiffies, g=7177, c=7176, q=35)
[ 392.318555] All QSes seen, last rcu_preempt kthread activity 19368
(4294990375-4294971007), jiffies_till_next_fqs=1, root ->qsmask 0x0
[ 392.330758] swapper/0 R running task 0 0 0
0x00000022
[ 392.337883] Call trace:
[ 392.340365] dump_backtrace+0x0/0x1c8
[ 392.344065] show_stack+0x14/0x20
[ 392.347416] sched_show_task+0x224/0x2e8
[ 392.351377] rcu_check_callbacks+0x8ac/0x8b0
[ 392.355686] update_process_times+0x2c/0x58
[ 392.359908] tick_sched_handle.isra.5+0x30/0x50
[ 392.364479] tick_sched_timer+0x40/0x90
[ 392.368351] __hrtimer_run_queues+0xfc/0x208
[ 392.372659] hrtimer_interrupt+0xd4/0x258
[ 392.376710] arch_timer_handler_virt+0x28/0x48
[ 392.381194] handle_percpu_devid_irq+0x80/0x138
[ 392.385767] generic_handle_irq+0x28/0x40
[ 392.389813] __handle_domain_irq+0x5c/0xb8
[ 392.393946] gic_handle_irq+0x58/0xa8
[ 392.397640] el1_irq+0xb4/0x130
[ 392.400810] arch_cpu_idle+0x14/0x20
[ 392.404422] default_idle_call+0x1c/0x38
[ 392.408381] do_idle+0x17c/0x1f8
[ 392.411640] cpu_startup_entry+0x20/0x28
[ 392.415598] rest_init+0x24c/0x260
[ 392.419037] start_kernel+0x3e8/0x414
I was running the same tests on another ARM64 platform earlier using the
same build which have more then two cores and there I could not observe
this issue.
--
Regards,
Niklas S?derlund
Heiner,
On 12 April 2018 at 21:43, Heiner Kallweit <[email protected]> wrote:
>>>>
>>>> I'm going to prepare a debug patch to spy what's happening when entering idle
>>
>> I'd like to narrow the problem a bit more with the 2 patchies aboves. Can you try
>> them separatly on top of c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
>> and check if one of them fixes the problem ?i
>>
>> (They should apply on linux-next as well)
>>
>> First patch always kick ilb instead of doing ilb on local cpu before entering idle
>>
>> ---
>> kernel/sched/fair.c | 3 +--
>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 0951d1c..b21925b 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -9739,8 +9739,7 @@ static void nohz_newidle_balance(struct rq *this_rq)
>> * candidate for ilb instead of waking up another idle CPU.
>> * Kick an normal ilb if we failed to do the update.
>> */
>> - if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
>> - kick_ilb(NOHZ_STATS_KICK);
>> + kick_ilb(NOHZ_STATS_KICK);
>> raw_spin_lock(&this_rq->lock);
>> }
>>
>>
> I tested both patches, with both of them the issue still occurs. However,
> on top of linux-next from yesterday I have the impression that it happens
> less frequent with the second patch.
> On top of the commit mentioned by you I don't see a change in system behavior
> with either patch.
Thanks for the tests.
I was expecting to have more differences between the 2 patches and
especially no problem with the 1st patch which only send a ipi
reschedule to the other CPU if it is idle.
It seems to not really be related to what is done but to the fact that
it is done at that place in the code
Thanks
>
> Regards, Heiner
Hi Niklas,
On 13 April 2018 at 00:39, Niklas Söderlund
<[email protected]> wrote:
> Hi Vincent,
>
> Thanks for helping trying to figure this out.
>
> On 2018-04-12 15:30:31 +0200, Vincent Guittot wrote:
>
> [snip]
>
>>
>> I'd like to narrow the problem a bit more with the 2 patchies aboves. Can you try
>> them separatly on top of c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
>> and check if one of them fixes the problem ?i
>
> I tried your suggested changes based on top of c18bb396d3d261eb.
>
>>
>> (They should apply on linux-next as well)
>>
>> First patch always kick ilb instead of doing ilb on local cpu before entering idle
>>
>> ---
>> kernel/sched/fair.c | 3 +--
>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 0951d1c..b21925b 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -9739,8 +9739,7 @@ static void nohz_newidle_balance(struct rq *this_rq)
>> * candidate for ilb instead of waking up another idle CPU.
>> * Kick an normal ilb if we failed to do the update.
>> */
>> - if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
>> - kick_ilb(NOHZ_STATS_KICK);
>> + kick_ilb(NOHZ_STATS_KICK);
>> raw_spin_lock(&this_rq->lock);
>> }
>
> This change don't seem to effect the issue. I can still get the single
> ssh session and the system to lockup by hitting the return key. And
> opening a second ssh session immediately unblocks both the first ssh
> session and the serial console. And I can still trigger the console
> warning by just letting the system be once it locks-up. I do have
> just as before reset the system a few times to trigger the issue.
You results are similar to Heiner's ones. The problem is still there
even if we only kick ilb which mainly send an IPI reschedule to the
other CPU if Idle
>
> [ 245.351693] INFO: rcu_sched detected stalls on CPUs/tasks:
> [ 245.357199] 0-...!: (1 GPs behind) idle=93c/0/0 softirq=2224/2225 fqs=0
> [ 245.363988] (detected by 1, t=3025 jiffies, g=337, c=336, q=10)
> [ 245.370003] Sending NMI from CPU 1 to CPUs 0:
> [ 245.374368] NMI backtrace for cpu 0
> [ 245.374377] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.16.0-10930-ged741fb4567c816f #42
> [ 245.374379] Hardware name: Generic R8A7791 (Flattened Device Tree)
> [ 245.374393] PC is at arch_cpu_idle+0x24/0x40
> [ 245.374397] LR is at arch_cpu_idle+0x34/0x40
> [ 245.374400] pc : [<c0108554>] lr : [<c0108564>] psr: 60050013
> [ 245.374403] sp : c0b01f40 ip : c0b01f50 fp : c0b01f4c
> [ 245.374405] r10: c0a56a38 r9 : e7fffbc0 r8 : c0b04c00
> [ 245.374407] r7 : c0b04c78 r6 : c0b04c2c r5 : ffffe000 r4 : 00000001
> [ 245.374410] r3 : c0119100 r2 : e77813a8 r1 : 0002d93c r0 : 00000000
> [ 245.374414] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
> [ 245.374417] Control: 10c5387d Table: 6662006a DAC: 00000051
> [ 245.374421] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.16.0-10930-ged741fb4567c816f #42
> [ 245.374423] Hardware name: Generic R8A7791 (Flattened Device Tree)
> [ 245.374425] Backtrace:
> [ 245.374435] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
> [ 245.374440] r7:c0b47278 r6:60050193 r5:00000000 r4:c0b73d80
> [ 245.374450] [<c010c3d8>] (show_stack) from [<c074b824>] (dump_stack+0x84/0xa4)
> [ 245.374456] [<c074b7a0>] (dump_stack) from [<c01087e8>] (show_regs+0x14/0x18)
> [ 245.374460] r7:c0b47278 r6:c0b01ef0 r5:00000000 r4:c0bc62c8
> [ 245.374468] [<c01087d4>] (show_regs) from [<c0751938>] (nmi_cpu_backtrace+0xfc/0x118)
> [ 245.374475] [<c075183c>] (nmi_cpu_backtrace) from [<c010ef34>] (handle_IPI+0x22c/0x294)
> [ 245.374479] r7:c0b47278 r6:c0b01ef0 r5:00000007 r4:c0a775fc
> [ 245.374488] [<c010ed08>] (handle_IPI) from [<c037cd68>] (gic_handle_irq+0x8c/0x98)
> [ 245.374492] r10:c0a56a38 r9:c0b00000 r8:f0803000 r7:c0b47278 r6:c0b01ef0 r5:c0b05244
> [ 245.374495] r4:f0802000 r3:00000407
> [ 245.374501] [<c037ccdc>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
> [ 245.374504] Exception stack(0xc0b01ef0 to 0xc0b01f38)
> [ 245.374507] 1ee0: 00000000 0002d93c e77813a8 c0119100
> [ 245.374512] 1f00: 00000001 ffffe000 c0b04c2c c0b04c78 c0b04c00 e7fffbc0 c0a56a38 c0b01f4c
> [ 245.374516] 1f20: c0b01f50 c0b01f40 c0108564 c0108554 60050013 ffffffff
> [ 245.374521] r9:c0b00000 r8:c0b04c00 r7:c0b01f24 r6:ffffffff r5:60050013 r4:c0108554
> [ 245.374528] [<c0108530>] (arch_cpu_idle) from [<c076512c>] (default_idle_call+0x30/0x34)
> [ 245.374535] [<c07650fc>] (default_idle_call) from [<c014b834>] (do_idle+0xd8/0x128)
> [ 245.374540] [<c014b75c>] (do_idle) from [<c014baf0>] (cpu_startup_entry+0x20/0x24)
> [ 245.374543] r7:c0b04c08 r6:ffffffff r5:c0b80380 r4:000000c2
> [ 245.374549] [<c014bad0>] (cpu_startup_entry) from [<c075fb14>] (rest_init+0x9c/0xbc)
> [ 245.374555] [<c075fa78>] (rest_init) from [<c0a00dc4>] (start_kernel+0x368/0x3ec)
> [ 245.374558] r5:c0b80380 r4:c0b803c0
> [ 245.374563] [<c0a00a5c>] (start_kernel) from [<00000000>] ( (null))
> [ 245.375369] rcu_sched kthread starved for 3028 jiffies! g337 c336 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=1
> [ 245.641242] RCU grace-period kthread stack dump:
> [ 245.645859] rcu_sched I 0 9 2 0x00000000
> [ 245.651350] Backtrace:
> [ 245.653804] [<c07603f0>] (__schedule) from [<c0760b44>] (schedule+0x94/0xb8)
> [ 245.660857] r10:e77903c0 r9:e77903c0 r8:ffffffff r7:e709bed4 r6:ffffded7 r5:00000000
> [ 245.668689] r4:ffffe000
> [ 245.671227] [<c0760ab0>] (schedule) from [<c0764500>] (schedule_timeout+0x380/0x3dc)
> [ 245.678973] r5:00000000 r4:00000000
> [ 245.682557] [<c0764180>] (schedule_timeout) from [<c017b620>] (rcu_gp_kthread+0x668/0xe2c)
> [ 245.690826] r10:c0b79018 r9:00000151 r8:00000150 r7:00000001 r6:00000000 r5:c0b10ad0
> [ 245.698658] r4:c0b10980
> [ 245.701197] [<c017afb8>] (rcu_gp_kthread) from [<c013ddd8>] (kthread+0x148/0x160)
> [ 245.708682] r7:c0b10980
> [ 245.711218] [<c013dc90>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
> [ 245.718443] Exception stack(0xe709bfb0 to 0xe709bff8)
> [ 245.723495] bfa0: 00000000 00000000 00000000 00000000
> [ 245.731677] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [ 245.739858] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [ 245.746476] r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c013dc90
> [ 245.754307] r4:e7026f40
> [ 335.351162] INFO: rcu_sched detected stalls on CPUs/tasks:
> [ 335.356682] 0-...!: (0 ticks this GP) idle=b0a/1/1073741824 softirq=2226/2226 fqs=0
> [ 335.364521] (detected by 1, t=6027 jiffies, g=340, c=339, q=77)
> [ 335.370545] Sending NMI from CPU 1 to CPUs 0:
> [ 345.374803] NMI backtrace for cpu 1
> [ 345.378295] CPU: 1 PID: 1926 Comm: bash Not tainted 4.16.0-10930-ged741fb4567c816f #42
> [ 345.386215] Hardware name: Generic R8A7791 (Flattened Device Tree)
> [ 345.392395] Backtrace:
> [ 345.394855] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
> [ 345.402430] r7:00000000 r6:600f0193 r5:00000000 r4:c0b73d80
> [ 345.408097] [<c010c3d8>] (show_stack) from [<c074b824>] (dump_stack+0x84/0xa4)
> [ 345.415329] [<c074b7a0>] (dump_stack) from [<c0751940>] (nmi_cpu_backtrace+0x104/0x118)
> [ 345.423337] r7:00000000 r6:00000000 r5:00000001 r4:c0bc62c8
> [ 345.429001] [<c075183c>] (nmi_cpu_backtrace) from [<c07519e0>] (nmi_trigger_cpumask_backtrace+0x8c/0x10c)
> [ 345.438572] r7:00000000 r6:c010e4d4 r5:00000001 r4:c0b08158
> [ 345.444237] [<c0751954>] (nmi_trigger_cpumask_backtrace) from [<c010f0ec>] (arch_trigger_cpumask_backtrace+0x14/0x1c)
> [ 345.454850] r7:c0b10980 r6:c0b04cd0 r5:00000001 r4:c0b10980
> [ 345.460519] [<c010f0d8>] (arch_trigger_cpumask_backtrace) from [<c017f9ac>] (rcu_dump_cpu_stacks+0xb0/0xd0)
> [ 345.470268] [<c017f8fc>] (rcu_dump_cpu_stacks) from [<c017e73c>] (rcu_check_callbacks+0x584/0x71c)
> [ 345.479232] r10:0000004d r9:c0b03d00 r8:ffffe000 r7:00000001 r6:c0b10940 r5:e7795080
> [ 345.487065] r4:c0b10980 r3:f181863e
> [ 345.490648] [<c017e1b8>] (rcu_check_callbacks) from [<c0184a60>] (update_process_times+0x38/0x64)
> [ 345.499524] r10:e7fa9ea0 r9:e67e8000 r8:e7795e00 r7:0000004e r6:00000000 r5:e72d0640
> [ 345.507356] r4:ffffe000
> [ 345.509903] [<c0184a28>] (update_process_times) from [<c0195bf0>] (tick_nohz_handler+0xc8/0x114)
> [ 345.518693] r7:0000004e r6:146afc64 r5:e67e9d70 r4:e77917a0
> [ 345.524362] [<c0195b28>] (tick_nohz_handler) from [<c05c16ac>] (arch_timer_handler_virt+0x30/0x40)
> [ 345.533325] r8:c0b05244 r7:00000013 r6:e7017a40 r5:00000000 r4:e7015c00
> [ 345.540033] [<c05c167c>] (arch_timer_handler_virt) from [<c016ecd4>] (handle_percpu_devid_irq+0xf0/0x224)
> [ 345.549607] [<c016ebe4>] (handle_percpu_devid_irq) from [<c016a208>] (generic_handle_irq+0x20/0x30)
> [ 345.558657] r10:e7fa9ea0 r9:e67e8000 r8:e7008800 r7:00000013 r6:c0a775fc r5:00000000
> [ 345.566489] r4:00000000 r3:c016ebe4
> [ 345.570067] [<c016a1e8>] (generic_handle_irq) from [<c016a2c0>] (__handle_domain_irq+0xa8/0xbc)
> [ 345.578777] [<c016a218>] (__handle_domain_irq) from [<c037cd3c>] (gic_handle_irq+0x60/0x98)
> [ 345.587132] r9:e67e8000 r8:f0803000 r7:c0b47278 r6:e67e9d70 r5:c0b05244 r4:f0802000
> [ 345.594883] [<c037ccdc>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
> [ 345.602368] Exception stack(0xe67e9d70 to 0xe67e9db8)
> [ 345.607421] 9d60: 00000000 e6c433e8 00182b08 00000000
> [ 345.615603] 9d80: e6c429b4 00000087 e67e9e34 e6c434c4 00000000 00000087 e7fa9ea0 e67e9e14
> [ 345.623784] 9da0: 00000000 e67e9dc0 c01e0f34 c01e0f88 600f0113 ffffffff
> [ 345.630401] r9:e67e8000 r8:00000000 r7:e67e9da4 r6:ffffffff r5:600f0113 r4:c01e0f88
> [ 345.638156] [<c01e0d84>] (filemap_map_pages) from [<c02141f0>] (handle_mm_fault+0x5c8/0x93c)
> [ 345.646599] r10:e5dc5380 r9:80000007 r8:00000080 r7:b6e02000 r6:c0b04c08 r5:0000008f
> [ 345.654431] r4:00000000
> [ 345.656968] [<c0213c28>] (handle_mm_fault) from [<c011404c>] (do_page_fault+0x17c/0x324)
> [ 345.665063] r10:e5dc5380 r9:80000007 r8:e72d0640 r7:00000054 r6:e5d030c0 r5:b6e023dc
> [ 345.672895] r4:e67e9fb0
> [ 345.675431] [<c0113ed0>] (do_page_fault) from [<c0114470>] (do_PrefetchAbort+0x4c/0xc8)
> [ 345.683439] r10:00000010 r9:00179207 r8:c0b0a434 r7:e67e9fb0 r6:b6e023dc r5:c0b04c08
> [ 345.691271] r4:00000007
> [ 345.693807] [<c0114424>] (do_PrefetchAbort) from [<c010204c>] (ret_from_exception+0x0/0x14)
> [ 345.702160] Exception stack(0xe67e9fb0 to 0xe67e9ff8)
> [ 345.707213] 9fa0: 00000000 000e9600 00000001 bec86544
> [ 345.715395] 9fc0: 00000001 bec86544 00000000 00000000 bec8654c 00179207 00000010 000e9600
> [ 345.723576] 9fe0: 000c8140 bec86538 0002847c b6e023dc 200f0010 ffffffff
> [ 345.730192] r8:10c5387d r7:10c5387d r6:ffffffff r5:200f0010 r4:b6e023dc
> [ 345.737364] nfs: server 10.0.1.1 not responding, still trying
> [ 345.743294] nfs: server 10.0.1.1 not responding, still trying
> [ 345.744534] nfs: server 10.0.1.1 not responding, still trying
> [ 345.749262] nfs: server 10.0.1.1 OK
> [ 345.755391] nfs: server 10.0.1.1 OK
> [ 345.758641] nfs: server 10.0.1.1 OK
>
>>
>> --
>> 2.7.4
>>
>>
>> The 2nd patch does the opposite and never kick ilb. I have always made the
>> assumtpion that the problem was coming from _nohz_idle_balance() and I just
>> wanted to check that's the case
>>
>> ---
>> kernel/sched/fair.c | 3 +--
>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 0951d1c..1766594 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -9739,8 +9739,7 @@ static void nohz_newidle_balance(struct rq *this_rq)
>> * candidate for ilb instead of waking up another idle CPU.
>> * Kick an normal ilb if we failed to do the update.
>> */
>> - if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
>> - kick_ilb(NOHZ_STATS_KICK);
>> + _nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE);
>> raw_spin_lock(&this_rq->lock);
>> }
>
> With this change it seem more difficult to trigger the issue, I usually
> can trigger it within 2-3 rests of the system but with this applied I
> had to reset it over 5 times to trigger it. But I can still reproduce
> the lockup reliably multiple times. I would not trust my feeling that
> it's harder to reproduce with this change as the method to trigger it is
> kind of spotty and most likely due to circumstantial circumstances :-)
>
> I can still unlock the system by starting a second ssh session and
> trigger the output on the serial console.
>
> [ 193.741481] INFO: rcu_sched detected stalls on CPUs/tasks:
> [ 193.746991] 1-...!: (0 ticks this GP) idle=83c/0/0 softirq=2445/2445 fqs=0
> [ 193.754041] (detected by 0, t=4096 jiffies, g=611, c=610, q=31)
> [ 193.760054] NMI backtrace for cpu 0
> [ 193.763546] CPU: 0 PID: 2193 Comm: bash Not tainted 4.16.0-10929-gc18bb396d3d261eb-dirty #43
> [ 193.771984] Hardware name: Generic R8A7791 (Flattened Device Tree)
> [ 193.778163] Backtrace:
> [ 193.780625] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
> [ 193.788198] r7:00000000 r6:60010193 r5:00000000 r4:c0b73d80
> [ 193.793865] [<c010c3d8>] (show_stack) from [<c074b864>] (dump_stack+0x84/0xa4)
> [ 193.801093] [<c074b7e0>] (dump_stack) from [<c0751980>] (nmi_cpu_backtrace+0x104/0x118)
> [ 193.809098] r7:00000000 r6:00000000 r5:00000000 r4:c0bc62c8
> [ 193.814761] [<c075187c>] (nmi_cpu_backtrace) from [<c0751a20>] (nmi_trigger_cpumask_backtrace+0x8c/0x10c)
> [ 193.824329] r7:00000000 r6:c010e4d4 r5:00000000 r4:c0b08158
> [ 193.829992] [<c0751994>] (nmi_trigger_cpumask_backtrace) from [<c010f0ec>] (arch_trigger_cpumask_backtrace+0x14/0x1c)
> [ 193.840603] r7:c0b10980 r6:c0b04cd0 r5:00000000 r4:c0b10980
> [ 193.846268] [<c010f0d8>] (arch_trigger_cpumask_backtrace) from [<c017f9ec>] (rcu_dump_cpu_stacks+0xb0/0xd0)
> [ 193.856015] [<c017f93c>] (rcu_dump_cpu_stacks) from [<c017e77c>] (rcu_check_callbacks+0x584/0x71c)
> [ 193.864977] r10:0000001f r9:c0b03d00 r8:ffffe000 r7:00000001 r6:c0b10940 r5:e7786080
> [ 193.872810] r4:c0b10980 r3:f181863e
> [ 193.876392] [<c017e1f8>] (rcu_check_callbacks) from [<c0184aa0>] (update_process_times+0x38/0x64)
> [ 193.885268] r10:00000000 r9:00180b98 r8:e7786e00 r7:0000002d r6:00000001 r5:e5d86180
> [ 193.893100] r4:ffffe000
> [ 193.895642] [<c0184a68>] (update_process_times) from [<c0195c30>] (tick_nohz_handler+0xc8/0x114)
> [ 193.904431] r7:0000002d r6:1bcdda00 r5:e66c7fb0 r4:e77827a0
> [ 193.910099] [<c0195b68>] (tick_nohz_handler) from [<c05c16ec>] (arch_timer_handler_virt+0x30/0x40)
> [ 193.919062] r8:c0b05244 r7:00000013 r6:e7017a40 r5:00000000 r4:e7015c00
> [ 193.925769] [<c05c16bc>] (arch_timer_handler_virt) from [<c016ed14>] (handle_percpu_devid_irq+0xf0/0x224)
> [ 193.935343] [<c016ec24>] (handle_percpu_devid_irq) from [<c016a248>] (generic_handle_irq+0x20/0x30)
> [ 193.944394] r10:00000000 r9:00180b98 r8:e7008800 r7:00000013 r6:c0a775fc r5:00000000
> [ 193.952225] r4:00000000 r3:c016ec24
> [ 193.955803] [<c016a228>] (generic_handle_irq) from [<c016a300>] (__handle_domain_irq+0xa8/0xbc)
> [ 193.964513] [<c016a258>] (__handle_domain_irq) from [<c037cd7c>] (gic_handle_irq+0x60/0x98)
> [ 193.972869] r9:00180b98 r8:f0803000 r7:c0b47278 r6:e66c7fb0 r5:c0b05244 r4:f0802000
> [ 193.980619] [<c037cd1c>] (gic_handle_irq) from [<c0101dd4>] (__irq_usr+0x54/0x80)
> [ 193.988105] Exception stack(0xe66c7fb0 to 0xe66c7ff8)
> [ 193.993157] 7fa0: 00000000 00000000 b6e8b79c 00000000
> [ 194.001339] 7fc0: 00178378 000134e0 b6e8b794 b6e8b000 b6e8b794 00180b98 00000000 b6e8b7c4
> [ 194.009520] 7fe0: be955190 be955170 00000000 b6db5c58 20010010 ffffffff
> [ 194.016137] r9:00180b98 r8:10c5387d r7:10c5387d r6:ffffffff r5:20010010 r4:b6db5c58
> [ 194.023888] Sending NMI from CPU 0 to CPUs 1:
> [ 204.028147] NMI backtrace for cpu 1
> [ 204.028152] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-10929-gc18bb396d3d261eb-dirty #43
> [ 204.028155] Hardware name: Generic R8A7791 (Flattened Device Tree)
> [ 204.028164] PC is at _raw_spin_unlock_irqrestore+0x28/0x2c
> [ 204.028169] LR is at rcu_report_qs_rnp+0x4c/0x20c
> [ 204.028172] pc : [<c07653d8>] lr : [<c01796d4>] psr: 200e0113
> [ 204.028175] sp : e70a3db0 ip : e70a3dc0 fp : e70a3dbc
> [ 204.028177] r10: 00000264 r9 : c0b787eb r8 : 00000001
> [ 204.028179] r7 : 00000002 r6 : c0b10980 r5 : c0b10980 r4 : c0b10980
> [ 204.028182] r3 : 000022a3 r2 : 00000264 r1 : 600e0113 r0 : c0b10980
> [ 204.028187] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
> [ 204.028189] Control: 10c5387d Table: 66b9806a DAC: 00000051
> [ 204.028193] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-10929-gc18bb396d3d261eb-dirty #43
> [ 204.028195] Hardware name: Generic R8A7791 (Flattened Device Tree)
> [ 204.028197] Backtrace:
> [ 204.028204] [<c010c170>] (dump_backtrace) from [<c010c3f0>] (show_stack+0x18/0x1c)
> [ 204.028209] r7:c0b47278 r6:600e0193 r5:00000000 r4:c0b73d80
> [ 204.028215] [<c010c3d8>] (show_stack) from [<c074b864>] (dump_stack+0x84/0xa4)
> [ 204.028228] [<c074b7e0>] (dump_stack) from [<c01087e8>] (show_regs+0x14/0x18)
> [ 204.028231] r7:c0b47278 r6:e70a3d60 r5:00000001 r4:c0bc62c8
> [ 204.028238] [<c01087d4>] (show_regs) from [<c0751978>] (nmi_cpu_backtrace+0xfc/0x118)
> [ 204.028244] [<c075187c>] (nmi_cpu_backtrace) from [<c010ef34>] (handle_IPI+0x22c/0x294)
> [ 204.028247] r7:c0b47278 r6:e70a3d60 r5:00000007 r4:c0a775fc
> [ 204.028254] [<c010ed08>] (handle_IPI) from [<c037cda8>] (gic_handle_irq+0x8c/0x98)
> [ 204.028259] r10:00000264 r9:e70a2000 r8:f0803000 r7:c0b47278 r6:e70a3d60 r5:c0b05244
> [ 204.028261] r4:f0802000 r3:00000007
> [ 204.028267] [<c037cd1c>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
> [ 204.028270] Exception stack(0xe70a3d60 to 0xe70a3da8)
> [ 204.028275] 3d60: c0b10980 600e0113 00000264 000022a3 c0b10980 c0b10980 c0b10980 00000002
> [ 204.028280] 3d80: 00000001 c0b787eb 00000264 e70a3dbc e70a3dc0 e70a3db0 c01796d4 c07653d8
> [ 204.028283] 3da0: 200e0113 ffffffff
> [ 204.028287] r9:e70a2000 r8:00000001 r7:e70a3d94 r6:ffffffff r5:200e0113 r4:c07653d8
> [ 204.028295] [<c07653b0>] (_raw_spin_unlock_irqrestore) from [<c01796d4>] (rcu_report_qs_rnp+0x4c/0x20c)
> [ 204.028301] [<c0179688>] (rcu_report_qs_rnp) from [<c017c188>] (rcu_process_callbacks+0x364/0x8cc)
> [ 204.028305] r10:c0b03080 r9:00000000 r8:00000002 r7:600e0113 r6:c0b10980 r5:c0b10980
> [ 204.028306] r4:e7795080
> [ 204.028312] [<c017be24>] (rcu_process_callbacks) from [<c010236c>] (__do_softirq+0x23c/0x338)
> [ 204.028316] r10:c0b03080 r9:00000100 r8:c0b80ec0 r7:00000002 r6:00000009 r5:c0b030a4
> [ 204.028318] r4:ffffe000
> [ 204.028325] [<c0102130>] (__do_softirq) from [<c0125b90>] (irq_exit+0x6c/0xd4)
> [ 204.028329] r10:00000000 r9:e70a2000 r8:e7008800 r7:00000013 r6:c0a775fc r5:00000000
> [ 204.028331] r4:00000000
> [ 204.028336] [<c0125b24>] (irq_exit) from [<c016a2e0>] (__handle_domain_irq+0x88/0xbc)
> [ 204.028338] r5:00000000 r4:00000000
> [ 204.028344] [<c016a258>] (__handle_domain_irq) from [<c037cd7c>] (gic_handle_irq+0x60/0x98)
> [ 204.028348] r9:e70a2000 r8:f0803000 r7:c0b47278 r6:e70a3f40 r5:c0b05244 r4:f0802000
> [ 204.028354] [<c037cd1c>] (gic_handle_irq) from [<c0101a0c>] (__irq_svc+0x6c/0x90)
> [ 204.028356] Exception stack(0xe70a3f40 to 0xe70a3f88)
> [ 204.028361] 3f40: 00000000 00013844 e77903a8 c0119100 00000002 ffffe000 c0b04c2c c0b04c78
> [ 204.028367] 3f60: 4000406a 413fc0f2 00000000 e70a3f9c e70a3fa0 e70a3f90 c0108564 c0108554
> [ 204.028370] 3f80: 600e0013 ffffffff
> [ 204.028374] r9:e70a2000 r8:4000406a r7:e70a3f74 r6:ffffffff r5:600e0013 r4:c0108554
> [ 204.028381] [<c0108530>] (arch_cpu_idle) from [<c076516c>] (default_idle_call+0x30/0x34)
> [ 204.028388] [<c076513c>] (default_idle_call) from [<c014b834>] (do_idle+0xd8/0x128)
> [ 204.028392] [<c014b75c>] (do_idle) from [<c014baf0>] (cpu_startup_entry+0x20/0x24)
> [ 204.028396] r7:c0b80750 r6:10c0387d r5:00000051 r4:00000085
> [ 204.028401] [<c014bad0>] (cpu_startup_entry) from [<c010eacc>] (secondary_start_kernel+0x10c/0x12c)
> [ 204.028406] [<c010e9c0>] (secondary_start_kernel) from [<4010250c>] (0x4010250c)
> [ 204.028408] r5:00000051 r4:6709406a
>
>>
>> --
>> 2.7.4
>>
>>
>> Thanks,
>> Vincent
>>
>> > >
>> > >
>> > > Regards,
>> > > Vincent
>> > > >
>> > > > 1. c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
>> > > > 2. 31e77c93e432dec7 ("sched/fair: Update blocked load when newly idle")
>> > > >
>> > > > --
>> > > > Regards,
>> > > > Niklas Söderlund
>> >
>> > --
>> > Regards,
>> > Niklas Söderlund
>
> --
> Regards,
> Niklas Söderlund
On 13 April 2018 at 22:38, Niklas Söderlund
<[email protected]> wrote:
> Hi Vincent,
>
> On 2018-04-12 13:15:19 +0200, Niklas Söderlund wrote:
>> Hi Vincent,
>>
>> Thanks for your feedback.
>>
>> On 2018-04-12 12:33:27 +0200, Vincent Guittot wrote:
>> > Hi Niklas,
>> >
>> > On 12 April 2018 at 11:18, Niklas Söderlund
>> > <[email protected]> wrote:
>> > > Hi Vincent,
>> > >
>> > > I have observed issues running on linus/master from a few days back [1].
>> > > I'm running on a Renesas Koelsch board (arm32) and I can trigger a issue
>> > > by X forwarding the v4l2 test application qv4l2 over ssh and moving the
>> > > courser around in the GUI (best test case description award...). I'm
>> > > sorry about the really bad way I trigger this but I can't do it in any
>> > > other way, I'm happy to try other methods if you got some ideas. The
>> > > symptom of the issue is a complete hang of the system for more then 30
>> > > seconds and then this information is printed in the console:
>> >
>> > Heiner (edded cc) also reported similar problem with his platform: a
>> > dual core celeron
>> >
>> > Do you confirm that your platform is a dual cortex-A15 ? At least that
>> > what I have seen on web
>> > This would confirm that dual system is a key point.
>>
>> I can confirm that my platform is a dual core.
>
> I tested another dual core system today Renesas M3-W ARM64 system and I
> can observe the same lockups-on that system if it helps you understand
> the problem. It seems to be much harder to trigger the issue on this
> system for some reason. Hitting return in a ssh session don't seem to
> produce the lockup while starting a GUI using X forwarding over ssh it's
> possible.
Thanks for the test. That's confirm, it's only happen on dual core
>
> [ 392.306441] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 392.312201] (detected by 0, t=19366 jiffies, g=7177, c=7176, q=35)
> [ 392.318555] All QSes seen, last rcu_preempt kthread activity 19368
> (4294990375-4294971007), jiffies_till_next_fqs=1, root ->qsmask 0x0
> [ 392.330758] swapper/0 R running task 0 0 0
> 0x00000022
> [ 392.337883] Call trace:
> [ 392.340365] dump_backtrace+0x0/0x1c8
> [ 392.344065] show_stack+0x14/0x20
> [ 392.347416] sched_show_task+0x224/0x2e8
> [ 392.351377] rcu_check_callbacks+0x8ac/0x8b0
> [ 392.355686] update_process_times+0x2c/0x58
> [ 392.359908] tick_sched_handle.isra.5+0x30/0x50
> [ 392.364479] tick_sched_timer+0x40/0x90
> [ 392.368351] __hrtimer_run_queues+0xfc/0x208
> [ 392.372659] hrtimer_interrupt+0xd4/0x258
> [ 392.376710] arch_timer_handler_virt+0x28/0x48
> [ 392.381194] handle_percpu_devid_irq+0x80/0x138
> [ 392.385767] generic_handle_irq+0x28/0x40
> [ 392.389813] __handle_domain_irq+0x5c/0xb8
> [ 392.393946] gic_handle_irq+0x58/0xa8
> [ 392.397640] el1_irq+0xb4/0x130
> [ 392.400810] arch_cpu_idle+0x14/0x20
> [ 392.404422] default_idle_call+0x1c/0x38
> [ 392.408381] do_idle+0x17c/0x1f8
> [ 392.411640] cpu_startup_entry+0x20/0x28
> [ 392.415598] rest_init+0x24c/0x260
> [ 392.419037] start_kernel+0x3e8/0x414
>
> I was running the same tests on another ARM64 platform earlier using the
> same build which have more then two cores and there I could not observe
> this issue.
>
> --
> Regards,
> Niklas Söderlund
Hi Heiner and Niklas,
Le Saturday 14 Apr 2018 ? 13:24:20 (+0200), Vincent Guittot a ?crit :
> Hi Niklas,
>
> On 13 April 2018 at 00:39, Niklas S?derlund
> <[email protected]> wrote:
> > Hi Vincent,
> >
> > Thanks for helping trying to figure this out.
> >
> > On 2018-04-12 15:30:31 +0200, Vincent Guittot wrote:
> >
> > [snip]
> >
> >>
> >> I'd like to narrow the problem a bit more with the 2 patchies aboves. Can you try
> >> them separatly on top of c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
> >> and check if one of them fixes the problem ?i
> >
> > I tried your suggested changes based on top of c18bb396d3d261eb.
> >
> >>
> >> (They should apply on linux-next as well)
> >>
> >> First patch always kick ilb instead of doing ilb on local cpu before entering idle
> >>
> >> ---
> >> kernel/sched/fair.c | 3 +--
> >> 1 file changed, 1 insertion(+), 2 deletions(-)
> >>
> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >> index 0951d1c..b21925b 100644
> >> --- a/kernel/sched/fair.c
> >> +++ b/kernel/sched/fair.c
> >> @@ -9739,8 +9739,7 @@ static void nohz_newidle_balance(struct rq *this_rq)
> >> * candidate for ilb instead of waking up another idle CPU.
> >> * Kick an normal ilb if we failed to do the update.
> >> */
> >> - if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
> >> - kick_ilb(NOHZ_STATS_KICK);
> >> + kick_ilb(NOHZ_STATS_KICK);
> >> raw_spin_lock(&this_rq->lock);
> >> }
> >
> > This change don't seem to effect the issue. I can still get the single
> > ssh session and the system to lockup by hitting the return key. And
> > opening a second ssh session immediately unblocks both the first ssh
> > session and the serial console. And I can still trigger the console
> > warning by just letting the system be once it locks-up. I do have
> > just as before reset the system a few times to trigger the issue.
>
> You results are similar to Heiner's ones. The problem is still there
> even if we only kick ilb which mainly send an IPI reschedule to the
> other CPU if Idle
>
Could it be possible to record some traces of the problem to get a better view
of what happens ?
I have a small patch that adds some traces in the functions that seems to create
the problem
---
kernel/sched/fair.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0951d1c..a951464 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9606,6 +9606,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
*/
WRITE_ONCE(nohz.has_blocked, 0);
+ trace_printk("_nohz_idle_balance cpu %d idle %d flag %x", this_cpu, idle, flags);
+
/*
* Ensures that if we miss the CPU, we must see the has_blocked
* store from nohz_balance_enter_idle().
@@ -9680,6 +9682,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
if (likely(update_next_balance))
nohz.next_balance = next_balance;
+ trace_printk("_nohz_idle_balance return %d", ret);
+
return ret;
}
@@ -9732,6 +9736,8 @@ static void nohz_newidle_balance(struct rq *this_rq)
time_before(jiffies, READ_ONCE(nohz.next_blocked)))
return;
+ trace_printk("nohz_newidle_balance start update");
+
raw_spin_unlock(&this_rq->lock);
/*
* This CPU is going to be idle and blocked load of idle CPUs
--
2.7.4
Also that would be good to stop tracing when the RCU stall happens
In case you are not familiar with the trace tool, I have put below how to configure ftrace to trace scheudler, irq, timer ... events and stop tracing when dump_backtrace function is called
trace-cmd reset > /dev/null
trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
trace-cmd start -b 40000 -p function -l dump_backtrace
The trace-cmd start has to be called twice to make sure that we will not trace all function
Once the dump happen and the trace is stopped, you can extract the traces with
trace-cmd extract -o <trace file name>
Thanks
Vincent
On Fri, Apr 20, 2018 at 9:00 AM, Vincent Guittot
<[email protected]> wrote:
[..]
>
> Le Saturday 14 Apr 2018 à 13:24:20 (+0200), Vincent Guittot a écrit :
>> Hi Niklas,
>>
>> On 13 April 2018 at 00:39, Niklas Söderlund
>> <[email protected]> wrote:
>> > Hi Vincent,
>> >
>> > Thanks for helping trying to figure this out.
>> >
>> > On 2018-04-12 15:30:31 +0200, Vincent Guittot wrote:
>> >
>> > [snip]
>> >
>> >>
>> >> I'd like to narrow the problem a bit more with the 2 patchies aboves. Can you try
>> >> them separatly on top of c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net"))
>> >> and check if one of them fixes the problem ?i
>> >
>> > I tried your suggested changes based on top of c18bb396d3d261eb.
>> >
>> >>
>> >> (They should apply on linux-next as well)
>> >>
>> >> First patch always kick ilb instead of doing ilb on local cpu before entering idle
>> >>
>> >> ---
>> >> kernel/sched/fair.c | 3 +--
>> >> 1 file changed, 1 insertion(+), 2 deletions(-)
>> >>
>> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> >> index 0951d1c..b21925b 100644
>> >> --- a/kernel/sched/fair.c
>> >> +++ b/kernel/sched/fair.c
>> >> @@ -9739,8 +9739,7 @@ static void nohz_newidle_balance(struct rq *this_rq)
>> >> * candidate for ilb instead of waking up another idle CPU.
>> >> * Kick an normal ilb if we failed to do the update.
>> >> */
>> >> - if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
>> >> - kick_ilb(NOHZ_STATS_KICK);
>> >> + kick_ilb(NOHZ_STATS_KICK);
>> >> raw_spin_lock(&this_rq->lock);
>> >> }
>> >
>> > This change don't seem to effect the issue. I can still get the single
>> > ssh session and the system to lockup by hitting the return key. And
>> > opening a second ssh session immediately unblocks both the first ssh
>> > session and the serial console. And I can still trigger the console
>> > warning by just letting the system be once it locks-up. I do have
>> > just as before reset the system a few times to trigger the issue.
>>
>> You results are similar to Heiner's ones. The problem is still there
>> even if we only kick ilb which mainly send an IPI reschedule to the
>> other CPU if Idle
>>
>
> Could it be possible to record some traces of the problem to get a better view
> of what happens ?
>
> I have a small patch that adds some traces in the functions that seems to create
> the problem
>
> ---
> kernel/sched/fair.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..a951464 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9606,6 +9606,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
> */
> WRITE_ONCE(nohz.has_blocked, 0);
>
> + trace_printk("_nohz_idle_balance cpu %d idle %d flag %x", this_cpu, idle, flags);
> +
> /*
> * Ensures that if we miss the CPU, we must see the has_blocked
> * store from nohz_balance_enter_idle().
> @@ -9680,6 +9682,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
> if (likely(update_next_balance))
> nohz.next_balance = next_balance;
>
> + trace_printk("_nohz_idle_balance return %d", ret);
> +
> return ret;
> }
>
> @@ -9732,6 +9736,8 @@ static void nohz_newidle_balance(struct rq *this_rq)
> time_before(jiffies, READ_ONCE(nohz.next_blocked)))
> return;
>
> + trace_printk("nohz_newidle_balance start update");
> +
> raw_spin_unlock(&this_rq->lock);
> /*
> * This CPU is going to be idle and blocked load of idle CPUs
> --
> 2.7.4
>
>
> Also that would be good to stop tracing when the RCU stall happens
>
> In case you are not familiar with the trace tool, I have put below how to configure ftrace to trace scheudler, irq, timer ... events and stop tracing when dump_backtrace function is called
>
> trace-cmd reset > /dev/null
> trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> trace-cmd start -b 40000 -p function -l dump_backtrace
>
> The trace-cmd start has to be called twice to make sure that we will not trace all function
>
> Once the dump happen and the trace is stopped, you can extract the traces with
>
> trace-cmd extract -o <trace file name>
Just adding to that (and... Feel free to use or ignore these tips)
To prevent the trace_printk from getting lost in the trace flood, you
can also call tracing_stop() from wherever the problem is detected and
look at the last set of messages which is easier. Also you can dump
the ftrace buffer to the kernel logs before the lock up, using
ftrace_dump() and when you do that you probably want to limit the
ftrace buffer size.
thanks,
- Joel
Hi Vincent,
On 2018-04-20 18:00:13 +0200, Vincent Guittot wrote:
[snip]
> >
> > You results are similar to Heiner's ones. The problem is still there
> > even if we only kick ilb which mainly send an IPI reschedule to the
> > other CPU if Idle
> >
>
> Could it be possible to record some traces of the problem to get a better view
> of what happens ?
>
> I have a small patch that adds some traces in the functions that seems to create
> the problem
Sure, I applied the patch bellow on-top of c18bb396d3d261eb ("Merge
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
>
> ---
> kernel/sched/fair.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..a951464 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9606,6 +9606,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
> */
> WRITE_ONCE(nohz.has_blocked, 0);
>
> + trace_printk("_nohz_idle_balance cpu %d idle %d flag %x", this_cpu, idle, flags);
> +
> /*
> * Ensures that if we miss the CPU, we must see the has_blocked
> * store from nohz_balance_enter_idle().
> @@ -9680,6 +9682,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
> if (likely(update_next_balance))
> nohz.next_balance = next_balance;
>
> + trace_printk("_nohz_idle_balance return %d", ret);
> +
> return ret;
> }
>
> @@ -9732,6 +9736,8 @@ static void nohz_newidle_balance(struct rq *this_rq)
> time_before(jiffies, READ_ONCE(nohz.next_blocked)))
> return;
>
> + trace_printk("nohz_newidle_balance start update");
> +
> raw_spin_unlock(&this_rq->lock);
> /*
> * This CPU is going to be idle and blocked load of idle CPUs
> --
> 2.7.4
>
>
> Also that would be good to stop tracing when the RCU stall happens
>
> In case you are not familiar with the trace tool, I have put below how to configure ftrace to trace scheudler, irq, timer ... events and stop tracing when dump_backtrace function is called
>
> trace-cmd reset > /dev/null
> trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> trace-cmd start -b 40000 -p function -l dump_backtrace
>
> The trace-cmd start has to be called twice to make sure that we will not trace all function
>
> Once the dump happen and the trace is stopped, you can extract the traces with
>
> trace-cmd extract -o <trace file name>
Thanks for the help with trace-cmd, I have attached the full
trace-cmd.dat extracted from running this. But a quick run of trace-cmd
report.
trace-cmd-1948 [000] 144.504072: bputs: pick_next_task_fair: nohz_newidle_balance start update
trace-cmd-1948 [000] 144.504079: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
trace-cmd-1948 [000] 144.504082: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
rcu_sched-9 [000] 144.572757: bputs: pick_next_task_fair: nohz_newidle_balance start update
rcu_sched-9 [000] 144.572765: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
rcu_sched-9 [000] 144.572771: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
rcu_sched-9 [000] 144.612750: bputs: pick_next_task_fair: nohz_newidle_balance start update
rcu_sched-9 [000] 144.612752: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
rcu_sched-9 [000] 144.612757: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
kworker/1:1-31 [001] 145.032809: bputs: pick_next_task_fair: nohz_newidle_balance start update
kworker/1:1-31 [001] 145.032813: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
kworker/1:1-31 [001] 145.032817: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
kworker/u5:0-460 [001] 145.249412: bputs: pick_next_task_fair: nohz_newidle_balance start update
kworker/u5:0-460 [001] 145.249416: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
kworker/u5:0-460 [001] 145.249418: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
rcu_sched-9 [001] 145.362764: bputs: pick_next_task_fair: nohz_newidle_balance start update
rcu_sched-9 [001] 145.362773: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
rcu_sched-9 [001] 145.362775: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
systemd-journal-1090 [001] 145.582817: bputs: pick_next_task_fair: nohz_newidle_balance start update
systemd-journal-1090 [001] 145.582822: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
systemd-journal-1090 [001] 145.582824: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
kworker/1:1-31 [001] 146.392753: bputs: pick_next_task_fair: nohz_newidle_balance start update
kworker/1:1-31 [001] 146.392756: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
kworker/1:1-31 [001] 146.392760: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
systemd-udevd-1723 [000] 146.832885: bputs: pick_next_task_fair: nohz_newidle_balance start update
systemd-udevd-1723 [000] 146.832891: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
systemd-udevd-1723 [000] 146.832893: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
rcu_sched-9 [001] 146.882739: bputs: pick_next_task_fair: nohz_newidle_balance start update
rcu_sched-9 [001] 146.882742: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
rcu_sched-9 [001] 146.882747: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
haveged-1823 [001] 146.952985: bputs: pick_next_task_fair: nohz_newidle_balance start update
haveged-1823 [001] 146.952988: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
haveged-1823 [001] 146.952993: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
rcu_sched-9 [000] 147.002743: bputs: pick_next_task_fair: nohz_newidle_balance start update
rcu_sched-9 [000] 147.002752: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
rcu_sched-9 [000] 147.002755: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
rcu_sched-9 [000] 147.042741: bputs: pick_next_task_fair: nohz_newidle_balance start update
rcu_sched-9 [000] 147.042744: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
rcu_sched-9 [000] 147.042748: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
rcu_sched-9 [000] 147.082741: bputs: pick_next_task_fair: nohz_newidle_balance start update
rcu_sched-9 [000] 147.082743: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
rcu_sched-9 [000] 147.082748: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
sshd-1881 [000] 147.277339: bputs: pick_next_task_fair: nohz_newidle_balance start update
sshd-1881 [000] 147.277343: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
sshd-1881 [000] 147.277346: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
rcu_sched-9 [000] 147.342741: bputs: pick_next_task_fair: nohz_newidle_balance start update
rcu_sched-9 [000] 147.342749: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
rcu_sched-9 [000] 147.342754: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
rcu_sched-9 [000] 147.382739: bputs: pick_next_task_fair: nohz_newidle_balance start update
rcu_sched-9 [000] 147.382741: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
rcu_sched-9 [000] 147.382745: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
sshd-1881 [000] 147.621105: bputs: pick_next_task_fair: nohz_newidle_balance start update
sshd-1881 [000] 147.621108: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
sshd-1881 [000] 147.621110: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
<idle>-0 [001] 147.662725: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 0 flag 2
<idle>-0 [001] 147.662730: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
kworker/1:1-31 [001] 151.032830: bputs: pick_next_task_fair: nohz_newidle_balance start update
kworker/1:1-31 [001] 151.032833: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
kworker/1:1-31 [001] 151.032835: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
kworker/1:1-31 [001] 155.032777: bputs: pick_next_task_fair: nohz_newidle_balance start update
kworker/1:1-31 [001] 155.032780: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
kworker/1:1-31 [001] 155.032781: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
haveged-1823 [001] 155.515789: bputs: pick_next_task_fair: nohz_newidle_balance start update
haveged-1823 [001] 155.515792: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
haveged-1823 [001] 155.515794: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
kworker/1:1-31 [001] 157.032718: bputs: pick_next_task_fair: nohz_newidle_balance start update
kworker/1:1-31 [001] 157.032720: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
kworker/1:1-31 [001] 157.032722: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
kworker/1:1-31 [001] 159.032724: bputs: pick_next_task_fair: nohz_newidle_balance start update
kworker/1:1-31 [001] 159.032726: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
kworker/1:1-31 [001] 159.032728: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
kworker/1:1-31 [001] 161.032698: bputs: pick_next_task_fair: nohz_newidle_balance start update
kworker/1:1-31 [001] 161.032701: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
kworker/1:1-31 [001] 161.032702: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
kworker/1:1-31 [001] 165.032671: bputs: pick_next_task_fair: nohz_newidle_balance start update
kworker/1:1-31 [001] 165.032673: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
kworker/1:1-31 [001] 165.032675: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
kworker/1:1-31 [001] 167.032667: bputs: pick_next_task_fair: nohz_newidle_balance start update
kworker/1:1-31 [001] 167.032669: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
kworker/1:1-31 [001] 167.032671: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
kworker/1:1-31 [001] 169.032654: bputs: pick_next_task_fair: nohz_newidle_balance start update
kworker/1:1-31 [001] 169.032657: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
kworker/1:1-31 [001] 169.032658: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
rpcbind-1847 [001] 169.472688: bputs: pick_next_task_fair: nohz_newidle_balance start update
rpcbind-1847 [001] 169.472691: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
rpcbind-1847 [001] 169.472693: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
kworker/1:1-31 [001] 171.032660: bputs: pick_next_task_fair: nohz_newidle_balance start update
kworker/1:1-31 [001] 171.032662: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
kworker/1:1-31 [001] 171.032664: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
rpcbind-1847 [001] 199.482558: bputs: pick_next_task_fair: nohz_newidle_balance start update
rpcbind-1847 [001] 199.482561: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
rpcbind-1847 [001] 199.482563: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
<idle>-0 [000] 241.615152: function: dump_backtrace
--
Regards,
Niklas S?derlund
Hi Niklas,
Le Monday 23 Apr 2018 ? 00:18:27 (+0200), Niklas S?derlund a ?crit :
> Hi Vincent,
>
> On 2018-04-20 18:00:13 +0200, Vincent Guittot wrote:
>
> [snip]
>
> > >
> > > You results are similar to Heiner's ones. The problem is still there
> > > even if we only kick ilb which mainly send an IPI reschedule to the
> > > other CPU if Idle
> > >
> >
> > Could it be possible to record some traces of the problem to get a better view
> > of what happens ?
> >
> > I have a small patch that adds some traces in the functions that seems to create
> > the problem
>
> Sure, I applied the patch bellow on-top of c18bb396d3d261eb ("Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
>
> >
> >
> >
> > Also that would be good to stop tracing when the RCU stall happens
> >
> > In case you are not familiar with the trace tool, I have put below how to configure ftrace to trace scheudler, irq, timer ... events and stop tracing when dump_backtrace function is called
> >
> > trace-cmd reset > /dev/null
> > trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> > trace-cmd start -b 40000 -p function -l dump_backtrace
> >
> > The trace-cmd start has to be called twice to make sure that we will not trace all function
> >
> > Once the dump happen and the trace is stopped, you can extract the traces with
> >
> > trace-cmd extract -o <trace file name>
>
> Thanks for the help with trace-cmd, I have attached the full
> trace-cmd.dat extracted from running this. But a quick run of trace-cmd
> report.i
Thanks for the report. Can you re run with the following trace-cmd sequence ? My previous sequence disables ftrace events
trace-cmd reset > /dev/null
trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
trace-cmd start -b 40000 -p function -l dump_backtrace -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
I have updated the patch and added traces to check that scheduler returns from idle_balance function and doesn't stay stuck
---
kernel/sched/fair.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0951d1c..4285511 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9606,6 +9606,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
*/
WRITE_ONCE(nohz.has_blocked, 0);
+ trace_printk("_nohz_idle_balance cpu %d idle %d flag %x", this_cpu, idle, flags);
+
/*
* Ensures that if we miss the CPU, we must see the has_blocked
* store from nohz_balance_enter_idle().
@@ -9628,6 +9630,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
rq = cpu_rq(balance_cpu);
+ trace_printk("_nohz_idle_balance update cpu %d ", balance_cpu);
+
has_blocked_load |= update_nohz_stats(rq, true);
/*
@@ -9680,6 +9684,8 @@ static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags,
if (likely(update_next_balance))
nohz.next_balance = next_balance;
+ trace_printk("_nohz_idle_balance return %d", ret);
+
return ret;
}
@@ -9732,6 +9738,8 @@ static void nohz_newidle_balance(struct rq *this_rq)
time_before(jiffies, READ_ONCE(nohz.next_blocked)))
return;
+ trace_printk("nohz_newidle_balance start update");
+
raw_spin_unlock(&this_rq->lock);
/*
* This CPU is going to be idle and blocked load of idle CPUs
@@ -9742,6 +9750,9 @@ static void nohz_newidle_balance(struct rq *this_rq)
if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))
kick_ilb(NOHZ_STATS_KICK);
raw_spin_lock(&this_rq->lock);
+
+ trace_printk("nohz_newidle_balance lock back");
+
}
#else /* !CONFIG_NO_HZ_COMMON */
@@ -9869,6 +9880,8 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
rq_repin_lock(this_rq, rf);
+ trace_printk("idle_balance %d", pulled_task);
+
return pulled_task;
}
--
2.7.4
>
>
[snip]
> rcu_sched-9 [000] 147.342741: bputs: pick_next_task_fair: nohz_newidle_balance start update
> rcu_sched-9 [000] 147.342749: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
> rcu_sched-9 [000] 147.342754: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
> rcu_sched-9 [000] 147.382739: bputs: pick_next_task_fair: nohz_newidle_balance start update
> rcu_sched-9 [000] 147.382741: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
> rcu_sched-9 [000] 147.382745: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
> sshd-1881 [000] 147.621105: bputs: pick_next_task_fair: nohz_newidle_balance start update
> sshd-1881 [000] 147.621108: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 0 idle 2 flag 2
> sshd-1881 [000] 147.621110: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
This is the last call to the function added by the patch on CPU0 and it returns completely.
But then nothing else is schedule on CPU0
> <idle>-0 [001] 147.662725: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 0 flag 2
> <idle>-0 [001] 147.662730: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
> kworker/1:1-31 [001] 151.032830: bputs: pick_next_task_fair: nohz_newidle_balance start update
> kworker/1:1-31 [001] 151.032833: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
> kworker/1:1-31 [001] 151.032835: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
> kworker/1:1-31 [001] 155.032777: bputs: pick_next_task_fair: nohz_newidle_balance start update
> kworker/1:1-31 [001] 155.032780: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
> kworker/1:1-31 [001] 155.032781: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
> haveged-1823 [001] 155.515789: bputs: pick_next_task_fair: nohz_newidle_balance start update
> haveged-1823 [001] 155.515792: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
> haveged-1823 [001] 155.515794: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
> kworker/1:1-31 [001] 157.032718: bputs: pick_next_task_fair: nohz_newidle_balance start update
> kworker/1:1-31 [001] 157.032720: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
> kworker/1:1-31 [001] 157.032722: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
> kworker/1:1-31 [001] 159.032724: bputs: pick_next_task_fair: nohz_newidle_balance start update
> kworker/1:1-31 [001] 159.032726: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
> kworker/1:1-31 [001] 159.032728: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
> kworker/1:1-31 [001] 161.032698: bputs: pick_next_task_fair: nohz_newidle_balance start update
> kworker/1:1-31 [001] 161.032701: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
> kworker/1:1-31 [001] 161.032702: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
> kworker/1:1-31 [001] 165.032671: bputs: pick_next_task_fair: nohz_newidle_balance start update
> kworker/1:1-31 [001] 165.032673: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
> kworker/1:1-31 [001] 165.032675: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
> kworker/1:1-31 [001] 167.032667: bputs: pick_next_task_fair: nohz_newidle_balance start update
> kworker/1:1-31 [001] 167.032669: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
> kworker/1:1-31 [001] 167.032671: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
> kworker/1:1-31 [001] 169.032654: bputs: pick_next_task_fair: nohz_newidle_balance start update
> kworker/1:1-31 [001] 169.032657: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
> kworker/1:1-31 [001] 169.032658: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
> rpcbind-1847 [001] 169.472688: bputs: pick_next_task_fair: nohz_newidle_balance start update
> rpcbind-1847 [001] 169.472691: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
> rpcbind-1847 [001] 169.472693: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
> kworker/1:1-31 [001] 171.032660: bputs: pick_next_task_fair: nohz_newidle_balance start update
> kworker/1:1-31 [001] 171.032662: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
> kworker/1:1-31 [001] 171.032664: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
> rpcbind-1847 [001] 199.482558: bputs: pick_next_task_fair: nohz_newidle_balance start update
> rpcbind-1847 [001] 199.482561: bprint: _nohz_idle_balance: _nohz_idle_balance cpu 1 idle 2 flag 2
> rpcbind-1847 [001] 199.482563: bprint: _nohz_idle_balance: _nohz_idle_balance return 1
> <idle>-0 [000] 241.615152: function: dump_backtrace
>
> --
> Regards,
> Niklas S?derlund
sorry for the inconvenience
Vincent
Hi Niklas,
Le Thursday 26 Apr 2018 ? 00:56:03 (+0200), Niklas S?derlund a ?crit :
> Hi Vincent,
>
> Here are the result, sorry for the delay.
>
> On 2018-04-23 11:54:20 +0200, Vincent Guittot wrote:
>
> [snip]
>
> >
> > Thanks for the report. Can you re run with the following trace-cmd sequence ? My previous sequence disables ftrace events
> >
> > trace-cmd reset > /dev/null
> > trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> > trace-cmd start -b 40000 -p function -l dump_backtrace -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> >
> > I have updated the patch and added traces to check that scheduler returns from idle_balance function and doesn't stay stuck
>
> Once more I applied the change bellow on-top of c18bb396d3d261eb ("Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
>
> This time the result of 'trace-cmd report' is so large I do not include
> it here, but I attach the trace.dat file. Not sure why but the timing of
> sending the NMI to the backtrace print is different (but content the
> same AFIK) so in the odd change it can help figure this out:
>
Thanks for the trace, I have been able to catch a problem with it.
Could you test the patch below to confirm that the problem is solved ?
The patch apply on-top of
c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
From: Vincent Guittot <[email protected]>
Date: Thu, 26 Apr 2018 12:19:32 +0200
Subject: [PATCH] sched/fair: fix the update of blocked load when newly idle
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
With commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle"),
we release the rq->lock when updating blocked load of idle CPUs. This open
a time window during which another CPU can add a task to this CPU's cfs_rq.
The check for newly added task of idle_balance() is not in the common path.
Move the out label to include this check.
Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
Reported-by: Heiner Kallweit <[email protected]>
Reported-by: Niklas S?derlund <[email protected]>
Signed-off-by: Vincent Guittot <[email protected]>
---
kernel/sched/fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0951d1c..15a9f5e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9847,6 +9847,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
if (curr_cost > this_rq->max_idle_balance_cost)
this_rq->max_idle_balance_cost = curr_cost;
+out:
/*
* While browsing the domains, we released the rq lock, a task could
* have been enqueued in the meantime. Since we're not going idle,
@@ -9855,7 +9856,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
if (this_rq->cfs.h_nr_running && !pulled_task)
pulled_task = 1;
-out:
/* Move the next balance forward */
if (time_after(this_rq->next_balance, next_balance))
this_rq->next_balance = next_balance;
--
2.7.4
[snip]
On Thu, Apr 26, 2018 at 12:31:33PM +0200, Vincent Guittot wrote:
> From: Vincent Guittot <[email protected]>
> Date: Thu, 26 Apr 2018 12:19:32 +0200
> Subject: [PATCH] sched/fair: fix the update of blocked load when newly idle
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> With commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle"),
> we release the rq->lock when updating blocked load of idle CPUs. This open
> a time window during which another CPU can add a task to this CPU's cfs_rq.
> The check for newly added task of idle_balance() is not in the common path.
> Move the out label to include this check.
Ah quite so indeed. This could result in us running idle even though
there is a runnable task around -- which is bad.
> Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
> Reported-by: Heiner Kallweit <[email protected]>
> Reported-by: Niklas S?derlund <[email protected]>
> Signed-off-by: Vincent Guittot <[email protected]>
> ---
> kernel/sched/fair.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..15a9f5e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9847,6 +9847,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
> if (curr_cost > this_rq->max_idle_balance_cost)
> this_rq->max_idle_balance_cost = curr_cost;
>
> +out:
> /*
> * While browsing the domains, we released the rq lock, a task could
> * have been enqueued in the meantime. Since we're not going idle,
> @@ -9855,7 +9856,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
> if (this_rq->cfs.h_nr_running && !pulled_task)
> pulled_task = 1;
>
> -out:
> /* Move the next balance forward */
> if (time_after(this_rq->next_balance, next_balance))
> this_rq->next_balance = next_balance;
Hi Vincent,
Thanks for all your help.
On 2018-04-26 12:31:33 +0200, Vincent Guittot wrote:
> Hi Niklas,
>
> Le Thursday 26 Apr 2018 ? 00:56:03 (+0200), Niklas S?derlund a ?crit :
> > Hi Vincent,
> >
> > Here are the result, sorry for the delay.
> >
> > On 2018-04-23 11:54:20 +0200, Vincent Guittot wrote:
> >
> > [snip]
> >
> > >
> > > Thanks for the report. Can you re run with the following trace-cmd sequence ? My previous sequence disables ftrace events
> > >
> > > trace-cmd reset > /dev/null
> > > trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> > > trace-cmd start -b 40000 -p function -l dump_backtrace -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
> > >
> > > I have updated the patch and added traces to check that scheduler returns from idle_balance function and doesn't stay stuck
> >
> > Once more I applied the change bellow on-top of c18bb396d3d261eb ("Merge
> > git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
> >
> > This time the result of 'trace-cmd report' is so large I do not include
> > it here, but I attach the trace.dat file. Not sure why but the timing of
> > sending the NMI to the backtrace print is different (but content the
> > same AFIK) so in the odd change it can help figure this out:
> >
>
> Thanks for the trace, I have been able to catch a problem with it.
> Could you test the patch below to confirm that the problem is solved ?
> The patch apply on-top of
> c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
I can confirm that with the patch bellow I can no longer produce the
problem. Thanks!
>
> From: Vincent Guittot <[email protected]>
> Date: Thu, 26 Apr 2018 12:19:32 +0200
> Subject: [PATCH] sched/fair: fix the update of blocked load when newly idle
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> With commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle"),
> we release the rq->lock when updating blocked load of idle CPUs. This open
> a time window during which another CPU can add a task to this CPU's cfs_rq.
> The check for newly added task of idle_balance() is not in the common path.
> Move the out label to include this check.
>
> Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
> Reported-by: Heiner Kallweit <[email protected]>
> Reported-by: Niklas S?derlund <[email protected]>
> Signed-off-by: Vincent Guittot <[email protected]>
> ---
> kernel/sched/fair.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0951d1c..15a9f5e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9847,6 +9847,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
> if (curr_cost > this_rq->max_idle_balance_cost)
> this_rq->max_idle_balance_cost = curr_cost;
>
> +out:
> /*
> * While browsing the domains, we released the rq lock, a task could
> * have been enqueued in the meantime. Since we're not going idle,
> @@ -9855,7 +9856,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
> if (this_rq->cfs.h_nr_running && !pulled_task)
> pulled_task = 1;
>
> -out:
> /* Move the next balance forward */
> if (time_after(this_rq->next_balance, next_balance))
> this_rq->next_balance = next_balance;
> --
> 2.7.4
>
>
>
> [snip]
>
--
Regards,
Niklas S?derlund
Hi Niklas,
>> Thanks for the trace, I have been able to catch a problem with it.
>> Could you test the patch below to confirm that the problem is solved ?
>> The patch apply on-top of
>> c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
>
> I can confirm that with the patch bellow I can no longer produce the
> problem. Thanks!
Thanks for testing
Do you mind if I add
Tested-by: Niklas Söderlund <[email protected]>
Peter, Ingo,
Do you want me to re-send the patch with all tags or you will take
this version ?
Regards,
Vincent
>
>>
>> From: Vincent Guittot <[email protected]>
>> Date: Thu, 26 Apr 2018 12:19:32 +0200
>> Subject: [PATCH] sched/fair: fix the update of blocked load when newly idle
>> MIME-Version: 1.0
>> Content-Type: text/plain; charset=UTF-8
>> Content-Transfer-Encoding: 8bit
>>
>> With commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle"),
>> we release the rq->lock when updating blocked load of idle CPUs. This open
>> a time window during which another CPU can add a task to this CPU's cfs_rq.
>> The check for newly added task of idle_balance() is not in the common path.
>> Move the out label to include this check.
>>
>> Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
>> Reported-by: Heiner Kallweit <[email protected]>
>> Reported-by: Niklas Söderlund <[email protected]>
>> Signed-off-by: Vincent Guittot <[email protected]>
>> ---
>> kernel/sched/fair.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 0951d1c..15a9f5e 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -9847,6 +9847,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
>> if (curr_cost > this_rq->max_idle_balance_cost)
>> this_rq->max_idle_balance_cost = curr_cost;
>>
>> +out:
>> /*
>> * While browsing the domains, we released the rq lock, a task could
>> * have been enqueued in the meantime. Since we're not going idle,
>> @@ -9855,7 +9856,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
>> if (this_rq->cfs.h_nr_running && !pulled_task)
>> pulled_task = 1;
>>
>> -out:
>> /* Move the next balance forward */
>> if (time_after(this_rq->next_balance, next_balance))
>> this_rq->next_balance = next_balance;
>> --
>> 2.7.4
>>
>>
>>
>> [snip]
>>
>
> --
> Regards,
> Niklas Söderlund
Hi Vincent,
On 2018-04-26 17:27:24 +0200, Vincent Guittot wrote:
> Hi Niklas,
>
> >> Thanks for the trace, I have been able to catch a problem with it.
> >> Could you test the patch below to confirm that the problem is solved ?
> >> The patch apply on-top of
> >> c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
> >
> > I can confirm that with the patch bellow I can no longer produce the
> > problem. Thanks!
>
> Thanks for testing
> Do you mind if I add
> Tested-by: Niklas S?derlund <[email protected]>
Please do.
>
> Peter, Ingo,
> Do you want me to re-send the patch with all tags or you will take
> this version ?
>
> Regards,
> Vincent
>
> >
> >>
> >> From: Vincent Guittot <[email protected]>
> >> Date: Thu, 26 Apr 2018 12:19:32 +0200
> >> Subject: [PATCH] sched/fair: fix the update of blocked load when newly idle
> >> MIME-Version: 1.0
> >> Content-Type: text/plain; charset=UTF-8
> >> Content-Transfer-Encoding: 8bit
> >>
> >> With commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle"),
> >> we release the rq->lock when updating blocked load of idle CPUs. This open
> >> a time window during which another CPU can add a task to this CPU's cfs_rq.
> >> The check for newly added task of idle_balance() is not in the common path.
> >> Move the out label to include this check.
> >>
> >> Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
> >> Reported-by: Heiner Kallweit <[email protected]>
> >> Reported-by: Niklas S?derlund <[email protected]>
> >> Signed-off-by: Vincent Guittot <[email protected]>
> >> ---
> >> kernel/sched/fair.c | 2 +-
> >> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >> index 0951d1c..15a9f5e 100644
> >> --- a/kernel/sched/fair.c
> >> +++ b/kernel/sched/fair.c
> >> @@ -9847,6 +9847,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
> >> if (curr_cost > this_rq->max_idle_balance_cost)
> >> this_rq->max_idle_balance_cost = curr_cost;
> >>
> >> +out:
> >> /*
> >> * While browsing the domains, we released the rq lock, a task could
> >> * have been enqueued in the meantime. Since we're not going idle,
> >> @@ -9855,7 +9856,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
> >> if (this_rq->cfs.h_nr_running && !pulled_task)
> >> pulled_task = 1;
> >>
> >> -out:
> >> /* Move the next balance forward */
> >> if (time_after(this_rq->next_balance, next_balance))
> >> this_rq->next_balance = next_balance;
> >> --
> >> 2.7.4
> >>
> >>
> >>
> >> [snip]
> >>
> >
> > --
> > Regards,
> > Niklas S?derlund
--
Regards,
Niklas S?derlund
Hi Vincent,
On Thu, Apr 26, 2018 at 12:31 PM, Vincent Guittot
<[email protected]> wrote:
> Le Thursday 26 Apr 2018 à 00:56:03 (+0200), Niklas Söderlund a écrit :
>> Here are the result, sorry for the delay.
>>
>> On 2018-04-23 11:54:20 +0200, Vincent Guittot wrote:
>>
>> [snip]
>>
>> >
>> > Thanks for the report. Can you re run with the following trace-cmd sequence ? My previous sequence disables ftrace events
>> >
>> > trace-cmd reset > /dev/null
>> > trace-cmd start -b 40000 -p function -l dump_backtrace:traceoff -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
>> > trace-cmd start -b 40000 -p function -l dump_backtrace -e sched -e cpu_idle -e cpu_frequency -e timer -e ipi -e irq -e printk
>> >
>> > I have updated the patch and added traces to check that scheduler returns from idle_balance function and doesn't stay stuck
>>
>> Once more I applied the change bellow on-top of c18bb396d3d261eb ("Merge
>> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
>>
>> This time the result of 'trace-cmd report' is so large I do not include
>> it here, but I attach the trace.dat file. Not sure why but the timing of
>> sending the NMI to the backtrace print is different (but content the
>> same AFIK) so in the odd change it can help figure this out:
>>
>
> Thanks for the trace, I have been able to catch a problem with it.
> Could you test the patch below to confirm that the problem is solved ?
> The patch apply on-top of
> c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
>
> From: Vincent Guittot <[email protected]>
> Date: Thu, 26 Apr 2018 12:19:32 +0200
> Subject: [PATCH] sched/fair: fix the update of blocked load when newly idle
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> With commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle"),
> we release the rq->lock when updating blocked load of idle CPUs. This open
> a time window during which another CPU can add a task to this CPU's cfs_rq.
> The check for newly added task of idle_balance() is not in the common path.
> Move the out label to include this check.
>
> Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
> Reported-by: Heiner Kallweit <[email protected]>
> Reported-by: Niklas Söderlund <[email protected]>
> Signed-off-by: Vincent Guittot <[email protected]>
Thanks!
Tested-by: Geert Uytterhoeven <[email protected]>
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
Commit-ID: 457be908c83637ee10bda085a23dc05afa3b14a0
Gitweb: https://git.kernel.org/tip/457be908c83637ee10bda085a23dc05afa3b14a0
Author: Vincent Guittot <[email protected]>
AuthorDate: Thu, 26 Apr 2018 12:19:32 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 3 May 2018 07:38:03 +0200
sched/fair: Fix the update of blocked load when newly idle
With commit:
31e77c93e432 ("sched/fair: Update blocked load when newly idle")
... we release the rq->lock when updating blocked load of idle CPUs.
This opens a time window during which another CPU can add a task to this
CPU's cfs_rq.
The check for newly added task of idle_balance() is not in the common path.
Move the out label to include this check.
Reported-by: Heiner Kallweit <[email protected]>
Tested-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/sched/fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 54dc31e7ab9b..e3002e5ada31 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9847,6 +9847,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
if (curr_cost > this_rq->max_idle_balance_cost)
this_rq->max_idle_balance_cost = curr_cost;
+out:
/*
* While browsing the domains, we released the rq lock, a task could
* have been enqueued in the meantime. Since we're not going idle,
@@ -9855,7 +9856,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
if (this_rq->cfs.h_nr_running && !pulled_task)
pulled_task = 1;
-out:
/* Move the next balance forward */
if (time_after(this_rq->next_balance, next_balance))
this_rq->next_balance = next_balance;