2015-06-29 17:38:19

by Mark Rutland

[permalink] [raw]
Subject: PRREMPT RCU stalls with 9edfbfed3f544a78 ("sched/core: Rework rq->clock update skips")

Hi,

Recently (since ~v4.0) I've been seeing new RCU stall warnings when
running hackbench (and cpu-affine cyclictest instances) on
CONFIG_PREEMPT=y kernels, both arm64 and x86_64:

arm64
-----
INFO: rcu_preempt detected stalls on CPUs/tasks:
Tasks blocked on level-0 rcu_node (CPUs 0-5):
(detected by 1, t=2102 jiffies, g=957, c=956, q=23)
All QSes seen, last rcu_preempt kthread activity 2 (4294981808-4294981806), jiffies_till_next_fqs=1, root ->qsmask 0x0
hackbench R running task 0 10388 9559 0x00000000
Call trace:
[<ffffffc000089914>] dump_backtrace+0x0/0x124
[<ffffffc000089a48>] show_stack+0x10/0x1c
[<ffffffc0000d65a0>] sched_show_task+0xa4/0x104
[<ffffffc0000fc2c8>] rcu_check_callbacks+0x998/0x9a0
[<ffffffc0000ff4dc>] update_process_times+0x38/0x6c
[<ffffffc00010e288>] tick_sched_handle.isra.16+0x1c/0x68
[<ffffffc00010e314>] tick_sched_timer+0x40/0x88
[<ffffffc0000fff64>] __run_hrtimer.isra.34+0x48/0x108
[<ffffffc000100254>] hrtimer_interrupt+0xc4/0x248
[<ffffffc0004bb9cc>] arch_timer_handler_phys+0x28/0x38
[<ffffffc0000f381c>] handle_percpu_devid_irq+0x74/0x9c
[<ffffffc0000ef73c>] generic_handle_irq+0x30/0x4c
[<ffffffc0000efa54>] __handle_domain_irq+0x5c/0xac
[<ffffffc00008241c>] gic_handle_irq+0x30/0x80
Exception stack(0xffffffc073bc3a80 to 0xffffffc073bc3ba0)
3a80: 00000064 00000000 00000064 00000000 73bc3bc0 ffffffc0 00331ffc ffffffc0
3aa0: 00000064 00000000 00000064 00000000 ffffffff ffffffff 160c3d20 00000000
3ac0: e9d018ac 0000007f 00000064 00000000 5a046200 ffffffc9 000000c0 00000000
3ae0: 00000000 00000000 00000000 00000000 ffffffff 00000000 00000018 00000000
3b00: 00000002 00000000 00000000 00000000 948d9a5c 0000007f 94a1e590 0000007f
3b20: 0019b7e0 ffffffc0 94a316d0 0000007f e9d01530 0000007f 00000064 00000000
3b40: 00000064 00000000 00000064 00000000 00000000 00000000 00000064 00000000
3b60: 73bc3d78 ffffffc0 757be080 ffffffc9 6241b980 ffffffc9 00000001 00000000
3b80: 55da5500 ffffffc9 73bc3bc0 ffffffc0 00331f9c ffffffc0 73bc3bc0 ffffffc0
[<ffffffc0000855a4>] el1_irq+0x64/0xd8
[<ffffffc0004f1d6c>] skb_copy_datagram_from_iter+0x5c/0x1f0
[<ffffffc000579e68>] unix_stream_sendmsg+0xf0/0x334
[<ffffffc0004e1d68>] sock_sendmsg+0x14/0x58
[<ffffffc0004e1e20>] sock_write_iter+0x74/0xd4
[<ffffffc00019a494>] __vfs_write+0xac/0x10c
[<ffffffc00019ad38>] vfs_write+0x8c/0x194
[<ffffffc00019b820>] SyS_write+0x40/0xa0

x86_64
------
[ 2616.307011] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 2616.311868] Tasks blocked on level-0 rcu_node (CPUs 0-3):
[ 2616.311868] (detected by 1, t=21014 jiffies, g=10009, c=10008, q=274)
[ 2616.322320] All QSes seen, last rcu_preempt kthread activity 3 (4297283620-4297283617), jiffies_till_next_fqs=3, root ->qsmask 0x0
[ 2616.322320] hackbench R running task 13984 30174 29054 0x00000008
[ 2616.322320] 00000000000036a0 ffff8801bec83c98 ffffffff8107b751 0000000000000003
[ 2616.322320] ffffffff81e3b8c0 ffffffff81f01280 ffffffff81e3b8c0 ffff8801bec83d28
[ 2616.352044] ffffffff810a90f7 0000000000000000 ffffffff8107faac 000000000000f6a0
[ 2616.352044] Call Trace:
[ 2616.352044] <IRQ> [<ffffffff8107b751>] sched_show_task+0xe1/0x150
[ 2616.352044] [<ffffffff810a90f7>] rcu_check_callbacks+0x8f7/0x900
[ 2616.352044] [<ffffffff8107faac>] ? account_system_time+0x9c/0x190
[ 2616.352044] [<ffffffff810ae244>] update_process_times+0x34/0x60
[ 2616.352044] [<ffffffff810bd471>] tick_sched_handle.isra.15+0x31/0x40
[ 2616.352044] [<ffffffff810bd4bf>] tick_sched_timer+0x3f/0x70
[ 2616.352044] [<ffffffff810aefaf>] __run_hrtimer+0x7f/0x230
[ 2616.352044] [<ffffffff810bd480>] ? tick_sched_handle.isra.15+0x40/0x40
[ 2616.416043] [<ffffffff810af3b3>] hrtimer_interrupt+0xf3/0x220
[ 2616.416043] [<ffffffff810439b1>] hpet_interrupt_handler+0x11/0x30
[ 2616.416043] [<ffffffff8109e437>] handle_irq_event_percpu+0x47/0x1d0
[ 2616.428774] [<ffffffff8109e5fc>] handle_irq_event+0x3c/0x60
[ 2616.428774] [<ffffffff810a13cf>] handle_edge_irq+0x8f/0x130
[ 2616.428774] [<ffffffff81004f17>] handle_irq+0xf7/0x180
[ 2616.450371] [<ffffffff8107a231>] ? get_parent_ip+0x11/0x50
[ 2616.450371] [<ffffffff810046d8>] do_IRQ+0x58/0x100
[ 2616.450371] [<ffffffff8190caeb>] common_interrupt+0x6b/0x6b
[ 2616.467277] <EOI> [<ffffffff8170aa47>] ? sock_def_readable+0x67/0x70
[ 2616.467277] [<ffffffff812bfb05>] ? avc_has_perm+0x105/0x1b0
[ 2616.467277] [<ffffffff812bfad7>] ? avc_has_perm+0xd7/0x1b0
[ 2616.482406] [<ffffffff812c2092>] inode_has_perm.isra.29+0x22/0x30
[ 2616.482406] [<ffffffff812c21a7>] file_has_perm+0x87/0xa0
[ 2616.493232] [<ffffffff811af786>] ? fsnotify+0x2d6/0x400
[ 2616.493232] [<ffffffff812c48c5>] selinux_file_permission+0xa5/0x120
[ 2616.493232] [<ffffffff812bde1e>] security_file_permission+0x1e/0xa0
[ 2616.493232] [<ffffffff81170fd1>] rw_verify_area+0x51/0xd0
[ 2616.493232] [<ffffffff811711ff>] vfs_write+0x6f/0x1b0
[ 2616.493232] [<ffffffff81171fd1>] SyS_write+0x41/0xb0
[ 2616.493232] [<ffffffff81186220>] ? SyS_poll+0x60/0xf0
[ 2616.493232] [<ffffffff8190c017>] system_call_fastpath+0x12/0x6a

I've bisected to 9edfbfed3f544a78 ("sched/core: Rework rq->clock update
skips"), and with that reverted I no longer see stalls. For v4.1 I also
had to revert 44fb085bfa17628c "sched/deadline: Add rq->clock update
skip for dl task yield" as it calls function introduced in
9edfbfed3f544a78.

Any ideas as to what could be going on?

To reproduce the issue I'm running the following in parallel on
dual-core machines:

$ while true; do ./hackbench 100 process 1000; sleep 30; done
# ./cyclictest -a0 -n -M -m
# ./cyclictest -a1 -n -M -m

Usually it takes a couple of minutes to trigger, but I've seen it take
up to half an hour. With the reverts I haven't seen stalls after testing
for an hour or so.

Thanks,
Mark.


2015-06-29 18:11:49

by Paul E. McKenney

[permalink] [raw]
Subject: Re: PRREMPT RCU stalls with 9edfbfed3f544a78 ("sched/core: Rework rq->clock update skips")

On Mon, Jun 29, 2015 at 06:37:52PM +0100, Mark Rutland wrote:
> Hi,
>
> Recently (since ~v4.0) I've been seeing new RCU stall warnings when
> running hackbench (and cpu-affine cyclictest instances) on
> CONFIG_PREEMPT=y kernels, both arm64 and x86_64:
>
> arm64
> -----
> INFO: rcu_preempt detected stalls on CPUs/tasks:
> Tasks blocked on level-0 rcu_node (CPUs 0-5):

So you appear to be blocked on a task that was preempted within its RCU
read-side critical section. There is a bug that causes the task not
to be printed, which is fixed by 82efed06d5e3 ("rcu: Fix missing task
information during rcu-preempt stall").

Not that this will fix the stall itself, but it will at least allow you
to see the task that is causing the stall.

> (detected by 1, t=2102 jiffies, g=957, c=956, q=23)
> All QSes seen, last rcu_preempt kthread activity 2 (4294981808-4294981806), jiffies_till_next_fqs=1, root ->qsmask 0x0
> hackbench R running task 0 10388 9559 0x00000000
> Call trace:
> [<ffffffc000089914>] dump_backtrace+0x0/0x124
> [<ffffffc000089a48>] show_stack+0x10/0x1c
> [<ffffffc0000d65a0>] sched_show_task+0xa4/0x104
> [<ffffffc0000fc2c8>] rcu_check_callbacks+0x998/0x9a0
> [<ffffffc0000ff4dc>] update_process_times+0x38/0x6c
> [<ffffffc00010e288>] tick_sched_handle.isra.16+0x1c/0x68
> [<ffffffc00010e314>] tick_sched_timer+0x40/0x88
> [<ffffffc0000fff64>] __run_hrtimer.isra.34+0x48/0x108
> [<ffffffc000100254>] hrtimer_interrupt+0xc4/0x248
> [<ffffffc0004bb9cc>] arch_timer_handler_phys+0x28/0x38
> [<ffffffc0000f381c>] handle_percpu_devid_irq+0x74/0x9c
> [<ffffffc0000ef73c>] generic_handle_irq+0x30/0x4c
> [<ffffffc0000efa54>] __handle_domain_irq+0x5c/0xac
> [<ffffffc00008241c>] gic_handle_irq+0x30/0x80
> Exception stack(0xffffffc073bc3a80 to 0xffffffc073bc3ba0)
> 3a80: 00000064 00000000 00000064 00000000 73bc3bc0 ffffffc0 00331ffc ffffffc0
> 3aa0: 00000064 00000000 00000064 00000000 ffffffff ffffffff 160c3d20 00000000
> 3ac0: e9d018ac 0000007f 00000064 00000000 5a046200 ffffffc9 000000c0 00000000
> 3ae0: 00000000 00000000 00000000 00000000 ffffffff 00000000 00000018 00000000
> 3b00: 00000002 00000000 00000000 00000000 948d9a5c 0000007f 94a1e590 0000007f
> 3b20: 0019b7e0 ffffffc0 94a316d0 0000007f e9d01530 0000007f 00000064 00000000
> 3b40: 00000064 00000000 00000064 00000000 00000000 00000000 00000064 00000000
> 3b60: 73bc3d78 ffffffc0 757be080 ffffffc9 6241b980 ffffffc9 00000001 00000000
> 3b80: 55da5500 ffffffc9 73bc3bc0 ffffffc0 00331f9c ffffffc0 73bc3bc0 ffffffc0
> [<ffffffc0000855a4>] el1_irq+0x64/0xd8
> [<ffffffc0004f1d6c>] skb_copy_datagram_from_iter+0x5c/0x1f0
> [<ffffffc000579e68>] unix_stream_sendmsg+0xf0/0x334
> [<ffffffc0004e1d68>] sock_sendmsg+0x14/0x58
> [<ffffffc0004e1e20>] sock_write_iter+0x74/0xd4
> [<ffffffc00019a494>] __vfs_write+0xac/0x10c
> [<ffffffc00019ad38>] vfs_write+0x8c/0x194
> [<ffffffc00019b820>] SyS_write+0x40/0xa0
>
> x86_64
> ------
> [ 2616.307011] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 2616.311868] Tasks blocked on level-0 rcu_node (CPUs 0-3):

Same here, same diagnostic fix.

> [ 2616.311868] (detected by 1, t=21014 jiffies, g=10009, c=10008, q=274)
> [ 2616.322320] All QSes seen, last rcu_preempt kthread activity 3 (4297283620-4297283617), jiffies_till_next_fqs=3, root ->qsmask 0x0
> [ 2616.322320] hackbench R running task 13984 30174 29054 0x00000008
> [ 2616.322320] 00000000000036a0 ffff8801bec83c98 ffffffff8107b751 0000000000000003
> [ 2616.322320] ffffffff81e3b8c0 ffffffff81f01280 ffffffff81e3b8c0 ffff8801bec83d28
> [ 2616.352044] ffffffff810a90f7 0000000000000000 ffffffff8107faac 000000000000f6a0
> [ 2616.352044] Call Trace:
> [ 2616.352044] <IRQ> [<ffffffff8107b751>] sched_show_task+0xe1/0x150
> [ 2616.352044] [<ffffffff810a90f7>] rcu_check_callbacks+0x8f7/0x900
> [ 2616.352044] [<ffffffff8107faac>] ? account_system_time+0x9c/0x190
> [ 2616.352044] [<ffffffff810ae244>] update_process_times+0x34/0x60
> [ 2616.352044] [<ffffffff810bd471>] tick_sched_handle.isra.15+0x31/0x40
> [ 2616.352044] [<ffffffff810bd4bf>] tick_sched_timer+0x3f/0x70
> [ 2616.352044] [<ffffffff810aefaf>] __run_hrtimer+0x7f/0x230
> [ 2616.352044] [<ffffffff810bd480>] ? tick_sched_handle.isra.15+0x40/0x40
> [ 2616.416043] [<ffffffff810af3b3>] hrtimer_interrupt+0xf3/0x220
> [ 2616.416043] [<ffffffff810439b1>] hpet_interrupt_handler+0x11/0x30
> [ 2616.416043] [<ffffffff8109e437>] handle_irq_event_percpu+0x47/0x1d0
> [ 2616.428774] [<ffffffff8109e5fc>] handle_irq_event+0x3c/0x60
> [ 2616.428774] [<ffffffff810a13cf>] handle_edge_irq+0x8f/0x130
> [ 2616.428774] [<ffffffff81004f17>] handle_irq+0xf7/0x180
> [ 2616.450371] [<ffffffff8107a231>] ? get_parent_ip+0x11/0x50
> [ 2616.450371] [<ffffffff810046d8>] do_IRQ+0x58/0x100
> [ 2616.450371] [<ffffffff8190caeb>] common_interrupt+0x6b/0x6b
> [ 2616.467277] <EOI> [<ffffffff8170aa47>] ? sock_def_readable+0x67/0x70
> [ 2616.467277] [<ffffffff812bfb05>] ? avc_has_perm+0x105/0x1b0
> [ 2616.467277] [<ffffffff812bfad7>] ? avc_has_perm+0xd7/0x1b0
> [ 2616.482406] [<ffffffff812c2092>] inode_has_perm.isra.29+0x22/0x30
> [ 2616.482406] [<ffffffff812c21a7>] file_has_perm+0x87/0xa0
> [ 2616.493232] [<ffffffff811af786>] ? fsnotify+0x2d6/0x400
> [ 2616.493232] [<ffffffff812c48c5>] selinux_file_permission+0xa5/0x120
> [ 2616.493232] [<ffffffff812bde1e>] security_file_permission+0x1e/0xa0
> [ 2616.493232] [<ffffffff81170fd1>] rw_verify_area+0x51/0xd0
> [ 2616.493232] [<ffffffff811711ff>] vfs_write+0x6f/0x1b0
> [ 2616.493232] [<ffffffff81171fd1>] SyS_write+0x41/0xb0
> [ 2616.493232] [<ffffffff81186220>] ? SyS_poll+0x60/0xf0
> [ 2616.493232] [<ffffffff8190c017>] system_call_fastpath+0x12/0x6a
>
> I've bisected to 9edfbfed3f544a78 ("sched/core: Rework rq->clock update
> skips"), and with that reverted I no longer see stalls. For v4.1 I also
> had to revert 44fb085bfa17628c "sched/deadline: Add rq->clock update
> skip for dl task yield" as it calls function introduced in
> 9edfbfed3f544a78.
>
> Any ideas as to what could be going on?

I don't see why those commits would prevent a preempted task from running,
but then I don't claim to fully understand these changes.

> To reproduce the issue I'm running the following in parallel on
> dual-core machines:
>
> $ while true; do ./hackbench 100 process 1000; sleep 30; done
> # ./cyclictest -a0 -n -M -m
> # ./cyclictest -a1 -n -M -m
>
> Usually it takes a couple of minutes to trigger, but I've seen it take
> up to half an hour. With the reverts I haven't seen stalls after testing
> for an hour or so.

I would want to see at least a 5-hour run before being all that sure,
but an hour is at least promising.

Thanx, Paul

2015-06-30 10:59:37

by Mike Galbraith

[permalink] [raw]
Subject: Re: PRREMPT RCU stalls with 9edfbfed3f544a78 ("sched/core: Rework rq->clock update skips")

On Mon, 2015-06-29 at 18:37 +0100, Mark Rutland wrote:
> Hi,
>
> Recently (since ~v4.0) I've been seeing new RCU stall warnings when
> running hackbench (and cpu-affine cyclictest instances) on
> CONFIG_PREEMPT=y kernels, both arm64 and x86_64:

FWIW, I'm unable to reproduce on my Haswell box w. preempt config.

homer: # grep RCU config.preempt
# RCU Subsystem
CONFIG_PREEMPT_RCU=y
CONFIG_RCU_EXPERT=y
CONFIG_SRCU=y
# CONFIG_TASKS_RCU is not set
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_USER_QS=y
CONFIG_RCU_FANOUT=64
CONFIG_RCU_FANOUT_LEAF=16
# CONFIG_RCU_FAST_NO_HZ is not set
# CONFIG_TREE_RCU_TRACE is not set
# CONFIG_RCU_BOOST is not set
CONFIG_RCU_KTHREAD_PRIO=0
CONFIG_RCU_NOCB_CPU=y
CONFIG_RCU_NOCB_CPU_NONE=y
# CONFIG_RCU_NOCB_CPU_ZERO is not set
# CONFIG_RCU_NOCB_CPU_ALL is not set
# CONFIG_RCU_EXPEDITE_BOOT is not set
# RCU Debugging
# CONFIG_PROVE_RCU is not set
# CONFIG_SPARSE_RCU_POINTER is not set
# CONFIG_RCU_TORTURE_TEST is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=60
# CONFIG_RCU_CPU_STALL_INFO is not set
# CONFIG_RCU_TRACE is not set
# CONFIG_RCU_EQS_DEBUG is not set