Today in "rcu stall while fuzzing" news:
INFO: rcu_preempt detected stalls on CPUs/tasks:
Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
(detected by 0, t=6502 jiffies, g=75434, c=75433, q=0)
trinity-c342 R running task 13384 766 32295 0x00000000
ffff880068943d58 0000000000000002 0000000000000002 ffff880193c8c680
00000000001d4100 0000000000000000 ffff880068943fd8 00000000001d4100
ffff88024302c680 ffff880193c8c680 ffff880068943fd8 0000000000000000
Call Trace:
[<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
[<ffffffff8883df10>] retint_kernel+0x20/0x30
[<ffffffff880d9424>] ? lock_acquire+0xd4/0x2b0
[<ffffffff8808d495>] ? kill_pid_info+0x5/0x130
[<ffffffff8808d4d5>] kill_pid_info+0x45/0x130
[<ffffffff8808d495>] ? kill_pid_info+0x5/0x130
[<ffffffff8808d6d2>] SYSC_kill+0xf2/0x2f0
[<ffffffff8808d67b>] ? SYSC_kill+0x9b/0x2f0
[<ffffffff8819c2b7>] ? context_tracking_user_exit+0x57/0x280
[<ffffffff880136bd>] ? syscall_trace_enter+0x13d/0x310
[<ffffffff8808fd9e>] SyS_kill+0xe/0x10
[<ffffffff8883d3a4>] tracesys+0xdd/0xe2
trinity-c225 R running task 13448 646 32295 0x00000000
ffff880161ccfb28 0000000000000002 ffff880161ccfe10 ffff88000bf85e00
00000000001d4100 0000000000000003 ffff880161ccffd8 00000000001d4100
ffff880030124680 ffff88000bf85e00 ffff880161ccffd8 0000000000000000
Call Trace:
[<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
[<ffffffff8883df10>] retint_kernel+0x20/0x30
[<ffffffff88233d41>] ? __d_lookup_rcu+0xd1/0x1e0
[<ffffffff88233dd6>] ? __d_lookup_rcu+0x166/0x1e0
[<ffffffff88222f9f>] lookup_fast+0x4f/0x3d0
[<ffffffff88224857>] link_path_walk+0x1a7/0x8a0
[<ffffffff88224f95>] ? path_lookupat+0x45/0x7b0
[<ffffffff88224fb7>] path_lookupat+0x67/0x7b0
[<ffffffff880d385d>] ? trace_hardirqs_off+0xd/0x10
[<ffffffff8883dda4>] ? retint_restore_args+0xe/0xe
[<ffffffff8822572b>] filename_lookup+0x2b/0xc0
[<ffffffff88229c77>] user_path_at_empty+0x67/0xc0
[<ffffffff880d3dbe>] ? put_lock_stats.isra.27+0xe/0x30
[<ffffffff880d42a6>] ? lock_release_holdtime.part.28+0xe6/0x160
[<ffffffff880b15ad>] ? get_parent_ip+0xd/0x50
[<ffffffff88229ce1>] user_path_at+0x11/0x20
[<ffffffff8824fac1>] do_utimes+0xd1/0x180
[<ffffffff8824fbef>] SyS_utime+0x7f/0xc0
[<ffffffff8883d345>] ? tracesys+0x7e/0xe2
[<ffffffff8883d3a4>] tracesys+0xdd/0xe2
trinity-c342 R running task 13384 766 32295 0x00000000
ffff880068943d58 0000000000000002 0000000000000002 ffff880193c8c680
00000000001d4100 0000000000000000 ffff880068943fd8 00000000001d4100
ffff88024302c680 ffff880193c8c680 ffff880068943fd8 0000000000000000
Call Trace:
[<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
[<ffffffff8883df10>] retint_kernel+0x20/0x30
[<ffffffff880d9424>] ? lock_acquire+0xd4/0x2b0
[<ffffffff8808d495>] ? kill_pid_info+0x5/0x130
[<ffffffff8808d4d5>] kill_pid_info+0x45/0x130
[<ffffffff8808d495>] ? kill_pid_info+0x5/0x130
[<ffffffff8808d6d2>] SYSC_kill+0xf2/0x2f0
[<ffffffff8808d67b>] ? SYSC_kill+0x9b/0x2f0
[<ffffffff8819c2b7>] ? context_tracking_user_exit+0x57/0x280
[<ffffffff880136bd>] ? syscall_trace_enter+0x13d/0x310
[<ffffffff8808fd9e>] SyS_kill+0xe/0x10
[<ffffffff8883d3a4>] tracesys+0xdd/0xe2
trinity-c225 R running task 13448 646 32295 0x00000000
ffff880161ccfb28 0000000000000002 ffff880161ccfe10 ffff88000bf85e00
00000000001d4100 0000000000000003 ffff880161ccffd8 00000000001d4100
ffff880030124680 ffff88000bf85e00 ffff880161ccffd8 0000000000000000
Call Trace:
[<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
[<ffffffff8883df10>] retint_kernel+0x20/0x30
[<ffffffff88233d41>] ? __d_lookup_rcu+0xd1/0x1e0
[<ffffffff88233dd6>] ? __d_lookup_rcu+0x166/0x1e0
[<ffffffff88222f9f>] lookup_fast+0x4f/0x3d0
[<ffffffff88224857>] link_path_walk+0x1a7/0x8a0
[<ffffffff88224f95>] ? path_lookupat+0x45/0x7b0
[<ffffffff88224fb7>] path_lookupat+0x67/0x7b0
[<ffffffff880d385d>] ? trace_hardirqs_off+0xd/0x10
[<ffffffff8883dda4>] ? retint_restore_args+0xe/0xe
[<ffffffff8822572b>] filename_lookup+0x2b/0xc0
[<ffffffff88229c77>] user_path_at_empty+0x67/0xc0
[<ffffffff880d3dbe>] ? put_lock_stats.isra.27+0xe/0x30
[<ffffffff880d42a6>] ? lock_release_holdtime.part.28+0xe6/0x160
[<ffffffff880b15ad>] ? get_parent_ip+0xd/0x50
[<ffffffff88229ce1>] user_path_at+0x11/0x20
[<ffffffff8824fac1>] do_utimes+0xd1/0x180
[<ffffffff8824fbef>] SyS_utime+0x7f/0xc0
[<ffffffff8883d345>] ? tracesys+0x7e/0xe2
[<ffffffff8883d3a4>] tracesys+0xdd/0xe2
INFO: rcu_preempt detected stalls on CPUs/tasks:
Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
(detected by 3, t=26007 jiffies, g=75434, c=75433, q=0)
trinity-c342 R running task 13384 766 32295 0x00000000
ffff880068943d98 0000000000000002 0000000000000000 ffff880193c8c680
00000000001d4100 0000000000000000 ffff880068943fd8 00000000001d4100
ffff88000188af00 ffff880193c8c680 ffff880068943fd8 0000000000000000
Call Trace:
[<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
[<ffffffff8883df10>] retint_kernel+0x20/0x30
[<ffffffff8809f767>] ? pid_task+0x47/0xa0
[<ffffffff8809f73d>] ? pid_task+0x1d/0xa0
[<ffffffff8808d4f1>] kill_pid_info+0x61/0x130
[<ffffffff8808d495>] ? kill_pid_info+0x5/0x130
[<ffffffff8808d6d2>] SYSC_kill+0xf2/0x2f0
[<ffffffff8808d67b>] ? SYSC_kill+0x9b/0x2f0
[<ffffffff8819c2b7>] ? context_tracking_user_exit+0x57/0x280
[<ffffffff880136bd>] ? syscall_trace_enter+0x13d/0x310
[<ffffffff8808fd9e>] SyS_kill+0xe/0x10
[<ffffffff8883d3a4>] tracesys+0xdd/0xe2
trinity-c225 R running task 13448 646 32295 0x00000000
ffff880161ccfb78 0000000000000002 ffffffff88c993ed ffff88000bf85e00
00000000001d4100 0000000000000003 ffff880161ccffd8 00000000001d4100
ffff88005ea89780 ffff88000bf85e00 ffff880161ccffd8 0000000000000000
Call Trace:
[<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
[<ffffffff8883df10>] retint_kernel+0x20/0x30
[<ffffffff8822303a>] ? lookup_fast+0xea/0x3d0
[<ffffffff88223025>] ? lookup_fast+0xd5/0x3d0
[<ffffffff88224857>] link_path_walk+0x1a7/0x8a0
[<ffffffff88224f95>] ? path_lookupat+0x45/0x7b0
[<ffffffff88224fb7>] path_lookupat+0x67/0x7b0
[<ffffffff880d385d>] ? trace_hardirqs_off+0xd/0x10
[<ffffffff8883dda4>] ? retint_restore_args+0xe/0xe
[<ffffffff8822572b>] filename_lookup+0x2b/0xc0
[<ffffffff88229c77>] user_path_at_empty+0x67/0xc0
[<ffffffff880d3dbe>] ? put_lock_stats.isra.27+0xe/0x30
[<ffffffff880d42a6>] ? lock_release_holdtime.part.28+0xe6/0x160
[<ffffffff880b15ad>] ? get_parent_ip+0xd/0x50
[<ffffffff88229ce1>] user_path_at+0x11/0x20
[<ffffffff8824fac1>] do_utimes+0xd1/0x180
[<ffffffff8824fbef>] SyS_utime+0x7f/0xc0
[<ffffffff8883d345>] ? tracesys+0x7e/0xe2
[<ffffffff8883d3a4>] tracesys+0xdd/0xe2
trinity-c342 R running task 13384 766 32295 0x00000000
ffff880068943d98 0000000000000002 0000000000000000 ffff880193c8c680
00000000001d4100 0000000000000000 ffff880068943fd8 00000000001d4100
ffff88000188af00 ffff880193c8c680 ffff880068943fd8 0000000000000000
Call Trace:
[<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
[<ffffffff8883df10>] retint_kernel+0x20/0x30
[<ffffffff8809f767>] ? pid_task+0x47/0xa0
[<ffffffff8809f73d>] ? pid_task+0x1d/0xa0
[<ffffffff8808d4f1>] kill_pid_info+0x61/0x130
[<ffffffff8808d495>] ? kill_pid_info+0x5/0x130
[<ffffffff8808d6d2>] SYSC_kill+0xf2/0x2f0
[<ffffffff8808d67b>] ? SYSC_kill+0x9b/0x2f0
[<ffffffff8819c2b7>] ? context_tracking_user_exit+0x57/0x280
[<ffffffff880136bd>] ? syscall_trace_enter+0x13d/0x310
[<ffffffff8808fd9e>] SyS_kill+0xe/0x10
[<ffffffff8883d3a4>] tracesys+0xdd/0xe2
trinity-c225 R running task 13448 646 32295 0x00000000
ffff880161ccfb78 0000000000000002 ffffffff88c993ed ffff88000bf85e00
00000000001d4100 0000000000000003 ffff880161ccffd8 00000000001d4100
ffff88005ea89780 ffff88000bf85e00 ffff880161ccffd8 0000000000000000
Call Trace:
[<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
[<ffffffff8883df10>] retint_kernel+0x20/0x30
[<ffffffff8822303a>] ? lookup_fast+0xea/0x3d0
[<ffffffff88223025>] ? lookup_fast+0xd5/0x3d0
[<ffffffff88224857>] link_path_walk+0x1a7/0x8a0
[<ffffffff88224f95>] ? path_lookupat+0x45/0x7b0
[<ffffffff88224fb7>] path_lookupat+0x67/0x7b0
[<ffffffff880d385d>] ? trace_hardirqs_off+0xd/0x10
[<ffffffff8883dda4>] ? retint_restore_args+0xe/0xe
[<ffffffff8822572b>] filename_lookup+0x2b/0xc0
[<ffffffff88229c77>] user_path_at_empty+0x67/0xc0
[<ffffffff880d3dbe>] ? put_lock_stats.isra.27+0xe/0x30
[<ffffffff880d42a6>] ? lock_release_holdtime.part.28+0xe6/0x160
[<ffffffff880b15ad>] ? get_parent_ip+0xd/0x50
[<ffffffff88229ce1>] user_path_at+0x11/0x20
[<ffffffff8824fac1>] do_utimes+0xd1/0x180
[<ffffffff8824fbef>] SyS_utime+0x7f/0xc0
[<ffffffff8883d345>] ? tracesys+0x7e/0xe2
[<ffffffff8883d3a4>] tracesys+0xdd/0xe2
This is on Linus' current tree, with the new CONFIG_TASKS_RCU unset.
Dave
On 10/13/2014 01:35 PM, Dave Jones wrote:
> oday in "rcu stall while fuzzing" news:
>
> INFO: rcu_preempt detected stalls on CPUs/tasks:
> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> (detected by 0, t=6502 jiffies, g=75434, c=75433, q=0)
I've complained about RCU stalls couple days ago (in a different context)
on -next. I guess whatever causing them made it into Linus's tree?
https://lkml.org/lkml/2014/10/11/64
Thanks,
Sasha
On Mon, Oct 13, 2014 at 01:35:04PM -0400, Dave Jones wrote:
> Today in "rcu stall while fuzzing" news:
>
> INFO: rcu_preempt detected stalls on CPUs/tasks:
> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> (detected by 0, t=6502 jiffies, g=75434, c=75433, q=0)
> trinity-c342 R running task 13384 766 32295 0x00000000
> ffff880068943d58 0000000000000002 0000000000000002 ffff880193c8c680
> 00000000001d4100 0000000000000000 ffff880068943fd8 00000000001d4100
> ffff88024302c680 ffff880193c8c680 ffff880068943fd8 0000000000000000
> Call Trace:
> [<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
> [<ffffffff8883df10>] retint_kernel+0x20/0x30
> [<ffffffff880d9424>] ? lock_acquire+0xd4/0x2b0
> [<ffffffff8808d495>] ? kill_pid_info+0x5/0x130
> [<ffffffff8808d4d5>] kill_pid_info+0x45/0x130
> [<ffffffff8808d495>] ? kill_pid_info+0x5/0x130
> [<ffffffff8808d6d2>] SYSC_kill+0xf2/0x2f0
> [<ffffffff8808d67b>] ? SYSC_kill+0x9b/0x2f0
> [<ffffffff8819c2b7>] ? context_tracking_user_exit+0x57/0x280
> [<ffffffff880136bd>] ? syscall_trace_enter+0x13d/0x310
> [<ffffffff8808fd9e>] SyS_kill+0xe/0x10
> [<ffffffff8883d3a4>] tracesys+0xdd/0xe2
Well, there is a loop in kill_pid_info(). I am surprised that it
would loop indefinitely, but if it did, you would certainly get
RCU CPU stalls. Please see patch below, adding Oleg for his thoughts.
> trinity-c225 R running task 13448 646 32295 0x00000000
> ffff880161ccfb28 0000000000000002 ffff880161ccfe10 ffff88000bf85e00
> 00000000001d4100 0000000000000003 ffff880161ccffd8 00000000001d4100
> ffff880030124680 ffff88000bf85e00 ffff880161ccffd8 0000000000000000
> Call Trace:
> [<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
> [<ffffffff8883df10>] retint_kernel+0x20/0x30
> [<ffffffff88233d41>] ? __d_lookup_rcu+0xd1/0x1e0
> [<ffffffff88233dd6>] ? __d_lookup_rcu+0x166/0x1e0
> [<ffffffff88222f9f>] lookup_fast+0x4f/0x3d0
> [<ffffffff88224857>] link_path_walk+0x1a7/0x8a0
> [<ffffffff88224f95>] ? path_lookupat+0x45/0x7b0
> [<ffffffff88224fb7>] path_lookupat+0x67/0x7b0
> [<ffffffff880d385d>] ? trace_hardirqs_off+0xd/0x10
> [<ffffffff8883dda4>] ? retint_restore_args+0xe/0xe
> [<ffffffff8822572b>] filename_lookup+0x2b/0xc0
> [<ffffffff88229c77>] user_path_at_empty+0x67/0xc0
> [<ffffffff880d3dbe>] ? put_lock_stats.isra.27+0xe/0x30
> [<ffffffff880d42a6>] ? lock_release_holdtime.part.28+0xe6/0x160
> [<ffffffff880b15ad>] ? get_parent_ip+0xd/0x50
> [<ffffffff88229ce1>] user_path_at+0x11/0x20
> [<ffffffff8824fac1>] do_utimes+0xd1/0x180
> [<ffffffff8824fbef>] SyS_utime+0x7f/0xc0
> [<ffffffff8883d345>] ? tracesys+0x7e/0xe2
> [<ffffffff8883d3a4>] tracesys+0xdd/0xe2
This one will require more looking. But did you do something like
create a pair of mutually recursive symlinks or something? ;-)
Thanx, Paul
> trinity-c342 R running task 13384 766 32295 0x00000000
> ffff880068943d58 0000000000000002 0000000000000002 ffff880193c8c680
> 00000000001d4100 0000000000000000 ffff880068943fd8 00000000001d4100
> ffff88024302c680 ffff880193c8c680 ffff880068943fd8 0000000000000000
> Call Trace:
> [<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
> [<ffffffff8883df10>] retint_kernel+0x20/0x30
> [<ffffffff880d9424>] ? lock_acquire+0xd4/0x2b0
> [<ffffffff8808d495>] ? kill_pid_info+0x5/0x130
> [<ffffffff8808d4d5>] kill_pid_info+0x45/0x130
> [<ffffffff8808d495>] ? kill_pid_info+0x5/0x130
> [<ffffffff8808d6d2>] SYSC_kill+0xf2/0x2f0
> [<ffffffff8808d67b>] ? SYSC_kill+0x9b/0x2f0
> [<ffffffff8819c2b7>] ? context_tracking_user_exit+0x57/0x280
> [<ffffffff880136bd>] ? syscall_trace_enter+0x13d/0x310
> [<ffffffff8808fd9e>] SyS_kill+0xe/0x10
> [<ffffffff8883d3a4>] tracesys+0xdd/0xe2
> trinity-c225 R running task 13448 646 32295 0x00000000
> ffff880161ccfb28 0000000000000002 ffff880161ccfe10 ffff88000bf85e00
> 00000000001d4100 0000000000000003 ffff880161ccffd8 00000000001d4100
> ffff880030124680 ffff88000bf85e00 ffff880161ccffd8 0000000000000000
> Call Trace:
> [<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
> [<ffffffff8883df10>] retint_kernel+0x20/0x30
> [<ffffffff88233d41>] ? __d_lookup_rcu+0xd1/0x1e0
> [<ffffffff88233dd6>] ? __d_lookup_rcu+0x166/0x1e0
> [<ffffffff88222f9f>] lookup_fast+0x4f/0x3d0
> [<ffffffff88224857>] link_path_walk+0x1a7/0x8a0
> [<ffffffff88224f95>] ? path_lookupat+0x45/0x7b0
> [<ffffffff88224fb7>] path_lookupat+0x67/0x7b0
> [<ffffffff880d385d>] ? trace_hardirqs_off+0xd/0x10
> [<ffffffff8883dda4>] ? retint_restore_args+0xe/0xe
> [<ffffffff8822572b>] filename_lookup+0x2b/0xc0
> [<ffffffff88229c77>] user_path_at_empty+0x67/0xc0
> [<ffffffff880d3dbe>] ? put_lock_stats.isra.27+0xe/0x30
> [<ffffffff880d42a6>] ? lock_release_holdtime.part.28+0xe6/0x160
> [<ffffffff880b15ad>] ? get_parent_ip+0xd/0x50
> [<ffffffff88229ce1>] user_path_at+0x11/0x20
> [<ffffffff8824fac1>] do_utimes+0xd1/0x180
> [<ffffffff8824fbef>] SyS_utime+0x7f/0xc0
> [<ffffffff8883d345>] ? tracesys+0x7e/0xe2
> [<ffffffff8883d3a4>] tracesys+0xdd/0xe2
> INFO: rcu_preempt detected stalls on CPUs/tasks:
> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> (detected by 3, t=26007 jiffies, g=75434, c=75433, q=0)
> trinity-c342 R running task 13384 766 32295 0x00000000
> ffff880068943d98 0000000000000002 0000000000000000 ffff880193c8c680
> 00000000001d4100 0000000000000000 ffff880068943fd8 00000000001d4100
> ffff88000188af00 ffff880193c8c680 ffff880068943fd8 0000000000000000
> Call Trace:
> [<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
> [<ffffffff8883df10>] retint_kernel+0x20/0x30
> [<ffffffff8809f767>] ? pid_task+0x47/0xa0
> [<ffffffff8809f73d>] ? pid_task+0x1d/0xa0
> [<ffffffff8808d4f1>] kill_pid_info+0x61/0x130
> [<ffffffff8808d495>] ? kill_pid_info+0x5/0x130
> [<ffffffff8808d6d2>] SYSC_kill+0xf2/0x2f0
> [<ffffffff8808d67b>] ? SYSC_kill+0x9b/0x2f0
> [<ffffffff8819c2b7>] ? context_tracking_user_exit+0x57/0x280
> [<ffffffff880136bd>] ? syscall_trace_enter+0x13d/0x310
> [<ffffffff8808fd9e>] SyS_kill+0xe/0x10
> [<ffffffff8883d3a4>] tracesys+0xdd/0xe2
> trinity-c225 R running task 13448 646 32295 0x00000000
> ffff880161ccfb78 0000000000000002 ffffffff88c993ed ffff88000bf85e00
> 00000000001d4100 0000000000000003 ffff880161ccffd8 00000000001d4100
> ffff88005ea89780 ffff88000bf85e00 ffff880161ccffd8 0000000000000000
> Call Trace:
> [<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
> [<ffffffff8883df10>] retint_kernel+0x20/0x30
> [<ffffffff8822303a>] ? lookup_fast+0xea/0x3d0
> [<ffffffff88223025>] ? lookup_fast+0xd5/0x3d0
> [<ffffffff88224857>] link_path_walk+0x1a7/0x8a0
> [<ffffffff88224f95>] ? path_lookupat+0x45/0x7b0
> [<ffffffff88224fb7>] path_lookupat+0x67/0x7b0
> [<ffffffff880d385d>] ? trace_hardirqs_off+0xd/0x10
> [<ffffffff8883dda4>] ? retint_restore_args+0xe/0xe
> [<ffffffff8822572b>] filename_lookup+0x2b/0xc0
> [<ffffffff88229c77>] user_path_at_empty+0x67/0xc0
> [<ffffffff880d3dbe>] ? put_lock_stats.isra.27+0xe/0x30
> [<ffffffff880d42a6>] ? lock_release_holdtime.part.28+0xe6/0x160
> [<ffffffff880b15ad>] ? get_parent_ip+0xd/0x50
> [<ffffffff88229ce1>] user_path_at+0x11/0x20
> [<ffffffff8824fac1>] do_utimes+0xd1/0x180
> [<ffffffff8824fbef>] SyS_utime+0x7f/0xc0
> [<ffffffff8883d345>] ? tracesys+0x7e/0xe2
> [<ffffffff8883d3a4>] tracesys+0xdd/0xe2
> trinity-c342 R running task 13384 766 32295 0x00000000
> ffff880068943d98 0000000000000002 0000000000000000 ffff880193c8c680
> 00000000001d4100 0000000000000000 ffff880068943fd8 00000000001d4100
> ffff88000188af00 ffff880193c8c680 ffff880068943fd8 0000000000000000
> Call Trace:
> [<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
> [<ffffffff8883df10>] retint_kernel+0x20/0x30
> [<ffffffff8809f767>] ? pid_task+0x47/0xa0
> [<ffffffff8809f73d>] ? pid_task+0x1d/0xa0
> [<ffffffff8808d4f1>] kill_pid_info+0x61/0x130
> [<ffffffff8808d495>] ? kill_pid_info+0x5/0x130
> [<ffffffff8808d6d2>] SYSC_kill+0xf2/0x2f0
> [<ffffffff8808d67b>] ? SYSC_kill+0x9b/0x2f0
> [<ffffffff8819c2b7>] ? context_tracking_user_exit+0x57/0x280
> [<ffffffff880136bd>] ? syscall_trace_enter+0x13d/0x310
> [<ffffffff8808fd9e>] SyS_kill+0xe/0x10
> [<ffffffff8883d3a4>] tracesys+0xdd/0xe2
> trinity-c225 R running task 13448 646 32295 0x00000000
> ffff880161ccfb78 0000000000000002 ffffffff88c993ed ffff88000bf85e00
> 00000000001d4100 0000000000000003 ffff880161ccffd8 00000000001d4100
> ffff88005ea89780 ffff88000bf85e00 ffff880161ccffd8 0000000000000000
> Call Trace:
> [<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
> [<ffffffff8883df10>] retint_kernel+0x20/0x30
> [<ffffffff8822303a>] ? lookup_fast+0xea/0x3d0
> [<ffffffff88223025>] ? lookup_fast+0xd5/0x3d0
> [<ffffffff88224857>] link_path_walk+0x1a7/0x8a0
> [<ffffffff88224f95>] ? path_lookupat+0x45/0x7b0
> [<ffffffff88224fb7>] path_lookupat+0x67/0x7b0
> [<ffffffff880d385d>] ? trace_hardirqs_off+0xd/0x10
> [<ffffffff8883dda4>] ? retint_restore_args+0xe/0xe
> [<ffffffff8822572b>] filename_lookup+0x2b/0xc0
> [<ffffffff88229c77>] user_path_at_empty+0x67/0xc0
> [<ffffffff880d3dbe>] ? put_lock_stats.isra.27+0xe/0x30
> [<ffffffff880d42a6>] ? lock_release_holdtime.part.28+0xe6/0x160
> [<ffffffff880b15ad>] ? get_parent_ip+0xd/0x50
> [<ffffffff88229ce1>] user_path_at+0x11/0x20
> [<ffffffff8824fac1>] do_utimes+0xd1/0x180
> [<ffffffff8824fbef>] SyS_utime+0x7f/0xc0
> [<ffffffff8883d345>] ? tracesys+0x7e/0xe2
> [<ffffffff8883d3a4>] tracesys+0xdd/0xe2
>
> This is on Linus' current tree, with the new CONFIG_TASKS_RCU unset.
------------------------------------------------------------------------
diff --git a/kernel/signal.c b/kernel/signal.c
index 8f0876f9f6dd..ef6525d0ca73 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1331,8 +1331,8 @@ int kill_pid_info(int sig, struct siginfo *info, struct pid *pid)
int error = -ESRCH;
struct task_struct *p;
- rcu_read_lock();
retry:
+ rcu_read_lock();
p = pid_task(pid, PIDTYPE_PID);
if (p) {
error = group_send_sig_info(sig, info, p);
@@ -1343,6 +1343,7 @@ retry:
* if we race with de_thread() it will find the
* new leader.
*/
+ rcu_read_unlock();
goto retry;
}
rcu_read_unlock();
On Thu, Oct 23, 2014 at 11:32:32AM -0700, Paul E. McKenney wrote:
> > trinity-c225 R running task 13448 646 32295 0x00000000
> > ffff880161ccfb28 0000000000000002 ffff880161ccfe10 ffff88000bf85e00
> > 00000000001d4100 0000000000000003 ffff880161ccffd8 00000000001d4100
> > ffff880030124680 ffff88000bf85e00 ffff880161ccffd8 0000000000000000
> > Call Trace:
> > [<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
> > [<ffffffff8883df10>] retint_kernel+0x20/0x30
> > [<ffffffff88233d41>] ? __d_lookup_rcu+0xd1/0x1e0
> > [<ffffffff88233dd6>] ? __d_lookup_rcu+0x166/0x1e0
> > [<ffffffff88222f9f>] lookup_fast+0x4f/0x3d0
> > [<ffffffff88224857>] link_path_walk+0x1a7/0x8a0
> > [<ffffffff88224f95>] ? path_lookupat+0x45/0x7b0
> > [<ffffffff88224fb7>] path_lookupat+0x67/0x7b0
> > [<ffffffff880d385d>] ? trace_hardirqs_off+0xd/0x10
> > [<ffffffff8883dda4>] ? retint_restore_args+0xe/0xe
> > [<ffffffff8822572b>] filename_lookup+0x2b/0xc0
> > [<ffffffff88229c77>] user_path_at_empty+0x67/0xc0
> > [<ffffffff880d3dbe>] ? put_lock_stats.isra.27+0xe/0x30
> > [<ffffffff880d42a6>] ? lock_release_holdtime.part.28+0xe6/0x160
> > [<ffffffff880b15ad>] ? get_parent_ip+0xd/0x50
> > [<ffffffff88229ce1>] user_path_at+0x11/0x20
> > [<ffffffff8824fac1>] do_utimes+0xd1/0x180
> > [<ffffffff8824fbef>] SyS_utime+0x7f/0xc0
> > [<ffffffff8883d345>] ? tracesys+0x7e/0xe2
> > [<ffffffff8883d3a4>] tracesys+0xdd/0xe2
>
> This one will require more looking. But did you do something like
> create a pair of mutually recursive symlinks or something? ;-)
I'm not 100% sure, but this may have been on a box that I was running
tests on NFS. So maybe the server had disappeared with the mount
still active..
Just a guess tbh.
Dave
On Tue, Oct 14, 2014 at 10:35:10PM -0400, Sasha Levin wrote:
> On 10/13/2014 01:35 PM, Dave Jones wrote:
> > oday in "rcu stall while fuzzing" news:
> >
> > INFO: rcu_preempt detected stalls on CPUs/tasks:
> > Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> > Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> > (detected by 0, t=6502 jiffies, g=75434, c=75433, q=0)
>
> I've complained about RCU stalls couple days ago (in a different context)
> on -next. I guess whatever causing them made it into Linus's tree?
>
> https://lkml.org/lkml/2014/10/11/64
And on that one, I must confess that I don't see where the RCU read-side
critical section might be.
Hmmm... Maybe someone forgot to put an rcu_read_unlock() somewhere.
Can you reproduce this with CONFIG_PROVE_RCU=y?
Thanx, Paul
On 10/23/2014 02:39 PM, Paul E. McKenney wrote:
> On Tue, Oct 14, 2014 at 10:35:10PM -0400, Sasha Levin wrote:
>> On 10/13/2014 01:35 PM, Dave Jones wrote:
>>> oday in "rcu stall while fuzzing" news:
>>>
>>> INFO: rcu_preempt detected stalls on CPUs/tasks:
>>> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
>>> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
>>> (detected by 0, t=6502 jiffies, g=75434, c=75433, q=0)
>>
>> I've complained about RCU stalls couple days ago (in a different context)
>> on -next. I guess whatever causing them made it into Linus's tree?
>>
>> https://lkml.org/lkml/2014/10/11/64
>
> And on that one, I must confess that I don't see where the RCU read-side
> critical section might be.
>
> Hmmm... Maybe someone forgot to put an rcu_read_unlock() somewhere.
> Can you reproduce this with CONFIG_PROVE_RCU=y?
Paul, if that was directed to me - Yes, I see stalls with CONFIG_PROVE_RCU
set and nothing else is showing up before/after that.
Thanks,
Sasha
On 10/23, Paul E. McKenney wrote:
>
> On Mon, Oct 13, 2014 at 01:35:04PM -0400, Dave Jones wrote:
> > Today in "rcu stall while fuzzing" news:
> >
> > INFO: rcu_preempt detected stalls on CPUs/tasks:
> > Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> > Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> > (detected by 0, t=6502 jiffies, g=75434, c=75433, q=0)
> > trinity-c342 R running task 13384 766 32295 0x00000000
> > ffff880068943d58 0000000000000002 0000000000000002 ffff880193c8c680
> > 00000000001d4100 0000000000000000 ffff880068943fd8 00000000001d4100
> > ffff88024302c680 ffff880193c8c680 ffff880068943fd8 0000000000000000
> > Call Trace:
> > [<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
> > [<ffffffff8883df10>] retint_kernel+0x20/0x30
> > [<ffffffff880d9424>] ? lock_acquire+0xd4/0x2b0
> > [<ffffffff8808d495>] ? kill_pid_info+0x5/0x130
> > [<ffffffff8808d4d5>] kill_pid_info+0x45/0x130
> > [<ffffffff8808d495>] ? kill_pid_info+0x5/0x130
> > [<ffffffff8808d6d2>] SYSC_kill+0xf2/0x2f0
> > [<ffffffff8808d67b>] ? SYSC_kill+0x9b/0x2f0
> > [<ffffffff8819c2b7>] ? context_tracking_user_exit+0x57/0x280
> > [<ffffffff880136bd>] ? syscall_trace_enter+0x13d/0x310
> > [<ffffffff8808fd9e>] SyS_kill+0xe/0x10
> > [<ffffffff8883d3a4>] tracesys+0xdd/0xe2
>
> Well, there is a loop in kill_pid_info(). I am surprised that it
> would loop indefinitely, but if it did, you would certainly get
> RCU CPU stalls. Please see patch below, adding Oleg for his thoughts.
Yes, this loops should not be a problem, we only restart if we race with
a multi-threaded exec from a non-leader thread.
But I already saw a couple of bug-reports which look as a task_struct
corruption (->signal/creds == NULL), looks like something was broken
recently. Perhaps an unbalanced put_task_struct...
_Perhaps_ this is another case. If ->sighand was nullified then it will
loop forever.
Oleg.
On Thu, Oct 23, 2014 at 02:40:18PM -0400, Dave Jones wrote:
> On Thu, Oct 23, 2014 at 11:32:32AM -0700, Paul E. McKenney wrote:
>
> > > trinity-c225 R running task 13448 646 32295 0x00000000
> > > ffff880161ccfb28 0000000000000002 ffff880161ccfe10 ffff88000bf85e00
> > > 00000000001d4100 0000000000000003 ffff880161ccffd8 00000000001d4100
> > > ffff880030124680 ffff88000bf85e00 ffff880161ccffd8 0000000000000000
> > > Call Trace:
> > > [<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
> > > [<ffffffff8883df10>] retint_kernel+0x20/0x30
> > > [<ffffffff88233d41>] ? __d_lookup_rcu+0xd1/0x1e0
> > > [<ffffffff88233dd6>] ? __d_lookup_rcu+0x166/0x1e0
> > > [<ffffffff88222f9f>] lookup_fast+0x4f/0x3d0
> > > [<ffffffff88224857>] link_path_walk+0x1a7/0x8a0
> > > [<ffffffff88224f95>] ? path_lookupat+0x45/0x7b0
> > > [<ffffffff88224fb7>] path_lookupat+0x67/0x7b0
> > > [<ffffffff880d385d>] ? trace_hardirqs_off+0xd/0x10
> > > [<ffffffff8883dda4>] ? retint_restore_args+0xe/0xe
> > > [<ffffffff8822572b>] filename_lookup+0x2b/0xc0
> > > [<ffffffff88229c77>] user_path_at_empty+0x67/0xc0
> > > [<ffffffff880d3dbe>] ? put_lock_stats.isra.27+0xe/0x30
> > > [<ffffffff880d42a6>] ? lock_release_holdtime.part.28+0xe6/0x160
> > > [<ffffffff880b15ad>] ? get_parent_ip+0xd/0x50
> > > [<ffffffff88229ce1>] user_path_at+0x11/0x20
> > > [<ffffffff8824fac1>] do_utimes+0xd1/0x180
> > > [<ffffffff8824fbef>] SyS_utime+0x7f/0xc0
> > > [<ffffffff8883d345>] ? tracesys+0x7e/0xe2
> > > [<ffffffff8883d3a4>] tracesys+0xdd/0xe2
> >
> > This one will require more looking. But did you do something like
> > create a pair of mutually recursive symlinks or something? ;-)
>
> I'm not 100% sure, but this may have been on a box that I was running
> tests on NFS. So maybe the server had disappeared with the mount
> still active..
>
> Just a guess tbh.
Another possibility might be that the box was so overloaded that tasks
were getting preempted for 21 seconds as a matter of course, and sometimes
within RCU read-side critical sections. Or did the box have ample idle
time?
Thanx, Paul
On Thu, Oct 23, 2014 at 12:28:07PM -0700, Paul E. McKenney wrote:
> > > This one will require more looking. But did you do something like
> > > create a pair of mutually recursive symlinks or something? ;-)
> >
> > I'm not 100% sure, but this may have been on a box that I was running
> > tests on NFS. So maybe the server had disappeared with the mount
> > still active..
> >
> > Just a guess tbh.
>
> Another possibility might be that the box was so overloaded that tasks
> were getting preempted for 21 seconds as a matter of course, and sometimes
> within RCU read-side critical sections. Or did the box have ample idle
> time?
I fairly recently upped the number of child processes I typically run
with, so it being overloaded does sound highly likely.
Dave
On Thu, Oct 23, 2014 at 09:13:19PM +0200, Oleg Nesterov wrote:
> On 10/23, Paul E. McKenney wrote:
> >
> > On Mon, Oct 13, 2014 at 01:35:04PM -0400, Dave Jones wrote:
> > > Today in "rcu stall while fuzzing" news:
> > >
> > > INFO: rcu_preempt detected stalls on CPUs/tasks:
> > > Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> > > Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> > > (detected by 0, t=6502 jiffies, g=75434, c=75433, q=0)
> > > trinity-c342 R running task 13384 766 32295 0x00000000
> > > ffff880068943d58 0000000000000002 0000000000000002 ffff880193c8c680
> > > 00000000001d4100 0000000000000000 ffff880068943fd8 00000000001d4100
> > > ffff88024302c680 ffff880193c8c680 ffff880068943fd8 0000000000000000
> > > Call Trace:
> > > [<ffffffff888368e2>] preempt_schedule_irq+0x52/0xb0
> > > [<ffffffff8883df10>] retint_kernel+0x20/0x30
> > > [<ffffffff880d9424>] ? lock_acquire+0xd4/0x2b0
> > > [<ffffffff8808d495>] ? kill_pid_info+0x5/0x130
> > > [<ffffffff8808d4d5>] kill_pid_info+0x45/0x130
> > > [<ffffffff8808d495>] ? kill_pid_info+0x5/0x130
> > > [<ffffffff8808d6d2>] SYSC_kill+0xf2/0x2f0
> > > [<ffffffff8808d67b>] ? SYSC_kill+0x9b/0x2f0
> > > [<ffffffff8819c2b7>] ? context_tracking_user_exit+0x57/0x280
> > > [<ffffffff880136bd>] ? syscall_trace_enter+0x13d/0x310
> > > [<ffffffff8808fd9e>] SyS_kill+0xe/0x10
> > > [<ffffffff8883d3a4>] tracesys+0xdd/0xe2
> >
> > Well, there is a loop in kill_pid_info(). I am surprised that it
> > would loop indefinitely, but if it did, you would certainly get
> > RCU CPU stalls. Please see patch below, adding Oleg for his thoughts.
>
> Yes, this loops should not be a problem, we only restart if we race with
> a multi-threaded exec from a non-leader thread.
>
> But I already saw a couple of bug-reports which look as a task_struct
> corruption (->signal/creds == NULL), looks like something was broken
> recently. Perhaps an unbalanced put_task_struct...
>
> _Perhaps_ this is another case. If ->sighand was nullified then it will
> loop forever.
OK, so making each pass through the loop a separate RCU read-side critical
section might be considered to be suppressing notification of an error
condition?
Thanx, Paul
On Thu, Oct 23, 2014 at 03:37:59PM -0400, Dave Jones wrote:
> On Thu, Oct 23, 2014 at 12:28:07PM -0700, Paul E. McKenney wrote:
>
> > > > This one will require more looking. But did you do something like
> > > > create a pair of mutually recursive symlinks or something? ;-)
> > >
> > > I'm not 100% sure, but this may have been on a box that I was running
> > > tests on NFS. So maybe the server had disappeared with the mount
> > > still active..
> > >
> > > Just a guess tbh.
> >
> > Another possibility might be that the box was so overloaded that tasks
> > were getting preempted for 21 seconds as a matter of course, and sometimes
> > within RCU read-side critical sections. Or did the box have ample idle
> > time?
>
> I fairly recently upped the number of child processes I typically run
> with, so it being overloaded does sound highly likely.
Ah, that could do it! One way to test extreme loads and not trigger
RCU CPU stall warnings might be to make all of your child processes all
sleep during a given interval of a few hundred milliseconds during each
ten-second interval. Would that work for you?
Thanx, Paul
On 10/23, Paul E. McKenney wrote:
>
> OK, so making each pass through the loop a separate RCU read-side critical
> section might be considered to be suppressing notification of an error
> condition?
I agree, this change probably makes sense anyway. Personally I'd prefer
the version below (somehow I hate multiple unlock's), but I won't insist.
Oleg.
--- x/kernel/signal.c
+++ x/kernel/signal.c
@@ -1331,21 +1331,19 @@ int kill_pid_info(int sig, struct siginf
int error = -ESRCH;
struct task_struct *p;
- rcu_read_lock();
retry:
+ rcu_read_lock();
p = pid_task(pid, PIDTYPE_PID);
- if (p) {
+ if (p)
error = group_send_sig_info(sig, info, p);
- if (unlikely(error == -ESRCH))
- /*
- * The task was unhashed in between, try again.
- * If it is dead, pid_task() will return NULL,
- * if we race with de_thread() it will find the
- * new leader.
- */
- goto retry;
- }
rcu_read_unlock();
+ /*
+ * The task was unhashed in between, try again. If it is dead,
+ * pid_task() will return NULL, if we race with de_thread() it
+ * will find the new leader.
+ */
+ if (p && error == -ESRCH))
+ goto retry;
return error;
}
On Thu, Oct 23, 2014 at 02:55:43PM -0400, Sasha Levin wrote:
> On 10/23/2014 02:39 PM, Paul E. McKenney wrote:
> > On Tue, Oct 14, 2014 at 10:35:10PM -0400, Sasha Levin wrote:
> >> On 10/13/2014 01:35 PM, Dave Jones wrote:
> >>> oday in "rcu stall while fuzzing" news:
> >>>
> >>> INFO: rcu_preempt detected stalls on CPUs/tasks:
> >>> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> >>> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> >>> (detected by 0, t=6502 jiffies, g=75434, c=75433, q=0)
> >>
> >> I've complained about RCU stalls couple days ago (in a different context)
> >> on -next. I guess whatever causing them made it into Linus's tree?
> >>
> >> https://lkml.org/lkml/2014/10/11/64
> >
> > And on that one, I must confess that I don't see where the RCU read-side
> > critical section might be.
> >
> > Hmmm... Maybe someone forgot to put an rcu_read_unlock() somewhere.
> > Can you reproduce this with CONFIG_PROVE_RCU=y?
>
> Paul, if that was directed to me - Yes, I see stalls with CONFIG_PROVE_RCU
> set and nothing else is showing up before/after that.
Indeed it was directed to you. ;-)
Does the following crude diagnostic patch turn up anything?
Thanx, Paul
------------------------------------------------------------------------
softirq: Check for RCU read-side misnesting in softirq handlers
This commit adds checks for RCU read-side misnesting in softirq handlers.
Please note that this works only for CONFIG_TREE_PREEMPT_RCU=y because
the other RCU flavors have no way of knowing how deeply nested they are.
Reported-by: Sasha Levin <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 501baa9ac1be..c6b63a4c576d 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -257,11 +257,13 @@ restart:
while ((softirq_bit = ffs(pending))) {
unsigned int vec_nr;
int prev_count;
+ int rcu_depth;
h += softirq_bit - 1;
vec_nr = h - softirq_vec;
prev_count = preempt_count();
+ rcu_depth = rcu_preempt_depth();
kstat_incr_softirqs_this_cpu(vec_nr);
@@ -274,6 +276,11 @@ restart:
prev_count, preempt_count());
preempt_count_set(prev_count);
}
+ if (IS_ENABLED(CONFIG_PROVE_RCU) &&
+ rcu_depth != rcu_preempt_depth())
+ pr_err("huh, entered softirq %u %s %p with RCU nesting %08x, exited with %08x?\n",
+ vec_nr, softirq_to_name[vec_nr], h->action,
+ rcu_depth, rcu_preempt_depth());
h++;
pending >>= softirq_bit;
}
On Thu, Oct 23, 2014 at 12:52:21PM -0700, Paul E. McKenney wrote:
> On Thu, Oct 23, 2014 at 03:37:59PM -0400, Dave Jones wrote:
> > On Thu, Oct 23, 2014 at 12:28:07PM -0700, Paul E. McKenney wrote:
> >
> > > > > This one will require more looking. But did you do something like
> > > > > create a pair of mutually recursive symlinks or something? ;-)
> > > >
> > > > I'm not 100% sure, but this may have been on a box that I was running
> > > > tests on NFS. So maybe the server had disappeared with the mount
> > > > still active..
> > > >
> > > > Just a guess tbh.
> > >
> > > Another possibility might be that the box was so overloaded that tasks
> > > were getting preempted for 21 seconds as a matter of course, and sometimes
> > > within RCU read-side critical sections. Or did the box have ample idle
> > > time?
> >
> > I fairly recently upped the number of child processes I typically run
> > with, so it being overloaded does sound highly likely.
>
> Ah, that could do it! One way to test extreme loads and not trigger
> RCU CPU stall warnings might be to make all of your child processes all
> sleep during a given interval of a few hundred milliseconds during each
> ten-second interval. Would that work for you?
This feels like hiding from the problem rather than fixing it.
I'm not sure it even makes sense to add sleeps to the fuzzer, other than
to slow things down, and if I were to do that, I may as well just run
it with fewer threads instead.
While the fuzzer is doing pretty crazy stuff, what's different about it
from any other application that overcommits the CPU with too many threads?
We impose rlimits to stop people from forkbombing and the like, but this
doesn't even need that many processes to trigger, and with some effort
could probably done with even fewer if I found ways to keep other cores
busy in the kernel for long enough.
That all said, I don't have easy reproducers for this right now, due
to other bugs manifesting long before this gets to be a problem.
Dave
On Thu, Oct 23, 2014 at 09:53:37PM +0200, Oleg Nesterov wrote:
> On 10/23, Paul E. McKenney wrote:
> >
> > OK, so making each pass through the loop a separate RCU read-side critical
> > section might be considered to be suppressing notification of an error
> > condition?
>
> I agree, this change probably makes sense anyway. Personally I'd prefer
> the version below (somehow I hate multiple unlock's), but I won't insist.
Your code, your rules. ;-)
But given this structure, why not use a for() loop replace the
"goto retry" with an inverted condition and a "return error"?
Maybe something like the following patch?
Thanx, Paul
------------------------------------------------------------------------
signal: Exit RCU read-side critical section on each pass through loop
The kill_pid_info() can potentially loop indefinitely if tasks are created
and deleted sufficiently quickly, and if this happens, this function
will remain in a single RCU read-side critical section indefinitely.
This commit therefore exits the RCU read-side critical section on each
pass through the loop. Because a race must happen to retry the loop,
this should have no performance impact in the common case.
Reported-by: Dave Jones <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Cc: Oleg Nesterov <[email protected]>
diff --git a/kernel/signal.c b/kernel/signal.c
index 8f0876f9f6dd..54820984a872 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1331,23 +1331,21 @@ int kill_pid_info(int sig, struct siginfo *info, struct pid *pid)
int error = -ESRCH;
struct task_struct *p;
- rcu_read_lock();
-retry:
- p = pid_task(pid, PIDTYPE_PID);
- if (p) {
- error = group_send_sig_info(sig, info, p);
- if (unlikely(error == -ESRCH))
- /*
- * The task was unhashed in between, try again.
- * If it is dead, pid_task() will return NULL,
- * if we race with de_thread() it will find the
- * new leader.
- */
- goto retry;
- }
- rcu_read_unlock();
+ for (;;) {
+ rcu_read_lock();
+ p = pid_task(pid, PIDTYPE_PID);
+ if (p)
+ error = group_send_sig_info(sig, info, p);
+ rcu_read_unlock();
+ if (likely(!p || error != -ESRCH))
+ return error;
- return error;
+ /*
+ * The task was unhashed in between, try again. If it
+ * is dead, pid_task() will return NULL, if we race with
+ * de_thread() it will find the new leader.
+ */
+ }
}
int kill_proc_info(int sig, struct siginfo *info, pid_t pid)
On Thu, Oct 23, 2014 at 04:28:16PM -0400, Dave Jones wrote:
> On Thu, Oct 23, 2014 at 12:52:21PM -0700, Paul E. McKenney wrote:
> > On Thu, Oct 23, 2014 at 03:37:59PM -0400, Dave Jones wrote:
> > > On Thu, Oct 23, 2014 at 12:28:07PM -0700, Paul E. McKenney wrote:
> > >
> > > > > > This one will require more looking. But did you do something like
> > > > > > create a pair of mutually recursive symlinks or something? ;-)
> > > > >
> > > > > I'm not 100% sure, but this may have been on a box that I was running
> > > > > tests on NFS. So maybe the server had disappeared with the mount
> > > > > still active..
> > > > >
> > > > > Just a guess tbh.
> > > >
> > > > Another possibility might be that the box was so overloaded that tasks
> > > > were getting preempted for 21 seconds as a matter of course, and sometimes
> > > > within RCU read-side critical sections. Or did the box have ample idle
> > > > time?
> > >
> > > I fairly recently upped the number of child processes I typically run
> > > with, so it being overloaded does sound highly likely.
> >
> > Ah, that could do it! One way to test extreme loads and not trigger
> > RCU CPU stall warnings might be to make all of your child processes all
> > sleep during a given interval of a few hundred milliseconds during each
> > ten-second interval. Would that work for you?
>
> This feels like hiding from the problem rather than fixing it.
> I'm not sure it even makes sense to add sleeps to the fuzzer, other than
> to slow things down, and if I were to do that, I may as well just run
> it with fewer threads instead.
I was thinking of the RCU CPU stall warnings that were strictly due to
overload as being false positives. If trinity caused a kthread to loop
within an RCU read-side critical section, you would still get the RCU
CPU stall warning even with the sleeps.
But just a suggestion, no strong feelings. Might change if there is an
excess of false-positive RCU CPU stall warnings, of course. ;-)
> While the fuzzer is doing pretty crazy stuff, what's different about it
> from any other application that overcommits the CPU with too many threads?
The (presumably) much higher probability of being preempted in the kernel,
and thus within an RCU read-side critical section.
> We impose rlimits to stop people from forkbombing and the like, but this
> doesn't even need that many processes to trigger, and with some effort
> could probably done with even fewer if I found ways to keep other cores
> busy in the kernel for long enough.
>
> That all said, I don't have easy reproducers for this right now, due
> to other bugs manifesting long before this gets to be a problem.
Fair enough! ;-)
Thanx, Paul
On 10/23, Paul E. McKenney wrote:
>
> Your code, your rules. ;-)
Heh, no. I do not trust my (perverted) taste, I never-never
argue with cosmetic issues ;)
Cough... and at the same time I have a small nit.
> But given this structure, why not use a for() loop replace the
> "goto retry" with an inverted condition and a "return error"?
> Maybe something like the following patch?
Thanks,
Acked-by: Oleg Nesterov <[email protected]>
On Thu, Oct 23, 2014 at 11:13:04PM +0200, Oleg Nesterov wrote:
> On 10/23, Paul E. McKenney wrote:
> >
> > Your code, your rules. ;-)
>
> Heh, no. I do not trust my (perverted) taste, I never-never
> argue with cosmetic issues ;)
>
> Cough... and at the same time I have a small nit.
;-) ;-) ;-)
> > But given this structure, why not use a for() loop replace the
> > "goto retry" with an inverted condition and a "return error"?
> > Maybe something like the following patch?
>
> Thanks,
>
> Acked-by: Oleg Nesterov <[email protected]>
Got it, thank you!
Thanx, Paul
On 10/23/2014 03:58 PM, Paul E. McKenney wrote:
> On Thu, Oct 23, 2014 at 02:55:43PM -0400, Sasha Levin wrote:
>> > On 10/23/2014 02:39 PM, Paul E. McKenney wrote:
>>> > > On Tue, Oct 14, 2014 at 10:35:10PM -0400, Sasha Levin wrote:
>>>> > >> On 10/13/2014 01:35 PM, Dave Jones wrote:
>>>>> > >>> oday in "rcu stall while fuzzing" news:
>>>>> > >>>
>>>>> > >>> INFO: rcu_preempt detected stalls on CPUs/tasks:
>>>>> > >>> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
>>>>> > >>> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
>>>>> > >>> (detected by 0, t=6502 jiffies, g=75434, c=75433, q=0)
>>>> > >>
>>>> > >> I've complained about RCU stalls couple days ago (in a different context)
>>>> > >> on -next. I guess whatever causing them made it into Linus's tree?
>>>> > >>
>>>> > >> https://lkml.org/lkml/2014/10/11/64
>>> > >
>>> > > And on that one, I must confess that I don't see where the RCU read-side
>>> > > critical section might be.
>>> > >
>>> > > Hmmm... Maybe someone forgot to put an rcu_read_unlock() somewhere.
>>> > > Can you reproduce this with CONFIG_PROVE_RCU=y?
>> >
>> > Paul, if that was directed to me - Yes, I see stalls with CONFIG_PROVE_RCU
>> > set and nothing else is showing up before/after that.
> Indeed it was directed to you. ;-)
>
> Does the following crude diagnostic patch turn up anything?
Nope, seeing stalls but not seeing that pr_err() you added.
[ 5107.395916] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 5107.395916] 0: (776 ticks this GP) idle=a8d/140000000000002/0 softirq=16356/16356 last_accelerate: f5b7/55e5, nonlazy_posted: 24252, ..
[ 5107.395916] (detected by 1, t=20502 jiffies, g=13949, c=13948, q=0)
[ 5107.395916] Task dump for CPU 0:
[ 5107.395916] trinity-c0 R running task 12848 20357 9041 0x0008000e
[ 5107.395916] 0000000000000000 ffff88006bfd76c0 ffff88065722b988 ffffffffa10af964
[ 5107.395916] ffff88065722b998 ffffffffa106ad23 ffff88065722b9c8 ffffffffa119dce5
[ 5107.395916] 00000000001d76c0 ffff88006bfd76c0 00000000001d76c0 ffff8806473cbd10
[ 5107.395916] Call Trace:
[ 5107.395916] [<ffffffffa10af964>] ? kvm_clock_read+0x24/0x40
[ 5107.395916] [<ffffffffa106ad23>] ? sched_clock+0x13/0x30
[ 5107.395916] [<ffffffffa119dce5>] ? sched_clock_local+0x25/0x90
[ 5107.395916] [<ffffffffa1303dfb>] ? __slab_free+0xbb/0x3a0
[ 5107.395916] [<ffffffffa1b71167>] ? debug_smp_processor_id+0x17/0x20
[ 5107.395916] [<ffffffffa451cb64>] ? _raw_spin_unlock_irqrestore+0x64/0xa0
[ 5107.395916] [<ffffffffa1303dfb>] ? __slab_free+0xbb/0x3a0
[ 5107.395916] [<ffffffffa1b71bce>] ? __debug_check_no_obj_freed+0x10e/0x210
[ 5107.395916] [<ffffffffa1305871>] ? kmem_cache_free+0xb1/0x4f0
[ 5107.395916] [<ffffffffa1305883>] ? kmem_cache_free+0xc3/0x4f0
[ 5107.395916] [<ffffffffa1305bb2>] ? kmem_cache_free+0x3f2/0x4f0
[ 5107.395916] [<ffffffffa12e0cbe>] ? unlink_anon_vmas+0x10e/0x180
[ 5107.395916] [<ffffffffa12e0cbe>] ? unlink_anon_vmas+0x10e/0x180
[ 5107.395916] [<ffffffffa12cfbdf>] ? free_pgtables+0x3f/0x130
[ 5107.395916] [<ffffffffa12dc1a4>] ? exit_mmap+0xc4/0x180
[ 5107.395916] [<ffffffffa13143fe>] ? __khugepaged_exit+0xbe/0x120
[ 5107.395916] [<ffffffffa115bbb3>] ? mmput+0x73/0x110
[ 5107.395916] [<ffffffffa1162eb7>] ? do_exit+0x2c7/0xd30
[ 5107.395916] [<ffffffffa1173fb9>] ? get_signal+0x3c9/0xaf0
[ 5107.395916] [<ffffffffa1b71167>] ? debug_smp_processor_id+0x17/0x20
[ 5107.395916] [<ffffffffa11bccbe>] ? put_lock_stats.isra.13+0xe/0x30
[ 5107.395916] [<ffffffffa451c810>] ? _raw_spin_unlock_irq+0x30/0x70
[ 5107.395916] [<ffffffffa11639c2>] ? do_group_exit+0x52/0xe0
[ 5107.395916] [<ffffffffa1173ef6>] ? get_signal+0x306/0xaf0
[ 5107.395916] [<ffffffffa119dce5>] ? sched_clock_local+0x25/0x90
[ 5107.395916] [<ffffffffa105f2f0>] ? do_signal+0x20/0x130
[ 5107.395916] [<ffffffffa1298558>] ? context_tracking_user_exit+0x78/0x2d0
[ 5107.395916] [<ffffffffa1b71183>] ? __this_cpu_preempt_check+0x13/0x20
[ 5107.395916] [<ffffffffa11c04cb>] ? trace_hardirqs_on_caller+0xfb/0x280
[ 5107.395916] [<ffffffffa11c065d>] ? trace_hardirqs_on+0xd/0x10
[ 5107.395916] [<ffffffffa105f469>] ? do_notify_resume+0x69/0xb0
[ 5107.395916] [<ffffffffa451d74f>] ? int_signal+0x12/0x17
Thanks,
Sasha
On Fri, Oct 24, 2014 at 08:28:40AM -0400, Sasha Levin wrote:
> On 10/23/2014 03:58 PM, Paul E. McKenney wrote:
> > On Thu, Oct 23, 2014 at 02:55:43PM -0400, Sasha Levin wrote:
> >> > On 10/23/2014 02:39 PM, Paul E. McKenney wrote:
> >>> > > On Tue, Oct 14, 2014 at 10:35:10PM -0400, Sasha Levin wrote:
> >>>> > >> On 10/13/2014 01:35 PM, Dave Jones wrote:
> >>>>> > >>> oday in "rcu stall while fuzzing" news:
> >>>>> > >>>
> >>>>> > >>> INFO: rcu_preempt detected stalls on CPUs/tasks:
> >>>>> > >>> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> >>>>> > >>> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> >>>>> > >>> (detected by 0, t=6502 jiffies, g=75434, c=75433, q=0)
> >>>> > >>
> >>>> > >> I've complained about RCU stalls couple days ago (in a different context)
> >>>> > >> on -next. I guess whatever causing them made it into Linus's tree?
> >>>> > >>
> >>>> > >> https://lkml.org/lkml/2014/10/11/64
> >>> > >
> >>> > > And on that one, I must confess that I don't see where the RCU read-side
> >>> > > critical section might be.
> >>> > >
> >>> > > Hmmm... Maybe someone forgot to put an rcu_read_unlock() somewhere.
> >>> > > Can you reproduce this with CONFIG_PROVE_RCU=y?
> >> >
> >> > Paul, if that was directed to me - Yes, I see stalls with CONFIG_PROVE_RCU
> >> > set and nothing else is showing up before/after that.
> > Indeed it was directed to you. ;-)
> >
> > Does the following crude diagnostic patch turn up anything?
>
> Nope, seeing stalls but not seeing that pr_err() you added.
OK, color me confused. Could you please send me the full dmesg or a
pointer to it?
Thanx, Paul
> [ 5107.395916] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 5107.395916] 0: (776 ticks this GP) idle=a8d/140000000000002/0 softirq=16356/16356 last_accelerate: f5b7/55e5, nonlazy_posted: 24252, ..
> [ 5107.395916] (detected by 1, t=20502 jiffies, g=13949, c=13948, q=0)
> [ 5107.395916] Task dump for CPU 0:
> [ 5107.395916] trinity-c0 R running task 12848 20357 9041 0x0008000e
> [ 5107.395916] 0000000000000000 ffff88006bfd76c0 ffff88065722b988 ffffffffa10af964
> [ 5107.395916] ffff88065722b998 ffffffffa106ad23 ffff88065722b9c8 ffffffffa119dce5
> [ 5107.395916] 00000000001d76c0 ffff88006bfd76c0 00000000001d76c0 ffff8806473cbd10
> [ 5107.395916] Call Trace:
> [ 5107.395916] [<ffffffffa10af964>] ? kvm_clock_read+0x24/0x40
> [ 5107.395916] [<ffffffffa106ad23>] ? sched_clock+0x13/0x30
> [ 5107.395916] [<ffffffffa119dce5>] ? sched_clock_local+0x25/0x90
> [ 5107.395916] [<ffffffffa1303dfb>] ? __slab_free+0xbb/0x3a0
> [ 5107.395916] [<ffffffffa1b71167>] ? debug_smp_processor_id+0x17/0x20
> [ 5107.395916] [<ffffffffa451cb64>] ? _raw_spin_unlock_irqrestore+0x64/0xa0
> [ 5107.395916] [<ffffffffa1303dfb>] ? __slab_free+0xbb/0x3a0
> [ 5107.395916] [<ffffffffa1b71bce>] ? __debug_check_no_obj_freed+0x10e/0x210
> [ 5107.395916] [<ffffffffa1305871>] ? kmem_cache_free+0xb1/0x4f0
> [ 5107.395916] [<ffffffffa1305883>] ? kmem_cache_free+0xc3/0x4f0
> [ 5107.395916] [<ffffffffa1305bb2>] ? kmem_cache_free+0x3f2/0x4f0
> [ 5107.395916] [<ffffffffa12e0cbe>] ? unlink_anon_vmas+0x10e/0x180
> [ 5107.395916] [<ffffffffa12e0cbe>] ? unlink_anon_vmas+0x10e/0x180
> [ 5107.395916] [<ffffffffa12cfbdf>] ? free_pgtables+0x3f/0x130
> [ 5107.395916] [<ffffffffa12dc1a4>] ? exit_mmap+0xc4/0x180
> [ 5107.395916] [<ffffffffa13143fe>] ? __khugepaged_exit+0xbe/0x120
> [ 5107.395916] [<ffffffffa115bbb3>] ? mmput+0x73/0x110
> [ 5107.395916] [<ffffffffa1162eb7>] ? do_exit+0x2c7/0xd30
> [ 5107.395916] [<ffffffffa1173fb9>] ? get_signal+0x3c9/0xaf0
> [ 5107.395916] [<ffffffffa1b71167>] ? debug_smp_processor_id+0x17/0x20
> [ 5107.395916] [<ffffffffa11bccbe>] ? put_lock_stats.isra.13+0xe/0x30
> [ 5107.395916] [<ffffffffa451c810>] ? _raw_spin_unlock_irq+0x30/0x70
> [ 5107.395916] [<ffffffffa11639c2>] ? do_group_exit+0x52/0xe0
> [ 5107.395916] [<ffffffffa1173ef6>] ? get_signal+0x306/0xaf0
> [ 5107.395916] [<ffffffffa119dce5>] ? sched_clock_local+0x25/0x90
> [ 5107.395916] [<ffffffffa105f2f0>] ? do_signal+0x20/0x130
> [ 5107.395916] [<ffffffffa1298558>] ? context_tracking_user_exit+0x78/0x2d0
> [ 5107.395916] [<ffffffffa1b71183>] ? __this_cpu_preempt_check+0x13/0x20
> [ 5107.395916] [<ffffffffa11c04cb>] ? trace_hardirqs_on_caller+0xfb/0x280
> [ 5107.395916] [<ffffffffa11c065d>] ? trace_hardirqs_on+0xd/0x10
> [ 5107.395916] [<ffffffffa105f469>] ? do_notify_resume+0x69/0xb0
> [ 5107.395916] [<ffffffffa451d74f>] ? int_signal+0x12/0x17
>
>
> Thanks,
> Sasha
>
On 10/24/2014 12:13 PM, Paul E. McKenney wrote:
> On Fri, Oct 24, 2014 at 08:28:40AM -0400, Sasha Levin wrote:
>> > On 10/23/2014 03:58 PM, Paul E. McKenney wrote:
>>> > > On Thu, Oct 23, 2014 at 02:55:43PM -0400, Sasha Levin wrote:
>>>>> > >> > On 10/23/2014 02:39 PM, Paul E. McKenney wrote:
>>>>>>> > >>> > > On Tue, Oct 14, 2014 at 10:35:10PM -0400, Sasha Levin wrote:
>>>>>>>>> > >>>> > >> On 10/13/2014 01:35 PM, Dave Jones wrote:
>>>>>>>>>>> > >>>>> > >>> oday in "rcu stall while fuzzing" news:
>>>>>>>>>>> > >>>>> > >>>
>>>>>>>>>>> > >>>>> > >>> INFO: rcu_preempt detected stalls on CPUs/tasks:
>>>>>>>>>>> > >>>>> > >>> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
>>>>>>>>>>> > >>>>> > >>> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
>>>>>>>>>>> > >>>>> > >>> (detected by 0, t=6502 jiffies, g=75434, c=75433, q=0)
>>>>>>>>> > >>>> > >>
>>>>>>>>> > >>>> > >> I've complained about RCU stalls couple days ago (in a different context)
>>>>>>>>> > >>>> > >> on -next. I guess whatever causing them made it into Linus's tree?
>>>>>>>>> > >>>> > >>
>>>>>>>>> > >>>> > >> https://lkml.org/lkml/2014/10/11/64
>>>>>>> > >>> > >
>>>>>>> > >>> > > And on that one, I must confess that I don't see where the RCU read-side
>>>>>>> > >>> > > critical section might be.
>>>>>>> > >>> > >
>>>>>>> > >>> > > Hmmm... Maybe someone forgot to put an rcu_read_unlock() somewhere.
>>>>>>> > >>> > > Can you reproduce this with CONFIG_PROVE_RCU=y?
>>>>> > >> >
>>>>> > >> > Paul, if that was directed to me - Yes, I see stalls with CONFIG_PROVE_RCU
>>>>> > >> > set and nothing else is showing up before/after that.
>>> > > Indeed it was directed to you. ;-)
>>> > >
>>> > > Does the following crude diagnostic patch turn up anything?
>> >
>> > Nope, seeing stalls but not seeing that pr_err() you added.
> OK, color me confused. Could you please send me the full dmesg or a
> pointer to it?
Attached.
Thanks,
Sasha
On Mon, 13 Oct 2014 13:35:04 -0400
Dave Jones <[email protected]> wrote:
> Today in "rcu stall while fuzzing" news:
My bug report seems to be related with this topic:
Regression: kernel 3.17 halts sometimes (with Call trace)
https://bugzilla.kernel.org/show_bug.cgi?id=85941
Could somone take a look?
Thanks.
--
Linux 3.16.0-00115-g19583ca: Shuffling Zombie Juror
http://www.youtube.com/DanielFragaBR
http://exchangewar.info
Bitcoin: 12H6661yoLDUZaYPdah6urZS5WiXwTAUgL
On Fri, Oct 24, 2014 at 12:39:15PM -0400, Sasha Levin wrote:
> On 10/24/2014 12:13 PM, Paul E. McKenney wrote:
> > On Fri, Oct 24, 2014 at 08:28:40AM -0400, Sasha Levin wrote:
> >> > On 10/23/2014 03:58 PM, Paul E. McKenney wrote:
> >>> > > On Thu, Oct 23, 2014 at 02:55:43PM -0400, Sasha Levin wrote:
> >>>>> > >> > On 10/23/2014 02:39 PM, Paul E. McKenney wrote:
> >>>>>>> > >>> > > On Tue, Oct 14, 2014 at 10:35:10PM -0400, Sasha Levin wrote:
> >>>>>>>>> > >>>> > >> On 10/13/2014 01:35 PM, Dave Jones wrote:
> >>>>>>>>>>> > >>>>> > >>> oday in "rcu stall while fuzzing" news:
> >>>>>>>>>>> > >>>>> > >>>
> >>>>>>>>>>> > >>>>> > >>> INFO: rcu_preempt detected stalls on CPUs/tasks:
> >>>>>>>>>>> > >>>>> > >>> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> >>>>>>>>>>> > >>>>> > >>> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> >>>>>>>>>>> > >>>>> > >>> (detected by 0, t=6502 jiffies, g=75434, c=75433, q=0)
> >>>>>>>>> > >>>> > >>
> >>>>>>>>> > >>>> > >> I've complained about RCU stalls couple days ago (in a different context)
> >>>>>>>>> > >>>> > >> on -next. I guess whatever causing them made it into Linus's tree?
> >>>>>>>>> > >>>> > >>
> >>>>>>>>> > >>>> > >> https://lkml.org/lkml/2014/10/11/64
> >>>>>>> > >>> > >
> >>>>>>> > >>> > > And on that one, I must confess that I don't see where the RCU read-side
> >>>>>>> > >>> > > critical section might be.
> >>>>>>> > >>> > >
> >>>>>>> > >>> > > Hmmm... Maybe someone forgot to put an rcu_read_unlock() somewhere.
> >>>>>>> > >>> > > Can you reproduce this with CONFIG_PROVE_RCU=y?
> >>>>> > >> >
> >>>>> > >> > Paul, if that was directed to me - Yes, I see stalls with CONFIG_PROVE_RCU
> >>>>> > >> > set and nothing else is showing up before/after that.
> >>> > > Indeed it was directed to you. ;-)
> >>> > >
> >>> > > Does the following crude diagnostic patch turn up anything?
> >> >
> >> > Nope, seeing stalls but not seeing that pr_err() you added.
> > OK, color me confused. Could you please send me the full dmesg or a
> > pointer to it?
>
> Attached.
Thank you! I would complain about the FAULT_INJECTION messages, but
they don't appear to be happening all that frequently.
The stack dumps do look different here. I suspect that this is a real
issue in the VM code.
Thanx, Paul
On Mon, Oct 27, 2014 at 02:13:29PM -0700, Paul E. McKenney wrote:
> On Fri, Oct 24, 2014 at 12:39:15PM -0400, Sasha Levin wrote:
> > On 10/24/2014 12:13 PM, Paul E. McKenney wrote:
> > > On Fri, Oct 24, 2014 at 08:28:40AM -0400, Sasha Levin wrote:
> > >> > On 10/23/2014 03:58 PM, Paul E. McKenney wrote:
> > >>> > > On Thu, Oct 23, 2014 at 02:55:43PM -0400, Sasha Levin wrote:
> > >>>>> > >> > On 10/23/2014 02:39 PM, Paul E. McKenney wrote:
> > >>>>>>> > >>> > > On Tue, Oct 14, 2014 at 10:35:10PM -0400, Sasha Levin wrote:
> > >>>>>>>>> > >>>> > >> On 10/13/2014 01:35 PM, Dave Jones wrote:
> > >>>>>>>>>>> > >>>>> > >>> oday in "rcu stall while fuzzing" news:
> > >>>>>>>>>>> > >>>>> > >>>
> > >>>>>>>>>>> > >>>>> > >>> INFO: rcu_preempt detected stalls on CPUs/tasks:
> > >>>>>>>>>>> > >>>>> > >>> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> > >>>>>>>>>>> > >>>>> > >>> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> > >>>>>>>>>>> > >>>>> > >>> (detected by 0, t=6502 jiffies, g=75434, c=75433, q=0)
> > >>>>>>>>> > >>>> > >>
> > >>>>>>>>> > >>>> > >> I've complained about RCU stalls couple days ago (in a different context)
> > >>>>>>>>> > >>>> > >> on -next. I guess whatever causing them made it into Linus's tree?
> > >>>>>>>>> > >>>> > >>
> > >>>>>>>>> > >>>> > >> https://lkml.org/lkml/2014/10/11/64
> > >>>>>>> > >>> > >
> > >>>>>>> > >>> > > And on that one, I must confess that I don't see where the RCU read-side
> > >>>>>>> > >>> > > critical section might be.
> > >>>>>>> > >>> > >
> > >>>>>>> > >>> > > Hmmm... Maybe someone forgot to put an rcu_read_unlock() somewhere.
> > >>>>>>> > >>> > > Can you reproduce this with CONFIG_PROVE_RCU=y?
> > >>>>> > >> >
> > >>>>> > >> > Paul, if that was directed to me - Yes, I see stalls with CONFIG_PROVE_RCU
> > >>>>> > >> > set and nothing else is showing up before/after that.
> > >>> > > Indeed it was directed to you. ;-)
> > >>> > >
> > >>> > > Does the following crude diagnostic patch turn up anything?
> > >> >
> > >> > Nope, seeing stalls but not seeing that pr_err() you added.
> > > OK, color me confused. Could you please send me the full dmesg or a
> > > pointer to it?
> >
> > Attached.
>
> Thank you! I would complain about the FAULT_INJECTION messages, but
> they don't appear to be happening all that frequently.
>
> The stack dumps do look different here. I suspect that this is a real
> issue in the VM code.
And to that end... The filemap_map_pages() function does have loop over
a list of pages. I wonder if the rcu_read_lock() should be moved into
the radix_tree_for_each_slot() loop. CCing linux-mm for their thoughts,
though it looks to me like the current radix_tree_for_each_slot() wants
to be under RCU protection. But I am not seeing anything that requires
all iterations of the loop to be under the same RCU read-side critical
section. Maybe something like the following patch?
Thanx, Paul
------------------------------------------------------------------------
mm: Attempted fix for RCU CPU stall warning
It appears that filemap_map_pages() can stay in a single RCU read-side
critical section for a very long time if given a large area to map.
This could result in RCU CPU stall warnings. This commit therefore breaks
the read-side critical section into per-iteration critical sections, taking
care to make sure that the radix_tree_for_each_slot() call itself remains
in an RCU read-side critical section, as required.
Reported-by: Sasha Levin <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
diff --git a/mm/filemap.c b/mm/filemap.c
index 14b4642279f1..f78f144fb41f 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2055,6 +2055,8 @@ skip:
next:
if (iter.index == vmf->max_pgoff)
break;
+ rcu_read_unlock();
+ rcu_read_lock();
}
rcu_read_unlock();
}
On 10/27/2014 05:13 PM, Paul E. McKenney wrote:
> Thank you! I would complain about the FAULT_INJECTION messages, but
> they don't appear to be happening all that frequently.
>
> The stack dumps do look different here. I suspect that this is a real
> issue in the VM code.
The stack dumps are pretty much different with every trace.
I just got a "double trace" for the first time, maybe it'll help:
[ 5887.980026] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 5887.980028] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 5887.980029]
[ 5887.980030]
[ 5887.980046] 0: (210 ticks this GP) idle=f11/140000000000002/0 softirq=18456/18456 last_accelerate: 36b9/86cf, nonlazy_posted: 38271, ..
[ 5887.980054] 0: (210 ticks this GP) idle=f11/140000000000002/0 softirq=18456/18456 last_accelerate: 36b9/86cf, nonlazy_posted: 38271, ..
[ 5887.980060]
[ 5887.980073]
[ 5887.980074] (detected by 11, t=20502 jiffies, g=16314, c=16313, q=0)
[ 5887.980079] (detected by 28, t=20502 jiffies, g=-284, c=-285, q=0)
[ 5887.980081] Task dump for CPU 0:
[ 5887.980084] Task dump for CPU 0:
[ 5887.980090] trinity-c175 R
[ 5887.980092] trinity-c175 R
[ 5887.980094] running task
[ 5887.980116] running task 12864 17785 9032 0x0008000c
[ 5887.980120] 12864 17785 9032 0x0008000c
[ 5887.980125] ffffffff940ff458
[ 5887.980127] ffffffff940ff458
[ 5887.980130] 0000000000000286
[ 5887.980132] 0000000000000286
[ 5887.980134] ffffffff9d28b638
[ 5887.980137] ffffffff9d28b638
[ 5887.980139] ffff88014b88b000
[ 5887.980141] ffff88014b88b000
[ 5887.980141]
[ 5887.980142]
[ 5887.980146] ffff88014b88b000
[ 5887.980148] ffff88014b88b000
[ 5887.980151] ffffffff9d300b30
[ 5887.980153] ffffffff9d300b30
[ 5887.980155] 0000000000000000
[ 5887.980157] 0000000000000000
[ 5887.980160] ffff88013dc476a8
[ 5887.980161] ffff88013dc476a8
[ 5887.980161]
[ 5887.980163]
[ 5887.980167] ffffffff9423b078
[ 5887.980169] ffffffff9423b078
[ 5887.980171] 0000000000000002
[ 5887.980173] 0000000000000002
[ 5887.980175] 0000000000000000
[ 5887.980177] 0000000000000000
[ 5887.980180] ffff88013dc477c0
[ 5887.980181] ffff88013dc477c0
[ 5887.980181]
[ 5887.980183]
[ 5887.980185] Call Trace:
[ 5887.980187] Call Trace:
[ 5887.980208] ? __change_page_attr_set_clr (arch/x86/mm/pageattr.c:1109 arch/x86/mm/pageattr.c:1279)
[ 5887.980219] ? __change_page_attr_set_clr (arch/x86/mm/pageattr.c:1109 arch/x86/mm/pageattr.c:1279)
[ 5887.980232] ? check_flags (kernel/locking/lockdep.c:3532)
[ 5887.980241] ? check_flags (kernel/locking/lockdep.c:3532)
[ 5887.980253] ? preempt_count_sub (kernel/sched/core.c:2641)
[ 5887.980262] ? preempt_count_sub (kernel/sched/core.c:2641)
[ 5887.980269] ? debug_check_no_locks_freed (kernel/locking/lockdep.c:4076)
[ 5887.980278] ? debug_check_no_locks_freed (kernel/locking/lockdep.c:4076)
[ 5887.980294] ? _raw_spin_unlock_irqrestore (./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:161 kernel/locking/spinlock.c:191)
[ 5887.980303] ? _raw_spin_unlock_irqrestore (./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:161 kernel/locking/spinlock.c:191)
[ 5887.980312] ? kernel_map_pages (arch/x86/mm/pageattr.c:1801 arch/x86/mm/pageattr.c:1821)
[ 5887.980319] ? kernel_map_pages (arch/x86/mm/pageattr.c:1801 arch/x86/mm/pageattr.c:1821)
[ 5887.980334] ? free_pages_prepare (mm/page_alloc.c:779)
[ 5887.980344] ? free_pages_prepare (mm/page_alloc.c:779)
[ 5887.980351] ? __free_pages_ok (mm/page_alloc.c:787)
[ 5887.980358] ? __free_pages_ok (mm/page_alloc.c:787)
[ 5887.980365] ? free_compound_page (mm/page_alloc.c:364)
[ 5887.980373] ? free_compound_page (mm/page_alloc.c:364)
[ 5887.980380] ? put_compound_page (mm/swap.c:262)
[ 5887.980389] ? put_compound_page (mm/swap.c:262)
[ 5887.980395] ? preempt_count_sub (kernel/sched/core.c:2641)
[ 5887.980404] ? preempt_count_sub (kernel/sched/core.c:2641)
[ 5887.980411] ? release_pages (mm/swap.c:917)
[ 5887.980419] ? release_pages (mm/swap.c:917)
[ 5887.980430] ? free_pages_and_swap_cache (mm/swap_state.c:274)
[ 5887.980440] ? free_pages_and_swap_cache (mm/swap_state.c:274)
[ 5887.980449] ? tlb_flush_mmu_free (mm/memory.c:254 (discriminator 3))
[ 5887.980458] ? tlb_flush_mmu_free (mm/memory.c:254 (discriminator 3))
[ 5887.980465] ? unmap_page_range (mm/memory.c:1220 mm/memory.c:1263 mm/memory.c:1284 mm/memory.c:1308)
[ 5887.980473] ? unmap_page_range (mm/memory.c:1220 mm/memory.c:1263 mm/memory.c:1284 mm/memory.c:1308)
[ 5887.980481] ? unmap_single_vma (mm/memory.c:1355)
[ 5887.980489] ? unmap_single_vma (mm/memory.c:1355)
[ 5887.980496] ? unmap_vmas (mm/memory.c:1382 (discriminator 3))
[ 5887.980503] ? unmap_vmas (mm/memory.c:1382 (discriminator 3))
[ 5887.980512] ? exit_mmap (mm/mmap.c:2855)
[ 5887.980521] ? exit_mmap (mm/mmap.c:2855)
[ 5887.980530] ? kmem_cache_free (include/trace/events/kmem.h:143 mm/slub.c:2715)
[ 5887.980538] ? kmem_cache_free (include/trace/events/kmem.h:143 mm/slub.c:2715)
[ 5887.980546] ? mmput (kernel/fork.c:674)
[ 5887.980554] ? mmput (kernel/fork.c:674)
[ 5887.980561] ? do_exit (./arch/x86/include/asm/thread_info.h:168 kernel/exit.c:462 kernel/exit.c:747)
[ 5887.980569] ? do_exit (./arch/x86/include/asm/thread_info.h:168 kernel/exit.c:462 kernel/exit.c:747)
[ 5887.980578] ? rcu_start_future_gp (kernel/rcu/tree.c:1364)
[ 5887.980589] ? rcu_start_future_gp (kernel/rcu/tree.c:1364)
[ 5887.980596] ? recalc_sigpending_tsk (include/linux/thread_info.h:71 include/linux/sched.h:2733 kernel/signal.c:137)
[ 5887.980605] ? recalc_sigpending_tsk (include/linux/thread_info.h:71 include/linux/sched.h:2733 kernel/signal.c:137)
[ 5887.980613] ? debug_lockdep_rcu_enabled (kernel/rcu/update.c:136 (discriminator 1))
[ 5887.980621] ? debug_lockdep_rcu_enabled (kernel/rcu/update.c:136 (discriminator 1))
[ 5887.980630] ? do_group_exit (./arch/x86/include/asm/current.h:14 kernel/exit.c:875)
[ 5887.980637] ? do_group_exit (./arch/x86/include/asm/current.h:14 kernel/exit.c:875)
[ 5887.980645] ? preempt_count_sub (kernel/sched/core.c:2641)
[ 5887.980651] ? preempt_count_sub (kernel/sched/core.c:2641)
[ 5887.980660] ? get_signal (kernel/signal.c:2358)
[ 5887.980667] ? get_signal (kernel/signal.c:2358)
[ 5887.980676] ? trace_hardirqs_off_caller (kernel/locking/lockdep.c:2621)
[ 5887.980683] ? trace_hardirqs_off_caller (kernel/locking/lockdep.c:2621)
[ 5887.980692] ? do_signal (arch/x86/kernel/signal.c:703)
[ 5887.980701] ? do_signal (arch/x86/kernel/signal.c:703)
[ 5887.980707] ? get_parent_ip (kernel/sched/core.c:2585)
[ 5887.980714] ? get_parent_ip (kernel/sched/core.c:2585)
[ 5887.980723] ? _raw_spin_unlock (./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:183)
[ 5887.980731] ? _raw_spin_unlock (./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:183)
[ 5887.980740] ? context_tracking_user_exit (include/linux/vtime.h:89 include/linux/jump_label.h:114 include/trace/events/context_tracking.h:47 kernel/context_tracking.c:140)
[ 5887.980748] ? context_tracking_user_exit (include/linux/vtime.h:89 include/linux/jump_label.h:114 include/trace/events/context_tracking.h:47 kernel/context_tracking.c:140)
[ 5887.980757] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2566)
[ 5887.980764] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2566)
[ 5887.980772] ? trace_hardirqs_on (kernel/locking/lockdep.c:2609)
[ 5887.980778] ? trace_hardirqs_on (kernel/locking/lockdep.c:2609)
[ 5887.980787] ? do_notify_resume (arch/x86/kernel/signal.c:756)
[ 5887.980793] ? do_notify_resume (arch/x86/kernel/signal.c:756)
[ 5887.980802] ? int_signal (arch/x86/kernel/entry_64.S:587)
[ 5887.980809] ? int_signal (arch/x86/kernel/entry_64.S:587)
Thanks,
Sasha
On Mon, Oct 27, 2014 at 04:44:25PM -0700, Paul E. McKenney wrote:
> On Mon, Oct 27, 2014 at 02:13:29PM -0700, Paul E. McKenney wrote:
> > On Fri, Oct 24, 2014 at 12:39:15PM -0400, Sasha Levin wrote:
> > > On 10/24/2014 12:13 PM, Paul E. McKenney wrote:
> > > > On Fri, Oct 24, 2014 at 08:28:40AM -0400, Sasha Levin wrote:
> > > >> > On 10/23/2014 03:58 PM, Paul E. McKenney wrote:
> > > >>> > > On Thu, Oct 23, 2014 at 02:55:43PM -0400, Sasha Levin wrote:
> > > >>>>> > >> > On 10/23/2014 02:39 PM, Paul E. McKenney wrote:
> > > >>>>>>> > >>> > > On Tue, Oct 14, 2014 at 10:35:10PM -0400, Sasha Levin wrote:
> > > >>>>>>>>> > >>>> > >> On 10/13/2014 01:35 PM, Dave Jones wrote:
> > > >>>>>>>>>>> > >>>>> > >>> oday in "rcu stall while fuzzing" news:
> > > >>>>>>>>>>> > >>>>> > >>>
> > > >>>>>>>>>>> > >>>>> > >>> INFO: rcu_preempt detected stalls on CPUs/tasks:
> > > >>>>>>>>>>> > >>>>> > >>> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> > > >>>>>>>>>>> > >>>>> > >>> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
> > > >>>>>>>>>>> > >>>>> > >>> (detected by 0, t=6502 jiffies, g=75434, c=75433, q=0)
> > > >>>>>>>>> > >>>> > >>
> > > >>>>>>>>> > >>>> > >> I've complained about RCU stalls couple days ago (in a different context)
> > > >>>>>>>>> > >>>> > >> on -next. I guess whatever causing them made it into Linus's tree?
> > > >>>>>>>>> > >>>> > >>
> > > >>>>>>>>> > >>>> > >> https://lkml.org/lkml/2014/10/11/64
> > > >>>>>>> > >>> > >
> > > >>>>>>> > >>> > > And on that one, I must confess that I don't see where the RCU read-side
> > > >>>>>>> > >>> > > critical section might be.
> > > >>>>>>> > >>> > >
> > > >>>>>>> > >>> > > Hmmm... Maybe someone forgot to put an rcu_read_unlock() somewhere.
> > > >>>>>>> > >>> > > Can you reproduce this with CONFIG_PROVE_RCU=y?
> > > >>>>> > >> >
> > > >>>>> > >> > Paul, if that was directed to me - Yes, I see stalls with CONFIG_PROVE_RCU
> > > >>>>> > >> > set and nothing else is showing up before/after that.
> > > >>> > > Indeed it was directed to you. ;-)
> > > >>> > >
> > > >>> > > Does the following crude diagnostic patch turn up anything?
> > > >> >
> > > >> > Nope, seeing stalls but not seeing that pr_err() you added.
> > > > OK, color me confused. Could you please send me the full dmesg or a
> > > > pointer to it?
> > >
> > > Attached.
> >
> > Thank you! I would complain about the FAULT_INJECTION messages, but
> > they don't appear to be happening all that frequently.
> >
> > The stack dumps do look different here. I suspect that this is a real
> > issue in the VM code.
>
> And to that end... The filemap_map_pages() function does have loop over
> a list of pages. I wonder if the rcu_read_lock() should be moved into
> the radix_tree_for_each_slot() loop. CCing linux-mm for their thoughts,
> though it looks to me like the current radix_tree_for_each_slot() wants
> to be under RCU protection. But I am not seeing anything that requires
> all iterations of the loop to be under the same RCU read-side critical
> section. Maybe something like the following patch?
Just following up, did the patch below help?
Thanx, Paul
> ------------------------------------------------------------------------
>
> mm: Attempted fix for RCU CPU stall warning
>
> It appears that filemap_map_pages() can stay in a single RCU read-side
> critical section for a very long time if given a large area to map.
> This could result in RCU CPU stall warnings. This commit therefore breaks
> the read-side critical section into per-iteration critical sections, taking
> care to make sure that the radix_tree_for_each_slot() call itself remains
> in an RCU read-side critical section, as required.
>
> Reported-by: Sasha Levin <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 14b4642279f1..f78f144fb41f 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2055,6 +2055,8 @@ skip:
> next:
> if (iter.index == vmf->max_pgoff)
> break;
> + rcu_read_unlock();
> + rcu_read_lock();
> }
> rcu_read_unlock();
> }
On 11/13/2014 06:07 PM, Paul E. McKenney wrote:
> On Mon, Oct 27, 2014 at 04:44:25PM -0700, Paul E. McKenney wrote:
>> > On Mon, Oct 27, 2014 at 02:13:29PM -0700, Paul E. McKenney wrote:
>>> > > On Fri, Oct 24, 2014 at 12:39:15PM -0400, Sasha Levin wrote:
>>>> > > > On 10/24/2014 12:13 PM, Paul E. McKenney wrote:
>>>>> > > > > On Fri, Oct 24, 2014 at 08:28:40AM -0400, Sasha Levin wrote:
>>>>>>> > > > >> > On 10/23/2014 03:58 PM, Paul E. McKenney wrote:
>>>>>>>>> > > > >>> > > On Thu, Oct 23, 2014 at 02:55:43PM -0400, Sasha Levin wrote:
>>>>>>>>>>>>> > > > >>>>> > >> > On 10/23/2014 02:39 PM, Paul E. McKenney wrote:
>>>>>>>>>>>>>>>>> > > > >>>>>>> > >>> > > On Tue, Oct 14, 2014 at 10:35:10PM -0400, Sasha Levin wrote:
>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>> > >>>> > >> On 10/13/2014 01:35 PM, Dave Jones wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>> > >>>>> > >>> oday in "rcu stall while fuzzing" news:
>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>> > >>>>> > >>>
>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>> > >>>>> > >>> INFO: rcu_preempt detected stalls on CPUs/tasks:
>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>> > >>>>> > >>> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>> > >>>>> > >>> Tasks blocked on level-0 rcu_node (CPUs 0-3): P766 P646
>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>> > >>>>> > >>> (detected by 0, t=6502 jiffies, g=75434, c=75433, q=0)
>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>> > >>>> > >>
>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>> > >>>> > >> I've complained about RCU stalls couple days ago (in a different context)
>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>> > >>>> > >> on -next. I guess whatever causing them made it into Linus's tree?
>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>> > >>>> > >>
>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>> > >>>> > >> https://lkml.org/lkml/2014/10/11/64
>>>>>>>>>>>>>>>>> > > > >>>>>>> > >>> > >
>>>>>>>>>>>>>>>>> > > > >>>>>>> > >>> > > And on that one, I must confess that I don't see where the RCU read-side
>>>>>>>>>>>>>>>>> > > > >>>>>>> > >>> > > critical section might be.
>>>>>>>>>>>>>>>>> > > > >>>>>>> > >>> > >
>>>>>>>>>>>>>>>>> > > > >>>>>>> > >>> > > Hmmm... Maybe someone forgot to put an rcu_read_unlock() somewhere.
>>>>>>>>>>>>>>>>> > > > >>>>>>> > >>> > > Can you reproduce this with CONFIG_PROVE_RCU=y?
>>>>>>>>>>>>> > > > >>>>> > >> >
>>>>>>>>>>>>> > > > >>>>> > >> > Paul, if that was directed to me - Yes, I see stalls with CONFIG_PROVE_RCU
>>>>>>>>>>>>> > > > >>>>> > >> > set and nothing else is showing up before/after that.
>>>>>>>>> > > > >>> > > Indeed it was directed to you. ;-)
>>>>>>>>> > > > >>> > >
>>>>>>>>> > > > >>> > > Does the following crude diagnostic patch turn up anything?
>>>>>>> > > > >> >
>>>>>>> > > > >> > Nope, seeing stalls but not seeing that pr_err() you added.
>>>>> > > > > OK, color me confused. Could you please send me the full dmesg or a
>>>>> > > > > pointer to it?
>>>> > > >
>>>> > > > Attached.
>>> > >
>>> > > Thank you! I would complain about the FAULT_INJECTION messages, but
>>> > > they don't appear to be happening all that frequently.
>>> > >
>>> > > The stack dumps do look different here. I suspect that this is a real
>>> > > issue in the VM code.
>> >
>> > And to that end... The filemap_map_pages() function does have loop over
>> > a list of pages. I wonder if the rcu_read_lock() should be moved into
>> > the radix_tree_for_each_slot() loop. CCing linux-mm for their thoughts,
>> > though it looks to me like the current radix_tree_for_each_slot() wants
>> > to be under RCU protection. But I am not seeing anything that requires
>> > all iterations of the loop to be under the same RCU read-side critical
>> > section. Maybe something like the following patch?
> Just following up, did the patch below help?
I'm not seeing any more stalls with filemap in them, but I don see different
traces.
Thanks,
Sasha