2009-09-16 14:17:38

by Catalin Marinas

[permalink] [raw]
Subject: RCU callbacks and TREE_PREEMPT_RCU

Hi Paul,

Eric was reporting some issues with kmemleak on 2.6.31 accessing freed
memory under heavy stress (using the "stress" application). Basically,
the system gets into an oom state (because of "stress -m 1000") and
kmemleak fails to allocate its metadata (correct behaviour so far). At
that point, it disables itself and schedules the clean-up work which
does this (among other locking, the kmemleak_do_cleanup function the
latest mainline):

rcu_read_lock();
list_for_each_entry_rcu(object, &object_list, object_list)
delete_object_full(object->pointer);
rcu_read_unlock();

The kmemleak objects are freed via put_object() with:

call_rcu(&object->rcu, free_object_rcu);

(the free_object_rcu calls kmem_cache_free).

When TREE_PREEMPT_RCU is enabled, the rcu list traversing above fails
with access to 0x6b6b6b6b but it is fine with TREE_PREEMPT_RCU=n and
TREE_RCU=y. During clean-up, kmemleak objects should no longer be freed
by other means since kmemleak was disabled and all callbacks are
ignored. The system is a 900Mhz P3, 256MB RAM, CONFIG_SMP=n.

Is there something I'm doing wrong in kmemleak or a bug with RCU
preemption? The kernel oops looks like this:

[ 5346.582119] kmemleak: Cannot allocate a kmemleak_object structure
[ 5346.582208] Pid: 31302, comm: stress Not tainted 2.6.31-01335-g86d7101 #5
[ 5346.582313] Call Trace:
[ 5346.582414] [<c01c4125>] create_object+0x215/0x220
[ 5346.582529] [<c0d3e660>] ? alloc_arch_preferred_bootmem+0x30/0x50
[ 5346.582628] [<c0157532>] ? mark_held_locks+0x52/0x70
[ 5346.582734] [<c0d3e660>] ? alloc_arch_preferred_bootmem+0x30/0x50
[ 5346.582823] [<c0d3e6b8>] ? __free+0x38/0x90
[ 5346.582941] [<c08ea9cb>] kmemleak_alloc+0x2b/0x60
[ 5346.705312] [<c01c075c>] kmem_cache_alloc+0x11c/0x1a0
[ 5346.705453] [<c05b7313>] ? cfq_set_request+0xf3/0x310
[ 5346.705573] [<c0d3e660>] ? alloc_arch_preferred_bootmem+0x30/0x50
[ 5346.705660] [<c05aeed3>] ? get_io_context+0x13/0x40
[ 5346.705765] [<c05b7220>] ? cfq_set_request+0x0/0x310
[ 5346.705850] [<c05b7313>] cfq_set_request+0xf3/0x310
[ 5346.705968] [<c015767c>] ? trace_hardirqs_on_caller+0x12c/0x180
[ 5346.706133] [<c05b7220>] ? cfq_set_request+0x0/0x310
[ 5346.706230] [<c05a3fcf>] elv_set_request+0x1f/0x50
[ 5346.706342] [<c05a8bbc>] get_request+0x27c/0x2f0
[ 5346.706426] [<c05a91c2>] get_request_wait+0xe2/0x140
[ 5346.706545] [<c0146290>] ? autoremove_wake_function+0x0/0x40
[ 5346.706638] [<c05abd79>] __make_request+0x89/0x3e0
[ 5346.706744] [<c05a7fe2>] generic_make_request+0x192/0x400
[ 5346.706835] [<c05ad011>] submit_bio+0x71/0x110
[ 5346.706939] [<c015767c>] ? trace_hardirqs_on_caller+0x12c/0x180
[ 5346.797327] [<c01576db>] ? trace_hardirqs_on+0xb/0x10
[ 5346.797478] [<c08fa239>] ? _spin_unlock_irqrestore+0x39/0x70
[ 5346.797597] [<c019d55d>] ? test_set_page_writeback+0x6d/0x140
[ 5346.797699] [<c01b607a>] swap_writepage+0x9a/0xd0
[ 5346.797804] [<c01b60b0>] ? end_swap_bio_write+0x0/0x80
[ 5346.797895] [<c01a0706>] shrink_page_list+0x316/0x700
[ 5346.798003] [<c015aa9f>] ? __lock_acquire+0x40f/0xab0
[ 5346.798170] [<c0159749>] ? validate_chain+0xe9/0x1030
[ 5346.798260] [<c01a0cca>] shrink_list+0x1da/0x4e0
[ 5346.798370] [<c01a1267>] shrink_zone+0x297/0x310
[ 5346.798454] [<c01a1441>] ? shrink_slab+0x161/0x1a0
[ 5346.798563] [<c01a1661>] try_to_free_pages+0x1e1/0x2e0
[ 5346.798650] [<c019f5f0>] ? isolate_pages_global+0x0/0x1e0
[ 5346.798774] [<c019b76e>] __alloc_pages_nodemask+0x35e/0x5d0
[ 5346.798864] [<c01aa957>] do_wp_page+0xb7/0x690
[ 5346.798968] [<c01abf83>] ? handle_mm_fault+0x263/0x600
[ 5346.929240] [<c08fa4b5>] ? _spin_lock+0x65/0x70
[ 5346.929378] [<c01ac185>] handle_mm_fault+0x465/0x600
[ 5346.929496] [<c08fc7fb>] ? do_page_fault+0x14b/0x390
[ 5346.929589] [<c014a4fc>] ? down_read_trylock+0x5c/0x70
[ 5346.929696] [<c08fc860>] do_page_fault+0x1b0/0x390
[ 5346.929780] [<c08fc6b0>] ? do_page_fault+0x0/0x390
[ 5346.929884] [<c08fad18>] error_code+0x70/0x78
[ 5347.889442] BUG: unable to handle kernel paging request at 6b6b6b6b
[ 5347.889626] IP: [<c01c31e0>] kmemleak_do_cleanup+0x60/0xa0
[ 5347.889835] *pde = 00000000
[ 5347.889933] Oops: 0000 [#1] PREEMPT
[ 5347.890038] last sysfs file: /sys/class/vc/vcsa9/dev
[ 5347.890038] Modules linked in: [last unloaded: rcutorture]
[ 5347.890038]
[ 5347.890038] Pid: 5, comm: events/0 Not tainted (2.6.31-01335-g86d7101 #5)
System Name
[ 5347.890038] EIP: 0060:[<c01c31e0>] EFLAGS: 00010286 CPU: 0
[ 5347.890038] EIP is at kmemleak_do_cleanup+0x60/0xa0
[ 5347.890038] EAX: 002ed661 EBX: 6b6b6b43 ECX: 00000007 EDX: 6b6b6b6b
[ 5347.890038] ESI: cf8b40b0 EDI: 00000002 EBP: cf8b8f3c ESP: cf8b8f28
[ 5347.890038] DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[ 5347.890038] Process events/0 (pid: 5, ti=cf8b8000 task=cf8c3500
task.ti=cf8b8000)
[ 5347.890038] Stack:
[ 5347.890038] 00000002 00000001 00000000 c01c3180 c0cd6640 cf8b8f98 c0142857
00000000
[ 5347.890038] <0> 00000002 00000000 c01427f6 cf8b40d4 cf8b40dc cf8c3500
c01c3180 c0cd6640
[ 5347.890038] <0> c0f938b0 c0a89514 00000000 00000000 00000000 cf8c3500
c0146290 cf8b8f84
[ 5347.890038] Call Trace:
[ 5347.890038] [<c01c3180>] ? kmemleak_do_cleanup+0x0/0xa0
[ 5347.890038] [<c0142857>] ? worker_thread+0x1d7/0x300
[ 5347.890038] [<c01427f6>] ? worker_thread+0x176/0x300
[ 5347.890038] [<c01c3180>] ? kmemleak_do_cleanup+0x0/0xa0
[ 5347.890038] [<c0146290>] ? autoremove_wake_function+0x0/0x40
[ 5347.890038] [<c0142680>] ? worker_thread+0x0/0x300
[ 5347.890038] [<c01461b7>] ? kthread+0x77/0x80
[ 5347.890038] [<c0146140>] ? kthread+0x0/0x80
[ 5347.890038] [<c010356b>] ? kernel_thread_helper+0x7/0x1c
[ 5347.890038] Code: 89 44 24 04 b8 e0 2c cd c0 c7 04 24 02 00 00 00 e8 76 7f
f9 ff 8b 15 d0 66 cd c0 eb 0b 8b 43 58 e8 76 ff ff ff 8b 53 28 8d 5a d8 <8b>
43 28 0f 18 00 90 81 fa d0 66 cd c0 75 e3 b9 ef 31 1c c0 ba
[ 5347.890038] EIP: [<c01c31e0>] kmemleak_do_cleanup+0x60/0xa0 SS:ESP
0068:cf8b8f28
[ 5347.890038] CR2: 000000006b6b6b6b


Thanks.

--
Catalin


2009-09-16 15:29:30

by Paul E. McKenney

[permalink] [raw]
Subject: Re: RCU callbacks and TREE_PREEMPT_RCU

On Wed, Sep 16, 2009 at 03:17:21PM +0100, Catalin Marinas wrote:
> Hi Paul,
>
> Eric was reporting some issues with kmemleak on 2.6.31 accessing freed
> memory under heavy stress (using the "stress" application). Basically,
> the system gets into an oom state (because of "stress -m 1000") and
> kmemleak fails to allocate its metadata (correct behaviour so far). At
> that point, it disables itself and schedules the clean-up work which
> does this (among other locking, the kmemleak_do_cleanup function the
> latest mainline):
>
> rcu_read_lock();
> list_for_each_entry_rcu(object, &object_list, object_list)
> delete_object_full(object->pointer);
> rcu_read_unlock();
>
> The kmemleak objects are freed via put_object() with:
>
> call_rcu(&object->rcu, free_object_rcu);
>
> (the free_object_rcu calls kmem_cache_free).
>
> When TREE_PREEMPT_RCU is enabled, the rcu list traversing above fails
> with access to 0x6b6b6b6b but it is fine with TREE_PREEMPT_RCU=n and
> TREE_RCU=y. During clean-up, kmemleak objects should no longer be freed
> by other means since kmemleak was disabled and all callbacks are
> ignored. The system is a 900Mhz P3, 256MB RAM, CONFIG_SMP=n.
>
> Is there something I'm doing wrong in kmemleak or a bug with RCU
> preemption? The kernel oops looks like this:

>From your description and the code above, I must suspect a bug with
RCU preemption. A new one, as the only bugs I am currently chasing
involve NR_CPUS>32 (>64 on 64-bit systems).

CONFIG_SMP=n implies NR_CPUS==1 in your build, correct?

Thanx, Paul

> [ 5346.582119] kmemleak: Cannot allocate a kmemleak_object structure
> [ 5346.582208] Pid: 31302, comm: stress Not tainted 2.6.31-01335-g86d7101 #5
> [ 5346.582313] Call Trace:
> [ 5346.582414] [<c01c4125>] create_object+0x215/0x220
> [ 5346.582529] [<c0d3e660>] ? alloc_arch_preferred_bootmem+0x30/0x50
> [ 5346.582628] [<c0157532>] ? mark_held_locks+0x52/0x70
> [ 5346.582734] [<c0d3e660>] ? alloc_arch_preferred_bootmem+0x30/0x50
> [ 5346.582823] [<c0d3e6b8>] ? __free+0x38/0x90
> [ 5346.582941] [<c08ea9cb>] kmemleak_alloc+0x2b/0x60
> [ 5346.705312] [<c01c075c>] kmem_cache_alloc+0x11c/0x1a0
> [ 5346.705453] [<c05b7313>] ? cfq_set_request+0xf3/0x310
> [ 5346.705573] [<c0d3e660>] ? alloc_arch_preferred_bootmem+0x30/0x50
> [ 5346.705660] [<c05aeed3>] ? get_io_context+0x13/0x40
> [ 5346.705765] [<c05b7220>] ? cfq_set_request+0x0/0x310
> [ 5346.705850] [<c05b7313>] cfq_set_request+0xf3/0x310
> [ 5346.705968] [<c015767c>] ? trace_hardirqs_on_caller+0x12c/0x180
> [ 5346.706133] [<c05b7220>] ? cfq_set_request+0x0/0x310
> [ 5346.706230] [<c05a3fcf>] elv_set_request+0x1f/0x50
> [ 5346.706342] [<c05a8bbc>] get_request+0x27c/0x2f0
> [ 5346.706426] [<c05a91c2>] get_request_wait+0xe2/0x140
> [ 5346.706545] [<c0146290>] ? autoremove_wake_function+0x0/0x40
> [ 5346.706638] [<c05abd79>] __make_request+0x89/0x3e0
> [ 5346.706744] [<c05a7fe2>] generic_make_request+0x192/0x400
> [ 5346.706835] [<c05ad011>] submit_bio+0x71/0x110
> [ 5346.706939] [<c015767c>] ? trace_hardirqs_on_caller+0x12c/0x180
> [ 5346.797327] [<c01576db>] ? trace_hardirqs_on+0xb/0x10
> [ 5346.797478] [<c08fa239>] ? _spin_unlock_irqrestore+0x39/0x70
> [ 5346.797597] [<c019d55d>] ? test_set_page_writeback+0x6d/0x140
> [ 5346.797699] [<c01b607a>] swap_writepage+0x9a/0xd0
> [ 5346.797804] [<c01b60b0>] ? end_swap_bio_write+0x0/0x80
> [ 5346.797895] [<c01a0706>] shrink_page_list+0x316/0x700
> [ 5346.798003] [<c015aa9f>] ? __lock_acquire+0x40f/0xab0
> [ 5346.798170] [<c0159749>] ? validate_chain+0xe9/0x1030
> [ 5346.798260] [<c01a0cca>] shrink_list+0x1da/0x4e0
> [ 5346.798370] [<c01a1267>] shrink_zone+0x297/0x310
> [ 5346.798454] [<c01a1441>] ? shrink_slab+0x161/0x1a0
> [ 5346.798563] [<c01a1661>] try_to_free_pages+0x1e1/0x2e0
> [ 5346.798650] [<c019f5f0>] ? isolate_pages_global+0x0/0x1e0
> [ 5346.798774] [<c019b76e>] __alloc_pages_nodemask+0x35e/0x5d0
> [ 5346.798864] [<c01aa957>] do_wp_page+0xb7/0x690
> [ 5346.798968] [<c01abf83>] ? handle_mm_fault+0x263/0x600
> [ 5346.929240] [<c08fa4b5>] ? _spin_lock+0x65/0x70
> [ 5346.929378] [<c01ac185>] handle_mm_fault+0x465/0x600
> [ 5346.929496] [<c08fc7fb>] ? do_page_fault+0x14b/0x390
> [ 5346.929589] [<c014a4fc>] ? down_read_trylock+0x5c/0x70
> [ 5346.929696] [<c08fc860>] do_page_fault+0x1b0/0x390
> [ 5346.929780] [<c08fc6b0>] ? do_page_fault+0x0/0x390
> [ 5346.929884] [<c08fad18>] error_code+0x70/0x78
> [ 5347.889442] BUG: unable to handle kernel paging request at 6b6b6b6b
> [ 5347.889626] IP: [<c01c31e0>] kmemleak_do_cleanup+0x60/0xa0
> [ 5347.889835] *pde = 00000000
> [ 5347.889933] Oops: 0000 [#1] PREEMPT
> [ 5347.890038] last sysfs file: /sys/class/vc/vcsa9/dev
> [ 5347.890038] Modules linked in: [last unloaded: rcutorture]
> [ 5347.890038]
> [ 5347.890038] Pid: 5, comm: events/0 Not tainted (2.6.31-01335-g86d7101 #5)
> System Name
> [ 5347.890038] EIP: 0060:[<c01c31e0>] EFLAGS: 00010286 CPU: 0
> [ 5347.890038] EIP is at kmemleak_do_cleanup+0x60/0xa0
> [ 5347.890038] EAX: 002ed661 EBX: 6b6b6b43 ECX: 00000007 EDX: 6b6b6b6b
> [ 5347.890038] ESI: cf8b40b0 EDI: 00000002 EBP: cf8b8f3c ESP: cf8b8f28
> [ 5347.890038] DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
> [ 5347.890038] Process events/0 (pid: 5, ti=cf8b8000 task=cf8c3500
> task.ti=cf8b8000)
> [ 5347.890038] Stack:
> [ 5347.890038] 00000002 00000001 00000000 c01c3180 c0cd6640 cf8b8f98 c0142857
> 00000000
> [ 5347.890038] <0> 00000002 00000000 c01427f6 cf8b40d4 cf8b40dc cf8c3500
> c01c3180 c0cd6640
> [ 5347.890038] <0> c0f938b0 c0a89514 00000000 00000000 00000000 cf8c3500
> c0146290 cf8b8f84
> [ 5347.890038] Call Trace:
> [ 5347.890038] [<c01c3180>] ? kmemleak_do_cleanup+0x0/0xa0
> [ 5347.890038] [<c0142857>] ? worker_thread+0x1d7/0x300
> [ 5347.890038] [<c01427f6>] ? worker_thread+0x176/0x300
> [ 5347.890038] [<c01c3180>] ? kmemleak_do_cleanup+0x0/0xa0
> [ 5347.890038] [<c0146290>] ? autoremove_wake_function+0x0/0x40
> [ 5347.890038] [<c0142680>] ? worker_thread+0x0/0x300
> [ 5347.890038] [<c01461b7>] ? kthread+0x77/0x80
> [ 5347.890038] [<c0146140>] ? kthread+0x0/0x80
> [ 5347.890038] [<c010356b>] ? kernel_thread_helper+0x7/0x1c
> [ 5347.890038] Code: 89 44 24 04 b8 e0 2c cd c0 c7 04 24 02 00 00 00 e8 76 7f
> f9 ff 8b 15 d0 66 cd c0 eb 0b 8b 43 58 e8 76 ff ff ff 8b 53 28 8d 5a d8 <8b>
> 43 28 0f 18 00 90 81 fa d0 66 cd c0 75 e3 b9 ef 31 1c c0 ba
> [ 5347.890038] EIP: [<c01c31e0>] kmemleak_do_cleanup+0x60/0xa0 SS:ESP
> 0068:cf8b8f28
> [ 5347.890038] CR2: 000000006b6b6b6b
>
>
> Thanks.
>
> --
> Catalin
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2009-09-16 15:34:17

by Catalin Marinas

[permalink] [raw]
Subject: Re: RCU callbacks and TREE_PREEMPT_RCU

On Wed, 2009-09-16 at 08:29 -0700, Paul E. McKenney wrote:
> On Wed, Sep 16, 2009 at 03:17:21PM +0100, Catalin Marinas wrote:
> > When TREE_PREEMPT_RCU is enabled, the rcu list traversing above fails
> > with access to 0x6b6b6b6b but it is fine with TREE_PREEMPT_RCU=n and
> > TREE_RCU=y. During clean-up, kmemleak objects should no longer be freed
> > by other means since kmemleak was disabled and all callbacks are
> > ignored. The system is a 900Mhz P3, 256MB RAM, CONFIG_SMP=n.
> >
> > Is there something I'm doing wrong in kmemleak or a bug with RCU
> > preemption? The kernel oops looks like this:
>
> From your description and the code above, I must suspect a bug with
> RCU preemption. A new one, as the only bugs I am currently chasing
> involve NR_CPUS>32 (>64 on 64-bit systems).
>
> CONFIG_SMP=n implies NR_CPUS==1 in your build, correct?

CONFIG_NR_CPUS=1.

--
Catalin

2009-09-16 15:47:15

by Paul E. McKenney

[permalink] [raw]
Subject: Re: RCU callbacks and TREE_PREEMPT_RCU

On Wed, Sep 16, 2009 at 04:34:15PM +0100, Catalin Marinas wrote:
> On Wed, 2009-09-16 at 08:29 -0700, Paul E. McKenney wrote:
> > On Wed, Sep 16, 2009 at 03:17:21PM +0100, Catalin Marinas wrote:
> > > When TREE_PREEMPT_RCU is enabled, the rcu list traversing above fails
> > > with access to 0x6b6b6b6b but it is fine with TREE_PREEMPT_RCU=n and
> > > TREE_RCU=y. During clean-up, kmemleak objects should no longer be freed
> > > by other means since kmemleak was disabled and all callbacks are
> > > ignored. The system is a 900Mhz P3, 256MB RAM, CONFIG_SMP=n.
> > >
> > > Is there something I'm doing wrong in kmemleak or a bug with RCU
> > > preemption? The kernel oops looks like this:
> >
> > From your description and the code above, I must suspect a bug with
> > RCU preemption. A new one, as the only bugs I am currently chasing
> > involve NR_CPUS>32 (>64 on 64-bit systems).
> >
> > CONFIG_SMP=n implies NR_CPUS==1 in your build, correct?
>
> CONFIG_NR_CPUS=1.

I was afraid of that. ;-)

Thanx, Paul

2009-09-16 15:57:20

by Paul E. McKenney

[permalink] [raw]
Subject: Re: RCU callbacks and TREE_PREEMPT_RCU

On Wed, Sep 16, 2009 at 08:47:16AM -0700, Paul E. McKenney wrote:
> On Wed, Sep 16, 2009 at 04:34:15PM +0100, Catalin Marinas wrote:
> > On Wed, 2009-09-16 at 08:29 -0700, Paul E. McKenney wrote:
> > > On Wed, Sep 16, 2009 at 03:17:21PM +0100, Catalin Marinas wrote:
> > > > When TREE_PREEMPT_RCU is enabled, the rcu list traversing above fails
> > > > with access to 0x6b6b6b6b but it is fine with TREE_PREEMPT_RCU=n and
> > > > TREE_RCU=y. During clean-up, kmemleak objects should no longer be freed
> > > > by other means since kmemleak was disabled and all callbacks are
> > > > ignored. The system is a 900Mhz P3, 256MB RAM, CONFIG_SMP=n.
> > > >
> > > > Is there something I'm doing wrong in kmemleak or a bug with RCU
> > > > preemption? The kernel oops looks like this:
> > >
> > > From your description and the code above, I must suspect a bug with
> > > RCU preemption. A new one, as the only bugs I am currently chasing
> > > involve NR_CPUS>32 (>64 on 64-bit systems).
> > >
> > > CONFIG_SMP=n implies NR_CPUS==1 in your build, correct?
> >
> > CONFIG_NR_CPUS=1.
>
> I was afraid of that. ;-)

PS to previous -- there -is- a bug in mainline for TREE_PREEMPT_RCU for
single-CPU operation, but it is with synchronize_rcu() rather than
call_rcu(). The fix is in tip/core/urgent, commit #366b04ca. Or see
the following patch.

So, could you please give the following patch a try?

Thanx, Paul

Commit-ID: 366b04ca60c70479e2959fe8485b87ff380fdbbf
Gitweb: http://git.kernel.org/tip/366b04ca60c70479e2959fe8485b87ff380fdbbf
Author: Paul E. McKenney <[email protected]>
AuthorDate: Sun, 13 Sep 2009 09:15:11 -0700
Committer: Ingo Molnar <[email protected]>
CommitDate: Tue, 15 Sep 2009 08:43:59 +0200

rcu: Fix synchronize_rcu() for TREE_PREEMPT_RCU

The redirection of synchronize_sched() to synchronize_rcu() was
appropriate for TREE_RCU, but not for TREE_PREEMPT_RCU.

Fix this by creating an underlying synchronize_sched(). TREE_RCU
then redirects synchronize_rcu() to synchronize_sched(), while
TREE_PREEMPT_RCU has its own version of synchronize_rcu().

Signed-off-by: Paul E. McKenney <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
LKML-Reference: <12528585111916-git-send-email->
Signed-off-by: Ingo Molnar <[email protected]>


---
include/linux/rcupdate.h | 23 +++++------------------
include/linux/rcutree.h | 4 ++--
kernel/rcupdate.c | 44 +++++++++++++++++++++++++++++++++++++++++++-
3 files changed, 50 insertions(+), 21 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 95e0615..39dce83 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -52,8 +52,13 @@ struct rcu_head {
};

/* Exported common interfaces */
+#ifdef CONFIG_TREE_PREEMPT_RCU
extern void synchronize_rcu(void);
+#else /* #ifdef CONFIG_TREE_PREEMPT_RCU */
+#define synchronize_rcu synchronize_sched
+#endif /* #else #ifdef CONFIG_TREE_PREEMPT_RCU */
extern void synchronize_rcu_bh(void);
+extern void synchronize_sched(void);
extern void rcu_barrier(void);
extern void rcu_barrier_bh(void);
extern void rcu_barrier_sched(void);
@@ -262,24 +267,6 @@ struct rcu_synchronize {
extern void wakeme_after_rcu(struct rcu_head *head);

/**
- * synchronize_sched - block until all CPUs have exited any non-preemptive
- * kernel code sequences.
- *
- * This means that all preempt_disable code sequences, including NMI and
- * hardware-interrupt handlers, in progress on entry will have completed
- * before this primitive returns. However, this does not guarantee that
- * softirq handlers will have completed, since in some kernels, these
- * handlers can run in process context, and can block.
- *
- * This primitive provides the guarantees made by the (now removed)
- * synchronize_kernel() API. In contrast, synchronize_rcu() only
- * guarantees that rcu_read_lock() sections will have completed.
- * In "classic RCU", these two guarantees happen to be one and
- * the same, but can differ in realtime RCU implementations.
- */
-#define synchronize_sched() __synchronize_sched()
-
-/**
* call_rcu - Queue an RCU callback for invocation after a grace period.
* @head: structure to be used for queueing the RCU updates.
* @func: actual update function to be invoked after the grace period
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index a893077..00d08c0 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -53,6 +53,8 @@ static inline void __rcu_read_unlock(void)
preempt_enable();
}

+#define __synchronize_sched() synchronize_rcu()
+
static inline void exit_rcu(void)
{
}
@@ -68,8 +70,6 @@ static inline void __rcu_read_unlock_bh(void)
local_bh_enable();
}

-#define __synchronize_sched() synchronize_rcu()
-
extern void call_rcu_sched(struct rcu_head *head,
void (*func)(struct rcu_head *rcu));

diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index bd5d5c8..28d2f24 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -74,6 +74,8 @@ void wakeme_after_rcu(struct rcu_head *head)
complete(&rcu->completion);
}

+#ifdef CONFIG_TREE_PREEMPT_RCU
+
/**
* synchronize_rcu - wait until a grace period has elapsed.
*
@@ -87,7 +89,7 @@ void synchronize_rcu(void)
{
struct rcu_synchronize rcu;

- if (rcu_blocking_is_gp())
+ if (!rcu_scheduler_active)
return;

init_completion(&rcu.completion);
@@ -98,6 +100,46 @@ void synchronize_rcu(void)
}
EXPORT_SYMBOL_GPL(synchronize_rcu);

+#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
+
+/**
+ * synchronize_sched - wait until an rcu-sched grace period has elapsed.
+ *
+ * Control will return to the caller some time after a full rcu-sched
+ * grace period has elapsed, in other words after all currently executing
+ * rcu-sched read-side critical sections have completed. These read-side
+ * critical sections are delimited by rcu_read_lock_sched() and
+ * rcu_read_unlock_sched(), and may be nested. Note that preempt_disable(),
+ * local_irq_disable(), and so on may be used in place of
+ * rcu_read_lock_sched().
+ *
+ * This means that all preempt_disable code sequences, including NMI and
+ * hardware-interrupt handlers, in progress on entry will have completed
+ * before this primitive returns. However, this does not guarantee that
+ * softirq handlers will have completed, since in some kernels, these
+ * handlers can run in process context, and can block.
+ *
+ * This primitive provides the guarantees made by the (now removed)
+ * synchronize_kernel() API. In contrast, synchronize_rcu() only
+ * guarantees that rcu_read_lock() sections will have completed.
+ * In "classic RCU", these two guarantees happen to be one and
+ * the same, but can differ in realtime RCU implementations.
+ */
+void synchronize_sched(void)
+{
+ struct rcu_synchronize rcu;
+
+ if (rcu_blocking_is_gp())
+ return;
+
+ init_completion(&rcu.completion);
+ /* Will wake me after RCU finished. */
+ call_rcu_sched(&rcu.head, wakeme_after_rcu);
+ /* Wait for it. */
+ wait_for_completion(&rcu.completion);
+}
+EXPORT_SYMBOL_GPL(synchronize_sched);
+
/**
* synchronize_rcu_bh - wait until an rcu_bh grace period has elapsed.
*

2009-09-16 16:25:51

by Eric Sesterhenn

[permalink] [raw]
Subject: Re: RCU callbacks and TREE_PREEMPT_RCU

On Wed, 2009-09-16 at 08:57 -0700, Paul E. McKenney wrote:
> On Wed, Sep 16, 2009 at 08:47:16AM -0700, Paul E. McKenney wrote:
> > On Wed, Sep 16, 2009 at 04:34:15PM +0100, Catalin Marinas wrote:
> > > On Wed, 2009-09-16 at 08:29 -0700, Paul E. McKenney wrote:
> > > > On Wed, Sep 16, 2009 at 03:17:21PM +0100, Catalin Marinas wrote:
> > > > > When TREE_PREEMPT_RCU is enabled, the rcu list traversing above fails
> > > > > with access to 0x6b6b6b6b but it is fine with TREE_PREEMPT_RCU=n and
> > > > > TREE_RCU=y. During clean-up, kmemleak objects should no longer be freed
> > > > > by other means since kmemleak was disabled and all callbacks are
> > > > > ignored. The system is a 900Mhz P3, 256MB RAM, CONFIG_SMP=n.
> > > > >
> > > > > Is there something I'm doing wrong in kmemleak or a bug with RCU
> > > > > preemption? The kernel oops looks like this:
> > > >
> > > > From your description and the code above, I must suspect a bug with
> > > > RCU preemption. A new one, as the only bugs I am currently chasing
> > > > involve NR_CPUS>32 (>64 on 64-bit systems).
> > > >
> > > > CONFIG_SMP=n implies NR_CPUS==1 in your build, correct?
> > >
> > > CONFIG_NR_CPUS=1.
> >
> > I was afraid of that. ;-)
>
> PS to previous -- there -is- a bug in mainline for TREE_PREEMPT_RCU for
> single-CPU operation, but it is with synchronize_rcu() rather than
> call_rcu(). The fix is in tip/core/urgent, commit #366b04ca. Or see
> the following patch.
>
> So, could you please give the following patch a try?

I'll give it a try.

Thanks, Eric

> Commit-ID: 366b04ca60c70479e2959fe8485b87ff380fdbbf
> Gitweb: http://git.kernel.org/tip/366b04ca60c70479e2959fe8485b87ff380fdbbf
> Author: Paul E. McKenney <[email protected]>
> AuthorDate: Sun, 13 Sep 2009 09:15:11 -0700
> Committer: Ingo Molnar <[email protected]>
> CommitDate: Tue, 15 Sep 2009 08:43:59 +0200
>
> rcu: Fix synchronize_rcu() for TREE_PREEMPT_RCU
>
> The redirection of synchronize_sched() to synchronize_rcu() was
> appropriate for TREE_RCU, but not for TREE_PREEMPT_RCU.
>
> Fix this by creating an underlying synchronize_sched(). TREE_RCU
> then redirects synchronize_rcu() to synchronize_sched(), while
> TREE_PREEMPT_RCU has its own version of synchronize_rcu().
>
> Signed-off-by: Paul E. McKenney <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> LKML-Reference: <12528585111916-git-send-email->
> Signed-off-by: Ingo Molnar <[email protected]>
>
>
> ---
> include/linux/rcupdate.h | 23 +++++------------------
> include/linux/rcutree.h | 4 ++--
> kernel/rcupdate.c | 44 +++++++++++++++++++++++++++++++++++++++++++-
> 3 files changed, 50 insertions(+), 21 deletions(-)
>
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 95e0615..39dce83 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -52,8 +52,13 @@ struct rcu_head {
> };
>
> /* Exported common interfaces */
> +#ifdef CONFIG_TREE_PREEMPT_RCU
> extern void synchronize_rcu(void);
> +#else /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> +#define synchronize_rcu synchronize_sched
> +#endif /* #else #ifdef CONFIG_TREE_PREEMPT_RCU */
> extern void synchronize_rcu_bh(void);
> +extern void synchronize_sched(void);
> extern void rcu_barrier(void);
> extern void rcu_barrier_bh(void);
> extern void rcu_barrier_sched(void);
> @@ -262,24 +267,6 @@ struct rcu_synchronize {
> extern void wakeme_after_rcu(struct rcu_head *head);
>
> /**
> - * synchronize_sched - block until all CPUs have exited any non-preemptive
> - * kernel code sequences.
> - *
> - * This means that all preempt_disable code sequences, including NMI and
> - * hardware-interrupt handlers, in progress on entry will have completed
> - * before this primitive returns. However, this does not guarantee that
> - * softirq handlers will have completed, since in some kernels, these
> - * handlers can run in process context, and can block.
> - *
> - * This primitive provides the guarantees made by the (now removed)
> - * synchronize_kernel() API. In contrast, synchronize_rcu() only
> - * guarantees that rcu_read_lock() sections will have completed.
> - * In "classic RCU", these two guarantees happen to be one and
> - * the same, but can differ in realtime RCU implementations.
> - */
> -#define synchronize_sched() __synchronize_sched()
> -
> -/**
> * call_rcu - Queue an RCU callback for invocation after a grace period.
> * @head: structure to be used for queueing the RCU updates.
> * @func: actual update function to be invoked after the grace period
> diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
> index a893077..00d08c0 100644
> --- a/include/linux/rcutree.h
> +++ b/include/linux/rcutree.h
> @@ -53,6 +53,8 @@ static inline void __rcu_read_unlock(void)
> preempt_enable();
> }
>
> +#define __synchronize_sched() synchronize_rcu()
> +
> static inline void exit_rcu(void)
> {
> }
> @@ -68,8 +70,6 @@ static inline void __rcu_read_unlock_bh(void)
> local_bh_enable();
> }
>
> -#define __synchronize_sched() synchronize_rcu()
> -
> extern void call_rcu_sched(struct rcu_head *head,
> void (*func)(struct rcu_head *rcu));
>
> diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
> index bd5d5c8..28d2f24 100644
> --- a/kernel/rcupdate.c
> +++ b/kernel/rcupdate.c
> @@ -74,6 +74,8 @@ void wakeme_after_rcu(struct rcu_head *head)
> complete(&rcu->completion);
> }
>
> +#ifdef CONFIG_TREE_PREEMPT_RCU
> +
> /**
> * synchronize_rcu - wait until a grace period has elapsed.
> *
> @@ -87,7 +89,7 @@ void synchronize_rcu(void)
> {
> struct rcu_synchronize rcu;
>
> - if (rcu_blocking_is_gp())
> + if (!rcu_scheduler_active)
> return;
>
> init_completion(&rcu.completion);
> @@ -98,6 +100,46 @@ void synchronize_rcu(void)
> }
> EXPORT_SYMBOL_GPL(synchronize_rcu);
>
> +#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
> +
> +/**
> + * synchronize_sched - wait until an rcu-sched grace period has elapsed.
> + *
> + * Control will return to the caller some time after a full rcu-sched
> + * grace period has elapsed, in other words after all currently executing
> + * rcu-sched read-side critical sections have completed. These read-side
> + * critical sections are delimited by rcu_read_lock_sched() and
> + * rcu_read_unlock_sched(), and may be nested. Note that preempt_disable(),
> + * local_irq_disable(), and so on may be used in place of
> + * rcu_read_lock_sched().
> + *
> + * This means that all preempt_disable code sequences, including NMI and
> + * hardware-interrupt handlers, in progress on entry will have completed
> + * before this primitive returns. However, this does not guarantee that
> + * softirq handlers will have completed, since in some kernels, these
> + * handlers can run in process context, and can block.
> + *
> + * This primitive provides the guarantees made by the (now removed)
> + * synchronize_kernel() API. In contrast, synchronize_rcu() only
> + * guarantees that rcu_read_lock() sections will have completed.
> + * In "classic RCU", these two guarantees happen to be one and
> + * the same, but can differ in realtime RCU implementations.
> + */
> +void synchronize_sched(void)
> +{
> + struct rcu_synchronize rcu;
> +
> + if (rcu_blocking_is_gp())
> + return;
> +
> + init_completion(&rcu.completion);
> + /* Will wake me after RCU finished. */
> + call_rcu_sched(&rcu.head, wakeme_after_rcu);
> + /* Wait for it. */
> + wait_for_completion(&rcu.completion);
> +}
> +EXPORT_SYMBOL_GPL(synchronize_sched);
> +
> /**
> * synchronize_rcu_bh - wait until an rcu_bh grace period has elapsed.
> *
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2009-09-16 23:19:48

by Eric Sesterhenn

[permalink] [raw]
Subject: Re: RCU callbacks and TREE_PREEMPT_RCU

On Wed, 2009-09-16 at 08:57 -0700, Paul E. McKenney wrote:
> On Wed, Sep 16, 2009 at 08:47:16AM -0700, Paul E. McKenney wrote:
> > On Wed, Sep 16, 2009 at 04:34:15PM +0100, Catalin Marinas wrote:
> > > On Wed, 2009-09-16 at 08:29 -0700, Paul E. McKenney wrote:
> > > > On Wed, Sep 16, 2009 at 03:17:21PM +0100, Catalin Marinas wrote:
> > > > > When TREE_PREEMPT_RCU is enabled, the rcu list traversing above fails
> > > > > with access to 0x6b6b6b6b but it is fine with TREE_PREEMPT_RCU=n and
> > > > > TREE_RCU=y. During clean-up, kmemleak objects should no longer be freed
> > > > > by other means since kmemleak was disabled and all callbacks are
> > > > > ignored. The system is a 900Mhz P3, 256MB RAM, CONFIG_SMP=n.
> > > > >
> > > > > Is there something I'm doing wrong in kmemleak or a bug with RCU
> > > > > preemption? The kernel oops looks like this:
> > > >
> > > > From your description and the code above, I must suspect a bug with
> > > > RCU preemption. A new one, as the only bugs I am currently chasing
> > > > involve NR_CPUS>32 (>64 on 64-bit systems).
> > > >
> > > > CONFIG_SMP=n implies NR_CPUS==1 in your build, correct?
> > >
> > > CONFIG_NR_CPUS=1.
> >
> > I was afraid of that. ;-)
>
> PS to previous -- there -is- a bug in mainline for TREE_PREEMPT_RCU for
> single-CPU operation, but it is with synchronize_rcu() rather than
> call_rcu(). The fix is in tip/core/urgent, commit #366b04ca. Or see
> the following patch.
>
> So, could you please give the following patch a try?

Sadly this does not fix the issue, is there any further information I
can provide to you?

Regards, Eric

2009-09-16 23:26:12

by Paul E. McKenney

[permalink] [raw]
Subject: Re: RCU callbacks and TREE_PREEMPT_RCU

On Thu, Sep 17, 2009 at 01:19:46AM +0200, Eric Sesterhenn wrote:
> On Wed, 2009-09-16 at 08:57 -0700, Paul E. McKenney wrote:
> > On Wed, Sep 16, 2009 at 08:47:16AM -0700, Paul E. McKenney wrote:
> > > On Wed, Sep 16, 2009 at 04:34:15PM +0100, Catalin Marinas wrote:
> > > > On Wed, 2009-09-16 at 08:29 -0700, Paul E. McKenney wrote:
> > > > > On Wed, Sep 16, 2009 at 03:17:21PM +0100, Catalin Marinas wrote:
> > > > > > When TREE_PREEMPT_RCU is enabled, the rcu list traversing above fails
> > > > > > with access to 0x6b6b6b6b but it is fine with TREE_PREEMPT_RCU=n and
> > > > > > TREE_RCU=y. During clean-up, kmemleak objects should no longer be freed
> > > > > > by other means since kmemleak was disabled and all callbacks are
> > > > > > ignored. The system is a 900Mhz P3, 256MB RAM, CONFIG_SMP=n.
> > > > > >
> > > > > > Is there something I'm doing wrong in kmemleak or a bug with RCU
> > > > > > preemption? The kernel oops looks like this:
> > > > >
> > > > > From your description and the code above, I must suspect a bug with
> > > > > RCU preemption. A new one, as the only bugs I am currently chasing
> > > > > involve NR_CPUS>32 (>64 on 64-bit systems).
> > > > >
> > > > > CONFIG_SMP=n implies NR_CPUS==1 in your build, correct?
> > > >
> > > > CONFIG_NR_CPUS=1.
> > >
> > > I was afraid of that. ;-)
> >
> > PS to previous -- there -is- a bug in mainline for TREE_PREEMPT_RCU for
> > single-CPU operation, but it is with synchronize_rcu() rather than
> > call_rcu(). The fix is in tip/core/urgent, commit #366b04ca. Or see
> > the following patch.
> >
> > So, could you please give the following patch a try?
>
> Sadly this does not fix the issue, is there any further information I
> can provide to you?

:-(

Would you be willing to give the attached diagnostic patch a go?

Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 2454999..211442c 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -623,8 +623,8 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)

/* Special-case the common single-level case. */
if (NUM_RCU_NODES == 1) {
- rnp->qsmask = rnp->qsmaskinit;
rcu_preempt_check_blocked_tasks(rnp);
+ rnp->qsmask = rnp->qsmaskinit;
rnp->gpnum = rsp->gpnum;
rsp->signaled = RCU_SIGNAL_INIT; /* force_quiescent_state OK. */
spin_unlock_irqrestore(&rnp->lock, flags);
@@ -657,8 +657,8 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
rnp_end = &rsp->node[NUM_RCU_NODES];
for (rnp_cur = &rsp->node[0]; rnp_cur < rnp_end; rnp_cur++) {
spin_lock(&rnp_cur->lock); /* irqs already disabled. */
- rnp_cur->qsmask = rnp_cur->qsmaskinit;
rcu_preempt_check_blocked_tasks(rnp);
+ rnp_cur->qsmask = rnp_cur->qsmaskinit;
rnp->gpnum = rsp->gpnum;
spin_unlock(&rnp_cur->lock); /* irqs already disabled. */
}
@@ -703,6 +703,7 @@ rcu_process_gp_end(struct rcu_state *rsp, struct rcu_data *rdp)
static void cpu_quiet_msk_finish(struct rcu_state *rsp, unsigned long flags)
__releases(rnp->lock)
{
+ WARN_ON_ONCE(rsp->completed == rsp->gpnum);
rsp->completed = rsp->gpnum;
rcu_process_gp_end(rsp, rsp->rda[smp_processor_id()]);
rcu_start_gp(rsp, flags); /* releases root node's rnp->lock. */
@@ -720,6 +721,8 @@ cpu_quiet_msk(unsigned long mask, struct rcu_state *rsp, struct rcu_node *rnp,
unsigned long flags)
__releases(rnp->lock)
{
+ struct rcu_node *rnp_c;
+
/* Walk up the rcu_node hierarchy. */
for (;;) {
if (!(rnp->qsmask & mask)) {
@@ -743,8 +746,10 @@ cpu_quiet_msk(unsigned long mask, struct rcu_state *rsp, struct rcu_node *rnp,
break;
}
spin_unlock_irqrestore(&rnp->lock, flags);
+ rnp_c = rnp;
rnp = rnp->parent;
spin_lock_irqsave(&rnp->lock, flags);
+ WARN_ON_ONCE(rnp_c->qsmask);
}

/*
@@ -853,7 +858,7 @@ static void __rcu_offline_cpu(int cpu, struct rcu_state *rsp)
spin_lock_irqsave(&rsp->onofflock, flags);

/* Remove the outgoing CPU from the masks in the rcu_node hierarchy. */
- rnp = rdp->mynode;
+ rnp = rdp->mynode; /* this is the outgoing CPU's rnp. */
mask = rdp->grpmask; /* rnp->grplo is constant. */
do {
spin_lock(&rnp->lock); /* irqs already disabled. */
@@ -862,7 +867,7 @@ static void __rcu_offline_cpu(int cpu, struct rcu_state *rsp)
spin_unlock(&rnp->lock); /* irqs remain disabled. */
break;
}
- rcu_preempt_offline_tasks(rsp, rnp);
+ rcu_preempt_offline_tasks(rsp, rnp, rdp);
mask = rnp->grpmask;
spin_unlock(&rnp->lock); /* irqs remain disabled. */
rnp = rnp->parent;
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index eb4bae3..2b996c3 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -206,7 +206,8 @@ static void rcu_read_unlock_special(struct task_struct *t)
*/
if (!empty && rnp->qsmask == 0 &&
list_empty(&rnp->blocked_tasks[rnp->gpnum & 0x1])) {
- t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS;
+ struct rcu_node *rnp_p;
+
if (rnp->parent == NULL) {
/* Only one rcu_node in the tree. */
cpu_quiet_msk_finish(&rcu_preempt_state, flags);
@@ -215,9 +216,10 @@ static void rcu_read_unlock_special(struct task_struct *t)
/* Report up the rest of the hierarchy. */
mask = rnp->grpmask;
spin_unlock_irqrestore(&rnp->lock, flags);
- rnp = rnp->parent;
- spin_lock_irqsave(&rnp->lock, flags);
- cpu_quiet_msk(mask, &rcu_preempt_state, rnp, flags);
+ rnp_p = rnp->parent;
+ spin_lock_irqsave(&rnp_p->lock, flags);
+ WARN_ON_ONCE(rnp->qsmask);
+ cpu_quiet_msk(mask, &rcu_preempt_state, rnp_p, flags);
return;
}
spin_unlock(&rnp->lock);
@@ -278,6 +280,7 @@ static void rcu_print_task_stall(struct rcu_node *rnp)
static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
{
WARN_ON_ONCE(!list_empty(&rnp->blocked_tasks[rnp->gpnum & 0x1]));
+ WARN_ON_ONCE(rnp->qsmask);
}

/*
@@ -302,7 +305,8 @@ static int rcu_preempted_readers(struct rcu_node *rnp)
* The caller must hold rnp->lock with irqs disabled.
*/
static void rcu_preempt_offline_tasks(struct rcu_state *rsp,
- struct rcu_node *rnp)
+ struct rcu_node *rnp,
+ struct rcu_data *rdp)
{
int i;
struct list_head *lp;
@@ -314,6 +318,9 @@ static void rcu_preempt_offline_tasks(struct rcu_state *rsp,
WARN_ONCE(1, "Last CPU thought to be offlined?");
return; /* Shouldn't happen: at least one CPU online. */
}
+ WARN_ON_ONCE(rnp != rdp->mynode &&
+ (!list_empty(&rnp->blocked_tasks[0]) ||
+ !list_empty(&rnp->blocked_tasks[1])));

/*
* Move tasks up to root rcu_node. Rely on the fact that the
@@ -489,7 +496,8 @@ static int rcu_preempted_readers(struct rcu_node *rnp)
* tasks that were blocked within RCU read-side critical sections.
*/
static void rcu_preempt_offline_tasks(struct rcu_state *rsp,
- struct rcu_node *rnp)
+ struct rcu_node *rnp,
+ struct rcu_data *rdp)
{
}

2009-09-17 08:29:07

by Eric Sesterhenn

[permalink] [raw]
Subject: Re: RCU callbacks and TREE_PREEMPT_RCU

On Wed, 2009-09-16 at 16:26 -0700, Paul E. McKenney wrote:
> On Thu, Sep 17, 2009 at 01:19:46AM +0200, Eric Sesterhenn wrote:
> > On Wed, 2009-09-16 at 08:57 -0700, Paul E. McKenney wrote:
> > > On Wed, Sep 16, 2009 at 08:47:16AM -0700, Paul E. McKenney wrote:
> > > > On Wed, Sep 16, 2009 at 04:34:15PM +0100, Catalin Marinas wrote:
> > > > > On Wed, 2009-09-16 at 08:29 -0700, Paul E. McKenney wrote:
> > > > > > On Wed, Sep 16, 2009 at 03:17:21PM +0100, Catalin Marinas wrote:
> > > > > > > When TREE_PREEMPT_RCU is enabled, the rcu list traversing above fails
> > > > > > > with access to 0x6b6b6b6b but it is fine with TREE_PREEMPT_RCU=n and
> > > > > > > TREE_RCU=y. During clean-up, kmemleak objects should no longer be freed
> > > > > > > by other means since kmemleak was disabled and all callbacks are
> > > > > > > ignored. The system is a 900Mhz P3, 256MB RAM, CONFIG_SMP=n.
> > > > > > >
> > > > > > > Is there something I'm doing wrong in kmemleak or a bug with RCU
> > > > > > > preemption? The kernel oops looks like this:
> > > > > >
> > > > > > From your description and the code above, I must suspect a bug with
> > > > > > RCU preemption. A new one, as the only bugs I am currently chasing
> > > > > > involve NR_CPUS>32 (>64 on 64-bit systems).
> > > > > >
> > > > > > CONFIG_SMP=n implies NR_CPUS==1 in your build, correct?
> > > > >
> > > > > CONFIG_NR_CPUS=1.
> > > >
> > > > I was afraid of that. ;-)
> > >
> > > PS to previous -- there -is- a bug in mainline for TREE_PREEMPT_RCU for
> > > single-CPU operation, but it is with synchronize_rcu() rather than
> > > call_rcu(). The fix is in tip/core/urgent, commit #366b04ca. Or see
> > > the following patch.
> > >
> > > So, could you please give the following patch a try?
> >
> > Sadly this does not fix the issue, is there any further information I
> > can provide to you?
>
> :-(
>
> Would you be willing to give the attached diagnostic patch a go?
>
> Thanx, Paul

It does not apply cleanly against current -git
(rcu_preempt_check_blocked_tasks is missing in my rcutree_plugin.h for
example) I tried to apply it by hand as good as possible, and will test
it today.

root@whiterabbit:/usr/src/linux# patch -p1 < ~/RCU_callbacks_and_TREE_PREEMPT_RCU-debug
patching file kernel/rcutree.c
Hunk #1 FAILED at 623.
Hunk #2 FAILED at 657.
Hunk #3 succeeded at 722 (offset 19 lines).
Hunk #4 succeeded at 740 (offset 19 lines).
Hunk #5 succeeded at 765 (offset 19 lines).
Hunk #6 succeeded at 877 (offset 19 lines).
Hunk #7 succeeded at 886 (offset 19 lines).
2 out of 7 hunks FAILED -- saving rejects to file kernel/rcutree.c.rej
patching file kernel/rcutree_plugin.h
Hunk #1 FAILED at 206.
Hunk #2 succeeded at 206 (offset -10 lines).
Hunk #3 FAILED at 270.
Hunk #4 succeeded at 283 (offset -22 lines).
Hunk #5 succeeded at 296 (offset -22 lines).
Hunk #6 succeeded at 473 (offset -23 lines).
2 out of 6 hunks FAILED -- saving rejects to file
kernel/rcutree_plugin.h.rej

Regards, Eric


2009-09-17 22:21:21

by Paul E. McKenney

[permalink] [raw]
Subject: Re: RCU callbacks and TREE_PREEMPT_RCU

On Thu, Sep 17, 2009 at 10:29:02AM +0200, Eric Sesterhenn wrote:
> On Wed, 2009-09-16 at 16:26 -0700, Paul E. McKenney wrote:
> > On Thu, Sep 17, 2009 at 01:19:46AM +0200, Eric Sesterhenn wrote:
> > > On Wed, 2009-09-16 at 08:57 -0700, Paul E. McKenney wrote:
> > > > On Wed, Sep 16, 2009 at 08:47:16AM -0700, Paul E. McKenney wrote:
> > > > > On Wed, Sep 16, 2009 at 04:34:15PM +0100, Catalin Marinas wrote:
> > > > > > On Wed, 2009-09-16 at 08:29 -0700, Paul E. McKenney wrote:
> > > > > > > On Wed, Sep 16, 2009 at 03:17:21PM +0100, Catalin Marinas wrote:
> > > > > > > > When TREE_PREEMPT_RCU is enabled, the rcu list traversing above fails
> > > > > > > > with access to 0x6b6b6b6b but it is fine with TREE_PREEMPT_RCU=n and
> > > > > > > > TREE_RCU=y. During clean-up, kmemleak objects should no longer be freed
> > > > > > > > by other means since kmemleak was disabled and all callbacks are
> > > > > > > > ignored. The system is a 900Mhz P3, 256MB RAM, CONFIG_SMP=n.
> > > > > > > >
> > > > > > > > Is there something I'm doing wrong in kmemleak or a bug with RCU
> > > > > > > > preemption? The kernel oops looks like this:
> > > > > > >
> > > > > > > From your description and the code above, I must suspect a bug with
> > > > > > > RCU preemption. A new one, as the only bugs I am currently chasing
> > > > > > > involve NR_CPUS>32 (>64 on 64-bit systems).
> > > > > > >
> > > > > > > CONFIG_SMP=n implies NR_CPUS==1 in your build, correct?
> > > > > >
> > > > > > CONFIG_NR_CPUS=1.
> > > > >
> > > > > I was afraid of that. ;-)
> > > >
> > > > PS to previous -- there -is- a bug in mainline for TREE_PREEMPT_RCU for
> > > > single-CPU operation, but it is with synchronize_rcu() rather than
> > > > call_rcu(). The fix is in tip/core/urgent, commit #366b04ca. Or see
> > > > the following patch.
> > > >
> > > > So, could you please give the following patch a try?
> > >
> > > Sadly this does not fix the issue, is there any further information I
> > > can provide to you?
> >
> > :-(
> >
> > Would you be willing to give the attached diagnostic patch a go?
> >
> > Thanx, Paul
>
> It does not apply cleanly against current -git
> (rcu_preempt_check_blocked_tasks is missing in my rcutree_plugin.h for
> example) I tried to apply it by hand as good as possible, and will test
> it today.
>
> root@whiterabbit:/usr/src/linux# patch -p1 < ~/RCU_callbacks_and_TREE_PREEMPT_RCU-debug
> patching file kernel/rcutree.c
> Hunk #1 FAILED at 623.
> Hunk #2 FAILED at 657.
> Hunk #3 succeeded at 722 (offset 19 lines).
> Hunk #4 succeeded at 740 (offset 19 lines).
> Hunk #5 succeeded at 765 (offset 19 lines).
> Hunk #6 succeeded at 877 (offset 19 lines).
> Hunk #7 succeeded at 886 (offset 19 lines).
> 2 out of 7 hunks FAILED -- saving rejects to file kernel/rcutree.c.rej
> patching file kernel/rcutree_plugin.h
> Hunk #1 FAILED at 206.
> Hunk #2 succeeded at 206 (offset -10 lines).
> Hunk #3 FAILED at 270.
> Hunk #4 succeeded at 283 (offset -22 lines).
> Hunk #5 succeeded at 296 (offset -22 lines).
> Hunk #6 succeeded at 473 (offset -23 lines).
> 2 out of 6 hunks FAILED -- saving rejects to file
> kernel/rcutree_plugin.h.rej

Sigh!!! I lost track of what was in mainline vs. -tip. You certainly
need the following patch from -tip as well.

Please accept apologies for my confusion!!!

Thanx, Paul

------------------------------------------------------------------------

Commit-ID: de078d875cc7fc709f7818f26d38389c04369826
Gitweb: http://git.kernel.org/tip/de078d875cc7fc709f7818f26d38389c04369826
Author: Paul E. McKenney <[email protected]>
AuthorDate: Tue, 8 Sep 2009 15:54:36 -0700
Committer: Ingo Molnar <[email protected]>
CommitDate: Fri, 18 Sep 2009 00:04:54 +0200

rcu: Need to update rnp->gpnum if preemptable RCU is to be reliable

Without this patch, tasks preempted in RCU read-side critical
sections can fail to block the grace period, given that
rnp->gpnum is used to determine which rnp->blocked_tasks[]
element the preempted task is enqueued on.

Before the patch, rnp->gpnum is always zero, so preempted tasks
are always enqueued on rnp->blocked_tasks[0], which is correct
only when the current CPU has not checked into the current
grace period and the grace-period number is even, or,
similarly, if the current CPU -has- checked into the current
grace period and the grace-period number is odd.

Signed-off-by: Paul E. McKenney <[email protected]>
Acked-by: Steven Rostedt <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
LKML-Reference: <12524504771622-git-send-email->
Signed-off-by: Ingo Molnar <[email protected]>


---
kernel/rcutree.c | 6 +++++-
1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 6b11b07..c634a92 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -632,6 +632,7 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
/* Special-case the common single-level case. */
if (NUM_RCU_NODES == 1) {
rnp->qsmask = rnp->qsmaskinit;
+ rnp->gpnum = rsp->gpnum;
rsp->signaled = RCU_SIGNAL_INIT; /* force_quiescent_state OK. */
spin_unlock_irqrestore(&rnp->lock, flags);
return;
@@ -657,8 +658,10 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
*/

rnp_end = rsp->level[NUM_RCU_LVLS - 1];
- for (rnp_cur = &rsp->node[0]; rnp_cur < rnp_end; rnp_cur++)
+ for (rnp_cur = &rsp->node[0]; rnp_cur < rnp_end; rnp_cur++) {
rnp_cur->qsmask = rnp_cur->qsmaskinit;
+ rnp->gpnum = rsp->gpnum;
+ }

/*
* Now set up the leaf nodes. Here we must be careful. First,
@@ -679,6 +682,7 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
for (; rnp_cur < rnp_end; rnp_cur++) {
spin_lock(&rnp_cur->lock); /* irqs already disabled. */
rnp_cur->qsmask = rnp_cur->qsmaskinit;
+ rnp->gpnum = rsp->gpnum;
spin_unlock(&rnp_cur->lock); /* irqs already disabled. */
}

2009-09-18 12:12:04

by Eric Sesterhenn

[permalink] [raw]
Subject: Re: RCU callbacks and TREE_PREEMPT_RCU

hi,

On Thu, 2009-09-17 at 15:21 -0700, Paul E. McKenney wrote:
> On Thu, Sep 17, 2009 at 10:29:02AM +0200, Eric Sesterhenn wrote:
> > On Wed, 2009-09-16 at 16:26 -0700, Paul E. McKenney wrote:
> > > On Thu, Sep 17, 2009 at 01:19:46AM +0200, Eric Sesterhenn wrote:
> > > > On Wed, 2009-09-16 at 08:57 -0700, Paul E. McKenney wrote:
> > > > > On Wed, Sep 16, 2009 at 08:47:16AM -0700, Paul E. McKenney wrote:
> > > > > > On Wed, Sep 16, 2009 at 04:34:15PM +0100, Catalin Marinas wrote:
> > > > > > > On Wed, 2009-09-16 at 08:29 -0700, Paul E. McKenney wrote:
> > > > > > > > On Wed, Sep 16, 2009 at 03:17:21PM +0100, Catalin Marinas wrote:
> > > > > > > > > When TREE_PREEMPT_RCU is enabled, the rcu list traversing above fails
> > > > > > > > > with access to 0x6b6b6b6b but it is fine with TREE_PREEMPT_RCU=n and
> > > > > > > > > TREE_RCU=y. During clean-up, kmemleak objects should no longer be freed
> > > > > > > > > by other means since kmemleak was disabled and all callbacks are
> > > > > > > > > ignored. The system is a 900Mhz P3, 256MB RAM, CONFIG_SMP=n.
> > > > > > > > >
> > > > > > > > > Is there something I'm doing wrong in kmemleak or a bug with RCU
> > > > > > > > > preemption? The kernel oops looks like this:
> > > > > > > >
> > > > > > > > From your description and the code above, I must suspect a bug with
> > > > > > > > RCU preemption. A new one, as the only bugs I am currently chasing
> > > > > > > > involve NR_CPUS>32 (>64 on 64-bit systems).
> > > > > > > >
> > > > > > > > CONFIG_SMP=n implies NR_CPUS==1 in your build, correct?
> > > > > > >
> > > > > > > CONFIG_NR_CPUS=1.
> > > > > >
> > > > > > I was afraid of that. ;-)
> > > > >
> > > > > PS to previous -- there -is- a bug in mainline for TREE_PREEMPT_RCU for
> > > > > single-CPU operation, but it is with synchronize_rcu() rather than
> > > > > call_rcu(). The fix is in tip/core/urgent, commit #366b04ca. Or see
> > > > > the following patch.
> > > > >
> > > > > So, could you please give the following patch a try?
> > > >
> > > > Sadly this does not fix the issue, is there any further information I
> > > > can provide to you?
> > >
> > > :-(
> > >
> > > Would you be willing to give the attached diagnostic patch a go?
> > >
> > > Thanx, Paul
> >
> > It does not apply cleanly against current -git
> > (rcu_preempt_check_blocked_tasks is missing in my rcutree_plugin.h for
> > example) I tried to apply it by hand as good as possible, and will test
> > it today.
> >
> > root@whiterabbit:/usr/src/linux# patch -p1 < ~/RCU_callbacks_and_TREE_PREEMPT_RCU-debug
> > patching file kernel/rcutree.c
> > Hunk #1 FAILED at 623.
> > Hunk #2 FAILED at 657.
> > Hunk #3 succeeded at 722 (offset 19 lines).
> > Hunk #4 succeeded at 740 (offset 19 lines).
> > Hunk #5 succeeded at 765 (offset 19 lines).
> > Hunk #6 succeeded at 877 (offset 19 lines).
> > Hunk #7 succeeded at 886 (offset 19 lines).
> > 2 out of 7 hunks FAILED -- saving rejects to file kernel/rcutree.c.rej
> > patching file kernel/rcutree_plugin.h
> > Hunk #1 FAILED at 206.
> > Hunk #2 succeeded at 206 (offset -10 lines).
> > Hunk #3 FAILED at 270.
> > Hunk #4 succeeded at 283 (offset -22 lines).
> > Hunk #5 succeeded at 296 (offset -22 lines).
> > Hunk #6 succeeded at 473 (offset -23 lines).
> > 2 out of 6 hunks FAILED -- saving rejects to file
> > kernel/rcutree_plugin.h.rej
>
> Sigh!!! I lost track of what was in mainline vs. -tip. You certainly
> need the following patch from -tip as well.
>
> Please accept apologies for my confusion!!!

no problem, it still did not apply cleanly, but i was able to get a
working kernel and cant reproduce the issue with all 3 patches applied.

Thanks, Eric

> ------------------------------------------------------------------------
>
> Commit-ID: de078d875cc7fc709f7818f26d38389c04369826
> Gitweb: http://git.kernel.org/tip/de078d875cc7fc709f7818f26d38389c04369826
> Author: Paul E. McKenney <[email protected]>
> AuthorDate: Tue, 8 Sep 2009 15:54:36 -0700
> Committer: Ingo Molnar <[email protected]>
> CommitDate: Fri, 18 Sep 2009 00:04:54 +0200
>
> rcu: Need to update rnp->gpnum if preemptable RCU is to be reliable
>
> Without this patch, tasks preempted in RCU read-side critical
> sections can fail to block the grace period, given that
> rnp->gpnum is used to determine which rnp->blocked_tasks[]
> element the preempted task is enqueued on.
>
> Before the patch, rnp->gpnum is always zero, so preempted tasks
> are always enqueued on rnp->blocked_tasks[0], which is correct
> only when the current CPU has not checked into the current
> grace period and the grace-period number is even, or,
> similarly, if the current CPU -has- checked into the current
> grace period and the grace-period number is odd.
>
> Signed-off-by: Paul E. McKenney <[email protected]>
> Acked-by: Steven Rostedt <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> LKML-Reference: <12524504771622-git-send-email->
> Signed-off-by: Ingo Molnar <[email protected]>
>
>
> ---
> kernel/rcutree.c | 6 +++++-
> 1 files changed, 5 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 6b11b07..c634a92 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -632,6 +632,7 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
> /* Special-case the common single-level case. */
> if (NUM_RCU_NODES == 1) {
> rnp->qsmask = rnp->qsmaskinit;
> + rnp->gpnum = rsp->gpnum;
> rsp->signaled = RCU_SIGNAL_INIT; /* force_quiescent_state OK. */
> spin_unlock_irqrestore(&rnp->lock, flags);
> return;
> @@ -657,8 +658,10 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
> */
>
> rnp_end = rsp->level[NUM_RCU_LVLS - 1];
> - for (rnp_cur = &rsp->node[0]; rnp_cur < rnp_end; rnp_cur++)
> + for (rnp_cur = &rsp->node[0]; rnp_cur < rnp_end; rnp_cur++) {
> rnp_cur->qsmask = rnp_cur->qsmaskinit;
> + rnp->gpnum = rsp->gpnum;
> + }
>
> /*
> * Now set up the leaf nodes. Here we must be careful. First,
> @@ -679,6 +682,7 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
> for (; rnp_cur < rnp_end; rnp_cur++) {
> spin_lock(&rnp_cur->lock); /* irqs already disabled. */
> rnp_cur->qsmask = rnp_cur->qsmaskinit;
> + rnp->gpnum = rsp->gpnum;
> spin_unlock(&rnp_cur->lock); /* irqs already disabled. */
> }
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2009-09-18 12:41:16

by Paul E. McKenney

[permalink] [raw]
Subject: Re: RCU callbacks and TREE_PREEMPT_RCU

On Fri, Sep 18, 2009 at 02:12:01PM +0200, Eric Sesterhenn wrote:
> hi,
>
> On Thu, 2009-09-17 at 15:21 -0700, Paul E. McKenney wrote:
> > On Thu, Sep 17, 2009 at 10:29:02AM +0200, Eric Sesterhenn wrote:
> > > On Wed, 2009-09-16 at 16:26 -0700, Paul E. McKenney wrote:
> > > > On Thu, Sep 17, 2009 at 01:19:46AM +0200, Eric Sesterhenn wrote:
> > > > > On Wed, 2009-09-16 at 08:57 -0700, Paul E. McKenney wrote:
> > > > > > On Wed, Sep 16, 2009 at 08:47:16AM -0700, Paul E. McKenney wrote:
> > > > > > > On Wed, Sep 16, 2009 at 04:34:15PM +0100, Catalin Marinas wrote:
> > > > > > > > On Wed, 2009-09-16 at 08:29 -0700, Paul E. McKenney wrote:
> > > > > > > > > On Wed, Sep 16, 2009 at 03:17:21PM +0100, Catalin Marinas wrote:
> > > > > > > > > > When TREE_PREEMPT_RCU is enabled, the rcu list traversing above fails
> > > > > > > > > > with access to 0x6b6b6b6b but it is fine with TREE_PREEMPT_RCU=n and
> > > > > > > > > > TREE_RCU=y. During clean-up, kmemleak objects should no longer be freed
> > > > > > > > > > by other means since kmemleak was disabled and all callbacks are
> > > > > > > > > > ignored. The system is a 900Mhz P3, 256MB RAM, CONFIG_SMP=n.
> > > > > > > > > >
> > > > > > > > > > Is there something I'm doing wrong in kmemleak or a bug with RCU
> > > > > > > > > > preemption? The kernel oops looks like this:
> > > > > > > > >
> > > > > > > > > From your description and the code above, I must suspect a bug with
> > > > > > > > > RCU preemption. A new one, as the only bugs I am currently chasing
> > > > > > > > > involve NR_CPUS>32 (>64 on 64-bit systems).
> > > > > > > > >
> > > > > > > > > CONFIG_SMP=n implies NR_CPUS==1 in your build, correct?
> > > > > > > >
> > > > > > > > CONFIG_NR_CPUS=1.
> > > > > > >
> > > > > > > I was afraid of that. ;-)
> > > > > >
> > > > > > PS to previous -- there -is- a bug in mainline for TREE_PREEMPT_RCU for
> > > > > > single-CPU operation, but it is with synchronize_rcu() rather than
> > > > > > call_rcu(). The fix is in tip/core/urgent, commit #366b04ca. Or see
> > > > > > the following patch.
> > > > > >
> > > > > > So, could you please give the following patch a try?
> > > > >
> > > > > Sadly this does not fix the issue, is there any further information I
> > > > > can provide to you?
> > > >
> > > > :-(
> > > >
> > > > Would you be willing to give the attached diagnostic patch a go?
> > > >
> > > > Thanx, Paul
> > >
> > > It does not apply cleanly against current -git
> > > (rcu_preempt_check_blocked_tasks is missing in my rcutree_plugin.h for
> > > example) I tried to apply it by hand as good as possible, and will test
> > > it today.
> > >
> > > root@whiterabbit:/usr/src/linux# patch -p1 < ~/RCU_callbacks_and_TREE_PREEMPT_RCU-debug
> > > patching file kernel/rcutree.c
> > > Hunk #1 FAILED at 623.
> > > Hunk #2 FAILED at 657.
> > > Hunk #3 succeeded at 722 (offset 19 lines).
> > > Hunk #4 succeeded at 740 (offset 19 lines).
> > > Hunk #5 succeeded at 765 (offset 19 lines).
> > > Hunk #6 succeeded at 877 (offset 19 lines).
> > > Hunk #7 succeeded at 886 (offset 19 lines).
> > > 2 out of 7 hunks FAILED -- saving rejects to file kernel/rcutree.c.rej
> > > patching file kernel/rcutree_plugin.h
> > > Hunk #1 FAILED at 206.
> > > Hunk #2 succeeded at 206 (offset -10 lines).
> > > Hunk #3 FAILED at 270.
> > > Hunk #4 succeeded at 283 (offset -22 lines).
> > > Hunk #5 succeeded at 296 (offset -22 lines).
> > > Hunk #6 succeeded at 473 (offset -23 lines).
> > > 2 out of 6 hunks FAILED -- saving rejects to file
> > > kernel/rcutree_plugin.h.rej
> >
> > Sigh!!! I lost track of what was in mainline vs. -tip. You certainly
> > need the following patch from -tip as well.
> >
> > Please accept apologies for my confusion!!!
>
> no problem, it still did not apply cleanly, but i was able to get a
> working kernel and cant reproduce the issue with all 3 patches applied.

Very good!!! ;-)

Thanx, Paul

> Thanks, Eric
>
> > ------------------------------------------------------------------------
> >
> > Commit-ID: de078d875cc7fc709f7818f26d38389c04369826
> > Gitweb: http://git.kernel.org/tip/de078d875cc7fc709f7818f26d38389c04369826
> > Author: Paul E. McKenney <[email protected]>
> > AuthorDate: Tue, 8 Sep 2009 15:54:36 -0700
> > Committer: Ingo Molnar <[email protected]>
> > CommitDate: Fri, 18 Sep 2009 00:04:54 +0200
> >
> > rcu: Need to update rnp->gpnum if preemptable RCU is to be reliable
> >
> > Without this patch, tasks preempted in RCU read-side critical
> > sections can fail to block the grace period, given that
> > rnp->gpnum is used to determine which rnp->blocked_tasks[]
> > element the preempted task is enqueued on.
> >
> > Before the patch, rnp->gpnum is always zero, so preempted tasks
> > are always enqueued on rnp->blocked_tasks[0], which is correct
> > only when the current CPU has not checked into the current
> > grace period and the grace-period number is even, or,
> > similarly, if the current CPU -has- checked into the current
> > grace period and the grace-period number is odd.
> >
> > Signed-off-by: Paul E. McKenney <[email protected]>
> > Acked-by: Steven Rostedt <[email protected]>
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > LKML-Reference: <12524504771622-git-send-email->
> > Signed-off-by: Ingo Molnar <[email protected]>
> >
> >
> > ---
> > kernel/rcutree.c | 6 +++++-
> > 1 files changed, 5 insertions(+), 1 deletions(-)
> >
> > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > index 6b11b07..c634a92 100644
> > --- a/kernel/rcutree.c
> > +++ b/kernel/rcutree.c
> > @@ -632,6 +632,7 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
> > /* Special-case the common single-level case. */
> > if (NUM_RCU_NODES == 1) {
> > rnp->qsmask = rnp->qsmaskinit;
> > + rnp->gpnum = rsp->gpnum;
> > rsp->signaled = RCU_SIGNAL_INIT; /* force_quiescent_state OK. */
> > spin_unlock_irqrestore(&rnp->lock, flags);
> > return;
> > @@ -657,8 +658,10 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
> > */
> >
> > rnp_end = rsp->level[NUM_RCU_LVLS - 1];
> > - for (rnp_cur = &rsp->node[0]; rnp_cur < rnp_end; rnp_cur++)
> > + for (rnp_cur = &rsp->node[0]; rnp_cur < rnp_end; rnp_cur++) {
> > rnp_cur->qsmask = rnp_cur->qsmaskinit;
> > + rnp->gpnum = rsp->gpnum;
> > + }
> >
> > /*
> > * Now set up the leaf nodes. Here we must be careful. First,
> > @@ -679,6 +682,7 @@ rcu_start_gp(struct rcu_state *rsp, unsigned long flags)
> > for (; rnp_cur < rnp_end; rnp_cur++) {
> > spin_lock(&rnp_cur->lock); /* irqs already disabled. */
> > rnp_cur->qsmask = rnp_cur->qsmaskinit;
> > + rnp->gpnum = rsp->gpnum;
> > spin_unlock(&rnp_cur->lock); /* irqs already disabled. */
> > }
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >
>