LinuxLists.cc - Re: [PATCH] An RCU for SMP with a single CPU garbage collector

2011-03-07 21:16:17

Subject: Re: [PATCH] An RCU for SMP with a single CPU garbage collector

On Mon, Mar 07, 2011 at 04:01:57PM -0500, Paul E. McKenney wrote:
> Interesting!
>
> But I would really prefer leveraging the existing RCU implementations
> to the extent possible. Are the user-dedicated CPUs able to invoke
> system calls? If so, something like Frederic's approach should permit
> the existing RCU implementations to operate normally. If not, what is
> doing the RCU read-side critical sections on the dedicated CPUs?
>
> Thanx, Paul

I haven't seen Frederic's patch. Sorry I missed it!
It might have saved a bit of work...

I thought about the system call approach but rejected it.
Some (maybe many) customers needing dedicated CPUs will
have apps that never make any system calls at all.

Regards,
Joe

2011-03-07 21:34:04

by Joe Korty

[permalink] [raw]

Subject: Re: [PATCH] An RCU for SMP with a single CPU garbage collector

On Mon, Mar 07, 2011 at 04:16:13PM -0500, Korty, Joe wrote:
> On Mon, Mar 07, 2011 at 04:01:57PM -0500, Paul E. McKenney wrote:
>> If not, what is
>> doing the RCU read-side critical sections on the dedicated CPUs?

Oops, forgot to answer this. RCU critical regions are
delimited by preempt_enable ... preempt_disable. There is
a tie-in to preempt_disable(): it sets a special per-cpu
variable to zero whenever preempt_count() goes to zero.

The per-cpu variables are all periodically examined by
the the global garbage collector. The current batch ends
when all of the per-cpu variables have gone to zero. It
then resets each to 1 or to 0, depending on the current
state of the corresponding cpu.

Regards,
Joe

2011-03-07 22:51:16

by Joe Korty

[permalink] [raw]

Subject: Re: [PATCH] An RCU for SMP with a single CPU garbage collector

On Mon, Mar 07, 2011 at 04:16:13PM -0500, Joe Korty wrote:
> On Mon, Mar 07, 2011 at 04:01:57PM -0500, Paul E. McKenney wrote:
>> But I would really prefer leveraging the existing RCU implementations
>> to the extent possible. Are the user-dedicated CPUs able to invoke
>> system calls? If so, something like Frederic's approach should permit
>> the existing RCU implementations to operate normally. If not, what is
>> doing the RCU read-side critical sections on the dedicated CPUs?
>
> I thought about the system call approach but rejected it.
> Some (maybe many) customers needing dedicated CPUs will
> have apps that never make any system calls at all.

Hi Paul,
Thinking about it some more, the tap-into-syscall approach might
work in my implementation, in which case the tap-into-preempt-enable
code could go away.

Nice thing about RCU, the algorithms are infinitely mallable :)

Joe

2011-03-08 09:07:52

by Paul E. McKenney

[permalink] [raw]

Subject: Re: [PATCH] An RCU for SMP with a single CPU garbage collector

On Mon, Mar 07, 2011 at 05:51:10PM -0500, Joe Korty wrote:
> On Mon, Mar 07, 2011 at 04:16:13PM -0500, Joe Korty wrote:
> > On Mon, Mar 07, 2011 at 04:01:57PM -0500, Paul E. McKenney wrote:
> >> But I would really prefer leveraging the existing RCU implementations
> >> to the extent possible. Are the user-dedicated CPUs able to invoke
> >> system calls? If so, something like Frederic's approach should permit
> >> the existing RCU implementations to operate normally. If not, what is
> >> doing the RCU read-side critical sections on the dedicated CPUs?
> >
> > I thought about the system call approach but rejected it.
> > Some (maybe many) customers needing dedicated CPUs will
> > have apps that never make any system calls at all.
>
> Hi Paul,
> Thinking about it some more, the tap-into-syscall approach might
> work in my implementation, in which case the tap-into-preempt-enable
> code could go away.

OK, please let me know how that goes!

> Nice thing about RCU, the algorithms are infinitely mallable :)

Just trying to keep the code size finite. ;-)

Thanx, Paul

2011-03-08 15:57:17

by Joe Korty

[permalink] [raw]

Subject: Re: [PATCH] An RCU for SMP with a single CPU garbage collector

On Tue, Mar 08, 2011 at 04:07:42AM -0500, Paul E. McKenney wrote:
>> Thinking about it some more, the tap-into-syscall approach might
>> work in my implementation, in which case the tap-into-preempt-enable
>> code could go away.
>
> OK, please let me know how that goes!
>
>> Nice thing about RCU, the algorithms are infinitely mallable :)
>
> Just trying to keep the code size finite. ;-)

I hope to get to it this afternoon! I especially like
the lockless nature of JRCU, and that the dedicated cpus
are not loaded down with callback inovcations either.
Not sure how to support the PREEMPT_RCU mode though; so
if Fredrick is planning to support that, that alone would
make his approach the very best.

Joe

2011-03-08 22:54:02

by Joe Korty

[permalink] [raw]

Subject: Re: [PATCH] An RCU for SMP with a single CPU garbage collector

On Tue, Mar 08, 2011 at 10:57:10AM -0500, Joe Korty wrote:
> On Tue, Mar 08, 2011 at 04:07:42AM -0500, Paul E. McKenney wrote:
>>> Thinking about it some more, the tap-into-syscall approach might
>>> work in my implementation, in which case the tap-into-preempt-enable
>>> code could go away.
> >
>> OK, please let me know how that goes!
>>
>>> Nice thing about RCU, the algorithms are infinitely mallable :)
>>
>> Just trying to keep the code size finite. ;-)
>
> I hope to get to it this afternoon! I especially like
> the lockless nature of JRCU, and that the dedicated cpus
> are not loaded down with callback inovcations either.
> Not sure how to support the PREEMPT_RCU mode though; so
> if Fredrick is planning to support that, that alone would
> make his approach the very best.

Hi Paul,
I had a brainstorm. It _seems_ that JRCU might work fine if
all I did was remove the expensive preempt_enable() tap.
No new taps on system calls or anywhere else. That would
leave only the context switch tap plus the batch start/end
sampling that is remotely performed on each cpu by the
garbage collector. Not even rcu_read_unlock has a tap --
it is just a plain-jane preempt_enable() now.

And indeed it works! I am able to turn off the local
timer interrupt on one (of 15) cpus and the batches
keep flowing on. I have two user 100% use test apps
(one of them does no system calls), when I run that
on the timer-disabled cpu the batches still advance.
Admittedly the batches do not advance as fast as before
.. they used to advance at the max rate of 50 msecs/batch.
Now I regularly see batch lengths approaching 400 msecs.

I plan to put some taps into some other low overhead places
-- at all the voluntary preemption points, at might_sleep,
at rcu_read_unlock, for safety purposes. But it is nice
to see a zero overhead approach that works fine without
any of that.

Regards,
Joe

2011-03-09 22:29:23

by Frederic Weisbecker

[permalink] [raw]

Subject: Re: [PATCH] An RCU for SMP with a single CPU garbage collector

On Mon, Mar 07, 2011 at 04:16:13PM -0500, Joe Korty wrote:
> On Mon, Mar 07, 2011 at 04:01:57PM -0500, Paul E. McKenney wrote:
> > Interesting!
> >
> > But I would really prefer leveraging the existing RCU implementations
> > to the extent possible. Are the user-dedicated CPUs able to invoke
> > system calls? If so, something like Frederic's approach should permit
> > the existing RCU implementations to operate normally. If not, what is
> > doing the RCU read-side critical sections on the dedicated CPUs?
> >
> > Thanx, Paul
>
> I haven't seen Frederic's patch. Sorry I missed it!
> It might have saved a bit of work...

I'm sorry it's my fault, I should have Cc'ed you in my nohz task series.

It's here: https://lkml.org/lkml/2010/12/20/209 and the rcu changes
are spread in severals patches of the series. The idea is to
switch to extended quiescent state when we resume to userspace but
temporarily exit that state when we trigger an exception or an
irq. Then exit extended quiescent state when we enter the kernel
again.

I'll soon look at the last patchset you've posted.

2011-03-10 00:28:32

by Paul E. McKenney

[permalink] [raw]

Subject: Re: [PATCH] An RCU for SMP with a single CPU garbage collector

2011-03-10 00:31:11

by Paul E. McKenney

[permalink] [raw]

Subject: Re: [PATCH] An RCU for SMP with a single CPU garbage collector

On Tue, Mar 08, 2011 at 05:53:55PM -0500, Joe Korty wrote:
> On Tue, Mar 08, 2011 at 10:57:10AM -0500, Joe Korty wrote:
> > On Tue, Mar 08, 2011 at 04:07:42AM -0500, Paul E. McKenney wrote:
> >>> Thinking about it some more, the tap-into-syscall approach might
> >>> work in my implementation, in which case the tap-into-preempt-enable
> >>> code could go away.
> > >
> >> OK, please let me know how that goes!
> >>
> >>> Nice thing about RCU, the algorithms are infinitely mallable :)
> >>
> >> Just trying to keep the code size finite. ;-)
> >
> > I hope to get to it this afternoon! I especially like
> > the lockless nature of JRCU, and that the dedicated cpus
> > are not loaded down with callback inovcations either.
> > Not sure how to support the PREEMPT_RCU mode though; so
> > if Fredrick is planning to support that, that alone would
> > make his approach the very best.
>
>
>
> Hi Paul,
> I had a brainstorm. It _seems_ that JRCU might work fine if
> all I did was remove the expensive preempt_enable() tap.
> No new taps on system calls or anywhere else. That would
> leave only the context switch tap plus the batch start/end
> sampling that is remotely performed on each cpu by the
> garbage collector. Not even rcu_read_unlock has a tap --
> it is just a plain-jane preempt_enable() now.
>
> And indeed it works! I am able to turn off the local
> timer interrupt on one (of 15) cpus and the batches
> keep flowing on. I have two user 100% use test apps
> (one of them does no system calls), when I run that
> on the timer-disabled cpu the batches still advance.
> Admittedly the batches do not advance as fast as before
> .. they used to advance at the max rate of 50 msecs/batch.
> Now I regularly see batch lengths approaching 400 msecs.
>
> I plan to put some taps into some other low overhead places
> -- at all the voluntary preemption points, at might_sleep,
> at rcu_read_unlock, for safety purposes. But it is nice
> to see a zero overhead approach that works fine without
> any of that.

If you had a user-level process that never did system calls and never
entered the scheduler, what do you do to force forward progress of the RCU
grace periods? (This is force_quiescent_state()'s job in TREE_RCU, FYI.)

Thanx, Paul