Moving this discussion to a public list as discussing how to reduce the
number of rcu variants does not make sense in private. We should have
an archive of such discussions.
Ingo Molnar <[email protected]> writes:
> * Paul E. McKenney <[email protected]> wrote:
>
>> > So if people really want that low-cost RCU, and some people really
>> > need the sleepable version, the only one that can _possibly_ be dumped
>> > is the preempt one.
>> >
>> > But I may - again - be confused and/or missing something.
>>
>> I am going to do something very stupid and say that I was instead thinking in
>> terms of getting rid of RCU-bh, thus reminding you of its existence. ;-)
>>
>> The reason for believing that it is possible to get rid of RCU-bh is the work
>> that has gone into improving the forward progress of RCU grace periods under
>> heavy load and in corner-case workloads.
>>
>
> [...]
>
>> The other reason for RCU-sched is it has the side effect of waiting
>> for all in-flight hardware interrupt handlers, NMI handlers, and
>> preempt-disable regions of code to complete, and last I checked, this side
>> effect is relied on. In contrast, RCU-preeempt is only guaranteed to wait
>> on regions of code protected by rcu_read_lock() and rcu_read_unlock().
>
> Instead of only trying to fix the documentation (which is never a bad idea but it
> is fighting the symptom in this case), I think the first step should be to
> simplify the RCU read side APIs of RCU from 4 APIs:
>
> rcu_read_lock()
> srcu_read_lock()
> rcu_read_lock_sched()
> rcu_read_lock_bh()
>
> ... which have ~8 further sub-model variations depending on CONFIG_PREEMPT,
> CONFIG_PREEMPT_RCU - which is really a crazy design!
>
> I think we could reduce this to just two APIs with no Kconfig dependencies:
>
> rcu_read_lock()
> rcu_read_lock_preempt_disable()
>
> Which would be much, much simpler.
>
> This is how we could do it I think:
>
> 1)
>
> Getting rid of the _bh() variant should be reasonably simple and involve a
> treewide replacement of:
>
> rcu_read_lock_bh() -> local_bh_disable()
> rcu_read_unlock_bh() -> local_bh_enable()
>
> Correct?
>
> 2)
>
> Further reducing the variants is harder, due to this main asymmetry:
>
> !PREEMPT_RCU PREEMPT_RCU=y
> rcu_read_lock_sched(): atomic atomic
> rcu_read_lock(): atomic preemptible
>
> ('atomic' here is meant in the scheduler, non-preemptible sense.)
>
> But if we look at the bigger API picture:
>
> !PREEMPT_RCU PREEMPT_RCU=y
> rcu_read_lock(): atomic preemptiblep
> rcu_read_lock_sched(): atomic atomic
> srcu_read_lock(): preemptible preemptible
>
> Then we could maintain full read side API flexibility by making PREEMPT_RCU=y the
> only model, merging it with SRCU and using these main read side APIs:
>
> rcu_read_lock_preempt_disable((): atomic
> rcu_read_lock() preemptible
>
> It's a _really_ simple and straightforward RCU model, with very obvious semantics
> all around:
>
> - Note how the 'atomic' (non-preempt) variant uses the well-known
> preempt_disable() name as a postfix to signal its main property. (It's also a
> bit of a mouthful, which should discourage over-use.)
>
> - The read side APIs are really as straightforward as possible: there's no SRCU
> distinction on the read side, no _bh() distinction and no _sched() distinction.
> (On -rt all of these would turn into preemptible sections,
> obviously.)
And it looses the one advantage of srcu_read_lock. That you don't have
to wait for the entire world. If you actually allow sleeping that is an
important distinction to have. Or are you proposing that we add the
equivalent of init_srcu_struct to all of the rcu users?
That rcu_read_lock would need to take an argument about which rcu region
we are talking about.
> rcu_read_lock_preempt_disable() would essentially be all the current
> rcu_read_lock_sched() users (where the _sched() postfix was a confusing misnomer
> anyway).
>
> Wrt. merging SRCU and RCU: this can be done by making PREEMPT_RCU=y the one and
> only main RCU model and converting all SRCU users to main RCU. This is relatively
> straightforward to perform, as there are only ~170 SRCU critical sections, versus
> the 3000+ main RCU critical sections ...
It really sounds like you are talking about adding a requirement that
everyone update their rcu_read_lock() calls with information about which
region you are talking about. That seems like quite a bit of work.
Doing something implicit when PREEMPT_RCU=y and converting
"rcu_read_lock()" to "srcu_read_lock(&kernel_srcu_region)" only in that
case I can see.
Except in very specific circustances I don't think I ever want to run a
kernel with PREEMPT_RCU the default. All of that real time stuff trades
off predictability with performance. Having lost enough performance to
spectre and meltdown I don't think it makes sense for us all to start
runing predictable^H^H^H^H^H^H^H^H^H^H^H time kernels now.
> AFAICS this should be a possible read side design that keeps correctness, without
> considering grace period length patterns, i.e. without considering GC latency and
> scalability aspects.
>
> Before we get into ways to solve the latency and scalability aspects of such a
> simplified RCU model, do you agree with this analysis so far, or have I missed
> something important wrt. correctness?
RCU region specification. If we routinely allow preemption of rcu
critical sections for any length of time I can't imagine we will want to
wait for every possible preempted rcu critical section.
Of course I could see the merge working the other way. Adding the
debugging we need to find rcu critical secions that are held to long and
shrinking them so we don't need PREEMPT_RCU at all.
Eric
On Mon, Mar 05, 2018 at 08:33:20AM -0600, Eric W. Biederman wrote:
>
> Moving this discussion to a public list as discussing how to reduce the
> number of rcu variants does not make sense in private. We should have
> an archive of such discussions.
>
> Ingo Molnar <[email protected]> writes:
>
> > * Paul E. McKenney <[email protected]> wrote:
> >
> >> > So if people really want that low-cost RCU, and some people really
> >> > need the sleepable version, the only one that can _possibly_ be dumped
> >> > is the preempt one.
> >> >
> >> > But I may - again - be confused and/or missing something.
> >>
> >> I am going to do something very stupid and say that I was instead thinking in
> >> terms of getting rid of RCU-bh, thus reminding you of its existence. ;-)
> >>
> >> The reason for believing that it is possible to get rid of RCU-bh is the work
> >> that has gone into improving the forward progress of RCU grace periods under
> >> heavy load and in corner-case workloads.
> >>
> >
> > [...]
> >
> >> The other reason for RCU-sched is it has the side effect of waiting
> >> for all in-flight hardware interrupt handlers, NMI handlers, and
> >> preempt-disable regions of code to complete, and last I checked, this side
> >> effect is relied on. In contrast, RCU-preeempt is only guaranteed to wait
> >> on regions of code protected by rcu_read_lock() and rcu_read_unlock().
> >
> > Instead of only trying to fix the documentation (which is never a bad idea but it
> > is fighting the symptom in this case), I think the first step should be to
> > simplify the RCU read side APIs of RCU from 4 APIs:
> >
> > rcu_read_lock()
> > srcu_read_lock()
> > rcu_read_lock_sched()
> > rcu_read_lock_bh()
> >
> > ... which have ~8 further sub-model variations depending on CONFIG_PREEMPT,
> > CONFIG_PREEMPT_RCU - which is really a crazy design!
If it is possible to set CONFIG_PREEMPT_RCU differently than CONFIG_PREEMPT,
then that is a bug that I need to fix.
> > I think we could reduce this to just two APIs with no Kconfig dependencies:
> >
> > rcu_read_lock()
> > rcu_read_lock_preempt_disable()
> >
> > Which would be much, much simpler.
No argument on the simpler part, at least from an API perspective.
> > This is how we could do it I think:
> >
> > 1)
> >
> > Getting rid of the _bh() variant should be reasonably simple and involve a
> > treewide replacement of:
> >
> > rcu_read_lock_bh() -> local_bh_disable()
> > rcu_read_unlock_bh() -> local_bh_enable()
> >
> > Correct?
Assuming that I have done enough forward-progress work on grace periods, yes.
> > 2)
> >
> > Further reducing the variants is harder, due to this main asymmetry:
> >
> > !PREEMPT_RCU PREEMPT_RCU=y
> > rcu_read_lock_sched(): atomic atomic
> > rcu_read_lock(): atomic preemptible
> >
> > ('atomic' here is meant in the scheduler, non-preemptible sense.)
> >
> > But if we look at the bigger API picture:
> >
> > !PREEMPT_RCU PREEMPT_RCU=y
> > rcu_read_lock(): atomic preemptiblep
> > rcu_read_lock_sched(): atomic atomic
> > srcu_read_lock(): preemptible preemptible
> >
> > Then we could maintain full read side API flexibility by making PREEMPT_RCU=y the
> > only model, merging it with SRCU and using these main read side APIs:
> >
> > rcu_read_lock_preempt_disable((): atomic
> > rcu_read_lock() preemptible
One issue with merging SRCU into rcu_read_lock() is the general blocking
within SRCU readers. Once merged in, these guys block everyone. We should
focus initially on the non-SRCU variants.
On the other hand, Linus's suggestion of merging rcu_read_lock_sched()
into rcu_read_lock() just might be feasible. If that really does pan
out, we end up with the following:
!PREEMPT PREEMPT=y
rcu_read_lock(): atomic preemptible
srcu_read_lock(): preemptible preemptible
In this model, rcu_read_lock_sched() maps to preempt_disable() and (as
you say above) rcu_read_lock_bh() maps to local_bh_disable(). The way
this works is that in PREEMPT=y kernels, synchronize_rcu() waits not
only for RCU read-side critical sections, but also for regions of code
with preemption disabled. The main caveat seems to be that there be an
assumed point of preemptibility between each interrupt and each softirq
handler, which should be OK.
There will be some adjustments required for lockdep-RCU, but that should
be reasonably straightforward.
Seem reasonable?
> > It's a _really_ simple and straightforward RCU model, with very obvious semantics
> > all around:
> >
> > - Note how the 'atomic' (non-preempt) variant uses the well-known
> > preempt_disable() name as a postfix to signal its main property. (It's also a
> > bit of a mouthful, which should discourage over-use.)
My thought is to eliminate the atomic variant entirely. If you want
to disable preemption, interrupts, or whatever, you simply do so.
It might turn out that there are documentation benefits to having a
separate rcu_read_lock_preempt_disable() that maps to preempt_disable()
with lockdep semantics, and if so, that can be provided trivially.
> > - The read side APIs are really as straightforward as possible: there's no SRCU
> > distinction on the read side, no _bh() distinction and no _sched() distinction.
> > (On -rt all of these would turn into preemptible sections,
> > obviously.)
Agreed, and both models accomplish that.
> And it looses the one advantage of srcu_read_lock. That you don't have
> to wait for the entire world. If you actually allow sleeping that is an
> important distinction to have. Or are you proposing that we add the
> equivalent of init_srcu_struct to all of the rcu users?
I am instead proposing folding rcu_read_lock_bh() and rcu_read_lock_sched()
into rcu_read_lock(), and leaving srcu_read_lock() separate.
> That rcu_read_lock would need to take an argument about which rcu region
> we are talking about.
From what I can see, it would be far better to leave SRCU separate. As you
say, it really does have very different semantics.
> > rcu_read_lock_preempt_disable() would essentially be all the current
> > rcu_read_lock_sched() users (where the _sched() postfix was a confusing misnomer
> > anyway).
I agree that rcu_read_lock_preempt_disable() is a better name.
We might not need it at all, though. There are only about 20 uses of
rcu_read_lock_sched() in v4.15. ;-)
> > Wrt. merging SRCU and RCU: this can be done by making PREEMPT_RCU=y the one and
> > only main RCU model and converting all SRCU users to main RCU. This is relatively
> > straightforward to perform, as there are only ~170 SRCU critical sections, versus
> > the 3000+ main RCU critical sections ...
>
> It really sounds like you are talking about adding a requirement that
> everyone update their rcu_read_lock() calls with information about which
> region you are talking about. That seems like quite a bit of work.
Agreed, merging RCU, RCU-bh, and RCU-sched seems much more straightforward
to me from the viewpoint of both usage and implementation.
> Doing something implicit when PREEMPT_RCU=y and converting
> "rcu_read_lock()" to "srcu_read_lock(&kernel_srcu_region)" only in that
> case I can see.
>
> Except in very specific circustances I don't think I ever want to run a
> kernel with PREEMPT_RCU the default. All of that real time stuff trades
> off predictability with performance. Having lost enough performance to
> spectre and meltdown I don't think it makes sense for us all to start
> runing predictable^H^H^H^H^H^H^H^H^H^H^H time kernels now.
Yes, in PREEMPT=n kernels RCU would act exactly as it does today.
> > AFAICS this should be a possible read side design that keeps correctness, without
> > considering grace period length patterns, i.e. without considering GC latency and
> > scalability aspects.
> >
> > Before we get into ways to solve the latency and scalability aspects of such a
> > simplified RCU model, do you agree with this analysis so far, or have I missed
> > something important wrt. correctness?
>
> RCU region specification. If we routinely allow preemption of rcu
> critical sections for any length of time I can't imagine we will want to
> wait for every possible preempted rcu critical section.
>
> Of course I could see the merge working the other way. Adding the
> debugging we need to find rcu critical secions that are held to long and
> shrinking them so we don't need PREEMPT_RCU at all.
Again, from what I can see, merging rcu_read_lock(), rcu_read_lock_sched(),
and rcu_read_lock_bh() together should get us to a much better place.
Make sense, or am I missing something?
Thanx, Paul
* Paul E. McKenney <[email protected]> wrote:
> > > But if we look at the bigger API picture:
> > >
> > > !PREEMPT_RCU PREEMPT_RCU=y
> > > rcu_read_lock(): atomic preemptible
> > > rcu_read_lock_sched(): atomic atomic
> > > srcu_read_lock(): preemptible preemptible
> > >
> > > Then we could maintain full read side API flexibility by making PREEMPT_RCU=y the
> > > only model, merging it with SRCU and using these main read side APIs:
> > >
> > > rcu_read_lock_preempt_disable(): atomic
> > > rcu_read_lock(): preemptible
>
> One issue with merging SRCU into rcu_read_lock() is the general blocking within
> SRCU readers. Once merged in, these guys block everyone. We should focus
> initially on the non-SRCU variants.
>
> On the other hand, Linus's suggestion of merging rcu_read_lock_sched()
> into rcu_read_lock() just might be feasible. If that really does pan
> out, we end up with the following:
>
> !PREEMPT PREEMPT=y
> rcu_read_lock(): atomic preemptible
> srcu_read_lock(): preemptible preemptible
>
> In this model, rcu_read_lock_sched() maps to preempt_disable() and (as
> you say above) rcu_read_lock_bh() maps to local_bh_disable(). The way
> this works is that in PREEMPT=y kernels, synchronize_rcu() waits not
> only for RCU read-side critical sections, but also for regions of code
> with preemption disabled. The main caveat seems to be that there be an
> assumed point of preemptibility between each interrupt and each softirq
> handler, which should be OK.
>
> There will be some adjustments required for lockdep-RCU, but that should
> be reasonably straightforward.
>
> Seem reasonable?
Yes, that approach sounds very reasonable to me: it is similar to what we do on
the locking side as well, where we have 'atomic' variants (spinlocks/rwlocks) and
'sleeping' variants (mutexes, rwsems, etc.).
( This means there will be more automatic coupling between BH and preempt critical
sections and RCU models not captured via explicit RCU-namespace APIs, but that
should be OK I think. )
A couple of small side notes:
- Could we please also clean up the namespace of the synchronization APIs and
change them all to an rcu_ prefix, like all the other RCU APIs are? Right now
have a mixture like rcu_read_lock() but synchronize_rcu(), while I'd reall love
to be able to do:
git grep '\<rcu_' ...
... to see RCU API usage within a particular kernel area. This would also clean
up some of the internal inconsistencies like having 'struct rcu_synchronize'.
- If we are cleaning up the write side APIs, could we move over to a _wait
nomenclature, i.e. rcu_wait*()?
I.e. the new RCU namespace would be something like:
rcu_read_lock => rcu_read_lock # unchanged
rcu_read_unlock => rcu_read_unlock # unchanged
call_rcu => rcu_call_rcu
call_rcu_bh => rcu_call_bh
call_rcu_sched => rcu_call_sched
synchronize_rcu => rcu_wait_
synchronize_rcu_bh => rcu_wait_bh
synchronize_rcu_bh_expedited => rcu_wait_expedited_bh
synchronize_rcu_expedited => rcu_wait_expedited
synchronize_rcu_mult => rcu_wait_mult
synchronize_rcu_sched => rcu_wait_sched
synchronize_rcu_tasks => rcu_wait_tasks
srcu_read_lock => srcu_read_lock # unchanged
srcu_read_unlock => srcu_read_unlock # unchanged
synchronize_srcu => srcu_wait
synchronize_srcu_expedited => srcu_wait_expedited
Note that due to the prefix approach we gain various new patterns:
git grep rcu_wait # matches both rcu and srcu
git grep rcu_wait # matches all RCU waiting variants
git grep wait_expedited # matches all expedited variants
... which all increase the organization of the namespace.
- While we are at it, the two RCU-state API variants, while rarely used, are
named in a pretty obscure, disconnected fashion as well. A much better naming
would be:
get_state_synchronize_rcu => rcu_get_state
cond_synchronize_rcu => rcu_wait_state
... or so. This would also move them into the new, unified rcu_ prefix
namespace.
Note how consistent and hierarchical the new RCU API namespace is:
<subsystem-prefix>_<verb>[_<qualifier[s]>]
If you agree with the overall concept of this I'd be glad to help out with
scripting & testing the RCU namespace transition safely in an unintrusive fashion
once you've done the model unification work, with compatibility defines to not
create conflicts, churn and pain, etc.
Thanks,
Ingo
* Ingo Molnar <[email protected]> wrote:
> I.e. the new RCU namespace would be something like:
>
> call_rcu => rcu_call_rcu
typo: rcu_call().
> synchronize_rcu => rcu_wait_
typo: rcu_wait().
Here's the updated table:
# RCU APIs:
rcu_read_lock => rcu_read_lock # unchanged
rcu_read_unlock => rcu_read_unlock # unchanged
call_rcu => rcu_call
call_rcu_bh => rcu_call_bh
call_rcu_sched => rcu_call_sched
synchronize_rcu => rcu_wait
synchronize_rcu_bh => rcu_wait_bh
synchronize_rcu_bh_expedited => rcu_wait_expedited_bh
synchronize_rcu_expedited => rcu_wait_expedited
synchronize_rcu_mult => rcu_wait_mult
synchronize_rcu_sched => rcu_wait_sched
synchronize_rcu_tasks => rcu_wait_tasks
get_state_synchronize_rcu => rcu_get_state
cond_synchronize_rcu => rcu_wait_state
# SRCU APIs:
srcu_read_lock => srcu_read_lock # unchanged
srcu_read_unlock => srcu_read_unlock # unchanged
synchronize_srcu => srcu_wait
synchronize_srcu_expedited => srcu_wait_expedited
Thanks,
Ingo
On Tue, Mar 06, 2018 at 09:47:38AM +0100, Ingo Molnar wrote:
>
> * Paul E. McKenney <[email protected]> wrote:
>
> > > > But if we look at the bigger API picture:
> > > >
> > > > !PREEMPT_RCU PREEMPT_RCU=y
> > > > rcu_read_lock(): atomic preemptible
> > > > rcu_read_lock_sched(): atomic atomic
> > > > srcu_read_lock(): preemptible preemptible
> > > >
> > > > Then we could maintain full read side API flexibility by making PREEMPT_RCU=y the
> > > > only model, merging it with SRCU and using these main read side APIs:
> > > >
> > > > rcu_read_lock_preempt_disable(): atomic
> > > > rcu_read_lock(): preemptible
> >
> > One issue with merging SRCU into rcu_read_lock() is the general blocking within
> > SRCU readers. Once merged in, these guys block everyone. We should focus
> > initially on the non-SRCU variants.
> >
> > On the other hand, Linus's suggestion of merging rcu_read_lock_sched()
> > into rcu_read_lock() just might be feasible. If that really does pan
> > out, we end up with the following:
> >
> > !PREEMPT PREEMPT=y
> > rcu_read_lock(): atomic preemptible
> > srcu_read_lock(): preemptible preemptible
> >
> > In this model, rcu_read_lock_sched() maps to preempt_disable() and (as
> > you say above) rcu_read_lock_bh() maps to local_bh_disable(). The way
> > this works is that in PREEMPT=y kernels, synchronize_rcu() waits not
> > only for RCU read-side critical sections, but also for regions of code
> > with preemption disabled. The main caveat seems to be that there be an
> > assumed point of preemptibility between each interrupt and each softirq
> > handler, which should be OK.
> >
> > There will be some adjustments required for lockdep-RCU, but that should
> > be reasonably straightforward.
> >
> > Seem reasonable?
>
> Yes, that approach sounds very reasonable to me: it is similar to what we do on
> the locking side as well, where we have 'atomic' variants (spinlocks/rwlocks) and
> 'sleeping' variants (mutexes, rwsems, etc.).
>
> ( This means there will be more automatic coupling between BH and preempt critical
> sections and RCU models not captured via explicit RCU-namespace APIs, but that
> should be OK I think. )
Thus far, I have been unable to prove that it cannot work, which is about
as good as it gets at this stage. So here is hoping! ;-)
I will look at your later corrected message, but will gratefully accept
your offer of help with the naming transition.
Thanx, Paul
> A couple of small side notes:
>
> - Could we please also clean up the namespace of the synchronization APIs and
> change them all to an rcu_ prefix, like all the other RCU APIs are? Right now
> have a mixture like rcu_read_lock() but synchronize_rcu(), while I'd reall love
> to be able to do:
>
> git grep '\<rcu_' ...
>
> ... to see RCU API usage within a particular kernel area. This would also clean
> up some of the internal inconsistencies like having 'struct rcu_synchronize'.
>
> - If we are cleaning up the write side APIs, could we move over to a _wait
> nomenclature, i.e. rcu_wait*()?
>
> I.e. the new RCU namespace would be something like:
>
> rcu_read_lock => rcu_read_lock # unchanged
> rcu_read_unlock => rcu_read_unlock # unchanged
>
> call_rcu => rcu_call_rcu
> call_rcu_bh => rcu_call_bh
> call_rcu_sched => rcu_call_sched
>
> synchronize_rcu => rcu_wait_
> synchronize_rcu_bh => rcu_wait_bh
> synchronize_rcu_bh_expedited => rcu_wait_expedited_bh
> synchronize_rcu_expedited => rcu_wait_expedited
> synchronize_rcu_mult => rcu_wait_mult
> synchronize_rcu_sched => rcu_wait_sched
> synchronize_rcu_tasks => rcu_wait_tasks
>
> srcu_read_lock => srcu_read_lock # unchanged
> srcu_read_unlock => srcu_read_unlock # unchanged
>
> synchronize_srcu => srcu_wait
> synchronize_srcu_expedited => srcu_wait_expedited
>
> Note that due to the prefix approach we gain various new patterns:
>
> git grep rcu_wait # matches both rcu and srcu
> git grep rcu_wait # matches all RCU waiting variants
> git grep wait_expedited # matches all expedited variants
>
> ... which all increase the organization of the namespace.
>
> - While we are at it, the two RCU-state API variants, while rarely used, are
> named in a pretty obscure, disconnected fashion as well. A much better naming
> would be:
>
> get_state_synchronize_rcu => rcu_get_state
> cond_synchronize_rcu => rcu_wait_state
>
> ... or so. This would also move them into the new, unified rcu_ prefix
> namespace.
>
> Note how consistent and hierarchical the new RCU API namespace is:
>
> <subsystem-prefix>_<verb>[_<qualifier[s]>]
>
> If you agree with the overall concept of this I'd be glad to help out with
> scripting & testing the RCU namespace transition safely in an unintrusive fashion
> once you've done the model unification work, with compatibility defines to not
> create conflicts, churn and pain, etc.
>
> Thanks,
>
> Ingo
>
On Tue, Mar 06, 2018 at 10:00:50AM +0100, Ingo Molnar wrote:
>
> * Ingo Molnar <[email protected]> wrote:
>
> > I.e. the new RCU namespace would be something like:
> >
> > call_rcu => rcu_call_rcu
>
> typo: rcu_call().
>
> > synchronize_rcu => rcu_wait_
>
> typo: rcu_wait().
>
> Here's the updated table:
>
> # RCU APIs:
>
> rcu_read_lock => rcu_read_lock # unchanged
> rcu_read_unlock => rcu_read_unlock # unchanged
>
> call_rcu => rcu_call
> call_rcu_bh => rcu_call_bh
> call_rcu_sched => rcu_call_sched
call_rcu_tasks => rcu_call_tasks ?
> synchronize_rcu => rcu_wait
> synchronize_rcu_bh => rcu_wait_bh
> synchronize_rcu_bh_expedited => rcu_wait_expedited_bh
> synchronize_rcu_expedited => rcu_wait_expedited
> synchronize_rcu_mult => rcu_wait_mult
The consolidation of RCU, RCU-bh, and RCU-sched would eliminate the
only use of synchronize_rcu_mult(). Should we simply remove it?
> synchronize_rcu_sched => rcu_wait_sched
> synchronize_rcu_tasks => rcu_wait_tasks
>
> get_state_synchronize_rcu => rcu_get_state
> cond_synchronize_rcu => rcu_wait_state
All of rcu_barrier, rcu_barrier_bh, rcu_barrier_sched, rcu_barrier_tasks,
and srcu_barrier remain unchanged, correct?
> # SRCU APIs:
>
> srcu_read_lock => srcu_read_lock # unchanged
> srcu_read_unlock => srcu_read_unlock # unchanged
>
> synchronize_srcu => srcu_wait
> synchronize_srcu_expedited => srcu_wait_expedited
call_srcu => srcu_call ?
And rcu_assign_pointer, rcu_access_pointer(), and rcu_dereference*()
remain unchanged, correct? I wouldn't expect RCU's list API to change,
but if we are going to change something, this would be the time...
On the other hand, the ones you list above are the ones that get used,
hence the ones for which the names are most important. That said...
The following list is a bit out of date, but is not too far off:
https://lwn.net/Articles/609973/
Yeah, I know, causing trouble for myself. ;-)
Thanx, Paul
On Tue, Mar 06, 2018 at 12:39:06PM -0800, Paul E. McKenney wrote:
> On Tue, Mar 06, 2018 at 09:47:38AM +0100, Ingo Molnar wrote:
> >
> > * Paul E. McKenney <[email protected]> wrote:
> >
> > > > > But if we look at the bigger API picture:
> > > > >
> > > > > !PREEMPT_RCU PREEMPT_RCU=y
> > > > > rcu_read_lock(): atomic preemptible
> > > > > rcu_read_lock_sched(): atomic atomic
> > > > > srcu_read_lock(): preemptible preemptible
> > > > >
> > > > > Then we could maintain full read side API flexibility by making PREEMPT_RCU=y the
> > > > > only model, merging it with SRCU and using these main read side APIs:
> > > > >
> > > > > rcu_read_lock_preempt_disable(): atomic
> > > > > rcu_read_lock(): preemptible
> > >
> > > One issue with merging SRCU into rcu_read_lock() is the general blocking within
> > > SRCU readers. Once merged in, these guys block everyone. We should focus
> > > initially on the non-SRCU variants.
> > >
> > > On the other hand, Linus's suggestion of merging rcu_read_lock_sched()
> > > into rcu_read_lock() just might be feasible. If that really does pan
> > > out, we end up with the following:
> > >
> > > !PREEMPT PREEMPT=y
> > > rcu_read_lock(): atomic preemptible
> > > srcu_read_lock(): preemptible preemptible
> > >
> > > In this model, rcu_read_lock_sched() maps to preempt_disable() and (as
> > > you say above) rcu_read_lock_bh() maps to local_bh_disable(). The way
> > > this works is that in PREEMPT=y kernels, synchronize_rcu() waits not
> > > only for RCU read-side critical sections, but also for regions of code
> > > with preemption disabled. The main caveat seems to be that there be an
> > > assumed point of preemptibility between each interrupt and each softirq
> > > handler, which should be OK.
> > >
> > > There will be some adjustments required for lockdep-RCU, but that should
> > > be reasonably straightforward.
> > >
> > > Seem reasonable?
> >
> > Yes, that approach sounds very reasonable to me: it is similar to what we do on
> > the locking side as well, where we have 'atomic' variants (spinlocks/rwlocks) and
> > 'sleeping' variants (mutexes, rwsems, etc.).
> >
> > ( This means there will be more automatic coupling between BH and preempt critical
> > sections and RCU models not captured via explicit RCU-namespace APIs, but that
> > should be OK I think. )
>
> Thus far, I have been unable to prove that it cannot work, which is about
> as good as it gets at this stage. So here is hoping! ;-)
>
> I will look at your later corrected message, but will gratefully accept
> your offer of help with the naming transition.
Ah, and any thoughts on how best to get feedback from the various people
who would need to reprogram their fingers? Or is everyone already on
board with changing these various names?
Thanx, Paul
On Wed, Mar 7, 2018 at 7:54 AM, Paul E. McKenney
<[email protected]> wrote:
>
> Ah, and any thoughts on how best to get feedback from the various people
> who would need to reprogram their fingers? Or is everyone already on
> board with changing these various names?
I really would prefer to not see massive re-naming unless there is a
really good reason for it.
I'm all for simplifying RCU from a million different versions down to
just a few thousand, but I'm definitely not convinced we want to do
any search-and-replace.
Linus
On Wed, Mar 07, 2018 at 10:48:50AM -0800, Linus Torvalds wrote:
> On Wed, Mar 7, 2018 at 7:54 AM, Paul E. McKenney
> <[email protected]> wrote:
> >
> > Ah, and any thoughts on how best to get feedback from the various people
> > who would need to reprogram their fingers? Or is everyone already on
> > board with changing these various names?
>
> I really would prefer to not see massive re-naming unless there is a
> really good reason for it.
>
> I'm all for simplifying RCU from a million different versions down to
> just a few thousand, but I'm definitely not convinced we want to do
> any search-and-replace.
I am currently in the design (more accurately, reredesign phase) for
the simplification. It is quite possible that there is a good reason
for at least some renaming, but in that case, I would come back later
with that as a separate proposal.
Thanx, Paul
On Wed, Mar 07, 2018 at 07:54:44AM -0800, Paul E. McKenney wrote:
> On Tue, Mar 06, 2018 at 12:39:06PM -0800, Paul E. McKenney wrote:
> > On Tue, Mar 06, 2018 at 09:47:38AM +0100, Ingo Molnar wrote:
> > >
> > > * Paul E. McKenney <[email protected]> wrote:
> > >
> > > > > > But if we look at the bigger API picture:
> > > > > >
> > > > > > !PREEMPT_RCU PREEMPT_RCU=y
> > > > > > rcu_read_lock(): atomic preemptible
> > > > > > rcu_read_lock_sched(): atomic atomic
> > > > > > srcu_read_lock(): preemptible preemptible
> > > > > >
> > > > > > Then we could maintain full read side API flexibility by making PREEMPT_RCU=y the
> > > > > > only model, merging it with SRCU and using these main read side APIs:
> > > > > >
> > > > > > rcu_read_lock_preempt_disable(): atomic
> > > > > > rcu_read_lock(): preemptible
> > > >
> > > > One issue with merging SRCU into rcu_read_lock() is the general blocking within
> > > > SRCU readers. Once merged in, these guys block everyone. We should focus
> > > > initially on the non-SRCU variants.
> > > >
> > > > On the other hand, Linus's suggestion of merging rcu_read_lock_sched()
> > > > into rcu_read_lock() just might be feasible. If that really does pan
> > > > out, we end up with the following:
> > > >
> > > > !PREEMPT PREEMPT=y
> > > > rcu_read_lock(): atomic preemptible
> > > > srcu_read_lock(): preemptible preemptible
> > > >
> > > > In this model, rcu_read_lock_sched() maps to preempt_disable() and (as
> > > > you say above) rcu_read_lock_bh() maps to local_bh_disable(). The way
> > > > this works is that in PREEMPT=y kernels, synchronize_rcu() waits not
> > > > only for RCU read-side critical sections, but also for regions of code
> > > > with preemption disabled. The main caveat seems to be that there be an
> > > > assumed point of preemptibility between each interrupt and each softirq
> > > > handler, which should be OK.
> > > >
> > > > There will be some adjustments required for lockdep-RCU, but that should
> > > > be reasonably straightforward.
> > > >
> > > > Seem reasonable?
> > >
> > > Yes, that approach sounds very reasonable to me: it is similar to what we do on
> > > the locking side as well, where we have 'atomic' variants (spinlocks/rwlocks) and
> > > 'sleeping' variants (mutexes, rwsems, etc.).
> > >
> > > ( This means there will be more automatic coupling between BH and preempt critical
> > > sections and RCU models not captured via explicit RCU-namespace APIs, but that
> > > should be OK I think. )
> >
> > Thus far, I have been unable to prove that it cannot work, which is about
> > as good as it gets at this stage. So here is hoping! ;-)
> >
> > I will look at your later corrected message, but will gratefully accept
> > your offer of help with the naming transition.
>
> Ah, and any thoughts on how best to get feedback from the various people
> who would need to reprogram their fingers? Or is everyone already on
> board with changing these various names?
Experienced should get there in a week (gcc help); newbies would (have to)
rely on either on _properly updated_ documentation or weeks/months of code
paging; scripts do the renaming. What am I missing?
Andrea
>
> Thanx, Paul
>
On Tue, Mar 6, 2018 at 12:14 AM, Paul E. McKenney
<[email protected]> wrote:
> On Mon, Mar 05, 2018 at 08:33:20AM -0600, Eric W. Biederman wrote:
>>
>> Moving this discussion to a public list as discussing how to reduce the
>> number of rcu variants does not make sense in private. We should have
>> an archive of such discussions.
>>
>> Ingo Molnar <[email protected]> writes:
>>
>> > * Paul E. McKenney <[email protected]> wrote:
>> >
>> >> > So if people really want that low-cost RCU, and some people really
>> >> > need the sleepable version, the only one that can _possibly_ be dumped
>> >> > is the preempt one.
>> >> >
>> >> > But I may - again - be confused and/or missing something.
>> >>
>> >> I am going to do something very stupid and say that I was instead thinking in
>> >> terms of getting rid of RCU-bh, thus reminding you of its existence. ;-)
>> >>
>> >> The reason for believing that it is possible to get rid of RCU-bh is the work
>> >> that has gone into improving the forward progress of RCU grace periods under
>> >> heavy load and in corner-case workloads.
>> >>
>> >
>> > [...]
>> >
>> >> The other reason for RCU-sched is it has the side effect of waiting
>> >> for all in-flight hardware interrupt handlers, NMI handlers, and
>> >> preempt-disable regions of code to complete, and last I checked, this side
>> >> effect is relied on. In contrast, RCU-preeempt is only guaranteed to wait
>> >> on regions of code protected by rcu_read_lock() and rcu_read_unlock().
>> >
>> > Instead of only trying to fix the documentation (which is never a bad idea but it
>> > is fighting the symptom in this case), I think the first step should be to
>> > simplify the RCU read side APIs of RCU from 4 APIs:
>> >
>> > rcu_read_lock()
>> > srcu_read_lock()
>> > rcu_read_lock_sched()
>> > rcu_read_lock_bh()
>> >
>> > ... which have ~8 further sub-model variations depending on CONFIG_PREEMPT,
>> > CONFIG_PREEMPT_RCU - which is really a crazy design!
>
> If it is possible to set CONFIG_PREEMPT_RCU differently than CONFIG_PREEMPT,
> then that is a bug that I need to fix.
>
>> > I think we could reduce this to just two APIs with no Kconfig dependencies:
>> >
>> > rcu_read_lock()
>> > rcu_read_lock_preempt_disable()
>> >
>> > Which would be much, much simpler.
>
> No argument on the simpler part, at least from an API perspective.
>
>> > This is how we could do it I think:
>> >
>> > 1)
>> >
>> > Getting rid of the _bh() variant should be reasonably simple and involve a
>> > treewide replacement of:
>> >
>> > rcu_read_lock_bh() -> local_bh_disable()
>> > rcu_read_unlock_bh() -> local_bh_enable()
>> >
>> > Correct?
>
> Assuming that I have done enough forward-progress work on grace periods, yes.
>
>> > 2)
>> >
>> > Further reducing the variants is harder, due to this main asymmetry:
>> >
>> > !PREEMPT_RCU PREEMPT_RCU=y
>> > rcu_read_lock_sched(): atomic atomic
>> > rcu_read_lock(): atomic preemptible
>> >
>> > ('atomic' here is meant in the scheduler, non-preemptible sense.)
>> >
>> > But if we look at the bigger API picture:
>> >
>> > !PREEMPT_RCU PREEMPT_RCU=y
>> > rcu_read_lock(): atomic preemptiblep
>> > rcu_read_lock_sched(): atomic atomic
>> > srcu_read_lock(): preemptible preemptible
>> >
>> > Then we could maintain full read side API flexibility by making PREEMPT_RCU=y the
>> > only model, merging it with SRCU and using these main read side APIs:
>> >
>> > rcu_read_lock_preempt_disable((): atomic
>> > rcu_read_lock() preemptible
>
> One issue with merging SRCU into rcu_read_lock() is the general blocking
> within SRCU readers. Once merged in, these guys block everyone. We should
> focus initially on the non-SRCU variants.
>
> On the other hand, Linus's suggestion of merging rcu_read_lock_sched()
> into rcu_read_lock() just might be feasible. If that really does pan
> out, we end up with the following:
>
> !PREEMPT PREEMPT=y
> rcu_read_lock(): atomic preemptible
> srcu_read_lock(): preemptible preemptible
>
> In this model, rcu_read_lock_sched() maps to preempt_disable() and (as
> you say above) rcu_read_lock_bh() maps to local_bh_disable(). The way
> this works is that in PREEMPT=y kernels, synchronize_rcu() waits not
> only for RCU read-side critical sections, but also for regions of code
> with preemption disabled. The main caveat seems to be that there be an
> assumed point of preemptibility between each interrupt and each softirq
> handler, which should be OK.
>
> There will be some adjustments required for lockdep-RCU, but that should
> be reasonably straightforward.
>
> Seem reasonable?
It's good. I hope there is only one global(non-srcu) rcu variant.
It does have the trade-off, the grace period will be extended a little
in some cases,
so will the call_rcu()/synchronze_rcu(). But it simplifies the coding a lot.
>
>> > It's a _really_ simple and straightforward RCU model, with very obvious semantics
>> > all around:
>> >
>> > - Note how the 'atomic' (non-preempt) variant uses the well-known
>> > preempt_disable() name as a postfix to signal its main property. (It's also a
>> > bit of a mouthful, which should discourage over-use.)
>
> My thought is to eliminate the atomic variant entirely. If you want
> to disable preemption, interrupts, or whatever, you simply do so.
> It might turn out that there are documentation benefits to having a
> separate rcu_read_lock_preempt_disable() that maps to preempt_disable()
> with lockdep semantics, and if so, that can be provided trivially.
>
>> > - The read side APIs are really as straightforward as possible: there's no SRCU
>> > distinction on the read side, no _bh() distinction and no _sched() distinction.
>> > (On -rt all of these would turn into preemptible sections,
>> > obviously.)
>
> Agreed, and both models accomplish that.
>
>> And it looses the one advantage of srcu_read_lock. That you don't have
>> to wait for the entire world. If you actually allow sleeping that is an
>> important distinction to have. Or are you proposing that we add the
>> equivalent of init_srcu_struct to all of the rcu users?
>
> I am instead proposing folding rcu_read_lock_bh() and rcu_read_lock_sched()
> into rcu_read_lock(), and leaving srcu_read_lock() separate.
>
>> That rcu_read_lock would need to take an argument about which rcu region
>> we are talking about.
>
> From what I can see, it would be far better to leave SRCU separate. As you
> say, it really does have very different semantics.
>
>> > rcu_read_lock_preempt_disable() would essentially be all the current
>> > rcu_read_lock_sched() users (where the _sched() postfix was a confusing misnomer
>> > anyway).
>
> I agree that rcu_read_lock_preempt_disable() is a better name.
> We might not need it at all, though. There are only about 20 uses of
> rcu_read_lock_sched() in v4.15. ;-)
>
>> > Wrt. merging SRCU and RCU: this can be done by making PREEMPT_RCU=y the one and
>> > only main RCU model and converting all SRCU users to main RCU. This is relatively
>> > straightforward to perform, as there are only ~170 SRCU critical sections, versus
>> > the 3000+ main RCU critical sections ...
>>
>> It really sounds like you are talking about adding a requirement that
>> everyone update their rcu_read_lock() calls with information about which
>> region you are talking about. That seems like quite a bit of work.
>
> Agreed, merging RCU, RCU-bh, and RCU-sched seems much more straightforward
> to me from the viewpoint of both usage and implementation.
>
>> Doing something implicit when PREEMPT_RCU=y and converting
>> "rcu_read_lock()" to "srcu_read_lock(&kernel_srcu_region)" only in that
>> case I can see.
>>
>> Except in very specific circustances I don't think I ever want to run a
>> kernel with PREEMPT_RCU the default. All of that real time stuff trades
>> off predictability with performance. Having lost enough performance to
>> spectre and meltdown I don't think it makes sense for us all to start
>> runing predictable^H^H^H^H^H^H^H^H^H^H^H time kernels now.
>
> Yes, in PREEMPT=n kernels RCU would act exactly as it does today.
>
>> > AFAICS this should be a possible read side design that keeps correctness, without
>> > considering grace period length patterns, i.e. without considering GC latency and
>> > scalability aspects.
>> >
>> > Before we get into ways to solve the latency and scalability aspects of such a
>> > simplified RCU model, do you agree with this analysis so far, or have I missed
>> > something important wrt. correctness?
>>
>> RCU region specification. If we routinely allow preemption of rcu
>> critical sections for any length of time I can't imagine we will want to
>> wait for every possible preempted rcu critical section.
>>
>> Of course I could see the merge working the other way. Adding the
>> debugging we need to find rcu critical secions that are held to long and
>> shrinking them so we don't need PREEMPT_RCU at all.
>
> Again, from what I can see, merging rcu_read_lock(), rcu_read_lock_sched(),
> and rcu_read_lock_bh() together should get us to a much better place.
>
> Make sense, or am I missing something?
>
> Thanx, Paul
>
On Fri, Mar 09, 2018 at 05:48:55PM +0800, Lai Jiangshan wrote:
> On Tue, Mar 6, 2018 at 12:14 AM, Paul E. McKenney
> <[email protected]> wrote:
> > On Mon, Mar 05, 2018 at 08:33:20AM -0600, Eric W. Biederman wrote:
> >>
> >> Moving this discussion to a public list as discussing how to reduce the
> >> number of rcu variants does not make sense in private. We should have
> >> an archive of such discussions.
> >>
> >> Ingo Molnar <[email protected]> writes:
> >>
> >> > * Paul E. McKenney <[email protected]> wrote:
> >> >
> >> >> > So if people really want that low-cost RCU, and some people really
> >> >> > need the sleepable version, the only one that can _possibly_ be dumped
> >> >> > is the preempt one.
> >> >> >
> >> >> > But I may - again - be confused and/or missing something.
> >> >>
> >> >> I am going to do something very stupid and say that I was instead thinking in
> >> >> terms of getting rid of RCU-bh, thus reminding you of its existence. ;-)
> >> >>
> >> >> The reason for believing that it is possible to get rid of RCU-bh is the work
> >> >> that has gone into improving the forward progress of RCU grace periods under
> >> >> heavy load and in corner-case workloads.
> >> >>
> >> >
> >> > [...]
> >> >
> >> >> The other reason for RCU-sched is it has the side effect of waiting
> >> >> for all in-flight hardware interrupt handlers, NMI handlers, and
> >> >> preempt-disable regions of code to complete, and last I checked, this side
> >> >> effect is relied on. In contrast, RCU-preeempt is only guaranteed to wait
> >> >> on regions of code protected by rcu_read_lock() and rcu_read_unlock().
> >> >
> >> > Instead of only trying to fix the documentation (which is never a bad idea but it
> >> > is fighting the symptom in this case), I think the first step should be to
> >> > simplify the RCU read side APIs of RCU from 4 APIs:
> >> >
> >> > rcu_read_lock()
> >> > srcu_read_lock()
> >> > rcu_read_lock_sched()
> >> > rcu_read_lock_bh()
> >> >
> >> > ... which have ~8 further sub-model variations depending on CONFIG_PREEMPT,
> >> > CONFIG_PREEMPT_RCU - which is really a crazy design!
> >
> > If it is possible to set CONFIG_PREEMPT_RCU differently than CONFIG_PREEMPT,
> > then that is a bug that I need to fix.
> >
> >> > I think we could reduce this to just two APIs with no Kconfig dependencies:
> >> >
> >> > rcu_read_lock()
> >> > rcu_read_lock_preempt_disable()
> >> >
> >> > Which would be much, much simpler.
> >
> > No argument on the simpler part, at least from an API perspective.
> >
> >> > This is how we could do it I think:
> >> >
> >> > 1)
> >> >
> >> > Getting rid of the _bh() variant should be reasonably simple and involve a
> >> > treewide replacement of:
> >> >
> >> > rcu_read_lock_bh() -> local_bh_disable()
> >> > rcu_read_unlock_bh() -> local_bh_enable()
> >> >
> >> > Correct?
> >
> > Assuming that I have done enough forward-progress work on grace periods, yes.
> >
> >> > 2)
> >> >
> >> > Further reducing the variants is harder, due to this main asymmetry:
> >> >
> >> > !PREEMPT_RCU PREEMPT_RCU=y
> >> > rcu_read_lock_sched(): atomic atomic
> >> > rcu_read_lock(): atomic preemptible
> >> >
> >> > ('atomic' here is meant in the scheduler, non-preemptible sense.)
> >> >
> >> > But if we look at the bigger API picture:
> >> >
> >> > !PREEMPT_RCU PREEMPT_RCU=y
> >> > rcu_read_lock(): atomic preemptiblep
> >> > rcu_read_lock_sched(): atomic atomic
> >> > srcu_read_lock(): preemptible preemptible
> >> >
> >> > Then we could maintain full read side API flexibility by making PREEMPT_RCU=y the
> >> > only model, merging it with SRCU and using these main read side APIs:
> >> >
> >> > rcu_read_lock_preempt_disable((): atomic
> >> > rcu_read_lock() preemptible
> >
> > One issue with merging SRCU into rcu_read_lock() is the general blocking
> > within SRCU readers. Once merged in, these guys block everyone. We should
> > focus initially on the non-SRCU variants.
> >
> > On the other hand, Linus's suggestion of merging rcu_read_lock_sched()
> > into rcu_read_lock() just might be feasible. If that really does pan
> > out, we end up with the following:
> >
> > !PREEMPT PREEMPT=y
> > rcu_read_lock(): atomic preemptible
> > srcu_read_lock(): preemptible preemptible
> >
> > In this model, rcu_read_lock_sched() maps to preempt_disable() and (as
> > you say above) rcu_read_lock_bh() maps to local_bh_disable(). The way
> > this works is that in PREEMPT=y kernels, synchronize_rcu() waits not
> > only for RCU read-side critical sections, but also for regions of code
> > with preemption disabled. The main caveat seems to be that there be an
> > assumed point of preemptibility between each interrupt and each softirq
> > handler, which should be OK.
> >
> > There will be some adjustments required for lockdep-RCU, but that should
> > be reasonably straightforward.
> >
> > Seem reasonable?
>
> It's good. I hope there is only one global(non-srcu) rcu variant.
Well, there will still be both SRCU and RCU-tasks, but reducing by
two should at least help.
> It does have the trade-off, the grace period will be extended a little
> in some cases,
> so will the call_rcu()/synchronze_rcu(). But it simplifies the coding a lot.
True, and the extended grace periods did bother me at first. But then
I realized that it was no worse than RCU-sched, so it should be OK.
And thank you for looking this over!
Thanx, Paul
> >> > It's a _really_ simple and straightforward RCU model, with very obvious semantics
> >> > all around:
> >> >
> >> > - Note how the 'atomic' (non-preempt) variant uses the well-known
> >> > preempt_disable() name as a postfix to signal its main property. (It's also a
> >> > bit of a mouthful, which should discourage over-use.)
> >
> > My thought is to eliminate the atomic variant entirely. If you want
> > to disable preemption, interrupts, or whatever, you simply do so.
> > It might turn out that there are documentation benefits to having a
> > separate rcu_read_lock_preempt_disable() that maps to preempt_disable()
> > with lockdep semantics, and if so, that can be provided trivially.
> >
> >> > - The read side APIs are really as straightforward as possible: there's no SRCU
> >> > distinction on the read side, no _bh() distinction and no _sched() distinction.
> >> > (On -rt all of these would turn into preemptible sections,
> >> > obviously.)
> >
> > Agreed, and both models accomplish that.
> >
> >> And it looses the one advantage of srcu_read_lock. That you don't have
> >> to wait for the entire world. If you actually allow sleeping that is an
> >> important distinction to have. Or are you proposing that we add the
> >> equivalent of init_srcu_struct to all of the rcu users?
> >
> > I am instead proposing folding rcu_read_lock_bh() and rcu_read_lock_sched()
> > into rcu_read_lock(), and leaving srcu_read_lock() separate.
> >
> >> That rcu_read_lock would need to take an argument about which rcu region
> >> we are talking about.
> >
> > From what I can see, it would be far better to leave SRCU separate. As you
> > say, it really does have very different semantics.
> >
> >> > rcu_read_lock_preempt_disable() would essentially be all the current
> >> > rcu_read_lock_sched() users (where the _sched() postfix was a confusing misnomer
> >> > anyway).
> >
> > I agree that rcu_read_lock_preempt_disable() is a better name.
> > We might not need it at all, though. There are only about 20 uses of
> > rcu_read_lock_sched() in v4.15. ;-)
> >
> >> > Wrt. merging SRCU and RCU: this can be done by making PREEMPT_RCU=y the one and
> >> > only main RCU model and converting all SRCU users to main RCU. This is relatively
> >> > straightforward to perform, as there are only ~170 SRCU critical sections, versus
> >> > the 3000+ main RCU critical sections ...
> >>
> >> It really sounds like you are talking about adding a requirement that
> >> everyone update their rcu_read_lock() calls with information about which
> >> region you are talking about. That seems like quite a bit of work.
> >
> > Agreed, merging RCU, RCU-bh, and RCU-sched seems much more straightforward
> > to me from the viewpoint of both usage and implementation.
> >
> >> Doing something implicit when PREEMPT_RCU=y and converting
> >> "rcu_read_lock()" to "srcu_read_lock(&kernel_srcu_region)" only in that
> >> case I can see.
> >>
> >> Except in very specific circustances I don't think I ever want to run a
> >> kernel with PREEMPT_RCU the default. All of that real time stuff trades
> >> off predictability with performance. Having lost enough performance to
> >> spectre and meltdown I don't think it makes sense for us all to start
> >> runing predictable^H^H^H^H^H^H^H^H^H^H^H time kernels now.
> >
> > Yes, in PREEMPT=n kernels RCU would act exactly as it does today.
> >
> >> > AFAICS this should be a possible read side design that keeps correctness, without
> >> > considering grace period length patterns, i.e. without considering GC latency and
> >> > scalability aspects.
> >> >
> >> > Before we get into ways to solve the latency and scalability aspects of such a
> >> > simplified RCU model, do you agree with this analysis so far, or have I missed
> >> > something important wrt. correctness?
> >>
> >> RCU region specification. If we routinely allow preemption of rcu
> >> critical sections for any length of time I can't imagine we will want to
> >> wait for every possible preempted rcu critical section.
> >>
> >> Of course I could see the merge working the other way. Adding the
> >> debugging we need to find rcu critical secions that are held to long and
> >> shrinking them so we don't need PREEMPT_RCU at all.
> >
> > Again, from what I can see, merging rcu_read_lock(), rcu_read_lock_sched(),
> > and rcu_read_lock_bh() together should get us to a much better place.
> >
> > Make sense, or am I missing something?
> >
> > Thanx, Paul
> >
>
On Thu, Mar 08, 2018 at 12:45:24PM -0800, Paul E. McKenney wrote:
> On Wed, Mar 07, 2018 at 10:48:50AM -0800, Linus Torvalds wrote:
> > On Wed, Mar 7, 2018 at 7:54 AM, Paul E. McKenney
> > <[email protected]> wrote:
> > >
> > > Ah, and any thoughts on how best to get feedback from the various people
> > > who would need to reprogram their fingers? Or is everyone already on
> > > board with changing these various names?
> >
> > I really would prefer to not see massive re-naming unless there is a
> > really good reason for it.
> >
> > I'm all for simplifying RCU from a million different versions down to
> > just a few thousand, but I'm definitely not convinced we want to do
> > any search-and-replace.
>
> I am currently in the design (more accurately, reredesign phase) for
> the simplification. It is quite possible that there is a good reason
> for at least some renaming, but in that case, I would come back later
> with that as a separate proposal.
And I really am still working on this. It is a bit tricky, but still
looks doable. More likely to be ready for 4.19 than 4.18, though.
Thanx, Paul
On Tue, Apr 10, 2018 at 04:44:25PM -0700, Paul E. McKenney wrote:
> On Thu, Mar 08, 2018 at 12:45:24PM -0800, Paul E. McKenney wrote:
> > On Wed, Mar 07, 2018 at 10:48:50AM -0800, Linus Torvalds wrote:
> > > On Wed, Mar 7, 2018 at 7:54 AM, Paul E. McKenney
> > > <[email protected]> wrote:
> > > >
> > > > Ah, and any thoughts on how best to get feedback from the various people
> > > > who would need to reprogram their fingers? Or is everyone already on
> > > > board with changing these various names?
> > >
> > > I really would prefer to not see massive re-naming unless there is a
> > > really good reason for it.
> > >
> > > I'm all for simplifying RCU from a million different versions down to
> > > just a few thousand, but I'm definitely not convinced we want to do
> > > any search-and-replace.
> >
> > I am currently in the design (more accurately, reredesign phase) for
> > the simplification. It is quite possible that there is a good reason
> > for at least some renaming, but in that case, I would come back later
> > with that as a separate proposal.
>
> And I really am still working on this. It is a bit tricky, but still
> looks doable. More likely to be ready for 4.19 than 4.18, though.
I suppose it is well past time for an update...
I believe I have the preparation work done (famous last words!), and I am
now working on making rcutorture properly test the resulting compound RCU
read-side critical sections. User-level testing might be necessary (and
has been for some of the preparatory work). The ink-on-paper prototype
is starting to look promising, and I expect to get the corresponding
prototype patch posted by the end of this month.
A bit trickier than I expected (as usual), but still looking doable.
Thanx, Paul
On Fri, Jun 08, 2018 at 09:51:34AM -0700, Paul E. McKenney wrote:
> On Tue, Apr 10, 2018 at 04:44:25PM -0700, Paul E. McKenney wrote:
> > On Thu, Mar 08, 2018 at 12:45:24PM -0800, Paul E. McKenney wrote:
> > > On Wed, Mar 07, 2018 at 10:48:50AM -0800, Linus Torvalds wrote:
> > > > On Wed, Mar 7, 2018 at 7:54 AM, Paul E. McKenney
> > > > <[email protected]> wrote:
> > > > >
> > > > > Ah, and any thoughts on how best to get feedback from the various people
> > > > > who would need to reprogram their fingers? Or is everyone already on
> > > > > board with changing these various names?
> > > >
> > > > I really would prefer to not see massive re-naming unless there is a
> > > > really good reason for it.
> > > >
> > > > I'm all for simplifying RCU from a million different versions down to
> > > > just a few thousand, but I'm definitely not convinced we want to do
> > > > any search-and-replace.
> > >
> > > I am currently in the design (more accurately, reredesign phase) for
> > > the simplification. It is quite possible that there is a good reason
> > > for at least some renaming, but in that case, I would come back later
> > > with that as a separate proposal.
> >
> > And I really am still working on this. It is a bit tricky, but still
> > looks doable. More likely to be ready for 4.19 than 4.18, though.
>
> I suppose it is well past time for an update...
>
> I believe I have the preparation work done (famous last words!), and I am
> now working on making rcutorture properly test the resulting compound RCU
> read-side critical sections. User-level testing might be necessary (and
> has been for some of the preparatory work). The ink-on-paper prototype
> is starting to look promising, and I expect to get the corresponding
> prototype patch posted by the end of this month.
>
> A bit trickier than I expected (as usual), but still looking doable.
And it now passes light rcutorture testing, so I have posted an RFC
series to LKML:
lkml.kernel.org/r/[email protected]
There will be a long series of cleanups that will remove the other
flavors and also the code that supports multiple flavors, but there are
probably quite a few bugs to fix in this particular patch between now
and then. ;-)
Thanx, Paul
On Wed, Jun 27, 2018 at 03:28:35PM -0700, Paul E. McKenney wrote:
> On Fri, Jun 08, 2018 at 09:51:34AM -0700, Paul E. McKenney wrote:
> > On Tue, Apr 10, 2018 at 04:44:25PM -0700, Paul E. McKenney wrote:
> > > On Thu, Mar 08, 2018 at 12:45:24PM -0800, Paul E. McKenney wrote:
> > > > On Wed, Mar 07, 2018 at 10:48:50AM -0800, Linus Torvalds wrote:
> > > > > On Wed, Mar 7, 2018 at 7:54 AM, Paul E. McKenney
> > > > > <[email protected]> wrote:
> > > > > >
> > > > > > Ah, and any thoughts on how best to get feedback from the various people
> > > > > > who would need to reprogram their fingers? Or is everyone already on
> > > > > > board with changing these various names?
> > > > >
> > > > > I really would prefer to not see massive re-naming unless there is a
> > > > > really good reason for it.
> > > > >
> > > > > I'm all for simplifying RCU from a million different versions down to
> > > > > just a few thousand, but I'm definitely not convinced we want to do
> > > > > any search-and-replace.
> > > >
> > > > I am currently in the design (more accurately, reredesign phase) for
> > > > the simplification. It is quite possible that there is a good reason
> > > > for at least some renaming, but in that case, I would come back later
> > > > with that as a separate proposal.
> > >
> > > And I really am still working on this. It is a bit tricky, but still
> > > looks doable. More likely to be ready for 4.19 than 4.18, though.
> >
> > I suppose it is well past time for an update...
> >
> > I believe I have the preparation work done (famous last words!), and I am
> > now working on making rcutorture properly test the resulting compound RCU
> > read-side critical sections. User-level testing might be necessary (and
> > has been for some of the preparatory work). The ink-on-paper prototype
> > is starting to look promising, and I expect to get the corresponding
> > prototype patch posted by the end of this month.
> >
> > A bit trickier than I expected (as usual), but still looking doable.
>
> And it now passes light rcutorture testing, so I have posted an RFC
> series to LKML:
>
> lkml.kernel.org/r/[email protected]
>
> There will be a long series of cleanups that will remove the other
> flavors and also the code that supports multiple flavors, but there are
> probably quite a few bugs to fix in this particular patch between now
> and then. ;-)
If anything, the new consolidated-flavor version is now more stable than
the old one ever was. So I am posting the consolidation patch and a
lot of cleanup patches. Taken together, these reduce the size of RCU
by more than 400 lines of code.
The cleanups for the next merge window are confined to RCU itself.
For the merge window after that, I will have cleanups outside of RCU,
for example, replacing synchronize_sched() and synchronize_rcu_bh()
with synchronize_rcu() and removing synchronize_rcu_mult().
Thanx, Paul