2023-01-03 19:08:01

by Alan Stern

[permalink] [raw]
Subject: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

[Added LKML to the CC: list so there will be a permanent record of this
part of the discussion, and changed the Subject: to something more
descriptive of the topic at hand.]

On Tue, Jan 03, 2023 at 04:57:56PM +0000, Jonas Oberhauser wrote:
> Happy new year everyone!
>
> I'd like to circle back to the brief discussion we had about ppo \subseteq po.
>
> Here's some context:
>
> > > > > the preserved program order not always being a
> > > > > program order, lack of
> > >
> > > > Where does the LKMM allow a ppo relation not to be in program order?
> > >
> > > When one thread releases a lock and another one takes the lock, you
> > > can get an mb relation between the two threads
> > >
> > > https://github.com/torvalds/linux/blob/master/tools/memory-model/linux
> > > -kernel.cat#L40
> > >
> > > this then turns into a ppo edge.
>
> > Ah. I suppose we should have been a little more careful about internal vs. external full barriers. RCU barriers are also external, but the model didn't try to include them in the definition of mb; we should have done the same with unlock-lock.
>
> To be more explicit, in the current LKMM, mb includes some cases of po;[UL];co;[LKW];po which also relates events between threads, and this trickles up to the ppo:
>
> let mb = ([M] ; fencerel(Mb) ; [M]) |
> ([M] ; fencerel(Before-atomic) ; [RMW] ; po? ; [M]) |
> ([M] ; po? ; [RMW] ; fencerel(After-atomic) ; [M]) |
> ([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M]) |
> ([M] ; po ; [UL] ; (co | po) ; [LKW] ;
> fencerel(After-unlock-lock) ; [M])
> let gp = po ; [Sync-rcu | Sync-srcu] ; po?
> let strong-fence = mb | gp
> ...
> let ppo = to-r | to-w | (... | strong-fence | ...) | (po-unlock-lock-po & int) // expanded for readability
>
> Because of this, not every preserved program order edge is actually a program order edge that is being preserved.

Indeed, one can argue that neither the fence nor the (po-unlock-lock-po
& int) sub-relations should be included in ppo, since they don't reflect
dataflow constraints. They could instead be added separately to the
definition of hb, which is the only place that uses ppo.

> My suggestion for a fix would be to move this part out of mb and strong-fence, and instead introduce a new relation strong-sync that covers synchronization also between threads.
>
> let mb = ([M] ; fencerel(Mb) ; [M]) |
> ([M] ; fencerel(Before-atomic) ; [RMW] ; po? ; [M]) |
> ([M] ; po? ; [RMW] ; fencerel(After-atomic) ; [M]) |
> ([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M]) |
> - ([M] ; po ; [UL] ; (co | po) ; [LKW] ;
> - fencerel(After-unlock-lock) ; [M])
> let gp = po ; [Sync-rcu | Sync-srcu] ; po?
> let strong-fence = mb | gp
> + let strong-sync = strong-fence | ([M] ; po ; [UL] ; (co | po) ; [LKW] ;
> + fencerel(After-unlock-lock) ; [M])
> ...
> let ppo = to-r | to-w | (... | strong-fence | ...) | (po-unlock-lock-po & int)
>
> and then use strong-sync instead of strong-fence everywhere else, e.g.
> - let pb = prop ; strong-fence ; hb* ; [Marked]
> + let pb = prop ; strong-sync ; hb* ; [Marked]
> and similarly where strong-fence is being redefined and used in various later lines.
> (In general I would prefer renaming also other *-fence relations into *-sync when they include edges between threads).
>
>
> Note that no ordering is changed by this move.
> Firstly, the case [M];po;[UL];po;[LKW]; fencerel(After-unlock-lock) ; [M] which is also eliminated from mb by this change is still present in ppo through the definition ppo = ... | (po-unlock-lock-po & int).
> Secondly, for the ordering of [M];po;[UL];co;[LKW]; fencerel(After-unlock-lock) ; [M] we can focus on the case [M];po;[UL];coe;[LKW]; fencerel(After-unlock-lock) ; [M] because the other case (coi) is covered by the previous case.
> Ordering imposed by this case is also not lost, since every [M];po;[UL];coe;[LKW]; fencerel(After-unlock-lock) ; [M] edge also imposes a
> [M];po;[UL];rfe;[LKR]; fencerel(After-unlock-lock) ; [M]
> edge which is a po-rel ; [Marked] ; rfe ; [Marked] ; acq-po edge and hence hb;hb;hb.
> Thirdly, no new ordering is imposed by this change since every place we now order by strong-sync was previously ordered by the old strong-fence which is identical to the new strong-sync, and in all other places we changed we just (potentially) removed ordering.
>
> The definition of strong-sync could also be slightly simplified to
> let strong-sync = strong-fence | ([M]; po-unlock-lock-po ; [After-unlock-lock] ; po ; [M])
> which is kind of pretty because the after-unlock-lock is now after po-unlock-lock-po.
>
> What do you think?

That all sounds good to me. However, I wonder if it might be better to
use "strong-order" (and similar) for the new relation name instead of
"strong-sync". The idea being that fences are about ordering, not (or
not directly) about synchronization.

Alan


2023-01-04 16:00:08

by Andrea Parri

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 03, 2023 at 01:56:08PM -0500, Alan Stern wrote:
> [Added LKML to the CC: list so there will be a permanent record of this
> part of the discussion, and changed the Subject: to something more
> descriptive of the topic at hand.]
>
> On Tue, Jan 03, 2023 at 04:57:56PM +0000, Jonas Oberhauser wrote:
> > Happy new year everyone!
> >
> > I'd like to circle back to the brief discussion we had about ppo \subseteq po.
> >
> > Here's some context:
> >
> > > > > > the preserved program order not always being a
> > > > > > program order, lack of
> > > >
> > > > > Where does the LKMM allow a ppo relation not to be in program order?
> > > >
> > > > When one thread releases a lock and another one takes the lock, you
> > > > can get an mb relation between the two threads
> > > >
> > > > https://github.com/torvalds/linux/blob/master/tools/memory-model/linux
> > > > -kernel.cat#L40
> > > >
> > > > this then turns into a ppo edge.
> >
> > > Ah. I suppose we should have been a little more careful about internal vs. external full barriers. RCU barriers are also external, but the model didn't try to include them in the definition of mb; we should have done the same with unlock-lock.
> >
> > To be more explicit, in the current LKMM, mb includes some cases of po;[UL];co;[LKW];po which also relates events between threads, and this trickles up to the ppo:
> >
> > let mb = ([M] ; fencerel(Mb) ; [M]) |
> > ([M] ; fencerel(Before-atomic) ; [RMW] ; po? ; [M]) |
> > ([M] ; po? ; [RMW] ; fencerel(After-atomic) ; [M]) |
> > ([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M]) |
> > ([M] ; po ; [UL] ; (co | po) ; [LKW] ;
> > fencerel(After-unlock-lock) ; [M])
> > let gp = po ; [Sync-rcu | Sync-srcu] ; po?
> > let strong-fence = mb | gp
> > ...
> > let ppo = to-r | to-w | (... | strong-fence | ...) | (po-unlock-lock-po & int) // expanded for readability
> >
> > Because of this, not every preserved program order edge is actually a program order edge that is being preserved.
>
> Indeed, one can argue that neither the fence nor the (po-unlock-lock-po
> & int) sub-relations should be included in ppo, since they don't reflect
> dataflow constraints. They could instead be added separately to the
> definition of hb, which is the only place that uses ppo.
>
> > My suggestion for a fix would be to move this part out of mb and strong-fence, and instead introduce a new relation strong-sync that covers synchronization also between threads.
> >
> > let mb = ([M] ; fencerel(Mb) ; [M]) |
> > ([M] ; fencerel(Before-atomic) ; [RMW] ; po? ; [M]) |
> > ([M] ; po? ; [RMW] ; fencerel(After-atomic) ; [M]) |
> > ([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M]) |
> > - ([M] ; po ; [UL] ; (co | po) ; [LKW] ;
> > - fencerel(After-unlock-lock) ; [M])
> > let gp = po ; [Sync-rcu | Sync-srcu] ; po?
> > let strong-fence = mb | gp
> > + let strong-sync = strong-fence | ([M] ; po ; [UL] ; (co | po) ; [LKW] ;
> > + fencerel(After-unlock-lock) ; [M])
> > ...
> > let ppo = to-r | to-w | (... | strong-fence | ...) | (po-unlock-lock-po & int)
> >
> > and then use strong-sync instead of strong-fence everywhere else, e.g.
> > - let pb = prop ; strong-fence ; hb* ; [Marked]
> > + let pb = prop ; strong-sync ; hb* ; [Marked]
> > and similarly where strong-fence is being redefined and used in various later lines.
> > (In general I would prefer renaming also other *-fence relations into *-sync when they include edges between threads).
> >
> >
> > Note that no ordering is changed by this move.
> > Firstly, the case [M];po;[UL];po;[LKW]; fencerel(After-unlock-lock) ; [M] which is also eliminated from mb by this change is still present in ppo through the definition ppo = ... | (po-unlock-lock-po & int).
> > Secondly, for the ordering of [M];po;[UL];co;[LKW]; fencerel(After-unlock-lock) ; [M] we can focus on the case [M];po;[UL];coe;[LKW]; fencerel(After-unlock-lock) ; [M] because the other case (coi) is covered by the previous case.
> > Ordering imposed by this case is also not lost, since every [M];po;[UL];coe;[LKW]; fencerel(After-unlock-lock) ; [M] edge also imposes a
> > [M];po;[UL];rfe;[LKR]; fencerel(After-unlock-lock) ; [M]
> > edge which is a po-rel ; [Marked] ; rfe ; [Marked] ; acq-po edge and hence hb;hb;hb.
> > Thirdly, no new ordering is imposed by this change since every place we now order by strong-sync was previously ordered by the old strong-fence which is identical to the new strong-sync, and in all other places we changed we just (potentially) removed ordering.
> >
> > The definition of strong-sync could also be slightly simplified to
> > let strong-sync = strong-fence | ([M]; po-unlock-lock-po ; [After-unlock-lock] ; po ; [M])
> > which is kind of pretty because the after-unlock-lock is now after po-unlock-lock-po.
> >
> > What do you think?
>
> That all sounds good to me. However, I wonder if it might be better to
> use "strong-order" (and similar) for the new relation name instead of
> "strong-sync". The idea being that fences are about ordering, not (or
> not directly) about synchronization.

Sounds good to me too. I'm trying to remember why we went for the LKW
event to model smp_mb__after_unlock_lock() (as opposed to the LKR event,
as suggested above/in po-unlock-lock-po). Anyway, I currently see no
issue with the above (we know that LKW and LKR come paired), and I think
it's good to merge the two notions of "unlock-lock pair" if possible.

Andrea

2023-01-04 21:07:52

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 04, 2023 at 04:37:14PM +0100, Andrea Parri wrote:
> Sounds good to me too. I'm trying to remember why we went for the LKW
> event to model smp_mb__after_unlock_lock() (as opposed to the LKR event,
> as suggested above/in po-unlock-lock-po).

I don't remember either, but with the LKR event it would be awkward to
include the co part of (co | po) in the smp_mb__after_unlock_lock()
definition. You'd have to write something like ((co? ; rf) | po).

Aside from that, I don't think using LKR vs. LKW makes any difference.

> Anyway, I currently see no
> issue with the above (we know that LKW and LKR come paired), and I think
> it's good to merge the two notions of "unlock-lock pair" if possible.

Indeed. It also would eliminate questions about why po-unlock-lock-po
doesn't include the co term.

Alan

2023-01-05 18:35:29

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 04, 2023 at 11:13:05AM +0000, Jonas Oberhauser wrote:
>
>
> -----Original Message-----
> From: Alan Stern [mailto:[email protected]]
> Sent: Tuesday, January 3, 2023 7:56 PM
> > [Added LKML to the CC: list so there will be a permanent record of this part of the discussion, and changed the Subject: to something more descriptive of the topic at hand.]
>
> Aha, so it's the same discussion but now with 64% improved chance of immortalizing any mistakes I make.

Welcome to our world of open-source software development! ;-)

> > > To be more explicit, in the current LKMM, mb includes some cases of po;[UL];co;[LKW];po which also relates events between threads, and this trickles up to the ppo:
> > >
> > > let mb = ([M] ; fencerel(Mb) ; [M]) |
> > > ([M] ; fencerel(Before-atomic) ; [RMW] ; po? ; [M]) |
> > > ([M] ; po? ; [RMW] ; fencerel(After-atomic) ; [M]) |
> > > ([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M]) |
> > > ([M] ; po ; [UL] ; (co | po) ; [LKW] ;
> > > fencerel(After-unlock-lock) ; [M])
> > > let gp = po ; [Sync-rcu | Sync-srcu] ; po?
> > > let strong-fence = mb | gp
> > > ...
> > > let ppo = to-r | to-w | (... | strong-fence | ...) |
> > > (po-unlock-lock-po & int) // expanded for readability
> > >
> > > Because of this, not every preserved program order edge is actually a program order edge that is being preserved.
>
> > Indeed, one can argue that neither the fence nor the (po-unlock-lock-po & int) sub-relations should be included in ppo, since they don't reflect dataflow constraints. They could instead be added separately to the definition of hb, which is the only place that uses ppo.
>
> One can, but one can also argue instead that fences and lock/unlock sequences preserve program order. At least for fences this is the view e.g. RISC-V takes and I prefer this view.
>
> > > My suggestion for a fix would be to move this part out of mb and strong-fence, and instead introduce a new relation strong-sync that covers synchronization also between threads.
> > >
> > > let mb = ([M] ; fencerel(Mb) ; [M]) |
> > > ([M] ; fencerel(Before-atomic) ; [RMW] ; po? ; [M]) |
> > > ([M] ; po? ; [RMW] ; fencerel(After-atomic) ; [M]) |
> > > ([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M]) |
> > > - ([M] ; po ; [UL] ; (co | po) ; [LKW] ;
> > > - fencerel(After-unlock-lock) ; [M])
> > > let gp = po ; [Sync-rcu | Sync-srcu] ; po?
> > > let strong-fence = mb | gp
> > > + let strong-sync = strong-fence | ([M] ; po ; [UL] ; (co | po) ; [LKW] ;
> > > + fencerel(After-unlock-lock) ; [M])
> > > ...
> > > let ppo = to-r | to-w | (... | strong-fence | ...) |
> > > (po-unlock-lock-po & int)
> > >
> > > and then use strong-sync instead of strong-fence everywhere else, e.g.
> > > - let pb = prop ; strong-fence ; hb* ; [Marked]
> > > + let pb = prop ; strong-sync ; hb* ; [Marked]
> > > and similarly where strong-fence is being redefined and used in various later lines.
> > > (In general I would prefer renaming also other *-fence relations into *-sync when they include edges between threads).
> > > The definition of strong-sync could also be slightly simplified to
> > > let strong-sync = strong-fence | ([M]; po-unlock-lock-po ;
> > > [After-unlock-lock] ; po ; [M]) which is kind of pretty because the after-unlock-lock is now after po-unlock-lock-po.
> > >
> > > What do you think?
>
> > That all sounds good to me. However, I wonder if it might be better to use "strong-order" (and similar) for the new relation name instead of "strong-sync". The idea being that fences are about ordering, not (or not directly) about synchronization.
>
> I think that is indeed better, thanks. I suppose *-sync might be more appropriate if it *only* included edges between threads.

There are quite a few ways to group the relations. As long as we
don't end up oscillating back and forth with too short a frequency,
I am good. ;-)

> I'll wait a few days for other suggestions and then prepare a patch.

Looking forward to seeing what you come up with!

Thanx, Paul

2023-01-11 15:29:58

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 11, 2023 at 11:33:33AM +0000, Jonas Oberhauser wrote:
>
>
> -----Original Message-----
> From: Paul E. McKenney [mailto:[email protected]]
> Sent: Thursday, January 5, 2023 6:32 PM
> > On Wed, Jan 04, 2023 at 11:13:05AM +0000, Jonas Oberhauser wrote:
> > > -----Original Message-----
> > > From: Alan Stern [mailto:[email protected]]
> > > Sent: Tuesday, January 3, 2023 7:56 PM
> > > > That all sounds good to me. However, I wonder if it might be better to use "strong-order" (and similar) for the new relation name instead of "strong-sync". The idea being that fences are about ordering, not (or not directly) about synchronization.
> >
> > > I think that is indeed better, thanks. I suppose *-sync might be more appropriate if it *only* included edges between threads.
>
> > There are quite a few ways to group the relations. As long as we don't end up oscillating back and forth with too short a frequency, I am good. ;-)
>
> Considering how much effort it is to keep the documentation up-to-date
> even for small changes, I'm extremely oscillation-averse.
> Interestingly as I go through the documentation while preparing each
> patch I often find some remarks hinting at the content of the patch,
> e.g. "fences don't link events on different CPUs" and "rcu-fence is
> able to link events on different CPUs. (Perhaps this fact should lead
> us to say that rcu-fence isn't really a fence at all!)" in the current
> explanation.txt.
>
> Following the instructions sent to me by Andrea earlier, right now my
> plan is to first address the strong ordering in one patch, and then
> address this perhaps unlucky name of the other "fences" in a second
> patch. Let me know if this is incorrect, as there is some overlap in
> that I'll use strong-order right away, and then rename the handful of
> other fences-but-not-really-at-all to '-order' as well.

Minor snag: There already is an rcu-order relation in the memory model.
Maybe we need a different word from "order". Or maybe rcu-order should
be renamed.

Alan

2023-01-11 17:17:16

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 11, 2023 at 04:45:46PM +0000, Jonas Oberhauser wrote:
>
>
> -----Original Message-----
> From: Alan Stern [mailto:[email protected]]
> > On Wed, Jan 11, 2023 at 11:33:33AM +0000, Jonas Oberhauser wrote:
> > > Considering how much effort it is to keep the documentation up-to-date
> > > even for small changes, I'm extremely oscillation-averse.
> > > Interestingly as I go through the documentation while preparing each
> > > patch I often find some remarks hinting at the content of the patch,
> > > e.g. "fences don't link events on different CPUs" and "rcu-fence is
> > > able to link events on different CPUs. (Perhaps this fact should lead
> > > us to say that rcu-fence isn't really a fence at all!)" in the current
> > > explanation.txt.
> >
> > > [...] that I'll use strong-order right away, and then rename the handful of
> > > other fences-but-not-really-at-all to '-order' as well.
> >
> > Minor snag: There already is an rcu-order relation in the memory model.
> > Maybe we need a different word from "order". Or maybe rcu-order should be renamed.
>
> Yeah, I noticed (it's in the same section I'm quoting from above). There are
> some other minor things that might need editing in that section, e.g.,
> "Written symbolically, X ->rcu-fence Y means
> there are fence events E and F such that:
>
> X ->po E ->rcu-order F ->po Y."
> But in fact the definition is
> let rcu-fence = po ; rcu-order ; po?
> which allows for F = Y and not F ->po Y.

Yeah, that should be fixed.

> I'll need to get a better understanding of rcu-order before I can form an
> opinion of how things could be organized. The only thing I'm certain of is that
> strong-order and rcu-fence should end up with the same suffix :D
>
> Just looking at it from afar, it almost looks like there's a simpler,
> non-recursive definition of rcu-order trying to come out. I assume you've tried
> various things and they don't work xP ?

What is there to try? As far as I know, the only construct in the cat
language that can be used to get the effect of counting is a recursive
definition.

> Is it because you use the recursion to "count" the grace periods and read-side
> critical sections, in the sense of maintaining the inequality invariant between
> them? I wonder if there's a "pumping lemma" that can show this can't be done
> with a non-recursive definition.

Such a lemma would have to be based on the other constructs available in
the language. The only things I can think of which even come close are
the * and + operators, and they are insufficient (because they are no
stronger than regular expressions, which are well known to be too weak
-- there isn't even a regular expression for strings in which the
parentheses are balanced).

Alan

2023-01-12 22:42:22

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 11, 2023 at 05:24:07PM +0000, Jonas Oberhauser wrote:
>
>
> -----Original Message-----
> From: Alan Stern [mailto:[email protected]]
> > What is there to try? As far as I know, the only construct in the cat language that can be used to get the effect of counting is a recursive definition.
> > Such a lemma would have to be based on the other constructs available in the language. The only things I can think of which even come close are the * and + operators, and they are insufficient (because they are no stronger than regular expressions, which are well known to be too weak
> -- there isn't even a regular expression for strings in which the parentheses are balanced).
>
> Well, like I said, I don't yet understand rcu-order well enough to have any opinions :D
> You could say that the time period between me asking if you've tried other stuff and thinking that there's probably some impossibility result making that a waste of time were the one minute I used to read the comment above rcu-order and look at the recursion a little :-P, that was barely enough to understand that you're counting stuff which like you say is probably not possible with the other operators.
>
> Anyways I'll need to take some time to read the definition carefully.

Apologies for the delay, gmail spam-foldered your email again. :-/

I will risk sharing the intuition behind the rcu-order counting rule.

In the code, an RCU read-side critical section begins with rcu_read_lock()
and ends with the matching rcu_read_unlock(). RCU read-side critical
section may be nested, in which case RCU cares only about the outermost
of the nested set.

An RCU grace period includes at least one moment in time during which
each and every process/CPU/task/whatever is not within an RCU read-side
critical section. Any period of time spanning a grace period is itself
a grace period. And synchronize_rcu() waits for a grace period.

Taking the above two paragraphs together, it is forbidden for any RCU
read-side critical section to start before the beginning of a given
grace period and end after the end of that same grace period.

There is no ordering within or between RCU read-side critical sections
other than their separate relationships to any concurrent grace periods
and due to any operations within or between them that may have ordering
effects. For example, if you have a series of three non-overlapping RCU
read-side critical sections executed by a given process in the absence of
any grace periods (for example, in the absence of any synchronize_rcu()
invocations), and where all other operations executed by that process
are READ_ONCE() and WRITE_ONCE(), without dependencies of any sort,
those memory-reference operations can be executed in any order.

So if a given RCU read-side critical section's first operation follows a
given grace period on some other process, then all of its other operations
might have been executed just after the start of that same grace period.
The start of the grace period must be epsilon before the (reordered)
end of that RCU read-side critical section.

Suppose that one of that critical section's memory references started
just before a memory reference of some other critical section on some
other process. Then other references in this second critical section
could be reordered to precede the beginning of the grace period.

If you work the other possible examples, you will find that as long as
there are at least as many grace periods as critical sections in a given
candidate cycle, there will be sufficient ordering to prohibit that cycle.

For diagrams, please see Figures 15.14-15.16 here:

https://kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.2022.09.25a.pdf

On the off-chance that this helps...

Thanx, Paul

2023-01-13 16:59:43

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Thu, Jan 12, 2023 at 01:57:16PM -0800, Paul E. McKenney wrote:

> I will risk sharing the intuition behind the rcu-order counting rule.
>
> In the code, an RCU read-side critical section begins with rcu_read_lock()
> and ends with the matching rcu_read_unlock(). RCU read-side critical
> section may be nested, in which case RCU cares only about the outermost
> of the nested set.
>
> An RCU grace period includes at least one moment in time during which
> each and every process/CPU/task/whatever is not within an RCU read-side
> critical section.

Strictly speaking, this is not right. It should say: For each
process/CPU/task/whatever, an RCU grace period includes at least one
moment in time during which that process is not within an RCU read-side
critical section. There does not have to be any single moment during
which no processes are executing a critical section.

For example, the following is acceptable:

CPU 0: start of synchronize_rcu()......end
CPU 1: rcu_lock().....................rcu_unlock()
CPU 2: rcu_lock().......................rcu_unlock()

Alan

2023-01-13 17:45:24

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

Sorry for not replying earlier. Yesterday was an extremely busy day,
and while I was working on a reply late at night, a network outage led
to all I had written getting lost. :-(

Here's what I wanted to say, more or less. (And I will also ignore the
difference between gp and rcu-gp.)

On Thu, Jan 12, 2023 at 01:48:26PM +0000, Jonas Oberhauser wrote:
>
>
> -----Original Message-----
> From: Alan Stern [mailto:[email protected]]
> Sent: Wednesday, January 11, 2023 4:06 PM
> > Maybe we need a different word from "order". Or maybe rcu-order should be renamed.
>
> Alright, I've spent some time reading the rcu relations.
> If I understand correctly, there are the -rscs relations which "parse" the balanced and nested lock-unlock critical sections (e.g., in L1 L2 U2 U1, it links L1 to U1 and L2 to U2).
> In those relations the equation
> matched = matched | (unmatched-locks-to-unlocks \
> (unmatched-po ; unmatched-po))
> confused me for a second because it has multiple solutions, but I assume it should be read as producing the fixed point one gets by starting from the empty set and iteratively applying the map until one reaches a fixed point.

Yes; cat uses least-fixed-point definitions.

> Next, rcu-link is the normal hb/pb happens-before relation, enclosed by po; .. ; po?.

Here's how I think about rcu-link. rcu-fence will end up functioning
as kind of a super "cross-CPU" strong fence, so first consider the
restrictions we already have on strong fences. In particular,
consider a cycle of the sort ruled out by the propagation axiom
(I have expanded out the initial pb link):

A ->prop B ->strong-fence C ->hb* D ->pb* A

The part of the cycle running from C to B is:

C ->hb* D ->pb* A ->prop B

which is rcu-link aside from lacking the po? and po at the beginning
and end. Thus, it is easy to see that cycles of the form

(rcu-link ; strong-fence)+

are forbidden by the propagation axiom. We will want similar cycles
to be forbidden when the strong-fence is replaced by appropriate
RCU-specific relations. Hence the use of rcu-link to connect the
RCU-specific items.

> Then rcu-order links rcu events (grace periods and read-side critical sections) in a kind of rcu happens-before order. The basic idea is that if a grace period happens before an unlock, it must also happen before the matching lock since otherwise the grace period would be fully spanned by the critical section; analogously, if a lock happens before the grace period, the matching unlock must also happen before the grace period.

IMO it's generally better to think of grace periods as not being
instantaneous but as occurring over a prolonged period of time. Thus
we should say: If a grace period ends before an unlock occurs, it must
start before the corresponding lock. And contrapositively, if a lock
occurs before a grace period starts, the corresponding unlock must
occur before the grace period ends.

There's a little more to it, because rcu-fence also has implications
about stores propagating from one CPU to another in a particular
order, just like strong-fence, but that's the basic idea.

> This is made tricky by the fact that there's a distinction between the hb/pb happens-before and the rcu happens-before relations.
> As far as I understand, the current way to resolve this in LKMM is to count the number of critical sections and grace periods and analyzing that if the number of the latter is not less, then even nested and all kind of other weird cases will be ordered by applying this logic recursively when relying only on rcu-link (i.e., hb/pb happens-before).

We included a mathematical proof of this in the supplemental material
to our ASPLOS paper.

> But if I look at it through a lense of a unified notion of happens-before (let's call it rcu-link'), I would think of a definition in the direction of
> idealized-rcu-order = rcu-rscs^-1 ; rcu-link' ; gp | gp ; rcu-link' ; rcu-rscs^-1
> where rcu-order can itself contribute to rcu-link', in the sense that it extends any rcu-link
> rcu-link' = rcu-link | rcu-link' ; idealized-rcu-order

Not quite; there needs to be another "; rcu-link'" at the end.

> After thinking about it for a while, I came up with the following (ignoring the srcu cases since I don't understand them yet):

SRCU is exactly like RCU except for one aspect: The SRCU primitives
(synchronize_srcu(), srcu_lock(), and srcu_unlock()) each take an
argument, a pointer to an srcu structure. The ordering restrictions
apply only in cases where the arguments to the corresponding
primitives point to the _same_ srcu structure. That's why you see all
those "& loc" expressions sprinkled throughout the definitions of
srcu-rscs and rcu-order.

> let rec rcu-extend = rcu-rscs^-1 ; rcu-link' ; gp | gp ; rcu-link' ; rcu-rscs^-1 | gp
> and rcu-link' = rcu-link ; (rcu-extend ; rcu-link)*
> which I think satisfies
> rcu-extend <= rcu-order (use the helpful lemma rcu-order ; (rcu-link ; rcu-order)* <= rcu-order)
> rcu-order = rcu-extend ; (rcu-link ; rcu-extend)*
> (I've attached my proof sketch, but I'm not sure it's readable.)

Yes, this is a more succinct way of expressing the same definition, by
factoring out a common sub-relation and then performing simultaneous
recursion on two relations rather than one.

> If this is true, defining rcu-order like this cuts away roughly half of the cases (ignoring srcu, rcu-order has 6 cases and rcu-extend has 3),

The definition of rcu-link' should count as a case, so you have only
eliminated one-third of the cases. :-)

> and I believe makes the argument for why the relation works much clearer: It's essentially just the argument from above, that a grace period can't happen between a lock and unlock and thus if something happens before a grace period that happens before an unlock, it must also happen before the lock, and analogously if something happens after a grace period that happens after a lock, it must also happen after the unlock.
> In a sense I suppose that by splitting the recursion over two relations, the argument becomes disentangled into two simple arguments: firstly, rcu-link' can just be thought of as a kind of happens-before that is used to define rcu-extend, and secondly, rcu-extend can be used to extend rcu-link's happens-before.

Clarity is in the eye of the beholder. But I agree your definition is
shorter and easier to read, although perhaps grasping its full
implications is a little harder.

> The only thing that bothers me about it is the gp. It looks quite foreign here and besides, gp is already a strong fence and implies happens-before in its own right.

The lone gp is present in the definition of rcu-order because I wanted
to express explicitly the condition that the number of grace periods
in a cycle must be _at least_ as large as the number of critical
sections. There can be more grace periods, meaning that some of them
need not be paired with a critical section.

However, it's true that (rcu-link ; gp ; rcu-link) is a sub-relation
of rcu-link. Hence the only need for a lone gp in these definitions
is to cover the case where there are no critical sections at all in
the cycle (which I would like be to be forbidden by the rcu axiom,
even if it's already forbidden by the propagation axiom).

> So to clarify, I'm thinking something in the direction of
> rcu-extend = rcu-rscs^-1 ; rcu-link' ; gp | gp ; rcu-link' ; rcu-rscs^-1 (* no gp! *)
> (* recursion like before... (omitted) *)
> rb = prop ; rcu-extend ; ...
> and relying on the hb-ordering provided by lonely po;gp;po? to give the ordering ostensibly lost by not including the full rcu-order relation here.
> And then drop the old rcu-order and rename rcu-extend into rcu-order.
> But I haven't had time to think about this direction deeply yet.
>
> Let me know if I'm on the right track towards understanding rcu-order.

I think you are.

Alan

PS: I have no idea why the mailing list isn't accepting your emails.

2023-01-13 20:30:14

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 13, 2023 at 11:28:10AM -0500, Alan Stern wrote:
> Sorry for not replying earlier. Yesterday was an extremely busy day,
> and while I was working on a reply late at night, a network outage led
> to all I had written getting lost. :-(
>
> Here's what I wanted to say, more or less. (And I will also ignore the
> difference between gp and rcu-gp.)
>
> On Thu, Jan 12, 2023 at 01:48:26PM +0000, Jonas Oberhauser wrote:
> >
> >
> > -----Original Message-----
> > From: Alan Stern [mailto:[email protected]]
> > Sent: Wednesday, January 11, 2023 4:06 PM
> > > Maybe we need a different word from "order". Or maybe rcu-order should be renamed.
> >
> > Alright, I've spent some time reading the rcu relations.
> > If I understand correctly, there are the -rscs relations which "parse" the balanced and nested lock-unlock critical sections (e.g., in L1 L2 U2 U1, it links L1 to U1 and L2 to U2).
> > In those relations the equation
> > matched = matched | (unmatched-locks-to-unlocks \
> > (unmatched-po ; unmatched-po))
> > confused me for a second because it has multiple solutions, but I assume it should be read as producing the fixed point one gets by starting from the empty set and iteratively applying the map until one reaches a fixed point.
>
> Yes; cat uses least-fixed-point definitions.
>
> > Next, rcu-link is the normal hb/pb happens-before relation, enclosed by po; .. ; po?.
>
> Here's how I think about rcu-link. rcu-fence will end up functioning
> as kind of a super "cross-CPU" strong fence, so first consider the
> restrictions we already have on strong fences. In particular,
> consider a cycle of the sort ruled out by the propagation axiom
> (I have expanded out the initial pb link):
>
> A ->prop B ->strong-fence C ->hb* D ->pb* A
>
> The part of the cycle running from C to B is:
>
> C ->hb* D ->pb* A ->prop B
>
> which is rcu-link aside from lacking the po? and po at the beginning
> and end. Thus, it is easy to see that cycles of the form
>
> (rcu-link ; strong-fence)+
>
> are forbidden by the propagation axiom. We will want similar cycles
> to be forbidden when the strong-fence is replaced by appropriate
> RCU-specific relations. Hence the use of rcu-link to connect the
> RCU-specific items.
>
> > Then rcu-order links rcu events (grace periods and read-side critical sections) in a kind of rcu happens-before order. The basic idea is that if a grace period happens before an unlock, it must also happen before the matching lock since otherwise the grace period would be fully spanned by the critical section; analogously, if a lock happens before the grace period, the matching unlock must also happen before the grace period.
>
> IMO it's generally better to think of grace periods as not being
> instantaneous but as occurring over a prolonged period of time. Thus
> we should say: If a grace period ends before an unlock occurs, it must
> start before the corresponding lock. And contrapositively, if a lock
> occurs before a grace period starts, the corresponding unlock must
> occur before the grace period ends.

What Alan said! You could even have distinct partially overlapping
grace periods, as the Linux kernel actually does have courtesy of normal
grace periods via synchronize_rcu() and expedited grace periods via
synchronize_rcu_expedited().

> There's a little more to it, because rcu-fence also has implications
> about stores propagating from one CPU to another in a particular
> order, just like strong-fence, but that's the basic idea.
>
> > This is made tricky by the fact that there's a distinction between the hb/pb happens-before and the rcu happens-before relations.
> > As far as I understand, the current way to resolve this in LKMM is to count the number of critical sections and grace periods and analyzing that if the number of the latter is not less, then even nested and all kind of other weird cases will be ordered by applying this logic recursively when relying only on rcu-link (i.e., hb/pb happens-before).
>
> We included a mathematical proof of this in the supplemental material
> to our ASPLOS paper.
>
> > But if I look at it through a lense of a unified notion of happens-before (let's call it rcu-link'), I would think of a definition in the direction of
> > idealized-rcu-order = rcu-rscs^-1 ; rcu-link' ; gp | gp ; rcu-link' ; rcu-rscs^-1
> > where rcu-order can itself contribute to rcu-link', in the sense that it extends any rcu-link
> > rcu-link' = rcu-link | rcu-link' ; idealized-rcu-order
>
> Not quite; there needs to be another "; rcu-link'" at the end.
>
> > After thinking about it for a while, I came up with the following (ignoring the srcu cases since I don't understand them yet):
>
> SRCU is exactly like RCU except for one aspect: The SRCU primitives
> (synchronize_srcu(), srcu_lock(), and srcu_unlock()) each take an
> argument, a pointer to an srcu structure. The ordering restrictions
> apply only in cases where the arguments to the corresponding
> primitives point to the _same_ srcu structure. That's why you see all
> those "& loc" expressions sprinkled throughout the definitions of
> srcu-rscs and rcu-order.

In addition, the actual Linux-kernel SRCU has srcu_read_lock() return a
value that must be passed to srcu_read_unlock(). This means that SRCU
can have distinct overlapping SRCU read-side critical sections within
the confines of a given process.

Worse yet, the upcoming addition of srcu_down_read() and srcu_up_read()
means that a given SRCU read-side critical section might begin on one
process and end on another. Thus srcu_down_read() is to srcu_read_lock()
as down_sema() is to mutex_lock(), more or less.

Making LKMM correctly model all of this has been on my todo list for an
embarrassingly long time.

> > let rec rcu-extend = rcu-rscs^-1 ; rcu-link' ; gp | gp ; rcu-link' ; rcu-rscs^-1 | gp
> > and rcu-link' = rcu-link ; (rcu-extend ; rcu-link)*
> > which I think satisfies
> > rcu-extend <= rcu-order (use the helpful lemma rcu-order ; (rcu-link ; rcu-order)* <= rcu-order)
> > rcu-order = rcu-extend ; (rcu-link ; rcu-extend)*
> > (I've attached my proof sketch, but I'm not sure it's readable.)
>
> Yes, this is a more succinct way of expressing the same definition, by
> factoring out a common sub-relation and then performing simultaneous
> recursion on two relations rather than one.
>
> > If this is true, defining rcu-order like this cuts away roughly half of the cases (ignoring srcu, rcu-order has 6 cases and rcu-extend has 3),
>
> The definition of rcu-link' should count as a case, so you have only
> eliminated one-third of the cases. :-)
>
> > and I believe makes the argument for why the relation works much clearer: It's essentially just the argument from above, that a grace period can't happen between a lock and unlock and thus if something happens before a grace period that happens before an unlock, it must also happen before the lock, and analogously if something happens after a grace period that happens after a lock, it must also happen after the unlock.
> > In a sense I suppose that by splitting the recursion over two relations, the argument becomes disentangled into two simple arguments: firstly, rcu-link' can just be thought of as a kind of happens-before that is used to define rcu-extend, and secondly, rcu-extend can be used to extend rcu-link's happens-before.
>
> Clarity is in the eye of the beholder. But I agree your definition is
> shorter and easier to read, although perhaps grasping its full
> implications is a little harder.
>
> > The only thing that bothers me about it is the gp. It looks quite foreign here and besides, gp is already a strong fence and implies happens-before in its own right.
>
> The lone gp is present in the definition of rcu-order because I wanted
> to express explicitly the condition that the number of grace periods
> in a cycle must be _at least_ as large as the number of critical
> sections. There can be more grace periods, meaning that some of them
> need not be paired with a critical section.
>
> However, it's true that (rcu-link ; gp ; rcu-link) is a sub-relation
> of rcu-link. Hence the only need for a lone gp in these definitions
> is to cover the case where there are no critical sections at all in
> the cycle (which I would like be to be forbidden by the rcu axiom,
> even if it's already forbidden by the propagation axiom).
>
> > So to clarify, I'm thinking something in the direction of
> > rcu-extend = rcu-rscs^-1 ; rcu-link' ; gp | gp ; rcu-link' ; rcu-rscs^-1 (* no gp! *)
> > (* recursion like before... (omitted) *)
> > rb = prop ; rcu-extend ; ...
> > and relying on the hb-ordering provided by lonely po;gp;po? to give the ordering ostensibly lost by not including the full rcu-order relation here.
> > And then drop the old rcu-order and rename rcu-extend into rcu-order.
> > But I haven't had time to think about this direction deeply yet.
> >
> > Let me know if I'm on the right track towards understanding rcu-order.
>
> I think you are.
>
> Alan
>
> PS: I have no idea why the mailing list isn't accepting your emails.

The gmail service also doesn't like them. I forgot to look at the reason
this time, but last time it was that gmail couldn't prove to itself that
your email really came from huawei.com.

You are not the only one. So much so that I keep a browser window
open to my gmail spam folder at all times. :-/

Thaxn, Paul

2023-01-13 20:50:02

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 13, 2023 at 12:07:06PM -0800, Paul E. McKenney wrote:
> On Fri, Jan 13, 2023 at 11:28:10AM -0500, Alan Stern wrote:
> > On Thu, Jan 12, 2023 at 01:48:26PM +0000, Jonas Oberhauser wrote:
> > > From: Alan Stern [mailto:[email protected]]
> > > Sent: Wednesday, January 11, 2023 4:06 PM

[ . . . ]

> > SRCU is exactly like RCU except for one aspect: The SRCU primitives
> > (synchronize_srcu(), srcu_lock(), and srcu_unlock()) each take an
> > argument, a pointer to an srcu structure. The ordering restrictions
> > apply only in cases where the arguments to the corresponding
> > primitives point to the _same_ srcu structure. That's why you see all
> > those "& loc" expressions sprinkled throughout the definitions of
> > srcu-rscs and rcu-order.
>
> In addition, the actual Linux-kernel SRCU has srcu_read_lock() return a
> value that must be passed to srcu_read_unlock(). This means that SRCU
> can have distinct overlapping SRCU read-side critical sections within
> the confines of a given process.
>
> Worse yet, the upcoming addition of srcu_down_read() and srcu_up_read()
> means that a given SRCU read-side critical section might begin on one
> process and end on another. Thus srcu_down_read() is to srcu_read_lock()
> as down_sema() is to mutex_lock(), more or less.
>
> Making LKMM correctly model all of this has been on my todo list for an
> embarrassingly long time.

But there is no time like the present...

Here is what mainline has to recognize SRCU read-side critical sections:

------------------------------------------------------------------------

(* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
let srcu-rscs = let rec
unmatched-locks = Srcu-lock \ domain(matched)
and unmatched-unlocks = Srcu-unlock \ range(matched)
and unmatched = unmatched-locks | unmatched-unlocks
and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
and unmatched-locks-to-unlocks =
([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
and matched = matched | (unmatched-locks-to-unlocks \
(unmatched-po ; unmatched-po))
in matched

(* Validate nesting *)
flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking

(* Check for use of synchronize_srcu() inside an RCU critical section *)
flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep

(* Validate SRCU dynamic match *)
flag ~empty different-values(srcu-rscs) as srcu-bad-nesting

------------------------------------------------------------------------

And here is what I just now tried:

------------------------------------------------------------------------

(* Compute matching pairs of Srcu-lock and Srcu-unlock *)
let srcu-rscs = ([Srcu-lock] ; rfi ; [Srcu-unlock]) & loc

(* Validate nesting *)
flag empty srcu-rscs as no-srcu-readers
flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking

(* Check for use of synchronize_srcu() inside an RCU critical section *)
flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep

(* Validate SRCU dynamic match *)
flag ~empty different-values(srcu-rscs) as srcu-bad-nesting

------------------------------------------------------------------------

This gets me "Flag no-srcu-readers" when running this litmus test:

------------------------------------------------------------------------

C C-srcu-nest-1

(*
* Result: Never
*)

{}

P0(int *x, int *y, struct srcu_struct *s)
{
int r1;
int r2;
int r3;

r3 = srcu_read_lock(s);
r1 = READ_ONCE(*x);
srcu_read_unlock(s, r3);
r3 = srcu_read_lock(s);
r2 = READ_ONCE(*y);
srcu_read_unlock(s, r3);
}

P1(int *x, int *y, struct srcu_struct *s)
{
WRITE_ONCE(*y, 1);
synchronize_srcu(s);
WRITE_ONCE(*x, 1);
}

locations [0:r1]
exists (0:r1=1 /\ 0:r2=0)

------------------------------------------------------------------------

So what did I mess up this time? ;-)

Thanx, Paul

2023-01-13 20:54:30

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 13, 2023 at 11:38:17AM -0500, Alan Stern wrote:
> On Thu, Jan 12, 2023 at 01:57:16PM -0800, Paul E. McKenney wrote:
>
> > I will risk sharing the intuition behind the rcu-order counting rule.
> >
> > In the code, an RCU read-side critical section begins with rcu_read_lock()
> > and ends with the matching rcu_read_unlock(). RCU read-side critical
> > section may be nested, in which case RCU cares only about the outermost
> > of the nested set.
> >
> > An RCU grace period includes at least one moment in time during which
> > each and every process/CPU/task/whatever is not within an RCU read-side
> > critical section.
>
> Strictly speaking, this is not right. It should say: For each
> process/CPU/task/whatever, an RCU grace period includes at least one
> moment in time during which that process is not within an RCU read-side
> critical section. There does not have to be any single moment during
> which no processes are executing a critical section.
>
> For example, the following is acceptable:
>
> CPU 0: start of synchronize_rcu()......end
> CPU 1: rcu_lock().....................rcu_unlock()
> CPU 2: rcu_lock().......................rcu_unlock()

You are quite right, thank you! Yes, the time outside of an RCU
read-side critical section for a given process/CPU/task/whatever need
not be simultaneous with any other process/CPU/task/whatever.

Thanx, Paul

2023-01-14 17:18:28

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 13, 2023 at 12:07:06PM -0800, Paul E. McKenney wrote:
> What Alan said! You could even have distinct partially overlapping
> grace periods, as the Linux kernel actually does have courtesy of normal
> grace periods via synchronize_rcu() and expedited grace periods via
> synchronize_rcu_expedited().

Or just two different CPUs making overlapping calls to
synchronize_rcu().

Alan

2023-01-14 17:25:08

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 13, 2023 at 02:55:34PM +0000, Jonas Oberhauser wrote:
> I think the whole rcu-order topic can be summarized as the 'one rule': "if a grace period happens before a rcsc-unlock, it must also happen before the rcsc -lock, and analogously if rcsc-lock happens before a grace period, the rcsc-unlock also happens before the grace period" .

There is more to it than that, as I mentioned earlier. A complete
description can be found the explanation.txt document; it says:

For any critical section C and any grace period G, at least
one of the following statements must hold:

(1) C ends before G does, and in addition, every store that
propagates to C's CPU before the end of C must propagate to
every CPU before G ends.

(2) G starts before C does, and in addition, every store that
propagates to G's CPU before the start of G must propagate
to every CPU before C starts.

Alan

2023-01-14 17:48:26

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 13, 2023 at 08:14:08PM +0000, Jonas Oberhauser wrote:
> > [...] cat uses least-fixed-point definitions.
>
> Ah, but it's probably stricter than this. Even least fixed points are not always unique for these types of equations, e.g.,
> let rec either-x-or-y = if x \in either-x-or-y then {x} else {y}
> both {x} and {y} are least fixed points of this relation.
> But I believe that cat will use only the {y} solution, right? (Yes this isn't cat syntax, but you can probably do something similar in cat using intersections, set difference, and unions?).

As I understand it, herd handles a recursive definition by starting
from an empty relation and repeatedly calculating a new value from the
definition until there are no more changes.


> > IMO it's generally better to think of grace periods as not being instantaneous but as occurring over a prolonged period of time. Thus we should say: If a grace period ends before an unlock occurs, it must start before the corresponding lock. And contrapositively, if a lock occurs before a grace period starts, the corresponding unlock must occur before the grace period ends.
>
> I started thinking about it like this and comparing start/end times. That made it more complicated, but the math came out the same in the end. I could imagine that there are some scenarios where the intuition of collapsing the grace period to a single event could cause problems, but I haven't seen any.

IIRC (and it has been a long time), this may be vaguely connected with
the reason why the definitions of gp, rcu-link, and rcu-fence have po
one side but po? on the other. There was also something about what
should happen when you have two grace periods in a row. But I can't
remember the details.


> > SRCU is exactly like RCU except for one aspect: The SRCU primitives (synchronize_srcu(), srcu_lock(), and srcu_unlock()) each take an argument, a pointer to an srcu structure. The ordering restrictions apply only in cases where the arguments to the corresponding primitives point to the _same_ srcu structure. That's why you see all those "& loc" expressions sprinkled throughout the definitions of srcu-rscs and rcu-order.
>
> I see. So in a sense it's like fine-grained RCU? I also saw Paul's e-mail hinting at some other differences, which remind me of sequence locking.
> Is it something like speculative RCU that is fine-grained to keep the number of false-positive aborts low?

Not at all. In terms of the memory model, agreement of the srcu
structure pointers (and passing the value returned from
srcu_read_lock() to the corresponding srcu_read_unlock(), as Paul
mentioned) is the only difference.

But in terms of actual kernel programming there is another, HUGE
difference: Code executing inside an RCU read-side critical section is
not allowed to sleep or be preempted, whereas code executing inside an
SRCU read-side critical section _is_ allowed. That's where the "S" in
"SRCU" comes from; it stands for "Sleepable".


> > > If this is true, defining rcu-order like this cuts away roughly half
> > > of the cases (ignoring srcu, rcu-order has 6 cases and rcu-extend has
> > > 3),
>
> > The definition of rcu-link' should count as a case, so you have only eliminated one-third of the cases. :-)
>
> The only way I'd count rcu-link' as adding a case is if you say that the (...)* has two cases :D (or infinitely many :D)
> I don't count the existence of the definition because you could always inline it (but lose a lot of clarity imho).

If you did inline it, you'd probably find that the end result was
exactly what is currently in the LKMM.


> > The lone gp is present in the definition of rcu-order because I wanted to express explicitly the condition that the number of grace periods in a cycle must be _at least_ as large as the number of critical sections. There can be more grace periods, meaning that some of them need not be paired with a critical section.
> > However, it's true that (rcu-link ; gp ; rcu-link) is a sub-relation of rcu-link.
> > Hence the only need for a lone gp in these definitions is to cover the case where there are no critical sections at all in the cycle (which I would like be to be forbidden by the rcu axiom, even if it's already forbidden by the propagation axiom).
>
> Right, but that counting condition isn't the heart of RCU, it rather seems like an observation that helped formalize RCU at the time.
> If for example in the paper version you had excluded gp-link from the rcu-path, then the counting condition would be that the number of rcsc-link and gp-link in each rcu-path must be exactly equal.
> [The rest of the proof shouldn't be impacted (as you already point out in the paper, link ; strong-fence ; link <= link). The special case gp-link <= rcu-path is also already resolved by hb*;pb+ being irreflexive, since irreflexive(gp-link) is irreflexive(link;gp) and link;gp = hb*;pb*;prop;gp <= hb*;pb*;pb = hb*;pb+.]
>
> Similarly, I don't yet understand why lone gp should be mentioned in the rcu axiom. To me the fact that it works on its own is just a consequence of the strong-fence used to implement RCU. Not really a specific RCU semantics/mechanism.

It's RCU-specific in the sense that gp is an RCU primitive.

> I would prefer the rcu axiom to focus on and have a clear correspondence to the fundamental law of RCU (using the nice term from the ASPLOS paper), with as close a correspondence as possible (within constraints, like what is easy to compute and the fact that the law is, well, um, "primarily empirical in nature").

Actually it isn't, not any more. That quote was written before we
formalized RCU in the LKMM.

In the end it doesn't matter, since cycles with grace periods but no
critical sections are already handled by the propagation axiom.

Alan

2023-01-14 17:53:58

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 13, 2023 at 12:32:41PM -0800, Paul E. McKenney wrote:
> > Making LKMM correctly model all of this has been on my todo list for an
> > embarrassingly long time.
>
> But there is no time like the present...
>
> Here is what mainline has to recognize SRCU read-side critical sections:
>
> ------------------------------------------------------------------------
>
> (* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
> let srcu-rscs = let rec
> unmatched-locks = Srcu-lock \ domain(matched)
> and unmatched-unlocks = Srcu-unlock \ range(matched)
> and unmatched = unmatched-locks | unmatched-unlocks
> and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
> and unmatched-locks-to-unlocks =
> ([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
> and matched = matched | (unmatched-locks-to-unlocks \
> (unmatched-po ; unmatched-po))
> in matched
>
> (* Validate nesting *)
> flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
> flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
>
> (* Check for use of synchronize_srcu() inside an RCU critical section *)
> flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
>
> (* Validate SRCU dynamic match *)
> flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
>
> ------------------------------------------------------------------------
>
> And here is what I just now tried:
>
> ------------------------------------------------------------------------
>
> (* Compute matching pairs of Srcu-lock and Srcu-unlock *)
> let srcu-rscs = ([Srcu-lock] ; rfi ; [Srcu-unlock]) & loc

This doesn't make sense. Herd treats srcu_read_lock() as a load
operation (it takes a pointer as argument and returns a value) and
srcu_read_unlock() as a store operation (it takes both a pointer and a
value as arguments and returns nothing). So you can't connect them
with an rfi link; stores don't "read-from" loads.

I suppose you might be able to connect them with a data dependency,
though. But then how would you handle situations where two unlock
calls both use the value returned from a single lock call? You'd have
to check explicitly that srcu-rscs connected each lock with only one
unlock.

Alan

> (* Validate nesting *)
> flag empty srcu-rscs as no-srcu-readers
> flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
> flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
>
> (* Check for use of synchronize_srcu() inside an RCU critical section *)
> flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
>
> (* Validate SRCU dynamic match *)
> flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
>
> ------------------------------------------------------------------------
>
> This gets me "Flag no-srcu-readers" when running this litmus test:
>
> ------------------------------------------------------------------------
>
> C C-srcu-nest-1
>
> (*
> * Result: Never
> *)
>
> {}
>
> P0(int *x, int *y, struct srcu_struct *s)
> {
> int r1;
> int r2;
> int r3;
>
> r3 = srcu_read_lock(s);
> r1 = READ_ONCE(*x);
> srcu_read_unlock(s, r3);
> r3 = srcu_read_lock(s);
> r2 = READ_ONCE(*y);
> srcu_read_unlock(s, r3);
> }
>
> P1(int *x, int *y, struct srcu_struct *s)
> {
> WRITE_ONCE(*y, 1);
> synchronize_srcu(s);
> WRITE_ONCE(*x, 1);
> }
>
> locations [0:r1]
> exists (0:r1=1 /\ 0:r2=0)
>
> ------------------------------------------------------------------------
>
> So what did I mess up this time? ;-)
>
> Thanx, Paul

2023-01-14 18:04:11

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 13, 2023 at 05:05:11PM +0000, Jonas Oberhauser wrote:
> -----Original Message-----
> From: Alan Stern [mailto:[email protected]]
> Sent: Friday, January 13, 2023 5:38 PM
>
> > Strictly speaking, this is not right. It should say: For each process/CPU/task/whatever, an RCU grace period includes at least one moment in time during which that process is not within an RCU read-side critical section. There does not have to be any single moment during which no processes are executing a critical section.
>
> I see. I guess the other thing is more like a quiescent period.

"Quiescent period" was in fact my original name for "grace period"
back in the day, but a chorus of objections eventually prompted me to
instead label it a "grace period".

Perhaps you have given an improved rationale for their objections. ;-)

> I
> think the fact that RCU/safe memory reclamation(SMR) don't require a
> quiescent period is an important distinction, and even though we have
> our own SMR I never thought too deeply about this distinction.

If you want non-abysmal performance and scalability on modern hardware,
the distinction is critically important. After all, the speed of light
really is finite, and atoms are of non-zero size. And to the complete
surprise of my forty-years-ago self, these laws of physics seriously
constrain modern computing devices.

Thanx, Paul

2023-01-14 18:11:48

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Sat, Jan 14, 2023 at 11:55:17AM -0500, Alan Stern wrote:
> On Fri, Jan 13, 2023 at 12:07:06PM -0800, Paul E. McKenney wrote:
> > What Alan said! You could even have distinct partially overlapping
> > grace periods, as the Linux kernel actually does have courtesy of normal
> > grace periods via synchronize_rcu() and expedited grace periods via
> > synchronize_rcu_expedited().
>
> Or just two different CPUs making overlapping calls to
> synchronize_rcu().

True, there could be two overlapping grace periods in that case.
If nothing else, one of those synchronize_rcu() calls could be preempted
for 500 milliseconds upon entry, thus overlapping many grace periods from
the viewpoint of the user. Which is the viewpoint that LKMM should of
course take.

But because I had just been digging in the internals, I was taking
an implemntation-centric viewpoint. From that viewpoint, there is at
most one normal grace period in flight at a time. But even from that
viewpoint, there can also be a single independent expedited grace period
in flight at any given point in time, so partial overlap can happen even
from an implementation-centric viewpoint.

Which in no way negates or weakens your point, just letting you guys
know where I was coming from. Because if things go as they normally do,
there will be a future discussion where my head will once again be deep
into the implementation. ;-)

Thanx, Paul

2023-01-14 18:14:21

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Sat, Jan 14, 2023 at 12:40:39PM -0500, Alan Stern wrote:
> On Fri, Jan 13, 2023 at 12:32:41PM -0800, Paul E. McKenney wrote:
> > > Making LKMM correctly model all of this has been on my todo list for an
> > > embarrassingly long time.
> >
> > But there is no time like the present...
> >
> > Here is what mainline has to recognize SRCU read-side critical sections:
> >
> > ------------------------------------------------------------------------
> >
> > (* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
> > let srcu-rscs = let rec
> > unmatched-locks = Srcu-lock \ domain(matched)
> > and unmatched-unlocks = Srcu-unlock \ range(matched)
> > and unmatched = unmatched-locks | unmatched-unlocks
> > and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
> > and unmatched-locks-to-unlocks =
> > ([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
> > and matched = matched | (unmatched-locks-to-unlocks \
> > (unmatched-po ; unmatched-po))
> > in matched
> >
> > (* Validate nesting *)
> > flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
> > flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
> >
> > (* Check for use of synchronize_srcu() inside an RCU critical section *)
> > flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
> >
> > (* Validate SRCU dynamic match *)
> > flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
> >
> > ------------------------------------------------------------------------
> >
> > And here is what I just now tried:
> >
> > ------------------------------------------------------------------------
> >
> > (* Compute matching pairs of Srcu-lock and Srcu-unlock *)
> > let srcu-rscs = ([Srcu-lock] ; rfi ; [Srcu-unlock]) & loc
>
> This doesn't make sense. Herd treats srcu_read_lock() as a load
> operation (it takes a pointer as argument and returns a value) and
> srcu_read_unlock() as a store operation (it takes both a pointer and a
> value as arguments and returns nothing). So you can't connect them
> with an rfi link; stores don't "read-from" loads.
>
> I suppose you might be able to connect them with a data dependency,
> though. But then how would you handle situations where two unlock
> calls both use the value returned from a single lock call? You'd have
> to check explicitly that srcu-rscs connected each lock with only one
> unlock.

Thank you! I will give the dependencies a try.

Thanx, Paul

> Alan
>
> > (* Validate nesting *)
> > flag empty srcu-rscs as no-srcu-readers
> > flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
> > flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
> >
> > (* Check for use of synchronize_srcu() inside an RCU critical section *)
> > flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
> >
> > (* Validate SRCU dynamic match *)
> > flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
> >
> > ------------------------------------------------------------------------
> >
> > This gets me "Flag no-srcu-readers" when running this litmus test:
> >
> > ------------------------------------------------------------------------
> >
> > C C-srcu-nest-1
> >
> > (*
> > * Result: Never
> > *)
> >
> > {}
> >
> > P0(int *x, int *y, struct srcu_struct *s)
> > {
> > int r1;
> > int r2;
> > int r3;
> >
> > r3 = srcu_read_lock(s);
> > r1 = READ_ONCE(*x);
> > srcu_read_unlock(s, r3);
> > r3 = srcu_read_lock(s);
> > r2 = READ_ONCE(*y);
> > srcu_read_unlock(s, r3);
> > }
> >
> > P1(int *x, int *y, struct srcu_struct *s)
> > {
> > WRITE_ONCE(*y, 1);
> > synchronize_srcu(s);
> > WRITE_ONCE(*x, 1);
> > }
> >
> > locations [0:r1]
> > exists (0:r1=1 /\ 0:r2=0)
> >
> > ------------------------------------------------------------------------
> >
> > So what did I mess up this time? ;-)
> >
> > Thanx, Paul

2023-01-14 18:18:02

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 13, 2023 at 08:43:42PM +0000, Jonas Oberhauser wrote:
>
>
> -----Original Message-----
> From: Paul E. McKenney [mailto:[email protected]]
>
> > (* Compute matching pairs of Srcu-lock and Srcu-unlock *) let srcu-rscs = ([Srcu-lock] ; rfi ; [Srcu-unlock]) & loc
>
> How does the Srcu-unlock read from the Srcu-lock? Is there something in your model or in herd that lets it understand lock and unlock should be treated as writes resp. reads from that specific location?
>
> Or do you mean that value given to Srcu-unlock should be the value produced by Srcu-lock?

Yes, and in the Linux kernel one does something like this:

idx = srcu_read_lock(&mysrcu);
// critical section
srcu_read_unlock(&mysrcu, idx);

> Perhaps the closest to what you want is to express that as a data dependency if you know how to teach herd that Srcu-unlock is a read and Srcu-lock depends on its second input :D (I have no idea how to do that, hence the questions above)

Given that both you and Alan suggested it, I must try it. ;-)

> Then you could flag if there's a data dependency from more than one event, meaning that the value is not purely the Srcu-lock-produced value.

Good point, thank you!

> That doesn't guarantee you that you don't do some nasty stuff with constant values though.

That is the purpose of this statement in the srcu_read_unlock()
implementation:

WARN_ON_ONCE(idx & ~0x1);

But yes, this WARN_ON_ONCE() can of course be fooled.

> Could you maybe use an opaque datatype for the values?

We could put it in a struct or something. The problem with making it
completely opaque is that the users must store it somewhere, which means
that the compiler needs to know how big it is. Of course, languages
other than C have other ways to make this happen. And correspondingly
slower build times. ;-)

Thanx, Paul

2023-01-14 18:42:38

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Sat, Jan 14, 2023 at 09:53:43AM -0800, Paul E. McKenney wrote:
> On Fri, Jan 13, 2023 at 08:43:42PM +0000, Jonas Oberhauser wrote:
> >
> >
> > -----Original Message-----
> > From: Paul E. McKenney [mailto:[email protected]]
> >
> > > (* Compute matching pairs of Srcu-lock and Srcu-unlock *) let srcu-rscs = ([Srcu-lock] ; rfi ; [Srcu-unlock]) & loc
> >
> > How does the Srcu-unlock read from the Srcu-lock? Is there something in your model or in herd that lets it understand lock and unlock should be treated as writes resp. reads from that specific location?
> >
> > Or do you mean that value given to Srcu-unlock should be the value produced by Srcu-lock?
>
> Yes, and in the Linux kernel one does something like this:
>
> idx = srcu_read_lock(&mysrcu);
> // critical section
> srcu_read_unlock(&mysrcu, idx);
>
> > Perhaps the closest to what you want is to express that as a data dependency if you know how to teach herd that Srcu-unlock is a read and Srcu-lock depends on its second input :D (I have no idea how to do that, hence the questions above)
>
> Given that both you and Alan suggested it, I must try it. ;-)

And it works as desired on these litmus tests:

manual/kernel/C-srcu-nest-*.litmus

In this repository:

https://github.com/paulmckrcu/litmus

However, this has to be dumb luck because herd7 does not yet provide
the second argument to srcu_read_unlock(). My guess is that the herd7
is noting the dependency that is being carried by the pointers to the
srcu_struct structures. This guess stems in part from the fact that
I get "Flag unbalanced-srcu-locking" when I have one SRCU read-side
critical section following another in the same process, both using the
same srcu_struct structure.

Nevertheless, here is the resulting .bell fragment:

------------------------------------------------------------------------

(* Compute matching pairs of Srcu-lock and Srcu-unlock *)
let srcu-rscs = ([Srcu-lock] ; data ; [Srcu-unlock]) & loc

(* Validate nesting *)
flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking

(* Check for use of synchronize_srcu() inside an RCU critical section *)
flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep

(* Validate SRCU dynamic match *)
flag ~empty different-values(srcu-rscs) as srcu-bad-nesting

------------------------------------------------------------------------

I also created a C-srcu-nest-*.litmus as shown below, and LKMM does
complain about one srcu_read_lock() feeding into multiple instances of
srcu_read_unlock(). The complaint comes from the different_values()
check, which presumably complains about any duplication in the domain
or range of the specified relation.

But still working by accident! ;-)

Thanx, Paul

------------------------------------------------------------------------

C C-srcu-nest-3

(*
* Result: Flag srcu-bad-nesting
*
* This demonstrates erroneous matching of a single srcu_read_lock()
* with multiple srcu_read_unlock() instances.
*)

{}

P0(int *x, int *y, struct srcu_struct *s1, struct srcu_struct *s2)
{
int r1;
int r2;
int r3;
int r4;

r3 = srcu_read_lock(s1);
r2 = READ_ONCE(*y);
r4 = srcu_read_lock(s2);
r5 = srcu_read_lock(s2);
srcu_read_unlock(s1, r3);
r1 = READ_ONCE(*x);
srcu_read_unlock(s2, r4);
}

P1(int *x, int *y, struct srcu_struct *s2)
{
WRITE_ONCE(*y, 1);
synchronize_srcu(s2);
WRITE_ONCE(*x, 1);
}

locations [0:r1]
exists (0:r1=1 /\ 0:r2=0)

2023-01-14 20:22:09

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Sat, Jan 14, 2023 at 10:15:37AM -0800, Paul E. McKenney wrote:
> > > Perhaps the closest to what you want is to express that as a data dependency if you know how to teach herd that Srcu-unlock is a read and Srcu-lock depends on its second input :D (I have no idea how to do that, hence the questions above)
> >
> > Given that both you and Alan suggested it, I must try it. ;-)
>
> And it works as desired on these litmus tests:
>
> manual/kernel/C-srcu-nest-*.litmus
>
> In this repository:
>
> https://github.com/paulmckrcu/litmus
>
> However, this has to be dumb luck because herd7 does not yet provide
> the second argument to srcu_read_unlock().

Yes it does. Grep for srcu_read_unlock in linux-kernel.def and you'll
see two arguments.

> My guess is that the herd7
> is noting the dependency that is being carried by the pointers to the
> srcu_struct structures.

That is not a dependency.

> This guess stems in part from the fact that
> I get "Flag unbalanced-srcu-locking" when I have one SRCU read-side
> critical section following another in the same process, both using the
> same srcu_struct structure.
>
> Nevertheless, here is the resulting .bell fragment:
>
> ------------------------------------------------------------------------
>
> (* Compute matching pairs of Srcu-lock and Srcu-unlock *)
> let srcu-rscs = ([Srcu-lock] ; data ; [Srcu-unlock]) & loc
>
> (* Validate nesting *)
> flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
> flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
>
> (* Check for use of synchronize_srcu() inside an RCU critical section *)
> flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
>
> (* Validate SRCU dynamic match *)
> flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
>
> ------------------------------------------------------------------------
>
> I also created a C-srcu-nest-*.litmus as shown below, and LKMM does
> complain about one srcu_read_lock() feeding into multiple instances of
> srcu_read_unlock().

It shouldn't; that doesn't happen in the litmus test below. But the
test does contain an srcu_read_lock() that doesn't match any instances
of srcu_read_unlock(), so you should be getting an
"unbalanced-srcu-locking" complaint -- and indeed, you mentioned above
that this does happen.

Also, your bell file doesn't contain a check for a lock matched with
multiple unlocks, so there's no way for herd to complain about it.

> The complaint comes from the different_values()
> check, which presumably complains about any duplication in the domain
> or range of the specified relation.

No; different_values() holds when the values of the two events
linked by srcu-rscs are different. It has nothing to do with
duplication.

> But still working by accident! ;-)
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> C C-srcu-nest-3
>
> (*
> * Result: Flag srcu-bad-nesting
> *
> * This demonstrates erroneous matching of a single srcu_read_lock()
> * with multiple srcu_read_unlock() instances.
> *)
>
> {}
>
> P0(int *x, int *y, struct srcu_struct *s1, struct srcu_struct *s2)
> {
> int r1;
> int r2;
> int r3;
> int r4;
>
> r3 = srcu_read_lock(s1);
> r2 = READ_ONCE(*y);
> r4 = srcu_read_lock(s2);
> r5 = srcu_read_lock(s2);
> srcu_read_unlock(s1, r3);
> r1 = READ_ONCE(*x);
> srcu_read_unlock(s2, r4);
> }

This has 3 locks and 2 unlocks. The first lock matches the the first
unlock (r3 and s3), the second lock matches the second unlock (r4 and
s2), and the third lock doesn't match any unlock (r5 and s2).

Alan

>
> P1(int *x, int *y, struct srcu_struct *s2)
> {
> WRITE_ONCE(*y, 1);
> synchronize_srcu(s2);
> WRITE_ONCE(*x, 1);
> }
>
> locations [0:r1]
> exists (0:r1=1 /\ 0:r2=0)

2023-01-14 20:49:51

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Sat, Jan 14, 2023 at 10:15:37AM -0800, Paul E. McKenney wrote:
> Nevertheless, here is the resulting .bell fragment:
>
> ------------------------------------------------------------------------
>
> (* Compute matching pairs of Srcu-lock and Srcu-unlock *)
> let srcu-rscs = ([Srcu-lock] ; data ; [Srcu-unlock]) & loc
>
> (* Validate nesting *)
> flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
> flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
>
> (* Check for use of synchronize_srcu() inside an RCU critical section *)
> flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
>
> (* Validate SRCU dynamic match *)
> flag ~empty different-values(srcu-rscs) as srcu-bad-nesting

I forgot to mention... An appropriate check for one srcu_read_lock()
matched to more than one srcu_read_unlock() would be something like
this:

flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-unlocks

Alan

PS: Do you agree that we should change the names of the first two flags
above to unbalanced-srcu-lock and unbalanced-srcu-unlock, respectively
(and similarly for the rcu checks)? It might help to be a little more
specific about how the locking is wrong when we detect an error.

2023-01-15 05:34:39

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Sat, Jan 14, 2023 at 03:19:06PM -0500, Alan Stern wrote:
> On Sat, Jan 14, 2023 at 10:15:37AM -0800, Paul E. McKenney wrote:
> > Nevertheless, here is the resulting .bell fragment:
> >
> > ------------------------------------------------------------------------
> >
> > (* Compute matching pairs of Srcu-lock and Srcu-unlock *)
> > let srcu-rscs = ([Srcu-lock] ; data ; [Srcu-unlock]) & loc
> >
> > (* Validate nesting *)
> > flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
> > flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
> >
> > (* Check for use of synchronize_srcu() inside an RCU critical section *)
> > flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
> >
> > (* Validate SRCU dynamic match *)
> > flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
>
> I forgot to mention... An appropriate check for one srcu_read_lock()
> matched to more than one srcu_read_unlock() would be something like
> this:
>
> flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-unlocks

I have added this, thank you!

> Alan
>
> PS: Do you agree that we should change the names of the first two flags
> above to unbalanced-srcu-lock and unbalanced-srcu-unlock, respectively
> (and similarly for the rcu checks)? It might help to be a little more
> specific about how the locking is wrong when we detect an error.

I have made this change, again, thank you!

But I also added this:

flag empty srcu-rscs as no-srcu-readers

And it is always flagged. So far, I have not found any sort of relation
that connects Srcu-lock to Srcu-unlock other than po. I tried data,
ctrl, addr, rf, rfi, and combinations thereof.

What am I missing here?

Thanx, Paul

2023-01-15 05:44:34

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Sat, Jan 14, 2023 at 02:58:29PM -0500, Alan Stern wrote:
> On Sat, Jan 14, 2023 at 10:15:37AM -0800, Paul E. McKenney wrote:
> > > > Perhaps the closest to what you want is to express that as a data dependency if you know how to teach herd that Srcu-unlock is a read and Srcu-lock depends on its second input :D (I have no idea how to do that, hence the questions above)
> > >
> > > Given that both you and Alan suggested it, I must try it. ;-)
> >
> > And it works as desired on these litmus tests:
> >
> > manual/kernel/C-srcu-nest-*.litmus
> >
> > In this repository:
> >
> > https://github.com/paulmckrcu/litmus
> >
> > However, this has to be dumb luck because herd7 does not yet provide
> > the second argument to srcu_read_unlock().
>
> Yes it does. Grep for srcu_read_unlock in linux-kernel.def and you'll
> see two arguments.

Right you are! Too early this morning...

> > My guess is that the herd7
> > is noting the dependency that is being carried by the pointers to the
> > srcu_struct structures.
>
> That is not a dependency.

You are right, and apparently neither is the value returned by
srcu_read_lock() and passed to srcu_read_unlock().

> > This guess stems in part from the fact that
> > I get "Flag unbalanced-srcu-locking" when I have one SRCU read-side
> > critical section following another in the same process, both using the
> > same srcu_struct structure.
> >
> > Nevertheless, here is the resulting .bell fragment:
> >
> > ------------------------------------------------------------------------
> >
> > (* Compute matching pairs of Srcu-lock and Srcu-unlock *)
> > let srcu-rscs = ([Srcu-lock] ; data ; [Srcu-unlock]) & loc
> >
> > (* Validate nesting *)
> > flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
> > flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
> >
> > (* Check for use of synchronize_srcu() inside an RCU critical section *)
> > flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
> >
> > (* Validate SRCU dynamic match *)
> > flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
> >
> > ------------------------------------------------------------------------
> >
> > I also created a C-srcu-nest-*.litmus as shown below, and LKMM does
> > complain about one srcu_read_lock() feeding into multiple instances of
> > srcu_read_unlock().
>
> It shouldn't; that doesn't happen in the litmus test below. But the
> test does contain an srcu_read_lock() that doesn't match any instances
> of srcu_read_unlock(), so you should be getting an
> "unbalanced-srcu-locking" complaint -- and indeed, you mentioned above
> that this does happen.
>
> Also, your bell file doesn't contain a check for a lock matched with
> multiple unlocks, so there's no way for herd to complain about it.

Agreed!

> > The complaint comes from the different_values()
> > check, which presumably complains about any duplication in the domain
> > or range of the specified relation.
>
> No; different_values() holds when the values of the two events
> linked by srcu-rscs are different. It has nothing to do with
> duplication.

I removed the different_values() check and one of the complaints
went away, but yes, the other one did not.

> > But still working by accident! ;-)
> >
> > Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > C C-srcu-nest-3
> >
> > (*
> > * Result: Flag srcu-bad-nesting
> > *
> > * This demonstrates erroneous matching of a single srcu_read_lock()
> > * with multiple srcu_read_unlock() instances.
> > *)
> >
> > {}
> >
> > P0(int *x, int *y, struct srcu_struct *s1, struct srcu_struct *s2)
> > {
> > int r1;
> > int r2;
> > int r3;
> > int r4;
> >
> > r3 = srcu_read_lock(s1);
> > r2 = READ_ONCE(*y);
> > r4 = srcu_read_lock(s2);
> > r5 = srcu_read_lock(s2);
> > srcu_read_unlock(s1, r3);
> > r1 = READ_ONCE(*x);
> > srcu_read_unlock(s2, r4);
> > }
>
> This has 3 locks and 2 unlocks. The first lock matches the the first
> unlock (r3 and s3), the second lock matches the second unlock (r4 and
> s2), and the third lock doesn't match any unlock (r5 and s2).

Thank you and fixed.

Thanx, Paul

> Alan
>
> >
> > P1(int *x, int *y, struct srcu_struct *s2)
> > {
> > WRITE_ONCE(*y, 1);
> > synchronize_srcu(s2);
> > WRITE_ONCE(*x, 1);
> > }
> >
> > locations [0:r1]
> > exists (0:r1=1 /\ 0:r2=0)

2023-01-15 17:07:12

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Sat, Jan 14, 2023 at 09:15:10PM -0800, Paul E. McKenney wrote:
> On Sat, Jan 14, 2023 at 03:19:06PM -0500, Alan Stern wrote:
> > On Sat, Jan 14, 2023 at 10:15:37AM -0800, Paul E. McKenney wrote:
> > > Nevertheless, here is the resulting .bell fragment:
> > >
> > > ------------------------------------------------------------------------
> > >
> > > (* Compute matching pairs of Srcu-lock and Srcu-unlock *)
> > > let srcu-rscs = ([Srcu-lock] ; data ; [Srcu-unlock]) & loc
> > >
> > > (* Validate nesting *)
> > > flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
> > > flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
> > >
> > > (* Check for use of synchronize_srcu() inside an RCU critical section *)
> > > flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
> > >
> > > (* Validate SRCU dynamic match *)
> > > flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
> >
> > I forgot to mention... An appropriate check for one srcu_read_lock()
> > matched to more than one srcu_read_unlock() would be something like
> > this:
> >
> > flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-unlocks
>
> I have added this, thank you!
>
> > Alan
> >
> > PS: Do you agree that we should change the names of the first two flags
> > above to unbalanced-srcu-lock and unbalanced-srcu-unlock, respectively
> > (and similarly for the rcu checks)? It might help to be a little more
> > specific about how the locking is wrong when we detect an error.
>
> I have made this change, again, thank you!
>
> But I also added this:
>
> flag empty srcu-rscs as no-srcu-readers
>
> And it is always flagged. So far, I have not found any sort of relation
> that connects Srcu-lock to Srcu-unlock other than po. I tried data,
> ctrl, addr, rf, rfi, and combinations thereof.
>
> What am I missing here?

I don't think you're missing anything. This is a matter for Boqun or
Luc; it must have something to do with the way herd treats the
srcu_read_lock() and srcu_read_unlock() primitives.

Alan

2023-01-15 18:37:09

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Sun, Jan 15, 2023 at 11:23:31AM -0500, Alan Stern wrote:
> On Sat, Jan 14, 2023 at 09:15:10PM -0800, Paul E. McKenney wrote:
> > On Sat, Jan 14, 2023 at 03:19:06PM -0500, Alan Stern wrote:
> > > On Sat, Jan 14, 2023 at 10:15:37AM -0800, Paul E. McKenney wrote:
> > > > Nevertheless, here is the resulting .bell fragment:
> > > >
> > > > ------------------------------------------------------------------------
> > > >
> > > > (* Compute matching pairs of Srcu-lock and Srcu-unlock *)
> > > > let srcu-rscs = ([Srcu-lock] ; data ; [Srcu-unlock]) & loc
> > > >
> > > > (* Validate nesting *)
> > > > flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
> > > > flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
> > > >
> > > > (* Check for use of synchronize_srcu() inside an RCU critical section *)
> > > > flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
> > > >
> > > > (* Validate SRCU dynamic match *)
> > > > flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
> > >
> > > I forgot to mention... An appropriate check for one srcu_read_lock()
> > > matched to more than one srcu_read_unlock() would be something like
> > > this:
> > >
> > > flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-unlocks
> >
> > I have added this, thank you!
> >
> > > Alan
> > >
> > > PS: Do you agree that we should change the names of the first two flags
> > > above to unbalanced-srcu-lock and unbalanced-srcu-unlock, respectively
> > > (and similarly for the rcu checks)? It might help to be a little more
> > > specific about how the locking is wrong when we detect an error.
> >
> > I have made this change, again, thank you!
> >
> > But I also added this:
> >
> > flag empty srcu-rscs as no-srcu-readers
> >
> > And it is always flagged. So far, I have not found any sort of relation
> > that connects Srcu-lock to Srcu-unlock other than po. I tried data,
> > ctrl, addr, rf, rfi, and combinations thereof.
> >
> > What am I missing here?
>
> I don't think you're missing anything. This is a matter for Boqun or
> Luc; it must have something to do with the way herd treats the
> srcu_read_lock() and srcu_read_unlock() primitives.

It looks like we need something that tracks (data | rf)* between
the return value of srcu_read_lock() and the second parameter of
srcu_read_unlock(). The reason for rf rather than rfi is the upcoming
srcu_down_read() and srcu_up_read().

But what I will do in the meantime is to switch back to a commit that
simply flags nesting of same-srcu_struct SRCU read-side critical sections,
while blindly assuming that the return value of a given srcu_read_lock()
is passed in to the corresponding srcu_read_unlock():

------------------------------------------------------------------------

(* Compute matching pairs of Srcu-lock and Srcu-unlock, but prohibit nesting *)
let srcu-unmatched = Srcu-lock | Srcu-unlock
let srcu-unmatched-po = ([srcu-unmatched] ; po ; [srcu-unmatched]) & loc
let srcu-unmatched-locks-to-unlock = ([Srcu-lock] ; po ; [Srcu-unlock]) & loc
let srcu-rscs = srcu-unmatched-locks-to-unlock \ (srcu-unmatched-po ; srcu-unmatched-po)

(* Validate nesting *)
flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking

(* Check for use of synchronize_srcu() inside an RCU critical section *)
flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep

(* Validate SRCU dynamic match *)
flag ~empty different-values(srcu-rscs) as srcu-bad-nesting

------------------------------------------------------------------------

Or is there some better intermediate position that could be taken?

Thanx, Paul

2023-01-15 21:06:19

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Sun, Jan 15, 2023 at 10:10:52AM -0800, Paul E. McKenney wrote:
> On Sun, Jan 15, 2023 at 11:23:31AM -0500, Alan Stern wrote:
> > On Sat, Jan 14, 2023 at 09:15:10PM -0800, Paul E. McKenney wrote:
> > > What am I missing here?
> >
> > I don't think you're missing anything. This is a matter for Boqun or
> > Luc; it must have something to do with the way herd treats the
> > srcu_read_lock() and srcu_read_unlock() primitives.
>
> It looks like we need something that tracks (data | rf)* between
> the return value of srcu_read_lock() and the second parameter of
> srcu_read_unlock(). The reason for rf rather than rfi is the upcoming
> srcu_down_read() and srcu_up_read().

Or just make herd treat srcu_read_lock(s) as an annotated equivalent of
READ_ONCE(&s) and srcu_read_unlock(s, v) as an annotated equivalent of
WRITE_ONCE(s, v). But with some special accomodation to avoid
interaction with the new carry-dep relation.

> But what I will do in the meantime is to switch back to a commit that
> simply flags nesting of same-srcu_struct SRCU read-side critical sections,
> while blindly assuming that the return value of a given srcu_read_lock()
> is passed in to the corresponding srcu_read_unlock():
>
> ------------------------------------------------------------------------
>
> (* Compute matching pairs of Srcu-lock and Srcu-unlock, but prohibit nesting *)
> let srcu-unmatched = Srcu-lock | Srcu-unlock
> let srcu-unmatched-po = ([srcu-unmatched] ; po ; [srcu-unmatched]) & loc
> let srcu-unmatched-locks-to-unlock = ([Srcu-lock] ; po ; [Srcu-unlock]) & loc
> let srcu-rscs = srcu-unmatched-locks-to-unlock \ (srcu-unmatched-po ; srcu-unmatched-po)
>
> (* Validate nesting *)
> flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
> flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
>
> (* Check for use of synchronize_srcu() inside an RCU critical section *)
> flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
>
> (* Validate SRCU dynamic match *)
> flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
>
> ------------------------------------------------------------------------
>
> Or is there some better intermediate position that could be taken?

Do you mean go back to the current linux-kernel.bell? The code you
wrote above is different, since it prohibits nesting.

Alan

2023-01-16 04:40:47

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Sun, Jan 15, 2023 at 03:46:10PM -0500, Alan Stern wrote:
> On Sun, Jan 15, 2023 at 10:10:52AM -0800, Paul E. McKenney wrote:
> > On Sun, Jan 15, 2023 at 11:23:31AM -0500, Alan Stern wrote:
> > > On Sat, Jan 14, 2023 at 09:15:10PM -0800, Paul E. McKenney wrote:
> > > > What am I missing here?
> > >
> > > I don't think you're missing anything. This is a matter for Boqun or
> > > Luc; it must have something to do with the way herd treats the
> > > srcu_read_lock() and srcu_read_unlock() primitives.
> >
> > It looks like we need something that tracks (data | rf)* between
> > the return value of srcu_read_lock() and the second parameter of
> > srcu_read_unlock(). The reason for rf rather than rfi is the upcoming
> > srcu_down_read() and srcu_up_read().
>
> Or just make herd treat srcu_read_lock(s) as an annotated equivalent of
> READ_ONCE(&s) and srcu_read_unlock(s, v) as an annotated equivalent of
> WRITE_ONCE(s, v). But with some special accomodation to avoid
> interaction with the new carry-dep relation.

This is a modification to herd7 you are suggesting? Otherwise, I am
suffering a failure of imagination on how to properly sort it from the
other READ_ONCE() and WRITE_ONCE() instances.

> > But what I will do in the meantime is to switch back to a commit that
> > simply flags nesting of same-srcu_struct SRCU read-side critical sections,
> > while blindly assuming that the return value of a given srcu_read_lock()
> > is passed in to the corresponding srcu_read_unlock():
> >
> > ------------------------------------------------------------------------
> >
> > (* Compute matching pairs of Srcu-lock and Srcu-unlock, but prohibit nesting *)
> > let srcu-unmatched = Srcu-lock | Srcu-unlock
> > let srcu-unmatched-po = ([srcu-unmatched] ; po ; [srcu-unmatched]) & loc
> > let srcu-unmatched-locks-to-unlock = ([Srcu-lock] ; po ; [Srcu-unlock]) & loc
> > let srcu-rscs = srcu-unmatched-locks-to-unlock \ (srcu-unmatched-po ; srcu-unmatched-po)
> >
> > (* Validate nesting *)
> > flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
> > flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
> >
> > (* Check for use of synchronize_srcu() inside an RCU critical section *)
> > flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
> >
> > (* Validate SRCU dynamic match *)
> > flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
> >
> > ------------------------------------------------------------------------
> >
> > Or is there some better intermediate position that could be taken?
>
> Do you mean go back to the current linux-kernel.bell? The code you
> wrote above is different, since it prohibits nesting.

Not to the current linux-kernel.bell, but, as you say, making the change
to obtain a better approximation by prohibiting nesting.

Thanx, Paul

2023-01-16 19:37:40

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Sun, Jan 15, 2023 at 08:23:29PM -0800, Paul E. McKenney wrote:
> On Sun, Jan 15, 2023 at 03:46:10PM -0500, Alan Stern wrote:
> > On Sun, Jan 15, 2023 at 10:10:52AM -0800, Paul E. McKenney wrote:
> > > On Sun, Jan 15, 2023 at 11:23:31AM -0500, Alan Stern wrote:
> > > > On Sat, Jan 14, 2023 at 09:15:10PM -0800, Paul E. McKenney wrote:
> > > > > What am I missing here?
> > > >
> > > > I don't think you're missing anything. This is a matter for Boqun or
> > > > Luc; it must have something to do with the way herd treats the
> > > > srcu_read_lock() and srcu_read_unlock() primitives.
> > >
> > > It looks like we need something that tracks (data | rf)* between
> > > the return value of srcu_read_lock() and the second parameter of
> > > srcu_read_unlock(). The reason for rf rather than rfi is the upcoming
> > > srcu_down_read() and srcu_up_read().
> >
> > Or just make herd treat srcu_read_lock(s) as an annotated equivalent of
> > READ_ONCE(&s) and srcu_read_unlock(s, v) as an annotated equivalent of
> > WRITE_ONCE(s, v). But with some special accomodation to avoid
> > interaction with the new carry-dep relation.
>
> This is a modification to herd7 you are suggesting? Otherwise, I am
> suffering a failure of imagination on how to properly sort it from the
> other READ_ONCE() and WRITE_ONCE() instances.

srcu_read_lock and srcu_read_unlock events would be distinguished from
other marked loads and stores by belonging to the Srcu-lock and
Srcu-unlock sets. But I don't know whether this result can be
accomplished just by modifying the .def file -- it might require changes
to herd7. (In fact, as far as I know there is no documentation at all
for the double-underscore operations used in linux-kernel.def. Hint
hint!)

As mentioned earlier, we should ask Luc or Boqun.


> > > Or is there some better intermediate position that could be taken?
> >
> > Do you mean go back to the current linux-kernel.bell? The code you
> > wrote above is different, since it prohibits nesting.
>
> Not to the current linux-kernel.bell, but, as you say, making the change
> to obtain a better approximation by prohibiting nesting.

Why do you want to prohibit nesting? Why would that be a better
approximation?

Alan

2023-01-16 19:38:42

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Mon, Jan 16, 2023 at 01:11:41PM -0500, Alan Stern wrote:
> On Sun, Jan 15, 2023 at 08:23:29PM -0800, Paul E. McKenney wrote:
> > On Sun, Jan 15, 2023 at 03:46:10PM -0500, Alan Stern wrote:
> > > On Sun, Jan 15, 2023 at 10:10:52AM -0800, Paul E. McKenney wrote:
> > > > On Sun, Jan 15, 2023 at 11:23:31AM -0500, Alan Stern wrote:
> > > > > On Sat, Jan 14, 2023 at 09:15:10PM -0800, Paul E. McKenney wrote:
> > > > > > What am I missing here?
> > > > >
> > > > > I don't think you're missing anything. This is a matter for Boqun or
> > > > > Luc; it must have something to do with the way herd treats the
> > > > > srcu_read_lock() and srcu_read_unlock() primitives.
> > > >
> > > > It looks like we need something that tracks (data | rf)* between
> > > > the return value of srcu_read_lock() and the second parameter of
> > > > srcu_read_unlock(). The reason for rf rather than rfi is the upcoming
> > > > srcu_down_read() and srcu_up_read().
> > >
> > > Or just make herd treat srcu_read_lock(s) as an annotated equivalent of
> > > READ_ONCE(&s) and srcu_read_unlock(s, v) as an annotated equivalent of
> > > WRITE_ONCE(s, v). But with some special accomodation to avoid
> > > interaction with the new carry-dep relation.
> >
> > This is a modification to herd7 you are suggesting? Otherwise, I am
> > suffering a failure of imagination on how to properly sort it from the
> > other READ_ONCE() and WRITE_ONCE() instances.
>
> srcu_read_lock and srcu_read_unlock events would be distinguished from
> other marked loads and stores by belonging to the Srcu-lock and
> Srcu-unlock sets. But I don't know whether this result can be
> accomplished just by modifying the .def file -- it might require changes
> to herd7. (In fact, as far as I know there is no documentation at all
> for the double-underscore operations used in linux-kernel.def. Hint
> hint!)
>
> As mentioned earlier, we should ask Luc or Boqun.

Good point, will do.

> > > > Or is there some better intermediate position that could be taken?
> > >
> > > Do you mean go back to the current linux-kernel.bell? The code you
> > > wrote above is different, since it prohibits nesting.
> >
> > Not to the current linux-kernel.bell, but, as you say, making the change
> > to obtain a better approximation by prohibiting nesting.
>
> Why do you want to prohibit nesting? Why would that be a better
> approximation?

Because the current LKMM gives wrong answers for nested critical
sections. For example, for the litmus test shown below, mainline
LKMM will incorrectly report "Never". The two SRCU read-side critical
sections are independent, so the fact that P1()'s synchronize_srcu() is
guaranteed to wait for the first on to complete says nothing about the
second having completed. Therefore, in Linux-kernel SRCU, the "exists"
clause could be satisfied.

In contrast, the proposed change flags this as having nesting.

Thaxn, Paul

------------------------------------------------------------------------

C C-srcu-nest-5

(*
* Result: Sometimes
*
* This demonstrates non-nesting of SRCU read-side critical sections.
* Unlike RCU, SRCU critical sections do not nest.
*)

{}

P0(int *x, int *y, struct srcu_struct *s1)
{
int r1;
int r2;
int r3;
int r4;

r3 = srcu_read_lock(s1);
r2 = READ_ONCE(*y);
r4 = srcu_read_lock(s1);
srcu_read_unlock(s1, r3);
r1 = READ_ONCE(*x);
srcu_read_unlock(s1, r4);
}

P1(int *x, int *y, struct srcu_struct *s1)
{
WRITE_ONCE(*y, 1);
synchronize_srcu(s1);
WRITE_ONCE(*x, 1);
}

locations [0:r1]
exists (0:r1=1 /\ 0:r2=0)

2023-01-16 20:11:53

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Mon, Jan 16, 2023 at 11:06:52AM -0800, Paul E. McKenney wrote:
> On Mon, Jan 16, 2023 at 01:11:41PM -0500, Alan Stern wrote:

> > Why do you want to prohibit nesting? Why would that be a better
> > approximation?
>
> Because the current LKMM gives wrong answers for nested critical
> sections.

I don't agree. Or at least, it depends on whose definition of "nested
critical sections" you adopt.

> For example, for the litmus test shown below, mainline
> LKMM will incorrectly report "Never". The two SRCU read-side critical
> sections are independent, so the fact that P1()'s synchronize_srcu() is
> guaranteed to wait for the first on to complete says nothing about the
> second having completed. Therefore, in Linux-kernel SRCU, the "exists"
> clause could be satisfied.
>
> In contrast, the proposed change flags this as having nesting.

In fact, this litmus test has overlapping critical sections, not nested
ones. But the current LKML incorrectly _thinks_ they are nested,
because it matches each lock with the first unmatched unlock.

If you write a litmus test that has properly nested (not overlapping!)
read-side critical sections, the current LKMM will match the locks and
unlocks correctly and will give the right answer.

So what you really want to do is rule out overlapping, not nesting. But
I guess there's no way to do one without the other.

Alan

> Thaxn, Paul
>
> ------------------------------------------------------------------------
>
> C C-srcu-nest-5
>
> (*
> * Result: Sometimes
> *
> * This demonstrates non-nesting of SRCU read-side critical sections.
> * Unlike RCU, SRCU critical sections do not nest.
> *)
>
> {}
>
> P0(int *x, int *y, struct srcu_struct *s1)
> {
> int r1;
> int r2;
> int r3;
> int r4;
>
> r3 = srcu_read_lock(s1);
> r2 = READ_ONCE(*y);
> r4 = srcu_read_lock(s1);
> srcu_read_unlock(s1, r3);
> r1 = READ_ONCE(*x);
> srcu_read_unlock(s1, r4);
> }
>
> P1(int *x, int *y, struct srcu_struct *s1)
> {
> WRITE_ONCE(*y, 1);
> synchronize_srcu(s1);
> WRITE_ONCE(*x, 1);
> }
>
> locations [0:r1]
> exists (0:r1=1 /\ 0:r2=0)

2023-01-16 22:46:51

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Mon, Jan 16, 2023 at 02:20:57PM -0500, Alan Stern wrote:
> On Mon, Jan 16, 2023 at 11:06:52AM -0800, Paul E. McKenney wrote:
> > On Mon, Jan 16, 2023 at 01:11:41PM -0500, Alan Stern wrote:
>
> > > Why do you want to prohibit nesting? Why would that be a better
> > > approximation?
> >
> > Because the current LKMM gives wrong answers for nested critical
> > sections.
>
> I don't agree. Or at least, it depends on whose definition of "nested
> critical sections" you adopt.

Fair point, and I have therefore updated the test's header comment
to read as follows:

(*
* Result: Sometimes
*
* This demonstrates non-nested overlapping of SRCU read-side critical
* sections. Unlike RCU, SRCU critical sections do not unconditionally
* nest.
*)

> > For example, for the litmus test shown below, mainline
> > LKMM will incorrectly report "Never". The two SRCU read-side critical
> > sections are independent, so the fact that P1()'s synchronize_srcu() is
> > guaranteed to wait for the first on to complete says nothing about the
> > second having completed. Therefore, in Linux-kernel SRCU, the "exists"
> > clause could be satisfied.
> >
> > In contrast, the proposed change flags this as having nesting.
>
> In fact, this litmus test has overlapping critical sections, not nested
> ones. But the current LKML incorrectly _thinks_ they are nested,
> because it matches each lock with the first unmatched unlock.
>
> If you write a litmus test that has properly nested (not overlapping!)
> read-side critical sections, the current LKMM will match the locks and
> unlocks correctly and will give the right answer.
>
> So what you really want to do is rule out overlapping, not nesting. But
> I guess there's no way to do one without the other.

None that I could see!

Thanx, Paul

> Alan
>
> > Thaxn, Paul
> >
> > ------------------------------------------------------------------------
> >
> > C C-srcu-nest-5
> >
> > (*
> > * Result: Sometimes
> > *
> > * This demonstrates non-nesting of SRCU read-side critical sections.
> > * Unlike RCU, SRCU critical sections do not nest.
> > *)
> >
> > {}
> >
> > P0(int *x, int *y, struct srcu_struct *s1)
> > {
> > int r1;
> > int r2;
> > int r3;
> > int r4;
> >
> > r3 = srcu_read_lock(s1);
> > r2 = READ_ONCE(*y);
> > r4 = srcu_read_lock(s1);
> > srcu_read_unlock(s1, r3);
> > r1 = READ_ONCE(*x);
> > srcu_read_unlock(s1, r4);
> > }
> >
> > P1(int *x, int *y, struct srcu_struct *s1)
> > {
> > WRITE_ONCE(*y, 1);
> > synchronize_srcu(s1);
> > WRITE_ONCE(*x, 1);
> > }
> >
> > locations [0:r1]
> > exists (0:r1=1 /\ 0:r2=0)

2023-01-17 12:19:00

by Andrea Parri

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Mon, Jan 16, 2023 at 02:13:57PM -0800, Paul E. McKenney wrote:
> On Mon, Jan 16, 2023 at 02:20:57PM -0500, Alan Stern wrote:
> > On Mon, Jan 16, 2023 at 11:06:52AM -0800, Paul E. McKenney wrote:
> > > On Mon, Jan 16, 2023 at 01:11:41PM -0500, Alan Stern wrote:
> >
> > > > Why do you want to prohibit nesting? Why would that be a better
> > > > approximation?
> > >
> > > Because the current LKMM gives wrong answers for nested critical
> > > sections.
> >
> > I don't agree. Or at least, it depends on whose definition of "nested
> > critical sections" you adopt.
>
> Fair point, and I have therefore updated the test's header comment
> to read as follows:
>
> (*
> * Result: Sometimes
> *
> * This demonstrates non-nested overlapping of SRCU read-side critical
> * sections. Unlike RCU, SRCU critical sections do not unconditionally
> * nest.
> *)
>
> > > For example, for the litmus test shown below, mainline
> > > LKMM will incorrectly report "Never". The two SRCU read-side critical
> > > sections are independent, so the fact that P1()'s synchronize_srcu() is
> > > guaranteed to wait for the first on to complete says nothing about the
> > > second having completed. Therefore, in Linux-kernel SRCU, the "exists"
> > > clause could be satisfied.
> > >
> > > In contrast, the proposed change flags this as having nesting.
> >
> > In fact, this litmus test has overlapping critical sections, not nested
> > ones. But the current LKML incorrectly _thinks_ they are nested,
> > because it matches each lock with the first unmatched unlock.
> >
> > If you write a litmus test that has properly nested (not overlapping!)
> > read-side critical sections, the current LKMM will match the locks and
> > unlocks correctly and will give the right answer.
> >
> > So what you really want to do is rule out overlapping, not nesting. But
> > I guess there's no way to do one without the other.
>
> None that I could see!

This was reminiscent of old discussions, in fact, we do have:

[tools/memory-model/Documentation/litmus-tests.txt]

e. Although sleepable RCU (SRCU) is now modeled, there
are some subtle differences between its semantics and
those in the Linux kernel. For example, the kernel
might interpret the following sequence as two partially
overlapping SRCU read-side critical sections:

1 r1 = srcu_read_lock(&my_srcu);
2 do_something_1();
3 r2 = srcu_read_lock(&my_srcu);
4 do_something_2();
5 srcu_read_unlock(&my_srcu, r1);
6 do_something_3();
7 srcu_read_unlock(&my_srcu, r2);

In contrast, LKMM will interpret this as a nested pair of
SRCU read-side critical sections, with the outer critical
section spanning lines 1-7 and the inner critical section
spanning lines 3-5.

This difference would be more of a concern had anyone
identified a reasonable use case for partially overlapping
SRCU read-side critical sections. For more information
on the trickiness of such overlapping, please see:
https://paulmck.livejournal.com/40593.html

More recently/related,

https://lore.kernel.org/lkml/20220421230848.GA194034@paulmck-ThinkPad-P17-Gen-1/T/#m2a8701c7c377ccb27190a6679e58b0929b0b0ad9

Thanks,
Andrea


>
> Thanx, Paul
>
> > Alan
> >
> > > Thaxn, Paul
> > >
> > > ------------------------------------------------------------------------
> > >
> > > C C-srcu-nest-5
> > >
> > > (*
> > > * Result: Sometimes
> > > *
> > > * This demonstrates non-nesting of SRCU read-side critical sections.
> > > * Unlike RCU, SRCU critical sections do not nest.
> > > *)
> > >
> > > {}
> > >
> > > P0(int *x, int *y, struct srcu_struct *s1)
> > > {
> > > int r1;
> > > int r2;
> > > int r3;
> > > int r4;
> > >
> > > r3 = srcu_read_lock(s1);
> > > r2 = READ_ONCE(*y);
> > > r4 = srcu_read_lock(s1);
> > > srcu_read_unlock(s1, r3);
> > > r1 = READ_ONCE(*x);
> > > srcu_read_unlock(s1, r4);
> > > }
> > >
> > > P1(int *x, int *y, struct srcu_struct *s1)
> > > {
> > > WRITE_ONCE(*y, 1);
> > > synchronize_srcu(s1);
> > > WRITE_ONCE(*x, 1);
> > > }
> > >
> > > locations [0:r1]
> > > exists (0:r1=1 /\ 0:r2=0)

2023-01-17 16:30:17

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 17, 2023 at 12:46:28PM +0100, Andrea Parri wrote:
> On Mon, Jan 16, 2023 at 02:13:57PM -0800, Paul E. McKenney wrote:
> > On Mon, Jan 16, 2023 at 02:20:57PM -0500, Alan Stern wrote:
> > > On Mon, Jan 16, 2023 at 11:06:52AM -0800, Paul E. McKenney wrote:
> > > > On Mon, Jan 16, 2023 at 01:11:41PM -0500, Alan Stern wrote:
> > >
> > > > > Why do you want to prohibit nesting? Why would that be a better
> > > > > approximation?
> > > >
> > > > Because the current LKMM gives wrong answers for nested critical
> > > > sections.
> > >
> > > I don't agree. Or at least, it depends on whose definition of "nested
> > > critical sections" you adopt.
> >
> > Fair point, and I have therefore updated the test's header comment
> > to read as follows:
> >
> > (*
> > * Result: Sometimes
> > *
> > * This demonstrates non-nested overlapping of SRCU read-side critical
> > * sections. Unlike RCU, SRCU critical sections do not unconditionally
> > * nest.
> > *)
> >
> > > > For example, for the litmus test shown below, mainline
> > > > LKMM will incorrectly report "Never". The two SRCU read-side critical
> > > > sections are independent, so the fact that P1()'s synchronize_srcu() is
> > > > guaranteed to wait for the first on to complete says nothing about the
> > > > second having completed. Therefore, in Linux-kernel SRCU, the "exists"
> > > > clause could be satisfied.
> > > >
> > > > In contrast, the proposed change flags this as having nesting.
> > >
> > > In fact, this litmus test has overlapping critical sections, not nested
> > > ones. But the current LKML incorrectly _thinks_ they are nested,
> > > because it matches each lock with the first unmatched unlock.
> > >
> > > If you write a litmus test that has properly nested (not overlapping!)
> > > read-side critical sections, the current LKMM will match the locks and
> > > unlocks correctly and will give the right answer.
> > >
> > > So what you really want to do is rule out overlapping, not nesting. But
> > > I guess there's no way to do one without the other.
> >
> > None that I could see!
>
> This was reminiscent of old discussions, in fact, we do have:
>
> [tools/memory-model/Documentation/litmus-tests.txt]
>
> e. Although sleepable RCU (SRCU) is now modeled, there
> are some subtle differences between its semantics and
> those in the Linux kernel. For example, the kernel
> might interpret the following sequence as two partially
> overlapping SRCU read-side critical sections:
>
> 1 r1 = srcu_read_lock(&my_srcu);
> 2 do_something_1();
> 3 r2 = srcu_read_lock(&my_srcu);
> 4 do_something_2();
> 5 srcu_read_unlock(&my_srcu, r1);
> 6 do_something_3();
> 7 srcu_read_unlock(&my_srcu, r2);
>
> In contrast, LKMM will interpret this as a nested pair of
> SRCU read-side critical sections, with the outer critical
> section spanning lines 1-7 and the inner critical section
> spanning lines 3-5.
>
> This difference would be more of a concern had anyone
> identified a reasonable use case for partially overlapping
> SRCU read-side critical sections. For more information
> on the trickiness of such overlapping, please see:
> https://paulmck.livejournal.com/40593.html

Good point, if we do change the definition, we also need to update
this documentation.

> More recently/related,
>
> https://lore.kernel.org/lkml/20220421230848.GA194034@paulmck-ThinkPad-P17-Gen-1/T/#m2a8701c7c377ccb27190a6679e58b0929b0b0ad9

It would not be a bad thing for LKMM to be able to show people the
error of their ways when they try non-nested partially overlapping SRCU
read-side critical sections. Or, should they find some valid use case,
to help them prove their point. ;-)

Thanx, Paul

> Thanks,
> Andrea
>
>
> >
> > Thanx, Paul
> >
> > > Alan
> > >
> > > > Thaxn, Paul
> > > >
> > > > ------------------------------------------------------------------------
> > > >
> > > > C C-srcu-nest-5
> > > >
> > > > (*
> > > > * Result: Sometimes
> > > > *
> > > > * This demonstrates non-nesting of SRCU read-side critical sections.
> > > > * Unlike RCU, SRCU critical sections do not nest.
> > > > *)
> > > >
> > > > {}
> > > >
> > > > P0(int *x, int *y, struct srcu_struct *s1)
> > > > {
> > > > int r1;
> > > > int r2;
> > > > int r3;
> > > > int r4;
> > > >
> > > > r3 = srcu_read_lock(s1);
> > > > r2 = READ_ONCE(*y);
> > > > r4 = srcu_read_lock(s1);
> > > > srcu_read_unlock(s1, r3);
> > > > r1 = READ_ONCE(*x);
> > > > srcu_read_unlock(s1, r4);
> > > > }
> > > >
> > > > P1(int *x, int *y, struct srcu_struct *s1)
> > > > {
> > > > WRITE_ONCE(*y, 1);
> > > > synchronize_srcu(s1);
> > > > WRITE_ONCE(*x, 1);
> > > > }
> > > >
> > > > locations [0:r1]
> > > > exists (0:r1=1 /\ 0:r2=0)

2023-01-17 17:19:26

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 17, 2023 at 07:14:16AM -0800, Paul E. McKenney wrote:
> On Tue, Jan 17, 2023 at 12:46:28PM +0100, Andrea Parri wrote:
> > This was reminiscent of old discussions, in fact, we do have:
> >
> > [tools/memory-model/Documentation/litmus-tests.txt]
> >
> > e. Although sleepable RCU (SRCU) is now modeled, there
> > are some subtle differences between its semantics and
> > those in the Linux kernel. For example, the kernel
> > might interpret the following sequence as two partially
> > overlapping SRCU read-side critical sections:
> >
> > 1 r1 = srcu_read_lock(&my_srcu);
> > 2 do_something_1();
> > 3 r2 = srcu_read_lock(&my_srcu);
> > 4 do_something_2();
> > 5 srcu_read_unlock(&my_srcu, r1);
> > 6 do_something_3();
> > 7 srcu_read_unlock(&my_srcu, r2);
> >
> > In contrast, LKMM will interpret this as a nested pair of
> > SRCU read-side critical sections, with the outer critical
> > section spanning lines 1-7 and the inner critical section
> > spanning lines 3-5.
> >
> > This difference would be more of a concern had anyone
> > identified a reasonable use case for partially overlapping
> > SRCU read-side critical sections. For more information
> > on the trickiness of such overlapping, please see:
> > https://paulmck.livejournal.com/40593.html
>
> Good point, if we do change the definition, we also need to update
> this documentation.
>
> > More recently/related,
> >
> > https://lore.kernel.org/lkml/20220421230848.GA194034@paulmck-ThinkPad-P17-Gen-1/T/#m2a8701c7c377ccb27190a6679e58b0929b0b0ad9
>
> It would not be a bad thing for LKMM to be able to show people the
> error of their ways when they try non-nested partially overlapping SRCU
> read-side critical sections. Or, should they find some valid use case,
> to help them prove their point. ;-)

Isn't it true that the current code will flag srcu-bad-nesting if a
litmus test has non-nested overlapping SRCU read-side critical sections?

And if it is true, is there any need to change the memory model at this
point?

(And if it's not true, that's most likely due to a bug in herd7.)

Alan

2023-01-17 18:00:31

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 17, 2023 at 10:56:34AM -0500, Alan Stern wrote:
> On Tue, Jan 17, 2023 at 07:14:16AM -0800, Paul E. McKenney wrote:
> > On Tue, Jan 17, 2023 at 12:46:28PM +0100, Andrea Parri wrote:
> > > This was reminiscent of old discussions, in fact, we do have:
> > >
> > > [tools/memory-model/Documentation/litmus-tests.txt]
> > >
> > > e. Although sleepable RCU (SRCU) is now modeled, there
> > > are some subtle differences between its semantics and
> > > those in the Linux kernel. For example, the kernel
> > > might interpret the following sequence as two partially
> > > overlapping SRCU read-side critical sections:
> > >
> > > 1 r1 = srcu_read_lock(&my_srcu);
> > > 2 do_something_1();
> > > 3 r2 = srcu_read_lock(&my_srcu);
> > > 4 do_something_2();
> > > 5 srcu_read_unlock(&my_srcu, r1);
> > > 6 do_something_3();
> > > 7 srcu_read_unlock(&my_srcu, r2);
> > >
> > > In contrast, LKMM will interpret this as a nested pair of
> > > SRCU read-side critical sections, with the outer critical
> > > section spanning lines 1-7 and the inner critical section
> > > spanning lines 3-5.
> > >
> > > This difference would be more of a concern had anyone
> > > identified a reasonable use case for partially overlapping
> > > SRCU read-side critical sections. For more information
> > > on the trickiness of such overlapping, please see:
> > > https://paulmck.livejournal.com/40593.html
> >
> > Good point, if we do change the definition, we also need to update
> > this documentation.
> >
> > > More recently/related,
> > >
> > > https://lore.kernel.org/lkml/20220421230848.GA194034@paulmck-ThinkPad-P17-Gen-1/T/#m2a8701c7c377ccb27190a6679e58b0929b0b0ad9
> >
> > It would not be a bad thing for LKMM to be able to show people the
> > error of their ways when they try non-nested partially overlapping SRCU
> > read-side critical sections. Or, should they find some valid use case,
> > to help them prove their point. ;-)
>
> Isn't it true that the current code will flag srcu-bad-nesting if a
> litmus test has non-nested overlapping SRCU read-side critical sections?

Now that you mention it, it does indeed, flagging srcu-bad-nesting.

Just to see if I understand, different-values yields true if the set
contains multiple elements with the same value mapping to different
values. Or, to put it another way, if the relation does not correspond
to a function.

Or am I still missing something?

> And if it is true, is there any need to change the memory model at this
> point?
>
> (And if it's not true, that's most likely due to a bug in herd7.)

Agreed, changes must wait for SRCU support in herd7.

At which point something roughly similar to this might work?

let srcu-rscs = return_value(Srcu-lock) ; (dep | rfi)* ;
parameter(Srcu-unlock, 2)

Given an Srcu-down and an Srcu-up:

let srcu-rscs = ( return_value(Srcu-lock) ; (dep | rfi)* ;
parameter(Srcu-unlock, 2) ) |
( return_value(Srcu-down) ; (dep | rf)* ;
parameter(Srcu-up, 2) )

Seem reasonable, or am I missing yet something else?

Thanx, Paul

2023-01-17 18:44:25

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)


[I set up my huaweicloud.com address to send and receive mail, allegedly
huaweicloud.com has fewer problems. Let's see. Also snipping together
some mails that were sent to me in the meantime.]

On 1/14/2023 5:42 PM, Alan Stern wrote:
>
> On Fri, Jan 13, 2023 at 02:55:34PM +0000, Jonas Oberhauser wrote:
>>
>> I think the whole rcu-order topic can be summarized as the 'one
>> rule': "if a grace period happens before a rcsc-unlock, it must also
>> happen before the rcsc -lock, and analogously if rcsc-lock happens
>> before a grace period, the rcsc-unlock also happens before the grace
>> period" .
>
> There is more to it than that, as I mentioned earlier. A complete
> description can be found the explanation.txt document; it says: For
> any critical section C and any grace period G, at least one of the
> following statements must hold: (1) C ends before G does, and in
> addition, every store that propagates to C's CPU before the end of C
> must propagate to every CPU before G ends. (2) G starts before C does,
> and in addition, every store that propagates to G's CPU before the
> start of G must propagate to every CPU before C starts.

Yes, this difference took me a while to appreciate. If there was only (a
strict interpretation of) the rule I mentioned, then the RCU axioms
could be stated as just a regular atomicity axiom.

But because it also affects the surrounding operations, the recursion
becomes necessary.


>
>
>>>
>>> IMO it's generally better to think of grace periods as not being
>>> instantaneous but as occurring over a prolonged period of time. Thus
>>> we should say: If a grace period ends before an unlock occurs, it
>>> must start before the corresponding lock. And contrapositively, if a
>>> lock occurs before a grace period starts, the corresponding unlock
>>> must occur before the grace period ends.
>>
>> I started thinking about it like this and comparing start/end times.
>> That made it more complicated, but the math came out the same in the
>> end. I could imagine that there are some scenarios where the
>> intuition of collapsing the grace period to a single event could
>> cause problems, but I haven't seen any.
>
>
>
> IIRC (and it has been a long time), this may be vaguely connected with
> the reason why the definitions of gp, rcu-link, and rcu-fence have po
> one side but po? on the other.Ā  But I can't remember the details.



There's at least some connection. And I think from an operational model
perspective, the distinction has some effect.

That's because part (1) of the rule you quoted forces propagation before
G ends, which allows propagation to G's CPU after the start or before
the end.

Stores propagated in that time period are not forced to propagate by
part (2).

If the two events in the operational model were merged, then all stores
that need to propagate to G's CPU through rule (1) would also need to
propagate to other CPU's through part (2).

In particular, if we had an execution with 3 CPUs like below (time from
top to bottom, also attached as a text file in case my e-mail client
messes up the formatting)

CPU1Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | CPU2Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | CPU3
start CS;Ā Ā Ā Ā Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |
read stage==0Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |
Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | stage = 1;Ā Ā Ā Ā  |
Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |
Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | GP {Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |
x = 1;Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |
Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | start CS;
Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | read x == 0;
end CS;Ā Ā Ā Ā Ā Ā Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |
Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | }Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |
Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | stage = 2;Ā Ā Ā Ā  |
Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | read stage == 2;
Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | read x == 1;
Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | end CS;

then we allow x=1 not to propagate to the third CPU before it reads x.
But if there was only a single grace period step, which would not
overlap with either CS, then this outcome would be forbidden.
Because stage=1 didn't propagate to CPU1, the grace period would need to
be after CPU1's critical section.
Because stage=2 did propagate to CPU3, the grace period would need to be
before CPU3's critical section.
But then x=1 would need to propagate to CPU3 before the grace period,
and thus before its CS starts.

I think it's because things can't happen "at the same time" in the
operational model. Otherwise, x=1 could propagate "at the same time" as
it executes the grace period, and then wouldn't be affected by rule (2)
anymore.

But in the axiomatic model, we can use the po; ... to state that things
must happen "strictly before" the start of G (instead of "kinda at the
same time"). If there is a po-earlier event that observes the
propagation, then the propagation happened before the start of G. If
there is no po-earlier event that observes the propagation, then the
store may as well have propagated at the same time as G (or "between the
start and end"). So having the distinction of start and end of grace
periods becomes at least less important.

I still haven't wrapped my head fully around why the other side has to
be po?.

I asked Hernan to run all the old litmus tests with rcu-fence = po ;
rcu-order ; po and he reported no difference in those tests either.

Now I'm thinking if it can be proven that some of them aren't necessary,
or could be simplified.

Pretending for simplicity that rscs and grace periods aren't
reads&writes (and that prop must always start with overwrite&ext, but
this can be done wlog if we define the rcu relations with prop? instead
of prop).

I'm first looking at the rcu-link relation.
Any use of rcu-link in rcu-order is preceded by an rscs or gp.
Considering the cases where po? is not taken, the first edge of
hb*;pb*;prop? can't be any of prop, rfe, or prop&int because the
rcu-order never ends in write/reads. This leaves only ppo (or nothing),
and we can use ppo <= po (with the patch that's currently lying on my
hard disk :D) to get that he complete edge a subset of

(po ; hb*;pb*;prop? | id);po

Therefore I think we have rcu-link = (po ; hb*;pb*;prop? ; po) | po

Next, I look at rcu-fence in rb = prop? ; rcu-fence ; hb* ; pb*.
An rcu-fence ; hb* ; pb* which doesn't have the po at the end of
rcu-fence can not have prop, rfe, or prop&int after the rcu-fence
either. This leaves two cases, either the rb comes from prop? ; po ;
rcu-order or from prop? ; po ; rcu-order ; ppo ; hb* ; pb*.

In the latter case we can use ppo <= po and get backĀ  prop? ; po ;
rcu-order ; po ; hb* ; pb, so considering po? here is not necessary.

In the former case, we can ask instead if po ; rcu-order ; prop? is
irreflexive, and since prop can't follow on rcu-order, this is the same
as po ; rcu-order.

This can only have an identity edge if at least some of the rcu-links in
rcu-order are not just po. So let's look at the last such edge, when
abbreviating RCU cs and grace periods as R we get

Ā  po; (R ; rcu-link)* ; R ; po ; hb*;pb*;prop? ; (po ; R)+

where overall the number of gps >= number of rscs, and this can be
rewritten as

Ā  prop? ; (po ; R)+; po; (rcu-order ; rcu-link)? ; R ; po ; hb*;pb*

and I believe (po ; R)+; po; (R ; rcu-link)* ; R ; poĀ  <= po ; rcu-order
; po (using the fact that overall the number of gps is still >= the
number of rscs)

so then it simplifies again to

Ā  prop? ; po ; rcu-order ; po ; hb*;pb*

and po? is again not necessary.

I'm again ignoring srcu here. I don't know if we can still shuffle the
gp/rscs around like above when the locations have to match.

Either way if you can confirm my suspicion that the po? in rcu-fence
could be replaced by po, and that the po? in rcu-link could be replaced
by (po ; ... ; po) | po, or have a counter example and some additional
explanation for why the po? makes sense, I'd be thankful.


> There was also something about what should happen when you have two
> grace periods in a row.

Note that two grace periods in a row are a subset of po;rcu-gp;po and
thus gp, and so there's nothing to be done.
Something more interesting happens with critical sections, where
currently po ; rcu-rcsci ; po ; rcu-rcsci ; po should be a subset of po
; rcu-rcsci ; poĀ  because of the forbidden partial overlap. But I
currently don't think it's necessary to consider such cases.

The other thing that causes complications is when all the pb*,hb*,and
prop links in rcu-link are just id, and then rcu-link becomes po?;po =
po. Currently I don't understand why such pure po links should be
necessary at all, since they should just merge with the neighboring
rcu-gps into a gp edge.

>>
>> The only way I'd count rcu-link' as adding a case is if you say that
>> the (...)* has two cases :D (or infinitely many :D) I don't count the
>> existence of the definition because you could always inline it (but
>> lose a lot of clarity imho).
>
>
>
> If you did inline it, you'd probably find that the end result was
> exactly what is currently in the LKMM.

Not quite. There are two differences. The first is that the
rcu-order;rcu-link;rcu-order case disappears.

The second is that the ...;rcu-link;... and
...;rcu-link;rcu-order;rcu-link;... subcases get merged, and not to
...;rcu-link;(rcu-order;rcu-link)?;... but to
...;rcu-link;(rcu-order;rcu-link)*;...

Indeed the definitions of rcu-extend and rcu-order can't become exactly
the same because they are different relations, e.g., rcu-order can begin
and end with a grace period but rcu-extend can't.

That's why an additional non-recursive layer of

Ā Ā  rcu-order = rcu-extend ; (rcu-link; rcu-extend)*

is necessary to define rcu-order in terms of rcu-extend. But as I
mentioned I don't think rcu-order is necessary at all to define LKMM,
and one can probably just use rcu-extend instead of rcu-order (and in
fact even a version of rcu-extend without any lone rcu-gps).

>
>>
>> the law is, well, um, "primarily empirical in nature"
>
>
>
> Actually it isn't, not any more. That quote was written before we
> formalized RCU in the LKMM.

I meant that the original formulation was empirical; of course you have
formalized it, but how do you know that the formalization is valid? I
think the correspondence with "what's intended" is always an empirical
thing, even if you formally prove the correctness of the imlementation
against the specification you might have missed some parts or added some
parts that are actually just implementation details.

best wishes,

jonas


Attachments:
gp-case.txt (621.00 B)

2023-01-17 20:40:35

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 17, 2023 at 07:27:29PM +0100, Jonas Oberhauser wrote:
> On 1/17/2023 6:43 PM, Paul E. McKenney wrote:
> > On Tue, Jan 17, 2023 at 10:56:34AM -0500, Alan Stern wrote:
> > > On Tue, Jan 17, 2023 at 07:14:16AM -0800, Paul E. McKenney wrote:
> > > > On Tue, Jan 17, 2023 at 12:46:28PM +0100, Andrea Parri wrote:
> > > > > This was reminiscent of old discussions, in fact, we do have:
> > > > >
> > > > > [tools/memory-model/Documentation/litmus-tests.txt]
> > > > >
> > > > > e. Although sleepable RCU (SRCU) is now modeled, there
> > > > > are some subtle differences between its semantics and
> > > > > those in the Linux kernel. For example, the kernel
> > > > > might interpret the following sequence as two partially
> > > > > overlapping SRCU read-side critical sections:
> > > > >
> > > > > 1 r1 = srcu_read_lock(&my_srcu);
> > > > > 2 do_something_1();
> > > > > 3 r2 = srcu_read_lock(&my_srcu);
> > > > > 4 do_something_2();
> > > > > 5 srcu_read_unlock(&my_srcu, r1);
> > > > > 6 do_something_3();
> > > > > 7 srcu_read_unlock(&my_srcu, r2);
> > > > >
> > > > > In contrast, LKMM will interpret this as a nested pair of
> > > > > SRCU read-side critical sections, with the outer critical
> > > > > section spanning lines 1-7 and the inner critical section
> > > > > spanning lines 3-5.
> > > > >
> > > > > This difference would be more of a concern had anyone
> > > > > identified a reasonable use case for partially overlapping
> > > > > SRCU read-side critical sections. For more information
> > > > > on the trickiness of such overlapping, please see:
> > > > > https://paulmck.livejournal.com/40593.html
> > > > Good point, if we do change the definition, we also need to update
> > > > this documentation.
> > > >
> > > > > More recently/related,
> > > > >
> > > > > https://lore.kernel.org/lkml/20220421230848.GA194034@paulmck-ThinkPad-P17-Gen-1/T/#m2a8701c7c377ccb27190a6679e58b0929b0b0ad9
> > > > It would not be a bad thing for LKMM to be able to show people the
> > > > error of their ways when they try non-nested partially overlapping SRCU
> > > > read-side critical sections. Or, should they find some valid use case,
> > > > to help them prove their point. ;-)
> > > Isn't it true that the current code will flag srcu-bad-nesting if a
> > > litmus test has non-nested overlapping SRCU read-side critical sections?
> > Now that you mention it, it does indeed, flagging srcu-bad-nesting.
> >
> > Just to see if I understand, different-values yields true if the set
> > contains multiple elements with the same value mapping to different
> > values. Or, to put it another way, if the relation does not correspond
> > to a function.
> >
> > Or am I still missing something?
>
> based on https://lkml.org/lkml/2019/1/10/155:

Ah, thank you for the pointer!

> I think different-values(r) is the same as r \ same-values, where
> same-values links all reads and writes that have the same value (e.g.,
> "write 5 to x" and "read 5 from y").
>
> With this in mind, I think the idea is to 1) forbid partial overlap, and
> using the different-values to 2) force them to provide the appropriate
> value.
> This works because apparently srcu-lock is a read and srcu-unlock is a
> write, so in case of
> int r1 = srcu-lock(&ss);?? ==>? Read(&ss, x), r1 := x
> ...
> srcu-unlock(&ss, r1);? ==> Write(&ss, r1), which is Write(&ss, x)
>
> This guarantees that the read and write have the same value, hence
> different-values(...) will be the empty relation, and so no flag.

Might it instead match the entire event?

> > > And if it is true, is there any need to change the memory model at this
> > > point?
> > >
> > > (And if it's not true, that's most likely due to a bug in herd7.)
> > Agreed, changes must wait for SRCU support in herd7.
> >
> > At which point something roughly similar to this might work?
> >
> > let srcu-rscs = return_value(Srcu-lock) ; (dep | rfi)* ;
> > parameter(Srcu-unlock, 2)
>
> I would like instead to be able to give names to the arguments of events
> that become dependency relations, like
> ?? event srcu_unlock(struct srcu_struct *srcu_addr, struct srcu_token
> *srcu_data)
> and then
> ??? let srcu-rscs = [Srcu-lock] ; srcu_data ; (data; rfi)*
>
> Personally I would also like to not have Linux-specific primitives in
> herd7/cat, that means that to understand LKMM you also need to understand
> the herd7 tool, and sounds quite brittle.
>
> I would prefer if herd7 had some means to define custom events/instructions
> and uninterpreted relations between them, like
>
> relation rf : [write] x [read]
> [read] <= range(rf)
> empty rf ;rf^-1 \ id
>
> and some way to say
> [read] ; .return <= rf^-1 ; .data
> (where .return is a functional relation relating every event to the value it
> returns, and .xyz is the functional relation relating every event to the
> value of its argument xyz).

I am glad that I asked rather than kneejerk filing a bug report. ;-)

Other thoughts?

Thanx, Paul

2023-01-17 20:45:54

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)




On 1/17/2023 6:43 PM, Paul E. McKenney wrote:
> On Tue, Jan 17, 2023 at 10:56:34AM -0500, Alan Stern wrote:
>> On Tue, Jan 17, 2023 at 07:14:16AM -0800, Paul E. McKenney wrote:
>>> On Tue, Jan 17, 2023 at 12:46:28PM +0100, Andrea Parri wrote:
>>>> This was reminiscent of old discussions, in fact, we do have:
>>>>
>>>> [tools/memory-model/Documentation/litmus-tests.txt]
>>>>
>>>> e. Although sleepable RCU (SRCU) is now modeled, there
>>>> are some subtle differences between its semantics and
>>>> those in the Linux kernel. For example, the kernel
>>>> might interpret the following sequence as two partially
>>>> overlapping SRCU read-side critical sections:
>>>>
>>>> 1 r1 = srcu_read_lock(&my_srcu);
>>>> 2 do_something_1();
>>>> 3 r2 = srcu_read_lock(&my_srcu);
>>>> 4 do_something_2();
>>>> 5 srcu_read_unlock(&my_srcu, r1);
>>>> 6 do_something_3();
>>>> 7 srcu_read_unlock(&my_srcu, r2);
>>>>
>>>> In contrast, LKMM will interpret this as a nested pair of
>>>> SRCU read-side critical sections, with the outer critical
>>>> section spanning lines 1-7 and the inner critical section
>>>> spanning lines 3-5.
>>>>
>>>> This difference would be more of a concern had anyone
>>>> identified a reasonable use case for partially overlapping
>>>> SRCU read-side critical sections. For more information
>>>> on the trickiness of such overlapping, please see:
>>>> https://paulmck.livejournal.com/40593.html
>>> Good point, if we do change the definition, we also need to update
>>> this documentation.
>>>
>>>> More recently/related,
>>>>
>>>> https://lore.kernel.org/lkml/20220421230848.GA194034@paulmck-ThinkPad-P17-Gen-1/T/#m2a8701c7c377ccb27190a6679e58b0929b0b0ad9
>>> It would not be a bad thing for LKMM to be able to show people the
>>> error of their ways when they try non-nested partially overlapping SRCU
>>> read-side critical sections. Or, should they find some valid use case,
>>> to help them prove their point. ;-)
>> Isn't it true that the current code will flag srcu-bad-nesting if a
>> litmus test has non-nested overlapping SRCU read-side critical sections?
> Now that you mention it, it does indeed, flagging srcu-bad-nesting.
>
> Just to see if I understand, different-values yields true if the set
> contains multiple elements with the same value mapping to different
> values. Or, to put it another way, if the relation does not correspond
> to a function.
>
> Or am I still missing something?

based on https://lkml.org/lkml/2019/1/10/155:
I think different-values(r) is the same as r \ same-values, where
same-values links all reads and writes that have the same value (e.g.,
"write 5 to x" and "read 5 from y").

With this in mind, I think the idea is to 1) forbid partial overlap, and
using the different-values to 2) force them to provide the appropriate
value.
This works because apparently srcu-lock is a read and srcu-unlock is a
write, so in case of
int r1 = srcu-lock(&ss);Ā Ā  ==>Ā  Read(&ss, x), r1 := x
...
srcu-unlock(&ss, r1);Ā  ==> Write(&ss, r1), which is Write(&ss, x)

This guarantees that the read and write have the same value, hence
different-values(...) will be the empty relation, and so no flag.

>
>> And if it is true, is there any need to change the memory model at this
>> point?
>>
>> (And if it's not true, that's most likely due to a bug in herd7.)
> Agreed, changes must wait for SRCU support in herd7.
>
> At which point something roughly similar to this might work?
>
> let srcu-rscs = return_value(Srcu-lock) ; (dep | rfi)* ;
> parameter(Srcu-unlock, 2)

I would like instead to be able to give names to the arguments of events
that become dependency relations, like
Ā Ā  event srcu_unlock(struct srcu_struct *srcu_addr, struct srcu_token
*srcu_data)
and then
Ā Ā Ā  let srcu-rscs = [Srcu-lock] ; srcu_data ; (data; rfi)*

Personally I would also like to not have Linux-specific primitives in
herd7/cat, that means that to understand LKMM you also need to
understand the herd7 tool, and sounds quite brittle.

I would prefer if herd7 had some means to define custom
events/instructions and uninterpreted relations between them, like

relation rf : [write] x [read]
[read] <= range(rf)
empty rf ;rf^-1 \ id

and some way to say
[read] ; .return <= rf^-1 ; .data
(where .return is a functional relation relating every event to the
value it returns, and .xyz is the functional relation relating every
event to the value of its argument xyz).



2023-01-17 21:59:48

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 17, 2023 at 09:43:08AM -0800, Paul E. McKenney wrote:
> On Tue, Jan 17, 2023 at 10:56:34AM -0500, Alan Stern wrote:
> > Isn't it true that the current code will flag srcu-bad-nesting if a
> > litmus test has non-nested overlapping SRCU read-side critical sections?
>
> Now that you mention it, it does indeed, flagging srcu-bad-nesting.
>
> Just to see if I understand, different-values yields true if the set
> contains multiple elements with the same value mapping to different
> values. Or, to put it another way, if the relation does not correspond
> to a function.

As I understand it, given a relation r (i.e., a set of pairs of events),
different-values(r) returns the sub-relation consisting of those pairs
in r for which the value associated with the first event of the pair is
different from the value associated with the second event of the pair.

For srcu_read_lock() and loads in general, the associated value is the
value returned by the function call. For srcu_read_unlock() and stores
in general, the associated value is the value (i.e., the second
argument) passed to the function call.

> Or am I still missing something?
>
> > And if it is true, is there any need to change the memory model at this
> > point?
> >
> > (And if it's not true, that's most likely due to a bug in herd7.)
>
> Agreed, changes must wait for SRCU support in herd7.

Apparently the only change necessary is to make the srcu_read_lock and
srcu_read_unlock events act like loads and stores. In particular, they
need to be subject to the standard rules for calculating dependencies.

Right now the behavior is kind of strange. The following simple litmus
test:

C test
{}
P0(int *x)
{
int r1;
r1 = srcu_read_lock(x);
srcu_read_unlock(x, r1);
}
exists (~0:r1=0)

produces the following output from herd7:

Test test Allowed
States 1
0:r1=906;
Ok
Witnesses
Positive: 1 Negative: 0
Condition exists (not (0:r1=0))
Observation test Always 1 0
Time test 0.01
Hash=2f42c87ae9c1d267f4e80c66f646b9bb

Don't ask me where that 906 value comes from or why it is't 0. Also,
herd7's graphical output shows there is no data dependency from the lock
to the unlock, but we need to have one.

> At which point something roughly similar to this might work?
>
> let srcu-rscs = return_value(Srcu-lock) ; (dep | rfi)* ;
> parameter(Srcu-unlock, 2)

I can't tell what that's supposed to mean. In any case, I think what
you want would be:

let srcu-rscs = ([Srcu-lock] ; data ; [Srcu-unlock]) & loc

> Given an Srcu-down and an Srcu-up:
>
> let srcu-rscs = ( return_value(Srcu-lock) ; (dep | rfi)* ;
> parameter(Srcu-unlock, 2) ) |
> ( return_value(Srcu-down) ; (dep | rf)* ;
> parameter(Srcu-up, 2) )
>
> Seem reasonable, or am I missing yet something else?

Not at all reasonable.

For one thing, consider this question: Which statements lie inside a
read-side critical section?

With srcu_read_lock() and a matching srcu_read_unlock(), the answer is
clear: All statements po-between the two. With srcu_down_read() and
srcu_up_read(), the answer is cloudy in the extreme.

Also, bear in mind that the Fundamental Law of RCU is formulated in
terms of stores propagating to a critical section's CPU. What are we to
make of this when a single critical section can belong to more than one
CPU?

Indeed, given:

P0(int *x) {
srcu_down_read(x);
}

P1(int *x) {
srcu_up_read(x);
}

what are we to make of executions in which P1 executes before P0?

Alan

2023-01-17 22:29:26

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)


On 1/17/2023 7:55 PM, Paul E. McKenney wrote:

> On Tue, Jan 17, 2023 at 07:27:29PM +0100, Jonas Oberhauser wrote:
>> On 1/17/2023 6:43 PM, Paul E. McKenney wrote:
>>> Just to see if I understand, different-values yields true if the set
>>> contains multiple elements with the same value mapping to different
>>> values. Or, to put it another way, if the relation does not correspond
>>> to a function.
>> based on https://lkml.org/lkml/2019/1/10/155:
> Ah, thank you for the pointer!

What troubles me is that this is the only reference I could find as to
what the meaning of different-values is. Herd7 is a great tool for
specifying memory models, but the documentation could be heavily improved.

>> I think different-values(r) is the same as r \ same-values, where
>> same-values links all reads and writes that have the same value (e.g.,
>> "write 5 to x" and "read 5 from y").
>>
>> With this in mind, I think the idea is to 1) forbid partial overlap, and
>> using the different-values to 2) force them to provide the appropriate
>> value.
>> This works because apparently srcu-lock is a read and srcu-unlock is a
>> write, so in case of
>> int r1 = srcu-lock(&ss);Ā Ā  ==>Ā  Read(&ss, x), r1 := x
>> ...
>> srcu-unlock(&ss, r1);Ā  ==> Write(&ss, r1), which is Write(&ss, x)
>>
>> This guarantees that the read and write have the same value, hence
>> different-values(...) will be the empty relation, and so no flag.
> Might it instead match the entire event?

Which event?

Btw, if you want to state that a relation is functional (e.g., that
srcu-rscs only matches each lock event to at most one unlock event), one
way to do so is to state

flag ~empty ((srcu-rscs ; srcu-rscs^-1) \ id) as srcu-use-multiple-lock

I visualize this as two different locks pointing via srcu-rscs to the
same unlock.
Analogously,

flag ~empty ((srcu-rscs^-1 ; srcu-rscs) \ id) as srcu-reuse-lock-idx

should flag if a single lock points to two different unlocks (note: in a
single execution! this does not flag `int idx = srcu_lock(&ss); if {
...; srcu_unlock(&ss,idx); } else { ... ; srcu_unlock(&ss,idx) ;... } `).

[snipping in here a part written by Alan:]

> I think what you want would be:
>
> let srcu-rscs = ([Srcu-lock] ; data ; [Srcu-unlock]) & loc
>

I think it makes more sense to define
Ā Ā Ā  let srcu-rscs = ([Srcu-lock] ; (whatever relation says "I'm using
the return value as the second input") ; [Srcu-unlock])
and then to do
Ā Ā Ā  flag ~empty srcu-rscs\loc as srcu-passing-idx-to-wrong-unlock
to flag cases where you try to pass an index from one srcu_struct to
another.

>>> Agreed, changes must wait for SRCU support in herd7.
>>>
>> I would like instead to be able to give names to the arguments of events
>> that become dependency relations, like
>> Ā Ā  event srcu_unlock(struct srcu_struct *srcu_addr, struct srcu_token
>> *srcu_data)
>> and then
>> Ā Ā Ā  let srcu-rscs = [Srcu-lock] ; srcu_data ; (data; rfi)*
>>
>> Personally I would also like to not have Linux-specific primitives in
>> herd7/cat, that means that to understand LKMM you also need to understand
>> the herd7 tool, and sounds quite brittle.
>>
>> I would prefer if herd7 had some means to define custom events/instructions
>> and uninterpreted relations between them, like
>>
>> relation rf : [write] x [read]
>> [read] <= range(rf)
>> empty rf ;rf^-1 \ id
>>
>> and some way to say
>> [read] ; .return <= rf^-1 ; .data
>> (where .return is a functional relation relating every event to the value it
>> returns, and .xyz is the functional relation relating every event to the
>> value of its argument xyz).
> I am glad that I asked rather than kneejerk filing a bug report. ;-)

Please send me a link if you open a thread, then I'll voice my wishes as well.
Maybe Luc is in a wish-fulfilling mood?

best wishes,
jonas

PS:

> Other thoughts?

Other than that I added too many [] in my example? :) :( :) I meant

relation rf : write x read
read <= range(rf)
empty rf ;rf^-1 \ id

2023-01-17 23:53:22

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 17, 2023 at 06:48:12PM +0100, Jonas Oberhauser wrote:
>
> [I set up my huaweicloud.com address to send and receive mail, allegedly
> huaweicloud.com has fewer problems. Let's see. Also snipping together some
> mails that were sent to me in the meantime.]

That worked; your message was can be found on lore.kernel.org.

> On 1/14/2023 5:42 PM, Alan Stern wrote:
> >
> > On Fri, Jan 13, 2023 at 02:55:34PM +0000, Jonas Oberhauser wrote:
> > >
> > > I think the whole rcu-order topic can be summarized as the 'one
> > > rule': "if a grace period happens before a rcsc-unlock, it must also
> > > happen before the rcsc -lock, and analogously if rcsc-lock happens
> > > before a grace period, the rcsc-unlock also happens before the grace
> > > period" .
> >
> > There is more to it than that, as I mentioned earlier. A complete
> > description can be found the explanation.txt document; it says: For any
> > critical section C and any grace period G, at least one of the following
> > statements must hold: (1) C ends before G does, and in addition, every
> > store that propagates to C's CPU before the end of C must propagate to
> > every CPU before G ends. (2) G starts before C does, and in addition,
> > every store that propagates to G's CPU before the start of G must
> > propagate to every CPU before C starts.
>
> Yes, this difference took me a while to appreciate. If there was only (a
> strict interpretation of) the rule I mentioned, then the RCU axioms could be
> stated as just a regular atomicity axiom.
>
> But because it also affects the surrounding operations, the recursion
> becomes necessary.
>
>
> >
> >
> > > >
> > > > IMO it's generally better to think of grace periods as not being
> > > > instantaneous but as occurring over a prolonged period of time.
> > > > Thus we should say: If a grace period ends before an unlock
> > > > occurs, it must start before the corresponding lock. And
> > > > contrapositively, if a lock occurs before a grace period starts,
> > > > the corresponding unlock must occur before the grace period
> > > > ends.
> > >
> > > I started thinking about it like this and comparing start/end times.
> > > That made it more complicated, but the math came out the same in the
> > > end. I could imagine that there are some scenarios where the
> > > intuition of collapsing the grace period to a single event could
> > > cause problems, but I haven't seen any.
> >
> >
> >
> > IIRC (and it has been a long time), this may be vaguely connected with
> > the reason why the definitions of gp, rcu-link, and rcu-fence have po
> > one side but po? on the other.? But I can't remember the details.
>
>
>
> There's at least some connection. And I think from an operational model
> perspective, the distinction has some effect.
>
> That's because part (1) of the rule you quoted forces propagation before G

I prefer to say "requires" rather than "forces". "Forces" sounds more
like you're talking about a hardware mechanism that prevents something
bad from happening, like the way the cache coherency rules are enforced.

> ends, which allows propagation to G's CPU after the start or before the end.

After the start or before the end of what? G? And what time could
possibly count as being neither after the start nor before the end of G?

> Stores propagated in that time period are not forced to propagate by part
> (2).
>
> If the two events in the operational model were merged, then all stores that
> need to propagate to G's CPU through rule (1) would also need to propagate
> to other CPU's through part (2).

Again, I don't know why you say this. In fact, all stores that need to
propagate to G's CPU through rule (1) are also required by to propagate
to other CPU's through rule (1) -- not rule (2). And this has nothing
to do with whether the end of G occurs at the same time as the start or
some time afterward.

> In particular, if we had an execution with 3 CPUs like below (time from top
> to bottom, also attached as a text file in case my e-mail client messes up
> the formatting)
>
> CPU1???????????? | CPU2?????????? | CPU3
> start CS;??????? |??????????????? |
> read stage==0??? |??????????????? |
> ???????????????? | stage = 1;???? |
> ???????????????? |??????????????? |
> ???????????????? | GP {?????????? |
> x = 1;?????????? |??????????????? |
> ???????????????? |??????????????? | start CS;
> ???????????????? |??????????????? | read x == 0;
> end CS;????????? |??????????????? |
> ???????????????? | }????????????? |
> ???????????????? | stage = 2;???? |
> ???????????????? |??????????????? | read stage == 2;
> ???????????????? |??????????????? | read x == 1;
> ???????????????? |??????????????? | end CS;
>
> then we allow x=1 not to propagate to the third CPU before it reads x.

I still can't understand what you're saying. Since CPU3 reads x==1, of
course we require x=1 to propagate to CPU3 before it reads x.

> But
> if there was only a single grace period step, which would not overlap with
> either CS, then this outcome would be forbidden.
> Because stage=1 didn't propagate to CPU1, the grace period would need to be
> after CPU1's critical section.
> Because stage=2 did propagate to CPU3, the grace period would need to be
> before CPU3's critical section.
> But then x=1 would need to propagate to CPU3 before the grace period, and
> thus before its CS starts.
>
> I think it's because things can't happen "at the same time" in the
> operational model.

That's simply not true. As an example, writes propagate to their own
CPU at the same time as they execute.

> Otherwise, x=1 could propagate "at the same time" as it
> executes the grace period, and then wouldn't be affected by rule (2)
> anymore.
>
> But in the axiomatic model, we can use the po; ... to state that things must
> happen "strictly before" the start of G (instead of "kinda at the same
> time"). If there is a po-earlier event that observes the propagation, then
> the propagation happened before the start of G. If there is no po-earlier
> event that observes the propagation, then the store may as well have
> propagated at the same time as G (or "between the start and end"). So having
> the distinction of start and end of grace periods becomes at least less
> important.
>
> I still haven't wrapped my head fully around why the other side has to be
> po?.
>
> I asked Hernan to run all the old litmus tests with rcu-fence = po ;
> rcu-order ; po and he reported no difference in those tests either.
>
> Now I'm thinking if it can be proven that some of them aren't necessary, or
> could be simplified.

Maybe. But why go to the trouble?

> Pretending for simplicity that rscs and grace periods aren't reads&writes

They aren't. You don't have to pretend.

> (and that prop must always start with overwrite&ext, but this can be done
> wlog if we define the rcu relations with prop? instead of prop).
>
> I'm first looking at the rcu-link relation.
> Any use of rcu-link in rcu-order is preceded by an rscs or gp. Considering
> the cases where po? is not taken, the first edge of hb*;pb*;prop? can't be
> any of prop, rfe, or prop&int because the rcu-order never ends in
> write/reads. This leaves only ppo (or nothing), and we can use ppo <= po
> (with the patch that's currently lying on my hard disk :D) to get that he
> complete edge a subset of
>
> (po ; hb*;pb*;prop? | id);po
>
> Therefore I think we have rcu-link = (po ; hb*;pb*;prop? ; po) | po

This does not seem like a simplification to me.

> Next, I look at rcu-fence in rb = prop? ; rcu-fence ; hb* ; pb*.
> An rcu-fence ; hb* ; pb* which doesn't have the po at the end of rcu-fence
> can not have prop, rfe, or prop&int after the rcu-fence either. This leaves
> two cases, either the rb comes from prop? ; po ; rcu-order or from prop? ;
> po ; rcu-order ; ppo ; hb* ; pb*.
>
> In the latter case we can use ppo <= po and get back? prop? ; po ; rcu-order
> ; po ; hb* ; pb, so considering po? here is not necessary.
>
> In the former case, we can ask instead if po ; rcu-order ; prop? is
> irreflexive, and since prop can't follow on rcu-order, this is the same as
> po ; rcu-order.
>
> This can only have an identity edge if at least some of the rcu-links in
> rcu-order are not just po. So let's look at the last such edge, when
> abbreviating RCU cs and grace periods as R we get
>
> ? po; (R ; rcu-link)* ; R ; po ; hb*;pb*;prop? ; (po ; R)+
>
> where overall the number of gps >= number of rscs, and this can be rewritten
> as
>
> ? prop? ; (po ; R)+; po; (rcu-order ; rcu-link)? ; R ; po ; hb*;pb*
>
> and I believe (po ; R)+; po; (R ; rcu-link)* ; R ; po? <= po ; rcu-order ;
> po (using the fact that overall the number of gps is still >= the number of
> rscs)
>
> so then it simplifies again to
>
> ? prop? ; po ; rcu-order ; po ; hb*;pb*
>
> and po? is again not necessary.
>
> I'm again ignoring srcu here. I don't know if we can still shuffle the
> gp/rscs around like above when the locations have to match.

Indeed, before support for SRCU was added to the memory model, it did
put the po and po? terms in other places. I was forced to move them in
order to add the tests for matching locations.

> Either way if you can confirm my suspicion that the po? in rcu-fence could
> be replaced by po, and that the po? in rcu-link could be replaced by (po ;
> ... ; po) | po, or have a counter example and some additional explanation
> for why the po? makes sense, I'd be thankful.
>
>
> > There was also something about what should happen when you have two
> > grace periods in a row.
>
> Note that two grace periods in a row are a subset of po;rcu-gp;po and thus
> gp, and so there's nothing to be done.

That is not stated carefully, but it probably is wrong. Consider this:

P0 P1 P2
--------------- -------------- -----------------
rcu_read_lock Wy=1 rcu_read_lock
Wx=1 synchronize_rcu Wz=1
Ry=0 synchronize_rcu Rx=0
rcu_read_unlock Rz=0 rcu_read_unlock

(W stands for Write and R for Read.) This execution is forbidden by the
counting rule: Its cycle has two grace periods and two critical
sections. But if we changed the definition of gp to be

let gp = po ; [Sync-rcu | Sync-srcu] ; po

then the memory model would allow the execution. So having the po? at
the end of gp is vital. (Or at the beginning; I think either one would
work as well and the choice was arbitrary.)

> Something more interesting happens with critical sections, where currently
> po ; rcu-rcsci ; po ; rcu-rcsci ; po should be a subset of po ; rcu-rcsci ;
> po? because of the forbidden partial overlap. But I currently don't think
> it's necessary to consider such cases.
>
> The other thing that causes complications is when all the pb*,hb*,and prop
> links in rcu-link are just id, and then rcu-link becomes po?;po = po.
> Currently I don't understand why such pure po links should be necessary at
> all, since they should just merge with the neighboring rcu-gps into a gp
> edge.
>
> > >
> > > The only way I'd count rcu-link' as adding a case is if you say that
> > > the (...)* has two cases :D (or infinitely many :D) I don't count
> > > the existence of the definition because you could always inline it
> > > (but lose a lot of clarity imho).
> >
> >
> >
> > If you did inline it, you'd probably find that the end result was
> > exactly what is currently in the LKMM.
>
> Not quite. There are two differences. The first is that the
> rcu-order;rcu-link;rcu-order case disappears.
>
> The second is that the ...;rcu-link;... and
> ...;rcu-link;rcu-order;rcu-link;... subcases get merged, and not to
> ...;rcu-link;(rcu-order;rcu-link)?;... but to
> ...;rcu-link;(rcu-order;rcu-link)*;...

Okay.

> Indeed the definitions of rcu-extend and rcu-order can't become exactly the
> same because they are different relations, e.g., rcu-order can begin and end
> with a grace period but rcu-extend can't.
>
> That's why an additional non-recursive layer of
>
> ?? rcu-order = rcu-extend ; (rcu-link; rcu-extend)*
>
> is necessary to define rcu-order in terms of rcu-extend. But as I mentioned
> I don't think rcu-order is necessary at all to define LKMM, and one can
> probably just use rcu-extend instead of rcu-order (and in fact even a
> version of rcu-extend without any lone rcu-gps).

Sure, you could do that, but it wouldn't make sense. Why would anyone
want to define an RCU ordering relation that includes

gp ... rscs ... gp ... rscs

but not

gp ... rscs ... rscs ... gp

?

> > > the law is, well, um, "primarily empirical in nature"
> >
> >
> >
> > Actually it isn't, not any more. That quote was written before we
> > formalized RCU in the LKMM.
>
> I meant that the original formulation was empirical; of course you have
> formalized it, but how do you know that the formalization is valid?

We proved it in the ASPLOS paper. That is, we proved that a particular
implementation faithfully obeys the restrictions of the formalization.

> I think
> the correspondence with "what's intended" is always an empirical thing, even
> if you formally prove the correctness of the imlementation against the
> specification you might have missed some parts or added some parts that are
> actually just implementation details.

While I agree that it is difficult to be sure that an informal
specification agrees with a formal model, I wouldn't describe attempts
to ensure this as "empirical".

Alan

2023-01-18 03:11:51

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 17, 2023 at 09:43:08AM -0800, Paul E. McKenney wrote:
> On Tue, Jan 17, 2023 at 10:56:34AM -0500, Alan Stern wrote:

> > Isn't it true that the current code will flag srcu-bad-nesting if a
> > litmus test has non-nested overlapping SRCU read-side critical sections?
>
> Now that you mention it, it does indeed, flagging srcu-bad-nesting.
>
> Just to see if I understand, different-values yields true if the set
> contains multiple elements with the same value mapping to different
> values. Or, to put it another way, if the relation does not correspond
> to a function.
>
> Or am I still missing something?
>
> > And if it is true, is there any need to change the memory model at this
> > point?
> >
> > (And if it's not true, that's most likely due to a bug in herd7.)
>
> Agreed, changes must wait for SRCU support in herd7.

Maybe we don't. Please test the patch below; I think it will do what
you want -- and it doesn't rule out nesting.

Alan



Index: usb-devel/tools/memory-model/linux-kernel.bell
===================================================================
--- usb-devel.orig/tools/memory-model/linux-kernel.bell
+++ usb-devel/tools/memory-model/linux-kernel.bell
@@ -57,20 +57,12 @@ flag ~empty Rcu-lock \ domain(rcu-rscs)
flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-locking

(* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
-let srcu-rscs = let rec
- unmatched-locks = Srcu-lock \ domain(matched)
- and unmatched-unlocks = Srcu-unlock \ range(matched)
- and unmatched = unmatched-locks | unmatched-unlocks
- and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
- and unmatched-locks-to-unlocks =
- ([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
- and matched = matched | (unmatched-locks-to-unlocks \
- (unmatched-po ; unmatched-po))
- in matched
+let srcu-rscs = ([Srcu-lock] ; data ; [Srcu-unlock]) & loc

(* Validate nesting *)
flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
-flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
+flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-unlocking
+flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-srcu-unlocks

(* Check for use of synchronize_srcu() inside an RCU critical section *)
flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
@@ -80,11 +72,11 @@ flag ~empty different-values(srcu-rscs)

(* Compute marked and plain memory accesses *)
let Marked = (~M) | IW | Once | Release | Acquire | domain(rmw) | range(rmw) |
- LKR | LKW | UL | LF | RL | RU
+ LKR | LKW | UL | LF | RL | RU | Srcu-lock | Srcu-unlock
let Plain = M \ Marked

(* Redefine dependencies to include those carried through plain accesses *)
-let carry-dep = (data ; rfi)*
+let carry-dep = (data ; rfi ; [~Srcu-lock])*
let addr = carry-dep ; addr
let ctrl = carry-dep ; ctrl
let data = carry-dep ; data
Index: usb-devel/tools/memory-model/linux-kernel.def
===================================================================
--- usb-devel.orig/tools/memory-model/linux-kernel.def
+++ usb-devel/tools/memory-model/linux-kernel.def
@@ -49,8 +49,8 @@ synchronize_rcu() { __fence{sync-rcu}; }
synchronize_rcu_expedited() { __fence{sync-rcu}; }

// SRCU
-srcu_read_lock(X) __srcu{srcu-lock}(X)
-srcu_read_unlock(X,Y) { __srcu{srcu-unlock}(X,Y); }
+srcu_read_lock(X) __load{srcu-lock}(*X)
+srcu_read_unlock(X,Y) { __store{srcu-unlock}(*X,Y); }
synchronize_srcu(X) { __srcu{sync-srcu}(X); }
synchronize_srcu_expedited(X) { __srcu{sync-srcu}(X); }


2023-01-18 04:45:51

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 17, 2023 at 03:15:06PM -0500, Alan Stern wrote:
> On Tue, Jan 17, 2023 at 09:43:08AM -0800, Paul E. McKenney wrote:
> > On Tue, Jan 17, 2023 at 10:56:34AM -0500, Alan Stern wrote:
> > > Isn't it true that the current code will flag srcu-bad-nesting if a
> > > litmus test has non-nested overlapping SRCU read-side critical sections?
> >
> > Now that you mention it, it does indeed, flagging srcu-bad-nesting.
> >
> > Just to see if I understand, different-values yields true if the set
> > contains multiple elements with the same value mapping to different
> > values. Or, to put it another way, if the relation does not correspond
> > to a function.
>
> As I understand it, given a relation r (i.e., a set of pairs of events),
> different-values(r) returns the sub-relation consisting of those pairs
> in r for which the value associated with the first event of the pair is
> different from the value associated with the second event of the pair.

OK, so different-values(r) is different than (r \ id) because the
former operates on values and the latter on events?

> For srcu_read_lock() and loads in general, the associated value is the
> value returned by the function call. For srcu_read_unlock() and stores
> in general, the associated value is the value (i.e., the second
> argument) passed to the function call.

As you say later, it would be good to have that "data" relation include
srcu_read_lock() and srcu_read_unlock(). ;-)

> > Or am I still missing something?
> >
> > > And if it is true, is there any need to change the memory model at this
> > > point?
> > >
> > > (And if it's not true, that's most likely due to a bug in herd7.)
> >
> > Agreed, changes must wait for SRCU support in herd7.
>
> Apparently the only change necessary is to make the srcu_read_lock and
> srcu_read_unlock events act like loads and stores. In particular, they
> need to be subject to the standard rules for calculating dependencies.
>
> Right now the behavior is kind of strange. The following simple litmus
> test:
>
> C test
> {}
> P0(int *x)
> {
> int r1;
> r1 = srcu_read_lock(x);
> srcu_read_unlock(x, r1);
> }
> exists (~0:r1=0)
>
> produces the following output from herd7:
>
> Test test Allowed
> States 1
> 0:r1=906;
> Ok
> Witnesses
> Positive: 1 Negative: 0
> Condition exists (not (0:r1=0))
> Observation test Always 1 0
> Time test 0.01
> Hash=2f42c87ae9c1d267f4e80c66f646b9bb
>
> Don't ask me where that 906 value comes from or why it is't 0. Also,
> herd7's graphical output shows there is no data dependency from the lock
> to the unlock, but we need to have one.

Is it still the case that any herd7 value greater than 127 is special?

> > At which point something roughly similar to this might work?
> >
> > let srcu-rscs = return_value(Srcu-lock) ; (dep | rfi)* ;
> > parameter(Srcu-unlock, 2)
>
> I can't tell what that's supposed to mean. In any case, I think what
> you want would be:
>
> let srcu-rscs = ([Srcu-lock] ; data ; [Srcu-unlock]) & loc

Agreed, given that (data) is supposed to propagate through locals
and globals. Within the Linux kernel, assigning the return value from
srcu_read_lock() to a global is asking for it, though.

> > Given an Srcu-down and an Srcu-up:
> >
> > let srcu-rscs = ( return_value(Srcu-lock) ; (dep | rfi)* ;
> > parameter(Srcu-unlock, 2) ) |
> > ( return_value(Srcu-down) ; (dep | rf)* ;
> > parameter(Srcu-up, 2) )
> >
> > Seem reasonable, or am I missing yet something else?
>
> Not at all reasonable.
>
> For one thing, consider this question: Which statements lie inside a
> read-side critical section?

Here srcu_down_read() and srcu_up_read() are to srcu_read_lock() and
srcu_read_unlock() as down_read() and up_read() are to mutex_lock()
and mutex_unlock(). Not that this should be all that much comfort
given that I have no idea how one would go about modeling down_read()
and up_read() in LKMM.

> With srcu_read_lock() and a matching srcu_read_unlock(), the answer is
> clear: All statements po-between the two. With srcu_down_read() and
> srcu_up_read(), the answer is cloudy in the extreme.

And I agree that it must be clearly specified, and my that previous try
was completely lacking. Here is a second attempt:

let srcu-rscs = (([Srcu-lock] ; data ; [Srcu-unlock]) & loc) |
(([Srcu-down] ; (data | rf)* ; [Srcu-up]) & loc)

(And I see your proposal and will try it.)

> Also, bear in mind that the Fundamental Law of RCU is formulated in
> terms of stores propagating to a critical section's CPU. What are we to
> make of this when a single critical section can belong to more than one
> CPU?

One way of answering this question is by analogy with down() and up()
when used as a cross-task mutex. Another is by mechanically applying
some of current LKMM. Let's start with this second option.

LKMM works mostly with critical sections, but we also discussed ordering
based on the set of events po-after an srcu_read_lock() on the one hand
and the set of events po-before an srcu_read_unlock() on the other.
Starting here, the critical section is the intersection of these two sets.

In the case of srcu_down_read() and srcu_up_read(), as you say, whatever
might be a critical section must span processes. So what if instead of
po, we used (say) xbstar? Then given a set of A such that ([Srcu-down ;
xbstar ; A) and B such that (B ; xbstar ; [Srcu-up]), then the critical
section is the intersection of A and B.

One objection to this approach is that a bunch of unrelated events could
end up being defined as part of the critical section. Except that this
happens already anyway in real critical sections in the Linux kernel.

So what about down() and up() when used as cross-task mutexes?
These often do have conceptual critical sections that protect some
combination of resource, but these critical sections might span tasks
and/or workqueue handlers. And any reasonable definition of these
critical sections would be just as likely to pull in unrelated accesses as
the above intersection approach for srcu_down_read() and srcu_up_read().

But I am just now making all this up, so thoughts?

> Indeed, given:
>
> P0(int *x) {
> srcu_down_read(x);
> }
>
> P1(int *x) {
> srcu_up_read(x);
> }
>
> what are we to make of executions in which P1 executes before P0?

Indeed, there had better be something else forbidding such executions, or
this is an invalid use of srcu_down_read() and srcu_up_read(). This might
become more clear if the example is expanded to include the index returned
from srcu_down_read() that is to be passed to srcu_up_read():

P0(int *x, int *i) {
WRITE_ONCE(i, srcu_down_read(x));
}

P1(int *x, int *i) {
srcu_up_read(x, READ_ONCE(i));
}

Which it looks like you in fact to have in your patch, so time for me
to go try that out.

Thanx, Paul

2023-01-18 05:43:58

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 17, 2023 at 09:15:15PM -0500, Alan Stern wrote:
> On Tue, Jan 17, 2023 at 09:43:08AM -0800, Paul E. McKenney wrote:
> > On Tue, Jan 17, 2023 at 10:56:34AM -0500, Alan Stern wrote:
>
> > > Isn't it true that the current code will flag srcu-bad-nesting if a
> > > litmus test has non-nested overlapping SRCU read-side critical sections?
> >
> > Now that you mention it, it does indeed, flagging srcu-bad-nesting.
> >
> > Just to see if I understand, different-values yields true if the set
> > contains multiple elements with the same value mapping to different
> > values. Or, to put it another way, if the relation does not correspond
> > to a function.
> >
> > Or am I still missing something?
> >
> > > And if it is true, is there any need to change the memory model at this
> > > point?
> > >
> > > (And if it's not true, that's most likely due to a bug in herd7.)
> >
> > Agreed, changes must wait for SRCU support in herd7.
>
> Maybe we don't. Please test the patch below; I think it will do what
> you want -- and it doesn't rule out nesting.

It works like a champ on manual/kernel/C-srcu*.litmus in the litmus
repository on github, good show and thank you!!!

I will make more tests, and am checking this against the rest of the
litmus tests in the repo, but in the meantime would you be willing to
have me add your Signed-off-by?

Thanx, Paul

> Alan
>
>
>
> Index: usb-devel/tools/memory-model/linux-kernel.bell
> ===================================================================
> --- usb-devel.orig/tools/memory-model/linux-kernel.bell
> +++ usb-devel/tools/memory-model/linux-kernel.bell
> @@ -57,20 +57,12 @@ flag ~empty Rcu-lock \ domain(rcu-rscs)
> flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-locking
>
> (* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
> -let srcu-rscs = let rec
> - unmatched-locks = Srcu-lock \ domain(matched)
> - and unmatched-unlocks = Srcu-unlock \ range(matched)
> - and unmatched = unmatched-locks | unmatched-unlocks
> - and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
> - and unmatched-locks-to-unlocks =
> - ([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
> - and matched = matched | (unmatched-locks-to-unlocks \
> - (unmatched-po ; unmatched-po))
> - in matched
> +let srcu-rscs = ([Srcu-lock] ; data ; [Srcu-unlock]) & loc
>
> (* Validate nesting *)
> flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
> -flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
> +flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-unlocking
> +flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-srcu-unlocks
>
> (* Check for use of synchronize_srcu() inside an RCU critical section *)
> flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
> @@ -80,11 +72,11 @@ flag ~empty different-values(srcu-rscs)
>
> (* Compute marked and plain memory accesses *)
> let Marked = (~M) | IW | Once | Release | Acquire | domain(rmw) | range(rmw) |
> - LKR | LKW | UL | LF | RL | RU
> + LKR | LKW | UL | LF | RL | RU | Srcu-lock | Srcu-unlock
> let Plain = M \ Marked
>
> (* Redefine dependencies to include those carried through plain accesses *)
> -let carry-dep = (data ; rfi)*
> +let carry-dep = (data ; rfi ; [~Srcu-lock])*
> let addr = carry-dep ; addr
> let ctrl = carry-dep ; ctrl
> let data = carry-dep ; data
> Index: usb-devel/tools/memory-model/linux-kernel.def
> ===================================================================
> --- usb-devel.orig/tools/memory-model/linux-kernel.def
> +++ usb-devel/tools/memory-model/linux-kernel.def
> @@ -49,8 +49,8 @@ synchronize_rcu() { __fence{sync-rcu}; }
> synchronize_rcu_expedited() { __fence{sync-rcu}; }
>
> // SRCU
> -srcu_read_lock(X) __srcu{srcu-lock}(X)
> -srcu_read_unlock(X,Y) { __srcu{srcu-unlock}(X,Y); }
> +srcu_read_lock(X) __load{srcu-lock}(*X)
> +srcu_read_unlock(X,Y) { __store{srcu-unlock}(*X,Y); }
> synchronize_srcu(X) { __srcu{sync-srcu}(X); }
> synchronize_srcu_expedited(X) { __srcu{sync-srcu}(X); }
>
>

2023-01-18 12:12:03

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/17/2023 10:19 PM, Alan Stern wrote:
> On Tue, Jan 17, 2023 at 06:48:12PM +0100, Jonas Oberhauser wrote:
>> On 1/14/2023 5:42 PM, Alan Stern wrote:
>>>
>>> There is more to it than that, as I mentioned earlier. A complete
>>> description can be found the explanation.txt document; it says: For any
>>> critical section C and any grace period G, at least one of the following
>>> statements must hold: (1) C ends before G does, and in addition, every
>>> store that propagates to C's CPU before the end of C must propagate to
>>> every CPU before G ends. (2) G starts before C does, and in addition,
>>> every store that propagates to G's CPU before the start of G must
>>> propagate to every CPU before C starts.
>>>
>>> On Fri, Jan 13, 2023 at 02:55:34PM +0000, Jonas Oberhauser wrote:
>>>>> IMO it's generally better to think of grace periods as not being
>>>>> instantaneous but as occurring over a prolonged period of time.
>>>>> Thus we should say: If a grace period ends before an unlock
>>>>> occurs, it must start before the corresponding lock. And
>>>>> contrapositively, if a lock occurs before a grace period starts,
>>>>> the corresponding unlock must occur before the grace period
>>>>> ends.
>>>> I started thinking about it like this and comparing start/end times.
>>>> That made it more complicated, but the math came out the same in the
>>>> end. I could imagine that there are some scenarios where the
>>>> intuition of collapsing the grace period to a single event could
>>>> cause problems, but I haven't seen any.
>>>
>>>
>>> IIRC (and it has been a long time), this may be vaguely connected with
>>> the reason why the definitions of gp, rcu-link, and rcu-fence have po
>>> one side but po? on the other.Ā  But I can't remember the details.
>>
>>
>> There's at least some connection. And I think from an operational model
>> perspective, the distinction has some effect.
>>
>> That's because part (1) of the rule you quoted forces propagation before G
> I prefer to say "requires" rather than "forces". "Forces" sounds more
> like you're talking about a hardware mechanism that prevents something
> bad from happening, like the way the cache coherency rules are enforced.
>
>> ends, which allows propagation to G's CPU after the start or before the end.
> After the start or before the end of what? G?

Sorry, I meant after the start *and* before the end of G.

>> Stores propagated in that time period are not forced to propagate by part
>> (2).
>>
>> If the two events in the operational model were merged, then all stores that
>> need to propagate to G's CPU through rule (1) would also need to propagate
>> to other CPU's through part (2).
> Again, I don't know why you say this. In fact, all stores that need to
> propagate to G's CPU through rule (1) are also required by to propagate
> to other CPU's through rule (1) -- not rule (2). And this has nothing
> to do with whether the end of G occurs at the same time as the start or
> some time afterward.
Yes, but the time periods are different. Rule (1) only requires
propagation to all CPUs by the time G ends, rule (2) requires
propagation to all CPUs by the time the second critical section starts,
which may be much earlier.

>
>> In particular, if we had an execution with 3 CPUs like below (time from top
>> to bottom, also attached as a text file in case my e-mail client messes up
>> the formatting)
>>
>> CPU1Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | CPU2Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | CPU3
>> start CS;Ā Ā Ā Ā Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |
>> read stage==0Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |
>> Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | stage = 1;Ā Ā Ā Ā  |
>> Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |
>> Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | GP {Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |
>> x = 1;Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |
>> Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | start CS;
>> Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | read x == 0;
>> end CS;Ā Ā Ā Ā Ā Ā Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |
>> Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | }Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |
>> Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | stage = 2;Ā Ā Ā Ā  |
>> Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | read stage == 2;
>> Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | read x == 1;
>> Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  |Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  | end CS;
>>
>> then we allow x=1 not to propagate to the third CPU before it reads x.
> I still can't understand what you're saying. Since CPU3 reads x==1, of
> course we require x=1 to propagate to CPU3 before it reads x.

Note that it reads x twice, and the first time it reads 0.
I should have written "before it reads x for the first time" (= after
the second read side critical section starts, but before G ends). The
second read is after G ends, at which point by guarantee (1) the store
has propagated to CPU3.



>
>> But
>> if there was only a single grace period step, which would not overlap with
>> either CS, then this outcome would be forbidden.
>> Because stage=1 didn't propagate to CPU1, the grace period would need to be
>> after CPU1's critical section.
>> Because stage=2 did propagate to CPU3, the grace period would need to be
>> before CPU3's critical section.
>> But then x=1 would need to propagate to CPU3 before the grace period, and
>> thus before its CS starts.
>>
>> I think it's because things can't happen "at the same time" in the
>> operational model.
> That's simply not true. As an example, writes propagate to their own
> CPU at the same time as they execute.

I consider executing a store and it propagating to its own CPU as a
single thing happening.
That's because it is considered so in the operational model.
However, a store propagating to a CPU and that CPU executing a grace
period are likely two independent steps.
(You could formalize it differently, but it is highly non-standard).
It's imho more reasonable to split the grace period into start and end
step, and then use the Grace Period Guarantee as written.

>> Otherwise, x=1 could propagate "at the same time" as it
>> executes the grace period, and then wouldn't be affected by rule (2)
>> anymore.
>>
>> But in the axiomatic model, we can use the po; ... to state that things must
>> happen "strictly before" the start of G (instead of "kinda at the same
>> time"). If there is a po-earlier event that observes the propagation, then
>> the propagation happened before the start of G. If there is no po-earlier
>> event that observes the propagation, then the store may as well have
>> propagated at the same time as G (or "between the start and end"). So having
>> the distinction of start and end of grace periods becomes at least less
>> important.
>>
>> I still haven't wrapped my head fully around why the other side has to be
>> po?.
>>
>> I asked Hernan to run all the old litmus tests with rcu-fence = po ;
>> rcu-order ; po and he reported no difference in those tests either.
>>
>> Now I'm thinking if it can be proven that some of them aren't necessary, or
>> could be simplified.
> Maybe. But why go to the trouble?

Because if one could simplify things, the model becomes more
uniform/cohesive, easier to understand, and easier to reason about. (we
can have a seperate discussion about whether the result is simpler :D).

>> Pretending for simplicity that rscs and grace periods aren't reads&writes
> They aren't. You don't have to pretend.

rscs are reads& writes in herd. That's how the data dependency works in
your patch, right?
I consider that a hack though and don't like it.

>> (and that prop must always start with overwrite&ext, but this can be done
>> wlog if we define the rcu relations with prop? instead of prop).
>>
>> I'm first looking at the rcu-link relation.
>> Any use of rcu-link in rcu-order is preceded by an rscs or gp. Considering
>> the cases where po? is not taken, the first edge of hb*;pb*;prop? can't be
>> any of prop, rfe, or prop&int because the rcu-order never ends in
>> write/reads. This leaves only ppo (or nothing), and we can use ppo <= po
>> (with the patch that's currently lying on my hard disk :D) to get that he
>> complete edge a subset of
>>
>> (po ; hb*;pb*;prop? | id);po
>>
>> Therefore I think we have rcu-link = (po ; hb*;pb*;prop? ; po) | po
> This does not seem like a simplification to me.

It's simpler to reason about because the number of combinations is
roughly half (instead of multiplying each of hb*, pb*, prop? with the
possibility of not having a po, it just adds one possibility to (po ;
... ;po)). (I was also imprecise, it's not that `rcu-link = (po ;
hb*;pb*;prop? ; po) | po`, but rather that it could be changed to that
definition without changing the model).

>
>> Next, I look at rcu-fence in rb = prop? ; rcu-fence ; hb* ; pb*.
>> An rcu-fence ; hb* ; pb* which doesn't have the po at the end of rcu-fence
>> can not have prop, rfe, or prop&int after the rcu-fence either. This leaves
>> two cases, either the rb comes from prop? ; po ; rcu-order or from prop? ;
>> po ; rcu-order ; ppo ; hb* ; pb*.
>>
>> In the latter case we can use ppo <= po and get backĀ  prop? ; po ; rcu-order
>> ; po ; hb* ; pb, so considering po? here is not necessary.
>>
>> In the former case, we can ask instead if po ; rcu-order ; prop? is
>> irreflexive, and since prop can't follow on rcu-order, this is the same as
>> po ; rcu-order.
>>
>> This can only have an identity edge if at least some of the rcu-links in
>> rcu-order are not just po. So let's look at the last such edge, when
>> abbreviating RCU cs and grace periods as R we get
>>
>> Ā  po; (R ; rcu-link)* ; R ; po ; hb*;pb*;prop? ; (po ; R)+
>>
>> where overall the number of gps >= number of rscs, and this can be rewritten
>> as
>>
>> Ā  prop? ; (po ; R)+; po; (rcu-order ; rcu-link)? ; R ; po ; hb*;pb*
>>
>> and I believe (po ; R)+; po; (R ; rcu-link)* ; R ; poĀ  <= po ; rcu-order ;
>> po (using the fact that overall the number of gps is still >= the number of
>> rscs)
>>
>> so then it simplifies again to
>>
>> Ā  prop? ; po ; rcu-order ; po ; hb*;pb*
>>
>> and po? is again not necessary.
>>
>> I'm again ignoring srcu here. I don't know if we can still shuffle the
>> gp/rscs around like above when the locations have to match.
> Indeed, before support for SRCU was added to the memory model, it did
> put the po and po? terms in other places. I was forced to move them in
> order to add the tests for matching locations.
>
>> Either way if you can confirm my suspicion that the po? in rcu-fence could
>> be replaced by po, and that the po? in rcu-link could be replaced by (po ;
>> ... ; po) | po, or have a counter example and some additional explanation
>> for why the po? makes sense, I'd be thankful.
>>
>>
>>> There was also something about what should happen when you have two
>>> grace periods in a row.
>> Note that two grace periods in a row are a subset of po;rcu-gp;po and thus
>> gp, and so there's nothing to be done.
> That is not stated carefully, but it probably is wrong. Consider this:
>
> P0 P1 P2
> --------------- -------------- -----------------
> rcu_read_lock Wy=1 rcu_read_lock
> Wx=1 synchronize_rcu Wz=1
> Ry=0 synchronize_rcu Rx=0
> rcu_read_unlock Rz=0 rcu_read_unlock
>
> (W stands for Write and R for Read.) This execution is forbidden by the
> counting rule: Its cycle has two grace periods and two critical
> sections. But if we changed the definition of gp to be
>
> let gp = po ; [Sync-rcu | Sync-srcu] ; po
>
> then the memory model would allow the execution. So having the po? at
> the end of gp is vital.

I hadn't thought yet about the effect of modifying the definition of gp,
but I don't think this example relies on gp at all.
The model would forbid this even if rcu-fence and gp were both changed
from po? to po.
From Rz=0 we know
Ā Ā Ā  second sync() ->rcu-gp;po Rz ->prop Wz ->po P2 unlock() ->rcu-rscsi
P2 lock()
From Ry=0 we know
Ā  P1 unlock() ->rcu-rsci P1 lock() ->po Ry ->prop Wy ->po;rcu-gp first
sync()

which are both rcu-order.
Then from Rx=0 we have
Ā  Rx ->prop Wx ->po P1 unlock() ->rcu-orderĀ  first sync() ->po second
sync() ->rcu-order P2 lock() ->po Rx
of course since po is one case of rcu-link, we get
Ā  Rx ->prop Wx ->po P1 unlock() ->rcu-order P2 lock() ->po Rx
and hence
Ā  Rx ->prop Wx ->rcu-fence Rx
which is supposed to be irreflexive (even with rcu-fence=po;rcu-order;po).

Note that if your ordering relies on actually using gp twice in a row,
then these must come from strong-fence, but you should be able to just
take the shortcut by merging them into a single gp.
Ā  po;rcu-gp;po;rcu-gp;po <= gp <= strong-fence <= hb & strong-order


>
>> Something more interesting happens with critical sections, where currently
>> po ; rcu-rcsci ; po ; rcu-rcsci ; po should be a subset of po ; rcu-rcsci ;
>> poĀ  because of the forbidden partial overlap. But I currently don't think
>> it's necessary to consider such cases.
>>
>> The other thing that causes complications is when all the pb*,hb*,and prop
>> links in rcu-link are just id, and then rcu-link becomes po?;po = po.
>> Currently I don't understand why such pure po links should be necessary at
>> all, since they should just merge with the neighboring rcu-gps into a gp
>> edge.
>>
>>>> The only way I'd count rcu-link' as adding a case is if you say that
>>>> the (...)* has two cases :D (or infinitely many :D) I don't count
>>>> the existence of the definition because you could always inline it
>>>> (but lose a lot of clarity imho).
>>>
>>>
>>> If you did inline it, you'd probably find that the end result was
>>> exactly what is currently in the LKMM.
>> Not quite. There are two differences. The first is that the
>> rcu-order;rcu-link;rcu-order case disappears.
>>
>> The second is that the ...;rcu-link;... and
>> ...;rcu-link;rcu-order;rcu-link;... subcases get merged, and not to
>> ...;rcu-link;(rcu-order;rcu-link)?;... but to
>> ...;rcu-link;(rcu-order;rcu-link)*;...
> Okay.
>
>> Indeed the definitions of rcu-extend and rcu-order can't become exactly the
>> same because they are different relations, e.g., rcu-order can begin and end
>> with a grace period but rcu-extend can't.
>>
>> That's why an additional non-recursive layer of
>>
>> Ā Ā  rcu-order = rcu-extend ; (rcu-link; rcu-extend)*
>>
>> is necessary to define rcu-order in terms of rcu-extend. But as I mentioned
>> I don't think rcu-order is necessary at all to define LKMM, and one can
>> probably just use rcu-extend instead of rcu-order (and in fact even a
>> version of rcu-extend without any lone rcu-gps).
> Sure, you could do that, but it wouldn't make sense. Why would anyone
> want to define an RCU ordering relation that includes
>
> gp ... rscs ... gp ... rscs
>
> but not
>
> gp ... rscs ... rscs ... gp
>
> ?

Because the the RCU Grace Period Guarantee doesn't say "if a gp happens
before a gp, with some rscs in between, ...".
So I think even the picture is not the best picture to draw for RCU
ordering. I think the right picture to draw for RCU ordering is
something like this:
Ā Ā Ā  case (1): C ends before G does:

rcsc ... ... ... ... ... gp

case (2): G ends before C does:

gp ... ... ... ... ... rscs

where the dots are some relation that means "happens before".


and then the natural formalization of the propagation requirement

every store that propagates to (the CPU executing whatever of G resp. C happens first) before (G resp. C happens) propagates to every CPU before any instruction po-after (C resp. G, i.e., the other one) executes

is something like

prop ; po ; (case (1) or case (2)) ; po creates a kind of happens-before (rb)

which is absolutely the same as the formalization of the pb propagation requirement

every store that propagates to a CPU beginning a strong ordering operation before that CPU begins the strong ordering operation propagates to every CPU before any instruction po-after the strong ordering operation executes

which is
prop ; po ; (thing providing ordering); po creates a kind of happens-before (pb)

I think this makes a lot more sense than to say "here's the grace period guarantee that talks about ordering gp with rscs, and here's the formalization that says the count of gp is >= than the number of rscs in some technical counting relation, but the relation between the two is too complicated to explain here".

The one remaining snag I currently see is that the current formalization of pb and rb doesn't allow using these ordering guarantees multiple times. I would prefer something like
pb = prop ; strong-order
acyclic (pb | hb) as propagation
rb = prop ; new-rcu-order
acyclic (rb | pb | hb) as rcu

And then if you do have a case like
... gp ... rscs ... rscs ... gp ...

you would essentially get two rb edges.




>
>>>> the law is, well, um, "primarily empirical in nature"
>>>
>>>
>>> Actually it isn't, not any more. That quote was written before we
>>> formalized RCU in the LKMM.
>> I meant that the original formulation was empirical; of course you have
>> formalized it, but how do you know that the formalization is valid?
> We proved it in the ASPLOS paper. That is, we proved that a particular
> implementation faithfully obeys the restrictions of the formalization.
>
>> I think
>> the correspondence with "what's intended" is always an empirical thing, even
>> if you formally prove the correctness of the imlementation against the
>> specification you might have missed some parts or added some parts that are
>> actually just implementation details.
> While I agree that it is difficult to be sure that an informal
> specification agrees with a formal model, I wouldn't describe attempts
> to ensure this as "empirical".

I think there's two parts, the formal proofs you do that are of course
not empirical, and then the question if the theorem you have proven is
the one you should be proving. And that's something that will become
more or less clear as 1) you try to use the theorem you have proven to
prove other theorems, and 2) as you change the implementation details
and see whether the specification is stable under those changes or not.
I consider that empirical since it's based on experience that comes
after the fact, not something you can usually prove in advance.


Have fun,
jonas

2023-01-18 16:25:19

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 17, 2023 at 09:17:04PM -0800, Paul E. McKenney wrote:
> On Tue, Jan 17, 2023 at 09:15:15PM -0500, Alan Stern wrote:
> > Maybe we don't. Please test the patch below; I think it will do what
> > you want -- and it doesn't rule out nesting.
>
> It works like a champ on manual/kernel/C-srcu*.litmus in the litmus
> repository on github, good show and thank you!!!
>
> I will make more tests, and am checking this against the rest of the
> litmus tests in the repo, but in the meantime would you be willing to
> have me add your Signed-off-by?

I'll email a real patch submission in the not-too-distant future,
assuming you don't find any problems with the new code.

Alan

2023-01-18 17:45:23

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 18, 2023 at 08:59:55AM -0800, Boqun Feng wrote:
> On Wed, Jan 18, 2023 at 11:03:35AM -0500, Alan Stern wrote:
> > On Tue, Jan 17, 2023 at 09:17:04PM -0800, Paul E. McKenney wrote:
> > > On Tue, Jan 17, 2023 at 09:15:15PM -0500, Alan Stern wrote:
> > > > Maybe we don't. Please test the patch below; I think it will do what
> > > > you want -- and it doesn't rule out nesting.
> > >
> > > It works like a champ on manual/kernel/C-srcu*.litmus in the litmus
> > > repository on github, good show and thank you!!!
> > >
> > > I will make more tests, and am checking this against the rest of the
> > > litmus tests in the repo, but in the meantime would you be willing to
> > > have me add your Signed-off-by?
> >
> > I'll email a real patch submission in the not-too-distant future,
> > assuming you don't find any problems with the new code.
>
> I haven't tested the following, but I think we also need it to avoid
> (although rare) mixing srcu_struct with normal memory access?
>
> Since you are working on a patch, I think I better mention this ;-)
>
> Regards,
> Boqun
>
> diff --git a/tools/memory-model/lock.cat b/tools/memory-model/lock.cat
> index 6b52f365d73a..c134c2027224 100644
> --- a/tools/memory-model/lock.cat
> +++ b/tools/memory-model/lock.cat
> @@ -37,7 +37,7 @@ let RU = try RU with emptyset
> let LF = LF | RL
>
> (* There should be no ordinary R or W accesses to spinlocks *)
> -let ALL-LOCKS = LKR | LKW | UL | LF | RU
> +let ALL-LOCKS = LKR | LKW | UL | LF | RU | Srcu-lock | Srcu-unlock
> flag ~empty [M \ IW] ; loc ; [ALL-LOCKS] as mixed-lock-accesses
>
> (* Link Lock-Reads to their RMW-partner Lock-Writes *)

Great point! I'll at this to the patch, thanks.

Alan

2023-01-18 18:00:37

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 18, 2023 at 11:03:35AM -0500, Alan Stern wrote:
> On Tue, Jan 17, 2023 at 09:17:04PM -0800, Paul E. McKenney wrote:
> > On Tue, Jan 17, 2023 at 09:15:15PM -0500, Alan Stern wrote:
> > > Maybe we don't. Please test the patch below; I think it will do what
> > > you want -- and it doesn't rule out nesting.
> >
> > It works like a champ on manual/kernel/C-srcu*.litmus in the litmus
> > repository on github, good show and thank you!!!
> >
> > I will make more tests, and am checking this against the rest of the
> > litmus tests in the repo, but in the meantime would you be willing to
> > have me add your Signed-off-by?
>
> I'll email a real patch submission in the not-too-distant future,
> assuming you don't find any problems with the new code.

Sounds good!

The current state is that last night's testing found a difference only
for C-srcu-nest-5.litmus, in which case your version gives the correct
answer and mainline is wrong. There were a couple of broken tests, which
I fixed and a test involving spin_unlock_wait(), which is at this point
perma-broken due to the Linux kernel no longer having such a thing.
(Other than its re-introduction into i915, but they define it as a
spin_lock_irq() followed by a spin_unlock_irq(), so why worry?)
There were also a few timeouts.

I intend to run the longer tests overnight.

I have not yet come up with a good heuristic to auto-classify
automatically generated tests involving SRCU, so I cannot justify making
you wait on me to get my act together on that.

Thanx, Paul

2023-01-18 18:14:22

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 17, 2023 at 07:50:41PM -0800, Paul E. McKenney wrote:
> On Tue, Jan 17, 2023 at 03:15:06PM -0500, Alan Stern wrote:
> > On Tue, Jan 17, 2023 at 09:43:08AM -0800, Paul E. McKenney wrote:
> > > On Tue, Jan 17, 2023 at 10:56:34AM -0500, Alan Stern wrote:
> > > > Isn't it true that the current code will flag srcu-bad-nesting if a
> > > > litmus test has non-nested overlapping SRCU read-side critical sections?
> > >
> > > Now that you mention it, it does indeed, flagging srcu-bad-nesting.
> > >
> > > Just to see if I understand, different-values yields true if the set
> > > contains multiple elements with the same value mapping to different
> > > values. Or, to put it another way, if the relation does not correspond
> > > to a function.
> >
> > As I understand it, given a relation r (i.e., a set of pairs of events),
> > different-values(r) returns the sub-relation consisting of those pairs
> > in r for which the value associated with the first event of the pair is
> > different from the value associated with the second event of the pair.
>
> OK, so different-values(r) is different than (r \ id) because the
> former operates on values and the latter on events?

No. Both of these things are relations, not values or events.

Suppose you had:

A: WRITE_ONCE(x, 1);
B: WRITE_ONCE(y, 1);
C: WRITE_ONCE(z, 2);

Then the po relation would consist of the pairs (A,B), (A,C), and (B,C).

The different-values(po) relation would include only (A,C) and (B,C).
It would not include (A,B) because the two events in that pair have the
same value: 1.

And finally, (po \ id) would be the same as po, because the id relation
consists of the pairs (A,A), (B,B), and (C,C) -- and none of those are
in po to begin with, so removing them from po doesn't do anything.

> > Right now the behavior is kind of strange. The following simple litmus
> > test:
> >
> > C test
> > {}
> > P0(int *x)
> > {
> > int r1;
> > r1 = srcu_read_lock(x);
> > srcu_read_unlock(x, r1);
> > }
> > exists (~0:r1=0)
> >
> > produces the following output from herd7:
> >
> > Test test Allowed
> > States 1
> > 0:r1=906;
> > Ok
> > Witnesses
> > Positive: 1 Negative: 0
> > Condition exists (not (0:r1=0))
> > Observation test Always 1 0
> > Time test 0.01
> > Hash=2f42c87ae9c1d267f4e80c66f646b9bb
> >
> > Don't ask me where that 906 value comes from or why it is't 0. Also,
> > herd7's graphical output shows there is no data dependency from the lock
> > to the unlock, but we need to have one.
>
> Is it still the case that any herd7 value greater than 127 is special?

I have no idea.

> > > Given an Srcu-down and an Srcu-up:
> > >
> > > let srcu-rscs = ( return_value(Srcu-lock) ; (dep | rfi)* ;
> > > parameter(Srcu-unlock, 2) ) |
> > > ( return_value(Srcu-down) ; (dep | rf)* ;
> > > parameter(Srcu-up, 2) )
> > >
> > > Seem reasonable, or am I missing yet something else?
> >
> > Not at all reasonable.
> >
> > For one thing, consider this question: Which statements lie inside a
> > read-side critical section?
>
> Here srcu_down_read() and srcu_up_read() are to srcu_read_lock() and
> srcu_read_unlock() as down_read() and up_read() are to mutex_lock()
> and mutex_unlock(). Not that this should be all that much comfort
> given that I have no idea how one would go about modeling down_read()
> and up_read() in LKMM.

It might make sense to work on that first, before trying to do
srcu_down_read() and srcu_up_read().

> > With srcu_read_lock() and a matching srcu_read_unlock(), the answer is
> > clear: All statements po-between the two. With srcu_down_read() and
> > srcu_up_read(), the answer is cloudy in the extreme.
>
> And I agree that it must be clearly specified, and my that previous try
> was completely lacking. Here is a second attempt:
>
> let srcu-rscs = (([Srcu-lock] ; data ; [Srcu-unlock]) & loc) |
> (([Srcu-down] ; (data | rf)* ; [Srcu-up]) & loc)
>
> (And I see your proposal and will try it.)
>
> > Also, bear in mind that the Fundamental Law of RCU is formulated in
> > terms of stores propagating to a critical section's CPU. What are we to
> > make of this when a single critical section can belong to more than one
> > CPU?
>
> One way of answering this question is by analogy with down() and up()
> when used as a cross-task mutex. Another is by mechanically applying
> some of current LKMM. Let's start with this second option.
>
> LKMM works mostly with critical sections, but we also discussed ordering
> based on the set of events po-after an srcu_read_lock() on the one hand
> and the set of events po-before an srcu_read_unlock() on the other.
> Starting here, the critical section is the intersection of these two sets.
>
> In the case of srcu_down_read() and srcu_up_read(), as you say, whatever
> might be a critical section must span processes. So what if instead of
> po, we used (say) xbstar? Then given a set of A such that ([Srcu-down ;
> xbstar ; A) and B such that (B ; xbstar ; [Srcu-up]), then the critical
> section is the intersection of A and B.
>
> One objection to this approach is that a bunch of unrelated events could
> end up being defined as part of the critical section. Except that this
> happens already anyway in real critical sections in the Linux kernel.
>
> So what about down() and up() when used as cross-task mutexes?
> These often do have conceptual critical sections that protect some
> combination of resource, but these critical sections might span tasks
> and/or workqueue handlers. And any reasonable definition of these
> critical sections would be just as likely to pull in unrelated accesses as
> the above intersection approach for srcu_down_read() and srcu_up_read().
>
> But I am just now making all this up, so thoughts?

Maybe we don't really need to talk about read-side critical sections at
all. Once again, here's what explanation.txt currently says:

For any critical section C and any grace period G, at least
one of the following statements must hold:

(1) C ends before G does, and in addition, every store that
propagates to C's CPU before the end of C must propagate to
every CPU before G ends.

(2) G starts before C does, and in addition, every store that
propagates to G's CPU before the start of G must propagate
to every CPU before C starts.

Suppose we change this to:

For any RCU lock operation L and matching unlock operation U,
and any matching grace period G, at least one of the following
statements must hold:

(1) U executes before G ends, and in addition, every store that
propagates to U's CPU before U executes must propagate to
every CPU before G ends.

(2) G starts before L executes, and in addition, every store that
propagates to G's CPU before the start of G must propagate
to every CPU before L executes.

(For SRCU, G matches L and U if it operates on the same srcu structure.)

This can be applied sensibly to regular RCU, regular SRCU, and the
up/down version of SRCU. Maybe it's what we want.

> > Indeed, given:
> >
> > P0(int *x) {
> > srcu_down_read(x);
> > }
> >
> > P1(int *x) {
> > srcu_up_read(x);
> > }
> >
> > what are we to make of executions in which P1 executes before P0?
>
> Indeed, there had better be something else forbidding such executions, or
> this is an invalid use of srcu_down_read() and srcu_up_read(). This might
> become more clear if the example is expanded to include the index returned
> from srcu_down_read() that is to be passed to srcu_up_read():
>
> P0(int *x, int *i) {
> WRITE_ONCE(i, srcu_down_read(x));
> }
>
> P1(int *x, int *i) {
> srcu_up_read(x, READ_ONCE(i));
> }

Hmmm. What happens if you write:

r1 = srcu_down_read(x);
r2 = srcu_down_read(x);
srcu_up_read(x, r1);
srcu_up_read(x, r2);

? I can't even tell what that would be _intended_ to do.

In fact, it seems likely that to make this work, you have to store at
least two values in *x: the value of the up/down counter, and the value
returned by srcu_down_read or stored by srcu_up_read. That means you
can't describe what's happening without using a structure, and herd7
doesn't support structures.

> Which it looks like you in fact to have in your patch, so time for me
> to go try that out.
>
> Thanx, Paul

Alan

2023-01-18 18:23:12

by Boqun Feng

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 18, 2023 at 11:03:35AM -0500, Alan Stern wrote:
> On Tue, Jan 17, 2023 at 09:17:04PM -0800, Paul E. McKenney wrote:
> > On Tue, Jan 17, 2023 at 09:15:15PM -0500, Alan Stern wrote:
> > > Maybe we don't. Please test the patch below; I think it will do what
> > > you want -- and it doesn't rule out nesting.
> >
> > It works like a champ on manual/kernel/C-srcu*.litmus in the litmus
> > repository on github, good show and thank you!!!
> >
> > I will make more tests, and am checking this against the rest of the
> > litmus tests in the repo, but in the meantime would you be willing to
> > have me add your Signed-off-by?
>
> I'll email a real patch submission in the not-too-distant future,
> assuming you don't find any problems with the new code.

I haven't tested the following, but I think we also need it to avoid
(although rare) mixing srcu_struct with normal memory access?

Since you are working on a patch, I think I better mention this ;-)

Regards,
Boqun

diff --git a/tools/memory-model/lock.cat b/tools/memory-model/lock.cat
index 6b52f365d73a..c134c2027224 100644
--- a/tools/memory-model/lock.cat
+++ b/tools/memory-model/lock.cat
@@ -37,7 +37,7 @@ let RU = try RU with emptyset
let LF = LF | RL

(* There should be no ordinary R or W accesses to spinlocks *)
-let ALL-LOCKS = LKR | LKW | UL | LF | RU
+let ALL-LOCKS = LKR | LKW | UL | LF | RU | Srcu-lock | Srcu-unlock
flag ~empty [M \ IW] ; loc ; [ALL-LOCKS] as mixed-lock-accesses

(* Link Lock-Reads to their RMW-partner Lock-Writes *)

>
> Alan

2023-01-18 19:58:43

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/18/2023 5:50 PM, Alan Stern wrote:
> On Tue, Jan 17, 2023 at 07:50:41PM -0800, Paul E. McKenney wrote:
>> On Tue, Jan 17, 2023 at 03:15:06PM -0500, Alan Stern wrote:
>>> On Tue, Jan 17, 2023 at 09
>>>> Given an Srcu-down and an Srcu-up:
>>>>
>>>> let srcu-rscs = ( return_value(Srcu-lock) ; (dep | rfi)* ;
>>>> parameter(Srcu-unlock, 2) ) |
>>>> ( return_value(Srcu-down) ; (dep | rf)* ;
>>>> parameter(Srcu-up, 2) )
>>>>
>>>> Seem reasonable, or am I missing yet something else?
>>> Not at all reasonable.
>>>
>>> For one thing, consider this question: Which statements lie inside a
>>> read-side critical section?
>> Here srcu_down_read() and srcu_up_read() are to srcu_read_lock() and
>> srcu_read_unlock() as down_read() and up_read() are to mutex_lock()
>> and mutex_unlock(). Not that this should be all that much comfort
>> given that I have no idea how one would go about modeling down_read()
>> and up_read() in LKMM.
> It might make sense to work on that first, before trying to do
> srcu_down_read() and srcu_up_read().
>
>>> With srcu_read_lock() and a matching srcu_read_unlock(), the answer is
>>> clear: All statements po-between the two. With srcu_down_read() and
>>> srcu_up_read(), the answer is cloudy in the extreme.
>> And I agree that it must be clearly specified, and my that previous try
>> was completely lacking. Here is a second attempt:
>>
>> let srcu-rscs = (([Srcu-lock] ; data ; [Srcu-unlock]) & loc) |
>> (([Srcu-down] ; (data | rf)* ; [Srcu-up]) & loc)
>>
>> (And I see your proposal and will try it.)
>>
>>> Also, bear in mind that the Fundamental Law of RCU is formulated in
>>> terms of stores propagating to a critical section's CPU. What are we to
>>> make of this when a single critical section can belong to more than one
>>> CPU?
>> One way of answering this question is by analogy with down() and up()
>> when used as a cross-task mutex. Another is by mechanically applying
>> some of current LKMM. Let's start with this second option.
>>
>> LKMM works mostly with critical sections, but we also discussed ordering
>> based on the set of events po-after an srcu_read_lock() on the one hand
>> and the set of events po-before an srcu_read_unlock() on the other.
>> Starting here, the critical section is the intersection of these two sets.
>>
>> In the case of srcu_down_read() and srcu_up_read(), as you say, whatever
>> might be a critical section must span processes. So what if instead of
>> po, we used (say) xbstar? Then given a set of A such that ([Srcu-down ;
>> xbstar ; A) and B such that (B ; xbstar ; [Srcu-up]), then the critical
>> section is the intersection of A and B.
>>
>> One objection to this approach is that a bunch of unrelated events could
>> end up being defined as part of the critical section. Except that this
>> happens already anyway in real critical sections in the Linux kernel.
>>
>> So what about down() and up() when used as cross-task mutexes?
>> These often do have conceptual critical sections that protect some
>> combination of resource, but these critical sections might span tasks
>> and/or workqueue handlers. And any reasonable definition of these
>> critical sections would be just as likely to pull in unrelated accesses as
>> the above intersection approach for srcu_down_read() and srcu_up_read().
>>
>> But I am just now making all this up, so thoughts?
> Maybe we don't really need to talk about read-side critical sections at
> all. Once again, here's what explanation.txt currently says:
>
> For any critical section C and any grace period G, at least
> one of the following statements must hold:
>
> (1) C ends before G does, and in addition, every store that
> propagates to C's CPU before the end of C must propagate to
> every CPU before G ends.
>
> (2) G starts before C does, and in addition, every store that
> propagates to G's CPU before the start of G must propagate
> to every CPU before C starts.
>
> Suppose we change this to:
>
> For any RCU lock operation L and matching unlock operation U,
> and any matching grace period G, at least one of the following
> statements must hold:
>
> (1) U executes before G ends, and in addition, every store that
> propagates to U's CPU before U executes must propagate to
> every CPU before G ends.
>
> (2) G starts before L executes, and in addition, every store that
> propagates to G's CPU before the start of G must propagate
> to every CPU before L executes.
>
> (For SRCU, G matches L and U if it operates on the same srcu structure.)

I think for the formalization, the definition of "critical section" is
hidden inside the word "matching" here.
You will still need to define what matching means for up and down.
Can I understand down and up to create a large read-side critical
section that is shared between multiple threads, analogously to a
semaphore? With the restriction that for srcu, there are really multiple
(two) such critical sections that can be open in parallel, which are
indexed by the return value of down/the input of up?

If so I suspect that every down matches with every up within a "critical
section"?
maybe you can define balancing along the co analous to the balancing
along po currently used for matching rcu_lock() and rcu_unlock(). I.e.,

down ------------- up
Ā Ā  \down--------up/
Ā Ā Ā  Ā Ā  \down-up/
Ā Ā Ā  Ā Ā Ā  Ā  \_/
where diagonal links are co links and the straight links are "balanced
match" links.

Then everything that is enclosed within a pair of "balanced match" is
linked:

match-down-up = co^-1?; balanced-srcu-updown ; co^-1?

Since multiple critical sections can be in-flight, maybe you can use co
& same-value (or whatever the relation is) to define this?


let balanced-srcu-updown = let rec
Ā Ā  Ā Ā Ā Ā  unmatched-locks = Srcu-down \ domain(matched)
Ā Ā  Ā and unmatched-unlocks = Srcu-up \ range(matched)
Ā Ā  Ā and unmatched = unmatched-locks | unmatched-unlocks
Ā Ā  Ā and unmatched-co = [unmatched] ; co & same-value ; [unmatched]
Ā Ā  Ā and unmatched-locks-to-unlocks =
Ā Ā  Ā Ā Ā Ā  [unmatched-locks] ;Ā  co & same-value ; [unmatched-unlocks]
Ā Ā  Ā and matched = matched | (unmatched-locks-to-unlocks \
Ā Ā  Ā Ā Ā Ā  (unmatched-co ; unmatched-co))
Ā Ā  Ā in matched
let match-down-up = (co & same-value)^-1?; balanced-srcu-updown ; (co &
same-value)^-1?



>
>>> Indeed, given:
>>>
>>> P0(int *x) {
>>> srcu_down_read(x);
>>> }
>>>
>>> P1(int *x) {
>>> srcu_up_read(x);
>>> }
>>>
>>> what are we to make of executions in which P1 executes before P0?
>> Indeed, there had better be something else forbidding such executions, or
>> this is an invalid use of srcu_down_read() and srcu_up_read().

Would it be sufficient to flag executions in which an up is not matched
with any down event?

>> This might
>> become more clear if the example is expanded to include the index returned
>> from srcu_down_read() that is to be passed to srcu_up_read():
>>
>> P0(int *x, int *i) {
>> WRITE_ONCE(i, srcu_down_read(x));
>> }
>>
>> P1(int *x, int *i) {
>> srcu_up_read(x, READ_ONCE(i));
>> }
> Hmmm. What happens if you write:
>
> r1 = srcu_down_read(x);
> r2 = srcu_down_read(x);
> srcu_up_read(x, r1);
> srcu_up_read(x, r2);
>
> ? I can't even tell what that would be _intended_ to do.

Is it correct that it creates one or two read-side critical sections
depending on whether the two down() happen to return the same value,
which either spans at least all four lines (plus perhaps more if other
threads also do down()) or the first spans lines 1-3 and the second
spans 2-4?

Is the implementation of srcu-lock and srcu-unlock still anything like
the implementation in the 2009 paper?

best wishes and thanks for your patient explanations, jonas

2023-01-18 20:22:55

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 18, 2023 at 11:50:24AM -0500, Alan Stern wrote:
> On Tue, Jan 17, 2023 at 07:50:41PM -0800, Paul E. McKenney wrote:
> > On Tue, Jan 17, 2023 at 03:15:06PM -0500, Alan Stern wrote:
> > > On Tue, Jan 17, 2023 at 09:43:08AM -0800, Paul E. McKenney wrote:
> > > > On Tue, Jan 17, 2023 at 10:56:34AM -0500, Alan Stern wrote:
> > > > > Isn't it true that the current code will flag srcu-bad-nesting if a
> > > > > litmus test has non-nested overlapping SRCU read-side critical sections?
> > > >
> > > > Now that you mention it, it does indeed, flagging srcu-bad-nesting.
> > > >
> > > > Just to see if I understand, different-values yields true if the set
> > > > contains multiple elements with the same value mapping to different
> > > > values. Or, to put it another way, if the relation does not correspond
> > > > to a function.
> > >
> > > As I understand it, given a relation r (i.e., a set of pairs of events),
> > > different-values(r) returns the sub-relation consisting of those pairs
> > > in r for which the value associated with the first event of the pair is
> > > different from the value associated with the second event of the pair.
> >
> > OK, so different-values(r) is different than (r \ id) because the
> > former operates on values and the latter on events?
>
> No. Both of these things are relations, not values or events.
>
> Suppose you had:
>
> A: WRITE_ONCE(x, 1);
> B: WRITE_ONCE(y, 1);
> C: WRITE_ONCE(z, 2);
>
> Then the po relation would consist of the pairs (A,B), (A,C), and (B,C).
>
> The different-values(po) relation would include only (A,C) and (B,C).
> It would not include (A,B) because the two events in that pair have the
> same value: 1.
>
> And finally, (po \ id) would be the same as po, because the id relation
> consists of the pairs (A,A), (B,B), and (C,C) -- and none of those are
> in po to begin with, so removing them from po doesn't do anything.

Thank you for the much-needed tutorial!

The different values are in the domain, not the range, then. Good.

> > > Right now the behavior is kind of strange. The following simple litmus
> > > test:
> > >
> > > C test
> > > {}
> > > P0(int *x)
> > > {
> > > int r1;
> > > r1 = srcu_read_lock(x);
> > > srcu_read_unlock(x, r1);
> > > }
> > > exists (~0:r1=0)
> > >
> > > produces the following output from herd7:
> > >
> > > Test test Allowed
> > > States 1
> > > 0:r1=906;
> > > Ok
> > > Witnesses
> > > Positive: 1 Negative: 0
> > > Condition exists (not (0:r1=0))
> > > Observation test Always 1 0
> > > Time test 0.01
> > > Hash=2f42c87ae9c1d267f4e80c66f646b9bb
> > >
> > > Don't ask me where that 906 value comes from or why it is't 0. Also,
> > > herd7's graphical output shows there is no data dependency from the lock
> > > to the unlock, but we need to have one.
> >
> > Is it still the case that any herd7 value greater than 127 is special?
>
> I have no idea.

Boqun mentioned off-list this morning that this is still the case,
and that each execution of srcu_read_lock() will return a unique value.
Assuming that I understood him correctly, anyway.

> > > > Given an Srcu-down and an Srcu-up:
> > > >
> > > > let srcu-rscs = ( return_value(Srcu-lock) ; (dep | rfi)* ;
> > > > parameter(Srcu-unlock, 2) ) |
> > > > ( return_value(Srcu-down) ; (dep | rf)* ;
> > > > parameter(Srcu-up, 2) )
> > > >
> > > > Seem reasonable, or am I missing yet something else?
> > >
> > > Not at all reasonable.
> > >
> > > For one thing, consider this question: Which statements lie inside a
> > > read-side critical section?
> >
> > Here srcu_down_read() and srcu_up_read() are to srcu_read_lock() and
> > srcu_read_unlock() as down_read() and up_read() are to mutex_lock()
> > and mutex_unlock(). Not that this should be all that much comfort
> > given that I have no idea how one would go about modeling down_read()
> > and up_read() in LKMM.
>
> It might make sense to work on that first, before trying to do
> srcu_down_read() and srcu_up_read().

The thing is that it is easy to associate an srcu_down_read() with the
corresponding srcu_up_read(). With down() and up(), although in the
Linux kernel this might be represented by a data structure tracking
(say) an I/O request, LKMM is going to be hard pressed to figure that out.

If I am not too confused, the bell code would look something like this
(NOT FOR MAINLINE!):

------------------------------------------------------------------------

(* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
let srcu-rscs = ([Srcu-lock] ; (data | rf)* ; [Srcu-unlock]) & loc

(* Validate nesting *)
empty Srcu-lock \ domain(srcu-rscs) as mismatched-srcu-locking
empty Srcu-unlock \ range(srcu-rscs) as mismatched-srcu-unlocking
flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-srcu-unlocks

(* Check for use of synchronize_srcu() inside an RCU critical section *)
flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep

(* Validate SRCU dynamic match *)
flag ~empty different-values(srcu-rscs) as srcu-bad-nesting

------------------------------------------------------------------------

A for-mainline version would use Srcu-down and Srcu-up rather than
hijacking the current Srcu-lock and Srcu-unlock. Which seems to require
herd7 changes, but not unless/until we have agreement that this is a
reasonable thing to do.

> > > With srcu_read_lock() and a matching srcu_read_unlock(), the answer is
> > > clear: All statements po-between the two. With srcu_down_read() and
> > > srcu_up_read(), the answer is cloudy in the extreme.
> >
> > And I agree that it must be clearly specified, and my that previous try
> > was completely lacking. Here is a second attempt:
> >
> > let srcu-rscs = (([Srcu-lock] ; data ; [Srcu-unlock]) & loc) |
> > (([Srcu-down] ; (data | rf)* ; [Srcu-up]) & loc)
> >
> > (And I see your proposal and will try it.)
> >
> > > Also, bear in mind that the Fundamental Law of RCU is formulated in
> > > terms of stores propagating to a critical section's CPU. What are we to
> > > make of this when a single critical section can belong to more than one
> > > CPU?
> >
> > One way of answering this question is by analogy with down() and up()
> > when used as a cross-task mutex. Another is by mechanically applying
> > some of current LKMM. Let's start with this second option.
> >
> > LKMM works mostly with critical sections, but we also discussed ordering
> > based on the set of events po-after an srcu_read_lock() on the one hand
> > and the set of events po-before an srcu_read_unlock() on the other.
> > Starting here, the critical section is the intersection of these two sets.
> >
> > In the case of srcu_down_read() and srcu_up_read(), as you say, whatever
> > might be a critical section must span processes. So what if instead of
> > po, we used (say) xbstar? Then given a set of A such that ([Srcu-down ;
> > xbstar ; A) and B such that (B ; xbstar ; [Srcu-up]), then the critical
> > section is the intersection of A and B.
> >
> > One objection to this approach is that a bunch of unrelated events could
> > end up being defined as part of the critical section. Except that this
> > happens already anyway in real critical sections in the Linux kernel.
> >
> > So what about down() and up() when used as cross-task mutexes?
> > These often do have conceptual critical sections that protect some
> > combination of resource, but these critical sections might span tasks
> > and/or workqueue handlers. And any reasonable definition of these
> > critical sections would be just as likely to pull in unrelated accesses as
> > the above intersection approach for srcu_down_read() and srcu_up_read().
> >
> > But I am just now making all this up, so thoughts?
>
> Maybe we don't really need to talk about read-side critical sections at
> all. Once again, here's what explanation.txt currently says:
>
> For any critical section C and any grace period G, at least
> one of the following statements must hold:
>
> (1) C ends before G does, and in addition, every store that
> propagates to C's CPU before the end of C must propagate to
> every CPU before G ends.
>
> (2) G starts before C does, and in addition, every store that
> propagates to G's CPU before the start of G must propagate
> to every CPU before C starts.
>
> Suppose we change this to:
>
> For any RCU lock operation L and matching unlock operation U,
> and any matching grace period G, at least one of the following
> statements must hold:
>
> (1) U executes before G ends, and in addition, every store that
> propagates to U's CPU before U executes must propagate to
> every CPU before G ends.
>
> (2) G starts before L executes, and in addition, every store that
> propagates to G's CPU before the start of G must propagate
> to every CPU before L executes.
>
> (For SRCU, G matches L and U if it operates on the same srcu structure.)
>
> This can be applied sensibly to regular RCU, regular SRCU, and the
> up/down version of SRCU. Maybe it's what we want.

I do like your proposed change!

> > > Indeed, given:
> > >
> > > P0(int *x) {
> > > srcu_down_read(x);
> > > }
> > >
> > > P1(int *x) {
> > > srcu_up_read(x);
> > > }
> > >
> > > what are we to make of executions in which P1 executes before P0?
> >
> > Indeed, there had better be something else forbidding such executions, or
> > this is an invalid use of srcu_down_read() and srcu_up_read(). This might
> > become more clear if the example is expanded to include the index returned
> > from srcu_down_read() that is to be passed to srcu_up_read():
> >
> > P0(int *x, int *i) {
> > WRITE_ONCE(i, srcu_down_read(x));
> > }
> >
> > P1(int *x, int *i) {
> > srcu_up_read(x, READ_ONCE(i));
> > }
>
> Hmmm. What happens if you write:
>
> r1 = srcu_down_read(x);
> r2 = srcu_down_read(x);
> srcu_up_read(x, r1);
> srcu_up_read(x, r2);
>
> ? I can't even tell what that would be _intended_ to do.

Let's take it one line at a time:

r1 = srcu_down_read(x);
// A
r2 = srcu_down_read(x);
// B
srcu_up_read(x, r1);
// C
srcu_up_read(x, r2);
// D

An SRCU grace period that starts at A is permitted to complete at
C, difficult though it might be to actually make this happen in the
Linux kernel. It need wait only for pre-existing critical sections.
But an SRCU grace period that starts at either B or C must wait for both
critical sections, that is until D.

This applies to srcu_read_lock() and srcu_read_unlock() just as much as
to srcu_down_read() and srcu_up_read(), correct? Each SRCU read-side
critical section is its own thing, and they do not flatten the way that
RCU read-side critical sections do.

I don't know of a safe and sane use of this pattern, as noted here:
https://paulmck.livejournal.com/40593.html

But someone might come up with such a use.

> In fact, it seems likely that to make this work, you have to store at
> least two values in *x: the value of the up/down counter, and the value
> returned by srcu_down_read or stored by srcu_up_read. That means you
> can't describe what's happening without using a structure, and herd7
> doesn't support structures.

Yes, if we needed to combine the two overlapping grace periods into a
single larger grace period, this would be a problem. But we do not,
because an SRCU grace period beginning just after the WRITE_ONCE(*x, 1)
is allowed to end right after the srcu_up_read(s, r1). That grace period
is not required to wait for the end of the second critical section.

Thanx, Paul

2023-01-18 20:51:45

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/18/2023 4:50 AM, Paul E. McKenney wrote:
> On Tue, Jan 17, 2023 at 03:15:06PM -0500, Alan Stern wrote:
>> On Tue, Jan 17, 2023 at 09:43:08AM -0800, Paul E. McKenney wrote:
>>> On Tue, Jan 17, 2023 at 10:56:34AM -0500, Alan Stern wrote:
>>>> Isn't it true that the current code will flag srcu-bad-nesting if a
>>>> litmus test has non-nested overlapping SRCU read-side critical sections?
>>> Now that you mention it, it does indeed, flagging srcu-bad-nesting.
>>>
>>> Just to see if I understand, different-values yields true if the set
>>> contains multiple elements with the same value mapping to different
>>> values. Or, to put it another way, if the relation does not correspond
>>> to a function.
>> As I understand it, given a relation r (i.e., a set of pairs of events),
>> different-values(r) returns the sub-relation consisting of those pairs
>> in r for which the value associated with the first event of the pair is
>> different from the value associated with the second event of the pair.
> OK, so different-values(r) is different than (r \ id) because the
> former operates on values and the latter on events?

I think you can say that (if you allow yourself to be a little bit loose
with words, as I allow myself to be, much to the chagrin of Alan :) :( :)).

If you had a .value functional relation that relates every event to the
value of that event, then
Ā Ā  different-values(r) = r \ .value ; .value^-1
i.e., it relates events x and y iff: 1) r relates x and y, and 2) the
value of x is not equal to the value of y.

You could write this as
Ā Ā  different-values(r) = r \ .value ; value-id ; .value^-1
where value-id is like id but for values, i.e., relates every value v to
itself.

You could say that this difference operates on the values of the events,
rather than on the events itself.
In contrast,
Ā Ā Ā  r \ id
works directly on the events and relates x and y iff: 1) r relates x and
y, and 2) the event x is not equal to the event y.

In this sense I think your characterization is appropriate.



2023-01-18 21:29:16

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/18/2023 9:54 PM, Alan Stern wrote:
> On Wed, Jan 18, 2023 at 12:06:01PM -0800, Paul E. McKenney wrote:
>> On Wed, Jan 18, 2023 at 11:50:24AM -0500, Alan Stern wrote:
>> Boqun mentioned off-list this morning that this is still the case,
>> and that each execution of srcu_read_lock() will return a unique value.
>> Assuming that I understood him correctly, anyway.
> That will no longer be true with the patch I posted yesterday. Every
> execution of srcu_read_lock() will return 0 (or whatever the initial
> value of the lock variable is).
>
> But with a small change to the .def file, each execution of
> srcu_read_unlock() can be made to increment the lock's value, and then
> the next srcu_read_lock() would naturally return the new value.

That's one of the reasons I'd prefer to see some way to define arbitrary
events and constrain their values axiomatically in cat/bell, rather than
having to rely on loads and stores.

>
>>>> given that I have no idea how one would go about modeling down_read()
>>>> and up_read() in LKMM.
>>> It might make sense to work on that first, before trying to do
>>> srcu_down_read() and srcu_up_read().
>> The thing is that it is easy to associate an srcu_down_read() with the
>> corresponding srcu_up_read(). With down() and up(), although in the
>> Linux kernel this might be represented by a data structure tracking
>> (say) an I/O request, LKMM is going to be hard pressed to figure that out.
> It would help (or at least, it would help _me_) if you gave a short
> explanation of how srcu_down_read() and srcu_up_read() are meant to
> work. With regular r/w semaphores, the initial lock value is 0, each
> down() operation decrements the value, each up() operation increments
> the value -- or vice versa if you don't like negative values -- and a
> write_lock() will wait until the value is >= 0. In that setting, it
> makes sense to say that a down() which changes the value from n to n-1
> matches the next up() which changes the value from n-1 to n.
>
> I presume that srcu semaphores do not work this way. Particularly since
> the down() operation returns a value which must be passed to the
> corresponding up() operation. So how _do_ they work?

Coming from the lense of how it probably works (that you get an index
computed from the number of completed rcu_sync(), and down() on a
semaphore in an array indexed by that index, where any concurrent
rcu_sync waits until the semaphore is 0 again), I suspect that each
value defines its own semaphore.
That's why I proposed

let balanced-srcu-updown = let rec
Ā Ā  Ā Ā Ā Ā  unmatched-locks = Srcu-down \ domain(matched)
Ā Ā  Ā and unmatched-unlocks = Srcu-up \ range(matched)
Ā Ā  Ā and unmatched = unmatched-locks | unmatched-unlocks
Ā Ā  Ā and unmatched-co = [unmatched] ; co & same-value ; [unmatched]
Ā Ā  Ā and unmatched-locks-to-unlocks =
Ā Ā  Ā Ā Ā Ā  [unmatched-locks] ;Ā  co & same-value ; [unmatched-unlocks]
Ā Ā  Ā and matched = matched | (unmatched-locks-to-unlocks \
Ā Ā  Ā Ā Ā Ā  (unmatched-co ; unmatched-co))
Ā Ā  Ā in matched
let match-down-up = (co & same-value)^-1?; balanced-srcu-updown ; (co &
same-value)^-1?

Which would match every down on the same value with every up that
happens between two times that the value of the semaphore is 0, without
having to keep track of the actual value of the semaphore.

Since the semaphore starts from zero, it's really just a matter of
having balanced down() and up(), and if you are enclosed between
balanced down() and up(), you are in a critical section.

Then any GP that ends after any down() must also end after all "later"
up() on the same semaphore (same srcu_struct, same idx) until the value
reaches zero again. (Here "later" is encoded through co. Might be you
want fr instead.)
And any GP that starts before any up() must also start before all
"earlier" down() on the same semaphore.

This is the sense of matching I'm trying to encode above.
Let me know if my understanding of down() and up() is wrong.

have fun, jonas


2023-01-18 21:37:36

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 18, 2023 at 08:57:22PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/18/2023 4:50 AM, Paul E. McKenney wrote:
> > On Tue, Jan 17, 2023 at 03:15:06PM -0500, Alan Stern wrote:
> > > On Tue, Jan 17, 2023 at 09:43:08AM -0800, Paul E. McKenney wrote:
> > > > On Tue, Jan 17, 2023 at 10:56:34AM -0500, Alan Stern wrote:
> > > > > Isn't it true that the current code will flag srcu-bad-nesting if a
> > > > > litmus test has non-nested overlapping SRCU read-side critical sections?
> > > > Now that you mention it, it does indeed, flagging srcu-bad-nesting.
> > > >
> > > > Just to see if I understand, different-values yields true if the set
> > > > contains multiple elements with the same value mapping to different
> > > > values. Or, to put it another way, if the relation does not correspond
> > > > to a function.
> > > As I understand it, given a relation r (i.e., a set of pairs of events),
> > > different-values(r) returns the sub-relation consisting of those pairs
> > > in r for which the value associated with the first event of the pair is
> > > different from the value associated with the second event of the pair.
> > OK, so different-values(r) is different than (r \ id) because the
> > former operates on values and the latter on events?
>
> I think you can say that (if you allow yourself to be a little bit loose
> with words, as I allow myself to be, much to the chagrin of Alan :) :( :)).

Well, Alan's insistance on rigor has keep LKMM out of trouble more times
than I can count. ;-)

> If you had a .value functional relation that relates every event to the
> value of that event, then
> ?? different-values(r) = r \ .value ; .value^-1
> i.e., it relates events x and y iff: 1) r relates x and y, and 2) the value
> of x is not equal to the value of y.
>
> You could write this as
> ?? different-values(r) = r \ .value ; value-id ; .value^-1
> where value-id is like id but for values, i.e., relates every value v to
> itself.
>
> You could say that this difference operates on the values of the events,
> rather than on the events itself.
> In contrast,
> ??? r \ id
> works directly on the events and relates x and y iff: 1) r relates x and y,
> and 2) the event x is not equal to the event y.
>
> In this sense I think your characterization is appropriate.

It looks to be "different domain values", but maybe I should just run
some experiments. ;-)

Thanx, Paul

2023-01-18 21:37:41

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/18/2023 9:19 PM, Paul E. McKenney wrote:
> On Wed, Jan 18, 2023 at 08:42:36PM +0100, Jonas Oberhauser wrote:
>> On 1/18/2023 5:50 PM, Alan Stern wrote:
>>> On Tue, Jan 17, 2023 at 07:50:41PM -0800, Paul E. McKenney wrote:
>>>> On Tue, Jan 17, 2023 at 03:15:06PM -0500, Alan Stern wrote:
>>>>> On Tue, Jan 17, 2023 at 09
>>>>>> Given an Srcu-down and an Srcu-up:
>>>>>>
>>>>>> let srcu-rscs = ( return_value(Srcu-lock) ; (dep | rfi)* ;
>>>>>> parameter(Srcu-unlock, 2) ) |
>>>>>> ( return_value(Srcu-down) ; (dep | rf)* ;
>>>>>> parameter(Srcu-up, 2) )
>>>>>>
>>>>>> Seem reasonable, or am I missing yet something else?
>>>>> Not at all reasonable.
>>>>>
>>>>> For one thing, consider this question: Which statements lie inside a
>>>>> read-side critical section?
>>>> Here srcu_down_read() and srcu_up_read() are to srcu_read_lock() and
>>>> srcu_read_unlock() as down_read() and up_read() are to mutex_lock()
>>>> and mutex_unlock(). Not that this should be all that much comfort
>>>> given that I have no idea how one would go about modeling down_read()
>>>> and up_read() in LKMM.
>>> It might make sense to work on that first, before trying to do
>>> srcu_down_read() and srcu_up_read().
>>>
>>>>> With srcu_read_lock() and a matching srcu_read_unlock(), the answer is
>>>>> clear: All statements po-between the two. With srcu_down_read() and
>>>>> srcu_up_read(), the answer is cloudy in the extreme.
>>>> And I agree that it must be clearly specified, and my that previous try
>>>> was completely lacking. Here is a second attempt:
>>>>
>>>> let srcu-rscs = (([Srcu-lock] ; data ; [Srcu-unlock]) & loc) |
>>>> (([Srcu-down] ; (data | rf)* ; [Srcu-up]) & loc)
>>>>
>>>> (And I see your proposal and will try it.)
>>>>
>>>>> Also, bear in mind that the Fundamental Law of RCU is formulated in
>>>>> terms of stores propagating to a critical section's CPU. What are we to
>>>>> make of this when a single critical section can belong to more than one
>>>>> CPU?
>>>> One way of answering this question is by analogy with down() and up()
>>>> when used as a cross-task mutex. Another is by mechanically applying
>>>> some of current LKMM. Let's start with this second option.
>>>>
>>>> LKMM works mostly with critical sections, but we also discussed ordering
>>>> based on the set of events po-after an srcu_read_lock() on the one hand
>>>> and the set of events po-before an srcu_read_unlock() on the other.
>>>> Starting here, the critical section is the intersection of these two sets.
>>>>
>>>> In the case of srcu_down_read() and srcu_up_read(), as you say, whatever
>>>> might be a critical section must span processes. So what if instead of
>>>> po, we used (say) xbstar? Then given a set of A such that ([Srcu-down ;
>>>> xbstar ; A) and B such that (B ; xbstar ; [Srcu-up]), then the critical
>>>> section is the intersection of A and B.
>>>>
>>>> One objection to this approach is that a bunch of unrelated events could
>>>> end up being defined as part of the critical section. Except that this
>>>> happens already anyway in real critical sections in the Linux kernel.
>>>>
>>>> So what about down() and up() when used as cross-task mutexes?
>>>> These often do have conceptual critical sections that protect some
>>>> combination of resource, but these critical sections might span tasks
>>>> and/or workqueue handlers. And any reasonable definition of these
>>>> critical sections would be just as likely to pull in unrelated accesses as
>>>> the above intersection approach for srcu_down_read() and srcu_up_read().
>>>>
>>>> But I am just now making all this up, so thoughts?
>>> Maybe we don't really need to talk about read-side critical sections at
>>> all. Once again, here's what explanation.txt currently says:
>>>
>>> For any critical section C and any grace period G, at least
>>> one of the following statements must hold:
>>>
>>> (1) C ends before G does, and in addition, every store that
>>> propagates to C's CPU before the end of C must propagate to
>>> every CPU before G ends.
>>>
>>> (2) G starts before C does, and in addition, every store that
>>> propagates to G's CPU before the start of G must propagate
>>> to every CPU before C starts.
>>>
>>> Suppose we change this to:
>>>
>>> For any RCU lock operation L and matching unlock operation U,
>>> and any matching grace period G, at least one of the following
>>> statements must hold:
>>>
>>> (1) U executes before G ends, and in addition, every store that
>>> propagates to U's CPU before U executes must propagate to
>>> every CPU before G ends.
>>>
>>> (2) G starts before L executes, and in addition, every store that
>>> propagates to G's CPU before the start of G must propagate
>>> to every CPU before L executes.
>>>
>>> (For SRCU, G matches L and U if it operates on the same srcu structure.)
>> I think for the formalization, the definition of "critical section" is
>> hidden inside the word "matching" here.
>> You will still need to define what matching means for up and down.
>> Can I understand down and up to create a large read-side critical section
>> that is shared between multiple threads, analogously to a semaphore? With
>> the restriction that for srcu, there are really multiple (two) such critical
>> sections that can be open in parallel, which are indexed by the return value
>> of down/the input of up?
>>
>> If so I suspect that every down matches with every up within a "critical
>> section"?
>> maybe you can define balancing along the co analous to the balancing along
>> po currently used for matching rcu_lock() and rcu_unlock(). I.e.,
>>
>> down ------------- up
>> Ā Ā  \down--------up/
>> Ā Ā Ā  Ā Ā  \down-up/
>> Ā Ā Ā  Ā Ā Ā  Ā  \_/
>> where diagonal links are co links and the straight links are "balanced
>> match" links.
> The SRCU read-side critical sections are fundamentally different than
> those of RCU. [...]
> In contrast, SRCU read-side critical sections are defined by the
> return value of srcu_read_lock() being passed into the matching
> srcu_read_unlock().

I'm a bit confused. I previously thought that there is
srcu_lock/srcu_unlock and srcu_down/srcu_up and that these are different
things.

Your explanation matches how I understood srcu_read_lock after reading
the paper and pretending that there wasn't a single counter, while my
understanding of srcu_read_down would be closer to the original
implementation in the 2009 paper where there was a single counter, and
thus srcu_read_down and srcu_read_up could open a multi-thread critical
section.

Is there only one thing and read_down *is* read_lock?
If they are not the same, is my understand of read_down correct?

And isn't it also true that the srcu_lock() needs to be on the same CPU
as the matching srcu_unlock()?

I think for matching srcu_lock to srcu_unlock, you can just use the data
dependency (following the "hack" of defining them as reads and writes).

What I was suggesting below is how to redefine "match" between read_down
and read_up that work more like a cross-thread semaphore.


>> Then everything that is enclosed within a pair of "balanced match" is
>> linked:
>>
>> match-down-up = co^-1?; balanced-srcu-updown ; co^-1?
>>
>> Since multiple critical sections can be in-flight, maybe you can use co &
>> same-value (or whatever the relation is) to define this?
>>
>>
>> let balanced-srcu-updown = let rec
>> Ā Ā  Ā Ā Ā Ā  unmatched-locks = Srcu-down \ domain(matched)
>> Ā Ā  Ā and unmatched-unlocks = Srcu-up \ range(matched)
>> Ā Ā  Ā and unmatched = unmatched-locks | unmatched-unlocks
>> Ā Ā  Ā and unmatched-co = [unmatched] ; co & same-value ; [unmatched]
>> Ā Ā  Ā and unmatched-locks-to-unlocks =
>> Ā Ā  Ā Ā Ā Ā  [unmatched-locks] ;Ā  co & same-value ; [unmatched-unlocks]
>> Ā Ā  Ā and matched = matched | (unmatched-locks-to-unlocks \
>> Ā Ā  Ā Ā Ā Ā  (unmatched-co ; unmatched-co))
>> Ā Ā  Ā in matched
>> let match-down-up = (co & same-value)^-1?; balanced-srcu-updown ; (co &
>> same-value)^-1?



>> Is the implementation of srcu-lock and srcu-unlock still anything like the
>> implementation in the 2009 paper?
> The interaction between readers and grace period is now mediated by a
> per-CPU pair of lock counters and of unlock counters, so the 2009 paper is
> not the best guide. But yes, you would likely need three or four pairwise
> overlapping critical sections for the current SRCU implementation to end
> "early".

That makes sense.

Have fun, jonas

2023-01-18 21:42:52

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/18/2023 10:12 PM, Paul E. McKenney wrote:

> The only difference between srcu_read_lock() and srcu_read_unlock()
> on the one hand and srcu_down_read() and srcu_up_read() on the other
> is that a matching pair of srcu_read_lock() and srcu_read_unlock()
> must be running on the same task. In contrast, srcu_down_read() and
> srcu_up_read() are not subject to this constraint.
>
>> What I was suggesting below is how to redefine "match" between read_down and
>> read_up that work more like a cross-thread semaphore.
> Understood, but what I don't understand is why not simply this:
>
> let srcu-rscs-down = ([Srcu-down] ; (data | rf)* ; [Srcu-up]) & loc

Oh, I had thought that it should be more like a semaphore rather than
just a cross-cpu mutex.

Here's an example of how what you are describing would be used:

P0{
Ā Ā  idx = srcu_down(&ss);
Ā Ā  store_release(done,1);
}

P1{
Ā Ā Ā  while (! load_acquire(done));
Ā Ā Ā  srcu_up(&ss,idx)
}

What I was thinking of is more something like this:

P0{
Ā Ā  idx1 = srcu_down(&ss);
Ā Ā  srcu_up(&ss,idx1);
}

P1{
Ā Ā Ā  idx2 = srcu_down(&ss);
Ā Ā Ā  srcu_up(&ss,idx2)
}

where the big difference to srcu_lock/unlock would be that if P0 and P1
happened to get the same index -- which you could very well check or
synchronize on -- that you would be guaranteed that the grace period
only ends once *all* threads that are using this index have called up.
(note that I believe that your implementation has this property, and
some users may come to rely on it if they find out!)

If you want this latter kind of guarantee, then you need to do so
something along the lines of what Alan or I wrote.

If all you need is the ability to use the first scenario, without any
guarantee that if the index happened to be the same (or providing an API
where you can do the down with a fixed index provided by P0) the grace
period will extend, then what you propose should be right.

But from Alan's comments I had misunderstood that that wouldn't be the case.

Best wishes,
jonas

2023-01-18 21:57:16

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 18, 2023 at 08:42:36PM +0100, Jonas Oberhauser wrote:
> On 1/18/2023 5:50 PM, Alan Stern wrote:
> > On Tue, Jan 17, 2023 at 07:50:41PM -0800, Paul E. McKenney wrote:
> > > On Tue, Jan 17, 2023 at 03:15:06PM -0500, Alan Stern wrote:
> > > > On Tue, Jan 17, 2023 at 09
> > > > > Given an Srcu-down and an Srcu-up:
> > > > >
> > > > > let srcu-rscs = ( return_value(Srcu-lock) ; (dep | rfi)* ;
> > > > > parameter(Srcu-unlock, 2) ) |
> > > > > ( return_value(Srcu-down) ; (dep | rf)* ;
> > > > > parameter(Srcu-up, 2) )
> > > > >
> > > > > Seem reasonable, or am I missing yet something else?
> > > > Not at all reasonable.
> > > >
> > > > For one thing, consider this question: Which statements lie inside a
> > > > read-side critical section?
> > > Here srcu_down_read() and srcu_up_read() are to srcu_read_lock() and
> > > srcu_read_unlock() as down_read() and up_read() are to mutex_lock()
> > > and mutex_unlock(). Not that this should be all that much comfort
> > > given that I have no idea how one would go about modeling down_read()
> > > and up_read() in LKMM.
> > It might make sense to work on that first, before trying to do
> > srcu_down_read() and srcu_up_read().
> >
> > > > With srcu_read_lock() and a matching srcu_read_unlock(), the answer is
> > > > clear: All statements po-between the two. With srcu_down_read() and
> > > > srcu_up_read(), the answer is cloudy in the extreme.
> > > And I agree that it must be clearly specified, and my that previous try
> > > was completely lacking. Here is a second attempt:
> > >
> > > let srcu-rscs = (([Srcu-lock] ; data ; [Srcu-unlock]) & loc) |
> > > (([Srcu-down] ; (data | rf)* ; [Srcu-up]) & loc)
> > >
> > > (And I see your proposal and will try it.)
> > >
> > > > Also, bear in mind that the Fundamental Law of RCU is formulated in
> > > > terms of stores propagating to a critical section's CPU. What are we to
> > > > make of this when a single critical section can belong to more than one
> > > > CPU?
> > > One way of answering this question is by analogy with down() and up()
> > > when used as a cross-task mutex. Another is by mechanically applying
> > > some of current LKMM. Let's start with this second option.
> > >
> > > LKMM works mostly with critical sections, but we also discussed ordering
> > > based on the set of events po-after an srcu_read_lock() on the one hand
> > > and the set of events po-before an srcu_read_unlock() on the other.
> > > Starting here, the critical section is the intersection of these two sets.
> > >
> > > In the case of srcu_down_read() and srcu_up_read(), as you say, whatever
> > > might be a critical section must span processes. So what if instead of
> > > po, we used (say) xbstar? Then given a set of A such that ([Srcu-down ;
> > > xbstar ; A) and B such that (B ; xbstar ; [Srcu-up]), then the critical
> > > section is the intersection of A and B.
> > >
> > > One objection to this approach is that a bunch of unrelated events could
> > > end up being defined as part of the critical section. Except that this
> > > happens already anyway in real critical sections in the Linux kernel.
> > >
> > > So what about down() and up() when used as cross-task mutexes?
> > > These often do have conceptual critical sections that protect some
> > > combination of resource, but these critical sections might span tasks
> > > and/or workqueue handlers. And any reasonable definition of these
> > > critical sections would be just as likely to pull in unrelated accesses as
> > > the above intersection approach for srcu_down_read() and srcu_up_read().
> > >
> > > But I am just now making all this up, so thoughts?
> > Maybe we don't really need to talk about read-side critical sections at
> > all. Once again, here's what explanation.txt currently says:
> >
> > For any critical section C and any grace period G, at least
> > one of the following statements must hold:
> >
> > (1) C ends before G does, and in addition, every store that
> > propagates to C's CPU before the end of C must propagate to
> > every CPU before G ends.
> >
> > (2) G starts before C does, and in addition, every store that
> > propagates to G's CPU before the start of G must propagate
> > to every CPU before C starts.
> >
> > Suppose we change this to:
> >
> > For any RCU lock operation L and matching unlock operation U,
> > and any matching grace period G, at least one of the following
> > statements must hold:
> >
> > (1) U executes before G ends, and in addition, every store that
> > propagates to U's CPU before U executes must propagate to
> > every CPU before G ends.
> >
> > (2) G starts before L executes, and in addition, every store that
> > propagates to G's CPU before the start of G must propagate
> > to every CPU before L executes.
> >
> > (For SRCU, G matches L and U if it operates on the same srcu structure.)
>
> I think for the formalization, the definition of "critical section" is
> hidden inside the word "matching" here.
> You will still need to define what matching means for up and down.
> Can I understand down and up to create a large read-side critical section
> that is shared between multiple threads, analogously to a semaphore? With
> the restriction that for srcu, there are really multiple (two) such critical
> sections that can be open in parallel, which are indexed by the return value
> of down/the input of up?
>
> If so I suspect that every down matches with every up within a "critical
> section"?
> maybe you can define balancing along the co analous to the balancing along
> po currently used for matching rcu_lock() and rcu_unlock(). I.e.,
>
> down ------------- up
> ?? \down--------up/
> ??? ?? \down-up/
> ??? ??? ? \_/
> where diagonal links are co links and the straight links are "balanced
> match" links.

The SRCU read-side critical sections are fundamentally different than
those of RCU. RCU's critical sections are defined by po and they nest
and flatten, so that a nested set of RCU read-side critical sections is
equivalent to a single critical section spanning the full set. Your
example above illustrates this nicely.

In contrast, SRCU read-side critical sections are defined by the
return value of srcu_read_lock() being passed into the matching
srcu_read_unlock(). They can be nested, overlapped, or whatever,
but each SRCU read-side critical section is its own thing.
Yes, in SRCU's fully nested case, you cannot tell the difference,
but you can with partially overlapping critical sections, as in
Alan's example:

r1 = srcu_read_lock(s);
r2 = srcu_read_lock(s);
srcu_read_unlock(s, r1);
srcu_read_unlock(s, r2);

An SRCU grace period that starts between that pair of srcu_read_lock()
invocations is permitted to end between the pair of srcu_read_unlock()
invocations because the second critical section has not yet started.

Mainline LKMM's current SRCU definitions are an approximation, which
we are now trying to make exact.

> Then everything that is enclosed within a pair of "balanced match" is
> linked:
>
> match-down-up = co^-1?; balanced-srcu-updown ; co^-1?
>
> Since multiple critical sections can be in-flight, maybe you can use co &
> same-value (or whatever the relation is) to define this?
>
>
> let balanced-srcu-updown = let rec
> ?? ???? unmatched-locks = Srcu-down \ domain(matched)
> ?? ?and unmatched-unlocks = Srcu-up \ range(matched)
> ?? ?and unmatched = unmatched-locks | unmatched-unlocks
> ?? ?and unmatched-co = [unmatched] ; co & same-value ; [unmatched]
> ?? ?and unmatched-locks-to-unlocks =
> ?? ???? [unmatched-locks] ;? co & same-value ; [unmatched-unlocks]
> ?? ?and matched = matched | (unmatched-locks-to-unlocks \
> ?? ???? (unmatched-co ; unmatched-co))
> ?? ?in matched
> let match-down-up = (co & same-value)^-1?; balanced-srcu-updown ; (co &
> same-value)^-1?

Or just substitute "(data | rf)*" for the "data" in Alan's
current definition. (But keeping Alan's definition or similar for
srcu_read_lock() and using the "(data | rf)*" only for srcu_down_read().)

> > > > Indeed, given:
> > > >
> > > > P0(int *x) {
> > > > srcu_down_read(x);
> > > > }
> > > >
> > > > P1(int *x) {
> > > > srcu_up_read(x);
> > > > }
> > > >
> > > > what are we to make of executions in which P1 executes before P0?
> > > Indeed, there had better be something else forbidding such executions, or
> > > this is an invalid use of srcu_down_read() and srcu_up_read().
>
> Would it be sufficient to flag executions in which an up is not matched with
> any down event?
>
> > > This might
> > > become more clear if the example is expanded to include the index returned
> > > from srcu_down_read() that is to be passed to srcu_up_read():
> > >
> > > P0(int *x, int *i) {
> > > WRITE_ONCE(i, srcu_down_read(x));
> > > }
> > >
> > > P1(int *x, int *i) {
> > > srcu_up_read(x, READ_ONCE(i));
> > > }
> > Hmmm. What happens if you write:
> >
> > r1 = srcu_down_read(x);
> > r2 = srcu_down_read(x);
> > srcu_up_read(x, r1);
> > srcu_up_read(x, r2);
> >
> > ? I can't even tell what that would be _intended_ to do.
>
> Is it correct that it creates one or two read-side critical sections
> depending on whether the two down() happen to return the same value, which
> either spans at least all four lines (plus perhaps more if other threads
> also do down()) or the first spans lines 1-3 and the second spans 2-4?

It creates two of them.

> Is the implementation of srcu-lock and srcu-unlock still anything like the
> implementation in the 2009 paper?

The interaction between readers and grace period is now mediated by a
per-CPU pair of lock counters and of unlock counters, so the 2009 paper is
not the best guide. But yes, you would likely need three or four pairwise
overlapping critical sections for the current SRCU implementation to end
"early".

> best wishes and thanks for your patient explanations, jonas

Not a problem! Though I will be distracted most of this afternoon,
Pacific Time.

Thanx, Paul

2023-01-18 21:58:06

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 18, 2023 at 12:06:01PM -0800, Paul E. McKenney wrote:
> On Wed, Jan 18, 2023 at 11:50:24AM -0500, Alan Stern wrote:
> Boqun mentioned off-list this morning that this is still the case,
> and that each execution of srcu_read_lock() will return a unique value.
> Assuming that I understood him correctly, anyway.

That will no longer be true with the patch I posted yesterday. Every
execution of srcu_read_lock() will return 0 (or whatever the initial
value of the lock variable is).

But with a small change to the .def file, each execution of
srcu_read_unlock() can be made to increment the lock's value, and then
the next srcu_read_lock() would naturally return the new value.

> > > given that I have no idea how one would go about modeling down_read()
> > > and up_read() in LKMM.
> >
> > It might make sense to work on that first, before trying to do
> > srcu_down_read() and srcu_up_read().
>
> The thing is that it is easy to associate an srcu_down_read() with the
> corresponding srcu_up_read(). With down() and up(), although in the
> Linux kernel this might be represented by a data structure tracking
> (say) an I/O request, LKMM is going to be hard pressed to figure that out.

It would help (or at least, it would help _me_) if you gave a short
explanation of how srcu_down_read() and srcu_up_read() are meant to
work. With regular r/w semaphores, the initial lock value is 0, each
down() operation decrements the value, each up() operation increments
the value -- or vice versa if you don't like negative values -- and a
write_lock() will wait until the value is >= 0. In that setting, it
makes sense to say that a down() which changes the value from n to n-1
matches the next up() which changes the value from n-1 to n.

I presume that srcu semaphores do not work this way. Particularly since
the down() operation returns a value which must be passed to the
corresponding up() operation. So how _do_ they work?

> > Hmmm. What happens if you write:
> >
> > r1 = srcu_down_read(x);
> > r2 = srcu_down_read(x);
> > srcu_up_read(x, r1);
> > srcu_up_read(x, r2);
> >
> > ? I can't even tell what that would be _intended_ to do.
>
> Let's take it one line at a time:
>
> r1 = srcu_down_read(x);
> // A
> r2 = srcu_down_read(x);
> // B
> srcu_up_read(x, r1);
> // C
> srcu_up_read(x, r2);
> // D
>
> An SRCU grace period that starts at A is permitted to complete at
> C, difficult though it might be to actually make this happen in the
> Linux kernel. It need wait only for pre-existing critical sections.

So the down() returning r1 matches the up() receiving r1?

> But an SRCU grace period that starts at either B or C must wait for both
> critical sections, that is until D.

Implying that the down() returning r2 matches up() receiving r2?

And in general, an up() matches a down() iff they have the same values?
And we can imagine that every down() returns a different value?

How does this differ from srcu_read_lock() and srcu_read_unlock()? And
how do the "up" and "down" parts figure into it? -- what is going up or
down?

Alan

2023-01-18 22:01:21

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 18, 2023 at 09:30:31PM +0100, Jonas Oberhauser wrote:
> On 1/18/2023 9:19 PM, Paul E. McKenney wrote:
> > On Wed, Jan 18, 2023 at 08:42:36PM +0100, Jonas Oberhauser wrote:
> > > On 1/18/2023 5:50 PM, Alan Stern wrote:
> > > > On Tue, Jan 17, 2023 at 07:50:41PM -0800, Paul E. McKenney wrote:
> > > > > On Tue, Jan 17, 2023 at 03:15:06PM -0500, Alan Stern wrote:
> > > > > > On Tue, Jan 17, 2023 at 09
> > > > > > > Given an Srcu-down and an Srcu-up:
> > > > > > >
> > > > > > > let srcu-rscs = ( return_value(Srcu-lock) ; (dep | rfi)* ;
> > > > > > > parameter(Srcu-unlock, 2) ) |
> > > > > > > ( return_value(Srcu-down) ; (dep | rf)* ;
> > > > > > > parameter(Srcu-up, 2) )
> > > > > > >
> > > > > > > Seem reasonable, or am I missing yet something else?
> > > > > > Not at all reasonable.
> > > > > >
> > > > > > For one thing, consider this question: Which statements lie inside a
> > > > > > read-side critical section?
> > > > > Here srcu_down_read() and srcu_up_read() are to srcu_read_lock() and
> > > > > srcu_read_unlock() as down_read() and up_read() are to mutex_lock()
> > > > > and mutex_unlock(). Not that this should be all that much comfort
> > > > > given that I have no idea how one would go about modeling down_read()
> > > > > and up_read() in LKMM.
> > > > It might make sense to work on that first, before trying to do
> > > > srcu_down_read() and srcu_up_read().
> > > >
> > > > > > With srcu_read_lock() and a matching srcu_read_unlock(), the answer is
> > > > > > clear: All statements po-between the two. With srcu_down_read() and
> > > > > > srcu_up_read(), the answer is cloudy in the extreme.
> > > > > And I agree that it must be clearly specified, and my that previous try
> > > > > was completely lacking. Here is a second attempt:
> > > > >
> > > > > let srcu-rscs = (([Srcu-lock] ; data ; [Srcu-unlock]) & loc) |
> > > > > (([Srcu-down] ; (data | rf)* ; [Srcu-up]) & loc)
> > > > >
> > > > > (And I see your proposal and will try it.)
> > > > >
> > > > > > Also, bear in mind that the Fundamental Law of RCU is formulated in
> > > > > > terms of stores propagating to a critical section's CPU. What are we to
> > > > > > make of this when a single critical section can belong to more than one
> > > > > > CPU?
> > > > > One way of answering this question is by analogy with down() and up()
> > > > > when used as a cross-task mutex. Another is by mechanically applying
> > > > > some of current LKMM. Let's start with this second option.
> > > > >
> > > > > LKMM works mostly with critical sections, but we also discussed ordering
> > > > > based on the set of events po-after an srcu_read_lock() on the one hand
> > > > > and the set of events po-before an srcu_read_unlock() on the other.
> > > > > Starting here, the critical section is the intersection of these two sets.
> > > > >
> > > > > In the case of srcu_down_read() and srcu_up_read(), as you say, whatever
> > > > > might be a critical section must span processes. So what if instead of
> > > > > po, we used (say) xbstar? Then given a set of A such that ([Srcu-down ;
> > > > > xbstar ; A) and B such that (B ; xbstar ; [Srcu-up]), then the critical
> > > > > section is the intersection of A and B.
> > > > >
> > > > > One objection to this approach is that a bunch of unrelated events could
> > > > > end up being defined as part of the critical section. Except that this
> > > > > happens already anyway in real critical sections in the Linux kernel.
> > > > >
> > > > > So what about down() and up() when used as cross-task mutexes?
> > > > > These often do have conceptual critical sections that protect some
> > > > > combination of resource, but these critical sections might span tasks
> > > > > and/or workqueue handlers. And any reasonable definition of these
> > > > > critical sections would be just as likely to pull in unrelated accesses as
> > > > > the above intersection approach for srcu_down_read() and srcu_up_read().
> > > > >
> > > > > But I am just now making all this up, so thoughts?
> > > > Maybe we don't really need to talk about read-side critical sections at
> > > > all. Once again, here's what explanation.txt currently says:
> > > >
> > > > For any critical section C and any grace period G, at least
> > > > one of the following statements must hold:
> > > >
> > > > (1) C ends before G does, and in addition, every store that
> > > > propagates to C's CPU before the end of C must propagate to
> > > > every CPU before G ends.
> > > >
> > > > (2) G starts before C does, and in addition, every store that
> > > > propagates to G's CPU before the start of G must propagate
> > > > to every CPU before C starts.
> > > >
> > > > Suppose we change this to:
> > > >
> > > > For any RCU lock operation L and matching unlock operation U,
> > > > and any matching grace period G, at least one of the following
> > > > statements must hold:
> > > >
> > > > (1) U executes before G ends, and in addition, every store that
> > > > propagates to U's CPU before U executes must propagate to
> > > > every CPU before G ends.
> > > >
> > > > (2) G starts before L executes, and in addition, every store that
> > > > propagates to G's CPU before the start of G must propagate
> > > > to every CPU before L executes.
> > > >
> > > > (For SRCU, G matches L and U if it operates on the same srcu structure.)
> > > I think for the formalization, the definition of "critical section" is
> > > hidden inside the word "matching" here.
> > > You will still need to define what matching means for up and down.
> > > Can I understand down and up to create a large read-side critical section
> > > that is shared between multiple threads, analogously to a semaphore? With
> > > the restriction that for srcu, there are really multiple (two) such critical
> > > sections that can be open in parallel, which are indexed by the return value
> > > of down/the input of up?
> > >
> > > If so I suspect that every down matches with every up within a "critical
> > > section"?
> > > maybe you can define balancing along the co analous to the balancing along
> > > po currently used for matching rcu_lock() and rcu_unlock(). I.e.,
> > >
> > > down ------------- up
> > > ?? \down--------up/
> > > ??? ?? \down-up/
> > > ??? ??? ? \_/
> > > where diagonal links are co links and the straight links are "balanced
> > > match" links.
> > The SRCU read-side critical sections are fundamentally different than
> > those of RCU. [...]
> > In contrast, SRCU read-side critical sections are defined by the
> > return value of srcu_read_lock() being passed into the matching
> > srcu_read_unlock().
>
> I'm a bit confused. I previously thought that there is srcu_lock/srcu_unlock
> and srcu_down/srcu_up and that these are different things.
>
> Your explanation matches how I understood srcu_read_lock after reading the
> paper and pretending that there wasn't a single counter, while my
> understanding of srcu_read_down would be closer to the original
> implementation in the 2009 paper where there was a single counter, and thus
> srcu_read_down and srcu_read_up could open a multi-thread critical section.
>
> Is there only one thing and read_down *is* read_lock?
> If they are not the same, is my understand of read_down correct?
>
> And isn't it also true that the srcu_lock() needs to be on the same CPU as
> the matching srcu_unlock()?
>
> I think for matching srcu_lock to srcu_unlock, you can just use the data
> dependency (following the "hack" of defining them as reads and writes).

The only difference between srcu_read_lock() and srcu_read_unlock()
on the one hand and srcu_down_read() and srcu_up_read() on the other
is that a matching pair of srcu_read_lock() and srcu_read_unlock()
must be running on the same task. In contrast, srcu_down_read() and
srcu_up_read() are not subject to this constraint.

> What I was suggesting below is how to redefine "match" between read_down and
> read_up that work more like a cross-thread semaphore.

Understood, but what I don't understand is why not simply this:

let srcu-rscs-down = ([Srcu-down] ; (data | rf)* ; [Srcu-up]) & loc

> > > Then everything that is enclosed within a pair of "balanced match" is
> > > linked:
> > >
> > > match-down-up = co^-1?; balanced-srcu-updown ; co^-1?
> > >
> > > Since multiple critical sections can be in-flight, maybe you can use co &
> > > same-value (or whatever the relation is) to define this?
> > >
> > >
> > > let balanced-srcu-updown = let rec
> > > ?? ???? unmatched-locks = Srcu-down \ domain(matched)
> > > ?? ?and unmatched-unlocks = Srcu-up \ range(matched)
> > > ?? ?and unmatched = unmatched-locks | unmatched-unlocks
> > > ?? ?and unmatched-co = [unmatched] ; co & same-value ; [unmatched]
> > > ?? ?and unmatched-locks-to-unlocks =
> > > ?? ???? [unmatched-locks] ;? co & same-value ; [unmatched-unlocks]
> > > ?? ?and matched = matched | (unmatched-locks-to-unlocks \
> > > ?? ???? (unmatched-co ; unmatched-co))
> > > ?? ?in matched
> > > let match-down-up = (co & same-value)^-1?; balanced-srcu-updown ; (co &
> > > same-value)^-1?
>
>
>
> > > Is the implementation of srcu-lock and srcu-unlock still anything like the
> > > implementation in the 2009 paper?
>
> > The interaction between readers and grace period is now mediated by a
> > per-CPU pair of lock counters and of unlock counters, so the 2009 paper is
> > not the best guide. But yes, you would likely need three or four pairwise
> > overlapping critical sections for the current SRCU implementation to end
> > "early".
>
> That makes sense.
>
> Have fun, jonas

And you! ;-)

Thanx, Paul

2023-01-19 00:39:14

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 18, 2023 at 03:54:47PM -0500, Alan Stern wrote:
> On Wed, Jan 18, 2023 at 12:06:01PM -0800, Paul E. McKenney wrote:
> > On Wed, Jan 18, 2023 at 11:50:24AM -0500, Alan Stern wrote:
> > Boqun mentioned off-list this morning that this is still the case,
> > and that each execution of srcu_read_lock() will return a unique value.
> > Assuming that I understood him correctly, anyway.
>
> That will no longer be true with the patch I posted yesterday. Every
> execution of srcu_read_lock() will return 0 (or whatever the initial
> value of the lock variable is).
>
> But with a small change to the .def file, each execution of
> srcu_read_unlock() can be made to increment the lock's value, and then
> the next srcu_read_lock() would naturally return the new value.

Different values might be good for debugging, but I am not sure that this
is worth the weight. Happy to go with whatever you decide on this one.

> > > > given that I have no idea how one would go about modeling down_read()
> > > > and up_read() in LKMM.
> > >
> > > It might make sense to work on that first, before trying to do
> > > srcu_down_read() and srcu_up_read().
> >
> > The thing is that it is easy to associate an srcu_down_read() with the
> > corresponding srcu_up_read(). With down() and up(), although in the
> > Linux kernel this might be represented by a data structure tracking
> > (say) an I/O request, LKMM is going to be hard pressed to figure that out.
>
> It would help (or at least, it would help _me_) if you gave a short
> explanation of how srcu_down_read() and srcu_up_read() are meant to
> work. With regular r/w semaphores, the initial lock value is 0, each
> down() operation decrements the value, each up() operation increments
> the value -- or vice versa if you don't like negative values -- and a
> write_lock() will wait until the value is >= 0. In that setting, it
> makes sense to say that a down() which changes the value from n to n-1
> matches the next up() which changes the value from n-1 to n.
>
> I presume that srcu semaphores do not work this way. Particularly since
> the down() operation returns a value which must be passed to the
> corresponding up() operation. So how _do_ they work?

There are pairs of per-CPU counters. One pair (->srcu_lock_count[])
counts the number of srcu_down_read() operations that took place on
that CPU and another pair (->srcu_unlock_count[]) counts the number
of srcu_down_read() operations that took place on that CPU. There is
an ->srcu_idx that selects which of the ->srcu_lock_count[] elements
should be incremented by srcu_down_read(). Of course, srcu_down_read()
returns the value of ->srcu_idx that it used so that the matching
srcu_up_read() will use that same index when incrementing its CPU's
->srcu_unlock_count[].

Grace periods go something like this:

1. Sum up the ->srcu_unlock_count[!ssp->srcu_idx] counters.

2. smp_mb().

3. Sum up the ->srcu_unlock_count[!ssp->srcu_idx] counters.

4. If the sums are not equal, retry from #1.

5. smp_mb().

6. WRITE_ONCE(ssp->srcu_idx, !ssp->srcu_idx);

7. smp_mb().

8. Same loop as #1-4.

So similar to r/w semaphores, but with two separate distributed counts.
This means that the number of readers need not go to zero at any given
point in time, consistent with the need to wait only on old readers.

> > > Hmmm. What happens if you write:
> > >
> > > r1 = srcu_down_read(x);
> > > r2 = srcu_down_read(x);
> > > srcu_up_read(x, r1);
> > > srcu_up_read(x, r2);
> > >
> > > ? I can't even tell what that would be _intended_ to do.
> >
> > Let's take it one line at a time:
> >
> > r1 = srcu_down_read(x);
> > // A
> > r2 = srcu_down_read(x);
> > // B
> > srcu_up_read(x, r1);
> > // C
> > srcu_up_read(x, r2);
> > // D
> >
> > An SRCU grace period that starts at A is permitted to complete at
> > C, difficult though it might be to actually make this happen in the
> > Linux kernel. It need wait only for pre-existing critical sections.
>
> So the down() returning r1 matches the up() receiving r1?

Yes.

> > But an SRCU grace period that starts at either B or C must wait for both
> > critical sections, that is until D.
>
> Implying that the down() returning r2 matches up() receiving r2?

Again, yes.

> And in general, an up() matches a down() iff they have the same values?

I would instead say that the match is determined by the fact that a
given srcu_up_read() receives what was returned from the corresponding
srcu_down_read(). So not a comparison of values, but rather something
like (data | rf)*.

> And we can imagine that every down() returns a different value?

For purposes of herd7 emuation before your rework, yes. In real life,
and as an implementation detail, a given srcu_down_read() only ever
returns either 0 or 1. And from what I can see playing around, the fact
that any given srcu_down_read() only ever returns zero is not a problem,
the (data | rf)* still works fine. Thus far, anyway. ;-)

> How does this differ from srcu_read_lock() and srcu_read_unlock()? And
> how do the "up" and "down" parts figure into it? -- what is going up or
> down?

Functionally and from a performance/scalability viewpoint, they
are identical to srcu_read_lock() and srcu_read_unlock(). The only
difference is that srcu_down_read() and srcu_up_read() lack the lockdep
machinery that complains when a matching pair of srcu_read_lock() and
srcu_read_unlock() are used from different tasks.

Within the implementation, nothing ever goes down, it is all
this_cpu_inc(). The "down" and "up" are by analogy to down() and up(),
where "down()" says acquire some rights to a resource and "up()" says
release those rights.

Wait, I can make "down" work.

A call to srcu_down_read() reduces the quantity computed by summing the
unlocks then subtracting the sum of the locks. A call to srcu_up_read()
increases that same quantity. ;-)

Thanx, Paul

2023-01-19 00:58:51

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 18, 2023 at 10:24:50PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/18/2023 10:12 PM, Paul E. McKenney wrote:
>
> > The only difference between srcu_read_lock() and srcu_read_unlock()
> > on the one hand and srcu_down_read() and srcu_up_read() on the other
> > is that a matching pair of srcu_read_lock() and srcu_read_unlock()
> > must be running on the same task. In contrast, srcu_down_read() and
> > srcu_up_read() are not subject to this constraint.
> >
> > > What I was suggesting below is how to redefine "match" between read_down and
> > > read_up that work more like a cross-thread semaphore.
> > Understood, but what I don't understand is why not simply this:
> >
> > let srcu-rscs-down = ([Srcu-down] ; (data | rf)* ; [Srcu-up]) & loc
>
> Oh, I had thought that it should be more like a semaphore rather than just a
> cross-cpu mutex.
>
> Here's an example of how what you are describing would be used:
>
> P0{
> ?? idx = srcu_down(&ss);
> ?? store_release(done,1);
> }
>
> P1{
> ??? while (! load_acquire(done));
> ??? srcu_up(&ss,idx)
> }

Exactly!!!

> What I was thinking of is more something like this:
>
> P0{
> ?? idx1 = srcu_down(&ss);
> ?? srcu_up(&ss,idx1);
> }
>
> P1{
> ??? idx2 = srcu_down(&ss);
> ??? srcu_up(&ss,idx2)
> }

And srcu_read_lock() and srcu_read_unlock() already do this.

> where the big difference to srcu_lock/unlock would be that if P0 and P1
> happened to get the same index -- which you could very well check or
> synchronize on -- that you would be guaranteed that the grace period only
> ends once *all* threads that are using this index have called up. (note that
> I believe that your implementation has this property, and some users may
> come to rely on it if they find out!)

They are permitted and encouraged to rely on the fact that
synchronize_srcu() waits until all pre-existing SRCU read-side critical
sections have completed, which I believe is quite close to what you
are saying. But if they want to look at the return values from either
srcu_read_lock() or srcu_down_read(), they would be better off using
either get_state_synchronize_srcu() or start_poll_synchronize_srcu().

Huh. I need to add a NUM_ACTIVE_SRCU_POLL_OLDSTATE, don't I? I first
need to figure out what its value would be.

> If you want this latter kind of guarantee, then you need to do so something
> along the lines of what Alan or I wrote.
>
> If all you need is the ability to use the first scenario, without any
> guarantee that if the index happened to be the same (or providing an API
> where you can do the down with a fixed index provided by P0) the grace
> period will extend, then what you propose should be right.
>
> But from Alan's comments I had misunderstood that that wouldn't be the case.

"What do you need?" "Well, what can be provided?" ;-)

Thanx, Paul

2023-01-19 03:02:06

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

Jonas, each of your emails introduces too many new thoughts and ideas!
I can't keep up. So in this reply I'm going to skip over most of what
you wrote. If you think any of the items I have elided are worth
pursuing, you can bring them up in a new thread -- hopefully with just
one main thought per email! :-)

On Wed, Jan 18, 2023 at 12:25:05PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/17/2023 10:19 PM, Alan Stern wrote:
> > On Tue, Jan 17, 2023 at 06:48:12PM +0100, Jonas Oberhauser wrote:
> > > On 1/14/2023 5:42 PM, Alan Stern wrote:

> > > Pretending for simplicity that rscs and grace periods aren't reads&writes
> > They aren't. You don't have to pretend.
>
> rscs are reads& writes in herd. That's how the data dependency works in your
> patch, right?

No, you're mixing up RCU and SRCU. The RCU operations rcu_read_lock()
and rcu_read_unlock() are not loads or stores; they're just fences. In
the current form of the LKMM the same is true for the SRCU operations
srcu_read_lock() and srcu_read_unlock(), but in the patch I submitted
they are indeed loads and stores.

> I consider that a hack though and don't like it.

It _is_ a bit of a hack, but not a huge one. srcu_read_lock() really
is a lot like a load, in that it returns a value obtained by reading
something from memory (along with some other operations, though, so it
isn't a simple straightforward read -- perhaps more like an
atomic_inc_return_relaxed).

srcu_read_unlock() is somewhat less like a store, but it does have one
notable similarity: It takes an input value and therefore can be the
target of a data dependency. The biggest difference is that an
srcu_read_unlock() can't really be read-from. It would be nice if herd
had an event type that behaved this way.

Also, herd doesn't have any way of saying that the value passed to a
store is an unmodified copy of the value obtained by a load. In our
case that doesn't matter much -- nobody should be writing litmus tests
in which the value returned by srcu_read_lock() is incremented and then
decremented again before being passed to srcu_write_lock()!

> > > > There was also something about what should happen when you have two
> > > > grace periods in a row.
> > > Note that two grace periods in a row are a subset of po;rcu-gp;po and thus
> > > gp, and so there's nothing to be done.
> > That is not stated carefully, but it probably is wrong. Consider this:
> >
> > P0 P1 P2
> > --------------- -------------- -----------------
> > rcu_read_lock Wy=1 rcu_read_lock
> > Wx=1 synchronize_rcu Wz=1
> > Ry=0 synchronize_rcu Rx=0
> > rcu_read_unlock Rz=0 rcu_read_unlock
> >
> > (W stands for Write and R for Read.) This execution is forbidden by the
> > counting rule: Its cycle has two grace periods and two critical
> > sections. But if we changed the definition of gp to be
> >
> > let gp = po ; [Sync-rcu | Sync-srcu] ; po
> >
> > then the memory model would allow the execution. So having the po? at
> > the end of gp is vital.
>
> I hadn't thought yet about the effect of modifying the definition of gp, but
> I don't think this example relies on gp at all.
> The model would forbid this even if rcu-fence and gp were both changed from
> po? to po.
> From Rz=0 we know
> ??? second sync() ->rcu-gp;po Rz ->prop Wz ->po P2 unlock() ->rcu-rscsi P2
> lock()
> From Ry=0 we know
> ? P1 unlock() ->rcu-rsci P1 lock() ->po Ry ->prop Wy ->po;rcu-gp first
> sync()
>
> which are both rcu-order.
> Then from Rx=0 we have
> ? Rx ->prop Wx ->po P1 unlock() ->rcu-order? first sync() ->po second sync()
> ->rcu-order P2 lock() ->po Rx
> of course since po is one case of rcu-link, we get
> ? Rx ->prop Wx ->po P1 unlock() ->rcu-order P2 lock() ->po Rx
> and hence
> ? Rx ->prop Wx ->rcu-fence Rx
> which is supposed to be irreflexive (even with rcu-fence=po;rcu-order;po).

By golly, you're right! I'm still thinking in terms of an older
version of the memory model, which used gp in place of rcu-gp. In
that version, P1's write and read would be linked by gp but not by
(gp ; rcu-link ; gp) if the po? at the end of the definition of gp
was replaced by po.

> Note that if your ordering relies on actually using gp twice in a row, then
> these must come from strong-fence, but you should be able to just take the
> shortcut by merging them into a single gp.
> ? po;rcu-gp;po;rcu-gp;po <= gp <= strong-fence <= hb & strong-order

I don't know what you mean by this. The example above does rely on
having two synchronize_rcu() calls; with only one it would be allowed.


> > > I don't think rcu-order is necessary at all to define LKMM, and one can
> > > probably just use rcu-extend instead of rcu-order (and in fact even a
> > > version of rcu-extend without any lone rcu-gps).
> > Sure, you could do that, but it wouldn't make sense. Why would anyone
> > want to define an RCU ordering relation that includes
> >
> > gp ... rscs ... gp ... rscs
> >
> > but not
> >
> > gp ... rscs ... rscs ... gp
> >
> > ?
>
> Because the the RCU Grace Period Guarantee doesn't say "if a gp happens
> before a gp, with some rscs in between, ...".
> So I think even the picture is not the best picture to draw for RCU
> ordering. I think the right picture to draw for RCU ordering is something
> like this:
> ??? case (1): C ends before G does:
>
> rcsc ... ... ... ... ... gp
>
> case (2): G ends before C does:
>
> gp ... ... ... ... ... rscs
>
> where the dots are some relation that means "happens before".

Okay. So we could define rcu-order by:

let rec rcu-order = (rcu-gp ; rcu-link ; (rcu-order ; rcu-link)* ; rcu-rscsi) |
(rcu-rscsi ; rcu-link ; (rcu-order ; rcu-link)* ; rcu-gp)

(ignoring the SRCU cases). That is a little awkward; it might make
sense to factor out (rcu-link ; (rcu-order ; rcu-link)*) as a separate
relation and do a simultaneous recursion on both relations.

But either way, rcu-fence would have to be defined as (po ; rcu-order+ ; po?),
which looks a little odd.

Alan

2023-01-19 03:44:05

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 18, 2023 at 04:02:14PM -0800, Paul E. McKenney wrote:
> On Wed, Jan 18, 2023 at 03:54:47PM -0500, Alan Stern wrote:
> > How does this differ from srcu_read_lock() and srcu_read_unlock()? And
> > how do the "up" and "down" parts figure into it? -- what is going up or
> > down?
>
> Functionally and from a performance/scalability viewpoint, they
> are identical to srcu_read_lock() and srcu_read_unlock(). The only
> difference is that srcu_down_read() and srcu_up_read() lack the lockdep
> machinery that complains when a matching pair of srcu_read_lock() and
> srcu_read_unlock() are used from different tasks.

This makes me wonder if there's any need for srcu_down_read and
srcu_up_read at all. Why not just use srcu_read_lock and
srcu_read_unlock, and remove the lockdep check?

> Within the implementation, nothing ever goes down, it is all
> this_cpu_inc(). The "down" and "up" are by analogy to down() and up(),
> where "down()" says acquire some rights to a resource and "up()" says
> release those rights.

Another reason not to use those names. If you insist on making these
operations distinct from srcu_read_lock and srcu_read_unlock, why not
borrow the "_get" and "_put" nomenclature used by the device core? I
suspect more people would associate them with acquiring and releasing
rights to a resource. (Although in this case it might be so clear
exactly what that resource is.)

> Wait, I can make "down" work.
>
> A call to srcu_down_read() reduces the quantity computed by summing the
> unlocks then subtracting the sum of the locks. A call to srcu_up_read()
> increases that same quantity. ;-)

I can't honestly call that a resoundingly convincing argument. :-)

Alan

2023-01-19 11:27:42

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 18, 2023 at 09:19:07PM -0500, Alan Stern wrote:
> On Wed, Jan 18, 2023 at 04:02:14PM -0800, Paul E. McKenney wrote:
> > On Wed, Jan 18, 2023 at 03:54:47PM -0500, Alan Stern wrote:
> > > How does this differ from srcu_read_lock() and srcu_read_unlock()? And
> > > how do the "up" and "down" parts figure into it? -- what is going up or
> > > down?
> >
> > Functionally and from a performance/scalability viewpoint, they
> > are identical to srcu_read_lock() and srcu_read_unlock(). The only
> > difference is that srcu_down_read() and srcu_up_read() lack the lockdep
> > machinery that complains when a matching pair of srcu_read_lock() and
> > srcu_read_unlock() are used from different tasks.
>
> This makes me wonder if there's any need for srcu_down_read and
> srcu_up_read at all. Why not just use srcu_read_lock and
> srcu_read_unlock, and remove the lockdep check?

Because the lockdep check is quite helpful in finding bugs in the
common case.

> > Within the implementation, nothing ever goes down, it is all
> > this_cpu_inc(). The "down" and "up" are by analogy to down() and up(),
> > where "down()" says acquire some rights to a resource and "up()" says
> > release those rights.
>
> Another reason not to use those names. If you insist on making these
> operations distinct from srcu_read_lock and srcu_read_unlock, why not
> borrow the "_get" and "_put" nomenclature used by the device core? I
> suspect more people would associate them with acquiring and releasing
> rights to a resource. (Although in this case it might be so clear
> exactly what that resource is.)
>
> > Wait, I can make "down" work.
> >
> > A call to srcu_down_read() reduces the quantity computed by summing the
> > unlocks then subtracting the sum of the locks. A call to srcu_up_read()
> > increases that same quantity. ;-)
>
> I can't honestly call that a resoundingly convincing argument. :-)

It is exactly the same argument for the name of down() and up(). ;-)

And the analogy between mutex_lock() and down() on the one hand an
srcu_read_lock() and srcu_down_read() should be helpful as well.

Thanx, Paul

2023-01-19 11:57:16

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/19/2023 3:28 AM, Alan Stern wrote:
> > This is a permanent error; I've given up. Sorry it didn't
> work out.

[It seems the e-mail still reached me through the mailing list]

> hopefully with just one main thought per email! :-)

Honestly, I wish I could.Ā  I'm already trying hard but sometimes things
are interconnected, and my brain isn't good at withholding information
-- including incorrect and incomprehensible information :D

>
> On Wed, Jan 18, 2023 at 12:25:05PM +0100, Jonas Oberhauser wrote:
>>
>> On 1/17/2023 10:19 PM, Alan Stern wrote:
>>> On Tue, Jan 17, 2023 at 06:48:12PM +0100, Jonas Oberhauser wrote:
>>>> Pretending for simplicity that rscs and grace periods aren't reads&writes
>>> They aren't. You don't have to pretend.
>> rscs are reads& writes in herd. That's how the data dependency works in your
>> patch, right?
> No, you're mixing up RCU and SRCU.
Yes, I meant for the srcu : ) Any argument I'm trying to make just for
rcu right now, will need to still work for srcu later.

>> I consider that a hack though and don't like it.
> It _is_ a bit of a hack, but not a huge one. srcu_read_lock() really
> is a lot like a load, in that it returns a value obtained by reading
> something from memory (along with some other operations, though, so it
> isn't a simple straightforward read -- perhaps more like an
> atomic_inc_return_relaxed).
The issue I have with this is that it might create accidental ordering.
How does it behave when you throw fences in the mix?
It really does not work like an increment at all, I think
srcu_read_lock() only reads the currently active index, but the index is
changed by srcu_sync. But even that is an implementation detail of
sorts. I think the best way to think of it would be for srcu_read_lock
to just return an arbitrary value.
The user can not rely on any kind of "accidental" rfe edges between
these events for ordering.

Perhaps if you flag any use of these values in address or control
dependencies, as well as any event which depends on more than one of
these values, you could prove that it's impossible to contrain the
behavior through these rfe(and/or co) edges because you can anyways
never inspect the value returned by the operation (except to pass it
into srcu_unlock).

Or you might be able to explicitly eliminate the events everywhere, just
like you have done for carry-dep in your patch.

But it looks so brittle.



> srcu_read_unlock() is somewhat less like a store, but it does have one
> notable similarity: It takes an input value and therefore can be the
> target of a data dependency. The biggest difference is that an
> srcu_read_unlock() can't really be read-from. It would be nice if herd
> had an event type that behaved this way.
Or if you could declare your own : )
Obviously, you will have accidental rf edges going from
srcu_read_unlock() to srcu_read_lock() if you model them this way.

> Also, herd doesn't have any way of saying that the value passed to a
> store is an unmodified copy of the value obtained by a load. In our
> case that doesn't matter much -- nobody should be writing litmus tests
> in which the value returned by srcu_read_lock() is incremented and then
> decremented again before being passed to srcu_write_lock()!

It would be nice if herd allowed declaring structs that can be used for
such purposes.
(anyways, I am not sure if Luc is still following everything in this
deeply nested thread that started somewhere completely different. But
maybe if Paul or you open a feature request, let me know so that I can
give my 2ct)

> By golly, you're right! I'm still thinking in terms of an older
> version of the memory model, which used gp in place of rcu-gp.

I'm glad I'm not the only person to mix these two up xP

>> Note that if your ordering relies on actually using gp twice in a row, then
>> these must come from strong-fence, but you should be able to just take the
>> shortcut by merging them into a single gp.
>> Ā  po;rcu-gp;po;rcu-gp;po <= gp <= strong-fence <= hb & strong-order
> I don't know what you mean by this. The example above does rely on
> having two synchronize_rcu() calls; with only one it would be allowed.

I mean that if you have a cycle that is formed by having two adjacent
actual `gp` edges, like .... ; gp;gp ; ....Ā  with gp= po ; rcu-gp ; po?,
(not like your example, where the cycle uses two *rcu*-gp but no gp
edges) and assume we define gp' = po ; rcu-gp ; po and hb' and pb' to
use gp' instead of gp,
then there are two cases for how that cycle came to be, either 1) as
Ā ... ; hb;hb ; ....
but then you can refactor as
Ā ... ; po;rcu-gp;po;rcu-gp;po ; ...
Ā ... ; po;rcu-gp; Ā  Ā  po Ā Ā  Ā  ; ...
Ā ... ;Ā Ā Ā Ā Ā Ā Ā Ā  gp'Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  ; ...
Ā ... ;Ā Ā Ā Ā Ā Ā Ā Ā  hb'Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  ; ...
which again creates a cycle, or 2) as
Ā  ... ; pb ; hb ; ...
coming from
Ā  ... ; prop ; gp ; gp ; ....
which you can similarly refactor as
Ā  ... ; prop ; po;rcu-gp;po ; ....
Ā  ... ; prop ;Ā Ā Ā Ā Ā  gp' Ā  Ā  ; ....
and again get a cycle with
... ; pb' ; ....
Therefore, gp = po;rcu-gp;po should be equivalent.

>>>> I don't think rcu-order is necessary at all to define LKMM, and one can
>>>> probably just use rcu-extend instead of rcu-order (and in fact even a
>>>> version of rcu-extend without any lone rcu-gps).
>>> Sure, you could do that, but it wouldn't make sense. Why would anyone
>>> want to define an RCU ordering relation that includes
>>>
>>> gp ... rscs ... gp ... rscs
>>>
>>> but not
>>>
>>> gp ... rscs ... rscs ... gp
>>>
>>> ?
>> Because the the RCU Grace Period Guarantee doesn't say "if a gp happens
>> before a gp, with some rscs in between, ...".
>> So I think even the picture is not the best picture to draw for RCU
>> ordering. I think the right picture to draw for RCU ordering is something
>> like this:
>> Ā Ā Ā  case (1): C ends before G does:
>>
>> rcsc ... ... ... ... ... gp
>>
>> case (2): G ends before C does:
>>
>> gp ... ... ... ... ... rscs
>>
>> where the dots are some relation that means "happens before".
> Okay. So we could define rcu-order by:
>
> let rec rcu-order = (rcu-gp ; rcu-link ; (rcu-order ; rcu-link)* ; rcu-rscsi) |
> (rcu-rscsi ; rcu-link ; (rcu-order ; rcu-link)* ; rcu-gp)
>
> (ignoring the SRCU cases). That is a little awkward; it might make
> sense to factor out (rcu-link ; (rcu-order ; rcu-link)*) as a separate
> relation and do a simultaneous recursion on both relations.

Exactly!

> But either way, rcu-fence would have to be defined as (po ; rcu-order+ ; po?),
> which looks a little odd.

Almost, (assuming the rb definition is changed to be like the pb one*)
it would be

rcu-fence = (po ; (rcu-order ; po)+).

Alernatively, I believe (but haven't fully confirmed) it would also work to define
rcu-fence = po ; rcu-order ; po?


This is why I am wondering whether
order = po ; {inducing operation} ; po?
is ok in general or not.


Have fun,
jonas

(*= otherwise you'd need to include the rcu-link in here as well)

2023-01-19 14:29:20

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/19/2023 1:11 AM, Paul E. McKenney wrote:
> On Wed, Jan 18, 2023 at 10:24:50PM +0100, Jonas Oberhauser wrote:
>> What I was thinking of is more something like this:
>>
>> P0{
>> Ā Ā  idx1 = srcu_down(&ss);
>> Ā Ā  srcu_up(&ss,idx1);
>> }
>>
>> P1{
>> Ā Ā Ā  idx2 = srcu_down(&ss);
>> Ā Ā Ā  srcu_up(&ss,idx2)
>> }
> And srcu_read_lock() and srcu_read_unlock() already do this.

I think I left out too much from my example.
And filling in the details led me down a bit of a rabbit hole of
confusion for a while.
But here's what I ended up with:


P0{
Ā Ā Ā  idx1 = srcu_down(&ss);
Ā Ā Ā  store_rel(p1, true);


Ā Ā Ā  shared cs

Ā Ā Ā  R x == ?

Ā Ā Ā  while (! load_acq(p2));
Ā Ā Ā  R idx2 == idx1 // for some reason, we got lucky!
Ā Ā Ā  srcu_up(&ss,idx1);
}

P1{
Ā Ā Ā  idx2 = srcu_down(&ss);
Ā Ā Ā  store_rel(p2, true);

Ā Ā Ā  shared cs

Ā Ā Ā  R y == 0

Ā Ā Ā  while (! load_acq(p1));
Ā Ā Ā  srcu_up(&ss,idx2);
}

P2 {
Ā Ā Ā  W y = 1
Ā Ā Ā  srcu_sync(&ss);
Ā Ā Ā  W x = 1
}




Assuming that like indicated above both threads happen to read the same
index, are you guaranteed that the shared cs lasts until both P0 and P1
have performed their final up?
Is it allowed for P0 to read x==1?

If you define matching up&down as you do through the data link, then we
get something like

P1's down ->po;prop;poĀ  grace period
thus
P1's upĀ  ->rcu-orderĀ  grace period
P0's down ->po;hb;poĀ  P1's up ->rcu-order grace period
P0's up ->srcu-rscsi;rcu-link;rcu-orderĀ  grace-period
Unfortunately, this is not enough to rcu-orderĀ  P0's up with the grace
period -- you'd need a second rcu-gp for that!

Looking at it from the other side, because x reads x=1, we have
grace period ->po;rfe;po P0's up
and thus
grace period ->rcu-order P0's down ->po;hb;po P1's up
but again this would order the grace period with P1's up because you'd
need a second grace period.

When sketching it out on paper, I couldn't find any forbidden cycles,
and so x==1 would be allowed. (But as I side, I managed to confuse
myself a few times with this, so if you find a forbidden cycle let me know).

But note that the synchronization in there and the fact that both have
the same index ensures that the two grace periods overlap, in a
hypothetical order it would be
Ā  down() -> down() -> up() -> up()
(with any premutation of P0 and P1 over these events so that they each
get 1 up() and 1 down()) and thus the grace period must actually end
after both, or start before both.

With the definition I offered, you would get
P0's up() ->srcu-rscsiĀ  P1's down()
and
P1's up() ->srcu-rscsi P0's down()
and in particular

Rx1 ->po P0's up() ->srcu-rscsiĀ  P1's down() ->po Ry0 ->prop Wy1 ->po
srcu-gp on the same loc ->po Wx1 ->rfe Rx1
which can be collapsed to
Rx1 ->po;rcu-order;po;hb Rx1 which isn't irreflexive

Thus x==1 would be forbidden.

This is more semaphore-like, where the same cookie shared between
threads implies that it's the same semaphore, and the overlapping
guarantee (through synchronizing on p1,p2 in the beginning) means that
the critical sections overlap.

In contrast, I wouldn't suggest the same for srcu_lock and srcu_unlock,
where even though you may get the same cookie by accident, those might
still be two completely independent critical sections.
For example, you could imagine a single percpu counter _cnt (per index
of course) that is incremented and decremented for lock() and unlock(),
and the condition to pass an srcu_sync() of a given index is that the
cpu[...]._cnt[idx] are all individually 0 and the sum of all ups[idx] is
equal to the sum of all downs[idx].

If you create an operational model of up() and down() in terms of such a
per-index semaphore, I think the x==1 case would similarly need to be
forbidden. Since the grace period must end after P1's grace period, and
P0's and P1's grace period overlap and use the same semaphore, the count
is never 0 at any point in time either P0 or P1 are in the grace period,
and so the grace period must also end after P0's grace period. But then
x=1 can't yet have propagated to P0 when it reads x inside its grace period.

In contrast, if the operational model of lock() and unlock() is a
per-index and per-cpu count, then the x==1 case would be allowed, e.g.,
as follows (time from left to right, all processes happen in parallel):
P0:Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  < Rx1Ā Ā Ā Ā Ā Ā Ā Ā  >
P1: <Ā Ā Ā  Ry0Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  >
P2:Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  y=1Ā  < P0!Ā Ā Ā Ā  P1! > x=1

here < and > mark the start and end of cs and gp, and Pn! is the time
the gp realizes that Pn was not in a cs.

Best wishes,

jonas

2023-01-19 16:52:17

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Thu, Jan 19, 2023 at 12:22:50PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/19/2023 3:28 AM, Alan Stern wrote:
> > > This is a permanent error; I've given up. Sorry it didn't
> > work out.
>
> [It seems the e-mail still reached me through the mailing list]

[For everyone else, Jonas is referring to the fact that the last two
emails I sent to his huaweicloud.com address could not be delivered, so
I copied them off-list to his huawei.com address.]

> > > I consider that a hack though and don't like it.
> > It _is_ a bit of a hack, but not a huge one. srcu_read_lock() really
> > is a lot like a load, in that it returns a value obtained by reading
> > something from memory (along with some other operations, though, so it
> > isn't a simple straightforward read -- perhaps more like an
> > atomic_inc_return_relaxed).
> The issue I have with this is that it might create accidental ordering. How
> does it behave when you throw fences in the mix?

I think this isn't going to be a problem. Certainly any real
implementation of scru_read_lock() is going to involve some actual load
operations, so any unintentional ordering caused by fences will also
apply to real executions. Likewise for srcu_read_unlock and store
operations.

> It really does not work like an increment at all, I think srcu_read_lock()
> only reads the currently active index, but the index is changed by
> srcu_sync. But even that is an implementation detail of sorts. I think the
> best way to think of it would be for srcu_read_lock to just return an
> arbitrary value.

I think I'll stick to it always returning the initial value. Paul said
that would be okay.

> The user can not rely on any kind of "accidental" rfe edges between these
> events for ordering.
>
> Perhaps if you flag any use of these values in address or control
> dependencies, as well as any event which depends on more than one of these
> values, you could prove that it's impossible to contrain the behavior
> through these rfe(and/or co) edges because you can anyways never inspect the
> value returned by the operation (except to pass it into srcu_unlock).
>
> Or you might be able to explicitly eliminate the events everywhere, just
> like you have done for carry-dep in your patch.

On second thought, I'll make it impossible to read from the
srcu_read_unlock events by removing them from the rf (and rfi/rfe)
relation. Then it won't be necessary to change carry-dep or anything
else.

> But it looks so brittle.

Maybe not so much after this change?

> > srcu_read_unlock() is somewhat less like a store, but it does have one
> > notable similarity: It takes an input value and therefore can be the
> > target of a data dependency. The biggest difference is that an
> > srcu_read_unlock() can't really be read-from. It would be nice if herd
> > had an event type that behaved this way.
> Or if you could declare your own : )
> Obviously, you will have accidental rf edges going from srcu_read_unlock()
> to srcu_read_lock() if you model them this way.

Not when those edges are erased.

> > Also, herd doesn't have any way of saying that the value passed to a
> > store is an unmodified copy of the value obtained by a load. In our
> > case that doesn't matter much -- nobody should be writing litmus tests
> > in which the value returned by srcu_read_lock() is incremented and then
> > decremented again before being passed to srcu_write_lock()!
>
> It would be nice if herd allowed declaring structs that can be used for such
> purposes.
> (anyways, I am not sure if Luc is still following everything in this deeply
> nested thread that started somewhere completely different. But maybe if Paul
> or you open a feature request, let me know so that I can give my 2ct)

I thought you were against adding into herd features that were specific
to the Linux kernel?

> > > Note that if your ordering relies on actually using gp twice in a row, then
> > > these must come from strong-fence, but you should be able to just take the
> > > shortcut by merging them into a single gp.
> > > ? po;rcu-gp;po;rcu-gp;po <= gp <= strong-fence <= hb & strong-order
> > I don't know what you mean by this. The example above does rely on
> > having two synchronize_rcu() calls; with only one it would be allowed.
>
> I mean that if you have a cycle that is formed by having two adjacent actual
> `gp` edges, like .... ; gp;gp ; ....? with gp= po ; rcu-gp ; po?,
> (not like your example, where the cycle uses two *rcu*-gp but no gp edges)

Don't forget that I had in mind a version of the model where rcu-gp did
not exist.

> and assume we define gp' = po ; rcu-gp ; po and hb' and pb' to use gp'
> instead of gp,
> then there are two cases for how that cycle came to be, either 1) as
> ?... ; hb;hb ; ....
> but then you can refactor as
> ?... ; po;rcu-gp;po;rcu-gp;po ; ...
> ?... ; po;rcu-gp; ? ? po ?? ? ; ...
> ?... ;???????? gp'??????????? ; ...
> ?... ;???????? hb'??????????? ; ...
> which again creates a cycle, or 2) as
> ? ... ; pb ; hb ; ...
> coming from
> ? ... ; prop ; gp ; gp ; ....
> which you can similarly refactor as
> ? ... ; prop ; po;rcu-gp;po ; ....
> ? ... ; prop ;????? gp' ? ? ; ....
> and again get a cycle with
> ... ; pb' ; ....
> Therefore, gp = po;rcu-gp;po should be equivalent.

The point is that in P1, we have Write ->(gp;gp) Read, but we do not
have Write ->(gp';gp') Read. Only Write ->gp' Read. So if you're using
gp' instead of gp, you'll analyze the litmus test as if it had only one
grace period but two critical sections, getting a wrong answer.


Here's a totally different way of thinking about these things, which may
prove enlightening. These thoughts originally occurred to me years ago,
and I had forgotten about them until last night.

If G is a grace period, let's write t1(G) for the time when G starts and
t2(G) for the time when G ends.

Likewise, if C is a read-side critical section, let's write t2(C) for
the time when C starts (or the lock executes if you prefer) and t1(C)
for the time when C ends (or the unlock executes). This terminology
reflects the "backward" role that critical sections play in the memory
model.

Now we can can characterize rcu-order and rcu-link in operational terms.
Let A and B each be either a grace period or a read-side critical
section. Then:

A ->rcu-order B means t1(A) < t2(B), and

A ->rcu-link B means t2(A) <= t1(B).

(Of course, we always have t1(X) < t2(X) for any grace period or
critical section X.)

This explains quite a lot. For example, we can justify including

C ->rcu-link G

into rcu-order as follows. From C ->rcu-link G we get that t2(C) <=
t1(G), in other words, C starts when or before G starts. Then the
Fundamental Law of RCU says that C must end before G ends, since
otherwise C would span all of G. Thus t1(C) < t2(G), which is C
->rcu-order G.

The case of G ->rcu-link C is similar.

This also explains why rcu-link can be extended by appending (rcu-order
; rcu-link)*. From X ->rcu-order Y ->rcu-link Z we get that t1(X) <
t2(Y) <= t1(Z) and thus t1(X) <= t1(Z). So if

A ->rcu-link B ->(rcu-order ; rcu-link)* C

then t2(A) <= t1(B) <= t1(C), which justifies A ->rcu-link C.

The same sort of argument shows that rcu-order should be extendable by
appending (rcu-link ; rcu-order)* -- but not (rcu-order ; rcu-link)*.

This also justifies why a lone gp belongs in rcu-order: G ->rcu-order G
holds because t1(G) < t2(G). But for critical sections we have t2(C) <
t1(C) and so C ->rcu-order C does not hold.

Assuming ordinary memory accesses occur in a single instant, you see why
it makes sense to consider (po ; rcu-order ; po) an ordering. But when
you're comparing grace periods or critical sections to each other,
things get a little ambiguous. Should G1 be considered to come before
G2 when t1(G1) < t1(G2), when t2(G1) < t2(G2), or when t2(G1) < t1(G2)?
Springing for (po ; rcu-order ; po?) amounts to choosing the second
alternative.

Alan

2023-01-19 19:07:04

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Thu, Jan 19, 2023 at 11:41:01AM -0500, Alan Stern wrote:
> On Thu, Jan 19, 2023 at 12:22:50PM +0100, Jonas Oberhauser wrote:
> >
> >
> > On 1/19/2023 3:28 AM, Alan Stern wrote:
> > > > This is a permanent error; I've given up. Sorry it didn't
> > > work out.
> >
> > [It seems the e-mail still reached me through the mailing list]
>
> [For everyone else, Jonas is referring to the fact that the last two
> emails I sent to his huaweicloud.com address could not be delivered, so
> I copied them off-list to his huawei.com address.]
>
> > > > I consider that a hack though and don't like it.
> > > It _is_ a bit of a hack, but not a huge one. srcu_read_lock() really
> > > is a lot like a load, in that it returns a value obtained by reading
> > > something from memory (along with some other operations, though, so it
> > > isn't a simple straightforward read -- perhaps more like an
> > > atomic_inc_return_relaxed).
> > The issue I have with this is that it might create accidental ordering. How
> > does it behave when you throw fences in the mix?
>
> I think this isn't going to be a problem. Certainly any real
> implementation of scru_read_lock() is going to involve some actual load
> operations, so any unintentional ordering caused by fences will also
> apply to real executions. Likewise for srcu_read_unlock and store
> operations.
>
> > It really does not work like an increment at all, I think srcu_read_lock()
> > only reads the currently active index, but the index is changed by
> > srcu_sync. But even that is an implementation detail of sorts. I think the
> > best way to think of it would be for srcu_read_lock to just return an
> > arbitrary value.
>
> I think I'll stick to it always returning the initial value. Paul said
> that would be okay.

Just confirming.

> > The user can not rely on any kind of "accidental" rfe edges between these
> > events for ordering.
> >
> > Perhaps if you flag any use of these values in address or control
> > dependencies, as well as any event which depends on more than one of these
> > values, you could prove that it's impossible to contrain the behavior
> > through these rfe(and/or co) edges because you can anyways never inspect the
> > value returned by the operation (except to pass it into srcu_unlock).
> >
> > Or you might be able to explicitly eliminate the events everywhere, just
> > like you have done for carry-dep in your patch.
>
> On second thought, I'll make it impossible to read from the
> srcu_read_unlock events by removing them from the rf (and rfi/rfe)
> relation. Then it won't be necessary to change carry-dep or anything
> else.

Although that works very well for srcu_read_lock() and srcu_read_unlock(),
it would be an issue for srcu_down_read() and srcu_up_read(). But one
thing at a time! ;-)

Thanx, Paul

2023-01-19 20:41:45

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Thu, Jan 19, 2023 at 10:41:07AM -0800, Paul E. McKenney wrote:
> In contrast, this actually needs srcu_down_read() and srcu_up_read():
>
> ------------------------------------------------------------------------
>
> C C-srcu-nest-6
>
> (*
> * Result: Never
> *
> * Flag unbalanced-srcu-locking
> * This would be valid for srcu_down_read() and srcu_up_read().
> *)
>
> {}
>
> P0(int *x, int *y, struct srcu_struct *s1, int *idx)
> {
> int r2;
> int r3;
>
> r3 = srcu_down_read(s1);
> WRITE_ONCE(*idx, r3);
> r2 = READ_ONCE(*y);
> }
>
> P1(int *x, int *y, struct srcu_struct *s1, int *idx)
> {
> int r1;
> int r3;
>
> r1 = READ_ONCE(*x);
> r3 = READ_ONCE(*idx);
> srcu_up_read(s1, r3);
> }
>
> P2(int *x, int *y, struct srcu_struct *s1)
> {
> WRITE_ONCE(*y, 1);
> synchronize_srcu(s1);
> WRITE_ONCE(*x, 1);
> }
>
> locations [0:r1]
> exists (1:r1=1 /\ 0:r2=0)

I modified this litmus test by adding a flag variable with an
smp_store_release in P0, an smp_load_acquire in P1, and a filter clause
to ensure that P1 reads the flag and idx from P1.

With the patch below, the results were as expected:

Test C-srcu-nest-6 Allowed
States 3
0:r1=0; 0:r2=0; 1:r1=0;
0:r1=0; 0:r2=1; 1:r1=0;
0:r1=0; 0:r2=1; 1:r1=1;
No
Witnesses
Positive: 0 Negative: 3
Condition exists (1:r1=1 /\ 0:r2=0)
Observation C-srcu-nest-6 Never 0 3
Time C-srcu-nest-6 0.04
Hash=2b010cf3446879fb84752a6016ff88c5

It turns out that the idea of removing rf edges from Srcu-unlock events
doesn't work well. The missing edges mess up herd's calculation of the
fr relation and the coherence axiom. So I've gone back to filtering
those edges out of carry-dep.

Also, Boqun's suggestion for flagging ordinary accesses to SRCU
structures no longer works, because the lock and unlock operations now
_are_ normal accesses. I removed that check too, but it shouldn't hurt
much because I don't expect to encounter litmus tests that try to read
or write srcu_structs directly.

Alan



Index: usb-devel/tools/memory-model/linux-kernel.bell
===================================================================
--- usb-devel.orig/tools/memory-model/linux-kernel.bell
+++ usb-devel/tools/memory-model/linux-kernel.bell
@@ -53,38 +53,30 @@ let rcu-rscs = let rec
in matched

(* Validate nesting *)
-flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-locking
-flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-locking
+flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-lock
+flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-unlock

(* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
-let srcu-rscs = let rec
- unmatched-locks = Srcu-lock \ domain(matched)
- and unmatched-unlocks = Srcu-unlock \ range(matched)
- and unmatched = unmatched-locks | unmatched-unlocks
- and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
- and unmatched-locks-to-unlocks =
- ([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
- and matched = matched | (unmatched-locks-to-unlocks \
- (unmatched-po ; unmatched-po))
- in matched
+let srcu-rscs = ([Srcu-lock] ; (data | rf)+ ; [Srcu-unlock]) & loc

(* Validate nesting *)
-flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
-flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
+flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-lock
+flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-unlock
+flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-srcu-matches

(* Check for use of synchronize_srcu() inside an RCU critical section *)
flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep

(* Validate SRCU dynamic match *)
-flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
+flag ~empty different-values(srcu-rscs) as bad-srcu-value-match

(* Compute marked and plain memory accesses *)
let Marked = (~M) | IW | Once | Release | Acquire | domain(rmw) | range(rmw) |
- LKR | LKW | UL | LF | RL | RU
+ LKR | LKW | UL | LF | RL | RU | Srcu-lock | Srcu-unlock
let Plain = M \ Marked

(* Redefine dependencies to include those carried through plain accesses *)
-let carry-dep = (data ; rfi)*
+let carry-dep = (data ; [~ Srcu-unlock] ; rfi)*
let addr = carry-dep ; addr
let ctrl = carry-dep ; ctrl
let data = carry-dep ; data
Index: usb-devel/tools/memory-model/linux-kernel.def
===================================================================
--- usb-devel.orig/tools/memory-model/linux-kernel.def
+++ usb-devel/tools/memory-model/linux-kernel.def
@@ -49,8 +49,10 @@ synchronize_rcu() { __fence{sync-rcu}; }
synchronize_rcu_expedited() { __fence{sync-rcu}; }

// SRCU
-srcu_read_lock(X) __srcu{srcu-lock}(X)
-srcu_read_unlock(X,Y) { __srcu{srcu-unlock}(X,Y); }
+srcu_read_lock(X) __load{srcu-lock}(*X)
+srcu_read_unlock(X,Y) { __store{srcu-unlock}(*X,Y); }
+srcu_down_read(X) __load{srcu-lock}(*X)
+srcu_up_read(X,Y) { __store{srcu-unlock}(*X,Y); }
synchronize_srcu(X) { __srcu{sync-srcu}(X); }
synchronize_srcu_expedited(X) { __srcu{sync-srcu}(X); }

2023-01-19 20:45:53

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 18, 2023 at 09:41:38AM -0800, Paul E. McKenney wrote:
> On Wed, Jan 18, 2023 at 11:03:35AM -0500, Alan Stern wrote:
> > On Tue, Jan 17, 2023 at 09:17:04PM -0800, Paul E. McKenney wrote:
> > > On Tue, Jan 17, 2023 at 09:15:15PM -0500, Alan Stern wrote:
> > > > Maybe we don't. Please test the patch below; I think it will do what
> > > > you want -- and it doesn't rule out nesting.
> > >
> > > It works like a champ on manual/kernel/C-srcu*.litmus in the litmus
> > > repository on github, good show and thank you!!!
> > >
> > > I will make more tests, and am checking this against the rest of the
> > > litmus tests in the repo, but in the meantime would you be willing to
> > > have me add your Signed-off-by?
> >
> > I'll email a real patch submission in the not-too-distant future,
> > assuming you don't find any problems with the new code.
>
> Sounds good!
>
> The current state is that last night's testing found a difference only
> for C-srcu-nest-5.litmus, in which case your version gives the correct
> answer and mainline is wrong. There were a couple of broken tests, which
> I fixed and a test involving spin_unlock_wait(), which is at this point
> perma-broken due to the Linux kernel no longer having such a thing.
> (Other than its re-introduction into i915, but they define it as a
> spin_lock_irq() followed by a spin_unlock_irq(), so why worry?)
> There were also a few timeouts.
>
> I intend to run the longer tests overnight.

Except that I had an episode of pilot error. :-/

But here are a couple of litmus tests illustrating how SRCU read-side
critical sections do not flatten/fuse the way that RCU read-side critical
sections do.

Why the difference? Because such a Linux-kernel SRCU implementation
would require additional storage of the order (T * S), where T is
the number of tasks and S is the number of srcu_struct structures.
That just won't be happening.

Here they are, and both behave the way that I would expect given your
unofficial patch:

https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-7.litmus
https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-8.litmus

Whew! ;-)

Thanx, Paul

2023-01-19 22:15:30

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Thu, Jan 19, 2023 at 02:51:53PM -0500, Alan Stern wrote:
> On Thu, Jan 19, 2023 at 10:41:07AM -0800, Paul E. McKenney wrote:
> > In contrast, this actually needs srcu_down_read() and srcu_up_read():
> >
> > ------------------------------------------------------------------------
> >
> > C C-srcu-nest-6
> >
> > (*
> > * Result: Never
> > *
> > * Flag unbalanced-srcu-locking
> > * This would be valid for srcu_down_read() and srcu_up_read().
> > *)
> >
> > {}
> >
> > P0(int *x, int *y, struct srcu_struct *s1, int *idx)
> > {
> > int r2;
> > int r3;
> >
> > r3 = srcu_down_read(s1);
> > WRITE_ONCE(*idx, r3);
> > r2 = READ_ONCE(*y);
> > }
> >
> > P1(int *x, int *y, struct srcu_struct *s1, int *idx)
> > {
> > int r1;
> > int r3;
> >
> > r1 = READ_ONCE(*x);
> > r3 = READ_ONCE(*idx);
> > srcu_up_read(s1, r3);
> > }
> >
> > P2(int *x, int *y, struct srcu_struct *s1)
> > {
> > WRITE_ONCE(*y, 1);
> > synchronize_srcu(s1);
> > WRITE_ONCE(*x, 1);
> > }
> >
> > locations [0:r1]
> > exists (1:r1=1 /\ 0:r2=0)
>
> I modified this litmus test by adding a flag variable with an
> smp_store_release in P0, an smp_load_acquire in P1, and a filter clause
> to ensure that P1 reads the flag and idx from P1.
>
> With the patch below, the results were as expected:
>
> Test C-srcu-nest-6 Allowed
> States 3
> 0:r1=0; 0:r2=0; 1:r1=0;
> 0:r1=0; 0:r2=1; 1:r1=0;
> 0:r1=0; 0:r2=1; 1:r1=1;
> No
> Witnesses
> Positive: 0 Negative: 3
> Condition exists (1:r1=1 /\ 0:r2=0)
> Observation C-srcu-nest-6 Never 0 3
> Time C-srcu-nest-6 0.04
> Hash=2b010cf3446879fb84752a6016ff88c5

Fair point, and for example we already recommend emulating call_rcu()
using similar release-acquire tricks.

> It turns out that the idea of removing rf edges from Srcu-unlock events
> doesn't work well. The missing edges mess up herd's calculation of the
> fr relation and the coherence axiom. So I've gone back to filtering
> those edges out of carry-dep.
>
> Also, Boqun's suggestion for flagging ordinary accesses to SRCU
> structures no longer works, because the lock and unlock operations now
> _are_ normal accesses. I removed that check too, but it shouldn't hurt
> much because I don't expect to encounter litmus tests that try to read
> or write srcu_structs directly.

Agreed. I for one would definitely have something to say about an
SRCU-usage patch that directly manipulated a srcu_struct structure! ;-)

> Alan
>
>
>
> Index: usb-devel/tools/memory-model/linux-kernel.bell
> ===================================================================
> --- usb-devel.orig/tools/memory-model/linux-kernel.bell
> +++ usb-devel/tools/memory-model/linux-kernel.bell
> @@ -53,38 +53,30 @@ let rcu-rscs = let rec
> in matched
>
> (* Validate nesting *)
> -flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-locking
> -flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-locking
> +flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-lock
> +flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-unlock

This renaming makes sense to me.

> (* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
> -let srcu-rscs = let rec
> - unmatched-locks = Srcu-lock \ domain(matched)
> - and unmatched-unlocks = Srcu-unlock \ range(matched)
> - and unmatched = unmatched-locks | unmatched-unlocks
> - and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
> - and unmatched-locks-to-unlocks =
> - ([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
> - and matched = matched | (unmatched-locks-to-unlocks \
> - (unmatched-po ; unmatched-po))
> - in matched
> +let srcu-rscs = ([Srcu-lock] ; (data | rf)+ ; [Srcu-unlock]) & loc

The point of the "+" instead of the "*" is to avoid LKMM being confused by
an srcu_read_lock() immediately preceding an unrelated srcu_read_unlock(),
right? Or am I missing something more subtle?

> (* Validate nesting *)
> -flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
> -flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
> +flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-lock
> +flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-unlock
> +flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-srcu-matches
>
> (* Check for use of synchronize_srcu() inside an RCU critical section *)
> flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
>
> (* Validate SRCU dynamic match *)
> -flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
> +flag ~empty different-values(srcu-rscs) as bad-srcu-value-match
>
> (* Compute marked and plain memory accesses *)
> let Marked = (~M) | IW | Once | Release | Acquire | domain(rmw) | range(rmw) |
> - LKR | LKW | UL | LF | RL | RU
> + LKR | LKW | UL | LF | RL | RU | Srcu-lock | Srcu-unlock
> let Plain = M \ Marked
>
> (* Redefine dependencies to include those carried through plain accesses *)
> -let carry-dep = (data ; rfi)*
> +let carry-dep = (data ; [~ Srcu-unlock] ; rfi)*

The "[~ Srcu-unlock]" matches the store that bridges the data and rfi",
correct?

> let addr = carry-dep ; addr
> let ctrl = carry-dep ; ctrl
> let data = carry-dep ; data
> Index: usb-devel/tools/memory-model/linux-kernel.def
> ===================================================================
> --- usb-devel.orig/tools/memory-model/linux-kernel.def
> +++ usb-devel/tools/memory-model/linux-kernel.def
> @@ -49,8 +49,10 @@ synchronize_rcu() { __fence{sync-rcu}; }
> synchronize_rcu_expedited() { __fence{sync-rcu}; }
>
> // SRCU
> -srcu_read_lock(X) __srcu{srcu-lock}(X)
> -srcu_read_unlock(X,Y) { __srcu{srcu-unlock}(X,Y); }
> +srcu_read_lock(X) __load{srcu-lock}(*X)
> +srcu_read_unlock(X,Y) { __store{srcu-unlock}(*X,Y); }
> +srcu_down_read(X) __load{srcu-lock}(*X)
> +srcu_up_read(X,Y) { __store{srcu-unlock}(*X,Y); }

And here srcu_down_read() and srcu_up_read() are synonyms for
srcu_read_lock() and srcu_read_unlock(), respectively, which I believe
should suffice.

> synchronize_srcu(X) { __srcu{sync-srcu}(X); }
> synchronize_srcu_expedited(X) { __srcu{sync-srcu}(X); }

So this looks quite reasonable to me.

Thanx, Paul

2023-01-19 22:26:35

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Thu, Jan 19, 2023 at 01:53:04PM -0800, Paul E. McKenney wrote:
> On Thu, Jan 19, 2023 at 02:51:53PM -0500, Alan Stern wrote:
> > Index: usb-devel/tools/memory-model/linux-kernel.bell
> > ===================================================================
> > --- usb-devel.orig/tools/memory-model/linux-kernel.bell
> > +++ usb-devel/tools/memory-model/linux-kernel.bell
> > @@ -53,38 +53,30 @@ let rcu-rscs = let rec
> > in matched
> >
> > (* Validate nesting *)
> > -flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-locking
> > -flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-locking
> > +flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-lock
> > +flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-unlock
>
> This renaming makes sense to me.

But I'll put it in a separate patch, since it's not related to the main
purpose of this change.

>
> > (* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
> > -let srcu-rscs = let rec
> > - unmatched-locks = Srcu-lock \ domain(matched)
> > - and unmatched-unlocks = Srcu-unlock \ range(matched)
> > - and unmatched = unmatched-locks | unmatched-unlocks
> > - and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
> > - and unmatched-locks-to-unlocks =
> > - ([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
> > - and matched = matched | (unmatched-locks-to-unlocks \
> > - (unmatched-po ; unmatched-po))
> > - in matched
> > +let srcu-rscs = ([Srcu-lock] ; (data | rf)+ ; [Srcu-unlock]) & loc
>
> The point of the "+" instead of the "*" is to avoid LKMM being confused by
> an srcu_read_lock() immediately preceding an unrelated srcu_read_unlock(),
> right? Or am I missing something more subtle?

No, and it's not to avoid confusion. It merely indicates that there has
to be at least one instance of data or rf here; we will never have a
case where the lock and the unlock are the same event.

>
> > (* Validate nesting *)
> > -flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
> > -flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
> > +flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-lock
> > +flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-unlock
> > +flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-srcu-matches
> >
> > (* Check for use of synchronize_srcu() inside an RCU critical section *)
> > flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
> >
> > (* Validate SRCU dynamic match *)
> > -flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
> > +flag ~empty different-values(srcu-rscs) as bad-srcu-value-match
> >
> > (* Compute marked and plain memory accesses *)
> > let Marked = (~M) | IW | Once | Release | Acquire | domain(rmw) | range(rmw) |
> > - LKR | LKW | UL | LF | RL | RU
> > + LKR | LKW | UL | LF | RL | RU | Srcu-lock | Srcu-unlock
> > let Plain = M \ Marked
> >
> > (* Redefine dependencies to include those carried through plain accesses *)
> > -let carry-dep = (data ; rfi)*
> > +let carry-dep = (data ; [~ Srcu-unlock] ; rfi)*
>
> The "[~ Srcu-unlock]" matches the store that bridges the data and rfi",
> correct?

Right.

>
> > let addr = carry-dep ; addr
> > let ctrl = carry-dep ; ctrl
> > let data = carry-dep ; data
> > Index: usb-devel/tools/memory-model/linux-kernel.def
> > ===================================================================
> > --- usb-devel.orig/tools/memory-model/linux-kernel.def
> > +++ usb-devel/tools/memory-model/linux-kernel.def
> > @@ -49,8 +49,10 @@ synchronize_rcu() { __fence{sync-rcu}; }
> > synchronize_rcu_expedited() { __fence{sync-rcu}; }
> >
> > // SRCU
> > -srcu_read_lock(X) __srcu{srcu-lock}(X)
> > -srcu_read_unlock(X,Y) { __srcu{srcu-unlock}(X,Y); }
> > +srcu_read_lock(X) __load{srcu-lock}(*X)
> > +srcu_read_unlock(X,Y) { __store{srcu-unlock}(*X,Y); }
> > +srcu_down_read(X) __load{srcu-lock}(*X)
> > +srcu_up_read(X,Y) { __store{srcu-unlock}(*X,Y); }
>
> And here srcu_down_read() and srcu_up_read() are synonyms for
> srcu_read_lock() and srcu_read_unlock(), respectively, which I believe
> should suffice.
>
> > synchronize_srcu(X) { __srcu{sync-srcu}(X); }
> > synchronize_srcu_expedited(X) { __srcu{sync-srcu}(X); }
>
> So this looks quite reasonable to me.

Okay, good. In theory we could check for read_lock and read_unlock on
different CPUs, but I don't think it's worth the trouble.

Alan

2023-01-19 23:42:26

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Thu, Jan 19, 2023 at 05:04:49PM -0500, Alan Stern wrote:
> On Thu, Jan 19, 2023 at 01:53:04PM -0800, Paul E. McKenney wrote:
> > On Thu, Jan 19, 2023 at 02:51:53PM -0500, Alan Stern wrote:
> > > Index: usb-devel/tools/memory-model/linux-kernel.bell
> > > ===================================================================
> > > --- usb-devel.orig/tools/memory-model/linux-kernel.bell
> > > +++ usb-devel/tools/memory-model/linux-kernel.bell
> > > @@ -53,38 +53,30 @@ let rcu-rscs = let rec
> > > in matched
> > >
> > > (* Validate nesting *)
> > > -flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-locking
> > > -flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-locking
> > > +flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-lock
> > > +flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-unlock
> >
> > This renaming makes sense to me.
>
> But I'll put it in a separate patch, since it's not related to the main
> purpose of this change.

Even better!

> > > (* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
> > > -let srcu-rscs = let rec
> > > - unmatched-locks = Srcu-lock \ domain(matched)
> > > - and unmatched-unlocks = Srcu-unlock \ range(matched)
> > > - and unmatched = unmatched-locks | unmatched-unlocks
> > > - and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
> > > - and unmatched-locks-to-unlocks =
> > > - ([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
> > > - and matched = matched | (unmatched-locks-to-unlocks \
> > > - (unmatched-po ; unmatched-po))
> > > - in matched
> > > +let srcu-rscs = ([Srcu-lock] ; (data | rf)+ ; [Srcu-unlock]) & loc
> >
> > The point of the "+" instead of the "*" is to avoid LKMM being confused by
> > an srcu_read_lock() immediately preceding an unrelated srcu_read_unlock(),
> > right? Or am I missing something more subtle?
>
> No, and it's not to avoid confusion. It merely indicates that there has
> to be at least one instance of data or rf here; we will never have a
> case where the lock and the unlock are the same event.

Got it, thank you!

> > > (* Validate nesting *)
> > > -flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
> > > -flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
> > > +flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-lock
> > > +flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-unlock
> > > +flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-srcu-matches
> > >
> > > (* Check for use of synchronize_srcu() inside an RCU critical section *)
> > > flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
> > >
> > > (* Validate SRCU dynamic match *)
> > > -flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
> > > +flag ~empty different-values(srcu-rscs) as bad-srcu-value-match
> > >
> > > (* Compute marked and plain memory accesses *)
> > > let Marked = (~M) | IW | Once | Release | Acquire | domain(rmw) | range(rmw) |
> > > - LKR | LKW | UL | LF | RL | RU
> > > + LKR | LKW | UL | LF | RL | RU | Srcu-lock | Srcu-unlock
> > > let Plain = M \ Marked
> > >
> > > (* Redefine dependencies to include those carried through plain accesses *)
> > > -let carry-dep = (data ; rfi)*
> > > +let carry-dep = (data ; [~ Srcu-unlock] ; rfi)*
> >
> > The "[~ Srcu-unlock]" matches the store that bridges the data and rfi",
> > correct?
>
> Right.
>
> > > let addr = carry-dep ; addr
> > > let ctrl = carry-dep ; ctrl
> > > let data = carry-dep ; data
> > > Index: usb-devel/tools/memory-model/linux-kernel.def
> > > ===================================================================
> > > --- usb-devel.orig/tools/memory-model/linux-kernel.def
> > > +++ usb-devel/tools/memory-model/linux-kernel.def
> > > @@ -49,8 +49,10 @@ synchronize_rcu() { __fence{sync-rcu}; }
> > > synchronize_rcu_expedited() { __fence{sync-rcu}; }
> > >
> > > // SRCU
> > > -srcu_read_lock(X) __srcu{srcu-lock}(X)
> > > -srcu_read_unlock(X,Y) { __srcu{srcu-unlock}(X,Y); }
> > > +srcu_read_lock(X) __load{srcu-lock}(*X)
> > > +srcu_read_unlock(X,Y) { __store{srcu-unlock}(*X,Y); }
> > > +srcu_down_read(X) __load{srcu-lock}(*X)
> > > +srcu_up_read(X,Y) { __store{srcu-unlock}(*X,Y); }
> >
> > And here srcu_down_read() and srcu_up_read() are synonyms for
> > srcu_read_lock() and srcu_read_unlock(), respectively, which I believe
> > should suffice.
> >
> > > synchronize_srcu(X) { __srcu{sync-srcu}(X); }
> > > synchronize_srcu_expedited(X) { __srcu{sync-srcu}(X); }
> >
> > So this looks quite reasonable to me.
>
> Okay, good. In theory we could check for read_lock and read_unlock on
> different CPUs, but I don't think it's worth the trouble.

Given that lockdep already complains about that sort of thing in the
Linux kernel, agreed, it is not worth much trouble at all.

Thanx, Paul

2023-01-20 04:14:05

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Thu, Jan 19, 2023 at 02:51:53PM -0500, Alan Stern wrote:
> On Thu, Jan 19, 2023 at 10:41:07AM -0800, Paul E. McKenney wrote:
> > In contrast, this actually needs srcu_down_read() and srcu_up_read():
> >
> > ------------------------------------------------------------------------
> >
> > C C-srcu-nest-6
> >
> > (*
> > * Result: Never
> > *
> > * Flag unbalanced-srcu-locking
> > * This would be valid for srcu_down_read() and srcu_up_read().
> > *)
> >
> > {}
> >
> > P0(int *x, int *y, struct srcu_struct *s1, int *idx)
> > {
> > int r2;
> > int r3;
> >
> > r3 = srcu_down_read(s1);
> > WRITE_ONCE(*idx, r3);
> > r2 = READ_ONCE(*y);
> > }
> >
> > P1(int *x, int *y, struct srcu_struct *s1, int *idx)
> > {
> > int r1;
> > int r3;
> >
> > r1 = READ_ONCE(*x);
> > r3 = READ_ONCE(*idx);
> > srcu_up_read(s1, r3);
> > }
> >
> > P2(int *x, int *y, struct srcu_struct *s1)
> > {
> > WRITE_ONCE(*y, 1);
> > synchronize_srcu(s1);
> > WRITE_ONCE(*x, 1);
> > }
> >
> > locations [0:r1]
> > exists (1:r1=1 /\ 0:r2=0)
>
> I modified this litmus test by adding a flag variable with an
> smp_store_release in P0, an smp_load_acquire in P1, and a filter clause
> to ensure that P1 reads the flag and idx from P1.
>
> With the patch below, the results were as expected:
>
> Test C-srcu-nest-6 Allowed
> States 3
> 0:r1=0; 0:r2=0; 1:r1=0;
> 0:r1=0; 0:r2=1; 1:r1=0;
> 0:r1=0; 0:r2=1; 1:r1=1;
> No
> Witnesses
> Positive: 0 Negative: 3
> Condition exists (1:r1=1 /\ 0:r2=0)
> Observation C-srcu-nest-6 Never 0 3
> Time C-srcu-nest-6 0.04
> Hash=2b010cf3446879fb84752a6016ff88c5
>
> It turns out that the idea of removing rf edges from Srcu-unlock events
> doesn't work well. The missing edges mess up herd's calculation of the
> fr relation and the coherence axiom. So I've gone back to filtering
> those edges out of carry-dep.
>
> Also, Boqun's suggestion for flagging ordinary accesses to SRCU
> structures no longer works, because the lock and unlock operations now
> _are_ normal accesses. I removed that check too, but it shouldn't hurt
> much because I don't expect to encounter litmus tests that try to read
> or write srcu_structs directly.
>
> Alan
>
>
>
> Index: usb-devel/tools/memory-model/linux-kernel.bell
> ===================================================================
> --- usb-devel.orig/tools/memory-model/linux-kernel.bell
> +++ usb-devel/tools/memory-model/linux-kernel.bell
> @@ -53,38 +53,30 @@ let rcu-rscs = let rec
> in matched
>
> (* Validate nesting *)
> -flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-locking
> -flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-locking
> +flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-lock
> +flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-unlock
>
> (* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
> -let srcu-rscs = let rec
> - unmatched-locks = Srcu-lock \ domain(matched)
> - and unmatched-unlocks = Srcu-unlock \ range(matched)
> - and unmatched = unmatched-locks | unmatched-unlocks
> - and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
> - and unmatched-locks-to-unlocks =
> - ([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
> - and matched = matched | (unmatched-locks-to-unlocks \
> - (unmatched-po ; unmatched-po))
> - in matched
> +let srcu-rscs = ([Srcu-lock] ; (data | rf)+ ; [Srcu-unlock]) & loc
>
> (* Validate nesting *)
> -flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
> -flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
> +flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-lock
> +flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-unlock
> +flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-srcu-matches
>
> (* Check for use of synchronize_srcu() inside an RCU critical section *)
> flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
>
> (* Validate SRCU dynamic match *)
> -flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
> +flag ~empty different-values(srcu-rscs) as bad-srcu-value-match
>
> (* Compute marked and plain memory accesses *)
> let Marked = (~M) | IW | Once | Release | Acquire | domain(rmw) | range(rmw) |
> - LKR | LKW | UL | LF | RL | RU
> + LKR | LKW | UL | LF | RL | RU | Srcu-lock | Srcu-unlock
> let Plain = M \ Marked
>
> (* Redefine dependencies to include those carried through plain accesses *)
> -let carry-dep = (data ; rfi)*
> +let carry-dep = (data ; [~ Srcu-unlock] ; rfi)*
> let addr = carry-dep ; addr
> let ctrl = carry-dep ; ctrl
> let data = carry-dep ; data
> Index: usb-devel/tools/memory-model/linux-kernel.def
> ===================================================================
> --- usb-devel.orig/tools/memory-model/linux-kernel.def
> +++ usb-devel/tools/memory-model/linux-kernel.def
> @@ -49,8 +49,10 @@ synchronize_rcu() { __fence{sync-rcu}; }
> synchronize_rcu_expedited() { __fence{sync-rcu}; }
>
> // SRCU
> -srcu_read_lock(X) __srcu{srcu-lock}(X)
> -srcu_read_unlock(X,Y) { __srcu{srcu-unlock}(X,Y); }
> +srcu_read_lock(X) __load{srcu-lock}(*X)
> +srcu_read_unlock(X,Y) { __store{srcu-unlock}(*X,Y); }
> +srcu_down_read(X) __load{srcu-lock}(*X)
> +srcu_up_read(X,Y) { __store{srcu-unlock}(*X,Y); }
> synchronize_srcu(X) { __srcu{sync-srcu}(X); }
> synchronize_srcu_expedited(X) { __srcu{sync-srcu}(X); }

And for some initial tests:

https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-1.litmus

"Flag multiple-srcu-matches" but otherwise OK.
As a "hail Mary" exercise, I used r4 for the second SRCU
read-side critical section, but this had no effect.
(This flag is expected and seen for #4 below.)

https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-2.litmus
https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-3.litmus
https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-4.litmus
https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-5.litmus

All as expected.

https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-6.litmus

Get "Flag unbalanced-srcu-lock" and "Flag unbalanced-srcu-unlock",
but this is srcu_down_read() and srcu_up_read(), where this should
be OK. Ah, but I need to do the release/acquire/filter trick. Once
I did that, it works as expected.

https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-7.litmus
https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-8.litmus

Both as expected.

Getting there!!!

I also started a regression test, hopefully without pilot error. :-/

Thanx, Paul

2023-01-20 05:54:10

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Thu, Jan 19, 2023 at 02:39:01PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/19/2023 1:11 AM, Paul E. McKenney wrote:
> > On Wed, Jan 18, 2023 at 10:24:50PM +0100, Jonas Oberhauser wrote:
> > > What I was thinking of is more something like this:
> > >
> > > P0{
> > > ?? idx1 = srcu_down(&ss);
> > > ?? srcu_up(&ss,idx1);
> > > }
> > >
> > > P1{
> > > ??? idx2 = srcu_down(&ss);
> > > ??? srcu_up(&ss,idx2)
> > > }
> > And srcu_read_lock() and srcu_read_unlock() already do this.
>
> I think I left out too much from my example.
> And filling in the details led me down a bit of a rabbit hole of confusion
> for a while.
> But here's what I ended up with:
>
>
> P0{
> ??? idx1 = srcu_down(&ss);
> ??? store_rel(p1, true);
>
>
> ??? shared cs
>
> ??? R x == ?
>
> ??? while (! load_acq(p2));
> ??? R idx2 == idx1 // for some reason, we got lucky!
> ??? srcu_up(&ss,idx1);

Although the current Linux-kernel implementation happens to be fine with
this sort of abuse, I am quite happy to tell people "Don't do that!"
For whatever it is worth, the Linux kernel would happily ignore the read
of idx2 and the comparison, and I am happy with its doing so.

There is still a data dependency running from the srcu_down_read() to
the srcu_up_read(). If you instead fed idx2 into the srcu_up_read(),
I believe that herd would yell at you (please see definitions below).

And if you tried feeding idx2 into the above srcu_up_read() without
checking for equality, the Linux kernel would give you an SRCU
grace-period hang. So please don't do that.

The only valid use of the value returned from srcu_down_read() is to be
passed to a single execution of srcu_up_read(). If you do anything else
at all with it, anything at all, you just voided your SRCU warranty.
For that matter, if you just throw that value on the floor and don't
pass it to an srcu_up_read() execution, you also just voided your SRCU
warranty.

> }
>
> P1{
> ??? idx2 = srcu_down(&ss);
> ??? store_rel(p2, true);
>
> ??? shared cs
>
> ??? R y == 0
>
> ??? while (! load_acq(p1));
> ??? srcu_up(&ss,idx2);
> }
>
> P2 {
> ??? W y = 1
> ??? srcu_sync(&ss);
> ??? W x = 1
> }

And you can do this with srcu_read_lock() and srcu_read_unlock().
In contrast, this actually needs srcu_down_read() and srcu_up_read():

------------------------------------------------------------------------

C C-srcu-nest-6

(*
* Result: Never
*
* Flag unbalanced-srcu-locking
* This would be valid for srcu_down_read() and srcu_up_read().
*)

{}

P0(int *x, int *y, struct srcu_struct *s1, int *idx)
{
int r2;
int r3;

r3 = srcu_down_read(s1);
WRITE_ONCE(*idx, r3);
r2 = READ_ONCE(*y);
}

P1(int *x, int *y, struct srcu_struct *s1, int *idx)
{
int r1;
int r3;

r1 = READ_ONCE(*x);
r3 = READ_ONCE(*idx);
srcu_up_read(s1, r3);
}

P2(int *x, int *y, struct srcu_struct *s1)
{
WRITE_ONCE(*y, 1);
synchronize_srcu(s1);
WRITE_ONCE(*x, 1);
}

locations [0:r1]
exists (1:r1=1 /\ 0:r2=0)

------------------------------------------------------------------------

Here is the updated portion of linux-kernel.bell relative to Alan's
update:

------------------------------------------------------------------------

(* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
let srcu-rscs = ([Srcu-lock] ; (data | rf)* ; [Srcu-unlock]) & loc

(* Validate nesting *)
empty Srcu-lock \ domain(srcu-rscs) as mismatched-srcu-locking
empty Srcu-unlock \ range(srcu-rscs) as mismatched-srcu-unlocking
flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-srcu-unlocks

(* Check for use of synchronize_srcu() inside an RCU critical section *)
flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep

(* Validate SRCU dynamic match *)
flag ~empty different-values(srcu-rscs) as srcu-bad-nesting

------------------------------------------------------------------------

And here is the updated portion of linux-kernel.def:

------------------------------------------------------------------------

srcu_down_read(X) __load{srcu-lock}(*X)
srcu_up_read(X,Y) { __store{srcu-unlock}(*X,Y); }

------------------------------------------------------------------------

Could you please review the remainder to see what remains given the
usage restrictions that I called out above?

Thanx, Paul

> Assuming that like indicated above both threads happen to read the same
> index, are you guaranteed that the shared cs lasts until both P0 and P1 have
> performed their final up?
> Is it allowed for P0 to read x==1?
>
> If you define matching up&down as you do through the data link, then we get
> something like
>
> P1's down ->po;prop;po? grace period
> thus
> P1's up? ->rcu-order? grace period
> P0's down ->po;hb;po? P1's up ->rcu-order grace period
> P0's up ->srcu-rscsi;rcu-link;rcu-order? grace-period
> Unfortunately, this is not enough to rcu-order? P0's up with the grace
> period -- you'd need a second rcu-gp for that!
>
> Looking at it from the other side, because x reads x=1, we have
> grace period ->po;rfe;po P0's up
> and thus
> grace period ->rcu-order P0's down ->po;hb;po P1's up
> but again this would order the grace period with P1's up because you'd need
> a second grace period.
>
> When sketching it out on paper, I couldn't find any forbidden cycles, and so
> x==1 would be allowed. (But as I side, I managed to confuse myself a few
> times with this, so if you find a forbidden cycle let me know).
>
> But note that the synchronization in there and the fact that both have the
> same index ensures that the two grace periods overlap, in a hypothetical
> order it would be
> ? down() -> down() -> up() -> up()
> (with any premutation of P0 and P1 over these events so that they each get 1
> up() and 1 down()) and thus the grace period must actually end after both,
> or start before both.
>
> With the definition I offered, you would get
> P0's up() ->srcu-rscsi? P1's down()
> and
> P1's up() ->srcu-rscsi P0's down()
> and in particular
>
> Rx1 ->po P0's up() ->srcu-rscsi? P1's down() ->po Ry0 ->prop Wy1 ->po
> srcu-gp on the same loc ->po Wx1 ->rfe Rx1
> which can be collapsed to
> Rx1 ->po;rcu-order;po;hb Rx1 which isn't irreflexive
>
> Thus x==1 would be forbidden.
>
> This is more semaphore-like, where the same cookie shared between threads
> implies that it's the same semaphore, and the overlapping guarantee (through
> synchronizing on p1,p2 in the beginning) means that the critical sections
> overlap.
>
> In contrast, I wouldn't suggest the same for srcu_lock and srcu_unlock,
> where even though you may get the same cookie by accident, those might still
> be two completely independent critical sections.
> For example, you could imagine a single percpu counter _cnt (per index of
> course) that is incremented and decremented for lock() and unlock(), and the
> condition to pass an srcu_sync() of a given index is that the
> cpu[...]._cnt[idx] are all individually 0 and the sum of all ups[idx] is
> equal to the sum of all downs[idx].
>
> If you create an operational model of up() and down() in terms of such a
> per-index semaphore, I think the x==1 case would similarly need to be
> forbidden. Since the grace period must end after P1's grace period, and P0's
> and P1's grace period overlap and use the same semaphore, the count is never
> 0 at any point in time either P0 or P1 are in the grace period, and so the
> grace period must also end after P0's grace period. But then x=1 can't yet
> have propagated to P0 when it reads x inside its grace period.
>
> In contrast, if the operational model of lock() and unlock() is a per-index
> and per-cpu count, then the x==1 case would be allowed, e.g., as follows
> (time from left to right, all processes happen in parallel):
> P0:????????????????????? < Rx1???????? >
> P1: <??? Ry0?????????????? >
> P2:?????????? y=1? < P0!???? P1! > x=1
>
> here < and > mark the start and end of cs and gp, and Pn! is the time the gp
> realizes that Pn was not in a cs.
>
> Best wishes,
>
> jonas
>

2023-01-20 09:52:05

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/20/2023 4:55 AM, Paul E. McKenney wrote:
> On Thu, Jan 19, 2023 at 02:51:53PM -0500, Alan Stern wrote:
>> On Thu, Jan 19, 2023 at 10:41:07AM -0800, Paul E. McKenney wrote:
>>> In contrast, this actually needs srcu_down_read() and srcu_up_read():
>>>
>>> ------------------------------------------------------------------------
>>>
>>> C C-srcu-nest-6
>>>
>>> (*
>>> * Result: Never
>>> *
>>> * Flag unbalanced-srcu-locking
>>> * This would be valid for srcu_down_read() and srcu_up_read().
>>> *)
>>>
>>> {}
>>>
>>> P0(int *x, int *y, struct srcu_struct *s1, int *idx)
>>> {
>>> int r2;
>>> int r3;
>>>
>>> r3 = srcu_down_read(s1);
>>> WRITE_ONCE(*idx, r3);
>>> r2 = READ_ONCE(*y);
>>> }
>>>
>>> P1(int *x, int *y, struct srcu_struct *s1, int *idx)
>>> {
>>> int r1;
>>> int r3;
>>>
>>> r1 = READ_ONCE(*x);
>>> r3 = READ_ONCE(*idx);
>>> srcu_up_read(s1, r3);
>>> }
>>>
>>> P2(int *x, int *y, struct srcu_struct *s1)
>>> {
>>> WRITE_ONCE(*y, 1);
>>> synchronize_srcu(s1);
>>> WRITE_ONCE(*x, 1);
>>> }
>>>
>>> locations [0:r1]
>>> exists (1:r1=1 /\ 0:r2=0)
>> I modified this litmus test by adding a flag variable with an
>> smp_store_release in P0, an smp_load_acquire in P1, and a filter clause
>> to ensure that P1 reads the flag and idx from P1.
>>
>> With the patch below, the results were as expected:
>>
>> Test C-srcu-nest-6 Allowed
>> States 3
>> 0:r1=0; 0:r2=0; 1:r1=0;
>> 0:r1=0; 0:r2=1; 1:r1=0;
>> 0:r1=0; 0:r2=1; 1:r1=1;
>> No
>> Witnesses
>> Positive: 0 Negative: 3
>> Condition exists (1:r1=1 /\ 0:r2=0)
>> Observation C-srcu-nest-6 Never 0 3
>> Time C-srcu-nest-6 0.04
>> Hash=2b010cf3446879fb84752a6016ff88c5
>>
>> It turns out that the idea of removing rf edges from Srcu-unlock events
>> doesn't work well. The missing edges mess up herd's calculation of the
>> fr relation and the coherence axiom. So I've gone back to filtering
>> those edges out of carry-dep.
>>
>> Also, Boqun's suggestion for flagging ordinary accesses to SRCU
>> structures no longer works, because the lock and unlock operations now
>> _are_ normal accesses. I removed that check too, but it shouldn't hurt
>> much because I don't expect to encounter litmus tests that try to read
>> or write srcu_structs directly.
>>
>> Alan
>>
>>
>>
>> Index: usb-devel/tools/memory-model/linux-kernel.bell
>> ===================================================================
>> --- usb-devel.orig/tools/memory-model/linux-kernel.bell
>> +++ usb-devel/tools/memory-model/linux-kernel.bell
>> @@ -53,38 +53,30 @@ let rcu-rscs = let rec
>> in matched
>>
>> (* Validate nesting *)
>> -flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-locking
>> -flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-locking
>> +flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-lock
>> +flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-unlock
>>
>> (* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
>> -let srcu-rscs = let rec
>> - unmatched-locks = Srcu-lock \ domain(matched)
>> - and unmatched-unlocks = Srcu-unlock \ range(matched)
>> - and unmatched = unmatched-locks | unmatched-unlocks
>> - and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
>> - and unmatched-locks-to-unlocks =
>> - ([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
>> - and matched = matched | (unmatched-locks-to-unlocks \
>> - (unmatched-po ; unmatched-po))
>> - in matched
>> +let srcu-rscs = ([Srcu-lock] ; (data | rf)+ ; [Srcu-unlock]) & loc
>>
>> (* Validate nesting *)
>> -flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
>> -flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
>> +flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-lock
>> +flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-unlock
>> +flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-srcu-matches
>>
>> (* Check for use of synchronize_srcu() inside an RCU critical section *)
>> flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
>>
>> (* Validate SRCU dynamic match *)
>> -flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
>> +flag ~empty different-values(srcu-rscs) as bad-srcu-value-match
>>
>> (* Compute marked and plain memory accesses *)
>> let Marked = (~M) | IW | Once | Release | Acquire | domain(rmw) | range(rmw) |
>> - LKR | LKW | UL | LF | RL | RU
>> + LKR | LKW | UL | LF | RL | RU | Srcu-lock | Srcu-unlock
>> let Plain = M \ Marked
>>
>> (* Redefine dependencies to include those carried through plain accesses *)
>> -let carry-dep = (data ; rfi)*
>> +let carry-dep = (data ; [~ Srcu-unlock] ; rfi)*
>> let addr = carry-dep ; addr
>> let ctrl = carry-dep ; ctrl
>> let data = carry-dep ; data
>> Index: usb-devel/tools/memory-model/linux-kernel.def
>> ===================================================================
>> --- usb-devel.orig/tools/memory-model/linux-kernel.def
>> +++ usb-devel/tools/memory-model/linux-kernel.def
>> @@ -49,8 +49,10 @@ synchronize_rcu() { __fence{sync-rcu}; }
>> synchronize_rcu_expedited() { __fence{sync-rcu}; }
>>
>> // SRCU
>> -srcu_read_lock(X) __srcu{srcu-lock}(X)
>> -srcu_read_unlock(X,Y) { __srcu{srcu-unlock}(X,Y); }
>> +srcu_read_lock(X) __load{srcu-lock}(*X)
>> +srcu_read_unlock(X,Y) { __store{srcu-unlock}(*X,Y); }
>> +srcu_down_read(X) __load{srcu-lock}(*X)
>> +srcu_up_read(X,Y) { __store{srcu-unlock}(*X,Y); }
>> synchronize_srcu(X) { __srcu{sync-srcu}(X); }
>> synchronize_srcu_expedited(X) { __srcu{sync-srcu}(X); }
> And for some initial tests:
>
> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-1.litmus
>
> "Flag multiple-srcu-matches" but otherwise OK.
> As a "hail Mary" exercise, I used r4 for the second SRCU
> read-side critical section, but this had no effect.
> (This flag is expected and seen for #4 below.)

This is because srcu_lock/srcu_unlock are reads and writes, and so you
get the accidental rf relation here I was talking about earlier.
In particular your first lock() is linked byĀ  data ; rf ; data to the
second unlock(), which therefore seems to have data coming in from two
sources.

You would be better off moving the carry-dep/data definitions higher in
the file,

-let carry-dep = (data ; rfi)*
+let carry-dep = (data ; [~ Srcu-unlock] ; rfi)*
let addr = carry-dep ; addr
let ctrl = carry-dep ; ctrl
let data = carry-dep ; data

and then defining

+let srcu-rscs = ([Srcu-lock] ; data ; [Srcu-unlock]) & loc

Note here I'm just using the freshly redefined data, instead of the (data;rf)+


best wishes, jonas

2023-01-20 10:10:03

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/19/2023 10:53 PM, Paul E. McKenney wrote:
> On Thu, Jan 19, 2023 at 02:51:53PM -0500, Alan Stern wrote:
>> On Thu, Jan 19, 2023 at 10:41:07AM -0800, Paul E. McKenney wrote:
>>> In contrast, this actually needs srcu_down_read() and srcu_up_read():
>>>
>>> ------------------------------------------------------------------------
>>>
>>> C C-srcu-nest-6
>>>
>>> (*
>>> * Result: Never
>>> *
>>> * Flag unbalanced-srcu-locking
>>> * This would be valid for srcu_down_read() and srcu_up_read().
>>> *)
>>>
>>> {}
>>>
>>> P0(int *x, int *y, struct srcu_struct *s1, int *idx)
>>> {
>>> int r2;
>>> int r3;
>>>
>>> r3 = srcu_down_read(s1);
>>> WRITE_ONCE(*idx, r3);
>>> r2 = READ_ONCE(*y);
>>> }
>>>
>>> P1(int *x, int *y, struct srcu_struct *s1, int *idx)
>>> {
>>> int r1;
>>> int r3;
>>>
>>> r1 = READ_ONCE(*x);
>>> r3 = READ_ONCE(*idx);
>>> srcu_up_read(s1, r3);
>>> }
>>>
>>> P2(int *x, int *y, struct srcu_struct *s1)
>>> {
>>> WRITE_ONCE(*y, 1);
>>> synchronize_srcu(s1);
>>> WRITE_ONCE(*x, 1);
>>> }
>>>
>>> locations [0:r1]
>>> exists (1:r1=1 /\ 0:r2=0)
>> I modified this litmus test by adding a flag variable with an
>> smp_store_release in P0, an smp_load_acquire in P1, and a filter clause
>> to ensure that P1 reads the flag and idx from P1.

This sounds like good style.
I suppose this is already flagged as mismatched srcu_unlock(), in case
you accidentally read from the initial write?

>> It turns out that the idea of removing rf edges from Srcu-unlock events
>> doesn't work well. The missing edges mess up herd's calculation of the
>> fr relation and the coherence axiom. So I've gone back to filtering
>> those edges out of carry-dep.
>>
>> Also, Boqun's suggestion for flagging ordinary accesses to SRCU
>> structures no longer works, because the lock and unlock operations now
>> _are_ normal accesses. I removed that check too, but it shouldn't hurt
>> much because I don't expect to encounter litmus tests that try to read
>> or write srcu_structs directly.
> Agreed. I for one would definitely have something to say about an
> SRCU-usage patch that directly manipulated a srcu_struct structure! ;-)

Wouldn't the point of having it being flagged be that herd (or similar
tools) would be having something to say long before it has to reach your
pair of eyes?
I don't think Boqun's patch is hard to repair.
Besides the issue you mention, I think it's also missing Sync-srcu,
which seems to be linked by loc based on its first argument.

How about something like this?

let ALL-LOCKS = LKR | LKW | UL | LF | RU | Srcu-lock | Srcu-unlock |
Sync-srcu flag ~empty ~[ALL_LOCKS | IW] ; loc ; [ALL-LOCKS] as mixed-lock-accesses

If you're using something that isn't a lock or intial write on the same location as a lock, you get the flag.

Best wishes,
jonas

2023-01-20 10:26:57

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/19/2023 7:41 PM, Paul E. McKenney wrote:
> On Thu, Jan 19, 2023 at 02:39:01PM +0100, Jonas Oberhauser wrote:
>>
>> On 1/19/2023 1:11 AM, Paul E. McKenney wrote:
>>> On Wed, Jan 18, 2023 at 10:24:50PM +0100, Jonas Oberhauser wrote:
>>>> What I was thinking of is more something like this:
>>>>
>>>> P0{
>>>> Ā Ā  idx1 = srcu_down(&ss);
>>>> Ā Ā  srcu_up(&ss,idx1);
>>>> }
>>>>
>>>> P1{
>>>> Ā Ā Ā  idx2 = srcu_down(&ss);
>>>> Ā Ā Ā  srcu_up(&ss,idx2)
>>>> }
>>> And srcu_read_lock() and srcu_read_unlock() already do this.
>> I think I left out too much from my example.
>> And filling in the details led me down a bit of a rabbit hole of confusion
>> for a while.
>> But here's what I ended up with:
>>
>>
>> P0{
>> Ā Ā Ā  idx1 = srcu_down(&ss);
>> Ā Ā Ā  store_rel(p1, true);
>>
>>
>> Ā Ā Ā  shared cs
>>
>> Ā Ā Ā  R x == ?
>>
>> Ā Ā Ā  while (! load_acq(p2));
>> Ā Ā Ā  R idx2 == idx1 // for some reason, we got lucky!
>> Ā Ā Ā  srcu_up(&ss,idx1);
> Although the current Linux-kernel implementation happens to be fine with
> this sort of abuse, I am quite happy to tell people "Don't do that!"
> And you can do this with srcu_read_lock() and srcu_read_unlock().
> In contrast, this actually needs srcu_down_read() and srcu_up_read():

My point/clarification request wasn't about whether you could write that
code with read_lock() and read_unlock(), but what it would/should mean
for the operational and axiomatic models.
As I wrote later in the mail, for the operational model it is quite
clear that x==1 should be allowed for lock() and unlock(), but would
probably be forbidden for down() and up().
My clarification request is whether that difference in the probable
operational model should be reflected in the axiomatic model (as I first
suspected based on the word "semaphore" being dropped a lot), or whether
it's just due to abuse (i.e., yes the axiomatic model and operational
model might be different here, but you're not allowed to look).
Which brings us to the next point:

> Could you please review the remainder to see what remains given the
> usage restrictions that I called out above?

Perhaps we could say that reading an index without using it later is
forbidden?

flag ~empty [Srcu-lock];data;rf;[~ domain(data;[Srcu-unlock])] as
thrown-srcu-cookie-on-floor

So if there is an srcu_down() that produces a cookie that is read by
some read R, and R doesn't then pass that value into an srcu_up(), the
srcu-warranty is voided.

Perhaps it would also be good to add special tags for Srcu-down and
Srcu-up to avoid collisions.

always have fun, jonas

2023-01-20 12:39:11

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

I just realized I made a mistake in my earlier response to this message;
you still need the rf for passing the cookie across threads.
Perhaps it's better to just also exclude srcu_unlock type events
explicitly here.

+let srcu-rscs = ([Srcu-lock] ; (data ; [~ Srcu-unlock] ; rf) + ;
[Srcu-unlock]) & loc


best wishes,
jonas

On 1/20/2023 4:55 AM, Paul E. McKenney wrote:
> On Thu, Jan 19, 2023 at 02:51:53PM -0500, Alan Stern wrote:
>> On Thu, Jan 19, 2023 at 10:41:07AM -0800, Paul E. McKenney wrote:
>>> In contrast, this actually needs srcu_down_read() and srcu_up_read():
>>>
>>> ------------------------------------------------------------------------
>>>
>>> C C-srcu-nest-6
>>>
>>> (*
>>> * Result: Never
>>> *
>>> * Flag unbalanced-srcu-locking
>>> * This would be valid for srcu_down_read() and srcu_up_read().
>>> *)
>>>
>>> {}
>>>
>>> P0(int *x, int *y, struct srcu_struct *s1, int *idx)
>>> {
>>> int r2;
>>> int r3;
>>>
>>> r3 = srcu_down_read(s1);
>>> WRITE_ONCE(*idx, r3);
>>> r2 = READ_ONCE(*y);
>>> }
>>>
>>> P1(int *x, int *y, struct srcu_struct *s1, int *idx)
>>> {
>>> int r1;
>>> int r3;
>>>
>>> r1 = READ_ONCE(*x);
>>> r3 = READ_ONCE(*idx);
>>> srcu_up_read(s1, r3);
>>> }
>>>
>>> P2(int *x, int *y, struct srcu_struct *s1)
>>> {
>>> WRITE_ONCE(*y, 1);
>>> synchronize_srcu(s1);
>>> WRITE_ONCE(*x, 1);
>>> }
>>>
>>> locations [0:r1]
>>> exists (1:r1=1 /\ 0:r2=0)
>> I modified this litmus test by adding a flag variable with an
>> smp_store_release in P0, an smp_load_acquire in P1, and a filter clause
>> to ensure that P1 reads the flag and idx from P1.
>>
>> With the patch below, the results were as expected:
>>
>> Test C-srcu-nest-6 Allowed
>> States 3
>> 0:r1=0; 0:r2=0; 1:r1=0;
>> 0:r1=0; 0:r2=1; 1:r1=0;
>> 0:r1=0; 0:r2=1; 1:r1=1;
>> No
>> Witnesses
>> Positive: 0 Negative: 3
>> Condition exists (1:r1=1 /\ 0:r2=0)
>> Observation C-srcu-nest-6 Never 0 3
>> Time C-srcu-nest-6 0.04
>> Hash=2b010cf3446879fb84752a6016ff88c5
>>
>> It turns out that the idea of removing rf edges from Srcu-unlock events
>> doesn't work well. The missing edges mess up herd's calculation of the
>> fr relation and the coherence axiom. So I've gone back to filtering
>> those edges out of carry-dep.
>>
>> Also, Boqun's suggestion for flagging ordinary accesses to SRCU
>> structures no longer works, because the lock and unlock operations now
>> _are_ normal accesses. I removed that check too, but it shouldn't hurt
>> much because I don't expect to encounter litmus tests that try to read
>> or write srcu_structs directly.
>>
>> Alan
>>
>>
>>
>> Index: usb-devel/tools/memory-model/linux-kernel.bell
>> ===================================================================
>> --- usb-devel.orig/tools/memory-model/linux-kernel.bell
>> +++ usb-devel/tools/memory-model/linux-kernel.bell
>> @@ -53,38 +53,30 @@ let rcu-rscs = let rec
>> in matched
>>
>> (* Validate nesting *)
>> -flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-locking
>> -flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-locking
>> +flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-lock
>> +flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-unlock
>>
>> (* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
>> -let srcu-rscs = let rec
>> - unmatched-locks = Srcu-lock \ domain(matched)
>> - and unmatched-unlocks = Srcu-unlock \ range(matched)
>> - and unmatched = unmatched-locks | unmatched-unlocks
>> - and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
>> - and unmatched-locks-to-unlocks =
>> - ([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
>> - and matched = matched | (unmatched-locks-to-unlocks \
>> - (unmatched-po ; unmatched-po))
>> - in matched
>> +let srcu-rscs = ([Srcu-lock] ; (data | rf)+ ; [Srcu-unlock]) & loc
>>
>> (* Validate nesting *)
>> -flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
>> -flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
>> +flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-lock
>> +flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-unlock
>> +flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-srcu-matches
>>
>> (* Check for use of synchronize_srcu() inside an RCU critical section *)
>> flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
>>
>> (* Validate SRCU dynamic match *)
>> -flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
>> +flag ~empty different-values(srcu-rscs) as bad-srcu-value-match
>>
>> (* Compute marked and plain memory accesses *)
>> let Marked = (~M) | IW | Once | Release | Acquire | domain(rmw) | range(rmw) |
>> - LKR | LKW | UL | LF | RL | RU
>> + LKR | LKW | UL | LF | RL | RU | Srcu-lock | Srcu-unlock
>> let Plain = M \ Marked
>>
>> (* Redefine dependencies to include those carried through plain accesses *)
>> -let carry-dep = (data ; rfi)*
>> +let carry-dep = (data ; [~ Srcu-unlock] ; rfi)*
>> let addr = carry-dep ; addr
>> let ctrl = carry-dep ; ctrl
>> let data = carry-dep ; data
>> Index: usb-devel/tools/memory-model/linux-kernel.def
>> ===================================================================
>> --- usb-devel.orig/tools/memory-model/linux-kernel.def
>> +++ usb-devel/tools/memory-model/linux-kernel.def
>> @@ -49,8 +49,10 @@ synchronize_rcu() { __fence{sync-rcu}; }
>> synchronize_rcu_expedited() { __fence{sync-rcu}; }
>>
>> // SRCU
>> -srcu_read_lock(X) __srcu{srcu-lock}(X)
>> -srcu_read_unlock(X,Y) { __srcu{srcu-unlock}(X,Y); }
>> +srcu_read_lock(X) __load{srcu-lock}(*X)
>> +srcu_read_unlock(X,Y) { __store{srcu-unlock}(*X,Y); }
>> +srcu_down_read(X) __load{srcu-lock}(*X)
>> +srcu_up_read(X,Y) { __store{srcu-unlock}(*X,Y); }
>> synchronize_srcu(X) { __srcu{sync-srcu}(X); }
>> synchronize_srcu_expedited(X) { __srcu{sync-srcu}(X); }
> And for some initial tests:
>
> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-1.litmus
>
> "Flag multiple-srcu-matches" but otherwise OK.
> As a "hail Mary" exercise, I used r4 for the second SRCU
> read-side critical section, but this had no effect.
> (This flag is expected and seen for #4 below.)
>
> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-2.litmus
> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-3.litmus
> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-4.litmus
> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-5.litmus
>
> All as expected.
>
> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-6.litmus
>
> Get "Flag unbalanced-srcu-lock" and "Flag unbalanced-srcu-unlock",
> but this is srcu_down_read() and srcu_up_read(), where this should
> be OK. Ah, but I need to do the release/acquire/filter trick. Once
> I did that, it works as expected.
>
> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-7.litmus
> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-8.litmus
>
> Both as expected.
>
> Getting there!!!
>
> I also started a regression test, hopefully without pilot error. :-/
>
> Thanx, Paul

2023-01-20 13:19:25

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

I'm not going to get it right today, am I?

+let srcu-rscs = ([Srcu-lock] ; (data ; [~ Srcu-unlock] ; rfe) * ; data
; [Srcu-unlock]) & loc

I see now that I copied the format from your message but without
realizing the original had a `|` where I have a `;`.
I hope this version is finally right and perhaps more natural than the
(data | rf) version, considering rf can't actually appear in most places
and this more closely matches carry-dep;data.
But of course feel free to use
+let srcu-rscs = ([Srcu-lock] ; (dataĀ  | [~ Srcu-unlock] ; rf)+ ;
[Srcu-unlock]) & loc
instead if you prefer.

have fun, jonas


On 1/20/2023 1:34 PM, Jonas Oberhauser wrote:
> I just realized I made a mistake in my earlier response to this
> message; you still need the rf for passing the cookie across threads.
> Perhaps it's better to just also exclude srcu_unlock type events
> explicitly here.
>
> +let srcu-rscs = ([Srcu-lock] ; (data ; [~ Srcu-unlock] ; rf) + ;
> [Srcu-unlock]) & loc
>
>
> best wishes,
> jonas
>
> On 1/20/2023 4:55 AM, Paul E. McKenney wrote:
>> On Thu, Jan 19, 2023 at 02:51:53PM -0500, Alan Stern wrote:
>>> On Thu, Jan 19, 2023 at 10:41:07AM -0800, Paul E. McKenney wrote:
>>>> In contrast, this actually needs srcu_down_read() and srcu_up_read():
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>>
>>>> C C-srcu-nest-6
>>>>
>>>> (*
>>>> Ā  * Result: Never
>>>> Ā  *
>>>> Ā  * Flag unbalanced-srcu-locking
>>>> Ā  * This would be valid for srcu_down_read() and srcu_up_read().
>>>> Ā  *)
>>>>
>>>> {}
>>>>
>>>> P0(int *x, int *y, struct srcu_struct *s1, int *idx)
>>>> {
>>>> Ā Ā Ā Ā int r2;
>>>> Ā Ā Ā Ā int r3;
>>>>
>>>> Ā Ā Ā Ā r3 = srcu_down_read(s1);
>>>> Ā Ā Ā Ā WRITE_ONCE(*idx, r3);
>>>> Ā Ā Ā Ā r2 = READ_ONCE(*y);
>>>> }
>>>>
>>>> P1(int *x, int *y, struct srcu_struct *s1, int *idx)
>>>> {
>>>> Ā Ā Ā Ā int r1;
>>>> Ā Ā Ā Ā int r3;
>>>>
>>>> Ā Ā Ā Ā r1 = READ_ONCE(*x);
>>>> Ā Ā Ā Ā r3 = READ_ONCE(*idx);
>>>> Ā Ā Ā Ā srcu_up_read(s1, r3);
>>>> }
>>>>
>>>> P2(int *x, int *y, struct srcu_struct *s1)
>>>> {
>>>> Ā Ā Ā Ā WRITE_ONCE(*y, 1);
>>>> Ā Ā Ā Ā synchronize_srcu(s1);
>>>> Ā Ā Ā Ā WRITE_ONCE(*x, 1);
>>>> }
>>>>
>>>> locations [0:r1]
>>>> exists (1:r1=1 /\ 0:r2=0)
>>> I modified this litmus test by adding a flag variable with an
>>> smp_store_release in P0, an smp_load_acquire in P1, and a filter clause
>>> to ensure that P1 reads the flag and idx from P1.
>>>
>>> With the patch below, the results were as expected:
>>>
>>> Test C-srcu-nest-6 Allowed
>>> States 3
>>> 0:r1=0; 0:r2=0; 1:r1=0;
>>> 0:r1=0; 0:r2=1; 1:r1=0;
>>> 0:r1=0; 0:r2=1; 1:r1=1;
>>> No
>>> Witnesses
>>> Positive: 0 Negative: 3
>>> Condition exists (1:r1=1 /\ 0:r2=0)
>>> Observation C-srcu-nest-6 Never 0 3
>>> Time C-srcu-nest-6 0.04
>>> Hash=2b010cf3446879fb84752a6016ff88c5
>>>
>>> It turns out that the idea of removing rf edges from Srcu-unlock events
>>> doesn't work well.Ā  The missing edges mess up herd's calculation of the
>>> fr relation and the coherence axiom.Ā  So I've gone back to filtering
>>> those edges out of carry-dep.
>>>
>>> Also, Boqun's suggestion for flagging ordinary accesses to SRCU
>>> structures no longer works, because the lock and unlock operations now
>>> _are_ normal accesses.Ā  I removed that check too, but it shouldn't hurt
>>> much because I don't expect to encounter litmus tests that try to read
>>> or write srcu_structs directly.
>>>
>>> Alan
>>>
>>>
>>>
>>> Index: usb-devel/tools/memory-model/linux-kernel.bell
>>> ===================================================================
>>> --- usb-devel.orig/tools/memory-model/linux-kernel.bell
>>> +++ usb-devel/tools/memory-model/linux-kernel.bell
>>> @@ -53,38 +53,30 @@ let rcu-rscs = let rec
>>> Ā Ā Ā Ā Ā  in matched
>>> Ā  Ā  (* Validate nesting *)
>>> -flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-locking
>>> -flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-locking
>>> +flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-lock
>>> +flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-unlock
>>> Ā  Ā  (* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
>>> -let srcu-rscs = let rec
>>> -Ā Ā Ā Ā Ā Ā Ā  unmatched-locks = Srcu-lock \ domain(matched)
>>> -Ā Ā Ā  and unmatched-unlocks = Srcu-unlock \ range(matched)
>>> -Ā Ā Ā  and unmatched = unmatched-locks | unmatched-unlocks
>>> -Ā Ā Ā  and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
>>> -Ā Ā Ā  and unmatched-locks-to-unlocks =
>>> -Ā Ā Ā Ā Ā Ā Ā  ([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
>>> -Ā Ā Ā  and matched = matched | (unmatched-locks-to-unlocks \
>>> -Ā Ā Ā Ā Ā Ā Ā  (unmatched-po ; unmatched-po))
>>> -Ā Ā Ā  in matched
>>> +let srcu-rscs = ([Srcu-lock] ; (data | rf)+ ; [Srcu-unlock]) & loc
>>> Ā  Ā  (* Validate nesting *)
>>> -flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
>>> -flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
>>> +flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-lock
>>> +flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-unlock
>>> +flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-srcu-matches
>>> Ā  Ā  (* Check for use of synchronize_srcu() inside an RCU critical
>>> section *)
>>> Ā  flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
>>> Ā  Ā  (* Validate SRCU dynamic match *)
>>> -flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
>>> +flag ~empty different-values(srcu-rscs) as bad-srcu-value-match
>>> Ā  Ā  (* Compute marked and plain memory accesses *)
>>> Ā  let Marked = (~M) | IW | Once | Release | Acquire | domain(rmw) |
>>> range(rmw) |
>>> -Ā Ā Ā Ā Ā Ā Ā  LKR | LKW | UL | LF | RL | RU
>>> +Ā Ā Ā Ā Ā Ā Ā Ā  LKR | LKW | UL | LF | RL | RU | Srcu-lock | Srcu-unlock
>>> Ā  let Plain = M \ Marked
>>> Ā  Ā  (* Redefine dependencies to include those carried through plain
>>> accesses *)
>>> -let carry-dep = (data ; rfi)*
>>> +let carry-dep = (data ; [~ Srcu-unlock] ; rfi)*
>>> Ā  let addr = carry-dep ; addr
>>> Ā  let ctrl = carry-dep ; ctrl
>>> Ā  let data = carry-dep ; data
>>> Index: usb-devel/tools/memory-model/linux-kernel.def
>>> ===================================================================
>>> --- usb-devel.orig/tools/memory-model/linux-kernel.def
>>> +++ usb-devel/tools/memory-model/linux-kernel.def
>>> @@ -49,8 +49,10 @@ synchronize_rcu() { __fence{sync-rcu}; }
>>> Ā  synchronize_rcu_expedited() { __fence{sync-rcu}; }
>>> Ā  Ā  // SRCU
>>> -srcu_read_lock(X)Ā  __srcu{srcu-lock}(X)
>>> -srcu_read_unlock(X,Y) { __srcu{srcu-unlock}(X,Y); }
>>> +srcu_read_lock(X) __load{srcu-lock}(*X)
>>> +srcu_read_unlock(X,Y) { __store{srcu-unlock}(*X,Y); }
>>> +srcu_down_read(X) __load{srcu-lock}(*X)
>>> +srcu_up_read(X,Y) { __store{srcu-unlock}(*X,Y); }
>>> Ā  synchronize_srcu(X)Ā  { __srcu{sync-srcu}(X); }
>>> Ā  synchronize_srcu_expedited(X)Ā  { __srcu{sync-srcu}(X); }
>> And for some initial tests:
>>
>> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-1.litmus
>>
>>
>> Ā Ā Ā Ā "Flag multiple-srcu-matches" but otherwise OK.
>> Ā Ā Ā Ā As a "hail Mary" exercise, I used r4 for the second SRCU
>> Ā Ā Ā Ā read-side critical section, but this had no effect.
>> Ā Ā Ā Ā (This flag is expected and seen for #4 below.)
>>
>> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-2.litmus
>>
>> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-3.litmus
>>
>> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-4.litmus
>>
>> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-5.litmus
>>
>>
>> Ā Ā Ā Ā All as expected.
>>
>> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-6.litmus
>>
>>
>> Ā Ā Ā Ā Get "Flag unbalanced-srcu-lock" and "Flag unbalanced-srcu-unlock",
>> Ā Ā Ā Ā but this is srcu_down_read() and srcu_up_read(), where this should
>> Ā Ā Ā Ā be OK.Ā Ā Ā  Ah, but I need to do the release/acquire/filter trick.Ā 
>> Once
>> Ā Ā Ā Ā I did that, it works as expected.
>>
>> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-7.litmus
>>
>> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-8.litmus
>>
>>
>> Ā Ā Ā Ā Both as expected.
>>
>> Getting there!!!
>>
>> I also started a regression test, hopefully without pilot error.Ā  :-/
>>
>> Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  Thanx, Paul
>

2023-01-20 16:10:08

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 20, 2023 at 10:43:10AM +0100, Jonas Oberhauser wrote:
>
>
> On 1/19/2023 10:53 PM, Paul E. McKenney wrote:
> > On Thu, Jan 19, 2023 at 02:51:53PM -0500, Alan Stern wrote:
> > > On Thu, Jan 19, 2023 at 10:41:07AM -0800, Paul E. McKenney wrote:
> > > > In contrast, this actually needs srcu_down_read() and srcu_up_read():
> > > >
> > > > ------------------------------------------------------------------------
> > > >
> > > > C C-srcu-nest-6
> > > >
> > > > (*
> > > > * Result: Never
> > > > *
> > > > * Flag unbalanced-srcu-locking
> > > > * This would be valid for srcu_down_read() and srcu_up_read().
> > > > *)
> > > >
> > > > {}
> > > >
> > > > P0(int *x, int *y, struct srcu_struct *s1, int *idx)
> > > > {
> > > > int r2;
> > > > int r3;
> > > >
> > > > r3 = srcu_down_read(s1);
> > > > WRITE_ONCE(*idx, r3);
> > > > r2 = READ_ONCE(*y);
> > > > }
> > > >
> > > > P1(int *x, int *y, struct srcu_struct *s1, int *idx)
> > > > {
> > > > int r1;
> > > > int r3;
> > > >
> > > > r1 = READ_ONCE(*x);
> > > > r3 = READ_ONCE(*idx);
> > > > srcu_up_read(s1, r3);
> > > > }
> > > >
> > > > P2(int *x, int *y, struct srcu_struct *s1)
> > > > {
> > > > WRITE_ONCE(*y, 1);
> > > > synchronize_srcu(s1);
> > > > WRITE_ONCE(*x, 1);
> > > > }
> > > >
> > > > locations [0:r1]
> > > > exists (1:r1=1 /\ 0:r2=0)
> > > I modified this litmus test by adding a flag variable with an
> > > smp_store_release in P0, an smp_load_acquire in P1, and a filter clause
> > > to ensure that P1 reads the flag and idx from P1.
>
> This sounds like good style.
> I suppose this is already flagged as mismatched srcu_unlock(), in case you
> accidentally read from the initial write?

It might, except that a filter clause excludes this case. Here is the
updated test:

C C-srcu-nest-6

(*
* Result: Never
*
* This would be valid for srcu_down_read() and srcu_up_read().
*)

{}

P0(int *x, int *y, struct srcu_struct *s1, int *idx, int *f)
{
int r2;
int r3;

r3 = srcu_down_read(s1);
WRITE_ONCE(*idx, r3);
r2 = READ_ONCE(*y);
smp_store_release(f, 1);
}

P1(int *x, int *y, struct srcu_struct *s1, int *idx, int *f)
{
int r1;
int r3;
int r4;

r4 = smp_load_acquire(f);
r1 = READ_ONCE(*x);
r3 = READ_ONCE(*idx);
srcu_up_read(s1, r3);
}

P2(int *x, int *y, struct srcu_struct *s1)
{
WRITE_ONCE(*y, 1);
synchronize_srcu(s1);
WRITE_ONCE(*x, 1);
}

locations [0:r1]
filter (1:r4=1)
exists (1:r1=1 /\ 0:r2=0)

> > > It turns out that the idea of removing rf edges from Srcu-unlock events
> > > doesn't work well. The missing edges mess up herd's calculation of the
> > > fr relation and the coherence axiom. So I've gone back to filtering
> > > those edges out of carry-dep.
> > >
> > > Also, Boqun's suggestion for flagging ordinary accesses to SRCU
> > > structures no longer works, because the lock and unlock operations now
> > > _are_ normal accesses. I removed that check too, but it shouldn't hurt
> > > much because I don't expect to encounter litmus tests that try to read
> > > or write srcu_structs directly.
> > Agreed. I for one would definitely have something to say about an
> > SRCU-usage patch that directly manipulated a srcu_struct structure! ;-)
>
> Wouldn't the point of having it being flagged be that herd (or similar
> tools) would be having something to say long before it has to reach your
> pair of eyes?

That would of course be even better.

> I don't think Boqun's patch is hard to repair.
> Besides the issue you mention, I think it's also missing Sync-srcu, which
> seems to be linked by loc based on its first argument.
>
> How about something like this?
>
> let ALL-LOCKS = LKR | LKW | UL | LF | RU | Srcu-lock | Srcu-unlock |
> Sync-srcu flag ~empty ~[ALL_LOCKS | IW] ; loc ; [ALL-LOCKS] as
> mixed-lock-accesses
>
> If you're using something that isn't a lock or intial write on the same location as a lock, you get the flag.

Wouldn't that unconditionally complain about the first srcu_read_lock()
in a given process? Or am I misreading those statements?

Thanx, Paul

2023-01-20 16:14:23

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 20, 2023 at 11:13:00AM +0100, Jonas Oberhauser wrote:
>
>
> On 1/19/2023 7:41 PM, Paul E. McKenney wrote:
> > On Thu, Jan 19, 2023 at 02:39:01PM +0100, Jonas Oberhauser wrote:
> > >
> > > On 1/19/2023 1:11 AM, Paul E. McKenney wrote:
> > > > On Wed, Jan 18, 2023 at 10:24:50PM +0100, Jonas Oberhauser wrote:
> > > > > What I was thinking of is more something like this:
> > > > >
> > > > > P0{
> > > > > ?? idx1 = srcu_down(&ss);
> > > > > ?? srcu_up(&ss,idx1);
> > > > > }
> > > > >
> > > > > P1{
> > > > > ??? idx2 = srcu_down(&ss);
> > > > > ??? srcu_up(&ss,idx2)
> > > > > }
> > > > And srcu_read_lock() and srcu_read_unlock() already do this.
> > > I think I left out too much from my example.
> > > And filling in the details led me down a bit of a rabbit hole of confusion
> > > for a while.
> > > But here's what I ended up with:
> > >
> > >
> > > P0{
> > > ??? idx1 = srcu_down(&ss);
> > > ??? store_rel(p1, true);
> > >
> > >
> > > ??? shared cs
> > >
> > > ??? R x == ?
> > >
> > > ??? while (! load_acq(p2));
> > > ??? R idx2 == idx1 // for some reason, we got lucky!
> > > ??? srcu_up(&ss,idx1);
> > Although the current Linux-kernel implementation happens to be fine with
> > this sort of abuse, I am quite happy to tell people "Don't do that!"
> > And you can do this with srcu_read_lock() and srcu_read_unlock().
> > In contrast, this actually needs srcu_down_read() and srcu_up_read():
>
> My point/clarification request wasn't about whether you could write that
> code with read_lock() and read_unlock(), but what it would/should mean for
> the operational and axiomatic models.
> As I wrote later in the mail, for the operational model it is quite clear
> that x==1 should be allowed for lock() and unlock(), but would probably be
> forbidden for down() and up().

Agreed, the math might say something or another about doing something
with the srcu_read_lock() or srcu_down_read() return values (other than
passing them to srcu_read_unlock() or srcu_up_read(), respectively),
but such somethings are excluded by convention.

So it would be nice for LKMM to complain about such abuse, but not
at all mandatory.

> My clarification request is whether that difference in the probable
> operational model should be reflected in the axiomatic model (as I first
> suspected based on the word "semaphore" being dropped a lot), or whether
> it's just due to abuse (i.e., yes the axiomatic model and operational model
> might be different here, but you're not allowed to look).

For the moment, I am taking the door labeled "abuse".

Maybe someday someone will come up with a valid use case, but they have
to prove it first. ;-)

> Which brings us to the next point:
>
> > Could you please review the remainder to see what remains given the
> > usage restrictions that I called out above?
>
> Perhaps we could say that reading an index without using it later is
> forbidden?
>
> flag ~empty [Srcu-lock];data;rf;[~ domain(data;[Srcu-unlock])] as
> thrown-srcu-cookie-on-floor
>
> So if there is an srcu_down() that produces a cookie that is read by some
> read R, and R doesn't then pass that value into an srcu_up(), the
> srcu-warranty is voided.

I like the general idea, but I am dazed and confused by this "flag"
statement.

> Perhaps it would also be good to add special tags for Srcu-down and Srcu-up
> to avoid collisions.

Ah, separate down/up tags could make this "flag" statement at least
somewhat less dazing and confusing.

> always have fun, jonas

Always do! ;-)

Thanx, Paul

2023-01-20 16:14:58

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 20, 2023 at 01:51:01PM +0100, Jonas Oberhauser wrote:
> I'm not going to get it right today, am I?

Believe me, I know that feeling! Open-source development is therefore
an extremely good character-building exercise. At least that is what
I keep telling myself. ;-)

> +let srcu-rscs = ([Srcu-lock] ; (data ; [~ Srcu-unlock] ; rfe) * ; data ;
> [Srcu-unlock]) & loc
>
> I see now that I copied the format from your message but without realizing
> the original had a `|` where I have a `;`.
> I hope this version is finally right and perhaps more natural than the (data
> | rf) version, considering rf can't actually appear in most places and this
> more closely matches carry-dep;data.
> But of course feel free to use
> +let srcu-rscs = ([Srcu-lock] ; (data? | [~ Srcu-unlock] ; rf)+ ;
> [Srcu-unlock]) & loc
> instead if you prefer.

Ah, herd7 could see an rf link between any srcu_read_unlock() and any
"later" srcu_read_lock(), couldn't it? Good catch!!!

I took this last one, adding parentheses that might well be unnecessary.
(You see, herd7 was complaining about cut-and-paste, possibly due to
alternative character sets, so I indulged in a bit of diagnostic-driven
development.)

let srcu-rscs = ([Srcu-lock] ; (data | ([~ Srcu-unlock] ; rf))+ ;
[Srcu-unlock]) & loc

The reason for favoring "rf" over "rfe" is the possibility of a litmus
test where the process containing the srcu_down_read() sometimes but
not always also has the matching srcu_up_read(). Perhaps a pair of "if"
statements control which process does the matching srcu_up_read().

With this change, all of the C-srcu-nest-*.litmus tests act as expected.

And thank you!!!

Thanx, Paul

> have fun, jonas
>
>
> On 1/20/2023 1:34 PM, Jonas Oberhauser wrote:
> > I just realized I made a mistake in my earlier response to this message;
> > you still need the rf for passing the cookie across threads.
> > Perhaps it's better to just also exclude srcu_unlock type events
> > explicitly here.
> >
> > +let srcu-rscs = ([Srcu-lock] ; (data ; [~ Srcu-unlock] ; rf) + ;
> > [Srcu-unlock]) & loc
> >
> >
> > best wishes,
> > jonas
> >
> > On 1/20/2023 4:55 AM, Paul E. McKenney wrote:
> > > On Thu, Jan 19, 2023 at 02:51:53PM -0500, Alan Stern wrote:
> > > > On Thu, Jan 19, 2023 at 10:41:07AM -0800, Paul E. McKenney wrote:
> > > > > In contrast, this actually needs srcu_down_read() and srcu_up_read():
> > > > >
> > > > > ------------------------------------------------------------------------
> > > > >
> > > > >
> > > > > C C-srcu-nest-6
> > > > >
> > > > > (*
> > > > > ? * Result: Never
> > > > > ? *
> > > > > ? * Flag unbalanced-srcu-locking
> > > > > ? * This would be valid for srcu_down_read() and srcu_up_read().
> > > > > ? *)
> > > > >
> > > > > {}
> > > > >
> > > > > P0(int *x, int *y, struct srcu_struct *s1, int *idx)
> > > > > {
> > > > > ????int r2;
> > > > > ????int r3;
> > > > >
> > > > > ????r3 = srcu_down_read(s1);
> > > > > ????WRITE_ONCE(*idx, r3);
> > > > > ????r2 = READ_ONCE(*y);
> > > > > }
> > > > >
> > > > > P1(int *x, int *y, struct srcu_struct *s1, int *idx)
> > > > > {
> > > > > ????int r1;
> > > > > ????int r3;
> > > > >
> > > > > ????r1 = READ_ONCE(*x);
> > > > > ????r3 = READ_ONCE(*idx);
> > > > > ????srcu_up_read(s1, r3);
> > > > > }
> > > > >
> > > > > P2(int *x, int *y, struct srcu_struct *s1)
> > > > > {
> > > > > ????WRITE_ONCE(*y, 1);
> > > > > ????synchronize_srcu(s1);
> > > > > ????WRITE_ONCE(*x, 1);
> > > > > }
> > > > >
> > > > > locations [0:r1]
> > > > > exists (1:r1=1 /\ 0:r2=0)
> > > > I modified this litmus test by adding a flag variable with an
> > > > smp_store_release in P0, an smp_load_acquire in P1, and a filter clause
> > > > to ensure that P1 reads the flag and idx from P1.
> > > >
> > > > With the patch below, the results were as expected:
> > > >
> > > > Test C-srcu-nest-6 Allowed
> > > > States 3
> > > > 0:r1=0; 0:r2=0; 1:r1=0;
> > > > 0:r1=0; 0:r2=1; 1:r1=0;
> > > > 0:r1=0; 0:r2=1; 1:r1=1;
> > > > No
> > > > Witnesses
> > > > Positive: 0 Negative: 3
> > > > Condition exists (1:r1=1 /\ 0:r2=0)
> > > > Observation C-srcu-nest-6 Never 0 3
> > > > Time C-srcu-nest-6 0.04
> > > > Hash=2b010cf3446879fb84752a6016ff88c5
> > > >
> > > > It turns out that the idea of removing rf edges from Srcu-unlock events
> > > > doesn't work well.? The missing edges mess up herd's calculation of the
> > > > fr relation and the coherence axiom.? So I've gone back to filtering
> > > > those edges out of carry-dep.
> > > >
> > > > Also, Boqun's suggestion for flagging ordinary accesses to SRCU
> > > > structures no longer works, because the lock and unlock operations now
> > > > _are_ normal accesses.? I removed that check too, but it shouldn't hurt
> > > > much because I don't expect to encounter litmus tests that try to read
> > > > or write srcu_structs directly.
> > > >
> > > > Alan
> > > >
> > > >
> > > >
> > > > Index: usb-devel/tools/memory-model/linux-kernel.bell
> > > > ===================================================================
> > > > --- usb-devel.orig/tools/memory-model/linux-kernel.bell
> > > > +++ usb-devel/tools/memory-model/linux-kernel.bell
> > > > @@ -53,38 +53,30 @@ let rcu-rscs = let rec
> > > > ????? in matched
> > > > ? ? (* Validate nesting *)
> > > > -flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-locking
> > > > -flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-locking
> > > > +flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-lock
> > > > +flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-unlock
> > > > ? ? (* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
> > > > -let srcu-rscs = let rec
> > > > -??????? unmatched-locks = Srcu-lock \ domain(matched)
> > > > -??? and unmatched-unlocks = Srcu-unlock \ range(matched)
> > > > -??? and unmatched = unmatched-locks | unmatched-unlocks
> > > > -??? and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
> > > > -??? and unmatched-locks-to-unlocks =
> > > > -??????? ([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
> > > > -??? and matched = matched | (unmatched-locks-to-unlocks \
> > > > -??????? (unmatched-po ; unmatched-po))
> > > > -??? in matched
> > > > +let srcu-rscs = ([Srcu-lock] ; (data | rf)+ ; [Srcu-unlock]) & loc
> > > > ? ? (* Validate nesting *)
> > > > -flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
> > > > -flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
> > > > +flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-lock
> > > > +flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-unlock
> > > > +flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-srcu-matches
> > > > ? ? (* Check for use of synchronize_srcu() inside an RCU
> > > > critical section *)
> > > > ? flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
> > > > ? ? (* Validate SRCU dynamic match *)
> > > > -flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
> > > > +flag ~empty different-values(srcu-rscs) as bad-srcu-value-match
> > > > ? ? (* Compute marked and plain memory accesses *)
> > > > ? let Marked = (~M) | IW | Once | Release | Acquire |
> > > > domain(rmw) | range(rmw) |
> > > > -??????? LKR | LKW | UL | LF | RL | RU
> > > > +???????? LKR | LKW | UL | LF | RL | RU | Srcu-lock | Srcu-unlock
> > > > ? let Plain = M \ Marked
> > > > ? ? (* Redefine dependencies to include those carried through
> > > > plain accesses *)
> > > > -let carry-dep = (data ; rfi)*
> > > > +let carry-dep = (data ; [~ Srcu-unlock] ; rfi)*
> > > > ? let addr = carry-dep ; addr
> > > > ? let ctrl = carry-dep ; ctrl
> > > > ? let data = carry-dep ; data
> > > > Index: usb-devel/tools/memory-model/linux-kernel.def
> > > > ===================================================================
> > > > --- usb-devel.orig/tools/memory-model/linux-kernel.def
> > > > +++ usb-devel/tools/memory-model/linux-kernel.def
> > > > @@ -49,8 +49,10 @@ synchronize_rcu() { __fence{sync-rcu}; }
> > > > ? synchronize_rcu_expedited() { __fence{sync-rcu}; }
> > > > ? ? // SRCU
> > > > -srcu_read_lock(X)? __srcu{srcu-lock}(X)
> > > > -srcu_read_unlock(X,Y) { __srcu{srcu-unlock}(X,Y); }
> > > > +srcu_read_lock(X) __load{srcu-lock}(*X)
> > > > +srcu_read_unlock(X,Y) { __store{srcu-unlock}(*X,Y); }
> > > > +srcu_down_read(X) __load{srcu-lock}(*X)
> > > > +srcu_up_read(X,Y) { __store{srcu-unlock}(*X,Y); }
> > > > ? synchronize_srcu(X)? { __srcu{sync-srcu}(X); }
> > > > ? synchronize_srcu_expedited(X)? { __srcu{sync-srcu}(X); }
> > > And for some initial tests:
> > >
> > > https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-1.litmus
> > >
> > >
> > > ????"Flag multiple-srcu-matches" but otherwise OK.
> > > ????As a "hail Mary" exercise, I used r4 for the second SRCU
> > > ????read-side critical section, but this had no effect.
> > > ????(This flag is expected and seen for #4 below.)
> > >
> > > https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-2.litmus
> > >
> > > https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-3.litmus
> > >
> > > https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-4.litmus
> > >
> > > https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-5.litmus
> > >
> > >
> > > ????All as expected.
> > >
> > > https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-6.litmus
> > >
> > >
> > > ????Get "Flag unbalanced-srcu-lock" and "Flag unbalanced-srcu-unlock",
> > > ????but this is srcu_down_read() and srcu_up_read(), where this should
> > > ????be OK.??? Ah, but I need to do the release/acquire/filter
> > > trick.? Once
> > > ????I did that, it works as expected.
> > >
> > > https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-7.litmus
> > >
> > > https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-8.litmus
> > >
> > >
> > > ????Both as expected.
> > >
> > > Getting there!!!
> > >
> > > I also started a regression test, hopefully without pilot error.? :-/
> > >
> > > ??????????????????????????? Thanx, Paul
> >
>

2023-01-20 16:30:14

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 20, 2023 at 11:13:00AM +0100, Jonas Oberhauser wrote:
> Perhaps we could say that reading an index without using it later is
> forbidden?
>
> flag ~empty [Srcu-lock];data;rf;[~ domain(data;[Srcu-unlock])] as
> thrown-srcu-cookie-on-floor

We already flag locks that don't have a matching unlock. I don't see
any point in worrying about whatever else happens to the index.

> So if there is an srcu_down() that produces a cookie that is read by some
> read R, and R doesn't then pass that value into an srcu_up(), the
> srcu-warranty is voided.

No, it isn't. As long as the value is passed to exactly one
srcu_up_read(), it doesn't much matter what else you do with it.

Alan

2023-01-20 16:50:23

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 18, 2023 at 04:02:14PM -0800, Paul E. McKenney wrote:
> There are pairs of per-CPU counters. One pair (->srcu_lock_count[])
> counts the number of srcu_down_read() operations that took place on
> that CPU and another pair (->srcu_unlock_count[]) counts the number
> of srcu_down_read() operations that took place on that CPU. There is
> an ->srcu_idx that selects which of the ->srcu_lock_count[] elements
> should be incremented by srcu_down_read(). Of course, srcu_down_read()
> returns the value of ->srcu_idx that it used so that the matching
> srcu_up_read() will use that same index when incrementing its CPU's
> ->srcu_unlock_count[].
>
> Grace periods go something like this:
>
> 1. Sum up the ->srcu_unlock_count[!ssp->srcu_idx] counters.
>
> 2. smp_mb().
>
> 3. Sum up the ->srcu_unlock_count[!ssp->srcu_idx] counters.

Presumably you meant to write "lock" here, not "unlock".

>
> 4. If the sums are not equal, retry from #1.
>
> 5. smp_mb().
>
> 6. WRITE_ONCE(ssp->srcu_idx, !ssp->srcu_idx);
>
> 7. smp_mb().
>
> 8. Same loop as #1-4.
>
> So similar to r/w semaphores, but with two separate distributed counts.
> This means that the number of readers need not go to zero at any given
> point in time, consistent with the need to wait only on old readers.

Reasoning from first principles, I deduce the following:

You didn't describe exactly how srcu_down_read() and srcu_up_read()
work. Evidently the unlock increment in srcu_up_read() should have
release semantics, to prevent accesses from escaping out the end of the
critical section. But the lock increment in srcu_down_read() has to be
stronger than an acquire; to prevent accesses in the critical section
escaping out the start, the increment has to be followed by smp_mb().

The smp_mb() fences in steps 5 and 7 appear to be completely
unnecessary.

Provided an smp_mb() is added at the very start and end of the grace
period, the memory barrier in step 2 and its copy in step 8 can be
demoted to smp_rmb().

These changes would be small optimizations at best, and you may consider
them unimportant in view of the fact that grace periods often last quite
a long time.

Alan

2023-01-20 17:17:43

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Thu, Jan 19, 2023 at 07:55:21PM -0800, Paul E. McKenney wrote:
> And for some initial tests:
>
> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-1.litmus
>
> "Flag multiple-srcu-matches" but otherwise OK.
> As a "hail Mary" exercise, I used r4 for the second SRCU
> read-side critical section, but this had no effect.
> (This flag is expected and seen for #4 below.)

Jonas is right about the reason for this. Also, his suggestion for
fixing the check in lock.cat makes sense.

My revised patch is below.

> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-2.litmus
> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-3.litmus
> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-4.litmus
> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-5.litmus
>
> All as expected.
>
> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-6.litmus
>
> Get "Flag unbalanced-srcu-lock" and "Flag unbalanced-srcu-unlock",
> but this is srcu_down_read() and srcu_up_read(), where this should
> be OK. Ah, but I need to do the release/acquire/filter trick. Once
> I did that, it works as expected.
>
> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-7.litmus
> https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-8.litmus
>
> Both as expected.
>
> Getting there!!!

Good news.

Alan



Index: usb-devel/tools/memory-model/linux-kernel.bell
===================================================================
--- usb-devel.orig/tools/memory-model/linux-kernel.bell
+++ usb-devel/tools/memory-model/linux-kernel.bell
@@ -53,38 +53,31 @@ let rcu-rscs = let rec
in matched

(* Validate nesting *)
-flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-locking
-flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-locking
+flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-lock
+flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-unlock

(* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
-let srcu-rscs = let rec
- unmatched-locks = Srcu-lock \ domain(matched)
- and unmatched-unlocks = Srcu-unlock \ range(matched)
- and unmatched = unmatched-locks | unmatched-unlocks
- and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
- and unmatched-locks-to-unlocks =
- ([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
- and matched = matched | (unmatched-locks-to-unlocks \
- (unmatched-po ; unmatched-po))
- in matched
+let carry-srcu-data = (data ; [~ Srcu-unlock] ; rf)*
+let srcu-rscs = ([Srcu-lock] ; carry-srcu-data ; data ; [Srcu-unlock]) & loc

(* Validate nesting *)
-flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
-flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
+flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-lock
+flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-unlock
+flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-srcu-matches

(* Check for use of synchronize_srcu() inside an RCU critical section *)
flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep

(* Validate SRCU dynamic match *)
-flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
+flag ~empty different-values(srcu-rscs) as bad-srcu-value-match

(* Compute marked and plain memory accesses *)
let Marked = (~M) | IW | Once | Release | Acquire | domain(rmw) | range(rmw) |
- LKR | LKW | UL | LF | RL | RU
+ LKR | LKW | UL | LF | RL | RU | Srcu-lock | Srcu-unlock
let Plain = M \ Marked

(* Redefine dependencies to include those carried through plain accesses *)
-let carry-dep = (data ; rfi)*
+let carry-dep = (data ; [~ Srcu-unlock] ; rfi)*
let addr = carry-dep ; addr
let ctrl = carry-dep ; ctrl
let data = carry-dep ; data
Index: usb-devel/tools/memory-model/linux-kernel.def
===================================================================
--- usb-devel.orig/tools/memory-model/linux-kernel.def
+++ usb-devel/tools/memory-model/linux-kernel.def
@@ -49,8 +49,10 @@ synchronize_rcu() { __fence{sync-rcu}; }
synchronize_rcu_expedited() { __fence{sync-rcu}; }

// SRCU
-srcu_read_lock(X) __srcu{srcu-lock}(X)
-srcu_read_unlock(X,Y) { __srcu{srcu-unlock}(X,Y); }
+srcu_read_lock(X) __load{srcu-lock}(*X)
+srcu_read_unlock(X,Y) { __store{srcu-unlock}(*X,Y); }
+srcu_down_read(X) __load{srcu-lock}(*X)
+srcu_up_read(X,Y) { __store{srcu-unlock}(*X,Y); }
synchronize_srcu(X) { __srcu{sync-srcu}(X); }
synchronize_srcu_expedited(X) { __srcu{sync-srcu}(X); }

Index: usb-devel/tools/memory-model/lock.cat
===================================================================
--- usb-devel.orig/tools/memory-model/lock.cat
+++ usb-devel/tools/memory-model/lock.cat
@@ -36,9 +36,9 @@ let RU = try RU with emptyset
(* Treat RL as a kind of LF: a read with no ordering properties *)
let LF = LF | RL

-(* There should be no ordinary R or W accesses to spinlocks *)
-let ALL-LOCKS = LKR | LKW | UL | LF | RU
-flag ~empty [M \ IW] ; loc ; [ALL-LOCKS] as mixed-lock-accesses
+(* There should be no ordinary R or W accesses to spinlocks or SRCU structs *)
+let ALL-LOCKS = LKR | LKW | UL | LF | RU | Srcu-lock | Srcu-unlock | Sync-srcu
+flag ~empty [M \ IW \ ALL-LOCKS] ; loc ; [ALL-LOCKS] as mixed-lock-accesses

(* Link Lock-Reads to their RMW-partner Lock-Writes *)
let lk-rmw = ([LKR] ; po-loc ; [LKW]) \ (po ; po)

2023-01-20 18:07:15

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 20, 2023 at 11:14:06AM -0500, Alan Stern wrote:
> On Thu, Jan 19, 2023 at 07:55:21PM -0800, Paul E. McKenney wrote:
> > And for some initial tests:
> >
> > https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-1.litmus
> >
> > "Flag multiple-srcu-matches" but otherwise OK.
> > As a "hail Mary" exercise, I used r4 for the second SRCU
> > read-side critical section, but this had no effect.
> > (This flag is expected and seen for #4 below.)
>
> Jonas is right about the reason for this. Also, his suggestion for
> fixing the check in lock.cat makes sense.

Very good!

> My revised patch is below.

Thank you! Are you OK with my putting this on a not-for-mainline branch
for experimental purposes?

> > https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-2.litmus
> > https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-3.litmus
> > https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-4.litmus
> > https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-5.litmus
> >
> > All as expected.
> >
> > https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-6.litmus
> >
> > Get "Flag unbalanced-srcu-lock" and "Flag unbalanced-srcu-unlock",
> > but this is srcu_down_read() and srcu_up_read(), where this should
> > be OK. Ah, but I need to do the release/acquire/filter trick. Once
> > I did that, it works as expected.
> >
> > https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-7.litmus
> > https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-srcu-nest-8.litmus
> >
> > Both as expected.
> >
> > Getting there!!!
>
> Good news.

And all of the litmus-repo tests up to ten processes passed. Woo-hoo!!!

Thanx, Paul

> Alan
>
>
>
> Index: usb-devel/tools/memory-model/linux-kernel.bell
> ===================================================================
> --- usb-devel.orig/tools/memory-model/linux-kernel.bell
> +++ usb-devel/tools/memory-model/linux-kernel.bell
> @@ -53,38 +53,31 @@ let rcu-rscs = let rec
> in matched
>
> (* Validate nesting *)
> -flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-locking
> -flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-locking
> +flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-lock
> +flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-unlock
>
> (* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
> -let srcu-rscs = let rec
> - unmatched-locks = Srcu-lock \ domain(matched)
> - and unmatched-unlocks = Srcu-unlock \ range(matched)
> - and unmatched = unmatched-locks | unmatched-unlocks
> - and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
> - and unmatched-locks-to-unlocks =
> - ([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
> - and matched = matched | (unmatched-locks-to-unlocks \
> - (unmatched-po ; unmatched-po))
> - in matched
> +let carry-srcu-data = (data ; [~ Srcu-unlock] ; rf)*
> +let srcu-rscs = ([Srcu-lock] ; carry-srcu-data ; data ; [Srcu-unlock]) & loc
>
> (* Validate nesting *)
> -flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
> -flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
> +flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-lock
> +flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-unlock
> +flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-srcu-matches
>
> (* Check for use of synchronize_srcu() inside an RCU critical section *)
> flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
>
> (* Validate SRCU dynamic match *)
> -flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
> +flag ~empty different-values(srcu-rscs) as bad-srcu-value-match
>
> (* Compute marked and plain memory accesses *)
> let Marked = (~M) | IW | Once | Release | Acquire | domain(rmw) | range(rmw) |
> - LKR | LKW | UL | LF | RL | RU
> + LKR | LKW | UL | LF | RL | RU | Srcu-lock | Srcu-unlock
> let Plain = M \ Marked
>
> (* Redefine dependencies to include those carried through plain accesses *)
> -let carry-dep = (data ; rfi)*
> +let carry-dep = (data ; [~ Srcu-unlock] ; rfi)*
> let addr = carry-dep ; addr
> let ctrl = carry-dep ; ctrl
> let data = carry-dep ; data
> Index: usb-devel/tools/memory-model/linux-kernel.def
> ===================================================================
> --- usb-devel.orig/tools/memory-model/linux-kernel.def
> +++ usb-devel/tools/memory-model/linux-kernel.def
> @@ -49,8 +49,10 @@ synchronize_rcu() { __fence{sync-rcu}; }
> synchronize_rcu_expedited() { __fence{sync-rcu}; }
>
> // SRCU
> -srcu_read_lock(X) __srcu{srcu-lock}(X)
> -srcu_read_unlock(X,Y) { __srcu{srcu-unlock}(X,Y); }
> +srcu_read_lock(X) __load{srcu-lock}(*X)
> +srcu_read_unlock(X,Y) { __store{srcu-unlock}(*X,Y); }
> +srcu_down_read(X) __load{srcu-lock}(*X)
> +srcu_up_read(X,Y) { __store{srcu-unlock}(*X,Y); }
> synchronize_srcu(X) { __srcu{sync-srcu}(X); }
> synchronize_srcu_expedited(X) { __srcu{sync-srcu}(X); }
>
> Index: usb-devel/tools/memory-model/lock.cat
> ===================================================================
> --- usb-devel.orig/tools/memory-model/lock.cat
> +++ usb-devel/tools/memory-model/lock.cat
> @@ -36,9 +36,9 @@ let RU = try RU with emptyset
> (* Treat RL as a kind of LF: a read with no ordering properties *)
> let LF = LF | RL
>
> -(* There should be no ordinary R or W accesses to spinlocks *)
> -let ALL-LOCKS = LKR | LKW | UL | LF | RU
> -flag ~empty [M \ IW] ; loc ; [ALL-LOCKS] as mixed-lock-accesses
> +(* There should be no ordinary R or W accesses to spinlocks or SRCU structs *)
> +let ALL-LOCKS = LKR | LKW | UL | LF | RU | Srcu-lock | Srcu-unlock | Sync-srcu
> +flag ~empty [M \ IW \ ALL-LOCKS] ; loc ; [ALL-LOCKS] as mixed-lock-accesses
>
> (* Link Lock-Reads to their RMW-partner Lock-Writes *)
> let lk-rmw = ([LKR] ; po-loc ; [LKW]) \ (po ; po)

2023-01-20 18:13:35

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 20, 2023 at 11:01:03AM -0500, Alan Stern wrote:
> On Wed, Jan 18, 2023 at 04:02:14PM -0800, Paul E. McKenney wrote:
> > There are pairs of per-CPU counters. One pair (->srcu_lock_count[])
> > counts the number of srcu_down_read() operations that took place on
> > that CPU and another pair (->srcu_unlock_count[]) counts the number
> > of srcu_down_read() operations that took place on that CPU. There is
> > an ->srcu_idx that selects which of the ->srcu_lock_count[] elements
> > should be incremented by srcu_down_read(). Of course, srcu_down_read()
> > returns the value of ->srcu_idx that it used so that the matching
> > srcu_up_read() will use that same index when incrementing its CPU's
> > ->srcu_unlock_count[].
> >
> > Grace periods go something like this:
> >
> > 1. Sum up the ->srcu_unlock_count[!ssp->srcu_idx] counters.
> >
> > 2. smp_mb().
> >
> > 3. Sum up the ->srcu_unlock_count[!ssp->srcu_idx] counters.
>
> Presumably you meant to write "lock" here, not "unlock".

You are quite right, and apologies for my confusion.

> > 4. If the sums are not equal, retry from #1.
> >
> > 5. smp_mb().
> >
> > 6. WRITE_ONCE(ssp->srcu_idx, !ssp->srcu_idx);
> >
> > 7. smp_mb().
> >
> > 8. Same loop as #1-4.
> >
> > So similar to r/w semaphores, but with two separate distributed counts.
> > This means that the number of readers need not go to zero at any given
> > point in time, consistent with the need to wait only on old readers.
>
> Reasoning from first principles, I deduce the following:
>
> You didn't describe exactly how srcu_down_read() and srcu_up_read()
> work. Evidently the unlock increment in srcu_up_read() should have
> release semantics, to prevent accesses from escaping out the end of the
> critical section. But the lock increment in srcu_down_read() has to be
> stronger than an acquire; to prevent accesses in the critical section
> escaping out the start, the increment has to be followed by smp_mb().

You got it! There is some work going on to see if srcu_read_lock()'s
smp_mb() can be weakened to pure release, but we will see.

> The smp_mb() fences in steps 5 and 7 appear to be completely
> unnecessary.

For correctness, agreed. Their purpose is instead forward progress.
One can argue that step 5 is redundant due to control dependency, but
control dependencies are fragile, and as you say below, this code is
nowhere near a fastpath.

> Provided an smp_mb() is added at the very start and end of the grace
> period, the memory barrier in step 2 and its copy in step 8 can be
> demoted to smp_rmb().

This might need to be smp_mb() to allow srcu_read_unlock() to be
demoted to release ordering. Work in progress.

> These changes would be small optimizations at best, and you may consider
> them unimportant in view of the fact that grace periods often last quite
> a long time.

Agreed, keeping it simple and obvious is important on this code, which
is nowhere near a fastpath. The case of srcu_read_unlock() is another
thing altogether.

Thanx, Paul

2023-01-20 20:07:17

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 20, 2023 at 09:58:04AM -0800, Paul E. McKenney wrote:
> On Fri, Jan 20, 2023 at 11:01:03AM -0500, Alan Stern wrote:
> > On Wed, Jan 18, 2023 at 04:02:14PM -0800, Paul E. McKenney wrote:
> > > There are pairs of per-CPU counters. One pair (->srcu_lock_count[])
> > > counts the number of srcu_down_read() operations that took place on
> > > that CPU and another pair (->srcu_unlock_count[]) counts the number
> > > of srcu_down_read() operations that took place on that CPU. There is
> > > an ->srcu_idx that selects which of the ->srcu_lock_count[] elements
> > > should be incremented by srcu_down_read(). Of course, srcu_down_read()
> > > returns the value of ->srcu_idx that it used so that the matching
> > > srcu_up_read() will use that same index when incrementing its CPU's
> > > ->srcu_unlock_count[].
> > >
> > > Grace periods go something like this:
> > >
> > > 1. Sum up the ->srcu_unlock_count[!ssp->srcu_idx] counters.
> > >
> > > 2. smp_mb().
> > >
> > > 3. Sum up the ->srcu_unlock_count[!ssp->srcu_idx] counters.
> >
> > Presumably you meant to write "lock" here, not "unlock".
>
> You are quite right, and apologies for my confusion.
>
> > > 4. If the sums are not equal, retry from #1.
> > >
> > > 5. smp_mb().
> > >
> > > 6. WRITE_ONCE(ssp->srcu_idx, !ssp->srcu_idx);
> > >
> > > 7. smp_mb().
> > >
> > > 8. Same loop as #1-4.
> > >
> > > So similar to r/w semaphores, but with two separate distributed counts.
> > > This means that the number of readers need not go to zero at any given
> > > point in time, consistent with the need to wait only on old readers.
> >
> > Reasoning from first principles, I deduce the following:
> >
> > You didn't describe exactly how srcu_down_read() and srcu_up_read()
> > work. Evidently the unlock increment in srcu_up_read() should have
> > release semantics, to prevent accesses from escaping out the end of the
> > critical section. But the lock increment in srcu_down_read() has to be
> > stronger than an acquire; to prevent accesses in the critical section
> > escaping out the start, the increment has to be followed by smp_mb().
>
> You got it! There is some work going on to see if srcu_read_lock()'s
> smp_mb() can be weakened to pure release, but we will see.

That doesn't make sense. Release ordering in srcu_read_lock() would
only prevent accesses from leaking _in_ to the critical section. It
would do nothing to prevent accesses from leaking _out_.

> > The smp_mb() fences in steps 5 and 7 appear to be completely
> > unnecessary.
>
> For correctness, agreed. Their purpose is instead forward progress.

It's hard to say whether they would be effective at that. smp_mb()
forces the processor to wait until some time when previous writes have
become visible to all CPUs. But if you don't wait for that potentially
excessively long delay, you may be able to continue and be lucky enough
to find that all previous writes have already become visible to all the
CPUs that matter.

As far as I know, smp_mb() doesn't expedite the process of making
previous writes visible. However, I am very far from being an expert
on system architecture design.

> One can argue that step 5 is redundant due to control dependency, but
> control dependencies are fragile, and as you say below, this code is
> nowhere near a fastpath.

Also, control dependencies do not contribute to forward progress.

> > Provided an smp_mb() is added at the very start and end of the grace
> > period, the memory barrier in step 2 and its copy in step 8 can be
> > demoted to smp_rmb().
>
> This might need to be smp_mb() to allow srcu_read_unlock() to be
> demoted to release ordering. Work in progress.

I thought srcu_read_unlock() already _is_ a release operation. The
smp_mb() fence mentioned earlier needs to be in srcu_read_lock(), not
unlock(). And there's no way that one can be demoted.

srcu_read_unlock() does not need a full smp_mb().

> > These changes would be small optimizations at best, and you may consider
> > them unimportant in view of the fact that grace periods often last quite
> > a long time.
>
> Agreed, keeping it simple and obvious is important on this code, which
> is nowhere near a fastpath. The case of srcu_read_unlock() is another
> thing altogether.

Unfortunately, the full fence in srcu_read_lock() is unavoidable without
very major changes to the algorithm -- probably a complete redesign.
Without it, a read inside the critical section could be executed before
the store part of the increment, which could lead synchronize_srcu() to
believe that the critical section had not yet started when in fact it
had.

Alan

2023-01-20 20:07:27

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 20, 2023 at 09:30:54AM -0800, Paul E. McKenney wrote:
> On Fri, Jan 20, 2023 at 11:14:06AM -0500, Alan Stern wrote:
> > My revised patch is below.
>
> Thank you! Are you OK with my putting this on a not-for-mainline branch
> for experimental purposes?

Sure, go ahead.

Alan

2023-01-20 20:07:37

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 20, 2023 at 01:37:51PM -0500, Alan Stern wrote:
> On Fri, Jan 20, 2023 at 09:58:04AM -0800, Paul E. McKenney wrote:
> > On Fri, Jan 20, 2023 at 11:01:03AM -0500, Alan Stern wrote:
> > > On Wed, Jan 18, 2023 at 04:02:14PM -0800, Paul E. McKenney wrote:
> > > > There are pairs of per-CPU counters. One pair (->srcu_lock_count[])
> > > > counts the number of srcu_down_read() operations that took place on
> > > > that CPU and another pair (->srcu_unlock_count[]) counts the number
> > > > of srcu_down_read() operations that took place on that CPU. There is
> > > > an ->srcu_idx that selects which of the ->srcu_lock_count[] elements
> > > > should be incremented by srcu_down_read(). Of course, srcu_down_read()
> > > > returns the value of ->srcu_idx that it used so that the matching
> > > > srcu_up_read() will use that same index when incrementing its CPU's
> > > > ->srcu_unlock_count[].
> > > >
> > > > Grace periods go something like this:
> > > >
> > > > 1. Sum up the ->srcu_unlock_count[!ssp->srcu_idx] counters.
> > > >
> > > > 2. smp_mb().
> > > >
> > > > 3. Sum up the ->srcu_unlock_count[!ssp->srcu_idx] counters.
> > >
> > > Presumably you meant to write "lock" here, not "unlock".
> >
> > You are quite right, and apologies for my confusion.
> >
> > > > 4. If the sums are not equal, retry from #1.
> > > >
> > > > 5. smp_mb().
> > > >
> > > > 6. WRITE_ONCE(ssp->srcu_idx, !ssp->srcu_idx);
> > > >
> > > > 7. smp_mb().
> > > >
> > > > 8. Same loop as #1-4.
> > > >
> > > > So similar to r/w semaphores, but with two separate distributed counts.
> > > > This means that the number of readers need not go to zero at any given
> > > > point in time, consistent with the need to wait only on old readers.
> > >
> > > Reasoning from first principles, I deduce the following:
> > >
> > > You didn't describe exactly how srcu_down_read() and srcu_up_read()
> > > work. Evidently the unlock increment in srcu_up_read() should have
> > > release semantics, to prevent accesses from escaping out the end of the
> > > critical section. But the lock increment in srcu_down_read() has to be
> > > stronger than an acquire; to prevent accesses in the critical section
> > > escaping out the start, the increment has to be followed by smp_mb().
> >
> > You got it! There is some work going on to see if srcu_read_lock()'s
> > smp_mb() can be weakened to pure release, but we will see.
>
> That doesn't make sense. Release ordering in srcu_read_lock() would
> only prevent accesses from leaking _in_ to the critical section. It
> would do nothing to prevent accesses from leaking _out_.

Yes, I should have said srcu_read_unlock(). I do seem to be having
lock/unlock difficulties. :-/

We could remove the smp_mb() from srcu_read_lock(), but at the expense
of a round of IPIs from the grace-period code, along with interactions
with things like the CPU-hotplug code paths. I am not proposing
doing that, for one thing, one of the attractions of SRCU is its fast
and disturbance-free grace period when there are no readers in flight.
It is possible, though: Tasks Trace RCU does just this, IPIs, CPU hotplug,
and all.

There are other ways to do this, but the ones I know of would restrict
the contexts in which srcu_read_lock() and srcu_read_unlock() can be
executed, for example, in the context of offline CPUs.

> > > The smp_mb() fences in steps 5 and 7 appear to be completely
> > > unnecessary.
> >
> > For correctness, agreed. Their purpose is instead forward progress.
>
> It's hard to say whether they would be effective at that. smp_mb()
> forces the processor to wait until some time when previous writes have
> become visible to all CPUs. But if you don't wait for that potentially
> excessively long delay, you may be able to continue and be lucky enough
> to find that all previous writes have already become visible to all the
> CPUs that matter.
>
> As far as I know, smp_mb() doesn't expedite the process of making
> previous writes visible. However, I am very far from being an expert
> on system architecture design.

As you noticed, without the step-7 smp_mb(), a potentially large number
of invocations of srcu_read_lock() could use the old index value, that
is, the inces value that is to be counted in step 8. Then the step-8
phase of the grace period could unncessarily wait on those readers.

Similarly, without the step-5 smp_mb() and without the control
dependencies extending from the loads feeding into step 4's sum,
srcu_read_lock() and srcu_read_unlock() on other CPUs might prematurely
use the new index, which could force the step 1-4 phase of the grace
period to unnecessarily wait on those readers.

> > One can argue that step 5 is redundant due to control dependency, but
> > control dependencies are fragile, and as you say below, this code is
> > nowhere near a fastpath.
>
> Also, control dependencies do not contribute to forward progress.

I might be mistaken, and you can argue that the risk is small, but without
that ordering, step 4 could see unintended increments that could force
unnecessary retries of steps 1-3.

> > > Provided an smp_mb() is added at the very start and end of the grace
> > > period, the memory barrier in step 2 and its copy in step 8 can be
> > > demoted to smp_rmb().
> >
> > This might need to be smp_mb() to allow srcu_read_unlock() to be
> > demoted to release ordering. Work in progress.
>
> I thought srcu_read_unlock() already _is_ a release operation. The
> smp_mb() fence mentioned earlier needs to be in srcu_read_lock(), not
> unlock(). And there's no way that one can be demoted.

Agreed, my mistake earlier. The smp_mb() in srcu_read_lock() must
remaain smp_mb().

> srcu_read_unlock() does not need a full smp_mb().

That is quite possible, and that is what we are looking into. And testing
thus far agrees with you. But the grace-period ordering constraints
are quite severe, so this requires careful checking and severe testing.

> > > These changes would be small optimizations at best, and you may consider
> > > them unimportant in view of the fact that grace periods often last quite
> > > a long time.
> >
> > Agreed, keeping it simple and obvious is important on this code, which
> > is nowhere near a fastpath. The case of srcu_read_unlock() is another
> > thing altogether.
>
> Unfortunately, the full fence in srcu_read_lock() is unavoidable without
> very major changes to the algorithm -- probably a complete redesign.
> Without it, a read inside the critical section could be executed before
> the store part of the increment, which could lead synchronize_srcu() to
> believe that the critical section had not yet started when in fact it
> had.

I actually did type "srcu_read_unlock()" correctly in this case. ;-)

But yes, removing the smp_mb() from srcu_read_lock() is not in the cards.
On the other hand, doing so for srcu_read_unlock() just might be both
doable and worthwhile.

Thanx, Paul

2023-01-20 20:07:55

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 20, 2023 at 01:15:58PM -0500, Alan Stern wrote:
> On Fri, Jan 20, 2023 at 09:30:54AM -0800, Paul E. McKenney wrote:
> > On Fri, Jan 20, 2023 at 11:14:06AM -0500, Alan Stern wrote:
> > > My revised patch is below.
> >
> > Thank you! Are you OK with my putting this on a not-for-mainline branch
> > for experimental purposes?
>
> Sure, go ahead.

And done! This is on -rcu branch lkmm-srcu.2023.01.20a, as shown below.

Thanx, Paul

------------------------------------------------------------------------

commit f0d4b328e12cdc7f34c11f7c82b28a16e097f769
Author: Alan Stern <[email protected]>
Date: Fri Jan 20 10:34:59 2023 -0800

tools/memory-model: Provide exact SRCU semantics

LKMM has long provided only approximate handling of SRCU read-side
critical sections. This has not been a pressing problem because LKMM's
traditionall handling is correct for the common cases of non-overlapping
and of properly nested critical sections. However, LKMM's traditional
handling of partially overlapping critical sections incorrectly fuses
them into one large critical section.

For example, consider the following litmus test:

------------------------------------------------------------------------

C C-srcu-nest-5

(*
* Result: Sometimes
*
* This demonstrates non-nested overlapping of SRCU read-side critical
* sections. Unlike RCU, SRCU critical sections do not unconditionally
* nest.
*)

{}

P0(int *x, int *y, struct srcu_struct *s1)
{
int r1;
int r2;
int r3;
int r4;

r3 = srcu_read_lock(s1);
r2 = READ_ONCE(*y);
r4 = srcu_read_lock(s1);
srcu_read_unlock(s1, r3);
r1 = READ_ONCE(*x);
srcu_read_unlock(s1, r4);
}

P1(int *x, int *y, struct srcu_struct *s1)
{
WRITE_ONCE(*y, 1);
synchronize_srcu(s1);
WRITE_ONCE(*x, 1);
}

locations [0:r1]
exists (0:r1=1 /\ 0:r2=0)

------------------------------------------------------------------------

But current mainline incorrectctly flattens the two critical section
into one larger critical section, giving "Never" instead of the correct
"Sometimes":

------------------------------------------------------------------------

$ herd7 -conf linux-kernel.cfg C-srcu-nest-5.litmus
Test C-srcu-nest-5 Allowed
States 3
0:r1=0; 0:r2=0;
0:r1=0; 0:r2=1;
0:r1=1; 0:r2=1;
No
Witnesses
Positive: 0 Negative: 3
Flag srcu-bad-nesting
Condition exists (0:r1=1 /\ 0:r2=0)
Observation C-srcu-nest-5 Never 0 3
Time C-srcu-nest-5 0.01
Hash=e692c106cf3e84e20f12991dc438ff1b

------------------------------------------------------------------------

To its credit, it does complain. But with this commit, we get the
following result, which has the virtue of being correct:

------------------------------------------------------------------------

$ herd7 -conf linux-kernel.cfg C-srcu-nest-5.litmus
Test C-srcu-nest-5 Allowed
States 4
0:r1=0; 0:r2=0;
0:r1=0; 0:r2=1;
0:r1=1; 0:r2=0;
0:r1=1; 0:r2=1;
Ok
Witnesses
Positive: 1 Negative: 3
Condition exists (0:r1=1 /\ 0:r2=0)
Observation C-srcu-nest-5 Sometimes 1 3
Time C-srcu-nest-5 0.05
Hash=e692c106cf3e84e20f12991dc438ff1b

------------------------------------------------------------------------

In addition, there are srcu_down_read() and srcu_up_read() functions on
their way to mainline. Roughly speaking, these are to srcu_read_lock()
and srcu_read_unlock() as mutex_lock() and mutex_unlock() are to down()
and up(). The key point is that srcu_down_read() can execute in one
process and the matching srcu_up_read() in another, as shown in this
litmus test:

------------------------------------------------------------------------

C C-srcu-nest-6

(*
* Result: Never
*
* This would be valid for srcu_down_read() and srcu_up_read().
*)

{}

P0(int *x, int *y, struct srcu_struct *s1, int *idx, int *f)
{
int r2;
int r3;

r3 = srcu_down_read(s1);
WRITE_ONCE(*idx, r3);
r2 = READ_ONCE(*y);
smp_store_release(f, 1);
}

P1(int *x, int *y, struct srcu_struct *s1, int *idx, int *f)
{
int r1;
int r3;
int r4;

r4 = smp_load_acquire(f);
r1 = READ_ONCE(*x);
r3 = READ_ONCE(*idx);
srcu_up_read(s1, r3);
}

P2(int *x, int *y, struct srcu_struct *s1)
{
WRITE_ONCE(*y, 1);
synchronize_srcu(s1);
WRITE_ONCE(*x, 1);
}

locations [0:r1]
filter (1:r4=1)
exists (1:r1=1 /\ 0:r2=0)

------------------------------------------------------------------------

When run on current mainline, this litmus test gets a complaint about
an unknown macro srcu_down_read(). With this commit:

------------------------------------------------------------------------

herd7 -conf linux-kernel.cfg C-srcu-nest-6.litmus
Test C-srcu-nest-6 Allowed
States 3
0:r1=0; 0:r2=0; 1:r1=0;
0:r1=0; 0:r2=1; 1:r1=0;
0:r1=0; 0:r2=1; 1:r1=1;
No
Witnesses
Positive: 0 Negative: 3
Condition exists (1:r1=1 /\ 0:r2=0)
Observation C-srcu-nest-6 Never 0 3
Time C-srcu-nest-6 0.02
Hash=c1f20257d052ca5e899be508bedcb2a1

------------------------------------------------------------------------

Note that the user must supply the flag "f" and the "filter" clause,
similar to what must be done to emulate call_rcu().

[ paulmck: Fix space-before-tab whitespace nit. ]

TBD-contributions-from: Jonas Oberhauser <[email protected]>
Not-yet-signed-off-by: Alan Stern <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>

diff --git a/tools/memory-model/linux-kernel.bell b/tools/memory-model/linux-kernel.bell
index 70a9073dec3e5..6e702cda15e18 100644
--- a/tools/memory-model/linux-kernel.bell
+++ b/tools/memory-model/linux-kernel.bell
@@ -53,38 +53,31 @@ let rcu-rscs = let rec
in matched

(* Validate nesting *)
-flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-locking
-flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-locking
+flag ~empty Rcu-lock \ domain(rcu-rscs) as unbalanced-rcu-lock
+flag ~empty Rcu-unlock \ range(rcu-rscs) as unbalanced-rcu-unlock

(* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
-let srcu-rscs = let rec
- unmatched-locks = Srcu-lock \ domain(matched)
- and unmatched-unlocks = Srcu-unlock \ range(matched)
- and unmatched = unmatched-locks | unmatched-unlocks
- and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
- and unmatched-locks-to-unlocks =
- ([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
- and matched = matched | (unmatched-locks-to-unlocks \
- (unmatched-po ; unmatched-po))
- in matched
+let carry-srcu-data = (data ; [~ Srcu-unlock] ; rf)*
+let srcu-rscs = ([Srcu-lock] ; carry-srcu-data ; data ; [Srcu-unlock]) & loc

(* Validate nesting *)
-flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-locking
-flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-locking
+flag ~empty Srcu-lock \ domain(srcu-rscs) as unbalanced-srcu-lock
+flag ~empty Srcu-unlock \ range(srcu-rscs) as unbalanced-srcu-unlock
+flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-srcu-matches

(* Check for use of synchronize_srcu() inside an RCU critical section *)
flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep

(* Validate SRCU dynamic match *)
-flag ~empty different-values(srcu-rscs) as srcu-bad-nesting
+flag ~empty different-values(srcu-rscs) as bad-srcu-value-match

(* Compute marked and plain memory accesses *)
let Marked = (~M) | IW | Once | Release | Acquire | domain(rmw) | range(rmw) |
- LKR | LKW | UL | LF | RL | RU
+ LKR | LKW | UL | LF | RL | RU | Srcu-lock | Srcu-unlock
let Plain = M \ Marked

(* Redefine dependencies to include those carried through plain accesses *)
-let carry-dep = (data ; rfi)*
+let carry-dep = (data ; [~ Srcu-unlock] ; rfi)*
let addr = carry-dep ; addr
let ctrl = carry-dep ; ctrl
let data = carry-dep ; data
diff --git a/tools/memory-model/linux-kernel.def b/tools/memory-model/linux-kernel.def
index ef0f3c1850dee..e1f65e6de06f1 100644
--- a/tools/memory-model/linux-kernel.def
+++ b/tools/memory-model/linux-kernel.def
@@ -49,8 +49,10 @@ synchronize_rcu() { __fence{sync-rcu}; }
synchronize_rcu_expedited() { __fence{sync-rcu}; }

// SRCU
-srcu_read_lock(X) __srcu{srcu-lock}(X)
-srcu_read_unlock(X,Y) { __srcu{srcu-unlock}(X,Y); }
+srcu_read_lock(X) __load{srcu-lock}(*X)
+srcu_read_unlock(X,Y) { __store{srcu-unlock}(*X,Y); }
+srcu_down_read(X) __load{srcu-lock}(*X)
+srcu_up_read(X,Y) { __store{srcu-unlock}(*X,Y); }
synchronize_srcu(X) { __srcu{sync-srcu}(X); }
synchronize_srcu_expedited(X) { __srcu{sync-srcu}(X); }

diff --git a/tools/memory-model/lock.cat b/tools/memory-model/lock.cat
index 6b52f365d73ac..53b5a492739d0 100644
--- a/tools/memory-model/lock.cat
+++ b/tools/memory-model/lock.cat
@@ -36,9 +36,9 @@ let RU = try RU with emptyset
(* Treat RL as a kind of LF: a read with no ordering properties *)
let LF = LF | RL

-(* There should be no ordinary R or W accesses to spinlocks *)
-let ALL-LOCKS = LKR | LKW | UL | LF | RU
-flag ~empty [M \ IW] ; loc ; [ALL-LOCKS] as mixed-lock-accesses
+(* There should be no ordinary R or W accesses to spinlocks or SRCU structs *)
+let ALL-LOCKS = LKR | LKW | UL | LF | RU | Srcu-lock | Srcu-unlock | Sync-srcu
+flag ~empty [M \ IW \ ALL-LOCKS] ; loc ; [ALL-LOCKS] as mixed-lock-accesses

(* Link Lock-Reads to their RMW-partner Lock-Writes *)
let lk-rmw = ([LKR] ; po-loc ; [LKW]) \ (po ; po)

2023-01-20 21:10:14

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 20, 2023 at 11:20:32AM -0800, Paul E. McKenney wrote:
> On Fri, Jan 20, 2023 at 01:37:51PM -0500, Alan Stern wrote:
> > srcu_read_unlock() does not need a full smp_mb().
>
> That is quite possible, and that is what we are looking into. And testing
> thus far agrees with you. But the grace-period ordering constraints
> are quite severe, so this requires careful checking and severe testing.

If you're interested, I can provide a simple argument to show that the
Fundamental Law of RCU would continue to hold with only a release fence.
There is an added requirement: merely that synchronize_srcu() must have
an smp_mb() somewhere after its final read of the unlock counters --
which your version of the algorithm already has.

Alan

2023-01-20 21:13:06

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/20/2023 4:32 PM, Paul E. McKenney wrote:
> On Fri, Jan 20, 2023 at 01:51:01PM +0100, Jonas Oberhauser wrote:
>> I'm not going to get it right today, am I?
> Believe me, I know that feeling! Open-source development is therefore
> an extremely good character-building exercise. At least that is what
> I keep telling myself. ;-)

"Calvin, go do something you hate! Being miserable builds character!"

>
>> +let srcu-rscs = ([Srcu-lock] ; (data ; [~ Srcu-unlock] ; rfe) * ; data ;
>> [Srcu-unlock]) & loc
>>
>> I see now that I copied the format from your message but without realizing
>> the original had a `|` where I have a `;`.
>> I hope this version is finally right and perhaps more natural than the (data
>> | rf) version, considering rf can't actually appear in most places and this
>> more closely matches carry-dep;data.
>> But of course feel free to use
>> +let srcu-rscs = ([Srcu-lock] ; (dataĀ  | [~ Srcu-unlock] ; rf)+ ;
>> [Srcu-unlock]) & loc
>> instead if you prefer.
>
> The reason for favoring "rf" over "rfe" is the possibility of a litmus
> test where the process containing the srcu_down_read() sometimes but
> not always also has the matching srcu_up_read(). Perhaps a pair of "if"
> statements control which process does the matching srcu_up_read().

If you put the redefinition of data early enough to affect this
definition, the rfi option should be covered by the carry-dep in the
redefinition of data, so I left it out.

> And thank you!!!

always ;-)

jonas

2023-01-20 21:52:50

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 20, 2023 at 03:36:24PM -0500, Alan Stern wrote:
> On Fri, Jan 20, 2023 at 11:20:32AM -0800, Paul E. McKenney wrote:
> > On Fri, Jan 20, 2023 at 01:37:51PM -0500, Alan Stern wrote:
> > > srcu_read_unlock() does not need a full smp_mb().
> >
> > That is quite possible, and that is what we are looking into. And testing
> > thus far agrees with you. But the grace-period ordering constraints
> > are quite severe, so this requires careful checking and severe testing.
>
> If you're interested, I can provide a simple argument to show that the
> Fundamental Law of RCU would continue to hold with only a release fence.
> There is an added requirement: merely that synchronize_srcu() must have
> an smp_mb() somewhere after its final read of the unlock counters --
> which your version of the algorithm already has.

Please!

For your amusement, here is a very informal argument that this is
the case:

https://docs.google.com/document/d/1xvwQzavmH474MBPAIBqVyvCrCcS5j2BpqhErPhRj7Is/edit?usp=sharing

See the "Read-Side Optimizations" section at the end.

Thanx, Paul

2023-01-20 22:20:51

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/20/2023 4:39 PM, Paul E. McKenney wrote:
> On Fri, Jan 20, 2023 at 10:43:10AM +0100, Jonas Oberhauser wrote:
>
>> I don't think Boqun's patch is hard to repair.
>> Besides the issue you mention, I think it's also missing Sync-srcu, which
>> seems to be linked by loc based on its first argument.
>>
>> How about something like this?
>>
>> let ALL-LOCKS = LKR | LKW | UL | LF | RU | Srcu-lock | Srcu-unlock |
>> Sync-srcu flag ~empty ~[ALL_LOCKS | IW] ; loc ; [ALL-LOCKS] as
>> mixed-lock-accesses
>>
>> If you're using something that isn't a lock or intial write on the same location as a lock, you get the flag.
> Wouldn't that unconditionally complain about the first srcu_read_lock()
> in a given process? Or am I misreading those statements?
>

I unfolded the definition step by step and it seems I was careless when
distributing the ~ over the [] operator.
I should have written:

flag ~empty [~(ALL_LOCKS | IW)] ; loc ; [ALL-LOCKS] as mixed-lock-accesses

but somehow I thought I can save the parentheses by putting the ~ on the
outside.
Now on the off-chance that this is kind of how you already read the
relation, let me unfold it step-by-step.

Let's assume that the sequence s of operations on this location is
Ā  s = initial write , (perhaps some gps) , first read lock , read
lock&unlock&gp ...
then the flag would appear if the specified relation isn't empty. That
would be the case if there are a and b that are linked by

a ->[~(ALL_LOCKS | IW)] ; loc ; [ALL-LOCKS] b

This means a is neither in ALL_LOCKS nor in IW, while b is ALL-LOCKS; and furthermore, they are equal to a' and b' resp. that are related by loc, i.e., appear in this sequence s. Thus both a and b are actually appearing both in the sequence s.
However, every event in the sequence s is either in ALL_LOCKS or in IW, which contradicts the assumption that a is in the sequence and in neither of the sets. Because of this contradiction, the flag doesn't appear if the sequence looks like this.

More generally, if every event in the sequence is either the initial write or one of (srcu-) lock,unlock,up,down,sync, there won't be a flag.

In contrast, if the sequence has the form
s' = initial write, (normal srcu events), some other acces x, (normal srcu events)
and y is one of the srcu events in this sequence, then
x ->[~(ALL_LOCKS | IW)] ; loc ; [ALL_LOCKS] y
and you get a flag.

Best wishes,
jonas

2023-01-20 22:21:02

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/20/2023 5:18 PM, Alan Stern wrote:
> On Fri, Jan 20, 2023 at 11:13:00AM +0100, Jonas Oberhauser wrote:
>> Perhaps we could say that reading an index without using it later is
>> forbidden?
>>
>> flag ~empty [Srcu-lock];data;rf;[~ domain(data;[Srcu-unlock])] as
>> thrown-srcu-cookie-on-floor
> We already flag locks that don't have a matching unlock.

Of course, but as you know this is completely orthogonal.

> I don't see any point in worrying about whatever else happens to the index.

Can you briefly explain how the operational model you have in mind for
srcu's up and down allows x==1 (and y==0 and idx1==idx2) in the example
I sent before (copied with minor edit below for convenience)?

P0{
Ā Ā Ā  idx1 = srcu_down(&ss);
Ā Ā Ā  store_rel(p1, true);


Ā Ā Ā  shared cs

Ā Ā Ā  R x == 1

Ā Ā Ā  while (! load_acq(p2));
Ā Ā Ā  R idx2 == idx1 // for some reason, we got lucky!
Ā Ā Ā  srcu_up(&ss,idx1);
}

P1{
Ā Ā Ā  idx2 = srcu_down(&ss);
Ā Ā Ā  store_rel(p2, true);

Ā Ā Ā  shared cs

Ā Ā Ā  R y == 0

Ā Ā Ā  while (! load_acq(p1));
Ā Ā Ā  srcu_up(&ss,idx2);
}

P2 {
Ā Ā Ā  W y = 1
Ā Ā Ā  srcu_sync(&ss);
Ā Ā Ā  W x = 1
}


I can imagine models that allow this but they aren't pretty. Maybe you
have a better operational model?

>
>> So if there is an srcu_down() that produces a cookie that is read by some
>> read R, and R doesn't then pass that value into an srcu_up(), the
>> srcu-warranty is voided.
> No, it isn't.
I quote Paul:
"If you do anything else at all with it, anything at all, you just
voided your SRCU warranty. For that matter, if you just throw that value
on the floor and don't pass it to an srcu_up_read() execution, you also
just voided your SRCU warranty."

Best wishes,
jonas

2023-01-20 22:22:45

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 20, 2023 at 09:56:36PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/20/2023 4:32 PM, Paul E. McKenney wrote:
> > On Fri, Jan 20, 2023 at 01:51:01PM +0100, Jonas Oberhauser wrote:
> > > I'm not going to get it right today, am I?
> > Believe me, I know that feeling! Open-source development is therefore
> > an extremely good character-building exercise. At least that is what
> > I keep telling myself. ;-)
>
> "Calvin, go do something you hate! Being miserable builds character!"

Heh! There is the school of thought that says that if children
automatically did everything that they needed to do, they would not
need parents. Now about adults doing what they need to do, myself
included... ;-)

> > > +let srcu-rscs = ([Srcu-lock] ; (data ; [~ Srcu-unlock] ; rfe) * ; data ;
> > > [Srcu-unlock]) & loc
> > >
> > > I see now that I copied the format from your message but without realizing
> > > the original had a `|` where I have a `;`.
> > > I hope this version is finally right and perhaps more natural than the (data
> > > | rf) version, considering rf can't actually appear in most places and this
> > > more closely matches carry-dep;data.
> > > But of course feel free to use
> > > +let srcu-rscs = ([Srcu-lock] ; (data? | [~ Srcu-unlock] ; rf)+ ;
> > > [Srcu-unlock]) & loc
> > > instead if you prefer.
> >
> > The reason for favoring "rf" over "rfe" is the possibility of a litmus
> > test where the process containing the srcu_down_read() sometimes but
> > not always also has the matching srcu_up_read(). Perhaps a pair of "if"
> > statements control which process does the matching srcu_up_read().
>
> If you put the redefinition of data early enough to affect this definition,
> the rfi option should be covered by the carry-dep in the redefinition of
> data, so I left it out.

For right now, I will favor obviousness over minimalism, but for the
real patch, I will let Alan decide what makes the most sense. I am
sure that you will not be shy about letting him know of your thoughts
on the matter. ;-)

Thanx, Paul

> > And thank you!!!
>
> always ;-)
>
> jonas
>

2023-01-20 22:26:21

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 20, 2023 at 09:46:55PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/20/2023 4:39 PM, Paul E. McKenney wrote:
> > On Fri, Jan 20, 2023 at 10:43:10AM +0100, Jonas Oberhauser wrote:
> >
> > > I don't think Boqun's patch is hard to repair.
> > > Besides the issue you mention, I think it's also missing Sync-srcu, which
> > > seems to be linked by loc based on its first argument.
> > >
> > > How about something like this?
> > >
> > > let ALL-LOCKS = LKR | LKW | UL | LF | RU | Srcu-lock | Srcu-unlock |
> > > Sync-srcu flag ~empty ~[ALL_LOCKS | IW] ; loc ; [ALL-LOCKS] as
> > > mixed-lock-accesses
> > >
> > > If you're using something that isn't a lock or intial write on the same location as a lock, you get the flag.
> > Wouldn't that unconditionally complain about the first srcu_read_lock()
> > in a given process? Or am I misreading those statements?
> >
>
> I unfolded the definition step by step and it seems I was careless when
> distributing the ~ over the [] operator.
> I should have written:
>
> flag ~empty [~(ALL_LOCKS | IW)] ; loc ; [ALL-LOCKS] as mixed-lock-accesses
>
> but somehow I thought I can save the parentheses by putting the ~ on the
> outside.
> Now on the off-chance that this is kind of how you already read the
> relation, let me unfold it step-by-step.
>
> Let's assume that the sequence s of operations on this location is
> ? s = initial write , (perhaps some gps) , first read lock , read
> lock&unlock&gp ...
> then the flag would appear if the specified relation isn't empty. That would
> be the case if there are a and b that are linked by
>
> a ->[~(ALL_LOCKS | IW)] ; loc ; [ALL-LOCKS] b
>
> This means a is neither in ALL_LOCKS nor in IW, while b is ALL-LOCKS; and furthermore, they are equal to a' and b' resp. that are related by loc, i.e., appear in this sequence s. Thus both a and b are actually appearing both in the sequence s.
> However, every event in the sequence s is either in ALL_LOCKS or in IW, which contradicts the assumption that a is in the sequence and in neither of the sets. Because of this contradiction, the flag doesn't appear if the sequence looks like this.
>
> More generally, if every event in the sequence is either the initial write or one of (srcu-) lock,unlock,up,down,sync, there won't be a flag.
>
> In contrast, if the sequence has the form
> s' = initial write, (normal srcu events), some other acces x, (normal srcu events)
> and y is one of the srcu events in this sequence, then
> x ->[~(ALL_LOCKS | IW)] ; loc ; [ALL_LOCKS] y
> and you get a flag.

Thank you! When I get done messing with NMIs, I will give this a go.

Just out of curiosity, are you sent up to run LKMM locally at your end?

Thanx, Paul

2023-01-20 22:35:39

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/20/2023 4:47 PM, Paul E. McKenney wrote:
> On Fri, Jan 20, 2023 at 11:13:00AM +0100, Jonas Oberhauser wrote:
>>
>> On 1/19/2023 7:41 PM, Paul E. McKenney wrote:
>>> On Thu, Jan 19, 2023 at 02:39:01PM +0100, Jonas Oberhauser wrote:
>>>> On 1/19/2023 1:11 AM, Paul E. McKenney wrote:
>>>>> On Wed, Jan 18, 2023 at 10:24:50PM +0100, Jonas Oberhauser wrote:
>>>>>> What I was thinking of is more something like this:
>>>>>>
>>>>>> P0{
>>>>>> Ā Ā  idx1 = srcu_down(&ss);
>>>>>> Ā Ā  srcu_up(&ss,idx1);
>>>>>> }
>>>>>>
>>>>>> P1{
>>>>>> Ā Ā Ā  idx2 = srcu_down(&ss);
>>>>>> Ā Ā Ā  srcu_up(&ss,idx2)
>>>>>> }
>>>>> And srcu_read_lock() and srcu_read_unlock() already do this.
>>>> I think I left out too much from my example.
>>>> And filling in the details led me down a bit of a rabbit hole of confusion
>>>> for a while.
>>>> But here's what I ended up with:
>>>>
>>>>
>>>> P0{
>>>> Ā Ā Ā  idx1 = srcu_down(&ss);
>>>> Ā Ā Ā  store_rel(p1, true);
>>>>
>>>>
>>>> Ā Ā Ā  shared cs
>>>>
>>>> Ā Ā Ā  R x == ?
>>>>
>>>> Ā Ā Ā  while (! load_acq(p2));
>>>> Ā Ā Ā  R idx2 == idx1 // for some reason, we got lucky!
>>>> Ā Ā Ā  srcu_up(&ss,idx1);
>>> Although the current Linux-kernel implementation happens to be fine with
>>> this sort of abuse, I am quite happy to tell people "Don't do that!"
>>> And you can do this with srcu_read_lock() and srcu_read_unlock().
>>> In contrast, this actually needs srcu_down_read() and srcu_up_read():
>> My point/clarification request wasn't about whether you could write that
>> code with read_lock() and read_unlock(), but what it would/should mean for
>> the operational and axiomatic models.
>> As I wrote later in the mail, for the operational model it is quite clear
>> that x==1 should be allowed for lock() and unlock(), but would probably be
>> forbidden for down() and up().
> Agreed, the math might say something or another about doing something
> with the srcu_read_lock() or srcu_down_read() return values (other than
> passing them to srcu_read_unlock() or srcu_up_read(), respectively),
> but such somethings are excluded by convention.
>
> So it would be nice for LKMM to complain about such abuse, but not
> at all mandatory.

I think at the very least it would be nice if the convention was written
down somewhere.

>> My clarification request is whether that difference in the probable
>> operational model should be reflected in the axiomatic model (as I first
>> suspected based on the word "semaphore" being dropped a lot), or whether
>> it's just due to abuse (i.e., yes the axiomatic model and operational model
>> might be different here, but you're not allowed to look).
> For the moment, I am taking the door labeled "abuse".
>
> Maybe someday someone will come up with a valid use case, but they have
> to prove it first. ;-)

Luckily, I currently don't have a stake in this :D
I currently don't think it's necessary to take a peek at cookies before
deciding whether it should be used or not, since the decision can't
depend on the value of the cookie anyways.

>
>> Which brings us to the next point:
>>
>>> Could you please review the remainder to see what remains given the
>>> usage restrictions that I called out above?
>> Perhaps we could say that reading an index without using it later is
>> forbidden?
>>
>> flag ~empty [Srcu-lock];data;rf;[~ domain(data;[Srcu-unlock])] as
>> thrown-srcu-cookie-on-floor
>>
>> So if there is an srcu_down() that produces a cookie that is read by some
>> read R, and R doesn't then pass that value into an srcu_up(), the
>> srcu-warranty is voided.
> I like the general idea, but I am dazed and confused by this "flag"
> statement.

Too bad, I was aiming for dazed and amazed. Ah well, I'll take what I
can get.

> Ah, separate down/up tags could make this "flag" statement at least
> somewhat less dazing and confusing.

Let me use up/down names and also fix the statement a little bit in
analogy to the other issue we had with the rf from the other subthread.

letĀ  use-cookie = (data|[~(Srcu-up|Srcu-unlock)] ; rfe)* ; data

flag ~empty [Srcu-down] ; use-cookie; [~Srcu-up] ; rf ; [~ domain(use-cookie;[Srcu-up])] as thrown-srcu-cookie-on-floor

Here use-cookie is essentially just a name for the relation we used
before to see where the cookie is being used in the end when defining
how to match srcu events: it links (among other things) an srcu-down to
every store that stores the cookie produced by that srcu-down,
and every read that reads such a cookie to the srcu_up() that uses the
cookie returned by that read. (Because of how srcu's up() and down() are
currently formalized, those two happen to be the same thing, but maybe
it helps thinking of them as seperate for now).

Then the relation

[Srcu-down] ; use-cookie ; [~Srcu-up] ; rf ; [~ domain(use-cookie;[Srcu-up])]

links an event X to an event R exactly in the following case:

X ->use-cookie W ->rf R
and X \in Srcu-down, W \not\in Srcu-up, and R \not\in domain(use-cookie;[Srcu-up])

meaning X is an srcu_down(), and its cookie is stored by the write W,
and R is a read that looks at the cookie (it reads from W), but(!) the
cookie read by R is never used by any srcu_up().

More precisely, imagine that in contrast to what I just claimed, the
cookie read by R would actually be used in some srcu_up() event U.
Then R would be linked by use-cookie to U; we would have
Ā  R ->use-cookie U
Ā  and U \in Srcu-up
which we could rewrite as
Ā  R ->use-cookie;[Srcu-up] U

Now because R appears on the left-hand-side of the relation with some
event (here U), R is in the domain(*) of this relation :
Ā  R \in domain(use-cookie;[Srcu-up])
which is a contradiction.

In other words, the relation would be non-empty (= the flag is raised)
exactly when there is a read R that reads a cookie produced by some
srcu_down() event X, but the return value of that read is never used as
input to srcu_up().
This seems to be exactly the "drop on the floor" metaphor you mentioned
(and from my own experience I know it's bad to drop cookies on the floor).

does that make it more clear why it might be a reasonable formalization
of that principle?
jonas

(*anyways I hope so, I always mix up domain and range, but I think
domain is the left side and range the right side. I can also barely keep
apart reft and light though, so...)


2023-01-20 22:41:00

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/20/2023 10:37 PM, Paul E. McKenney wrote:
>
> Just out of curiosity, are you [set] up to run LKMM locally at your end?

I don't know what exactly that means. I generally run it on wetware.
But I sometimes ask Hernan to run Dat3M (on his machine) over all the
litmus tests in your repo to spot any obvious problems with variations I
consider.
I don't think Dat3M is feature-complete with herd at the moment, just
unbelievably faster. For example I think it ignores all flags in the cat
files.
Oh, I just remembered that I also installed herd7 recently to make sure
that any patches I might send in satisfy herd7 syntax requirements (I
think you called this diagnostic driven development?), but I haven't
used it to really run anything.

Is it too obvious that my words usually aren't backed by cold machine logic?

Best wishes,
jonas

2023-01-20 23:30:06

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 20, 2023 at 11:36:15PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/20/2023 10:37 PM, Paul E. McKenney wrote:
> >
> > Just out of curiosity, are you [set] up to run LKMM locally at your end?
>
> I don't know what exactly that means. I generally run it on wetware.
> But I sometimes ask Hernan to run Dat3M (on his machine) over all the litmus
> tests in your repo to spot any obvious problems with variations I consider.
> I don't think Dat3M is feature-complete with herd at the moment, just
> unbelievably faster. For example I think it ignores all flags in the cat
> files.
> Oh, I just remembered that I also installed herd7 recently to make sure that
> any patches I might send in satisfy herd7 syntax requirements (I think you
> called this diagnostic driven development?), but I haven't used it to really
> run anything.
>
> Is it too obvious that my words usually aren't backed by cold machine logic?

Well, there was this in one of your messages from earlier today: "I'm not
going to get it right today, am I?" And I freely confess that this led
me to suspect that you might not have been availing yourself of herd7's
opinion before posting. ;-)

If you clone the Linux kernel source on a Linux system, the information
here should help you get started. And we are of course here to help.

Your choice, just pointing out the option!

Thanx, Paul

2023-01-21 00:11:54

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/21/2023 12:19 AM, Paul E. McKenney wrote:
> On Fri, Jan 20, 2023 at 11:36:15PM +0100, Jonas Oberhauser wrote:
>>
>> On 1/20/2023 10:37 PM, Paul E. McKenney wrote:
>>> Just out of curiosity, are you [set] up to run LKMM locally at your end?
>> I don't know what exactly that means. I generally run it on wetware.
>> But I sometimes ask Hernan to run Dat3M (on his machine) over all the litmus
>> tests in your repo to spot any obvious problems with variations I consider.
>> I don't think Dat3M is feature-complete with herd at the moment, just
>> unbelievably faster. For example I think it ignores all flags in the cat
>> files.
>> Oh, I just remembered that I also installed herd7 recently to make sure that
>> any patches I might send in satisfy herd7 syntax requirements (I think you
>> called this diagnostic driven development?), but I haven't used it to really
>> run anything.
>>
>> Is it too obvious that my words usually aren't backed by cold machine logic?
> Well, there was this in one of your messages from earlier today: "I'm not
> going to get it right today, am I?" And I freely confess that this led
> me to suspect that you might not have been availing yourself of herd7's
> opinion before posting. ;-)
The main reason I might usually not consult herd7's opinion is that it
often takes a while to write a test case in a way herd7 accepts and
treats as intended, but then even so the fact that some tests pass may
just give some false confidence when some tricky case is being missed.
So I find the investment/increased confidence ratio to not yet be at the
right point to do this when communicating somewhat informally on the
mailing list, which is already taking quite a bit of my time (but at
least I'm learning a lot during that time about stuff like RCU/SRCU,
history of LKMM, etc.).
If I need to be more confident I'll use herd7 to make sure the syntax is
correct and as a sanity check, and some paper or Coq proofs to be
confident in the logic.

If you feel that I'm wasting the lists' time too much by making these
kind of mistakes, let me know and I'll reconsider.

Best wishes, jonas

2023-01-21 00:57:24

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Sat, Jan 21, 2023 at 01:03:50AM +0100, Jonas Oberhauser wrote:
>
>
> On 1/21/2023 12:19 AM, Paul E. McKenney wrote:
> > On Fri, Jan 20, 2023 at 11:36:15PM +0100, Jonas Oberhauser wrote:
> > >
> > > On 1/20/2023 10:37 PM, Paul E. McKenney wrote:
> > > > Just out of curiosity, are you [set] up to run LKMM locally at your end?
> > > I don't know what exactly that means. I generally run it on wetware.
> > > But I sometimes ask Hernan to run Dat3M (on his machine) over all the litmus
> > > tests in your repo to spot any obvious problems with variations I consider.
> > > I don't think Dat3M is feature-complete with herd at the moment, just
> > > unbelievably faster. For example I think it ignores all flags in the cat
> > > files.
> > > Oh, I just remembered that I also installed herd7 recently to make sure that
> > > any patches I might send in satisfy herd7 syntax requirements (I think you
> > > called this diagnostic driven development?), but I haven't used it to really
> > > run anything.
> > >
> > > Is it too obvious that my words usually aren't backed by cold machine logic?
> > Well, there was this in one of your messages from earlier today: "I'm not
> > going to get it right today, am I?" And I freely confess that this led
> > me to suspect that you might not have been availing yourself of herd7's
> > opinion before posting. ;-)
> The main reason I might usually not consult herd7's opinion is that it often
> takes a while to write a test case in a way herd7 accepts and treats as
> intended, but then even so the fact that some tests pass may just give some
> false confidence when some tricky case is being missed.
> So I find the investment/increased confidence ratio to not yet be at the
> right point to do this when communicating somewhat informally on the mailing
> list, which is already taking quite a bit of my time (but at least I'm
> learning a lot during that time about stuff like RCU/SRCU, history of LKMM,
> etc.).
> If I need to be more confident I'll use herd7 to make sure the syntax is
> correct and as a sanity check, and some paper or Coq proofs to be confident
> in the logic.
>
> If you feel that I'm wasting the lists' time too much by making these kind
> of mistakes, let me know and I'll reconsider.

Not a goal of mine, actually.

The only thing that I will add is that I cheat horribly by creating new
litmus tests by existing ones. ;-)

Thanx, Paul

2023-01-21 05:00:48

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 20, 2023 at 10:41:14PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/20/2023 5:18 PM, Alan Stern wrote:
> > On Fri, Jan 20, 2023 at 11:13:00AM +0100, Jonas Oberhauser wrote:
> > > Perhaps we could say that reading an index without using it later is
> > > forbidden?
> > >
> > > flag ~empty [Srcu-lock];data;rf;[~ domain(data;[Srcu-unlock])] as
> > > thrown-srcu-cookie-on-floor
> > We already flag locks that don't have a matching unlock.
>
> Of course, but as you know this is completely orthogonal.
>
> > I don't see any point in worrying about whatever else happens to the index.
>
> Can you briefly explain how the operational model you have in mind for
> srcu's up and down allows x==1 (and y==0 and idx1==idx2) in the example I
> sent before (copied with minor edit below for convenience)?
>
> P0{
> ??? idx1 = srcu_down(&ss);
> ??? store_rel(p1, true);
>
>
> ??? shared cs
>
> ??? R x == 1
>
> ??? while (! load_acq(p2));
> ??? R idx2 == idx1 // for some reason, we got lucky!
> ??? srcu_up(&ss,idx1);
> }
>
> P1{
> ??? idx2 = srcu_down(&ss);
> ??? store_rel(p2, true);
>
> ??? shared cs
>
> ??? R y == 0
>
> ??? while (! load_acq(p1));
> ??? srcu_up(&ss,idx2);
> }
>
> P2 {
> ??? W y = 1
> ??? srcu_sync(&ss);
> ??? W x = 1
> }
>
>
> I can imagine models that allow this but they aren't pretty. Maybe you have
> a better operational model?
>
> >
> > > So if there is an srcu_down() that produces a cookie that is read by some
> > > read R, and R doesn't then pass that value into an srcu_up(), the
> > > srcu-warranty is voided.
> > No, it isn't.
> I quote Paul:
> "If you do anything else at all with it, anything at all, you just voided
> your SRCU warranty. For that matter, if you just throw that value on the
> floor and don't pass it to an srcu_up_read() execution, you also just voided
> your SRCU warranty."

I suspect that you guys are talking past one another. My guess is that
one of you is saying "we could check" and the other "we are not required
to check", which are not necessarily in disagreement.

But that is just a guess. You guys tell me! ;-)

Thanx, Paul

2023-01-21 17:58:32

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 20, 2023 at 10:41:14PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/20/2023 5:18 PM, Alan Stern wrote:
> > On Fri, Jan 20, 2023 at 11:13:00AM +0100, Jonas Oberhauser wrote:
> > > Perhaps we could say that reading an index without using it later is
> > > forbidden?
> > >
> > > flag ~empty [Srcu-lock];data;rf;[~ domain(data;[Srcu-unlock])] as
> > > thrown-srcu-cookie-on-floor
> > We already flag locks that don't have a matching unlock.
>
> Of course, but as you know this is completely orthogonal.

Yeah, okay. It doesn't hurt to add this check, but the check isn't
complete. For example, it won't catch the invalid usage here:

P0(srcu_struct *ss)
{
int r1, r2;

r1 = srcu_read_lock(ss);
srcu_read_unlock(&ss, r1);
r2 = srcu_read_lock(ss);
srcu_read_unlock(&ss, r2);
}

exists (~0:r1=0:r2)

On the other hand, how often will people make this sort of mistake in
their litmus tests? My guess is not very.

> Can you briefly explain how the operational model you have in mind for
> srcu's up and down allows x==1 (and y==0 and idx1==idx2) in the example I
> sent before (copied with minor edit below for convenience)?
>
> P0{
> ??? idx1 = srcu_down(&ss);
> ??? store_rel(p1, true);
>
>
> ??? shared cs
>
> ??? R x == 1
>
> ??? while (! load_acq(p2));
> ??? R idx2 == idx1 // for some reason, we got lucky!
> ??? srcu_up(&ss,idx1);
> }
>
> P1{
> ??? idx2 = srcu_down(&ss);
> ??? store_rel(p2, true);
>
> ??? shared cs
>
> ??? R y == 0
>
> ??? while (! load_acq(p1));
> ??? srcu_up(&ss,idx2);
> }
>
> P2 {
> ??? W y = 1
> ??? srcu_sync(&ss);
> ??? W x = 1
> }
>
>
> I can imagine models that allow this but they aren't pretty. Maybe you have
> a better operational model?

The operational model is not very detailed as far as SRCU is concerned.
It merely says that synchronize_srcu() executing on CPU C waits until:

All writes received by C prior to the start of the function have
propagated to all CPUs (call this time t1). This could be
arranged by having synchronize_srcu() start with an smp_mb().

For every srcu_down_read() that executed prior to t1, the
matching srcu_up_read() has finished and all writes received
by the unlocking CPU prior to the unlock have propagated to all
CPUs. This could be arranged by having the srcu_up_read()
call include a release write which has been received by C and
having synchronize_srcu() end with an smp_mb().

The operational model doesn't specify exactly how synchronize_srcu()
manages to do these things, though.

Oh yes, it also says that the value returned by srcu_down_read() is an
unpredictable int. This differs from the code in the patched herd
model, which says that the value will always be 0.

Anyway, the operational model says the litmus test can succeed as
follows:

P0 P1 P2
--------------------- ---------------------- -------------------------
Widx2=srcu_down_read()
Wrel p2=1
Ry=0
Wy=1
synchronize_srcu() starts
... idx2, p2, and y propagate to all CPUs ...
Time t1
Widx1=srcu_down_read()
Wrel p1=1
,,, idx1 and p1 propagate to all CPUs ...
Racq p1=1
srcu_up_read(idx2)
synchronize_srcu() ends
Wx=1
Rx=1
Racq p2=1
Ridx2=idx1
srcu_up_read(idx1)

(The final equality in P0 is allowed because idx1 and idx2 are both
random numbers, so they might be equal.)

Incidentally, it's worth pointing out that the algorithm Paul described
will forbid this litmus test even if you remove the while loop and the
read of idx2 from P0.

Does this answer your question satisfactorily?

> > > So if there is an srcu_down() that produces a cookie that is read by some
> > > read R, and R doesn't then pass that value into an srcu_up(), the
> > > srcu-warranty is voided.
> > No, it isn't.
> I quote Paul:
> "If you do anything else at all with it, anything at all, you just voided
> your SRCU warranty. For that matter, if you just throw that value on the
> floor and don't pass it to an srcu_up_read() execution, you also just voided
> your SRCU warranty."

I suspect Paul did not express himself very precisely, and what he
really meant was more like this:

If you don't pass the value to exactly one srcu_up_read() call,
you void the SRCU warranty. In addition, if you do anything
else with the value that might affect the outcome of the litmus
test, you incur the risk that herd7 might compute an incorrect
result [as in the litmus test I gave near the start of this
email].

Merely storing the value in a shared variable which then doesn't get
used or is used only for something inconsequential would not cause any
problems.

Alan

2023-01-21 18:56:46

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Sat, Jan 21, 2023 at 12:36:26PM -0500, Alan Stern wrote:
> On Fri, Jan 20, 2023 at 10:41:14PM +0100, Jonas Oberhauser wrote:
> >
> >
> > On 1/20/2023 5:18 PM, Alan Stern wrote:
> > > On Fri, Jan 20, 2023 at 11:13:00AM +0100, Jonas Oberhauser wrote:
> > > > Perhaps we could say that reading an index without using it later is
> > > > forbidden?
> > > >
> > > > flag ~empty [Srcu-lock];data;rf;[~ domain(data;[Srcu-unlock])] as
> > > > thrown-srcu-cookie-on-floor
> > > We already flag locks that don't have a matching unlock.
> >
> > Of course, but as you know this is completely orthogonal.
>
> Yeah, okay. It doesn't hurt to add this check, but the check isn't
> complete. For example, it won't catch the invalid usage here:
>
> P0(srcu_struct *ss)
> {
> int r1, r2;
>
> r1 = srcu_read_lock(ss);
> srcu_read_unlock(&ss, r1);
> r2 = srcu_read_lock(ss);
> srcu_read_unlock(&ss, r2);
> }
>
> exists (~0:r1=0:r2)
>
> On the other hand, how often will people make this sort of mistake in
> their litmus tests? My guess is not very.

I must be blind this morning. I see a well-formed pair of back-to-back
SRCU read-side critical sections. A rather useless pair, given that
both are empty, but valid nonetheless.

Or is the bug the use of 0:r1 and 0:r2 in the "exists" clause? If so,
then I agree that this is not at all a high-priority bug to flag.

> > Can you briefly explain how the operational model you have in mind for
> > srcu's up and down allows x==1 (and y==0 and idx1==idx2) in the example I
> > sent before (copied with minor edit below for convenience)?
> >
> > P0{
> > ??? idx1 = srcu_down(&ss);
> > ??? store_rel(p1, true);
> >
> >
> > ??? shared cs
> >
> > ??? R x == 1
> >
> > ??? while (! load_acq(p2));
> > ??? R idx2 == idx1 // for some reason, we got lucky!
> > ??? srcu_up(&ss,idx1);
> > }
> >
> > P1{
> > ??? idx2 = srcu_down(&ss);
> > ??? store_rel(p2, true);
> >
> > ??? shared cs
> >
> > ??? R y == 0
> >
> > ??? while (! load_acq(p1));
> > ??? srcu_up(&ss,idx2);
> > }
> >
> > P2 {
> > ??? W y = 1
> > ??? srcu_sync(&ss);
> > ??? W x = 1
> > }
> >
> >
> > I can imagine models that allow this but they aren't pretty. Maybe you have
> > a better operational model?
>
> The operational model is not very detailed as far as SRCU is concerned.
> It merely says that synchronize_srcu() executing on CPU C waits until:
>
> All writes received by C prior to the start of the function have
> propagated to all CPUs (call this time t1). This could be
> arranged by having synchronize_srcu() start with an smp_mb().
>
> For every srcu_down_read() that executed prior to t1, the
> matching srcu_up_read() has finished and all writes received
> by the unlocking CPU prior to the unlock have propagated to all
> CPUs. This could be arranged by having the srcu_up_read()
> call include a release write which has been received by C and
> having synchronize_srcu() end with an smp_mb().

Agreed. It took me a few reads to see that this prohibited later writes
by other CPUs affecting reads in the prior critical section, but the "all
writes received by the unlocking CPU" does seem to me to prohibit this.

> The operational model doesn't specify exactly how synchronize_srcu()
> manages to do these things, though.

Which is a good thing, given the wide variety of possible implementations.

> Oh yes, it also says that the value returned by srcu_down_read() is an
> unpredictable int. This differs from the code in the patched herd
> model, which says that the value will always be 0.

As noted earlier, I believe that this is fine. If significant problems
arise, then we might need to do something. However, there is some
cost to complexity, so we should avoid getting too speculative about
possible probems.

> Anyway, the operational model says the litmus test can succeed as
> follows:
>
> P0 P1 P2
> --------------------- ---------------------- -------------------------
> Widx2=srcu_down_read()
> Wrel p2=1
> Ry=0
> Wy=1
> synchronize_srcu() starts
> ... idx2, p2, and y propagate to all CPUs ...
> Time t1
> Widx1=srcu_down_read()
> Wrel p1=1
> ,,, idx1 and p1 propagate to all CPUs ...
> Racq p1=1
> srcu_up_read(idx2)
> synchronize_srcu() ends
> Wx=1
> Rx=1
> Racq p2=1
> Ridx2=idx1
> srcu_up_read(idx1)
>
> (The final equality in P0 is allowed because idx1 and idx2 are both
> random numbers, so they might be equal.)

This all makes sense to me.

> Incidentally, it's worth pointing out that the algorithm Paul described
> will forbid this litmus test even if you remove the while loop and the
> read of idx2 from P0.

Given that the values returned by those two srcu_down_read() calls must
be the same, then, yes, the current Linux-kernel Tree RCU implementation
would forbid this.

On the other hand, if the two indexes differ, then P2's synchronize_srcu()
can see that there are no really old readers on !Widx2, then flip
the index. This would mean that P0's Widx1 would be equal to !Widx2,
which has already been waited on. Then P2's synchronize_srcu() can
return as soon as it sees P1's srcu_up_read().

> Does this answer your question satisfactorily?
>
> > > > So if there is an srcu_down() that produces a cookie that is read by some
> > > > read R, and R doesn't then pass that value into an srcu_up(), the
> > > > srcu-warranty is voided.
> > > No, it isn't.
> > I quote Paul:
> > "If you do anything else at all with it, anything at all, you just voided
> > your SRCU warranty. For that matter, if you just throw that value on the
> > floor and don't pass it to an srcu_up_read() execution, you also just voided
> > your SRCU warranty."
>
> I suspect Paul did not express himself very precisely,

You know me too well! ;-)

> and what he
> really meant was more like this:
>
> If you don't pass the value to exactly one srcu_up_read() call,
> you void the SRCU warranty. In addition, if you do anything
> else with the value that might affect the outcome of the litmus
> test, you incur the risk that herd7 might compute an incorrect
> result [as in the litmus test I gave near the start of this
> email].
>
> Merely storing the value in a shared variable which then doesn't get
> used or is used only for something inconsequential would not cause any
> problems.

That is consistent with my understanding, but please let me try again
in list form:

1. If a value returned from a given srcu_read_lock() is never passed
to an srcu_read_unlock(), later calls to synchronize_srcu()
are within their rights to simply never return.

2. If a value returned from a given srcu_read_lock() is modified in
any way before being passed to an srcu_read_unlock(), any calls
to synchronize_srcu() that have not yet returned are within
their rights to simply never return and they are also within
their rights to return prematurely.

3. If a value returned from a given srcu_read_lock() is passed to
more than one srcu_read_unlock(), any calls to synchronize_srcu()
that have not yet returned are within their rights to simply
never return and they are also within their rights to return
prematurely.

4. If a value returned from a given srcu_read_lock() is passed to
exactly one srcu_read_unlock(), and then that value is later
manipulated, that is bad practice (exactly what are you trying
to accomplish by so doing?), but SRCU won't know the difference.

In particular, the Linux-kernel SRCU implementation doesn't know
about the herd7 "exists" clause, but kudos to Jonas for casting
his conceptual net widely indeed!

5. All of the above apply with equal force to srcu_down_read()
and srcu_up_read().

6. If the value returned from a given srcu_read_lock() is transmitted
to an srcu_read_unlock() on another thread, the SRCU algorithm
will do the right thing, but lockdep will complain bitterly.
(This is the use case that srcu_down_read() and srcu_up_read()
are intended to address.)

This is not exactly concise, but does it help?

Thanx, Paul

2023-01-21 20:19:46

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Sat, Jan 21, 2023 at 10:40:32AM -0800, Paul E. McKenney wrote:
> On Sat, Jan 21, 2023 at 12:36:26PM -0500, Alan Stern wrote:
> > On Fri, Jan 20, 2023 at 10:41:14PM +0100, Jonas Oberhauser wrote:
> > >
> > >
> > > On 1/20/2023 5:18 PM, Alan Stern wrote:
> > > > On Fri, Jan 20, 2023 at 11:13:00AM +0100, Jonas Oberhauser wrote:
> > > > > Perhaps we could say that reading an index without using it later is
> > > > > forbidden?
> > > > >
> > > > > flag ~empty [Srcu-lock];data;rf;[~ domain(data;[Srcu-unlock])] as
> > > > > thrown-srcu-cookie-on-floor
> > > > We already flag locks that don't have a matching unlock.
> > >
> > > Of course, but as you know this is completely orthogonal.
> >
> > Yeah, okay. It doesn't hurt to add this check, but the check isn't
> > complete. For example, it won't catch the invalid usage here:
> >
> > P0(srcu_struct *ss)
> > {
> > int r1, r2;
> >
> > r1 = srcu_read_lock(ss);
> > srcu_read_unlock(&ss, r1);
> > r2 = srcu_read_lock(ss);
> > srcu_read_unlock(&ss, r2);
> > }
> >
> > exists (~0:r1=0:r2)
> >
> > On the other hand, how often will people make this sort of mistake in
> > their litmus tests? My guess is not very.
>
> I must be blind this morning. I see a well-formed pair of back-to-back
> SRCU read-side critical sections. A rather useless pair, given that
> both are empty,

And there are no synchronize_srcu() calls.

> but valid nonetheless.
>
> Or is the bug the use of 0:r1 and 0:r2 in the "exists" clause? If so,
> then I agree that this is not at all a high-priority bug to flag.

Yes, that is the bug. The patched version of LKMM and the
implementation you described say the exist clause will never be
satisfied, the current version of LKMM says it will always be
satisfied, and the theoretical model for SRCU says it will sometimes
be satisfied -- which is the answer we want.

> > > Can you briefly explain how the operational model you have in mind for
> > > srcu's up and down allows x==1 (and y==0 and idx1==idx2) in the example I
> > > sent before (copied with minor edit below for convenience)?
> > >
> > > P0{
> > > ??? idx1 = srcu_down(&ss);
> > > ??? store_rel(p1, true);
> > >
> > >
> > > ??? shared cs
> > >
> > > ??? R x == 1
> > >
> > > ??? while (! load_acq(p2));
> > > ??? R idx2 == idx1 // for some reason, we got lucky!
> > > ??? srcu_up(&ss,idx1);
> > > }
> > >
> > > P1{
> > > ??? idx2 = srcu_down(&ss);
> > > ??? store_rel(p2, true);
> > >
> > > ??? shared cs
> > >
> > > ??? R y == 0
> > >
> > > ??? while (! load_acq(p1));
> > > ??? srcu_up(&ss,idx2);
> > > }
> > >
> > > P2 {
> > > ??? W y = 1
> > > ??? srcu_sync(&ss);
> > > ??? W x = 1
> > > }
> > >
> > >
> > > I can imagine models that allow this but they aren't pretty. Maybe you have
> > > a better operational model?
> >
> > The operational model is not very detailed as far as SRCU is concerned.
> > It merely says that synchronize_srcu() executing on CPU C waits until:
> >
> > All writes received by C prior to the start of the function have
> > propagated to all CPUs (call this time t1). This could be
> > arranged by having synchronize_srcu() start with an smp_mb().
> >
> > For every srcu_down_read() that executed prior to t1, the
> > matching srcu_up_read() has finished and all writes received
> > by the unlocking CPU prior to the unlock have propagated to all
> > CPUs. This could be arranged by having the srcu_up_read()
> > call include a release write which has been received by C and
> > having synchronize_srcu() end with an smp_mb().
>
> Agreed. It took me a few reads to see that this prohibited later writes
> by other CPUs affecting reads in the prior critical section, but the "all
> writes received by the unlocking CPU" does seem to me to prohibit this.
>
> > The operational model doesn't specify exactly how synchronize_srcu()
> > manages to do these things, though.
>
> Which is a good thing, given the wide variety of possible implementations.
>
> > Oh yes, it also says that the value returned by srcu_down_read() is an
> > unpredictable int. This differs from the code in the patched herd
> > model, which says that the value will always be 0.
>
> As noted earlier, I believe that this is fine. If significant problems
> arise, then we might need to do something. However, there is some
> cost to complexity, so we should avoid getting too speculative about
> possible probems.
>
> > Anyway, the operational model says the litmus test can succeed as
> > follows:
> >
> > P0 P1 P2
> > --------------------- ---------------------- -------------------------
> > Widx2=srcu_down_read()
> > Wrel p2=1
> > Ry=0
> > Wy=1
> > synchronize_srcu() starts
> > ... idx2, p2, and y propagate to all CPUs ...
> > Time t1
> > Widx1=srcu_down_read()
> > Wrel p1=1
> > ,,, idx1 and p1 propagate to all CPUs ...
> > Racq p1=1
> > srcu_up_read(idx2)
> > synchronize_srcu() ends
> > Wx=1
> > Rx=1
> > Racq p2=1
> > Ridx2=idx1
> > srcu_up_read(idx1)
> >
> > (The final equality in P0 is allowed because idx1 and idx2 are both
> > random numbers, so they might be equal.)
>
> This all makes sense to me.
>
> > Incidentally, it's worth pointing out that the algorithm Paul described
> > will forbid this litmus test even if you remove the while loop and the
> > read of idx2 from P0.
>
> Given that the values returned by those two srcu_down_read() calls must
> be the same, then, yes, the current Linux-kernel Tree RCU implementation
> would forbid this.
>
> On the other hand, if the two indexes differ, then P2's synchronize_srcu()
> can see that there are no really old readers on !Widx2, then flip
> the index. This would mean that P0's Widx1 would be equal to !Widx2,
> which has already been waited on. Then P2's synchronize_srcu() can
> return as soon as it sees P1's srcu_up_read().

Sorry, what I said may not have been clear. I meant that even if you
remove the while loop and read of idx2 from P0, your algorithm will
still not allow idx1 = idx2 provided everything else is as written.

> > If you don't pass the value to exactly one srcu_up_read() call,
> > you void the SRCU warranty. In addition, if you do anything
> > else with the value that might affect the outcome of the litmus
> > test, you incur the risk that herd7 might compute an incorrect
> > result [as in the litmus test I gave near the start of this
> > email].
> >
> > Merely storing the value in a shared variable which then doesn't get
> > used or is used only for something inconsequential would not cause any
> > problems.
>
> That is consistent with my understanding, but please let me try again
> in list form:

...

> 4. If a value returned from a given srcu_read_lock() is passed to
> exactly one srcu_read_unlock(), and then that value is later
> manipulated, that is bad practice (exactly what are you trying
> to accomplish by so doing?), but SRCU won't know the difference.
>
> In particular, the Linux-kernel SRCU implementation doesn't know
> about the herd7 "exists" clause, but kudos to Jonas for casting
> his conceptual net widely indeed!

In addition, herd7 might give an answer different from what would
actually happen in the kernel, depending on what the manipulation does.

Yes, that is more or less what I was trying to express.

Alan

2023-01-21 20:24:49

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Sat, Jan 21, 2023 at 02:56:57PM -0500, Alan Stern wrote:
> On Sat, Jan 21, 2023 at 10:40:32AM -0800, Paul E. McKenney wrote:
> > On Sat, Jan 21, 2023 at 12:36:26PM -0500, Alan Stern wrote:
> > > On Fri, Jan 20, 2023 at 10:41:14PM +0100, Jonas Oberhauser wrote:
> > > >
> > > >
> > > > On 1/20/2023 5:18 PM, Alan Stern wrote:
> > > > > On Fri, Jan 20, 2023 at 11:13:00AM +0100, Jonas Oberhauser wrote:
> > > > > > Perhaps we could say that reading an index without using it later is
> > > > > > forbidden?
> > > > > >
> > > > > > flag ~empty [Srcu-lock];data;rf;[~ domain(data;[Srcu-unlock])] as
> > > > > > thrown-srcu-cookie-on-floor
> > > > > We already flag locks that don't have a matching unlock.
> > > >
> > > > Of course, but as you know this is completely orthogonal.
> > >
> > > Yeah, okay. It doesn't hurt to add this check, but the check isn't
> > > complete. For example, it won't catch the invalid usage here:
> > >
> > > P0(srcu_struct *ss)
> > > {
> > > int r1, r2;
> > >
> > > r1 = srcu_read_lock(ss);
> > > srcu_read_unlock(&ss, r1);
> > > r2 = srcu_read_lock(ss);
> > > srcu_read_unlock(&ss, r2);
> > > }
> > >
> > > exists (~0:r1=0:r2)
> > >
> > > On the other hand, how often will people make this sort of mistake in
> > > their litmus tests? My guess is not very.
> >
> > I must be blind this morning. I see a well-formed pair of back-to-back
> > SRCU read-side critical sections. A rather useless pair, given that
> > both are empty,
>
> And there are no synchronize_srcu() calls.

Agreed, an additional level of uselessness, though not invalidity. After
all, the more advantageous SRCU use cases execute lots of srcu_read_lock()
and srcu_read_unlock() calls and very few synchronize_srcu() calls.

> > but valid nonetheless.
> >
> > Or is the bug the use of 0:r1 and 0:r2 in the "exists" clause? If so,
> > then I agree that this is not at all a high-priority bug to flag.
>
> Yes, that is the bug. The patched version of LKMM and the
> implementation you described say the exist clause will never be
> satisfied, the current version of LKMM says it will always be
> satisfied, and the theoretical model for SRCU says it will sometimes
> be satisfied -- which is the answer we want.

Got it, thank you.

> > > > Can you briefly explain how the operational model you have in mind for
> > > > srcu's up and down allows x==1 (and y==0 and idx1==idx2) in the example I
> > > > sent before (copied with minor edit below for convenience)?
> > > >
> > > > P0{
> > > > ??? idx1 = srcu_down(&ss);
> > > > ??? store_rel(p1, true);
> > > >
> > > >
> > > > ??? shared cs
> > > >
> > > > ??? R x == 1
> > > >
> > > > ??? while (! load_acq(p2));
> > > > ??? R idx2 == idx1 // for some reason, we got lucky!
> > > > ??? srcu_up(&ss,idx1);
> > > > }
> > > >
> > > > P1{
> > > > ??? idx2 = srcu_down(&ss);
> > > > ??? store_rel(p2, true);
> > > >
> > > > ??? shared cs
> > > >
> > > > ??? R y == 0
> > > >
> > > > ??? while (! load_acq(p1));
> > > > ??? srcu_up(&ss,idx2);
> > > > }
> > > >
> > > > P2 {
> > > > ??? W y = 1
> > > > ??? srcu_sync(&ss);
> > > > ??? W x = 1
> > > > }
> > > >
> > > >
> > > > I can imagine models that allow this but they aren't pretty. Maybe you have
> > > > a better operational model?
> > >
> > > The operational model is not very detailed as far as SRCU is concerned.
> > > It merely says that synchronize_srcu() executing on CPU C waits until:
> > >
> > > All writes received by C prior to the start of the function have
> > > propagated to all CPUs (call this time t1). This could be
> > > arranged by having synchronize_srcu() start with an smp_mb().
> > >
> > > For every srcu_down_read() that executed prior to t1, the
> > > matching srcu_up_read() has finished and all writes received
> > > by the unlocking CPU prior to the unlock have propagated to all
> > > CPUs. This could be arranged by having the srcu_up_read()
> > > call include a release write which has been received by C and
> > > having synchronize_srcu() end with an smp_mb().
> >
> > Agreed. It took me a few reads to see that this prohibited later writes
> > by other CPUs affecting reads in the prior critical section, but the "all
> > writes received by the unlocking CPU" does seem to me to prohibit this.
> >
> > > The operational model doesn't specify exactly how synchronize_srcu()
> > > manages to do these things, though.
> >
> > Which is a good thing, given the wide variety of possible implementations.
> >
> > > Oh yes, it also says that the value returned by srcu_down_read() is an
> > > unpredictable int. This differs from the code in the patched herd
> > > model, which says that the value will always be 0.
> >
> > As noted earlier, I believe that this is fine. If significant problems
> > arise, then we might need to do something. However, there is some
> > cost to complexity, so we should avoid getting too speculative about
> > possible probems.
> >
> > > Anyway, the operational model says the litmus test can succeed as
> > > follows:
> > >
> > > P0 P1 P2
> > > --------------------- ---------------------- -------------------------
> > > Widx2=srcu_down_read()
> > > Wrel p2=1
> > > Ry=0
> > > Wy=1
> > > synchronize_srcu() starts
> > > ... idx2, p2, and y propagate to all CPUs ...
> > > Time t1
> > > Widx1=srcu_down_read()
> > > Wrel p1=1
> > > ,,, idx1 and p1 propagate to all CPUs ...
> > > Racq p1=1
> > > srcu_up_read(idx2)
> > > synchronize_srcu() ends
> > > Wx=1
> > > Rx=1
> > > Racq p2=1
> > > Ridx2=idx1
> > > srcu_up_read(idx1)
> > >
> > > (The final equality in P0 is allowed because idx1 and idx2 are both
> > > random numbers, so they might be equal.)
> >
> > This all makes sense to me.
> >
> > > Incidentally, it's worth pointing out that the algorithm Paul described
> > > will forbid this litmus test even if you remove the while loop and the
> > > read of idx2 from P0.
> >
> > Given that the values returned by those two srcu_down_read() calls must
> > be the same, then, yes, the current Linux-kernel Tree RCU implementation
> > would forbid this.
> >
> > On the other hand, if the two indexes differ, then P2's synchronize_srcu()
> > can see that there are no really old readers on !Widx2, then flip
> > the index. This would mean that P0's Widx1 would be equal to !Widx2,
> > which has already been waited on. Then P2's synchronize_srcu() can
> > return as soon as it sees P1's srcu_up_read().
>
> Sorry, what I said may not have been clear. I meant that even if you
> remove the while loop and read of idx2 from P0, your algorithm will
> still not allow idx1 = idx2 provided everything else is as written.

If synchronize_srcu() has flipped ->srcu_idx by the time that P0's
srcu_down_read() executes, agreed. Otherwise, Widx1 and Widx2 might
well be equal.

> > > If you don't pass the value to exactly one srcu_up_read() call,
> > > you void the SRCU warranty. In addition, if you do anything
> > > else with the value that might affect the outcome of the litmus
> > > test, you incur the risk that herd7 might compute an incorrect
> > > result [as in the litmus test I gave near the start of this
> > > email].
> > >
> > > Merely storing the value in a shared variable which then doesn't get
> > > used or is used only for something inconsequential would not cause any
> > > problems.
> >
> > That is consistent with my understanding, but please let me try again
> > in list form:
>
> ...
>
> > 4. If a value returned from a given srcu_read_lock() is passed to
> > exactly one srcu_read_unlock(), and then that value is later
> > manipulated, that is bad practice (exactly what are you trying
> > to accomplish by so doing?), but SRCU won't know the difference.
> >
> > In particular, the Linux-kernel SRCU implementation doesn't know
> > about the herd7 "exists" clause, but kudos to Jonas for casting
> > his conceptual net widely indeed!
>
> In addition, herd7 might give an answer different from what would
> actually happen in the kernel, depending on what the manipulation does.

True, given that the kernel's srcu_read_unlock() can return a
non-zero value.

> Yes, that is more or less what I was trying to express.

Sounds good!

Thanx, Paul

2023-01-21 21:35:34

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Sat, Jan 21, 2023 at 12:10:26PM -0800, Paul E. McKenney wrote:
> On Sat, Jan 21, 2023 at 02:56:57PM -0500, Alan Stern wrote:
> > > > Anyway, the operational model says the litmus test can succeed as
> > > > follows:
> > > >
> > > > P0 P1 P2
> > > > --------------------- ---------------------- -------------------------
> > > > Widx2=srcu_down_read()
> > > > Wrel p2=1
> > > > Ry=0
> > > > Wy=1
> > > > synchronize_srcu() starts
> > > > ... idx2, p2, and y propagate to all CPUs ...
> > > > Time t1
> > > > Widx1=srcu_down_read()
> > > > Wrel p1=1
> > > > ,,, idx1 and p1 propagate to all CPUs ...
> > > > Racq p1=1
> > > > srcu_up_read(idx2)
> > > > synchronize_srcu() ends
> > > > Wx=1
> > > > Rx=1
> > > > Racq p2=1
> > > > Ridx2=idx1
> > > > srcu_up_read(idx1)
> > > >
> > > > (The final equality in P0 is allowed because idx1 and idx2 are both
> > > > random numbers, so they might be equal.)
> > >
> > > This all makes sense to me.
> > >
> > > > Incidentally, it's worth pointing out that the algorithm Paul described
> > > > will forbid this litmus test even if you remove the while loop and the
> > > > read of idx2 from P0.

> > Sorry, what I said may not have been clear. I meant that even if you
> > remove the while loop and read of idx2 from P0, your algorithm will
> > still not allow idx1 = idx2 provided everything else is as written.
>
> If synchronize_srcu() has flipped ->srcu_idx by the time that P0's
> srcu_down_read() executes, agreed. Otherwise, Widx1 and Widx2 might
> well be equal.

But idx1 and idx2 are equal, we can't have both P0 reads x=1 and P1
reads y=0 -- not even if P0 doesn't wait until it reads p2=1. If you
don't see why, I'll send an explanation.

Alan

2023-01-22 00:02:34

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Sat, Jan 21, 2023 at 04:03:38PM -0500, Alan Stern wrote:
> On Sat, Jan 21, 2023 at 12:10:26PM -0800, Paul E. McKenney wrote:
> > On Sat, Jan 21, 2023 at 02:56:57PM -0500, Alan Stern wrote:
> > > > > Anyway, the operational model says the litmus test can succeed as
> > > > > follows:
> > > > >
> > > > > P0 P1 P2
> > > > > --------------------- ---------------------- -------------------------
> > > > > Widx2=srcu_down_read()
> > > > > Wrel p2=1
> > > > > Ry=0
> > > > > Wy=1
> > > > > synchronize_srcu() starts
> > > > > ... idx2, p2, and y propagate to all CPUs ...
> > > > > Time t1
> > > > > Widx1=srcu_down_read()
> > > > > Wrel p1=1
> > > > > ,,, idx1 and p1 propagate to all CPUs ...
> > > > > Racq p1=1
> > > > > srcu_up_read(idx2)
> > > > > synchronize_srcu() ends
> > > > > Wx=1
> > > > > Rx=1
> > > > > Racq p2=1
> > > > > Ridx2=idx1
> > > > > srcu_up_read(idx1)
> > > > >
> > > > > (The final equality in P0 is allowed because idx1 and idx2 are both
> > > > > random numbers, so they might be equal.)
> > > >
> > > > This all makes sense to me.
> > > >
> > > > > Incidentally, it's worth pointing out that the algorithm Paul described
> > > > > will forbid this litmus test even if you remove the while loop and the
> > > > > read of idx2 from P0.
>
> > > Sorry, what I said may not have been clear. I meant that even if you
> > > remove the while loop and read of idx2 from P0, your algorithm will
> > > still not allow idx1 = idx2 provided everything else is as written.
> >
> > If synchronize_srcu() has flipped ->srcu_idx by the time that P0's
> > srcu_down_read() executes, agreed. Otherwise, Widx1 and Widx2 might
> > well be equal.
>
> But idx1 and idx2 are equal, we can't have both P0 reads x=1 and P1
> reads y=0 -- not even if P0 doesn't wait until it reads p2=1. If you
> don't see why, I'll send an explanation.

Ah, synchronize_srcu() does unlocks *then* locks. I was getting it
backwards, apologies for my confusion!

Thanx, Paul

2023-01-22 20:32:29

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 20, 2023 at 01:20:37PM -0800, Paul E. McKenney wrote:
> On Fri, Jan 20, 2023 at 03:36:24PM -0500, Alan Stern wrote:
> > On Fri, Jan 20, 2023 at 11:20:32AM -0800, Paul E. McKenney wrote:
> > > On Fri, Jan 20, 2023 at 01:37:51PM -0500, Alan Stern wrote:
> > > > srcu_read_unlock() does not need a full smp_mb().
> > >
> > > That is quite possible, and that is what we are looking into. And testing
> > > thus far agrees with you. But the grace-period ordering constraints
> > > are quite severe, so this requires careful checking and severe testing.
> >
> > If you're interested, I can provide a simple argument to show that the
> > Fundamental Law of RCU would continue to hold with only a release fence.
> > There is an added requirement: merely that synchronize_srcu() must have
> > an smp_mb() somewhere after its final read of the unlock counters --
> > which your version of the algorithm already has.
>
> Please!
>
> For your amusement, here is a very informal argument that this is
> the case:
>
> https://docs.google.com/document/d/1xvwQzavmH474MBPAIBqVyvCrCcS5j2BpqhErPhRj7Is/edit?usp=sharing
>
> See the "Read-Side Optimizations" section at the end.

It looks like you've got the basic idea. Most of the complications seem
to arise from the different ways a grace period can happen.

Here's what I was thinking. Let C be a read-side critical section, with
L being its invocation of srcu_down_read() and U being the matching
invocation of srcu_up_read(). Let idx be the index value read by L (and
used by U). I will assume that L has the form:

idx = READ_ONCE(ss->index);
temp = this_cpu(ss->lock)[idx];
WRITE_ONCE(this_cpu(ss->lock)[idx], temp + 1)
smp_mb();

(or whatever is the right syntax for incrementing a per-cpu array
element). Likewise, assume U has the form:

temp = this_cpu(ss->unlock)[idx];
smp_store_release(&this_cpu(ss->unlock)[idx], temp + 1);

Let G be any SRCU grace period -- an invocation of synchronize_srcu(ss).
Assume G has the overall form:

accumulate_and_compare_loop(!ss->index);
smp_mb();
WRITE_ONCE(ss->index, !ss->index);
smp_mb();
accumulate_and_compare_loop(!ss->index);

where accumulate_and_compare_loop(i) has the form:

do {
s = t = 0;
for each CPU c:
s += READ_ONCE(cpu(c, ss->unlock)[i]);
smp_mb();
for each CPU c:
t += READ_ONCE(cpu(c, ss->lock)[i]);
} while (s != t);

It's not too hard to show, and I trust you already believe, that in the
final iteration of the accumulate_and_compare_loop(i) call for which
i = idx, the lock-counter increment in L is observed if and only if the
unlock-counter increment in U is observed. Thus we have two cases:

Case 1: Both of the increments are observed. Since the increment in U
is a store-release, every write that propagated to U's CPU before the
increment is therefore visible to G's CPU before its last read of an
unlock counter. Since the full fence in accumulate_and_compare_loop()
is executed after the last such read, these writes must propagate to
every CPU before G ends.

Case 2: Neither of the increments is observed. Let W be any write which
propagated to G's CPU before G started. Does W propagate to C before L
ends? We have the following SB or RWC pattern:

G C
------------------------ -----------------------
W propagates to G's CPU L writes lock counter
G does smp_mb() L does smp_mb()
G reads L's lock counter W propagates to C's CPU

(The smp_mb() in the left column is the one in
accumulate_and_compare_loop(idx), which precedes the reads of the lock
counters.)

If L's smp_mb() ended before G's did then L's write to the lock counter
would have propagated to G's CPU before G's smp_mb() ended, and hence G
would have observed the lock-counter increment. Since this didn't
happen, we know that G's smp_mb() ends before L's does. This means that
W must propagate to every CPU before L terminates, and hence before C's
critical section starts.

Together, these two cases cover the requirements of the Fundamental Law
of RCU. The memory barrier in U was needed only in Case 1, and there it
only needed to be a release fence.

Alan

2023-01-23 11:49:30

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/21/2023 6:36 PM, Alan Stern wrote:
> On Fri, Jan 20, 2023 at 10:41:14PM +0100, Jonas Oberhauser wrote:
>>
>> On 1/20/2023 5:18 PM, Alan Stern wrote:
>>> On Fri, Jan 20, 2023 at 11:13:00AM +0100, Jonas Oberhauser wrote:
>>>> Perhaps we could say that reading an index without using it later is
>>>> forbidden?
>>>>
>>>> flag ~empty [Srcu-lock];data;rf;[~ domain(data;[Srcu-unlock])] as
>>>> thrown-srcu-cookie-on-floor
>>> We already flag locks that don't have a matching unlock.
>> Of course, but as you know this is completely orthogonal.
> Yeah, okay. It doesn't hurt to add this check, but the check isn't
> complete. For example, it won't catch the invalid usage here:
>
> P0(srcu_struct *ss)
> {
> int r1, r2;
>
> r1 = srcu_read_lock(ss);
> srcu_read_unlock(&ss, r1);
> r2 = srcu_read_lock(ss);
> srcu_read_unlock(&ss, r2);
> }
>
> exists (~0:r1=0:r2)
>
> On the other hand, how often will people make this sort of mistake in
> their litmus tests? My guess is not very.
I currently don't care too much about the incorrect usage of herd (by
inspecting some final state incorrectly), only incorrect usage in the code.

>
>> I can imagine models that allow this but they aren't pretty. Maybe you have
>> a better operational model?
> The operational model is not very detailed as far as SRCU is concerned.
> It merely says that synchronize_srcu() executing on CPU C waits until:
>
> [...]
>
> For every srcu_down_read() that executed prior to t1, the
> matching srcu_up_read() [...].
> [...]
>
> Does this answer your question satisfactorily?

The reason I originally didn't consider this type of model (which
requires defining 'matching') pretty is that the most natural way to
define matching is probably using the whole dependency stuff at the
operational level. This isn't necessary for rcu or srcu lock/unlock, so
I thought this will add a new amount of tediousness to the model.
But I now realized that mechanisms for tracking dependencies are pretty
much already there (to define when stores can be executed), so I'm not
that unhappy about it anymore.


>>>> So if there is an srcu_down() that produces a cookie that is read by some
>>>> read R, and R doesn't then pass that value into an srcu_up(), the
>>>> srcu-warranty is voided.
>>> No, it isn't.
>> I quote Paul:
>> "If you do anything else at all with it, anything at all, you just voided
>> your SRCU warranty. For that matter, if you just throw that value on the
>> floor and don't pass it to an srcu_up_read() execution, you also just voided
>> your SRCU warranty."
> I suspect Paul did not express himself very precisely, and what he
> really meant was more like this:
>
> If you don't pass the value to exactly one srcu_up_read() call,
> you void the SRCU warranty. In addition, if you do anything
> else with the value that might affect the outcome of the litmus
> test, you incur the risk that herd7 might compute an incorrect
> result [as in the litmus test I gave near the start of this
> email].
>
> Merely storing the value in a shared variable which then doesn't get
> used or is used only for something inconsequential would not cause any
> problems.
>
> Alan
Ah, I understand now.
Thanks, jonas


2023-01-23 15:55:29

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Mon, Jan 23, 2023 at 12:48:42PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/21/2023 6:36 PM, Alan Stern wrote:
> > On Fri, Jan 20, 2023 at 10:41:14PM +0100, Jonas Oberhauser wrote:
> > >
> > > On 1/20/2023 5:18 PM, Alan Stern wrote:
> > > > On Fri, Jan 20, 2023 at 11:13:00AM +0100, Jonas Oberhauser wrote:
> > > > > Perhaps we could say that reading an index without using it later is
> > > > > forbidden?
> > > > >
> > > > > flag ~empty [Srcu-lock];data;rf;[~ domain(data;[Srcu-unlock])] as
> > > > > thrown-srcu-cookie-on-floor
> > > > We already flag locks that don't have a matching unlock.
> > > Of course, but as you know this is completely orthogonal.
> > Yeah, okay. It doesn't hurt to add this check, but the check isn't
> > complete. For example, it won't catch the invalid usage here:
> >
> > P0(srcu_struct *ss)
> > {
> > int r1, r2;
> >
> > r1 = srcu_read_lock(ss);
> > srcu_read_unlock(&ss, r1);
> > r2 = srcu_read_lock(ss);
> > srcu_read_unlock(&ss, r2);
> > }
> >
> > exists (~0:r1=0:r2)
> >
> > On the other hand, how often will people make this sort of mistake in
> > their litmus tests? My guess is not very.
> I currently don't care too much about the incorrect usage of herd (by
> inspecting some final state incorrectly), only incorrect usage in the code.

I'm inclined to add this check to the memory model. Would you prefer to
submit it yourself as a separate patch? Or are you happy to have it
merged with my patch, and if so, do you have a final, preferred form for
the check?

Alan

2023-01-23 16:17:15

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/19/2023 5:41 PM, Alan Stern wrote:
> On Thu, Jan 19, 2023 at 12:22:50PM +0100, Jonas Oberhauser wrote:
>> I mean that if you have a cycle that is formed by having two adjacent actual
>> `gp` edges, like .... ; gp;gp ; ....Ā  with gp= po ; rcu-gp ; po?,
>> (not like your example, where the cycle uses two *rcu*-gp but no gp edges)
> Don't forget that I had in mind a version of the model where rcu-gp did
> not exist.
>
>> and assume we define gp' = po ; rcu-gp ; po and hb' and pb' to use gp'
>> instead of gp,
>> then there are two cases for how that cycle came to be, either 1) as
>> Ā ... ; hb;hb ; ....
>> but then you can refactor as
>> Ā ... ; po;rcu-gp;po;rcu-gp;po ; ...
>> Ā ... ; po;rcu-gp; Ā  Ā  po Ā Ā  Ā  ; ...
>> Ā ... ;Ā Ā Ā Ā Ā Ā Ā Ā  gp'Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  ; ...
>> Ā ... ;Ā Ā Ā Ā Ā Ā Ā Ā  hb'Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā  ; ...
>> which again creates a cycle, or 2) as
>> Ā  ... ; pb ; hb ; ...
>> coming from
>> Ā  ... ; prop ; gp ; gp ; ....
>> which you can similarly refactor as
>> Ā  ... ; prop ; po;rcu-gp;po ; ....
>> Ā  ... ; prop ;Ā Ā Ā Ā Ā  gp' Ā  Ā  ; ....
>> and again get a cycle with
>> ... ; pb' ; ....
>> Therefore, gp = po;rcu-gp;po should be equivalent.
> The point is that in P1, we have Write ->(gp;gp) Read, but we do not
> have Write ->(gp';gp') Read. Only Write ->gp' Read. So if you're using
> gp' instead of gp, you'll analyze the litmus test as if it had only one
> grace period but two critical sections, getting a wrong answer.

Are you writing about the old model? Otherwise I don't see how this can
give a wrong answer.
gp' isn't used to count the grace periods (anymore?). the po<=rcu-link
allows using both grace periods to create rcu-order between the two read
side critical sections.
For the old model I believe it.

>
>
> Here's a totally different way of thinking about these things, which may
> prove enlightening. These thoughts originally occurred to me years ago,
> and I had forgotten about them until last night.
>
> If G is a grace period, let's write t1(G) for the time when G starts and
> t2(G) for the time when G ends.
>
> Likewise, if C is a read-side critical section, let's write t2(C) for
> the time when C starts (or the lock executes if you prefer) and t1(C)
> for the time when C ends (or the unlock executes). This terminology
> reflects the "backward" role that critical sections play in the memory
> model.
>
> Now we can can characterize rcu-order and rcu-link in operational terms.
> Let A and B each be either a grace period or a read-side critical
> section. Then:
>
> A ->rcu-order B means t1(A) < t2(B), and
>
> A ->rcu-link B means t2(A) <= t1(B).


That's a really elegant notation! I have thought about rcu-link and
rcu-order as ordering ends or starts depending on which events are being
ordered, but it quickly got out of hand because of all the different
cases. With this notation it becomes quite trivial.


> (Of course, we always have t1(X) < t2(X) for any grace period or
> critical section X.)
>
> This explains quite a lot. For example, we can justify including
>
> C ->rcu-link G
>
> into rcu-order as follows. From C ->rcu-link G we get that t2(C) <=
> t1(G), in other words, C starts when or before G starts. Then the
> Fundamental Law of RCU says that C must end before G ends, since
> otherwise C would span all of G. Thus t1(C) < t2(G), which is C
> ->rcu-order G.
>
> The case of G ->rcu-link C is similar.
>
> This also explains why rcu-link can be extended by appending (rcu-order
> ; rcu-link)*.

Indeed, by similar (but more clumsy) reasoning I observed that rcu-order
can be thought of as "extending" rcu-link.

> From X ->rcu-order Y ->rcu-link Z we get that t1(X) <
> t2(Y) <= t1(Z) and thus t1(X) <= t1(Z). So if
>
> A ->rcu-link B ->(rcu-order ; rcu-link)* C
>
> then t2(A) <= t1(B) <= t1(C), which justifies A ->rcu-link C.
>
> The same sort of argument shows that rcu-order should be extendable by
> appending (rcu-link ; rcu-order)* -- but not (rcu-order ; rcu-link)*.
>
> This also justifies why a lone gp belongs in rcu-order: G ->rcu-order G
> holds because t1(G) < t2(G). But for critical sections we have t2(C) <
> t1(C) and so C ->rcu-order C does not hold.
I don't think that it justifies why it belongs there. It justifies that
it could be included.
Neither rcu-order nor rcu-link exactly capture the temporal ordering,
they just imply it.
For example, if you have L1 U1 and L2 U2 forming two read side critical
sections C1 and C2, and
Ā Ā Ā  U1 ->(hb|pb)+ L2
then I would say you would have
Ā Ā Ā  t1(C1) < t2(C2)
but no rcu-order relation between any of the four events.

And for rcu-link this is even more obvious, because
(rcu-order;rcu-link)* does not currently actually extend rcu-link (but
it could based on the above reasoning).

In fact it seems we shouldn't even define a relation that is precisely
ordering t1(A) < t2(B) because that should be a total order on all grace
periods. As far as "observable" t1(A) < t2(B) is concerned, gp belongs
in that definition but I think it already is there through hb and/or pb.

> Assuming ordinary memory accesses occur in a single instant, you see why
> it makes sense to consider (po ; rcu-order ; po) an ordering.

Do you mean "execute" in a single instant?

> But when you're comparing grace periods or critical sections to each other,
> things get a little ambiguous. Should G1 be considered to come before
> G2 when t1(G1) < t1(G2), when t2(G1) < t2(G2), or when t2(G1) < t1(G2)?
> Springing for (po ; rcu-order ; po?) amounts to choosing the second
> alternative.

Aha, I see! Powerful notation indeed.
Keeping that in mind, wouldn't it make sense for pb also be changed to
`...;po?` ?
Mathematically it ends up making no difference (so far), because any
cycle of
Ā  ... ;(pb';po?); (rb | (pb';po?) | hb);...
(where pb' is pb but where things have been redefined so that the final
po is dropped)
can be trivially turned into a (pb | hb | rb) cycle except if it is
Ā Ā  ... ; pb' ; rcu-order ; po ; ...
But in this case we can use pb' <= prop ; po
Ā Ā  ... ; prop ; po ; rcu-order ; po ; ...
which is
Ā Ā  ... ; rb ; ...
and thus we get again a (pb | hb | rb) cycle.

But it would be more uniform and lets us define
Ā  xyz-order = po ; ... ; po?
Ā  pb = prop ; ...-order
Ā  rb = prop ; ...-order

Thanks for the insights,
jonas


2023-01-23 19:41:06

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/23/2023 4:55 PM, Alan Stern wrote:
> On Mon, Jan 23, 2023 at 12:48:42PM +0100, Jonas Oberhauser wrote:
>>
>> On 1/21/2023 6:36 PM, Alan Stern wrote:
>>> On Fri, Jan 20, 2023 at 10:41:14PM +0100, Jonas Oberhauser wrote:
>>>> On 1/20/2023 5:18 PM, Alan Stern wrote:
>>>>> On Fri, Jan 20, 2023 at 11:13:00AM +0100, Jonas Oberhauser wrote:
>>>>>> Perhaps we could say that reading an index without using it later is
>>>>>> forbidden?
>>>>>>
>>>>>> flag ~empty [Srcu-lock];data;rf;[~ domain(data;[Srcu-unlock])] as
>>>>>> thrown-srcu-cookie-on-floor
>>>>> We already flag locks that don't have a matching unlock.
>>>> Of course, but as you know this is completely orthogonal.
>>> Yeah, okay. It doesn't hurt to add this check, but the check isn't
>>> complete. For example, it won't catch the invalid usage here:
>>>
>>> P0(srcu_struct *ss)
>>> {
>>> int r1, r2;
>>>
>>> r1 = srcu_read_lock(ss);
>>> srcu_read_unlock(&ss, r1);
>>> r2 = srcu_read_lock(ss);
>>> srcu_read_unlock(&ss, r2);
>>> }
>>>
>>> exists (~0:r1=0:r2)
>>>
>>> On the other hand, how often will people make this sort of mistake in
>>> their litmus tests? My guess is not very.
>> I currently don't care too much about the incorrect usage of herd (by
>> inspecting some final state incorrectly), only incorrect usage in the code.
> I'm inclined to add this check to the memory model. Would you prefer to
> submit it yourself as a separate patch? Or are you happy to have it
> merged with my patch, and if so, do you have a final, preferred form for
> the check?

After clearing my confusion, I'm no longer sure if it should be added.
If you're still inclined to have it, I would prefer to submit the patch,
but I'd like to define the use-cookie relation (=
(data|[~Srcu-unlock];rfe)+) and use it also to clarify the srcu match
definition (I almost would like to do that anyways :D).
Is that ok?

jonas


2023-01-23 19:58:34

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Mon, Jan 23, 2023 at 05:16:27PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/19/2023 5:41 PM, Alan Stern wrote:
> > The point is that in P1, we have Write ->(gp;gp) Read, but we do not
> > have Write ->(gp';gp') Read. Only Write ->gp' Read. So if you're using
> > gp' instead of gp, you'll analyze the litmus test as if it had only one
> > grace period but two critical sections, getting a wrong answer.
>
> Are you writing about the old model? Otherwise I don't see how this can give
> a wrong answer.
> gp' isn't used to count the grace periods (anymore?). the po<=rcu-link
> allows using both grace periods to create rcu-order between the two read
> side critical sections.
> For the old model I believe it.

Yes, I was talking about the old version of the memory model.

> > If G is a grace period, let's write t1(G) for the time when G starts and
> > t2(G) for the time when G ends.
> >
> > Likewise, if C is a read-side critical section, let's write t2(C) for
> > the time when C starts (or the lock executes if you prefer) and t1(C)
> > for the time when C ends (or the unlock executes). This terminology
> > reflects the "backward" role that critical sections play in the memory
> > model.
> >
> > Now we can can characterize rcu-order and rcu-link in operational terms.
> > Let A and B each be either a grace period or a read-side critical
> > section. Then:
> >
> > A ->rcu-order B means t1(A) < t2(B), and
> >
> > A ->rcu-link B means t2(A) <= t1(B).
>
>
> That's a really elegant notation! I have thought about rcu-link and
> rcu-order as ordering ends or starts depending on which events are being
> ordered, but it quickly got out of hand because of all the different cases.
> With this notation it becomes quite trivial.
>
>
> > (Of course, we always have t1(X) < t2(X) for any grace period or
> > critical section X.)

Actually, it might make more sense to allow t1(C) = t2(C) for a critical
section C, because critical sections can be empty. Grace periods, by
contrast, always have to contain at least a full memory barrier.

> > This explains quite a lot. For example, we can justify including
> >
> > C ->rcu-link G
> >
> > into rcu-order as follows. From C ->rcu-link G we get that t2(C) <=
> > t1(G), in other words, C starts when or before G starts. Then the
> > Fundamental Law of RCU says that C must end before G ends, since
> > otherwise C would span all of G. Thus t1(C) < t2(G), which is C
> > ->rcu-order G.
> >
> > The case of G ->rcu-link C is similar.
> >
> > This also explains why rcu-link can be extended by appending (rcu-order
> > ; rcu-link)*.
>
> Indeed, by similar (but more clumsy) reasoning I observed that rcu-order can
> be thought of as "extending" rcu-link.
>
> > From X ->rcu-order Y ->rcu-link Z we get that t1(X) <
> > t2(Y) <= t1(Z) and thus t1(X) <= t1(Z). So if
> >
> > A ->rcu-link B ->(rcu-order ; rcu-link)* C
> >
> > then t2(A) <= t1(B) <= t1(C), which justifies A ->rcu-link C.
> >
> > The same sort of argument shows that rcu-order should be extendable by
> > appending (rcu-link ; rcu-order)* -- but not (rcu-order ; rcu-link)*.
> >
> > This also justifies why a lone gp belongs in rcu-order: G ->rcu-order G
> > holds because t1(G) < t2(G). But for critical sections we have t2(C) <
> > t1(C) and so C ->rcu-order C does not hold.
> I don't think that it justifies why it belongs there. It justifies that it
> could be included.
> Neither rcu-order nor rcu-link exactly capture the temporal ordering, they
> just imply it.
> For example, if you have L1 U1 and L2 U2 forming two read side critical
> sections C1 and C2, and
> ??? U1 ->(hb|pb)+ L2
> then I would say you would have
> ??? t1(C1) < t2(C2)
> but no rcu-order relation between any of the four events.

True, I should have said it suggests a reason for allowing rcu-order to
contain a lone gp.

> > Assuming ordinary memory accesses occur in a single instant, you see why
> > it makes sense to consider (po ; rcu-order ; po) an ordering.
>
> Do you mean "execute" in a single instant?

Yes, or to put it another way, t1(X) = t2(X) if X is a load or store.

> > But when you're comparing grace periods or critical sections to each other,
> > things get a little ambiguous. Should G1 be considered to come before
> > G2 when t1(G1) < t1(G2), when t2(G1) < t2(G2), or when t2(G1) < t1(G2)?
> > Springing for (po ; rcu-order ; po?) amounts to choosing the second
> > alternative.
>
> Aha, I see! Powerful notation indeed.
> Keeping that in mind, wouldn't it make sense for pb also be changed to
> `...;po?` ?

You mean changing the definition of pb to either:

prop ; strong-fence ; hb* ; po? ; [Marked]

or

prop ; strong-fence ; hb* ; [Marked] ; po? ; [Marked]

? Neither would be right. I'm sure you can easily come up with
examples of cycles in these relations, invalidating the propagation
axiom acyclic(pb).

rcu-fence is different because rcu-order has to begin and end with
either a grace period or a critical section, and both of these restrict
the execution order of surrounding events:

If X is a synchronize_rcu() or rcu_read_unlock() then events
po-before X must execute before X;

If X is a synchronize_rcu() or rcu_read_lock() then events
po-after X must execute after X.

The same cannot be said of hb or pb.

Alan

2023-01-23 20:07:38

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/23/2023 8:58 PM, Alan Stern wrote:
> On Mon, Jan 23, 2023 at 05:16:27PM +0100, Jonas Oberhauser wrote:
>> On 1/19/2023 5:41 PM, Alan Stern wrote:
>>
>>> But when you're comparing grace periods or critical sections to each other,
>>> things get a little ambiguous. Should G1 be considered to come before
>>> G2 when t1(G1) < t1(G2), when t2(G1) < t2(G2), or when t2(G1) < t1(G2)?
>>> Springing for (po ; rcu-order ; po?) amounts to choosing the second
>>> alternative.
>> Aha, I see! Powerful notation indeed.
>> Keeping that in mind, wouldn't it make sense for pb also be changed to
>> `...;po?` ?
> You mean changing the definition of pb to either:
>
> prop ; strong-fence ; hb* ; po? ; [Marked]
>
> or
>
> prop ; strong-fence ; hb* ; [Marked] ; po? ; [Marked]

Oh no, not at all!

I mean that
Ā Ā Ā  pb = prop ; po ; {strong ordering-operation} ; po ; hb* ; [Marked]
could instead be
Ā Ā Ā  pb = prop ; po ; {strong ordering-operation} ; po? ; hb* ; [Marked]

(note that the po ; ... ; po part is actually folded inside the actual
definition of strong fence).

> rcu-fence is different because rcu-order has to begin and end with
> either a grace period or a critical section, and both of these restrict
> the execution order of surrounding events:
>
> If X is a synchronize_rcu() or rcu_read_unlock() then events
> po-before X must execute before X;
>
> If X is a synchronize_rcu() or rcu_read_lock() then events
> po-after X must execute after X.
>
I believe so do the strong ordering-operations in pb.
best wishes, jonas


2023-01-23 20:17:08

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Sun, Jan 22, 2023 at 03:32:24PM -0500, Alan Stern wrote:
> On Fri, Jan 20, 2023 at 01:20:37PM -0800, Paul E. McKenney wrote:
> > On Fri, Jan 20, 2023 at 03:36:24PM -0500, Alan Stern wrote:
> > > On Fri, Jan 20, 2023 at 11:20:32AM -0800, Paul E. McKenney wrote:
> > > > On Fri, Jan 20, 2023 at 01:37:51PM -0500, Alan Stern wrote:
> > > > > srcu_read_unlock() does not need a full smp_mb().
> > > >
> > > > That is quite possible, and that is what we are looking into. And testing
> > > > thus far agrees with you. But the grace-period ordering constraints
> > > > are quite severe, so this requires careful checking and severe testing.
> > >
> > > If you're interested, I can provide a simple argument to show that the
> > > Fundamental Law of RCU would continue to hold with only a release fence.
> > > There is an added requirement: merely that synchronize_srcu() must have
> > > an smp_mb() somewhere after its final read of the unlock counters --
> > > which your version of the algorithm already has.
> >
> > Please!
> >
> > For your amusement, here is a very informal argument that this is
> > the case:
> >
> > https://docs.google.com/document/d/1xvwQzavmH474MBPAIBqVyvCrCcS5j2BpqhErPhRj7Is/edit?usp=sharing
> >
> > See the "Read-Side Optimizations" section at the end.
>
> It looks like you've got the basic idea. Most of the complications seem
> to arise from the different ways a grace period can happen.
>
> Here's what I was thinking. Let C be a read-side critical section, with
> L being its invocation of srcu_down_read() and U being the matching
> invocation of srcu_up_read(). Let idx be the index value read by L (and
> used by U). I will assume that L has the form:
>
> idx = READ_ONCE(ss->index);
> temp = this_cpu(ss->lock)[idx];
> WRITE_ONCE(this_cpu(ss->lock)[idx], temp + 1)
> smp_mb();
>
> (or whatever is the right syntax for incrementing a per-cpu array
> element).

The actual code uses this_cpu_inc() in order to permit srcu_read_lock()
and srcu_read_unlock() to be used in softirq and interrupt handlers,
but yes, ignoring interrupts, this is the form.

> Likewise, assume U has the form:
>
> temp = this_cpu(ss->unlock)[idx];
> smp_store_release(&this_cpu(ss->unlock)[idx], temp + 1);

And same here.

> Let G be any SRCU grace period -- an invocation of synchronize_srcu(ss).
> Assume G has the overall form:
>
> accumulate_and_compare_loop(!ss->index);
> smp_mb();
> WRITE_ONCE(ss->index, !ss->index);
> smp_mb();
> accumulate_and_compare_loop(!ss->index);
>
> where accumulate_and_compare_loop(i) has the form:
>
> do {
> s = t = 0;
> for each CPU c:
> s += READ_ONCE(cpu(c, ss->unlock)[i]);
> smp_mb();
> for each CPU c:
> t += READ_ONCE(cpu(c, ss->lock)[i]);
> } while (s != t);
>
> It's not too hard to show, and I trust you already believe, that in the
> final iteration of the accumulate_and_compare_loop(i) call for which
> i = idx, the lock-counter increment in L is observed if and only if the
> unlock-counter increment in U is observed. Thus we have two cases:
>
> Case 1: Both of the increments are observed. Since the increment in U
> is a store-release, every write that propagated to U's CPU before the
> increment is therefore visible to G's CPU before its last read of an
> unlock counter. Since the full fence in accumulate_and_compare_loop()
> is executed after the last such read, these writes must propagate to
> every CPU before G ends.
>
> Case 2: Neither of the increments is observed. Let W be any write which
> propagated to G's CPU before G started. Does W propagate to C before L
> ends? We have the following SB or RWC pattern:
>
> G C
> ------------------------ -----------------------
> W propagates to G's CPU L writes lock counter
> G does smp_mb() L does smp_mb()
> G reads L's lock counter W propagates to C's CPU
>
> (The smp_mb() in the left column is the one in
> accumulate_and_compare_loop(idx), which precedes the reads of the lock
> counters.)
>
> If L's smp_mb() ended before G's did then L's write to the lock counter
> would have propagated to G's CPU before G's smp_mb() ended, and hence G
> would have observed the lock-counter increment. Since this didn't
> happen, we know that G's smp_mb() ends before L's does. This means that
> W must propagate to every CPU before L terminates, and hence before C's
> critical section starts.
>
> Together, these two cases cover the requirements of the Fundamental Law
> of RCU. The memory barrier in U was needed only in Case 1, and there it
> only needed to be a release fence.

Looks good to me!

One twist is that the design of both SRCU and RCU are stronger than LKMM
requires, as illustrated by the litmus test at the end of this email.

I believe that your proof outline above also covers this case, but I
figure that I should ask.

Thanx, Paul

------------------------------------------------------------------------

C C-srcu-observed-2

(*
* Result: Sometimes
*
* But please note that the Linux-kernel SRCU implementation is designed
* to provide Never.
*)

{}

P0(int *x, int *y, int *z, struct srcu_struct *s)
{
int r1;
int r2;

r1 = srcu_read_lock(s);
WRITE_ONCE(*y, 1);
WRITE_ONCE(*x, 1);
srcu_read_unlock(s, r3);
}

P1(int *x, int *y, int *z, struct srcu_struct *s)
{
int r1;
int r2;

r1 = READ_ONCE(*y);
synchronize_srcu(s);
WRITE_ONCE(*z, 1);
}

P2(int *x, int *y, int *z, struct srcu_struct *s)
{
int r1;

WRITE_ONCE(*z, 2);
smp_mb();
r2 = READ_ONCE(*x);
}

exists (1:r1=1 /\ 1:r2=0 /\ z=1)

2023-01-23 20:34:06

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Mon, Jan 23, 2023 at 08:40:24PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/23/2023 4:55 PM, Alan Stern wrote:
> > I'm inclined to add this check to the memory model. Would you prefer to
> > submit it yourself as a separate patch? Or are you happy to have it
> > merged with my patch, and if so, do you have a final, preferred form for
> > the check?
>
> After clearing my confusion, I'm no longer sure if it should be added. If
> you're still inclined to have it, I would prefer to submit the patch, but
> I'd like to define the use-cookie relation (= (data|[~Srcu-unlock];rfe)+)
> and use it also to clarify the srcu match definition (I almost would like to
> do that anyways :D).
> Is that ok?

Write up a patch and we can all judge it.

Alan

2023-01-23 20:42:18

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Mon, Jan 23, 2023 at 09:06:54PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/23/2023 8:58 PM, Alan Stern wrote:
> > On Mon, Jan 23, 2023 at 05:16:27PM +0100, Jonas Oberhauser wrote:
> > > On 1/19/2023 5:41 PM, Alan Stern wrote:
> > >
> > > > But when you're comparing grace periods or critical sections to each other,
> > > > things get a little ambiguous. Should G1 be considered to come before
> > > > G2 when t1(G1) < t1(G2), when t2(G1) < t2(G2), or when t2(G1) < t1(G2)?
> > > > Springing for (po ; rcu-order ; po?) amounts to choosing the second
> > > > alternative.
> > > Aha, I see! Powerful notation indeed.
> > > Keeping that in mind, wouldn't it make sense for pb also be changed to
> > > `...;po?` ?
> > You mean changing the definition of pb to either:
> >
> > prop ; strong-fence ; hb* ; po? ; [Marked]
> >
> > or
> >
> > prop ; strong-fence ; hb* ; [Marked] ; po? ; [Marked]
>
> Oh no, not at all!
>
> I mean that
> ??? pb = prop ; po ; {strong ordering-operation} ; po ; hb* ; [Marked]
> could instead be
> ??? pb = prop ; po ; {strong ordering-operation} ; po? ; hb* ; [Marked]
>
> (note that the po ; ... ; po part is actually folded inside the actual
> definition of strong fence).

This goes back to the original herd models, before the LKMM came about:
The fencerel() macro uses po on both sides. I believe the motivating
idea back then was that ordering should apply only to memory accesses
(which can in practice be observed), not to other types of events such
as memory barriers.

> > rcu-fence is different because rcu-order has to begin and end with
> > either a grace period or a critical section, and both of these restrict
> > the execution order of surrounding events:
> >
> > If X is a synchronize_rcu() or rcu_read_unlock() then events
> > po-before X must execute before X;
> >
> > If X is a synchronize_rcu() or rcu_read_lock() then events
> > po-after X must execute after X.
> >
> I believe so do the strong ordering-operations in pb.

But the beginning and end of a pb link (for example, overwrite and hb)
don't need to be strong-ordering operations.

Alan

2023-01-24 02:18:20

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Mon, Jan 23, 2023 at 12:16:59PM -0800, Paul E. McKenney wrote:
> One twist is that the design of both SRCU and RCU are stronger than LKMM
> requires, as illustrated by the litmus test at the end of this email.
>
> I believe that your proof outline above also covers this case, but I
> figure that I should ask.

This test is full of typos, and I guess that one of them seriously
affects the meaning, because as far as I can tell the corrected test is
allowed.

> C C-srcu-observed-2
>
> (*
> * Result: Sometimes
> *
> * But please note that the Linux-kernel SRCU implementation is designed
> * to provide Never.
> *)
>
> {}
>
> P0(int *x, int *y, int *z, struct srcu_struct *s)
> {
> int r1;
> int r2;

r2 is never used.

>
> r1 = srcu_read_lock(s);
> WRITE_ONCE(*y, 1);
> WRITE_ONCE(*x, 1);
> srcu_read_unlock(s, r3);

There is no r3; this should be r1.

> }
>
> P1(int *x, int *y, int *z, struct srcu_struct *s)
> {
> int r1;
> int r2;

r2 is never used.

>
> r1 = READ_ONCE(*y);
> synchronize_srcu(s);
> WRITE_ONCE(*z, 1);
> }
>
> P2(int *x, int *y, int *z, struct srcu_struct *s)
> {
> int r1;

r1 is never used; it should be r2.

>
> WRITE_ONCE(*z, 2);
> smp_mb();
> r2 = READ_ONCE(*x);
> }
>
> exists (1:r1=1 /\ 1:r2=0 /\ z=1)

1:r2 is never used. Apparently this should 2:r2.

Given those changes, the test can run as follows: P2 runs to completion,
writing z=2 and reading x=0. Then P0 runs to completion, writing y=1
and x=1. Then P1 runs to completion, reading y=1 and overwriting z=1.

Alan

2023-01-24 04:07:36

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Mon, Jan 23, 2023 at 09:18:14PM -0500, Alan Stern wrote:
> On Mon, Jan 23, 2023 at 12:16:59PM -0800, Paul E. McKenney wrote:
> > One twist is that the design of both SRCU and RCU are stronger than LKMM
> > requires, as illustrated by the litmus test at the end of this email.
> >
> > I believe that your proof outline above also covers this case, but I
> > figure that I should ask.
>
> This test is full of typos, and I guess that one of them seriously
> affects the meaning, because as far as I can tell the corrected test is
> allowed.
>
> > C C-srcu-observed-2
> >
> > (*
> > * Result: Sometimes
> > *
> > * But please note that the Linux-kernel SRCU implementation is designed
> > * to provide Never.
> > *)
> >
> > {}
> >
> > P0(int *x, int *y, int *z, struct srcu_struct *s)
> > {
> > int r1;
> > int r2;
>
> r2 is never used.
>
> >
> > r1 = srcu_read_lock(s);
> > WRITE_ONCE(*y, 1);
> > WRITE_ONCE(*x, 1);
> > srcu_read_unlock(s, r3);
>
> There is no r3; this should be r1.
>
> > }
> >
> > P1(int *x, int *y, int *z, struct srcu_struct *s)
> > {
> > int r1;
> > int r2;
>
> r2 is never used.
>
> >
> > r1 = READ_ONCE(*y);
> > synchronize_srcu(s);
> > WRITE_ONCE(*z, 1);
> > }
> >
> > P2(int *x, int *y, int *z, struct srcu_struct *s)
> > {
> > int r1;
>
> r1 is never used; it should be r2.
>
> >
> > WRITE_ONCE(*z, 2);
> > smp_mb();
> > r2 = READ_ONCE(*x);
> > }
> >
> > exists (1:r1=1 /\ 1:r2=0 /\ z=1)
>
> 1:r2 is never used. Apparently this should 2:r2.
>
> Given those changes, the test can run as follows: P2 runs to completion,
> writing z=2 and reading x=0. Then P0 runs to completion, writing y=1
> and x=1. Then P1 runs to completion, reading y=1 and overwriting z=1.

All that and I also messed up by not having "z=2". :-/

Thank you for looking it over!

But the following one is forbidden, the Result comment below
notwithstanding. I could have sworn that there was some post-grace-period
write-to-write litmus test that LKMM allowed, but if so, this one is
not it.

------------------------------------------------------------------------

C C-srcu-observed-2

(*
* Result: Sometimes
*
* But please note that the Linux-kernel SRCU implementation is designed
* to provide Never.
*)

{}

P0(int *x, int *y, int *z, struct srcu_struct *s)
{
int r1;

r1 = srcu_read_lock(s);
WRITE_ONCE(*y, 1);
WRITE_ONCE(*x, 1);
srcu_read_unlock(s, r1);
}

P1(int *x, int *y, int *z, struct srcu_struct *s)
{
int r1;

r1 = READ_ONCE(*y);
synchronize_srcu(s);
WRITE_ONCE(*z, 1);
}

P2(int *x, int *y, int *z, struct srcu_struct *s)
{
int r1;

WRITE_ONCE(*z, 2);
smp_mb();
r1 = READ_ONCE(*x);
}

exists (1:r1=1 /\ 2:r1=0 /\ z=2)

------------------------------------------------------------------------

There is the one below, but I am (1) not sure that I have it right,
(2) not immediately certain that the Linux-kernel implementation would
forbid it, (3) not immediately sure that it should be forbidden.

In the meantime, thoughts?

Thanx, Paul

------------------------------------------------------------------------

C C-srcu-observed-3

(*
* Result: Sometimes
*)

{}

P0(int *x, int *y, int *z, struct srcu_struct *s)
{
int r1;

r1 = srcu_read_lock(s);
WRITE_ONCE(*y, 1);
WRITE_ONCE(*x, 1);
srcu_read_unlock(s, r1);
}

P1(int *x, int *y, int *z, struct srcu_struct *s)
{
int r1;

r1 = READ_ONCE(*y);
synchronize_srcu(s);
WRITE_ONCE(*z, 1);
}

P2(int *x, int *y, int *z, struct srcu_struct *s)
{
WRITE_ONCE(*z, 2);
smp_mb();
WRITE_ONCE(*x, 2);
}

exists (1:r1=1 /\ x=2 /\ z=2)

2023-01-24 11:10:06

by Andrea Parri

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

> There is the one below, but I am (1) not sure that I have it right,
> (2) not immediately certain that the Linux-kernel implementation would
> forbid it, (3) not immediately sure that it should be forbidden.
>
> In the meantime, thoughts?

As it stands, P0 to completion, then P1 to completion, then P2 to
completion should meet the "exists" clause; I guess we want "x=1"
in the clause (or the values of the stores to "x" exchanged).

Andrea


> ------------------------------------------------------------------------
>
> C C-srcu-observed-3
>
> (*
> * Result: Sometimes
> *)
>
> {}
>
> P0(int *x, int *y, int *z, struct srcu_struct *s)
> {
> int r1;
>
> r1 = srcu_read_lock(s);
> WRITE_ONCE(*y, 1);
> WRITE_ONCE(*x, 1);
> srcu_read_unlock(s, r1);
> }
>
> P1(int *x, int *y, int *z, struct srcu_struct *s)
> {
> int r1;
>
> r1 = READ_ONCE(*y);
> synchronize_srcu(s);
> WRITE_ONCE(*z, 1);
> }
>
> P2(int *x, int *y, int *z, struct srcu_struct *s)
> {
> WRITE_ONCE(*z, 2);
> smp_mb();
> WRITE_ONCE(*x, 2);
> }
>
> exists (1:r1=1 /\ x=2 /\ z=2)

2023-01-24 13:22:25

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/23/2023 9:41 PM, Alan Stern wrote:
> On Mon, Jan 23, 2023 at 09:06:54PM +0100, Jonas Oberhauser wrote:
>>
>> On 1/23/2023 8:58 PM, Alan Stern wrote:
>>> On Mon, Jan 23, 2023 at 05:16:27PM +0100, Jonas Oberhauser wrote:
>>>> On 1/19/2023 5:41 PM, Alan Stern wrote:
>>>>
>>>>> But when you're comparing grace periods or critical sections to each other,
>>>>> things get a little ambiguous. Should G1 be considered to come before
>>>>> G2 when t1(G1) < t1(G2), when t2(G1) < t2(G2), or when t2(G1) < t1(G2)?
>>>>> Springing for (po ; rcu-order ; po?) amounts to choosing the second
>>>>> alternative.
>>>> Aha, I see! Powerful notation indeed.
>>>> Keeping that in mind, wouldn't it make sense for pb also be changed to
>>>> `...;po?` ?
>>> You mean changing the definition of pb to either:
>>>
>>> prop ; strong-fence ; hb* ; po? ; [Marked]
>>>
>>> or
>>>
>>> prop ; strong-fence ; hb* ; [Marked] ; po? ; [Marked]
>> Oh no, not at all!
>>
>> I mean that
>> Ā Ā Ā  pb = prop ; po ; {strong ordering-operation} ; po ; hb* ; [Marked]
>> could instead be
>> Ā Ā Ā  pb = prop ; po ; {strong ordering-operation} ; po? ; hb* ; [Marked]
>>
>> (note that the po ; ... ; po part is actually folded inside the actual
>> definition of strong fence).
> This goes back to the original herd models, before the LKMM came about:
> The fencerel() macro uses po on both sides. I believe the motivating
> idea back then was that ordering should apply only to memory accesses
> (which can in practice be observed), not to other types of events such
> as memory barriers.
I see. I believe this argument no longer strictly holds, now that rcu-gp
needs to be ordered in some cases.

>>> rcu-fence is different because rcu-order has to begin and end with
>>> either a grace period or a critical section, and both of these restrict
>>> the execution order of surrounding events:
>>>
>>> If X is a synchronize_rcu() or rcu_read_unlock() then events
>>> po-before X must execute before X;
>>>
>>> If X is a synchronize_rcu() or rcu_read_lock() then events
>>> po-after X must execute after X.
>>>
>> I believe so do the strong ordering-operations in pb.
> But the beginning and end of a pb link (for example, overwrite and hb)
> don't need to be strong-ordering operations.
Of course, but I'm not suggesting to put a po? at those locations.

have fun, jonas


2023-01-24 14:54:32

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 24, 2023 at 12:09:48PM +0100, Andrea Parri wrote:
> > There is the one below, but I am (1) not sure that I have it right,
> > (2) not immediately certain that the Linux-kernel implementation would
> > forbid it, (3) not immediately sure that it should be forbidden.
> >
> > In the meantime, thoughts?
>
> As it stands, P0 to completion, then P1 to completion, then P2 to
> completion should meet the "exists" clause; I guess we want "x=1"
> in the clause (or the values of the stores to "x" exchanged).

OK, so I still don't have it right. ;-)

Make that x=1. I think.

Thanx, Paul

> Andrea
>
>
> > ------------------------------------------------------------------------
> >
> > C C-srcu-observed-3
> >
> > (*
> > * Result: Sometimes
> > *)
> >
> > {}
> >
> > P0(int *x, int *y, int *z, struct srcu_struct *s)
> > {
> > int r1;
> >
> > r1 = srcu_read_lock(s);
> > WRITE_ONCE(*y, 1);
> > WRITE_ONCE(*x, 1);
> > srcu_read_unlock(s, r1);
> > }
> >
> > P1(int *x, int *y, int *z, struct srcu_struct *s)
> > {
> > int r1;
> >
> > r1 = READ_ONCE(*y);
> > synchronize_srcu(s);
> > WRITE_ONCE(*z, 1);
> > }
> >
> > P2(int *x, int *y, int *z, struct srcu_struct *s)
> > {
> > WRITE_ONCE(*z, 2);
> > smp_mb();
> > WRITE_ONCE(*x, 2);
> > }
> >
> > exists (1:r1=1 /\ x=2 /\ z=2)

2023-01-24 15:12:04

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/24/2023 3:54 PM, Paul E. McKenney wrote:
> On Tue, Jan 24, 2023 at 12:09:48PM +0100, Andrea Parri wrote:
>>> There is the one below, but I am (1) not sure that I have it right,
>>> (2) not immediately certain that the Linux-kernel implementation would
>>> forbid it, (3) not immediately sure that it should be forbidden.
>>>
>>> In the meantime, thoughts?
>> As it stands, P0 to completion, then P1 to completion, then P2 to
>> completion should meet the "exists" clause; I guess we want "x=1"
>> in the clause (or the values of the stores to "x" exchanged).
> OK, so I still don't have it right. ;-)
>
> Make that x=1. I think.
>

If it is x=1, why doesn't LKMM forbid it?
Because T1:y=1 is read by T1 before the GP, the whole CS is before the
GP, i.e.,

srcu_read_unlock(s, r1); ->rcu-order synchronize_srcu(s);

The GP is furthermore po;prop;strong-fence;prop;po ordered before the
unlock, which you can shuffle around to get
Ā Ā  Wx=2Ā  ->prop;po;rcu-order;po ;Ā  prop;strong-fenceĀ  Wx=2
or
Ā Ā  Wx=2Ā  ->rbĀ  Wx=2
which is forbidden because rb is irreflexive.

Right?

jonas


2023-01-24 15:55:34

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/19/2023 5:41 PM, Alan Stern wrote:
> On Thu, Jan 19, 2023 at 12:22:50PM +0100, Jonas Oberhauser wrote:
>>
>> On 1/19/2023 3:28 AM, Alan Stern wrote:
>>>> This is a permanent error; I've given up. Sorry it didn't
>>> work out.
>> [It seems the e-mail still reached me through the mailing list]
> [For everyone else, Jonas is referring to the fact that the last two
> emails I sent to his huaweicloud.com address could not be delivered, so
> I copied them off-list to his huawei.com address.]
>
>>>> I consider that a hack though and don't like it.
>>> It _is_ a bit of a hack, but not a huge one. srcu_read_lock() really
>>> is a lot like a load, in that it returns a value obtained by reading
>>> something from memory (along with some other operations, though, so it
>>> isn't a simple straightforward read -- perhaps more like an
>>> atomic_inc_return_relaxed).
>> The issue I have with this is that it might create accidental ordering. How
>> does it behave when you throw fences in the mix?
> I think this isn't going to be a problem. Certainly any real
> implementation of scru_read_lock() is going to involve some actual load
> operations, so any unintentional ordering caused by fences will also
> apply to real executions. Likewise for srcu_read_unlock and store
> operations.

Note that there may indeed be reads in the implementation, but most
likely not from the srcu_read_unlock()s of other threads. Most probably
from the synchronize_srcu() calls. So the rfe edges being added are
probably not corresponding to any rfe edges in the implementation.

That said, I believe there may indeed not be any restrictions in
behavior caused by this, because any code that relies on the order being
a certain thing would need to use some other ordering mechanism, and
that would probably restrict the behavior anyways.

It does have the negative side-effect of creating an explosion of
permutations though, by ordering all unlocks() in a total way and also
sometimes allowing multiple options for each lock() (e.g.,
lock();unlock() || lock();unlock()Ā  has 4 executions instead of 1).

Anyways, not much to be done about it right now.

best wishes, jonas


2023-01-24 16:23:10

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 24, 2023 at 04:11:14PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/24/2023 3:54 PM, Paul E. McKenney wrote:
> > On Tue, Jan 24, 2023 at 12:09:48PM +0100, Andrea Parri wrote:
> > > > There is the one below, but I am (1) not sure that I have it right,
> > > > (2) not immediately certain that the Linux-kernel implementation would
> > > > forbid it, (3) not immediately sure that it should be forbidden.
> > > >
> > > > In the meantime, thoughts?
> > > As it stands, P0 to completion, then P1 to completion, then P2 to
> > > completion should meet the "exists" clause; I guess we want "x=1"
> > > in the clause (or the values of the stores to "x" exchanged).
> > OK, so I still don't have it right. ;-)
> >
> > Make that x=1. I think.
> >
>
> If it is x=1, why doesn't LKMM forbid it?
> Because T1:y=1 is read by T1 before the GP, the whole CS is before the GP,
> i.e.,
>
> srcu_read_unlock(s, r1); ->rcu-order synchronize_srcu(s);
>
> The GP is furthermore po;prop;strong-fence;prop;po ordered before the
> unlock, which you can shuffle around to get
> ?? Wx=2? ->prop;po;rcu-order;po ;? prop;strong-fence? Wx=2
> or
> ?? Wx=2? ->rb? Wx=2
> which is forbidden because rb is irreflexive.
>
> Right?

Yes according to herd7, hence the "I think". I clearly recall some
store-based lack of ordering after a grace period from some years back,
and am thus far failing to reproduce it.

And here is another attempt that herd7 actually does allow.

So what did I mess up this time? ;-)

Thanx, Paul

------------------------------------------------------------------------

C C-srcu-observed-4

(*
* Result: Sometimes
*
* The Linux-kernel implementation is suspected to forbid this.
*)

{}

P0(int *x, int *y, int *z, struct srcu_struct *s)
{
int r1;

r1 = srcu_read_lock(s);
WRITE_ONCE(*y, 2);
WRITE_ONCE(*x, 1);
srcu_read_unlock(s, r1);
}

P1(int *x, int *y, int *z, struct srcu_struct *s)
{
int r1;

WRITE_ONCE(*y, 1);
synchronize_srcu(s);
WRITE_ONCE(*z, 2);
}

P2(int *x, int *y, int *z, struct srcu_struct *s)
{
WRITE_ONCE(*z, 1);
smp_store_release(x, 2);
}

exists (x=1 /\ y=1 /\ z=1)

2023-01-24 16:41:02

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/24/2023 5:22 PM, Paul E. McKenney wrote:
> I clearly recall some
> store-based lack of ordering after a grace period from some years back,
> and am thus far failing to reproduce it.
>
> And here is another attempt that herd7 actually does allow.
>
> So what did I mess up this time? ;-)
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> C C-srcu-observed-4
>
> (*
> * Result: Sometimes
> *
> * The Linux-kernel implementation is suspected to forbid this.
> *)
>
> {}
>
> P0(int *x, int *y, int *z, struct srcu_struct *s)
> {
> int r1;
>
> r1 = srcu_read_lock(s);
> WRITE_ONCE(*y, 2);
> WRITE_ONCE(*x, 1);
> srcu_read_unlock(s, r1);
> }
>
> P1(int *x, int *y, int *z, struct srcu_struct *s)
> {
> int r1;
>
> WRITE_ONCE(*y, 1);
> synchronize_srcu(s);
> WRITE_ONCE(*z, 2);
> }
>
> P2(int *x, int *y, int *z, struct srcu_struct *s)
> {
> WRITE_ONCE(*z, 1);
> smp_store_release(x, 2);
> }
>
> exists (x=1 /\ y=1 /\ z=1)

I think even if you implement the unlock as mb() followed by some store
that is read by the gp between mb()s, this would still be allowed.

I have already forgotten the specifics, but I think the power model
allows certain stores never propagating somewhere?
If z=2,z=1,x=2 never propagate to P0, you might start by executing P0,
then P1, and then P2 at which point the memory system decides that x=1
overwrites x=2, and the latter simply doesn't propagate anywhere.

(I'll let anyone who has the model at hand correct me on this, because I
have to take a walk now).

Have fun, jonas


2023-01-24 17:22:22

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 24, 2023 at 04:54:42PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/19/2023 5:41 PM, Alan Stern wrote:
> > On Thu, Jan 19, 2023 at 12:22:50PM +0100, Jonas Oberhauser wrote:
> > >
> > > On 1/19/2023 3:28 AM, Alan Stern wrote:
> > > > > This is a permanent error; I've given up. Sorry it didn't
> > > > work out.
> > > [It seems the e-mail still reached me through the mailing list]
> > [For everyone else, Jonas is referring to the fact that the last two
> > emails I sent to his huaweicloud.com address could not be delivered, so
> > I copied them off-list to his huawei.com address.]
> >
> > > > > I consider that a hack though and don't like it.
> > > > It _is_ a bit of a hack, but not a huge one. srcu_read_lock() really
> > > > is a lot like a load, in that it returns a value obtained by reading
> > > > something from memory (along with some other operations, though, so it
> > > > isn't a simple straightforward read -- perhaps more like an
> > > > atomic_inc_return_relaxed).
> > > The issue I have with this is that it might create accidental ordering. How
> > > does it behave when you throw fences in the mix?
> > I think this isn't going to be a problem. Certainly any real
> > implementation of scru_read_lock() is going to involve some actual load
> > operations, so any unintentional ordering caused by fences will also
> > apply to real executions. Likewise for srcu_read_unlock and store
> > operations.
>
> Note that there may indeed be reads in the implementation, but most likely
> not from the srcu_read_unlock()s of other threads. Most probably from the
> synchronize_srcu() calls. So the rfe edges being added are probably not
> corresponding to any rfe edges in the implementation.
>
> That said, I believe there may indeed not be any restrictions in behavior
> caused by this, because any code that relies on the order being a certain
> thing would need to use some other ordering mechanism, and that would
> probably restrict the behavior anyways.
>
> It does have the negative side-effect of creating an explosion of
> permutations though, by ordering all unlocks() in a total way and also
> sometimes allowing multiple options for each lock() (e.g., lock();unlock()
> || lock();unlock()? has 4 executions instead of 1).

That's true. It would be nice if there was a class of write-like events
which couldn't be read from and didn't contribute to the coherence
ordering.

Alan

> Anyways, not much to be done about it right now.
>
> best wishes, jonas
>

2023-01-24 17:26:55

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 24, 2023 at 05:39:53PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/24/2023 5:22 PM, Paul E. McKenney wrote:
> > I clearly recall some
> > store-based lack of ordering after a grace period from some years back,
> > and am thus far failing to reproduce it.
> >
> > And here is another attempt that herd7 actually does allow.
> >
> > So what did I mess up this time? ;-)
> >
> > Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > C C-srcu-observed-4
> >
> > (*
> > * Result: Sometimes
> > *
> > * The Linux-kernel implementation is suspected to forbid this.
> > *)
> >
> > {}
> >
> > P0(int *x, int *y, int *z, struct srcu_struct *s)
> > {
> > int r1;
> >
> > r1 = srcu_read_lock(s);
> > WRITE_ONCE(*y, 2);
> > WRITE_ONCE(*x, 1);
> > srcu_read_unlock(s, r1);
> > }
> >
> > P1(int *x, int *y, int *z, struct srcu_struct *s)
> > {
> > int r1;
> >
> > WRITE_ONCE(*y, 1);
> > synchronize_srcu(s);
> > WRITE_ONCE(*z, 2);
> > }
> >
> > P2(int *x, int *y, int *z, struct srcu_struct *s)
> > {
> > WRITE_ONCE(*z, 1);
> > smp_store_release(x, 2);
> > }
> >
> > exists (x=1 /\ y=1 /\ z=1)
>
> I think even if you implement the unlock as mb() followed by some store that
> is read by the gp between mb()s, this would still be allowed.

The implementation of synchronize_srcu() has quite a few smp_mb()
invocations.

But exactly how are you modeling this? As in what additional accesses
and memory barriers are you placing in which locations?

> I have already forgotten the specifics, but I think the power model allows
> certain stores never propagating somewhere?

PowerPC would forbid the 3.2W case, where each process used an
smp_store_release() as its sole ordering (no smp_mb() calls at all).

> If z=2,z=1,x=2 never propagate to P0, you might start by executing P0, then
> P1, and then P2 at which point the memory system decides that x=1 overwrites
> x=2, and the latter simply doesn't propagate anywhere.

This propagation is modulated by the memory barriers, though.

> (I'll let anyone who has the model at hand correct me on this, because I
> have to take a walk now).

Have a good walk!

Thanx, Paul

2023-01-24 19:31:57

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/24/2023 6:26 PM, Paul E. McKenney wrote:
> On Tue, Jan 24, 2023 at 05:39:53PM +0100, Jonas Oberhauser wrote:
>>
>> On 1/24/2023 5:22 PM, Paul E. McKenney wrote:
>>> I clearly recall some
>>> store-based lack of ordering after a grace period from some years back,
>>> and am thus far failing to reproduce it.
>>>
>>> And here is another attempt that herd7 actually does allow.
>>>
>>> So what did I mess up this time? ;-)
>>>
>>> Thanx, Paul
>>>
>>> ------------------------------------------------------------------------
>>>
>>> C C-srcu-observed-4
>>>
>>> (*
>>> * Result: Sometimes
>>> *
>>> * The Linux-kernel implementation is suspected to forbid this.
>>> *)
>>>
>>> {}
>>>
>>> P0(int *x, int *y, int *z, struct srcu_struct *s)
>>> {
>>> int r1;
>>>
>>> r1 = srcu_read_lock(s);
>>> WRITE_ONCE(*y, 2);
>>> WRITE_ONCE(*x, 1);
>>> srcu_read_unlock(s, r1);
>>> }
>>>
>>> P1(int *x, int *y, int *z, struct srcu_struct *s)
>>> {
>>> int r1;
>>>
>>> WRITE_ONCE(*y, 1);
>>> synchronize_srcu(s);
>>> WRITE_ONCE(*z, 2);
>>> }
>>>
>>> P2(int *x, int *y, int *z, struct srcu_struct *s)
>>> {
>>> WRITE_ONCE(*z, 1);
>>> smp_store_release(x, 2);
>>> }
>>>
>>> exists (x=1 /\ y=1 /\ z=1)
>> I think even if you implement the unlock as mb() followed by some store that
>> is read by the gp between mb()s, this would still be allowed.
> The implementation of synchronize_srcu() has quite a few smp_mb()
> invocations.
>
> But exactly how are you modeling this? As in what additional accesses
> and memory barriers are you placing in which locations?

Along these lines:

P0(int *x, int *y, int *z, int *magic_location)
{
int r1;


WRITE_ONCE(*y, 2);
WRITE_ONCE(*x, 1);

smp_mb();
WRITE_ONCE(*magic_location, 1);

}

P1(int *x, int *y, int *z, int *magic_location)
{
int r1;

WRITE_ONCE(*y, 1);

smp_mb();
while (! READ_ONCE(*magic_location))
;
smp_mb();
WRITE_ONCE(*z, 2);
}


P2(int *x, int *y, int *z, struct srcu_struct *s)
{
WRITE_ONCE(*z, 1);
smp_store_release(x, 2);
}



Note that you can add as many additional smp_mb() and other accesses as
you want around the original srcu call sites. I don't see how they could
influence the absence of a cycle.

(Also, to make it work with herd it seems you need to replace the loop
with a single read and state in the exists clause that it happens to
read a 1.)

>> I have already forgotten the specifics, but I think the power model allows
>> certain stores never propagating somewhere?
> PowerPC would forbid the 3.2W case, where each process used an
> smp_store_release() as its sole ordering (no smp_mb() calls at all).
>
> [...]
>
> This propagation is modulated by the memory barriers, though.

Ah, looking at the model now. Indeed it's forbidden, because in order to
say that something is in co, there must not be a (resulting) cycle of co
and barriers. But you'd get that here.Ā  In the axiomatic model, this
corresponds to saying Power's "prop | co" is acyclic. The same isn't
true in LKMM. So that's probably why.

Have fun, jonas


2023-01-24 22:15:33

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 24, 2023 at 08:30:08PM +0100, Jonas Oberhauser wrote:
> On 1/24/2023 6:26 PM, Paul E. McKenney wrote:
> > On Tue, Jan 24, 2023 at 05:39:53PM +0100, Jonas Oberhauser wrote:
> > >
> > > On 1/24/2023 5:22 PM, Paul E. McKenney wrote:
> > > > I clearly recall some
> > > > store-based lack of ordering after a grace period from some years back,
> > > > and am thus far failing to reproduce it.
> > > >
> > > > And here is another attempt that herd7 actually does allow.
> > > >
> > > > So what did I mess up this time? ;-)
> > > >
> > > > Thanx, Paul
> > > >
> > > > ------------------------------------------------------------------------
> > > >
> > > > C C-srcu-observed-4
> > > >
> > > > (*
> > > > * Result: Sometimes
> > > > *
> > > > * The Linux-kernel implementation is suspected to forbid this.
> > > > *)
> > > >
> > > > {}
> > > >
> > > > P0(int *x, int *y, int *z, struct srcu_struct *s)
> > > > {
> > > > int r1;
> > > >
> > > > r1 = srcu_read_lock(s);
> > > > WRITE_ONCE(*y, 2);
> > > > WRITE_ONCE(*x, 1);
> > > > srcu_read_unlock(s, r1);
> > > > }
> > > >
> > > > P1(int *x, int *y, int *z, struct srcu_struct *s)
> > > > {
> > > > int r1;
> > > >
> > > > WRITE_ONCE(*y, 1);
> > > > synchronize_srcu(s);
> > > > WRITE_ONCE(*z, 2);
> > > > }
> > > >
> > > > P2(int *x, int *y, int *z, struct srcu_struct *s)
> > > > {
> > > > WRITE_ONCE(*z, 1);
> > > > smp_store_release(x, 2);
> > > > }
> > > >
> > > > exists (x=1 /\ y=1 /\ z=1)
> > > I think even if you implement the unlock as mb() followed by some store that
> > > is read by the gp between mb()s, this would still be allowed.
> > The implementation of synchronize_srcu() has quite a few smp_mb()
> > invocations.
> >
> > But exactly how are you modeling this? As in what additional accesses
> > and memory barriers are you placing in which locations?
>
> Along these lines:
>
> P0(int *x, int *y, int *z, int *magic_location)
> {
> int r1;
>
>
> WRITE_ONCE(*y, 2);
> WRITE_ONCE(*x, 1);
>
> smp_mb();
> WRITE_ONCE(*magic_location, 1);
>
> }
>
> P1(int *x, int *y, int *z, int *magic_location)
> {
> int r1;
>
> WRITE_ONCE(*y, 1);
>
> smp_mb();
> while (! READ_ONCE(*magic_location))
> ;
> smp_mb();
> WRITE_ONCE(*z, 2);
> }
>
>
> P2(int *x, int *y, int *z, struct srcu_struct *s)
> {
> WRITE_ONCE(*z, 1);
> smp_store_release(x, 2);
> }
>
>
>
> Note that you can add as many additional smp_mb() and other accesses as you
> want around the original srcu call sites. I don't see how they could
> influence the absence of a cycle.
>
> (Also, to make it work with herd it seems you need to replace the loop with
> a single read and state in the exists clause that it happens to read a 1.)

I agree that LKMM would allow such a litmus test.

> > > I have already forgotten the specifics, but I think the power model allows
> > > certain stores never propagating somewhere?
> > PowerPC would forbid the 3.2W case, where each process used an
> > smp_store_release() as its sole ordering (no smp_mb() calls at all).
> >
> > [...]
> >
> > This propagation is modulated by the memory barriers, though.
>
> Ah, looking at the model now. Indeed it's forbidden, because in order to say
> that something is in co, there must not be a (resulting) cycle of co and
> barriers. But you'd get that here.? In the axiomatic model, this corresponds
> to saying Power's "prop | co" is acyclic. The same isn't true in LKMM. So
> that's probably why.

Which means that the RCU and SRCU implementations need to make (admittedly
small) guarantees that cannot be expressed in LKMM. Which is in fact
what I was remembering, so I feel better now.

Not sure about the rest of you, though. ;-)

Thanx, Paul

2023-01-24 22:37:08

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 24, 2023 at 02:15:24PM -0800, Paul E. McKenney wrote:
> > Ah, looking at the model now. Indeed it's forbidden, because in order to say
> > that something is in co, there must not be a (resulting) cycle of co and
> > barriers. But you'd get that here.? In the axiomatic model, this corresponds
> > to saying Power's "prop | co" is acyclic. The same isn't true in LKMM. So
> > that's probably why.
>
> Which means that the RCU and SRCU implementations need to make (admittedly
> small) guarantees that cannot be expressed in LKMM. Which is in fact
> what I was remembering, so I feel better now.
>
> Not sure about the rest of you, though. ;-)

Can you be more explicit? Exactly what guarantees does the kernel
implementation make that can't be expressed in LKMM?

And are these anything the memory model needs to worry about?

Alan

2023-01-24 22:54:55

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 24, 2023 at 05:35:33PM -0500, Alan Stern wrote:
> On Tue, Jan 24, 2023 at 02:15:24PM -0800, Paul E. McKenney wrote:
> > > Ah, looking at the model now. Indeed it's forbidden, because in order to say
> > > that something is in co, there must not be a (resulting) cycle of co and
> > > barriers. But you'd get that here.? In the axiomatic model, this corresponds
> > > to saying Power's "prop | co" is acyclic. The same isn't true in LKMM. So
> > > that's probably why.
> >
> > Which means that the RCU and SRCU implementations need to make (admittedly
> > small) guarantees that cannot be expressed in LKMM. Which is in fact
> > what I was remembering, so I feel better now.
> >
> > Not sure about the rest of you, though. ;-)
>
> Can you be more explicit? Exactly what guarantees does the kernel
> implementation make that can't be expressed in LKMM?

I doubt that I will be able to articulate it very well, but here goes.

Within the Linux kernel, the rule for a given RCU "domain" is that if
an event follows a grace period in pretty much any sense of the word,
then that event sees the effects of all events in all read-side critical
sections that began prior to the start of that grace period.

Here the senses of the word "follow" include combinations of rf, fr,
and co, combined with the various acyclic and irreflexive relations
defined in LKMM.

> And are these anything the memory model needs to worry about?

Given that several people, yourself included, are starting to use LKMM
to analyze the Linux-kernel RCU implementations, maybe it does.

Me, I am happy either way.

Thanx, Paul

2023-01-25 01:55:08

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 24, 2023 at 02:54:49PM -0800, Paul E. McKenney wrote:
> On Tue, Jan 24, 2023 at 05:35:33PM -0500, Alan Stern wrote:
> > Can you be more explicit? Exactly what guarantees does the kernel
> > implementation make that can't be expressed in LKMM?
>
> I doubt that I will be able to articulate it very well, but here goes.
>
> Within the Linux kernel, the rule for a given RCU "domain" is that if
> an event follows a grace period in pretty much any sense of the word,
> then that event sees the effects of all events in all read-side critical
> sections that began prior to the start of that grace period.
>
> Here the senses of the word "follow" include combinations of rf, fr,
> and co, combined with the various acyclic and irreflexive relations
> defined in LKMM.

The LKMM says pretty much the same thing. In fact, it says the event
sees the effects of all events po-before the unlock of (not just inside)
any read-side critical section that began prior to the start of the
grace period.

> > And are these anything the memory model needs to worry about?
>
> Given that several people, yourself included, are starting to use LKMM
> to analyze the Linux-kernel RCU implementations, maybe it does.
>
> Me, I am happy either way.

Judging from your description, I don't think we have anything to worry
about.

Alan

2023-01-25 02:20:25

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Tue, Jan 24, 2023 at 08:54:56PM -0500, Alan Stern wrote:
> On Tue, Jan 24, 2023 at 02:54:49PM -0800, Paul E. McKenney wrote:
> > On Tue, Jan 24, 2023 at 05:35:33PM -0500, Alan Stern wrote:
> > > Can you be more explicit? Exactly what guarantees does the kernel
> > > implementation make that can't be expressed in LKMM?
> >
> > I doubt that I will be able to articulate it very well, but here goes.
> >
> > Within the Linux kernel, the rule for a given RCU "domain" is that if
> > an event follows a grace period in pretty much any sense of the word,
> > then that event sees the effects of all events in all read-side critical
> > sections that began prior to the start of that grace period.
> >
> > Here the senses of the word "follow" include combinations of rf, fr,
> > and co, combined with the various acyclic and irreflexive relations
> > defined in LKMM.
>
> The LKMM says pretty much the same thing. In fact, it says the event
> sees the effects of all events po-before the unlock of (not just inside)
> any read-side critical section that began prior to the start of the
> grace period.
>
> > > And are these anything the memory model needs to worry about?
> >
> > Given that several people, yourself included, are starting to use LKMM
> > to analyze the Linux-kernel RCU implementations, maybe it does.
> >
> > Me, I am happy either way.
>
> Judging from your description, I don't think we have anything to worry
> about.

Sounds good, and let's proceed on that assumption then. We can always
revisit later if need be.

Thanx, Paul

2023-01-25 13:10:59

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/25/2023 3:20 AM, Paul E. McKenney wrote:
> On Tue, Jan 24, 2023 at 08:54:56PM -0500, Alan Stern wrote:
>> On Tue, Jan 24, 2023 at 02:54:49PM -0800, Paul E. McKenney wrote:
>>> On Tue, Jan 24, 2023 at 05:35:33PM -0500, Alan Stern wrote:
>>>> Can you be more explicit? Exactly what guarantees does the kernel
>>>> implementation make that can't be expressed in LKMM?
>>> I doubt that I will be able to articulate it very well, but here goes.
>>>
>>> Within the Linux kernel, the rule for a given RCU "domain" is that if
>>> an event follows a grace period in pretty much any sense of the word,
>>> then that event sees the effects of all events in all read-side critical
>>> sections that began prior to the start of that grace period.
>>>
>>> Here the senses of the word "follow" include combinations of rf, fr,
>>> and co, combined with the various acyclic and irreflexive relations
>>> defined in LKMM.
>> The LKMM says pretty much the same thing. In fact, it says the event
>> sees the effects of all events po-before the unlock of (not just inside)
>> any read-side critical section that began prior to the start of the
>> grace period.
>>
>>>> And are these anything the memory model needs to worry about?
>>> Given that several people, yourself included, are starting to use LKMM
>>> to analyze the Linux-kernel RCU implementations, maybe it does.
>>>
>>> Me, I am happy either way.
>> Judging from your description, I don't think we have anything to worry
>> about.
> Sounds good, and let's proceed on that assumption then. We can always
> revisit later if need be.
>
> Thanx, Paul

FWIW, I currently don't see a need for either RCU nor "base" LKMM to
have this kind of guarantee.
But I'm curious for why it doesn't exist in LKMM -- is it because of
Alpha or some other issues that make it hard to guarantee (like a
compiler merging two threads and optimizing or something?), or is it
simply that it seemed like a complicated guarantee with no discernible
upside, or something else?

Best wishes, jonas


2023-01-25 15:05:27

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 25, 2023 at 02:10:08PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/25/2023 3:20 AM, Paul E. McKenney wrote:
> > On Tue, Jan 24, 2023 at 08:54:56PM -0500, Alan Stern wrote:
> > > On Tue, Jan 24, 2023 at 02:54:49PM -0800, Paul E. McKenney wrote:
> > > > On Tue, Jan 24, 2023 at 05:35:33PM -0500, Alan Stern wrote:
> > > > > Can you be more explicit? Exactly what guarantees does the kernel
> > > > > implementation make that can't be expressed in LKMM?
> > > > I doubt that I will be able to articulate it very well, but here goes.
> > > >
> > > > Within the Linux kernel, the rule for a given RCU "domain" is that if
> > > > an event follows a grace period in pretty much any sense of the word,
> > > > then that event sees the effects of all events in all read-side critical
> > > > sections that began prior to the start of that grace period.
> > > >
> > > > Here the senses of the word "follow" include combinations of rf, fr,
> > > > and co, combined with the various acyclic and irreflexive relations
> > > > defined in LKMM.
> > > The LKMM says pretty much the same thing. In fact, it says the event
> > > sees the effects of all events po-before the unlock of (not just inside)
> > > any read-side critical section that began prior to the start of the
> > > grace period.
> > >
> > > > > And are these anything the memory model needs to worry about?
> > > > Given that several people, yourself included, are starting to use LKMM
> > > > to analyze the Linux-kernel RCU implementations, maybe it does.
> > > >
> > > > Me, I am happy either way.
> > > Judging from your description, I don't think we have anything to worry
> > > about.
> > Sounds good, and let's proceed on that assumption then. We can always
> > revisit later if need be.
> >
> > Thanx, Paul
>
> FWIW, I currently don't see a need for either RCU nor "base" LKMM to have
> this kind of guarantee.

In the RCU case, it is because it is far easier to provide this guarantee,
even though it is based on hardware and compilers rather than LKMM,
than it would be to explain to some random person why the access that
is intuitively clearly after the grace period can somehow come before it.

> But I'm curious for why it doesn't exist in LKMM -- is it because of Alpha
> or some other issues that make it hard to guarantee (like a compiler merging
> two threads and optimizing or something?), or is it simply that it seemed
> like a complicated guarantee with no discernible upside, or something else?

Because to the best of my knowledge, no one has ever come up with a
use for 2+2W and friends that isn't better handled by some much more
straightforward pattern of accesses. So we did not guarantee it in LKMM.

Yes, you could argue that my "ease of explanation" paragraph above is
a valid use case, but I am not sure that this is all that compelling of
an argument. ;-)

Thanx, Paul

2023-01-25 15:34:49

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 25, 2023 at 07:05:20AM -0800, Paul E. McKenney wrote:
> On Wed, Jan 25, 2023 at 02:10:08PM +0100, Jonas Oberhauser wrote:
> >
> >
> > On 1/25/2023 3:20 AM, Paul E. McKenney wrote:
> > > On Tue, Jan 24, 2023 at 08:54:56PM -0500, Alan Stern wrote:
> > > > On Tue, Jan 24, 2023 at 02:54:49PM -0800, Paul E. McKenney wrote:
> > > > >
> > > > > Within the Linux kernel, the rule for a given RCU "domain" is that if
> > > > > an event follows a grace period in pretty much any sense of the word,
> > > > > then that event sees the effects of all events in all read-side critical
> > > > > sections that began prior to the start of that grace period.
> > > > >
> > > > > Here the senses of the word "follow" include combinations of rf, fr,
> > > > > and co, combined with the various acyclic and irreflexive relations
> > > > > defined in LKMM.
> > > > The LKMM says pretty much the same thing. In fact, it says the event
> > > > sees the effects of all events po-before the unlock of (not just inside)
> > > > any read-side critical section that began prior to the start of the
> > > > grace period.
> > > >
> > > > > > And are these anything the memory model needs to worry about?
> > > > > Given that several people, yourself included, are starting to use LKMM
> > > > > to analyze the Linux-kernel RCU implementations, maybe it does.
> > > > >
> > > > > Me, I am happy either way.
> > > > Judging from your description, I don't think we have anything to worry
> > > > about.
> > > Sounds good, and let's proceed on that assumption then. We can always
> > > revisit later if need be.
> > >
> > > Thanx, Paul
> >
> > FWIW, I currently don't see a need for either RCU nor "base" LKMM to have
> > this kind of guarantee.
>
> In the RCU case, it is because it is far easier to provide this guarantee,
> even though it is based on hardware and compilers rather than LKMM,
> than it would be to explain to some random person why the access that
> is intuitively clearly after the grace period can somehow come before it.
>
> > But I'm curious for why it doesn't exist in LKMM -- is it because of Alpha
> > or some other issues that make it hard to guarantee (like a compiler merging
> > two threads and optimizing or something?), or is it simply that it seemed
> > like a complicated guarantee with no discernible upside, or something else?
>
> Because to the best of my knowledge, no one has ever come up with a
> use for 2+2W and friends that isn't better handled by some much more
> straightforward pattern of accesses. So we did not guarantee it in LKMM.
>
> Yes, you could argue that my "ease of explanation" paragraph above is
> a valid use case, but I am not sure that this is all that compelling of
> an argument. ;-)

Are we all talking about the same thing? There were two different
guarantees mentioned above:

The RCU guarantee about writes in a read-side critical section
becoming visible to all CPUs before a later grace period ends;

The guarantee about the 2+2W pattern and friends being
forbidden.

The LKMM includes the first of these but not the second (for the reason
Paul stated).

Alan

2023-01-25 17:18:40

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 25, 2023 at 10:34:40AM -0500, Alan Stern wrote:
> On Wed, Jan 25, 2023 at 07:05:20AM -0800, Paul E. McKenney wrote:
> > On Wed, Jan 25, 2023 at 02:10:08PM +0100, Jonas Oberhauser wrote:
> > >
> > >
> > > On 1/25/2023 3:20 AM, Paul E. McKenney wrote:
> > > > On Tue, Jan 24, 2023 at 08:54:56PM -0500, Alan Stern wrote:
> > > > > On Tue, Jan 24, 2023 at 02:54:49PM -0800, Paul E. McKenney wrote:
> > > > > >
> > > > > > Within the Linux kernel, the rule for a given RCU "domain" is that if
> > > > > > an event follows a grace period in pretty much any sense of the word,
> > > > > > then that event sees the effects of all events in all read-side critical
> > > > > > sections that began prior to the start of that grace period.
> > > > > >
> > > > > > Here the senses of the word "follow" include combinations of rf, fr,
> > > > > > and co, combined with the various acyclic and irreflexive relations
> > > > > > defined in LKMM.
> > > > > The LKMM says pretty much the same thing. In fact, it says the event
> > > > > sees the effects of all events po-before the unlock of (not just inside)
> > > > > any read-side critical section that began prior to the start of the
> > > > > grace period.
> > > > >
> > > > > > > And are these anything the memory model needs to worry about?
> > > > > > Given that several people, yourself included, are starting to use LKMM
> > > > > > to analyze the Linux-kernel RCU implementations, maybe it does.
> > > > > >
> > > > > > Me, I am happy either way.
> > > > > Judging from your description, I don't think we have anything to worry
> > > > > about.
> > > > Sounds good, and let's proceed on that assumption then. We can always
> > > > revisit later if need be.
> > > >
> > > > Thanx, Paul
> > >
> > > FWIW, I currently don't see a need for either RCU nor "base" LKMM to have
> > > this kind of guarantee.
> >
> > In the RCU case, it is because it is far easier to provide this guarantee,
> > even though it is based on hardware and compilers rather than LKMM,
> > than it would be to explain to some random person why the access that
> > is intuitively clearly after the grace period can somehow come before it.
> >
> > > But I'm curious for why it doesn't exist in LKMM -- is it because of Alpha
> > > or some other issues that make it hard to guarantee (like a compiler merging
> > > two threads and optimizing or something?), or is it simply that it seemed
> > > like a complicated guarantee with no discernible upside, or something else?
> >
> > Because to the best of my knowledge, no one has ever come up with a
> > use for 2+2W and friends that isn't better handled by some much more
> > straightforward pattern of accesses. So we did not guarantee it in LKMM.
> >
> > Yes, you could argue that my "ease of explanation" paragraph above is
> > a valid use case, but I am not sure that this is all that compelling of
> > an argument. ;-)
>
> Are we all talking about the same thing? There were two different
> guarantees mentioned above:
>
> The RCU guarantee about writes in a read-side critical section
> becoming visible to all CPUs before a later grace period ends;
>
> The guarantee about the 2+2W pattern and friends being
> forbidden.
>
> The LKMM includes the first of these but not the second (for the reason
> Paul stated).

I am not sure whether or not we are talking about the same thing,
but given this litmus test:

------------------------------------------------------------------------

C C-srcu-observed-4

(*
* Result: Sometimes
*
* The Linux-kernel implementation is suspected to forbid this.
*)

{}

P0(int *x, int *y, int *z, struct srcu_struct *s)
{
int r1;

r1 = srcu_read_lock(s);
WRITE_ONCE(*y, 2);
WRITE_ONCE(*x, 1);
srcu_read_unlock(s, r1);
}

P1(int *x, int *y, int *z, struct srcu_struct *s)
{
int r1;

WRITE_ONCE(*y, 1);
synchronize_srcu(s);
WRITE_ONCE(*z, 2);
}

P2(int *x, int *y, int *z, struct srcu_struct *s)
{
WRITE_ONCE(*z, 1);
smp_store_release(x, 2);
}

exists (x=1 /\ y=1 /\ z=1)

------------------------------------------------------------------------

We get the following from herd7:

------------------------------------------------------------------------

$ herd7 -conf linux-kernel.cfg C-srcu-observed-4.litmus
Test C-srcu-observed-4 Allowed
States 8
x=1; y=1; z=1;
x=1; y=1; z=2;
x=1; y=2; z=1;
x=1; y=2; z=2;
x=2; y=1; z=1;
x=2; y=1; z=2;
x=2; y=2; z=1;
x=2; y=2; z=2;
Ok
Witnesses
Positive: 1 Negative: 7
Condition exists (x=1 /\ y=1 /\ z=1)
Observation C-srcu-observed-4 Sometimes 1 7
Time C-srcu-observed-4 0.02
Hash=8b6020369b73ac19070864a9db00bbf8

------------------------------------------------------------------------

This does not seem to me to be consistent with your "The RCU guarantee
about writes in a read-side critical section becoming visible to all
CPUs before a later grace period ends".

So what am I missing here?

Again, I am OK with LKMM allowing C-srcu-observed-4.litmus, as long as
the actual Linux-kernel implementation forbids it.

Thanx, Paul

2023-01-25 17:43:47

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/25/2023 6:18 PM, Paul E. McKenney wrote:
> On Wed, Jan 25, 2023 at 10:34:40AM -0500, Alan Stern wrote:
>> On Wed, Jan 25, 2023 at 07:05:20AM -0800, Paul E. McKenney wrote:
>>> On Wed, Jan 25, 2023 at 02:10:08PM +0100, Jonas Oberhauser wrote:
>>>>
>>>> On 1/25/2023 3:20 AM, Paul E. McKenney wrote:
>>>>> On Tue, Jan 24, 2023 at 08:54:56PM -0500, Alan Stern wrote:
>>>>>> On Tue, Jan 24, 2023 at 02:54:49PM -0800, Paul E. McKenney wrote:
>>>>>>> Within the Linux kernel, the rule for a given RCU "domain" is that if
>>>>>>> an event follows a grace period in pretty much any sense of the word,
>>>>>>> then that event sees the effects of all events in all read-side critical
>>>>>>> sections that began prior to the start of that grace period.
>>>>>>>
>>>>>>> Here the senses of the word "follow" include combinations of rf, fr,
>>>>>>> and co, combined with the various acyclic and irreflexive relations
>>>>>>> defined in LKMM.
>>>>>> The LKMM says pretty much the same thing. In fact, it says the event
>>>>>> sees the effects of all events po-before the unlock of (not just inside)
>>>>>> any read-side critical section that began prior to the start of the
>>>>>> grace period.
>>>>>>
>>>>>>>> And are these anything the memory model needs to worry about?
>>>>>>> Given that several people, yourself included, are starting to use LKMM
>>>>>>> to analyze the Linux-kernel RCU implementations, maybe it does.
>>>>>>>
>>>>>>> Me, I am happy either way.
>>>>>> Judging from your description, I don't think we have anything to worry
>>>>>> about.
>>>>> Sounds good, and let's proceed on that assumption then. We can always
>>>>> revisit later if need be.
>>>>>
>>>>> Thanx, Paul
>>>> FWIW, I currently don't see a need for either RCU nor "base" LKMM to have
>>>> this kind of guarantee.
>>> In the RCU case, it is because it is far easier to provide this guarantee,
>>> even though it is based on hardware and compilers rather than LKMM,
>>> than it would be to explain to some random person why the access that
>>> is intuitively clearly after the grace period can somehow come before it.
>>>
>>>> But I'm curious for why it doesn't exist in LKMM -- is it because of Alpha
>>>> or some other issues that make it hard to guarantee (like a compiler merging
>>>> two threads and optimizing or something?), or is it simply that it seemed
>>>> like a complicated guarantee with no discernible upside, or something else?
>>> Because to the best of my knowledge, no one has ever come up with a
>>> use for 2+2W and friends that isn't better handled by some much more
>>> straightforward pattern of accesses. So we did not guarantee it in LKMM.
>>>
>>> Yes, you could argue that my "ease of explanation" paragraph above is
>>> a valid use case, but I am not sure that this is all that compelling of
>>> an argument. ;-)
>> Are we all talking about the same thing? There were two different
>> guarantees mentioned above:
>>
>> The RCU guarantee about writes in a read-side critical section
>> becoming visible to all CPUs before a later grace period ends;
>>
>> The guarantee about the 2+2W pattern and friends being
>> forbidden.
>>
>> The LKMM includes the first of these but not the second (for the reason
>> Paul stated).
> I am not sure whether or not we are talking about the same thing,
> but given this litmus test:
>
> ------------------------------------------------------------------------
>
> C C-srcu-observed-4
>
> (*
> * Result: Sometimes
> *
> * The Linux-kernel implementation is suspected to forbid this.
> *)
>
> {}
>
> P0(int *x, int *y, int *z, struct srcu_struct *s)
> {
> int r1;
>
> r1 = srcu_read_lock(s);
> WRITE_ONCE(*y, 2);
> WRITE_ONCE(*x, 1);
> srcu_read_unlock(s, r1);
> }
>
> P1(int *x, int *y, int *z, struct srcu_struct *s)
> {
> int r1;
>
> WRITE_ONCE(*y, 1);
> synchronize_srcu(s);
> WRITE_ONCE(*z, 2);
> }
>
> P2(int *x, int *y, int *z, struct srcu_struct *s)
> {
> WRITE_ONCE(*z, 1);
> smp_store_release(x, 2);
> }
>
> exists (x=1 /\ y=1 /\ z=1)
>
> ------------------------------------------------------------------------
>
> We get the following from herd7:
>
> ------------------------------------------------------------------------
>
> $ herd7 -conf linux-kernel.cfg C-srcu-observed-4.litmus
> Test C-srcu-observed-4 Allowed
> States 8
> x=1; y=1; z=1;
> x=1; y=1; z=2;
> x=1; y=2; z=1;
> x=1; y=2; z=2;
> x=2; y=1; z=1;
> x=2; y=1; z=2;
> x=2; y=2; z=1;
> x=2; y=2; z=2;
> Ok
> Witnesses
> Positive: 1 Negative: 7
> Condition exists (x=1 /\ y=1 /\ z=1)
> Observation C-srcu-observed-4 Sometimes 1 7
> Time C-srcu-observed-4 0.02
> Hash=8b6020369b73ac19070864a9db00bbf8
>
> ------------------------------------------------------------------------
>
> This does not seem to me to be consistent with your "The RCU guarantee
> about writes in a read-side critical section becoming visible to all
> CPUs before a later grace period ends".

I believe the issue is a different one, it's about the prop;prop at the
end, not related to the grace period guarantee. The stores in the CS
become visible, but the store release never propagates anywhere, since
the co-later store from the CS already propagated everywhere.
I believe this is because A ->prop B ->prop C only says that there are
writes WB and WC such that WB propagates to B's CPU before B executes,
WC is co-after B, and WC propagates to C's CPU before C executes. (I
think B is the release store here).

But it does not say anything about the propagation/execution order of B
and WC, and I believe WC can propagate to every CPU (other than B's)
before B, and B never propagates anywhere.

> Again, I am OK with LKMM allowing C-srcu-observed-4.litmus, as long as
> the actual Linux-kernel implementation forbids it.

Is it really that important that the implementation forbids it? Do you
have a use case?

Best wishes, jonas


2023-01-25 19:09:45

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 25, 2023 at 09:18:32AM -0800, Paul E. McKenney wrote:
> ------------------------------------------------------------------------
>
> C C-srcu-observed-4
>
> (*
> * Result: Sometimes
> *
> * The Linux-kernel implementation is suspected to forbid this.
> *)
>
> {}
>
> P0(int *x, int *y, int *z, struct srcu_struct *s)
> {
> int r1;
>
> r1 = srcu_read_lock(s);
> WRITE_ONCE(*y, 2);
> WRITE_ONCE(*x, 1);
> srcu_read_unlock(s, r1);
> }
>
> P1(int *x, int *y, int *z, struct srcu_struct *s)
> {
> int r1;
>
> WRITE_ONCE(*y, 1);
> synchronize_srcu(s);
> WRITE_ONCE(*z, 2);
> }
>
> P2(int *x, int *y, int *z, struct srcu_struct *s)
> {
> WRITE_ONCE(*z, 1);
> smp_store_release(x, 2);
> }
>
> exists (x=1 /\ y=1 /\ z=1)
>
> ------------------------------------------------------------------------
>
> We get the following from herd7:
>
> ------------------------------------------------------------------------
>
> $ herd7 -conf linux-kernel.cfg C-srcu-observed-4.litmus
> Test C-srcu-observed-4 Allowed
> States 8
> x=1; y=1; z=1;
> x=1; y=1; z=2;
> x=1; y=2; z=1;
> x=1; y=2; z=2;
> x=2; y=1; z=1;
> x=2; y=1; z=2;
> x=2; y=2; z=1;
> x=2; y=2; z=2;
> Ok
> Witnesses
> Positive: 1 Negative: 7
> Condition exists (x=1 /\ y=1 /\ z=1)
> Observation C-srcu-observed-4 Sometimes 1 7
> Time C-srcu-observed-4 0.02
> Hash=8b6020369b73ac19070864a9db00bbf8
>
> ------------------------------------------------------------------------
>
> This does not seem to me to be consistent with your "The RCU guarantee
> about writes in a read-side critical section becoming visible to all
> CPUs before a later grace period ends".

Let's see. That guarantee requires only that x=1 and y=2 become visible
to P1 and P2 before the grace period ends. And since synchronize_srcu
is a strong fence, y=1 must become visible to P0 and P2 before the grace
period ends. Presumably after y=2 does, because it overwrites y=2.
Okay so far.

Now at some point P2 executes x=2. If this were to happen after the
grace period ended, it would overwrite x=1. Therefore it must happen
before the grace period ends, and therefore P2 must also write z=1
before the grace period ends.

So we have P2 writing z=1 before P1 writes z=2. But this doesn't mean
z=2 has to overwrite z=1! (You had a diagram illustrating this point in
one of your own slides for a talk about the LKMM.) Overwriting is
required only when the earlier write becomes visible to the later
write's CPU before the later write occurs, and nothing in this test
forces z=2 to propagate to P1 before the z=1 write executes.

So the litmus test's outcome can happen without violating my guarantee.

> So what am I missing here?

Can't tell. I'm not sure why you think the litmus test isn't consistent
with the guarantee.

> Again, I am OK with LKMM allowing C-srcu-observed-4.litmus, as long as
> the actual Linux-kernel implementation forbids it.

Why do you want the implementation to forbid it? The pattern of the
litmus test resembles 3+3W, and you don't care whether the kernel allows
that pattern. Do you?

Alan

2023-01-25 19:47:01

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 25, 2023 at 02:08:59PM -0500, Alan Stern wrote:
> On Wed, Jan 25, 2023 at 09:18:32AM -0800, Paul E. McKenney wrote:
> > ------------------------------------------------------------------------
> >
> > C C-srcu-observed-4
> >
> > (*
> > * Result: Sometimes
> > *
> > * The Linux-kernel implementation is suspected to forbid this.
> > *)
> >
> > {}
> >
> > P0(int *x, int *y, int *z, struct srcu_struct *s)
> > {
> > int r1;
> >
> > r1 = srcu_read_lock(s);
> > WRITE_ONCE(*y, 2);
> > WRITE_ONCE(*x, 1);
> > srcu_read_unlock(s, r1);
> > }
> >
> > P1(int *x, int *y, int *z, struct srcu_struct *s)
> > {
> > int r1;
> >
> > WRITE_ONCE(*y, 1);
> > synchronize_srcu(s);
> > WRITE_ONCE(*z, 2);
> > }
> >
> > P2(int *x, int *y, int *z, struct srcu_struct *s)
> > {
> > WRITE_ONCE(*z, 1);
> > smp_store_release(x, 2);
> > }
> >
> > exists (x=1 /\ y=1 /\ z=1)
> >
> > ------------------------------------------------------------------------
> >
> > We get the following from herd7:
> >
> > ------------------------------------------------------------------------
> >
> > $ herd7 -conf linux-kernel.cfg C-srcu-observed-4.litmus
> > Test C-srcu-observed-4 Allowed
> > States 8
> > x=1; y=1; z=1;
> > x=1; y=1; z=2;
> > x=1; y=2; z=1;
> > x=1; y=2; z=2;
> > x=2; y=1; z=1;
> > x=2; y=1; z=2;
> > x=2; y=2; z=1;
> > x=2; y=2; z=2;
> > Ok
> > Witnesses
> > Positive: 1 Negative: 7
> > Condition exists (x=1 /\ y=1 /\ z=1)
> > Observation C-srcu-observed-4 Sometimes 1 7
> > Time C-srcu-observed-4 0.02
> > Hash=8b6020369b73ac19070864a9db00bbf8
> >
> > ------------------------------------------------------------------------
> >
> > This does not seem to me to be consistent with your "The RCU guarantee
> > about writes in a read-side critical section becoming visible to all
> > CPUs before a later grace period ends".
>
> Let's see. That guarantee requires only that x=1 and y=2 become visible
> to P1 and P2 before the grace period ends. And since synchronize_srcu
> is a strong fence, y=1 must become visible to P0 and P2 before the grace
> period ends. Presumably after y=2 does, because it overwrites y=2.
> Okay so far.
>
> Now at some point P2 executes x=2. If this were to happen after the
> grace period ended, it would overwrite x=1. Therefore it must happen
> before the grace period ends, and therefore P2 must also write z=1
> before the grace period ends.
>
> So we have P2 writing z=1 before P1 writes z=2. But this doesn't mean
> z=2 has to overwrite z=1! (You had a diagram illustrating this point in
> one of your own slides for a talk about the LKMM.) Overwriting is
> required only when the earlier write becomes visible to the later
> write's CPU before the later write occurs, and nothing in this test
> forces z=2 to propagate to P1 before the z=1 write executes.
>
> So the litmus test's outcome can happen without violating my guarantee.

Makes sense, thank you!

> > So what am I missing here?
>
> Can't tell. I'm not sure why you think the litmus test isn't consistent
> with the guarantee.

I was missing that additional non-temporal co link.

> > Again, I am OK with LKMM allowing C-srcu-observed-4.litmus, as long as
> > the actual Linux-kernel implementation forbids it.
>
> Why do you want the implementation to forbid it? The pattern of the
> litmus test resembles 3+3W, and you don't care whether the kernel allows
> that pattern. Do you?

Jonas asked a similar question, so I am answering you both here.

With (say) a release-WRITE_ONCE() chain implementing N+2W for some
N, it is reasonably well known that you don't get ordering, hardware
support otwithstanding. After all, none of the Linux kernel, C, and C++
memory models make that guarantee. In addition, the non-RCU barriers
and accesses that you can use to create N+2W have been in very wide use
for a very long time.

Although RCU has been in use for almost as long as those non-RCU barriers,
it has not been in wide use for anywhere near that long. So I cannot
be so confident in ruling out some N+2W use case for RCU.

Such a use case could play out as follows:

1. They try LKMM on it, see that LKMM allows it, and therefore find
something else that works just as well. This is fine.

2. They try LKMM on it, see that LKMM allows it, but cannot find
something else that works just as well. They complain to us,
and we either show them how to get the same results some other
way or adjust LKMM (and perhaps the implementations) accordingly.
These are also fine.

3. They don't try LKMM on it, see that it works when they test it,
and they send it upstream. The use case is entangled deeply
enough in other code that no one spots it on review. The Linux
kernel unconditionally prohibits the cycle. This too is fine.

4. They don't try LKMM on it, see that it works when they test it,
and they send it upstream. The use case is entangled deeply
enough in other code that no one spots it on review. Because RCU
grace periods incur tens of microseconds of latency at a minimum,
all tests (almost) always pass, just due to delays and unrelated
accesses and memory barriers. Even in kernels built with some
future SRCU equivalent of CONFIG_RCU_STRICT_GRACE_PERIOD=y.
But the Linux kernel allows the cycle when there is a new moon
on Tuesday during a triple solar eclipse of Jupiter, a condition
that is eventually met, and at the worst possible time and place.

This is absolutely the opposite of fine.

I don't want to deal with #4. So this is an RCU-maintainer use case
that I would like to avoid. ;-)

Thanx, Paul

2023-01-25 20:36:41

by Andrea Parri

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

> > Why do you want the implementation to forbid it? The pattern of the
> > litmus test resembles 3+3W, and you don't care whether the kernel allows
> > that pattern. Do you?
>
> Jonas asked a similar question, so I am answering you both here.
>
> With (say) a release-WRITE_ONCE() chain implementing N+2W for some
> N, it is reasonably well known that you don't get ordering, hardware
> support otwithstanding. After all, none of the Linux kernel, C, and C++
> memory models make that guarantee. In addition, the non-RCU barriers
> and accesses that you can use to create N+2W have been in very wide use
> for a very long time.
>
> Although RCU has been in use for almost as long as those non-RCU barriers,
> it has not been in wide use for anywhere near that long. So I cannot
> be so confident in ruling out some N+2W use case for RCU.

Did some archeology... the pattern, with either RCU sync plus a release
or with two full fences plus a release, was forbidden by "ancient LKMM":
the relevant changes were described in

https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/LWNLinuxMM/WeakModel.html#Coherence%20Point%20and%20RCU

Andrea

2023-01-25 20:47:28

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 25, 2023 at 11:46:51AM -0800, Paul E. McKenney wrote:
> On Wed, Jan 25, 2023 at 02:08:59PM -0500, Alan Stern wrote:
> > Why do you want the implementation to forbid it? The pattern of the
> > litmus test resembles 3+3W, and you don't care whether the kernel allows
> > that pattern. Do you?
>
> Jonas asked a similar question, so I am answering you both here.
>
> With (say) a release-WRITE_ONCE() chain implementing N+2W for some
> N, it is reasonably well known that you don't get ordering, hardware
> support otwithstanding. After all, none of the Linux kernel, C, and C++
> memory models make that guarantee. In addition, the non-RCU barriers
> and accesses that you can use to create N+2W have been in very wide use
> for a very long time.
>
> Although RCU has been in use for almost as long as those non-RCU barriers,
> it has not been in wide use for anywhere near that long. So I cannot
> be so confident in ruling out some N+2W use case for RCU.
>
> Such a use case could play out as follows:
>
> 1. They try LKMM on it, see that LKMM allows it, and therefore find
> something else that works just as well. This is fine.
>
> 2. They try LKMM on it, see that LKMM allows it, but cannot find
> something else that works just as well. They complain to us,
> and we either show them how to get the same results some other
> way or adjust LKMM (and perhaps the implementations) accordingly.
> These are also fine.
>
> 3. They don't try LKMM on it, see that it works when they test it,
> and they send it upstream. The use case is entangled deeply
> enough in other code that no one spots it on review. The Linux
> kernel unconditionally prohibits the cycle. This too is fine.
>
> 4. They don't try LKMM on it, see that it works when they test it,
> and they send it upstream. The use case is entangled deeply
> enough in other code that no one spots it on review. Because RCU
> grace periods incur tens of microseconds of latency at a minimum,
> all tests (almost) always pass, just due to delays and unrelated
> accesses and memory barriers. Even in kernels built with some
> future SRCU equivalent of CONFIG_RCU_STRICT_GRACE_PERIOD=y.
> But the Linux kernel allows the cycle when there is a new moon
> on Tuesday during a triple solar eclipse of Jupiter, a condition
> that is eventually met, and at the worst possible time and place.
>
> This is absolutely the opposite of fine.
>
> I don't want to deal with #4. So this is an RCU-maintainer use case
> that I would like to avoid. ;-)

Since it is well known that the non-RCU barriers in the Linux kernel, C,
and C++ do not enforce ordering in n+nW, and seeing as how your litmus
test relies on an smp_store_release() at one point, I think it's
reasonable to assume people won't expect it to provide ordering.

Ah, but what about a litmus test that relies solely on RCU?

rcu_read_lock Wy=2 rcu_read_lock Wv=2
Wx=2 synchronize_rcu Wu=2 synchronize_rcu
Wy=1 Wu=1 Wv=1 Wx=1
rcu_read_unlock rcu_read_unlock

exists (x=2 /\ y=2 /\ u=2 /\ v=2)

Luckily, this _is_ forbidden by the LKMM. So I think you're okay.

Alan

2023-01-25 21:12:22

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/25/2023 9:36 PM, Andrea Parri wrote:
>>> Why do you want the implementation to forbid it? The pattern of the
>>> litmus test resembles 3+3W, and you don't care whether the kernel allows
>>> that pattern. Do you?
>> Jonas asked a similar question, so I am answering you both here.
>>
>> With (say) a release-WRITE_ONCE() chain implementing N+2W for some
>> N, it is reasonably well known that you don't get ordering, hardware
>> support otwithstanding. After all, none of the Linux kernel, C, and C++
>> memory models make that guarantee. In addition, the non-RCU barriers
>> and accesses that you can use to create N+2W have been in very wide use
>> for a very long time.
>>
>> Although RCU has been in use for almost as long as those non-RCU barriers,
>> it has not been in wide use for anywhere near that long. So I cannot
>> be so confident in ruling out some N+2W use case for RCU.
> Did some archeology... the pattern, with either RCU sync plus a release
> or with two full fences plus a release, was forbidden by "ancient LKMM":
> the relevant changes were described in
>
> https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/LWNLinuxMM/WeakModel.html#Coherence%20Point%20and%20RCU
>
> Andrea

Fascinating! It says there "But the weak model allows it, as required"
-- what does "as required" mean? Just "as required by dropping the
constraint"?

Is there still a notion of "strong model" and "weak model", or was the
strong model dropped?

jonas


2023-01-25 21:23:35

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 25, 2023 at 10:10:32PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/25/2023 9:36 PM, Andrea Parri wrote:
> > > > Why do you want the implementation to forbid it? The pattern of the
> > > > litmus test resembles 3+3W, and you don't care whether the kernel allows
> > > > that pattern. Do you?
> > > Jonas asked a similar question, so I am answering you both here.
> > >
> > > With (say) a release-WRITE_ONCE() chain implementing N+2W for some
> > > N, it is reasonably well known that you don't get ordering, hardware
> > > support otwithstanding. After all, none of the Linux kernel, C, and C++
> > > memory models make that guarantee. In addition, the non-RCU barriers
> > > and accesses that you can use to create N+2W have been in very wide use
> > > for a very long time.
> > >
> > > Although RCU has been in use for almost as long as those non-RCU barriers,
> > > it has not been in wide use for anywhere near that long. So I cannot
> > > be so confident in ruling out some N+2W use case for RCU.
> > Did some archeology... the pattern, with either RCU sync plus a release
> > or with two full fences plus a release, was forbidden by "ancient LKMM":
> > the relevant changes were described in
> >
> > https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/LWNLinuxMM/WeakModel.html#Coherence%20Point%20and%20RCU
> >
> > Andrea
>
> Fascinating! It says there "But the weak model allows it, as required" --
> what does "as required" mean? Just "as required by dropping the constraint"?

"As required by our reluctance to support it, given that all use cases
we have seen are traps for the unwary."

> Is there still a notion of "strong model" and "weak model", or was the
> strong model dropped?

The strong model was dropped. The differences between them were
eventually small enough that it did not make sense to maintain two models.

Thanx, Paul

2023-01-25 21:38:52

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 25, 2023 at 03:46:14PM -0500, Alan Stern wrote:
> On Wed, Jan 25, 2023 at 11:46:51AM -0800, Paul E. McKenney wrote:
> > On Wed, Jan 25, 2023 at 02:08:59PM -0500, Alan Stern wrote:
> > > Why do you want the implementation to forbid it? The pattern of the
> > > litmus test resembles 3+3W, and you don't care whether the kernel allows
> > > that pattern. Do you?
> >
> > Jonas asked a similar question, so I am answering you both here.
> >
> > With (say) a release-WRITE_ONCE() chain implementing N+2W for some
> > N, it is reasonably well known that you don't get ordering, hardware
> > support otwithstanding. After all, none of the Linux kernel, C, and C++
> > memory models make that guarantee. In addition, the non-RCU barriers
> > and accesses that you can use to create N+2W have been in very wide use
> > for a very long time.
> >
> > Although RCU has been in use for almost as long as those non-RCU barriers,
> > it has not been in wide use for anywhere near that long. So I cannot
> > be so confident in ruling out some N+2W use case for RCU.
> >
> > Such a use case could play out as follows:
> >
> > 1. They try LKMM on it, see that LKMM allows it, and therefore find
> > something else that works just as well. This is fine.
> >
> > 2. They try LKMM on it, see that LKMM allows it, but cannot find
> > something else that works just as well. They complain to us,
> > and we either show them how to get the same results some other
> > way or adjust LKMM (and perhaps the implementations) accordingly.
> > These are also fine.
> >
> > 3. They don't try LKMM on it, see that it works when they test it,
> > and they send it upstream. The use case is entangled deeply
> > enough in other code that no one spots it on review. The Linux
> > kernel unconditionally prohibits the cycle. This too is fine.
> >
> > 4. They don't try LKMM on it, see that it works when they test it,
> > and they send it upstream. The use case is entangled deeply
> > enough in other code that no one spots it on review. Because RCU
> > grace periods incur tens of microseconds of latency at a minimum,
> > all tests (almost) always pass, just due to delays and unrelated
> > accesses and memory barriers. Even in kernels built with some
> > future SRCU equivalent of CONFIG_RCU_STRICT_GRACE_PERIOD=y.
> > But the Linux kernel allows the cycle when there is a new moon
> > on Tuesday during a triple solar eclipse of Jupiter, a condition
> > that is eventually met, and at the worst possible time and place.
> >
> > This is absolutely the opposite of fine.
> >
> > I don't want to deal with #4. So this is an RCU-maintainer use case
> > that I would like to avoid. ;-)
>
> Since it is well known that the non-RCU barriers in the Linux kernel, C,
> and C++ do not enforce ordering in n+nW, and seeing as how your litmus
> test relies on an smp_store_release() at one point, I think it's
> reasonable to assume people won't expect it to provide ordering.

The presence of that grace period, which is well known to have super-heavy
ordering properties, will likely reduce the number of people whose
expectations are aligned with LKMM. :-/

Plus it is not easy to create something that meets the LKMM grace-period
requirements without also making it provide this additional ordering on
real systems.

> Ah, but what about a litmus test that relies solely on RCU?
>
> rcu_read_lock Wy=2 rcu_read_lock Wv=2
> Wx=2 synchronize_rcu Wu=2 synchronize_rcu
> Wy=1 Wu=1 Wv=1 Wx=1
> rcu_read_unlock rcu_read_unlock
>
> exists (x=2 /\ y=2 /\ u=2 /\ v=2)
>
> Luckily, this _is_ forbidden by the LKMM. So I think you're okay.

Some times I get lucky! ;-)

The reader-free counterpart of your test is also forbidden, which is no
surprise given that smp_mb() also suffices.

Thanx, Paul

2023-01-25 23:33:14

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 25, 2023 at 01:38:32PM -0800, Paul E. McKenney wrote:
> On Wed, Jan 25, 2023 at 03:46:14PM -0500, Alan Stern wrote:
> > On Wed, Jan 25, 2023 at 11:46:51AM -0800, Paul E. McKenney wrote:
> > > On Wed, Jan 25, 2023 at 02:08:59PM -0500, Alan Stern wrote:
> > > > Why do you want the implementation to forbid it? The pattern of the
> > > > litmus test resembles 3+3W, and you don't care whether the kernel allows
> > > > that pattern. Do you?
> > >
> > > Jonas asked a similar question, so I am answering you both here.
> > >
> > > With (say) a release-WRITE_ONCE() chain implementing N+2W for some
> > > N, it is reasonably well known that you don't get ordering, hardware
> > > support otwithstanding. After all, none of the Linux kernel, C, and C++
> > > memory models make that guarantee. In addition, the non-RCU barriers
> > > and accesses that you can use to create N+2W have been in very wide use
> > > for a very long time.
> > >
> > > Although RCU has been in use for almost as long as those non-RCU barriers,
> > > it has not been in wide use for anywhere near that long. So I cannot
> > > be so confident in ruling out some N+2W use case for RCU.
> > >
> > > Such a use case could play out as follows:
> > >
> > > 1. They try LKMM on it, see that LKMM allows it, and therefore find
> > > something else that works just as well. This is fine.
> > >
> > > 2. They try LKMM on it, see that LKMM allows it, but cannot find
> > > something else that works just as well. They complain to us,
> > > and we either show them how to get the same results some other
> > > way or adjust LKMM (and perhaps the implementations) accordingly.
> > > These are also fine.
> > >
> > > 3. They don't try LKMM on it, see that it works when they test it,
> > > and they send it upstream. The use case is entangled deeply
> > > enough in other code that no one spots it on review. The Linux
> > > kernel unconditionally prohibits the cycle. This too is fine.
> > >
> > > 4. They don't try LKMM on it, see that it works when they test it,
> > > and they send it upstream. The use case is entangled deeply
> > > enough in other code that no one spots it on review. Because RCU
> > > grace periods incur tens of microseconds of latency at a minimum,
> > > all tests (almost) always pass, just due to delays and unrelated
> > > accesses and memory barriers. Even in kernels built with some
> > > future SRCU equivalent of CONFIG_RCU_STRICT_GRACE_PERIOD=y.
> > > But the Linux kernel allows the cycle when there is a new moon
> > > on Tuesday during a triple solar eclipse of Jupiter, a condition
> > > that is eventually met, and at the worst possible time and place.
> > >
> > > This is absolutely the opposite of fine.
> > >
> > > I don't want to deal with #4. So this is an RCU-maintainer use case
> > > that I would like to avoid. ;-)
> >
> > Since it is well known that the non-RCU barriers in the Linux kernel, C,
> > and C++ do not enforce ordering in n+nW, and seeing as how your litmus
> > test relies on an smp_store_release() at one point, I think it's
> > reasonable to assume people won't expect it to provide ordering.
>
> The presence of that grace period, which is well known to have super-heavy
> ordering properties, will likely reduce the number of people whose
> expectations are aligned with LKMM. :-/
>
> Plus it is not easy to create something that meets the LKMM grace-period
> requirements without also making it provide this additional ordering on
> real systems.
>
> > Ah, but what about a litmus test that relies solely on RCU?
> >
> > rcu_read_lock Wy=2 rcu_read_lock Wv=2
> > Wx=2 synchronize_rcu Wu=2 synchronize_rcu
> > Wy=1 Wu=1 Wv=1 Wx=1
> > rcu_read_unlock rcu_read_unlock
> >
> > exists (x=2 /\ y=2 /\ u=2 /\ v=2)
> >
> > Luckily, this _is_ forbidden by the LKMM. So I think you're okay.
>
> Some times I get lucky! ;-)
>
> The reader-free counterpart of your test is also forbidden, which is no
> surprise given that smp_mb() also suffices.

Ah, and returning to the earlier question as to whether srcu_read_unlock()
can use release semantics instead of smp_mb(), at the very least, this
portion of the synchronize_srcu() function's header comment must change:

On systems with more than one CPU, when synchronize_srcu()
returns, each CPU is guaranteed to have executed a full
memory barrier since the end of its last corresponding SRCU
read-side critical section whose beginning preceded the call
to synchronize_srcu().

I don't know of any SRCU code that relies on this, but it would be good to
check. There used to (and might still) be RCU code relying on this, which
is why this sentence was added to the header comment in the first place.

Thanx, Paul

2023-01-26 01:45:51

by Alan Stern

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 25, 2023 at 03:33:08PM -0800, Paul E. McKenney wrote:
> Ah, and returning to the earlier question as to whether srcu_read_unlock()
> can use release semantics instead of smp_mb(), at the very least, this
> portion of the synchronize_srcu() function's header comment must change:
>
> On systems with more than one CPU, when synchronize_srcu()
> returns, each CPU is guaranteed to have executed a full
> memory barrier since the end of its last corresponding SRCU
> read-side critical section whose beginning preceded the call
> to synchronize_srcu().

Yes, that would not be true. But on the other hand, it would be true
that each CPU is guaranteed to have executed a release memory barrier
since the end of its last corresponding SRCU read-side critical section
whose beginning preceded the call to synchronize_srcu(), _and_ the CPU
executing synchronize_srcu() is guaranteed to have executed a full
memory barrier after seeing the values from all those release stores.
This is not quite the same thing but it ought to be just as good.

> I don't know of any SRCU code that relies on this, but it would be good to
> check. There used to (and might still) be RCU code relying on this, which
> is why this sentence was added to the header comment in the first place.

If there is code relying on that guarantee, it ought to work just as
well by relying on the modified guarantee.

Of course, there might be code relying on a guarantee that
srcu_read_unlock() executes a full memory barrier. This guarantee would
certainly no longer hold. But as I understand it, this guarantee was
never promised by the SRCU subsystem.

Alan

2023-01-26 01:53:35

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Wed, Jan 25, 2023 at 08:45:44PM -0500, Alan Stern wrote:
> On Wed, Jan 25, 2023 at 03:33:08PM -0800, Paul E. McKenney wrote:
> > Ah, and returning to the earlier question as to whether srcu_read_unlock()
> > can use release semantics instead of smp_mb(), at the very least, this
> > portion of the synchronize_srcu() function's header comment must change:
> >
> > On systems with more than one CPU, when synchronize_srcu()
> > returns, each CPU is guaranteed to have executed a full
> > memory barrier since the end of its last corresponding SRCU
> > read-side critical section whose beginning preceded the call
> > to synchronize_srcu().
>
> Yes, that would not be true. But on the other hand, it would be true
> that each CPU is guaranteed to have executed a release memory barrier
> since the end of its last corresponding SRCU read-side critical section
> whose beginning preceded the call to synchronize_srcu(), _and_ the CPU
> executing synchronize_srcu() is guaranteed to have executed a full
> memory barrier after seeing the values from all those release stores.
> This is not quite the same thing but it ought to be just as good.

Here is hoping!

> > I don't know of any SRCU code that relies on this, but it would be good to
> > check. There used to (and might still) be RCU code relying on this, which
> > is why this sentence was added to the header comment in the first place.
>
> If there is code relying on that guarantee, it ought to work just as
> well by relying on the modified guarantee.

Again, here is hoping!

> Of course, there might be code relying on a guarantee that
> srcu_read_unlock() executes a full memory barrier. This guarantee would
> certainly no longer hold. But as I understand it, this guarantee was
> never promised by the SRCU subsystem.

That indented sentence was copied from the synchronize_srcu() function's
header comment, which might be interpreted by some as a promise by the
SRCU subsystem.

Thanx, Paul

2023-01-26 12:19:07

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/26/2023 2:53 AM, Paul E. McKenney wrote:
> On Wed, Jan 25, 2023 at 08:45:44PM -0500, Alan Stern wrote:
>> On Wed, Jan 25, 2023 at 03:33:08PM -0800, Paul E. McKenney wrote:
>>> Ah, and returning to the earlier question as to whether srcu_read_unlock()
>>> can use release semantics instead of smp_mb(), at the very least, this
>>> portion of the synchronize_srcu() function's header comment must change:
>>>
>>> On systems with more than one CPU, when synchronize_srcu()
>>> returns, each CPU is guaranteed to have executed a full
>>> memory barrier since the end of its last corresponding SRCU
>>> read-side critical section whose beginning preceded the call
>>> to synchronize_srcu().
>>
>> Of course, there might be code relying on a guarantee that
>> srcu_read_unlock() executes a full memory barrier. This guarantee would
>> certainly no longer hold. But as I understand it, this guarantee was
>> never promised by the SRCU subsystem.
> That indented sentence was copied from the synchronize_srcu() function's
> header comment, which might be interpreted by some as a promise by the
> SRCU subsystem.

I think we understand that it is a promise of the SRCU subsystem, the
question is just what the promise is.
As Alan said, if the promise is interpreted as something like

"every store that propagated to the read side critical section must have
propagated to all CPUs before theĀ  synchronize_srcu() ends" (where the
RSCS and synchronize_srcu() calls are those from the promise)

then that guarantee holds even if you only use a release fence to
communicate the end of the RSCS to the GP. Note that this interpretation
is analogous to the promise of smp_mb__after_unlock_lock(), which says
that an UNLOCK+LOCK pair act as a full fence: here the read-side
unlock+gp act as a full memory barrier.

On the other hand, if the promise is more literally interpreted as

"there is a (possibly virtual) instruction in the reader-side execution
stream that acts as a full memory barrier, and that instruction is
executed before theĀ  synchronize_srcu() ends"

then that guarantee is violated, and I suppose you might be able to
write some absurd client that inspects every store of the reader thread
and sees that there is no line in the reader side code that acts like a
full fence. But it would take a lot of effort to discern this.

Perhaps someone interpreting the promise like this might however come to
the conclusion that because the only part of the code that is actually
under control of srcu, and hence the only code where that full barrier
could be hidden, would be inside the srcu_unlock(), they might expect to
always find this full barrier there and treat srcu_unlock() in general
as a full barrier. Considering that the wording explicitly isn't "an
srcu_unlock() is a full barrier", I hope few people would have this
unhealthy idea. But you never know.

Best wishes,
jonas


2023-01-26 18:48:07

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Thu, Jan 26, 2023 at 01:17:49PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/26/2023 2:53 AM, Paul E. McKenney wrote:
> > On Wed, Jan 25, 2023 at 08:45:44PM -0500, Alan Stern wrote:
> > > On Wed, Jan 25, 2023 at 03:33:08PM -0800, Paul E. McKenney wrote:
> > > > Ah, and returning to the earlier question as to whether srcu_read_unlock()
> > > > can use release semantics instead of smp_mb(), at the very least, this
> > > > portion of the synchronize_srcu() function's header comment must change:
> > > >
> > > > On systems with more than one CPU, when synchronize_srcu()
> > > > returns, each CPU is guaranteed to have executed a full
> > > > memory barrier since the end of its last corresponding SRCU
> > > > read-side critical section whose beginning preceded the call
> > > > to synchronize_srcu().
> > >
> > > Of course, there might be code relying on a guarantee that
> > > srcu_read_unlock() executes a full memory barrier. This guarantee would
> > > certainly no longer hold. But as I understand it, this guarantee was
> > > never promised by the SRCU subsystem.
> > That indented sentence was copied from the synchronize_srcu() function's
> > header comment, which might be interpreted by some as a promise by the
> > SRCU subsystem.
>
> I think we understand that it is a promise of the SRCU subsystem, the
> question is just what the promise is.
> As Alan said, if the promise is interpreted as something like
>
> "every store that propagated to the read side critical section must have
> propagated to all CPUs before the? synchronize_srcu() ends" (where the RSCS
> and synchronize_srcu() calls are those from the promise)
>
> then that guarantee holds even if you only use a release fence to
> communicate the end of the RSCS to the GP. Note that this interpretation is
> analogous to the promise of smp_mb__after_unlock_lock(), which says that an
> UNLOCK+LOCK pair act as a full fence: here the read-side unlock+gp act as a
> full memory barrier.

Good point that the existing smp_mb__after_unlock_lock() can be used for
any use cases relying on the more literal interpretation of this promise.
We already have the work-around! ;-)

> On the other hand, if the promise is more literally interpreted as
>
> "there is a (possibly virtual) instruction in the reader-side execution
> stream that acts as a full memory barrier, and that instruction is executed
> before the? synchronize_srcu() ends"
>
> then that guarantee is violated, and I suppose you might be able to write
> some absurd client that inspects every store of the reader thread and sees
> that there is no line in the reader side code that acts like a full fence.
> But it would take a lot of effort to discern this.

The usual litmus test is shown at the end of this email. If you remove
the "//" from any of those smp_mb() calls, the test is forbidden, but
with all of them commented out, it is allowed. Which illustrates the
utility of smp_mb__after_unlock_lock(). It also shows that LKMM does
not model this guarantee from synchronize_srcu()'s comment header.
Which might be fine, actually.

Of course, I just now wrote this litmus test, so it should be viewed
with extreme suspicion.

> Perhaps someone interpreting the promise like this might however come to the
> conclusion that because the only part of the code that is actually under
> control of srcu, and hence the only code where that full barrier could be
> hidden, would be inside the srcu_unlock(), they might expect to always find
> this full barrier there and treat srcu_unlock() in general as a full
> barrier. Considering that the wording explicitly isn't "an srcu_unlock() is
> a full barrier", I hope few people would have this unhealthy idea. But you
> never know.

Given that the more literal interpretation is not unreasonable, we should
assume that someone somewhere might have interpreted it that way.

But I agree that the odds of someone actually relying on this are low,
and any such use case can be fixed with smp_mb__before_srcu_read_unlock(),
similar to smp_mb__after_srcu_read_unlock() that you note is already in use.

It would still be good to scan SRCU use for this sort of pattern, maybe
manually, maybe via something like coccinelle. Alternatively, I could
post on my blog (with right of first refusal to LWN and you guys as
co-authors) telling the community of our intent to change this and see
what people say. Probably both rather than either/or.

Thoughts?

Thanx, Paul

------------------------------------------------------------------------

C C-srcu-observed-6

(*
* Result: Sometimes
*
* The result is Never if any of the smp_mb() calls is uncommented.
*)

{}

P0(int *a, int *b, int *c, int *d, struct srcu_struct *s)
{
int r1;
int r2;
int r3;
int r4;

r1 = srcu_read_lock(s);
WRITE_ONCE(*b, 2);
r2 = READ_ONCE(*a);
// smp_mb();
srcu_read_unlock(s, r1);
// smp_mb();
r3 = READ_ONCE(*c);
// smp_mb();
r4 = READ_ONCE(*d);
}

P1(int *a, int *b, int *c, int *d, struct srcu_struct *s)
{
WRITE_ONCE(*b, 1);
synchronize_srcu(s);
WRITE_ONCE(*c, 1);
}

P2(int *a, int *b, int *c, int *d, struct srcu_struct *s)
{
WRITE_ONCE(*d, 1);
smp_mb();
WRITE_ONCE(*a, 1);
}

exists (0:r2=1 /\ 0:r3=1 /\ 0:r4=0 /\ b=1)

2023-01-27 15:03:59

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)



On 1/26/2023 7:48 PM, Paul E. McKenney wrote:
> On Thu, Jan 26, 2023 at 01:17:49PM +0100, Jonas Oberhauser wrote:
>> [...]
>> Note that this interpretation is analogous to the promise of smp_mb__after_unlock_lock(), which says that an
>> UNLOCK+LOCK pair act as a full fence: here the read-side unlock+gp act as a
>> full memory barrier.
> Good point that the existing smp_mb__after_unlock_lock() can be used for
> any use cases relying on the more literal interpretation of this promise.
> We already have the work-around! ;-)

Can it? I meant that the less-literal form is similar to the one given
by smp_mb__after_unlock_lock().

>> [...] I suppose you might be able to write
>> some absurd client that inspects every store of the reader thread and sees
>> that there is no line in the reader side code that acts like a full fence.
>> But it would take a lot of effort to discern this.
> The usual litmus test is shown at the end of this email [...]
>> [...] I hope few people would have this unhealthy idea. But you
>> never know.
> Given that the more literal interpretation is not unreasonable, we should
> assume that someone somewhere might have interpreted it that way.
>
> But I agree that the odds of someone actually relying on this are low,
> and any such use case can be fixed with smp_mb__before_srcu_read_unlock(),
> similar to smp_mb__after_srcu_read_unlock() that you note is already in use.
>
> It would still be good to scan SRCU use for this sort of pattern, maybe
> manually, maybe via something like coccinelle. Alternatively, I could
> post on my blog (with right of first refusal to LWN and you guys as
> co-authors) telling the community of our intent to change this and see
> what people say. Probably both rather than either/or.
>
> Thoughts?

My first thought is "there is a 'usual' litmus test for this?" :D
But yes, the test you have given has at least the same structure as what
I would expect.

Communicating this with the community sounds very reasonable.

For some automated combing, I'm really not sure what pattern to look for.
I'm afraid someone with a lot of time might have to look (semi-)manually.

Best wishes, jonas


>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> C C-srcu-observed-6
>
> (*
> * Result: Sometimes
> *
> * The result is Never if any of the smp_mb() calls is uncommented.
> *)
>
> {}
>
> P0(int *a, int *b, int *c, int *d, struct srcu_struct *s)
> {
> int r1;
> int r2;
> int r3;
> int r4;
>
> r1 = srcu_read_lock(s);
> WRITE_ONCE(*b, 2);
> r2 = READ_ONCE(*a);
> // smp_mb();
> srcu_read_unlock(s, r1);
> // smp_mb();
> r3 = READ_ONCE(*c);
> // smp_mb();
> r4 = READ_ONCE(*d);
> }
>
> P1(int *a, int *b, int *c, int *d, struct srcu_struct *s)
> {
> WRITE_ONCE(*b, 1);
> synchronize_srcu(s);
> WRITE_ONCE(*c, 1);
> }
>
> P2(int *a, int *b, int *c, int *d, struct srcu_struct *s)
> {
> WRITE_ONCE(*d, 1);
> smp_mb();
> WRITE_ONCE(*a, 1);
> }
>
> exists (0:r2=1 /\ 0:r3=1 /\ 0:r4=0 /\ b=1)


2023-01-27 16:51:05

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 27, 2023 at 04:03:16PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/26/2023 7:48 PM, Paul E. McKenney wrote:
> > On Thu, Jan 26, 2023 at 01:17:49PM +0100, Jonas Oberhauser wrote:
> > > [...]
> > > Note that this interpretation is analogous to the promise of smp_mb__after_unlock_lock(), which says that an
> > > UNLOCK+LOCK pair act as a full fence: here the read-side unlock+gp act as a
> > > full memory barrier.
> > Good point that the existing smp_mb__after_unlock_lock() can be used for
> > any use cases relying on the more literal interpretation of this promise.
> > We already have the work-around! ;-)
>
> Can it? I meant that the less-literal form is similar to the one given by
> smp_mb__after_unlock_lock().
>
> > > [...] I suppose you might be able to write
> > > some absurd client that inspects every store of the reader thread and sees
> > > that there is no line in the reader side code that acts like a full fence.
> > > But it would take a lot of effort to discern this.
> > The usual litmus test is shown at the end of this email [...]
> > > [...] I hope few people would have this unhealthy idea. But you
> > > never know.
> > Given that the more literal interpretation is not unreasonable, we should
> > assume that someone somewhere might have interpreted it that way.
> >
> > But I agree that the odds of someone actually relying on this are low,
> > and any such use case can be fixed with smp_mb__before_srcu_read_unlock(),
> > similar to smp_mb__after_srcu_read_unlock() that you note is already in use.
> >
> > It would still be good to scan SRCU use for this sort of pattern, maybe
> > manually, maybe via something like coccinelle. Alternatively, I could
> > post on my blog (with right of first refusal to LWN and you guys as
> > co-authors) telling the community of our intent to change this and see
> > what people say. Probably both rather than either/or.
> >
> > Thoughts?
>
> My first thought is "there is a 'usual' litmus test for this?" :D
> But yes, the test you have given has at least the same structure as what I
> would expect.

Exactly! ;-)

> Communicating this with the community sounds very reasonable.
>
> For some automated combing, I'm really not sure what pattern to look for.
> I'm afraid someone with a lot of time might have to look (semi-)manually.

Please continue giving it some thought. The number of srcu_read_unlock()
calls in v6.1 is about 250, which is within the realm of manual
inspection, but it is all too easy to miss something.

Thanx, Paul

> Best wishes, jonas
>
>
> >
> > Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > C C-srcu-observed-6
> >
> > (*
> > * Result: Sometimes
> > *
> > * The result is Never if any of the smp_mb() calls is uncommented.
> > *)
> >
> > {}
> >
> > P0(int *a, int *b, int *c, int *d, struct srcu_struct *s)
> > {
> > int r1;
> > int r2;
> > int r3;
> > int r4;
> >
> > r1 = srcu_read_lock(s);
> > WRITE_ONCE(*b, 2);
> > r2 = READ_ONCE(*a);
> > // smp_mb();
> > srcu_read_unlock(s, r1);
> > // smp_mb();
> > r3 = READ_ONCE(*c);
> > // smp_mb();
> > r4 = READ_ONCE(*d);
> > }
> >
> > P1(int *a, int *b, int *c, int *d, struct srcu_struct *s)
> > {
> > WRITE_ONCE(*b, 1);
> > synchronize_srcu(s);
> > WRITE_ONCE(*c, 1);
> > }
> >
> > P2(int *a, int *b, int *c, int *d, struct srcu_struct *s)
> > {
> > WRITE_ONCE(*d, 1);
> > smp_mb();
> > WRITE_ONCE(*a, 1);
> > }
> >
> > exists (0:r2=1 /\ 0:r3=1 /\ 0:r4=0 /\ b=1)
>

2023-01-27 16:54:45

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

On Fri, Jan 27, 2023 at 08:50:59AM -0800, Paul E. McKenney wrote:
> On Fri, Jan 27, 2023 at 04:03:16PM +0100, Jonas Oberhauser wrote:
> >
> >
> > On 1/26/2023 7:48 PM, Paul E. McKenney wrote:
> > > On Thu, Jan 26, 2023 at 01:17:49PM +0100, Jonas Oberhauser wrote:
> > > > [...]
> > > > Note that this interpretation is analogous to the promise of smp_mb__after_unlock_lock(), which says that an
> > > > UNLOCK+LOCK pair act as a full fence: here the read-side unlock+gp act as a
> > > > full memory barrier.
> > > Good point that the existing smp_mb__after_unlock_lock() can be used for
> > > any use cases relying on the more literal interpretation of this promise.
> > > We already have the work-around! ;-)
> >
> > Can it? I meant that the less-literal form is similar to the one given by
> > smp_mb__after_unlock_lock().

Apologies, missed this on the first go...

I suppose that you could have a situation where the grace period ended
between the srcu_read_unlock() and the smp_mb__after_unlock_lock(),
but how would software detect that?

Thanx, Paul

> > > > [...] I suppose you might be able to write
> > > > some absurd client that inspects every store of the reader thread and sees
> > > > that there is no line in the reader side code that acts like a full fence.
> > > > But it would take a lot of effort to discern this.
> > > The usual litmus test is shown at the end of this email [...]
> > > > [...] I hope few people would have this unhealthy idea. But you
> > > > never know.
> > > Given that the more literal interpretation is not unreasonable, we should
> > > assume that someone somewhere might have interpreted it that way.
> > >
> > > But I agree that the odds of someone actually relying on this are low,
> > > and any such use case can be fixed with smp_mb__before_srcu_read_unlock(),
> > > similar to smp_mb__after_srcu_read_unlock() that you note is already in use.
> > >
> > > It would still be good to scan SRCU use for this sort of pattern, maybe
> > > manually, maybe via something like coccinelle. Alternatively, I could
> > > post on my blog (with right of first refusal to LWN and you guys as
> > > co-authors) telling the community of our intent to change this and see
> > > what people say. Probably both rather than either/or.
> > >
> > > Thoughts?
> >
> > My first thought is "there is a 'usual' litmus test for this?" :D
> > But yes, the test you have given has at least the same structure as what I
> > would expect.
>
> Exactly! ;-)
>
> > Communicating this with the community sounds very reasonable.
> >
> > For some automated combing, I'm really not sure what pattern to look for.
> > I'm afraid someone with a lot of time might have to look (semi-)manually.
>
> Please continue giving it some thought. The number of srcu_read_unlock()
> calls in v6.1 is about 250, which is within the realm of manual
> inspection, but it is all too easy to miss something.
>
> Thanx, Paul
>
> > Best wishes, jonas
> >
> >
> > >
> > > Thanx, Paul
> > >
> > > ------------------------------------------------------------------------
> > >
> > > C C-srcu-observed-6
> > >
> > > (*
> > > * Result: Sometimes
> > > *
> > > * The result is Never if any of the smp_mb() calls is uncommented.
> > > *)
> > >
> > > {}
> > >
> > > P0(int *a, int *b, int *c, int *d, struct srcu_struct *s)
> > > {
> > > int r1;
> > > int r2;
> > > int r3;
> > > int r4;
> > >
> > > r1 = srcu_read_lock(s);
> > > WRITE_ONCE(*b, 2);
> > > r2 = READ_ONCE(*a);
> > > // smp_mb();
> > > srcu_read_unlock(s, r1);
> > > // smp_mb();
> > > r3 = READ_ONCE(*c);
> > > // smp_mb();
> > > r4 = READ_ONCE(*d);
> > > }
> > >
> > > P1(int *a, int *b, int *c, int *d, struct srcu_struct *s)
> > > {
> > > WRITE_ONCE(*b, 1);
> > > synchronize_srcu(s);
> > > WRITE_ONCE(*c, 1);
> > > }
> > >
> > > P2(int *a, int *b, int *c, int *d, struct srcu_struct *s)
> > > {
> > > WRITE_ONCE(*d, 1);
> > > smp_mb();
> > > WRITE_ONCE(*a, 1);
> > > }
> > >
> > > exists (0:r2=1 /\ 0:r3=1 /\ 0:r4=0 /\ b=1)
> >