LinuxLists.cc - Current LKMM patch disposition

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Fri, Feb 03, 2023 at 04:48:43PM -0800, Paul E. McKenney wrote:
> Hello!
>
> Here is what I currently have for LKMM patches:
>
> 289e1c89217d4 ("locking/memory-barriers.txt: Improve documentation for writel() example")
> ebd50e2947de9 ("tools: memory-model: Add rmw-sequences to the LKMM")
> aae0c8a50d6d3 ("Documentation: Fixed a typo in atomic_t.txt")
> 9ba7d3b3b826e ("tools: memory-model: Make plain accesses carry dependencies")
>
> Queued for the upcoming (v6.3) merge window.
>
> c7637e2a8a27 ("tools/memory-model: Update some warning labels")
> 7862199d4df2 ("tools/memory-model: Unify UNLOCK+LOCK pairings to po-unlock-lock-")
>
> Are ready for the next (v6.4) merge window. If there is some
> reason that they should instead go into v6.3, please let us
> all know.
>
> a6cd5214b5ba ("tools/memory-model: Document LKMM test procedure")
>
> This goes onto the lkmm-dev pile because it is documenting how
> to use those scripts.
>
> https://lore.kernel.org/lkml/[email protected]/
> https://lore.kernel.org/lkml/[email protected]
> https://lore.kernel.org/lkml/[email protected]/
> 5d871b280e7f ("tools/memory-model: Add smp_mb__after_srcu_read_unlock()")
>
> These need review and perhaps further adjustment.
>
> So, am I missing any? Are there any that need to be redirected?

The "Provide exact semantics for SRCU" patch should have:

Portions suggested by Boqun Feng and Jonas Oberhauser.

added at the end, together with your Reported-by: tag. With that, I
think it can be queued for 6.4.

Alan

2023-02-04 01:49:51

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Fri, Feb 03, 2023 at 08:28:35PM -0500, Alan Stern wrote:
> On Fri, Feb 03, 2023 at 04:48:43PM -0800, Paul E. McKenney wrote:
> > Hello!
> >
> > Here is what I currently have for LKMM patches:
> >
> > 289e1c89217d4 ("locking/memory-barriers.txt: Improve documentation for writel() example")
> > ebd50e2947de9 ("tools: memory-model: Add rmw-sequences to the LKMM")
> > aae0c8a50d6d3 ("Documentation: Fixed a typo in atomic_t.txt")
> > 9ba7d3b3b826e ("tools: memory-model: Make plain accesses carry dependencies")
> >
> > Queued for the upcoming (v6.3) merge window.
> >
> > c7637e2a8a27 ("tools/memory-model: Update some warning labels")
> > 7862199d4df2 ("tools/memory-model: Unify UNLOCK+LOCK pairings to po-unlock-lock-")
> >
> > Are ready for the next (v6.4) merge window. If there is some
> > reason that they should instead go into v6.3, please let us
> > all know.
> >
> > a6cd5214b5ba ("tools/memory-model: Document LKMM test procedure")
> >
> > This goes onto the lkmm-dev pile because it is documenting how
> > to use those scripts.
> >
> > https://lore.kernel.org/lkml/[email protected]/
> > https://lore.kernel.org/lkml/[email protected]
> > https://lore.kernel.org/lkml/[email protected]/
> > 5d871b280e7f ("tools/memory-model: Add smp_mb__after_srcu_read_unlock()")
> >
> > These need review and perhaps further adjustment.
> >
> > So, am I missing any? Are there any that need to be redirected?
>
> The "Provide exact semantics for SRCU" patch should have:
>
> Portions suggested by Boqun Feng and Jonas Oberhauser.
>
> added at the end, together with your Reported-by: tag. With that, I
> think it can be queued for 6.4.

Thank you! Does the patch shown below work for you?

(I have tentatively queued this, but can easily adjust or replace it.)

Thanx, Paul

------------------------------------------------------------------------

tools/memory-model: Provide exact SRCU semantics

LKMM has long provided only approximate handling of SRCU read-side
critical sections. This has not been a pressing problem because LKMM's
traditional handling is correct for the common cases of non-overlapping
and properly nested critical sections. However, LKMM's traditional
handling of partially overlapping critical sections incorrectly fuses
them into one large critical section.

For example, consider the following litmus test:

------------------------------------------------------------------------

C C-srcu-nest-5

(*
* Result: Sometimes
*
* This demonstrates non-nested overlapping of SRCU read-side critical
* sections. Unlike RCU, SRCU critical sections do not unconditionally
* nest.
*)

{}

P0(int *x, int *y, struct srcu_struct *s1)
{
int r1;
int r2;
int r3;
int r4;

r3 = srcu_read_lock(s1);
r2 = READ_ONCE(*y);
r4 = srcu_read_lock(s1);
srcu_read_unlock(s1, r3);
r1 = READ_ONCE(*x);
srcu_read_unlock(s1, r4);
}

P1(int *x, int *y, struct srcu_struct *s1)
{
WRITE_ONCE(*y, 1);
synchronize_srcu(s1);
WRITE_ONCE(*x, 1);
}

locations [0:r1]
exists (0:r1=1 /\ 0:r2=0)

------------------------------------------------------------------------

Current mainline incorrectly flattens the two critical sections into
one larger critical section, giving "Never" instead of the correct
"Sometimes":

------------------------------------------------------------------------

$ herd7 -conf linux-kernel.cfg C-srcu-nest-5.litmus
Test C-srcu-nest-5 Allowed
States 3
0:r1=0; 0:r2=0;
0:r1=0; 0:r2=1;
0:r1=1; 0:r2=1;
No
Witnesses
Positive: 0 Negative: 3
Flag srcu-bad-nesting
Condition exists (0:r1=1 /\ 0:r2=0)
Observation C-srcu-nest-5 Never 0 3
Time C-srcu-nest-5 0.01
Hash=e692c106cf3e84e20f12991dc438ff1b

------------------------------------------------------------------------

To its credit, it does complain about bad nesting. But with this
commit we get the following result, which has the virtue of being
correct:

------------------------------------------------------------------------

$ herd7 -conf linux-kernel.cfg C-srcu-nest-5.litmus
Test C-srcu-nest-5 Allowed
States 4
0:r1=0; 0:r2=0;
0:r1=0; 0:r2=1;
0:r1=1; 0:r2=0;
0:r1=1; 0:r2=1;
Ok
Witnesses
Positive: 1 Negative: 3
Condition exists (0:r1=1 /\ 0:r2=0)
Observation C-srcu-nest-5 Sometimes 1 3
Time C-srcu-nest-5 0.05
Hash=e692c106cf3e84e20f12991dc438ff1b

------------------------------------------------------------------------

In addition, there are new srcu_down_read() and srcu_up_read()
functions on their way to mainline. Roughly speaking, these are to
srcu_read_lock() and srcu_read_unlock() as down() and up() are to
mutex_lock() and mutex_unlock(). The key point is that
srcu_down_read() can execute in one process and the matching
srcu_up_read() in another, as shown in this litmus test:

------------------------------------------------------------------------

C C-srcu-nest-6

(*
* Result: Never
*
* This would be valid for srcu_down_read() and srcu_up_read().
*)

{}

P0(int *x, int *y, struct srcu_struct *s1, int *idx, int *f)
{
int r2;
int r3;

r3 = srcu_down_read(s1);
WRITE_ONCE(*idx, r3);
r2 = READ_ONCE(*y);
smp_store_release(f, 1);
}

P1(int *x, int *y, struct srcu_struct *s1, int *idx, int *f)
{
int r1;
int r3;
int r4;

r4 = smp_load_acquire(f);
r1 = READ_ONCE(*x);
r3 = READ_ONCE(*idx);
srcu_up_read(s1, r3);
}

P2(int *x, int *y, struct srcu_struct *s1)
{
WRITE_ONCE(*y, 1);
synchronize_srcu(s1);
WRITE_ONCE(*x, 1);
}

locations [0:r1]
filter (1:r4=1)
exists (1:r1=1 /\ 0:r2=0)

------------------------------------------------------------------------

When run on current mainline, this litmus test gets a complaint about
an unknown macro srcu_down_read(). With this commit:

------------------------------------------------------------------------

herd7 -conf linux-kernel.cfg C-srcu-nest-6.litmus
Test C-srcu-nest-6 Allowed
States 3
0:r1=0; 0:r2=0; 1:r1=0;
0:r1=0; 0:r2=1; 1:r1=0;
0:r1=0; 0:r2=1; 1:r1=1;
No
Witnesses
Positive: 0 Negative: 3
Condition exists (1:r1=1 /\ 0:r2=0)
Observation C-srcu-nest-6 Never 0 3
Time C-srcu-nest-6 0.02
Hash=c1f20257d052ca5e899be508bedcb2a1

------------------------------------------------------------------------

Note that the user must supply the flag "f" and the "filter" clause,
similar to what must be done to emulate call_rcu().

The commit works by treating srcu_read_lock()/srcu_down_read() as
loads and srcu_read_unlock()/srcu_up_read() as stores. This allows us
to determine which unlock matches which lock by looking for a data
dependency between them. In order for this to work properly, the data
dependencies have to be tracked through stores to intermediate
variables such as "idx" in the litmus test above; this is handled by
the new carry-srcu-data relation. But it's important here (and in the
existing carry-dep relation) to avoid tracking the dependencies
through SRCU unlock stores. Otherwise, in situations resembling:

A: r1 = srcu_read_lock(s);
B: srcu_read_unlock(s, r1);
C: r2 = srcu_read_lock(s);
D: srcu_read_unlock(s, r2);

it would look as if D was dependent on both A and C, because "s" would
appear to be an intermediate variable written by B and read by C.
This explains the complications in the definitions of carry-srcu-dep
and carry-dep.

As a debugging aid, the commit adds a check for errors in which the
value returned by one call to srcu_read_lock()/srcu_down_read() is
passed to more than one instance of srcu_read_unlock()/srcu_up_read().

Finally, since these SRCU-related primitives are now treated as
ordinary reads and writes, we have to add them into the lists of
marked accesses (i.e., not subject to data races) and lock-related
accesses (i.e., one shouldn't try to access an srcu_struct with a
non-lock-related primitive such as READ_ONCE() or a plain write).

Portions of this approach were suggested by Boqun Feng and Jonas
Oberhauser.

[ paulmck: Fix space-before-tab whitespace nit. ]

Reported-by: Paul E. McKenney <[email protected]>
Signed-off-by: Alan Stern <[email protected]>

---

tools/memory-model/linux-kernel.bell | 17 +++++------------
tools/memory-model/linux-kernel.def | 6 ++++--
tools/memory-model/lock.cat | 6 +++---
3 files changed, 12 insertions(+), 17 deletions(-)

Index: usb-devel/tools/memory-model/linux-kernel.bell
===================================================================
--- usb-devel.orig/tools/memory-model/linux-kernel.bell
+++ usb-devel/tools/memory-model/linux-kernel.bell
@@ -57,20 +57,13 @@ flag ~empty Rcu-lock \ domain(rcu-rscs)
flag ~empty Rcu-unlock \ range(rcu-rscs) as unmatched-rcu-unlock

(* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
-let srcu-rscs = let rec
- unmatched-locks = Srcu-lock \ domain(matched)
- and unmatched-unlocks = Srcu-unlock \ range(matched)
- and unmatched = unmatched-locks | unmatched-unlocks
- and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
- and unmatched-locks-to-unlocks =
- ([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
- and matched = matched | (unmatched-locks-to-unlocks \
- (unmatched-po ; unmatched-po))
- in matched
+let carry-srcu-data = (data ; [~ Srcu-unlock] ; rf)*
+let srcu-rscs = ([Srcu-lock] ; carry-srcu-data ; data ; [Srcu-unlock]) & loc

(* Validate nesting *)
flag ~empty Srcu-lock \ domain(srcu-rscs) as unmatched-srcu-lock
flag ~empty Srcu-unlock \ range(srcu-rscs) as unmatched-srcu-unlock
+flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-srcu-matches

(* Check for use of synchronize_srcu() inside an RCU critical section *)
flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
@@ -80,11 +73,11 @@ flag ~empty different-values(srcu-rscs)

(* Compute marked and plain memory accesses *)
let Marked = (~M) | IW | Once | Release | Acquire | domain(rmw) | range(rmw) |
- LKR | LKW | UL | LF | RL | RU
+ LKR | LKW | UL | LF | RL | RU | Srcu-lock | Srcu-unlock
let Plain = M \ Marked

(* Redefine dependencies to include those carried through plain accesses *)
-let carry-dep = (data ; rfi)*
+let carry-dep = (data ; [~ Srcu-unlock] ; rfi)*
let addr = carry-dep ; addr
let ctrl = carry-dep ; ctrl
let data = carry-dep ; data
Index: usb-devel/tools/memory-model/linux-kernel.def
===================================================================
--- usb-devel.orig/tools/memory-model/linux-kernel.def
+++ usb-devel/tools/memory-model/linux-kernel.def
@@ -49,8 +49,10 @@ synchronize_rcu() { __fence{sync-rcu}; }
synchronize_rcu_expedited() { __fence{sync-rcu}; }

// SRCU
-srcu_read_lock(X) __srcu{srcu-lock}(X)
-srcu_read_unlock(X,Y) { __srcu{srcu-unlock}(X,Y); }
+srcu_read_lock(X) __load{srcu-lock}(*X)
+srcu_read_unlock(X,Y) { __store{srcu-unlock}(*X,Y); }
+srcu_down_read(X) __load{srcu-lock}(*X)
+srcu_up_read(X,Y) { __store{srcu-unlock}(*X,Y); }
synchronize_srcu(X) { __srcu{sync-srcu}(X); }
synchronize_srcu_expedited(X) { __srcu{sync-srcu}(X); }

Index: usb-devel/tools/memory-model/lock.cat
===================================================================
--- usb-devel.orig/tools/memory-model/lock.cat
+++ usb-devel/tools/memory-model/lock.cat
@@ -36,9 +36,9 @@ let RU = try RU with emptyset
(* Treat RL as a kind of LF: a read with no ordering properties *)
let LF = LF | RL

-(* There should be no ordinary R or W accesses to spinlocks *)
-let ALL-LOCKS = LKR | LKW | UL | LF | RU
-flag ~empty [M \ IW] ; loc ; [ALL-LOCKS] as mixed-lock-accesses
+(* There should be no ordinary R or W accesses to spinlocks or SRCU structs *)
+let ALL-LOCKS = LKR | LKW | UL | LF | RU | Srcu-lock | Srcu-unlock | Sync-srcu
+flag ~empty [M \ IW \ ALL-LOCKS] ; loc ; [ALL-LOCKS] as mixed-lock-accesses

(* Link Lock-Reads to their RMW-partner Lock-Writes *)
let lk-rmw = ([LKR] ; po-loc ; [LKW]) \ (po ; po)

2023-02-04 14:58:21

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Fri, Feb 03, 2023 at 05:49:41PM -0800, Paul E. McKenney wrote:
> On Fri, Feb 03, 2023 at 08:28:35PM -0500, Alan Stern wrote:
> > On Fri, Feb 03, 2023 at 04:48:43PM -0800, Paul E. McKenney wrote:
> > > Hello!
> > >
> > > Here is what I currently have for LKMM patches:
> > >
> > > 289e1c89217d4 ("locking/memory-barriers.txt: Improve documentation for writel() example")
> > > ebd50e2947de9 ("tools: memory-model: Add rmw-sequences to the LKMM")
> > > aae0c8a50d6d3 ("Documentation: Fixed a typo in atomic_t.txt")
> > > 9ba7d3b3b826e ("tools: memory-model: Make plain accesses carry dependencies")
> > >
> > > Queued for the upcoming (v6.3) merge window.
> > >
> > > c7637e2a8a27 ("tools/memory-model: Update some warning labels")
> > > 7862199d4df2 ("tools/memory-model: Unify UNLOCK+LOCK pairings to po-unlock-lock-")
> > >
> > > Are ready for the next (v6.4) merge window. If there is some
> > > reason that they should instead go into v6.3, please let us
> > > all know.
> > >
> > > a6cd5214b5ba ("tools/memory-model: Document LKMM test procedure")
> > >
> > > This goes onto the lkmm-dev pile because it is documenting how
> > > to use those scripts.
> > >
> > > https://lore.kernel.org/lkml/[email protected]/
> > > https://lore.kernel.org/lkml/[email protected]
> > > https://lore.kernel.org/lkml/[email protected]/
> > > 5d871b280e7f ("tools/memory-model: Add smp_mb__after_srcu_read_unlock()")
> > >
> > > These need review and perhaps further adjustment.
> > >
> > > So, am I missing any? Are there any that need to be redirected?
> >
> > The "Provide exact semantics for SRCU" patch should have:
> >
> > Portions suggested by Boqun Feng and Jonas Oberhauser.
> >
> > added at the end, together with your Reported-by: tag. With that, I
> > think it can be queued for 6.4.
>
> Thank you! Does the patch shown below work for you?
>
> (I have tentatively queued this, but can easily adjust or replace it.)

It looks fine.

Alan

2023-02-04 22:24:19

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Sat, Feb 04, 2023 at 09:58:12AM -0500, Alan Stern wrote:
> On Fri, Feb 03, 2023 at 05:49:41PM -0800, Paul E. McKenney wrote:
> > On Fri, Feb 03, 2023 at 08:28:35PM -0500, Alan Stern wrote:
> > > On Fri, Feb 03, 2023 at 04:48:43PM -0800, Paul E. McKenney wrote:
> > > > Hello!
> > > >
> > > > Here is what I currently have for LKMM patches:
> > > >
> > > > 289e1c89217d4 ("locking/memory-barriers.txt: Improve documentation for writel() example")
> > > > ebd50e2947de9 ("tools: memory-model: Add rmw-sequences to the LKMM")
> > > > aae0c8a50d6d3 ("Documentation: Fixed a typo in atomic_t.txt")
> > > > 9ba7d3b3b826e ("tools: memory-model: Make plain accesses carry dependencies")
> > > >
> > > > Queued for the upcoming (v6.3) merge window.
> > > >
> > > > c7637e2a8a27 ("tools/memory-model: Update some warning labels")
> > > > 7862199d4df2 ("tools/memory-model: Unify UNLOCK+LOCK pairings to po-unlock-lock-")
> > > >
> > > > Are ready for the next (v6.4) merge window. If there is some
> > > > reason that they should instead go into v6.3, please let us
> > > > all know.
> > > >
> > > > a6cd5214b5ba ("tools/memory-model: Document LKMM test procedure")
> > > >
> > > > This goes onto the lkmm-dev pile because it is documenting how
> > > > to use those scripts.
> > > >
> > > > https://lore.kernel.org/lkml/[email protected]/
> > > > https://lore.kernel.org/lkml/[email protected]
> > > > https://lore.kernel.org/lkml/[email protected]/
> > > > 5d871b280e7f ("tools/memory-model: Add smp_mb__after_srcu_read_unlock()")
> > > >
> > > > These need review and perhaps further adjustment.
> > > >
> > > > So, am I missing any? Are there any that need to be redirected?
> > >
> > > The "Provide exact semantics for SRCU" patch should have:
> > >
> > > Portions suggested by Boqun Feng and Jonas Oberhauser.
> > >
> > > added at the end, together with your Reported-by: tag. With that, I
> > > think it can be queued for 6.4.
> >
> > Thank you! Does the patch shown below work for you?
> >
> > (I have tentatively queued this, but can easily adjust or replace it.)
>
> It looks fine.

Very good, thank you for looking it over! I pushed it out on branch
stern.2023.02.04a.

Would anyone like to ack/review/whatever this one?

Thanx, Paul

2023-02-05 14:10:39

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Sat, Feb 04, 2023 at 02:24:11PM -0800, Paul E. McKenney wrote:
> On Sat, Feb 04, 2023 at 09:58:12AM -0500, Alan Stern wrote:
> > On Fri, Feb 03, 2023 at 05:49:41PM -0800, Paul E. McKenney wrote:
> > > On Fri, Feb 03, 2023 at 08:28:35PM -0500, Alan Stern wrote:
> > > > On Fri, Feb 03, 2023 at 04:48:43PM -0800, Paul E. McKenney wrote:
> > > > > Hello!
> > > > >
> > > > > Here is what I currently have for LKMM patches:
> > > > >
> > > > > 289e1c89217d4 ("locking/memory-barriers.txt: Improve documentation for writel() example")
> > > > > ebd50e2947de9 ("tools: memory-model: Add rmw-sequences to the LKMM")
> > > > > aae0c8a50d6d3 ("Documentation: Fixed a typo in atomic_t.txt")
> > > > > 9ba7d3b3b826e ("tools: memory-model: Make plain accesses carry dependencies")
> > > > >
> > > > > Queued for the upcoming (v6.3) merge window.
> > > > >
> > > > > c7637e2a8a27 ("tools/memory-model: Update some warning labels")
> > > > > 7862199d4df2 ("tools/memory-model: Unify UNLOCK+LOCK pairings to po-unlock-lock-")
> > > > >
> > > > > Are ready for the next (v6.4) merge window. If there is some
> > > > > reason that they should instead go into v6.3, please let us
> > > > > all know.
> > > > >
> > > > > a6cd5214b5ba ("tools/memory-model: Document LKMM test procedure")
> > > > >
> > > > > This goes onto the lkmm-dev pile because it is documenting how
> > > > > to use those scripts.
> > > > >
> > > > > https://lore.kernel.org/lkml/[email protected]/
> > > > > https://lore.kernel.org/lkml/[email protected]
> > > > > https://lore.kernel.org/lkml/[email protected]/
> > > > > 5d871b280e7f ("tools/memory-model: Add smp_mb__after_srcu_read_unlock()")
> > > > >
> > > > > These need review and perhaps further adjustment.
> > > > >
> > > > > So, am I missing any? Are there any that need to be redirected?
> > > >
> > > > The "Provide exact semantics for SRCU" patch should have:
> > > >
> > > > Portions suggested by Boqun Feng and Jonas Oberhauser.
> > > >
> > > > added at the end, together with your Reported-by: tag. With that, I
> > > > think it can be queued for 6.4.
> > >
> > > Thank you! Does the patch shown below work for you?
> > >
> > > (I have tentatively queued this, but can easily adjust or replace it.)
> >
> > It looks fine.
>
> Very good, thank you for looking it over! I pushed it out on branch
> stern.2023.02.04a.
>
> Would anyone like to ack/review/whatever this one?

Would it be possible to add comments, something like the following? Apologies
if it is missing some ideas. I will try to improve it later.

thanks!

- Joel

---8<-----------------------

diff --git a/tools/memory-model/linux-kernel.bell b/tools/memory-model/linux-kernel.bell
index ce068700939c..0a16177339bc 100644
--- a/tools/memory-model/linux-kernel.bell
+++ b/tools/memory-model/linux-kernel.bell
@@ -57,7 +57,23 @@ let rcu-rscs = let rec
flag ~empty Rcu-lock \ domain(rcu-rscs) as unmatched-rcu-lock
flag ~empty Rcu-unlock \ range(rcu-rscs) as unmatched-rcu-unlock

+(***************************************************************)
(* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
+(***************************************************************)
+(*
+ * carry-srcu-data: To handle the case of the SRCU critical section split
+ * across CPUs, where the idx is used to communicate the SRCU index across CPUs
+ * (say CPU0 and CPU1), data is between the R[srcu-lock] to W[once][idx] on
+ * CPU0, which is sequenced with the ->rf is between the W[once][idx] and the
+ * R[once][idx] on CPU1. The carry-srcu-data is made to exclude Srcu-unlock
+ * events to prevent capturing accesses across back-to-back SRCU read-side
+ * critical sections.
+ *
+ * srcu-rscs: Putting everything together, the carry-srcu-data is sequenced with
+ * a data relation, which is the data dependency between R[once][idx] on CPU1
+ * and the srcu-unlock store, and loc ensures the relation is unique for a
+ * specific lock.
+ *)
let carry-srcu-data = (data ; [~ Srcu-unlock] ; rf)*
let srcu-rscs = ([Srcu-lock] ; carry-srcu-data ; data ; [Srcu-unlock]) & loc

2023-02-06 18:39:12

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Sun, Feb 05, 2023 at 02:10:29PM +0000, Joel Fernandes wrote:
> On Sat, Feb 04, 2023 at 02:24:11PM -0800, Paul E. McKenney wrote:
> > On Sat, Feb 04, 2023 at 09:58:12AM -0500, Alan Stern wrote:
> > > On Fri, Feb 03, 2023 at 05:49:41PM -0800, Paul E. McKenney wrote:
> > > > On Fri, Feb 03, 2023 at 08:28:35PM -0500, Alan Stern wrote:
> > > > > The "Provide exact semantics for SRCU" patch should have:
> > > > >
> > > > > Portions suggested by Boqun Feng and Jonas Oberhauser.
> > > > >
> > > > > added at the end, together with your Reported-by: tag. With that, I
> > > > > think it can be queued for 6.4.
> > > >
> > > > Thank you! Does the patch shown below work for you?
> > > >
> > > > (I have tentatively queued this, but can easily adjust or replace it.)
> > >
> > > It looks fine.
> >
> > Very good, thank you for looking it over! I pushed it out on branch
> > stern.2023.02.04a.
> >
> > Would anyone like to ack/review/whatever this one?
>
> Would it be possible to add comments, something like the following? Apologies
> if it is missing some ideas. I will try to improve it later.
>
> thanks!
>
> - Joel
>
> ---8<-----------------------
>
> diff --git a/tools/memory-model/linux-kernel.bell b/tools/memory-model/linux-kernel.bell
> index ce068700939c..0a16177339bc 100644
> --- a/tools/memory-model/linux-kernel.bell
> +++ b/tools/memory-model/linux-kernel.bell
> @@ -57,7 +57,23 @@ let rcu-rscs = let rec
> flag ~empty Rcu-lock \ domain(rcu-rscs) as unmatched-rcu-lock
> flag ~empty Rcu-unlock \ range(rcu-rscs) as unmatched-rcu-unlock
>
> +(***************************************************************)
> (* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
> +(***************************************************************)
> +(*
> + * carry-srcu-data: To handle the case of the SRCU critical section split
> + * across CPUs, where the idx is used to communicate the SRCU index across CPUs
> + * (say CPU0 and CPU1), data is between the R[srcu-lock] to W[once][idx] on
> + * CPU0, which is sequenced with the ->rf is between the W[once][idx] and the
> + * R[once][idx] on CPU1. The carry-srcu-data is made to exclude Srcu-unlock
> + * events to prevent capturing accesses across back-to-back SRCU read-side
> + * critical sections.
> + *
> + * srcu-rscs: Putting everything together, the carry-srcu-data is sequenced with
> + * a data relation, which is the data dependency between R[once][idx] on CPU1
> + * and the srcu-unlock store, and loc ensures the relation is unique for a
> + * specific lock.
> + *)
> let carry-srcu-data = (data ; [~ Srcu-unlock] ; rf)*
> let srcu-rscs = ([Srcu-lock] ; carry-srcu-data ; data ; [Srcu-unlock]) & loc

My tendency has been to keep comments in the herd7 files to a minimum
and to put more extended descriptions in the explanation.txt file.
Right now that file contains almost nothing (a single paragraph!) about
SRCU, so it needs to be updated to talk about the new definition of
srcu-rscs. In my opinion, that's where this sort of comment belongs.

Joel, would you like to write an extra paragraph of two for that file,
explaining in more detail how SRCU lock-to-unlock matching is different
from regular RCU and how the definition of the srcu-rscs relation works?
I'd be happy to edit anything you come up with.

Alan

PS: We also need to update the PLAIN ACCESSES AND DATA RACES section of
explanation.txt, to mention the carry-dep relation and why it is
important.

2023-02-06 20:20:07

by Jonas Oberhauser

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On 2/4/2023 2:49 AM, Paul E. McKenney wrote:
> On Fri, Feb 03, 2023 at 08:28:35PM -0500, Alan Stern wrote:
>> On Fri, Feb 03, 2023 at 04:48:43PM -0800, Paul E. McKenney wrote:
>>> Hello!
>>>
>>> Here is what I currently have for LKMM patches:
>>>
>>> 289e1c89217d4 ("locking/memory-barriers.txt: Improve documentation for writel() example")
>>> ebd50e2947de9 ("tools: memory-model: Add rmw-sequences to the LKMM")
>>> aae0c8a50d6d3 ("Documentation: Fixed a typo in atomic_t.txt")
>>> 9ba7d3b3b826e ("tools: memory-model: Make plain accesses carry dependencies")
>>>
>>> Queued for the upcoming (v6.3) merge window.
>>>
>>> c7637e2a8a27 ("tools/memory-model: Update some warning labels")
>>> 7862199d4df2 ("tools/memory-model: Unify UNLOCK+LOCK pairings to po-unlock-lock-")
>>>
>>> Are ready for the next (v6.4) merge window. If there is some
>>> reason that they should instead go into v6.3, please let us
>>> all know.
>>>
>>> a6cd5214b5ba ("tools/memory-model: Document LKMM test procedure")
>>>
>>> This goes onto the lkmm-dev pile because it is documenting how
>>> to use those scripts.
>>>
>>> https://lore.kernel.org/lkml/[email protected]/
>>> https://lore.kernel.org/lkml/[email protected]
>>> https://lore.kernel.org/lkml/[email protected]/
>>> 5d871b280e7f ("tools/memory-model: Add smp_mb__after_srcu_read_unlock()")
>>>
>>> These need review and perhaps further adjustment.
>>>
>>> So, am I missing any? Are there any that need to be redirected?
>> The "Provide exact semantics for SRCU" patch should have:
>>
>> Portions suggested by Boqun Feng and Jonas Oberhauser.
>>
>> added at the end, together with your Reported-by: tag. With that, I
>> think it can be queued for 6.4.
> Thank you! Does the patch shown below work for you?
>
> (I have tentatively queued this, but can easily adjust or replace it.)
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> tools/memory-model: Provide exact SRCU semantics
>
> LKMM has long provided only approximate handling of SRCU read-side
> critical sections. This has not been a pressing problem because LKMM's
> traditional handling is correct for the common cases of non-overlapping
> and properly nested critical sections. However, LKMM's traditional
> handling of partially overlapping critical sections incorrectly fuses
> them into one large critical section.
>
> For example, consider the following litmus test:
>
> ------------------------------------------------------------------------
>
> C C-srcu-nest-5
>
> (*
> * Result: Sometimes
> *
> * This demonstrates non-nested overlapping of SRCU read-side critical
> * sections. Unlike RCU, SRCU critical sections do not unconditionally
> * nest.
> *)
>
> {}
>
> P0(int *x, int *y, struct srcu_struct *s1)
> {
> int r1;
> int r2;
> int r3;
> int r4;
>
> r3 = srcu_read_lock(s1);
> r2 = READ_ONCE(*y);
> r4 = srcu_read_lock(s1);
> srcu_read_unlock(s1, r3);
> r1 = READ_ONCE(*x);
> srcu_read_unlock(s1, r4);
> }
>
> P1(int *x, int *y, struct srcu_struct *s1)
> {
> WRITE_ONCE(*y, 1);
> synchronize_srcu(s1);
> WRITE_ONCE(*x, 1);
> }
>
> locations [0:r1]
> exists (0:r1=1 /\ 0:r2=0)
>
> ------------------------------------------------------------------------
>
> Current mainline incorrectly flattens the two critical sections into
> one larger critical section, giving "Never" instead of the correct
> "Sometimes":
>
> ------------------------------------------------------------------------
>
> $ herd7 -conf linux-kernel.cfg C-srcu-nest-5.litmus
> Test C-srcu-nest-5 Allowed
> States 3
> 0:r1=0; 0:r2=0;
> 0:r1=0; 0:r2=1;
> 0:r1=1; 0:r2=1;
> No
> Witnesses
> Positive: 0 Negative: 3
> Flag srcu-bad-nesting
> Condition exists (0:r1=1 /\ 0:r2=0)
> Observation C-srcu-nest-5 Never 0 3
> Time C-srcu-nest-5 0.01
> Hash=e692c106cf3e84e20f12991dc438ff1b
>
> ------------------------------------------------------------------------
>
> To its credit, it does complain about bad nesting. But with this
> commit we get the following result, which has the virtue of being
> correct:
>
> ------------------------------------------------------------------------
>
> $ herd7 -conf linux-kernel.cfg C-srcu-nest-5.litmus
> Test C-srcu-nest-5 Allowed
> States 4
> 0:r1=0; 0:r2=0;
> 0:r1=0; 0:r2=1;
> 0:r1=1; 0:r2=0;
> 0:r1=1; 0:r2=1;
> Ok
> Witnesses
> Positive: 1 Negative: 3
> Condition exists (0:r1=1 /\ 0:r2=0)
> Observation C-srcu-nest-5 Sometimes 1 3
> Time C-srcu-nest-5 0.05
> Hash=e692c106cf3e84e20f12991dc438ff1b
>
> ------------------------------------------------------------------------
>
> In addition, there are new srcu_down_read() and srcu_up_read()
> functions on their way to mainline. Roughly speaking, these are to
> srcu_read_lock() and srcu_read_unlock() as down() and up() are to
> mutex_lock() and mutex_unlock(). The key point is that
> srcu_down_read() can execute in one process and the matching
> srcu_up_read() in another, as shown in this litmus test:
>
> ------------------------------------------------------------------------
>
> C C-srcu-nest-6
>
> (*
> * Result: Never
> *
> * This would be valid for srcu_down_read() and srcu_up_read().
> *)
>
> {}
>
> P0(int *x, int *y, struct srcu_struct *s1, int *idx, int *f)
> {
> int r2;
> int r3;
>
> r3 = srcu_down_read(s1);
> WRITE_ONCE(*idx, r3);
> r2 = READ_ONCE(*y);
> smp_store_release(f, 1);
> }
>
> P1(int *x, int *y, struct srcu_struct *s1, int *idx, int *f)
> {
> int r1;
> int r3;
> int r4;
>
> r4 = smp_load_acquire(f);
> r1 = READ_ONCE(*x);
> r3 = READ_ONCE(*idx);
> srcu_up_read(s1, r3);
> }
>
> P2(int *x, int *y, struct srcu_struct *s1)
> {
> WRITE_ONCE(*y, 1);
> synchronize_srcu(s1);
> WRITE_ONCE(*x, 1);
> }
>
> locations [0:r1]
> filter (1:r4=1)
> exists (1:r1=1 /\ 0:r2=0)
>
> ------------------------------------------------------------------------
>
> When run on current mainline, this litmus test gets a complaint about
> an unknown macro srcu_down_read(). With this commit:
>
> ------------------------------------------------------------------------
>
> herd7 -conf linux-kernel.cfg C-srcu-nest-6.litmus
> Test C-srcu-nest-6 Allowed
> States 3
> 0:r1=0; 0:r2=0; 1:r1=0;
> 0:r1=0; 0:r2=1; 1:r1=0;
> 0:r1=0; 0:r2=1; 1:r1=1;
> No
> Witnesses
> Positive: 0 Negative: 3
> Condition exists (1:r1=1 /\ 0:r2=0)
> Observation C-srcu-nest-6 Never 0 3
> Time C-srcu-nest-6 0.02
> Hash=c1f20257d052ca5e899be508bedcb2a1
>
> ------------------------------------------------------------------------
>
> Note that the user must supply the flag "f" and the "filter" clause,
> similar to what must be done to emulate call_rcu().
>
> The commit works by treating srcu_read_lock()/srcu_down_read() as
> loads and srcu_read_unlock()/srcu_up_read() as stores. This allows us
> to determine which unlock matches which lock by looking for a data
> dependency between them. In order for this to work properly, the data
> dependencies have to be tracked through stores to intermediate
> variables such as "idx" in the litmus test above; this is handled by
> the new carry-srcu-data relation. But it's important here (and in the
> existing carry-dep relation) to avoid tracking the dependencies
> through SRCU unlock stores. Otherwise, in situations resembling:
>
> A: r1 = srcu_read_lock(s);
> B: srcu_read_unlock(s, r1);
> C: r2 = srcu_read_lock(s);
> D: srcu_read_unlock(s, r2);
>
> it would look as if D was dependent on both A and C, because "s" would
> appear to be an intermediate variable written by B and read by C.
> This explains the complications in the definitions of carry-srcu-dep
> and carry-dep.
>
> As a debugging aid, the commit adds a check for errors in which the
> value returned by one call to srcu_read_lock()/srcu_down_read() is
> passed to more than one instance of srcu_read_unlock()/srcu_up_read().
>
> Finally, since these SRCU-related primitives are now treated as
> ordinary reads and writes, we have to add them into the lists of
> marked accesses (i.e., not subject to data races) and lock-related
> accesses (i.e., one shouldn't try to access an srcu_struct with a
> non-lock-related primitive such as READ_ONCE() or a plain write).
>
> Portions of this approach were suggested by Boqun Feng and Jonas
> Oberhauser.
>
> [ paulmck: Fix space-before-tab whitespace nit. ]
>
> Reported-by: Paul E. McKenney <[email protected]>
> Signed-off-by: Alan Stern <[email protected]>

Reviewed-by: Jonas Oberhauser <[email protected]>

>
> ---
>
> tools/memory-model/linux-kernel.bell | 17 +++++------------
> tools/memory-model/linux-kernel.def | 6 ++++--
> tools/memory-model/lock.cat | 6 +++---
> 3 files changed, 12 insertions(+), 17 deletions(-)
>
> Index: usb-devel/tools/memory-model/linux-kernel.bell
> ===================================================================
> --- usb-devel.orig/tools/memory-model/linux-kernel.bell
> +++ usb-devel/tools/memory-model/linux-kernel.bell
> @@ -57,20 +57,13 @@ flag ~empty Rcu-lock \ domain(rcu-rscs)
> flag ~empty Rcu-unlock \ range(rcu-rscs) as unmatched-rcu-unlock
>
> (* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
> -let srcu-rscs = let rec
> - unmatched-locks = Srcu-lock \ domain(matched)
> - and unmatched-unlocks = Srcu-unlock \ range(matched)
> - and unmatched = unmatched-locks | unmatched-unlocks
> - and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
> - and unmatched-locks-to-unlocks =
> - ([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
> - and matched = matched | (unmatched-locks-to-unlocks \
> - (unmatched-po ; unmatched-po))
> - in matched
> +let carry-srcu-data = (data ; [~ Srcu-unlock] ; rf)*
> +let srcu-rscs = ([Srcu-lock] ; carry-srcu-data ; data ; [Srcu-unlock]) & loc
>
> (* Validate nesting *)
> flag ~empty Srcu-lock \ domain(srcu-rscs) as unmatched-srcu-lock
> flag ~empty Srcu-unlock \ range(srcu-rscs) as unmatched-srcu-unlock
> +flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-srcu-matches
>
> (* Check for use of synchronize_srcu() inside an RCU critical section *)
> flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
> @@ -80,11 +73,11 @@ flag ~empty different-values(srcu-rscs)
>
> (* Compute marked and plain memory accesses *)
> let Marked = (~M) | IW | Once | Release | Acquire | domain(rmw) | range(rmw) |
> - LKR | LKW | UL | LF | RL | RU
> + LKR | LKW | UL | LF | RL | RU | Srcu-lock | Srcu-unlock
> let Plain = M \ Marked
>
> (* Redefine dependencies to include those carried through plain accesses *)
> -let carry-dep = (data ; rfi)*
> +let carry-dep = (data ; [~ Srcu-unlock] ; rfi)*
> let addr = carry-dep ; addr
> let ctrl = carry-dep ; ctrl
> let data = carry-dep ; data
> Index: usb-devel/tools/memory-model/linux-kernel.def
> ===================================================================
> --- usb-devel.orig/tools/memory-model/linux-kernel.def
> +++ usb-devel/tools/memory-model/linux-kernel.def
> @@ -49,8 +49,10 @@ synchronize_rcu() { __fence{sync-rcu}; }
> synchronize_rcu_expedited() { __fence{sync-rcu}; }
>
> // SRCU
> -srcu_read_lock(X) __srcu{srcu-lock}(X)
> -srcu_read_unlock(X,Y) { __srcu{srcu-unlock}(X,Y); }
> +srcu_read_lock(X) __load{srcu-lock}(*X)
> +srcu_read_unlock(X,Y) { __store{srcu-unlock}(*X,Y); }
> +srcu_down_read(X) __load{srcu-lock}(*X)
> +srcu_up_read(X,Y) { __store{srcu-unlock}(*X,Y); }
> synchronize_srcu(X) { __srcu{sync-srcu}(X); }
> synchronize_srcu_expedited(X) { __srcu{sync-srcu}(X); }
>
> Index: usb-devel/tools/memory-model/lock.cat
> ===================================================================
> --- usb-devel.orig/tools/memory-model/lock.cat
> +++ usb-devel/tools/memory-model/lock.cat
> @@ -36,9 +36,9 @@ let RU = try RU with emptyset
> (* Treat RL as a kind of LF: a read with no ordering properties *)
> let LF = LF | RL
>
> -(* There should be no ordinary R or W accesses to spinlocks *)
> -let ALL-LOCKS = LKR | LKW | UL | LF | RU
> -flag ~empty [M \ IW] ; loc ; [ALL-LOCKS] as mixed-lock-accesses
> +(* There should be no ordinary R or W accesses to spinlocks or SRCU structs *)
> +let ALL-LOCKS = LKR | LKW | UL | LF | RU | Srcu-lock | Srcu-unlock | Sync-srcu
> +flag ~empty [M \ IW \ ALL-LOCKS] ; loc ; [ALL-LOCKS] as mixed-lock-accesses
>
> (* Link Lock-Reads to their RMW-partner Lock-Writes *)
> let lk-rmw = ([LKR] ; po-loc ; [LKW]) \ (po ; po)

2023-02-06 20:21:40

by Jonas Oberhauser

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On 2/4/2023 1:48 AM, Paul E. McKenney wrote:
> [...]
> https://lore.kernel.org/lkml/[email protected]/
> 5d871b280e7f ("tools/memory-model: Add smp_mb__after_srcu_read_unlock()")
>
> These need review and perhaps further adjustment.
>
> So, am I missing any? Are there any that need to be redirected?

Didn't you schedule joel's patch for 6.4?

Best wishes,
jonasx

2023-02-06 21:23:16

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Mon, Feb 6, 2023 at 1:39 PM Alan Stern <[email protected]> wrote:
>
> On Sun, Feb 05, 2023 at 02:10:29PM +0000, Joel Fernandes wrote:
> > On Sat, Feb 04, 2023 at 02:24:11PM -0800, Paul E. McKenney wrote:
> > > On Sat, Feb 04, 2023 at 09:58:12AM -0500, Alan Stern wrote:
> > > > On Fri, Feb 03, 2023 at 05:49:41PM -0800, Paul E. McKenney wrote:
> > > > > On Fri, Feb 03, 2023 at 08:28:35PM -0500, Alan Stern wrote:
> > > > > > The "Provide exact semantics for SRCU" patch should have:
> > > > > >
> > > > > > Portions suggested by Boqun Feng and Jonas Oberhauser.
> > > > > >
> > > > > > added at the end, together with your Reported-by: tag. With that, I
> > > > > > think it can be queued for 6.4.
> > > > >
> > > > > Thank you! Does the patch shown below work for you?
> > > > >
> > > > > (I have tentatively queued this, but can easily adjust or replace it.)
> > > >
> > > > It looks fine.
> > >
> > > Very good, thank you for looking it over! I pushed it out on branch
> > > stern.2023.02.04a.
> > >
> > > Would anyone like to ack/review/whatever this one?
> >
> > Would it be possible to add comments, something like the following? Apologies
> > if it is missing some ideas. I will try to improve it later.
> >
> > thanks!
> >
> > - Joel
> >
> > ---8<-----------------------
> >
> > diff --git a/tools/memory-model/linux-kernel.bell b/tools/memory-model/linux-kernel.bell
> > index ce068700939c..0a16177339bc 100644
> > --- a/tools/memory-model/linux-kernel.bell
> > +++ b/tools/memory-model/linux-kernel.bell
> > @@ -57,7 +57,23 @@ let rcu-rscs = let rec
> > flag ~empty Rcu-lock \ domain(rcu-rscs) as unmatched-rcu-lock
> > flag ~empty Rcu-unlock \ range(rcu-rscs) as unmatched-rcu-unlock
> >
> > +(***************************************************************)
> > (* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
> > +(***************************************************************)
> > +(*
> > + * carry-srcu-data: To handle the case of the SRCU critical section split
> > + * across CPUs, where the idx is used to communicate the SRCU index across CPUs
> > + * (say CPU0 and CPU1), data is between the R[srcu-lock] to W[once][idx] on
> > + * CPU0, which is sequenced with the ->rf is between the W[once][idx] and the
> > + * R[once][idx] on CPU1. The carry-srcu-data is made to exclude Srcu-unlock
> > + * events to prevent capturing accesses across back-to-back SRCU read-side
> > + * critical sections.
> > + *
> > + * srcu-rscs: Putting everything together, the carry-srcu-data is sequenced with
> > + * a data relation, which is the data dependency between R[once][idx] on CPU1
> > + * and the srcu-unlock store, and loc ensures the relation is unique for a
> > + * specific lock.
> > + *)
> > let carry-srcu-data = (data ; [~ Srcu-unlock] ; rf)*
> > let srcu-rscs = ([Srcu-lock] ; carry-srcu-data ; data ; [Srcu-unlock]) & loc
>
> My tendency has been to keep comments in the herd7 files to a minimum
> and to put more extended descriptions in the explanation.txt file.
> Right now that file contains almost nothing (a single paragraph!) about
> SRCU, so it needs to be updated to talk about the new definition of
> srcu-rscs. In my opinion, that's where this sort of comment belongs.

That makes sense, I agree.

> Joel, would you like to write an extra paragraph of two for that file,
> explaining in more detail how SRCU lock-to-unlock matching is different
> from regular RCU and how the definition of the srcu-rscs relation works?
> I'd be happy to edit anything you come up with.

Yes I would love to, I'll spend some more time studying this up a bit
more so I don't write nonsense. But yes, I am quite interested in
writing something up and I will do so!

Thanks,

- Joel

> Alan
>
> PS: We also need to update the PLAIN ACCESSES AND DATA RACES section of
> explanation.txt, to mention the carry-dep relation and why it is
> important.

2023-02-06 21:23:33

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Mon, Feb 06, 2023 at 09:18:27PM +0100, Jonas Oberhauser wrote:
> On 2/4/2023 2:49 AM, Paul E. McKenney wrote:
> > On Fri, Feb 03, 2023 at 08:28:35PM -0500, Alan Stern wrote:
> > > On Fri, Feb 03, 2023 at 04:48:43PM -0800, Paul E. McKenney wrote:
> > > > Hello!
> > > >
> > > > Here is what I currently have for LKMM patches:
> > > >
> > > > 289e1c89217d4 ("locking/memory-barriers.txt: Improve documentation for writel() example")
> > > > ebd50e2947de9 ("tools: memory-model: Add rmw-sequences to the LKMM")
> > > > aae0c8a50d6d3 ("Documentation: Fixed a typo in atomic_t.txt")
> > > > 9ba7d3b3b826e ("tools: memory-model: Make plain accesses carry dependencies")
> > > >
> > > > Queued for the upcoming (v6.3) merge window.
> > > >
> > > > c7637e2a8a27 ("tools/memory-model: Update some warning labels")
> > > > 7862199d4df2 ("tools/memory-model: Unify UNLOCK+LOCK pairings to po-unlock-lock-")
> > > >
> > > > Are ready for the next (v6.4) merge window. If there is some
> > > > reason that they should instead go into v6.3, please let us
> > > > all know.
> > > >
> > > > a6cd5214b5ba ("tools/memory-model: Document LKMM test procedure")
> > > >
> > > > This goes onto the lkmm-dev pile because it is documenting how
> > > > to use those scripts.
> > > >
> > > > https://lore.kernel.org/lkml/[email protected]/
> > > > https://lore.kernel.org/lkml/[email protected]
> > > > https://lore.kernel.org/lkml/[email protected]/
> > > > 5d871b280e7f ("tools/memory-model: Add smp_mb__after_srcu_read_unlock()")
> > > >
> > > > These need review and perhaps further adjustment.
> > > >
> > > > So, am I missing any? Are there any that need to be redirected?
> > > The "Provide exact semantics for SRCU" patch should have:
> > >
> > > Portions suggested by Boqun Feng and Jonas Oberhauser.
> > >
> > > added at the end, together with your Reported-by: tag. With that, I
> > > think it can be queued for 6.4.
> > Thank you! Does the patch shown below work for you?
> >
> > (I have tentatively queued this, but can easily adjust or replace it.)
> >
> > Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > tools/memory-model: Provide exact SRCU semantics
> >
> > LKMM has long provided only approximate handling of SRCU read-side
> > critical sections. This has not been a pressing problem because LKMM's
> > traditional handling is correct for the common cases of non-overlapping
> > and properly nested critical sections. However, LKMM's traditional
> > handling of partially overlapping critical sections incorrectly fuses
> > them into one large critical section.
> >
> > For example, consider the following litmus test:
> >
> > ------------------------------------------------------------------------
> >
> > C C-srcu-nest-5
> >
> > (*
> > * Result: Sometimes
> > *
> > * This demonstrates non-nested overlapping of SRCU read-side critical
> > * sections. Unlike RCU, SRCU critical sections do not unconditionally
> > * nest.
> > *)
> >
> > {}
> >
> > P0(int *x, int *y, struct srcu_struct *s1)
> > {
> > int r1;
> > int r2;
> > int r3;
> > int r4;
> >
> > r3 = srcu_read_lock(s1);
> > r2 = READ_ONCE(*y);
> > r4 = srcu_read_lock(s1);
> > srcu_read_unlock(s1, r3);
> > r1 = READ_ONCE(*x);
> > srcu_read_unlock(s1, r4);
> > }
> >
> > P1(int *x, int *y, struct srcu_struct *s1)
> > {
> > WRITE_ONCE(*y, 1);
> > synchronize_srcu(s1);
> > WRITE_ONCE(*x, 1);
> > }
> >
> > locations [0:r1]
> > exists (0:r1=1 /\ 0:r2=0)
> >
> > ------------------------------------------------------------------------
> >
> > Current mainline incorrectly flattens the two critical sections into
> > one larger critical section, giving "Never" instead of the correct
> > "Sometimes":
> >
> > ------------------------------------------------------------------------
> >
> > $ herd7 -conf linux-kernel.cfg C-srcu-nest-5.litmus
> > Test C-srcu-nest-5 Allowed
> > States 3
> > 0:r1=0; 0:r2=0;
> > 0:r1=0; 0:r2=1;
> > 0:r1=1; 0:r2=1;
> > No
> > Witnesses
> > Positive: 0 Negative: 3
> > Flag srcu-bad-nesting
> > Condition exists (0:r1=1 /\ 0:r2=0)
> > Observation C-srcu-nest-5 Never 0 3
> > Time C-srcu-nest-5 0.01
> > Hash=e692c106cf3e84e20f12991dc438ff1b
> >
> > ------------------------------------------------------------------------
> >
> > To its credit, it does complain about bad nesting. But with this
> > commit we get the following result, which has the virtue of being
> > correct:
> >
> > ------------------------------------------------------------------------
> >
> > $ herd7 -conf linux-kernel.cfg C-srcu-nest-5.litmus
> > Test C-srcu-nest-5 Allowed
> > States 4
> > 0:r1=0; 0:r2=0;
> > 0:r1=0; 0:r2=1;
> > 0:r1=1; 0:r2=0;
> > 0:r1=1; 0:r2=1;
> > Ok
> > Witnesses
> > Positive: 1 Negative: 3
> > Condition exists (0:r1=1 /\ 0:r2=0)
> > Observation C-srcu-nest-5 Sometimes 1 3
> > Time C-srcu-nest-5 0.05
> > Hash=e692c106cf3e84e20f12991dc438ff1b
> >
> > ------------------------------------------------------------------------
> >
> > In addition, there are new srcu_down_read() and srcu_up_read()
> > functions on their way to mainline. Roughly speaking, these are to
> > srcu_read_lock() and srcu_read_unlock() as down() and up() are to
> > mutex_lock() and mutex_unlock(). The key point is that
> > srcu_down_read() can execute in one process and the matching
> > srcu_up_read() in another, as shown in this litmus test:
> >
> > ------------------------------------------------------------------------
> >
> > C C-srcu-nest-6
> >
> > (*
> > * Result: Never
> > *
> > * This would be valid for srcu_down_read() and srcu_up_read().
> > *)
> >
> > {}
> >
> > P0(int *x, int *y, struct srcu_struct *s1, int *idx, int *f)
> > {
> > int r2;
> > int r3;
> >
> > r3 = srcu_down_read(s1);
> > WRITE_ONCE(*idx, r3);
> > r2 = READ_ONCE(*y);
> > smp_store_release(f, 1);
> > }
> >
> > P1(int *x, int *y, struct srcu_struct *s1, int *idx, int *f)
> > {
> > int r1;
> > int r3;
> > int r4;
> >
> > r4 = smp_load_acquire(f);
> > r1 = READ_ONCE(*x);
> > r3 = READ_ONCE(*idx);
> > srcu_up_read(s1, r3);
> > }
> >
> > P2(int *x, int *y, struct srcu_struct *s1)
> > {
> > WRITE_ONCE(*y, 1);
> > synchronize_srcu(s1);
> > WRITE_ONCE(*x, 1);
> > }
> >
> > locations [0:r1]
> > filter (1:r4=1)
> > exists (1:r1=1 /\ 0:r2=0)
> >
> > ------------------------------------------------------------------------
> >
> > When run on current mainline, this litmus test gets a complaint about
> > an unknown macro srcu_down_read(). With this commit:
> >
> > ------------------------------------------------------------------------
> >
> > herd7 -conf linux-kernel.cfg C-srcu-nest-6.litmus
> > Test C-srcu-nest-6 Allowed
> > States 3
> > 0:r1=0; 0:r2=0; 1:r1=0;
> > 0:r1=0; 0:r2=1; 1:r1=0;
> > 0:r1=0; 0:r2=1; 1:r1=1;
> > No
> > Witnesses
> > Positive: 0 Negative: 3
> > Condition exists (1:r1=1 /\ 0:r2=0)
> > Observation C-srcu-nest-6 Never 0 3
> > Time C-srcu-nest-6 0.02
> > Hash=c1f20257d052ca5e899be508bedcb2a1
> >
> > ------------------------------------------------------------------------
> >
> > Note that the user must supply the flag "f" and the "filter" clause,
> > similar to what must be done to emulate call_rcu().
> >
> > The commit works by treating srcu_read_lock()/srcu_down_read() as
> > loads and srcu_read_unlock()/srcu_up_read() as stores. This allows us
> > to determine which unlock matches which lock by looking for a data
> > dependency between them. In order for this to work properly, the data
> > dependencies have to be tracked through stores to intermediate
> > variables such as "idx" in the litmus test above; this is handled by
> > the new carry-srcu-data relation. But it's important here (and in the
> > existing carry-dep relation) to avoid tracking the dependencies
> > through SRCU unlock stores. Otherwise, in situations resembling:
> >
> > A: r1 = srcu_read_lock(s);
> > B: srcu_read_unlock(s, r1);
> > C: r2 = srcu_read_lock(s);
> > D: srcu_read_unlock(s, r2);
> >
> > it would look as if D was dependent on both A and C, because "s" would
> > appear to be an intermediate variable written by B and read by C.
> > This explains the complications in the definitions of carry-srcu-dep
> > and carry-dep.
> >
> > As a debugging aid, the commit adds a check for errors in which the
> > value returned by one call to srcu_read_lock()/srcu_down_read() is
> > passed to more than one instance of srcu_read_unlock()/srcu_up_read().
> >
> > Finally, since these SRCU-related primitives are now treated as
> > ordinary reads and writes, we have to add them into the lists of
> > marked accesses (i.e., not subject to data races) and lock-related
> > accesses (i.e., one shouldn't try to access an srcu_struct with a
> > non-lock-related primitive such as READ_ONCE() or a plain write).
> >
> > Portions of this approach were suggested by Boqun Feng and Jonas
> > Oberhauser.
> >
> > [ paulmck: Fix space-before-tab whitespace nit. ]
> >
> > Reported-by: Paul E. McKenney <[email protected]>
> > Signed-off-by: Alan Stern <[email protected]>
>
> Reviewed-by: Jonas Oberhauser <[email protected]>

Applied, thank you!

Thanx, Paul

> > ---
> >
> > tools/memory-model/linux-kernel.bell | 17 +++++------------
> > tools/memory-model/linux-kernel.def | 6 ++++--
> > tools/memory-model/lock.cat | 6 +++---
> > 3 files changed, 12 insertions(+), 17 deletions(-)
> >
> > Index: usb-devel/tools/memory-model/linux-kernel.bell
> > ===================================================================
> > --- usb-devel.orig/tools/memory-model/linux-kernel.bell
> > +++ usb-devel/tools/memory-model/linux-kernel.bell
> > @@ -57,20 +57,13 @@ flag ~empty Rcu-lock \ domain(rcu-rscs)
> > flag ~empty Rcu-unlock \ range(rcu-rscs) as unmatched-rcu-unlock
> > (* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
> > -let srcu-rscs = let rec
> > - unmatched-locks = Srcu-lock \ domain(matched)
> > - and unmatched-unlocks = Srcu-unlock \ range(matched)
> > - and unmatched = unmatched-locks | unmatched-unlocks
> > - and unmatched-po = ([unmatched] ; po ; [unmatched]) & loc
> > - and unmatched-locks-to-unlocks =
> > - ([unmatched-locks] ; po ; [unmatched-unlocks]) & loc
> > - and matched = matched | (unmatched-locks-to-unlocks \
> > - (unmatched-po ; unmatched-po))
> > - in matched
> > +let carry-srcu-data = (data ; [~ Srcu-unlock] ; rf)*
> > +let srcu-rscs = ([Srcu-lock] ; carry-srcu-data ; data ; [Srcu-unlock]) & loc
> > (* Validate nesting *)
> > flag ~empty Srcu-lock \ domain(srcu-rscs) as unmatched-srcu-lock
> > flag ~empty Srcu-unlock \ range(srcu-rscs) as unmatched-srcu-unlock
> > +flag ~empty (srcu-rscs^-1 ; srcu-rscs) \ id as multiple-srcu-matches
> > (* Check for use of synchronize_srcu() inside an RCU critical section *)
> > flag ~empty rcu-rscs & (po ; [Sync-srcu] ; po) as invalid-sleep
> > @@ -80,11 +73,11 @@ flag ~empty different-values(srcu-rscs)
> > (* Compute marked and plain memory accesses *)
> > let Marked = (~M) | IW | Once | Release | Acquire | domain(rmw) | range(rmw) |
> > - LKR | LKW | UL | LF | RL | RU
> > + LKR | LKW | UL | LF | RL | RU | Srcu-lock | Srcu-unlock
> > let Plain = M \ Marked
> > (* Redefine dependencies to include those carried through plain accesses *)
> > -let carry-dep = (data ; rfi)*
> > +let carry-dep = (data ; [~ Srcu-unlock] ; rfi)*
> > let addr = carry-dep ; addr
> > let ctrl = carry-dep ; ctrl
> > let data = carry-dep ; data
> > Index: usb-devel/tools/memory-model/linux-kernel.def
> > ===================================================================
> > --- usb-devel.orig/tools/memory-model/linux-kernel.def
> > +++ usb-devel/tools/memory-model/linux-kernel.def
> > @@ -49,8 +49,10 @@ synchronize_rcu() { __fence{sync-rcu}; }
> > synchronize_rcu_expedited() { __fence{sync-rcu}; }
> > // SRCU
> > -srcu_read_lock(X) __srcu{srcu-lock}(X)
> > -srcu_read_unlock(X,Y) { __srcu{srcu-unlock}(X,Y); }
> > +srcu_read_lock(X) __load{srcu-lock}(*X)
> > +srcu_read_unlock(X,Y) { __store{srcu-unlock}(*X,Y); }
> > +srcu_down_read(X) __load{srcu-lock}(*X)
> > +srcu_up_read(X,Y) { __store{srcu-unlock}(*X,Y); }
> > synchronize_srcu(X) { __srcu{sync-srcu}(X); }
> > synchronize_srcu_expedited(X) { __srcu{sync-srcu}(X); }
> > Index: usb-devel/tools/memory-model/lock.cat
> > ===================================================================
> > --- usb-devel.orig/tools/memory-model/lock.cat
> > +++ usb-devel/tools/memory-model/lock.cat
> > @@ -36,9 +36,9 @@ let RU = try RU with emptyset
> > (* Treat RL as a kind of LF: a read with no ordering properties *)
> > let LF = LF | RL
> > -(* There should be no ordinary R or W accesses to spinlocks *)
> > -let ALL-LOCKS = LKR | LKW | UL | LF | RU
> > -flag ~empty [M \ IW] ; loc ; [ALL-LOCKS] as mixed-lock-accesses
> > +(* There should be no ordinary R or W accesses to spinlocks or SRCU structs *)
> > +let ALL-LOCKS = LKR | LKW | UL | LF | RU | Srcu-lock | Srcu-unlock | Sync-srcu
> > +flag ~empty [M \ IW \ ALL-LOCKS] ; loc ; [ALL-LOCKS] as mixed-lock-accesses
> > (* Link Lock-Reads to their RMW-partner Lock-Writes *)
> > let lk-rmw = ([LKR] ; po-loc ; [LKW]) \ (po ; po)
>

2023-02-06 21:30:24

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Mon, Feb 06, 2023 at 09:20:59PM +0100, Jonas Oberhauser wrote:
> On 2/4/2023 1:48 AM, Paul E. McKenney wrote:
> > [...]
> > https://lore.kernel.org/lkml/[email protected]/
> > 5d871b280e7f ("tools/memory-model: Add smp_mb__after_srcu_read_unlock()")
> >
> > These need review and perhaps further adjustment.
> >
> > So, am I missing any? Are there any that need to be redirected?
>
> Didn't you schedule joel's patch for 6.4?

Agreed, and Alan's patch as well. The scoreboard now stands as shown
below.

Thanx, Paul

------------------------------------------------------------------------

289e1c89217d4 ("locking/memory-barriers.txt: Improve documentation for writel() example")
ebd50e2947de9 ("tools: memory-model: Add rmw-sequences to the LKMM")
aae0c8a50d6d3 ("Documentation: Fixed a typo in atomic_t.txt")
9ba7d3b3b826e ("tools: memory-model: Make plain accesses carry dependencies")

Queued for the upcoming (v6.3) merge window.

c7637e2a8a27 ("tools/memory-model: Update some warning labels")
7862199d4df2 ("tools/memory-model: Unify UNLOCK+LOCK pairings to po-unlock-lock-")
2e9fc6060678 ("tools/memory-model: Restrict to-r to read-read address dependency")
7e46006a97c9 ("tools/memory-model: Provide exact SRCU semantics")

Are ready for the next (v6.4) merge window. If there is some
reason that they should instead go into v6.3, please let us
all know.

a6cd5214b5ba ("tools/memory-model: Document LKMM test procedure")

This goes onto the lkmm-dev pile because it is documenting how
to use those scripts.

https://lore.kernel.org/lkml/[email protected]
5d871b280e7f ("tools/memory-model: Add smp_mb__after_srcu_read_unlock()")

These need review and perhaps further adjustment.

2023-02-11 15:49:49

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Mon, Feb 06, 2023 at 04:22:57PM -0500, Joel Fernandes wrote:
> On Mon, Feb 6, 2023 at 1:39 PM Alan Stern <[email protected]> wrote:
> >
> > On Sun, Feb 05, 2023 at 02:10:29PM +0000, Joel Fernandes wrote:
> > > On Sat, Feb 04, 2023 at 02:24:11PM -0800, Paul E. McKenney wrote:
> > > > On Sat, Feb 04, 2023 at 09:58:12AM -0500, Alan Stern wrote:
> > > > > On Fri, Feb 03, 2023 at 05:49:41PM -0800, Paul E. McKenney wrote:
> > > > > > On Fri, Feb 03, 2023 at 08:28:35PM -0500, Alan Stern wrote:
> > > > > > > The "Provide exact semantics for SRCU" patch should have:
> > > > > > >
> > > > > > > Portions suggested by Boqun Feng and Jonas Oberhauser.
> > > > > > >
> > > > > > > added at the end, together with your Reported-by: tag. With that, I
> > > > > > > think it can be queued for 6.4.
> > > > > >
> > > > > > Thank you! Does the patch shown below work for you?
> > > > > >
> > > > > > (I have tentatively queued this, but can easily adjust or replace it.)
> > > > >
> > > > > It looks fine.
> > > >
> > > > Very good, thank you for looking it over! I pushed it out on branch
> > > > stern.2023.02.04a.
> > > >
> > > > Would anyone like to ack/review/whatever this one?
> > >
> > > Would it be possible to add comments, something like the following? Apologies
> > > if it is missing some ideas. I will try to improve it later.
> > >
> > > thanks!
> > >
> > > - Joel
> > >
> > > ---8<-----------------------
> > >
> > > diff --git a/tools/memory-model/linux-kernel.bell b/tools/memory-model/linux-kernel.bell
> > > index ce068700939c..0a16177339bc 100644
> > > --- a/tools/memory-model/linux-kernel.bell
> > > +++ b/tools/memory-model/linux-kernel.bell
> > > @@ -57,7 +57,23 @@ let rcu-rscs = let rec
> > > flag ~empty Rcu-lock \ domain(rcu-rscs) as unmatched-rcu-lock
> > > flag ~empty Rcu-unlock \ range(rcu-rscs) as unmatched-rcu-unlock
> > >
> > > +(***************************************************************)
> > > (* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
> > > +(***************************************************************)
> > > +(*
> > > + * carry-srcu-data: To handle the case of the SRCU critical section split
> > > + * across CPUs, where the idx is used to communicate the SRCU index across CPUs
> > > + * (say CPU0 and CPU1), data is between the R[srcu-lock] to W[once][idx] on
> > > + * CPU0, which is sequenced with the ->rf is between the W[once][idx] and the
> > > + * R[once][idx] on CPU1. The carry-srcu-data is made to exclude Srcu-unlock
> > > + * events to prevent capturing accesses across back-to-back SRCU read-side
> > > + * critical sections.
> > > + *
> > > + * srcu-rscs: Putting everything together, the carry-srcu-data is sequenced with
> > > + * a data relation, which is the data dependency between R[once][idx] on CPU1
> > > + * and the srcu-unlock store, and loc ensures the relation is unique for a
> > > + * specific lock.
> > > + *)
> > > let carry-srcu-data = (data ; [~ Srcu-unlock] ; rf)*
> > > let srcu-rscs = ([Srcu-lock] ; carry-srcu-data ; data ; [Srcu-unlock]) & loc
> >
> > My tendency has been to keep comments in the herd7 files to a minimum
> > and to put more extended descriptions in the explanation.txt file.
> > Right now that file contains almost nothing (a single paragraph!) about
> > SRCU, so it needs to be updated to talk about the new definition of
> > srcu-rscs. In my opinion, that's where this sort of comment belongs.
>
> That makes sense, I agree.
>
> > Joel, would you like to write an extra paragraph of two for that file,
> > explaining in more detail how SRCU lock-to-unlock matching is different
> > from regular RCU and how the definition of the srcu-rscs relation works?
> > I'd be happy to edit anything you come up with.
>
> Yes I would love to, I'll spend some more time studying this up a bit
> more so I don't write nonsense. But yes, I am quite interested in
> writing something up and I will do so!

Hi Alan, all,

One thing I noticed: Shouldn't the model have some notion of fences with the
srcu lock primitive? SRCU implementation in the kernel does an unconditional
memory barrier on srcu_read_lock() (which it has to do for a number of
reasons including correctness), but currently both with/without this patch,
the following returns "Sometimes", instead of "Never". Sorry if this was
discussed before:

C MP+srcu

(*
* Result: Sometimes
*
* If an srcu_read_unlock() is called between 2 stores, they should propogate
* in order.
*)

{}

P0(struct srcu_struct *s, int *x, int *y)
{
int r1;

r1 = srcu_read_lock(s);
WRITE_ONCE(*x, 1);
srcu_read_unlock(s, r1); // replace with smp_mb() makes Never.
WRITE_ONCE(*y, 1);
}

P1(struct srcu_struct *s, int *x, int *y)
{
int r1;
int r2;

r1 = READ_ONCE(*y);
smp_rmb();
r2 = READ_ONCE(*x);
}

exists (1:r1=1 /\ 1:r2=0)

Also, one more general (and likely silly) question about reflexive-transitive closures.

Say you have 2 relations, R1 and R2. Except that R2 is completely empty.

What does (R1; R2)* return?

I expect (R1; R2) to be empty, since there does not exist a tail in R1, that
is a head in R2.

However, that does not appear to be true like in the carry-srcu-data relation
in Alan's patch. For instance, if I have a simple litmus test with a single
reader on a single CPU, and an updater on a second CPU, I see that
carry-srcu-data is a bunch of self-loops on all individual loads and stores
on all CPUs, including the loads and stores surrounding the updater's
synchronize_srcu() call, far from being an empty relation!

Thanks!

- Joel

2023-02-11 16:34:09

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Sat, Feb 11, 2023 at 03:49:39PM +0000, Joel Fernandes wrote:
> Hi Alan, all,
>
> One thing I noticed: Shouldn't the model have some notion of fences with the
> srcu lock primitive? SRCU implementation in the kernel does an unconditional
> memory barrier on srcu_read_lock() (which it has to do for a number of
> reasons including correctness), but currently both with/without this patch,
> the following returns "Sometimes", instead of "Never". Sorry if this was
> discussed before:
>
> C MP+srcu
>
> (*
> * Result: Sometimes
> *
> * If an srcu_read_unlock() is called between 2 stores, they should propogate
> * in order.
> *)
>
> {}
>
> P0(struct srcu_struct *s, int *x, int *y)
> {
> int r1;
>
> r1 = srcu_read_lock(s);
> WRITE_ONCE(*x, 1);
> srcu_read_unlock(s, r1); // replace with smp_mb() makes Never.
> WRITE_ONCE(*y, 1);
> }
>
> P1(struct srcu_struct *s, int *x, int *y)
> {
> int r1;
> int r2;
>
> r1 = READ_ONCE(*y);
> smp_rmb();
> r2 = READ_ONCE(*x);
> }
>
> exists (1:r1=1 /\ 1:r2=0)

As far as I know, the SRCU API does not guarantee this behavior. The
current implementation behaves this way, but future implementations
might not. Therefore we don't want to put it in the memory model.

> Also, one more general (and likely silly) question about reflexive-transitive closures.
>
> Say you have 2 relations, R1 and R2. Except that R2 is completely empty.
>
> What does (R1; R2)* return?

It returns the identity relation, that is, a relation which links each
event with itself. Remember, R* is defined as linking A to B if there
is a series of R links, of _any_ length (including 0!), going from A to
B. Since there is always a series of length 0 linking A to itself, R*
always contains the identity relation.

> I expect (R1; R2) to be empty, since there does not exist a tail in R1, that
> is a head in R2.

Correct. But for any relation R, R* always contains the identity
relation -- even when R is empty. R+, on the other hand, does not.
That's the difference between R* and R+: In R* the series of links can
be of any length, whereas in R+ there must be at least one link.

In your example, both R2+ and (R1 ; R2)+ would be empty.

> However, that does not appear to be true like in the carry-srcu-data relation
> in Alan's patch. For instance, if I have a simple litmus test with a single
> reader on a single CPU, and an updater on a second CPU, I see that
> carry-srcu-data is a bunch of self-loops on all individual loads and stores
> on all CPUs, including the loads and stores surrounding the updater's
> synchronize_srcu() call, far from being an empty relation!

Yep, that's the identity relation.

Alan

2023-02-11 17:18:26

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Mon, Feb 06, 2023 at 01:39:07PM -0500, Alan Stern wrote:
> On Sun, Feb 05, 2023 at 02:10:29PM +0000, Joel Fernandes wrote:
> > On Sat, Feb 04, 2023 at 02:24:11PM -0800, Paul E. McKenney wrote:
> > > On Sat, Feb 04, 2023 at 09:58:12AM -0500, Alan Stern wrote:
> > > > On Fri, Feb 03, 2023 at 05:49:41PM -0800, Paul E. McKenney wrote:
> > > > > On Fri, Feb 03, 2023 at 08:28:35PM -0500, Alan Stern wrote:
> > > > > > The "Provide exact semantics for SRCU" patch should have:
> > > > > >
> > > > > > Portions suggested by Boqun Feng and Jonas Oberhauser.
> > > > > >
> > > > > > added at the end, together with your Reported-by: tag. With that, I
> > > > > > think it can be queued for 6.4.
> > > > >
> > > > > Thank you! Does the patch shown below work for you?
> > > > >
> > > > > (I have tentatively queued this, but can easily adjust or replace it.)
> > > >
> > > > It looks fine.
> > >
> > > Very good, thank you for looking it over! I pushed it out on branch
> > > stern.2023.02.04a.
> > >
> > > Would anyone like to ack/review/whatever this one?
> >
> > Would it be possible to add comments, something like the following? Apologies
> > if it is missing some ideas. I will try to improve it later.
> >
> > thanks!
> >
> > - Joel
> >
> > ---8<-----------------------
> >
> > diff --git a/tools/memory-model/linux-kernel.bell b/tools/memory-model/linux-kernel.bell
> > index ce068700939c..0a16177339bc 100644
> > --- a/tools/memory-model/linux-kernel.bell
> > +++ b/tools/memory-model/linux-kernel.bell
> > @@ -57,7 +57,23 @@ let rcu-rscs = let rec
> > flag ~empty Rcu-lock \ domain(rcu-rscs) as unmatched-rcu-lock
> > flag ~empty Rcu-unlock \ range(rcu-rscs) as unmatched-rcu-unlock
> >
> > +(***************************************************************)
> > (* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
> > +(***************************************************************)
> > +(*
> > + * carry-srcu-data: To handle the case of the SRCU critical section split
> > + * across CPUs, where the idx is used to communicate the SRCU index across CPUs
> > + * (say CPU0 and CPU1), data is between the R[srcu-lock] to W[once][idx] on
> > + * CPU0, which is sequenced with the ->rf is between the W[once][idx] and the
> > + * R[once][idx] on CPU1. The carry-srcu-data is made to exclude Srcu-unlock
> > + * events to prevent capturing accesses across back-to-back SRCU read-side
> > + * critical sections.
> > + *
> > + * srcu-rscs: Putting everything together, the carry-srcu-data is sequenced with
> > + * a data relation, which is the data dependency between R[once][idx] on CPU1
> > + * and the srcu-unlock store, and loc ensures the relation is unique for a
> > + * specific lock.
> > + *)
> > let carry-srcu-data = (data ; [~ Srcu-unlock] ; rf)*
> > let srcu-rscs = ([Srcu-lock] ; carry-srcu-data ; data ; [Srcu-unlock]) & loc
>
> My tendency has been to keep comments in the herd7 files to a minimum
> and to put more extended descriptions in the explanation.txt file.
> Right now that file contains almost nothing (a single paragraph!) about
> SRCU, so it needs to be updated to talk about the new definition of
> srcu-rscs. In my opinion, that's where this sort of comment belongs.
>
> Joel, would you like to write an extra paragraph of two for that file,
> explaining in more detail how SRCU lock-to-unlock matching is different
> from regular RCU and how the definition of the srcu-rscs relation works?
> I'd be happy to edit anything you come up with.
>

I am happy to make changes to explanation.txt (I am assuming that's the file
you mentioned), but I was wondering what you thought of the following change.
If the formulas are split up, that itself could be some documentation as
well. I did add a small paragraph on the top of the formulas as well though.

Some light testing shows it works with the cross-CPU litmus test (could still
have bugs though and needs more testing).

Let me know how you feel about it, and if I should submit something along
these lines along with your suggestion to edit the explanation.txt. Thanks!

diff --git a/tools/memory-model/linux-kernel.bell b/tools/memory-model/linux-kernel.bell
index ce068700939c..1390d1b8ceee 100644
--- a/tools/memory-model/linux-kernel.bell
+++ b/tools/memory-model/linux-kernel.bell
@@ -57,9 +57,28 @@ let rcu-rscs = let rec
flag ~empty Rcu-lock \ domain(rcu-rscs) as unmatched-rcu-lock
flag ~empty Rcu-unlock \ range(rcu-rscs) as unmatched-rcu-unlock

-(* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
-let carry-srcu-data = (data ; [~ Srcu-unlock] ; rf)*
-let srcu-rscs = ([Srcu-lock] ; carry-srcu-data ; data ; [Srcu-unlock]) & loc
+(* SRCU read-side section modeling
+ * Compute matching pairs of nested Srcu-lock and Srcu-unlock:
+ * Each SRCU read-side critical section is treated as independent, of other
+ * overlapping SRCU read-side critical sections even when on the same domain.
+ * For this, each Srcu-lock and Srcu-unlock pair is treated as loads and
+ * stores, with the data-dependency flow also treated as independent to prevent
+ * fusing. *)
+
+(* Data dependency between lock and idx store *)
+let srcu-lock-to-store-idx = ([Srcu-lock]; data)
+
+(* Data dependency between idx load and unlock *)
+let srcu-load-idx-to-unlock = (data; [Srcu-unlock])
+
+(* Read-from dependency between idx store on one CPU and load on same/another.
+ * This is required to model the splitting of critical section across CPUs. *)
+let srcu-store-to-load-idx = (rf ; srcu-load-idx-to-unlock)
+
+(* SRCU data dependency flow. Exclude the Srcu-unlock to not transcend back to back rscs *)
+let carry-srcu-data = (srcu-lock-to-store-idx ; [~ Srcu-unlock] ; srcu-store-to-load-idx)*
+
+let srcu-rscs = ([Srcu-lock] ; carry-srcu-data ; [Srcu-unlock]) & loc

(* Validate nesting *)
flag ~empty Srcu-lock \ domain(srcu-rscs) as unmatched-srcu-lock
--
2.39.1.581.gbfd45094c4-goog

2023-02-11 20:19:31

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Sat, Feb 11, 2023 at 05:18:17PM +0000, Joel Fernandes wrote:
> I am happy to make changes to explanation.txt (I am assuming that's the file
> you mentioned), but I was wondering what you thought of the following change.
> If the formulas are split up, that itself could be some documentation as
> well. I did add a small paragraph on the top of the formulas as well though.
>
> Some light testing shows it works with the cross-CPU litmus test (could still
> have bugs though and needs more testing).
>
> Let me know how you feel about it, and if I should submit something along
> these lines along with your suggestion to edit the explanation.txt. Thanks!
>
> diff --git a/tools/memory-model/linux-kernel.bell b/tools/memory-model/linux-kernel.bell
> index ce068700939c..1390d1b8ceee 100644
> --- a/tools/memory-model/linux-kernel.bell
> +++ b/tools/memory-model/linux-kernel.bell
> @@ -57,9 +57,28 @@ let rcu-rscs = let rec
> flag ~empty Rcu-lock \ domain(rcu-rscs) as unmatched-rcu-lock
> flag ~empty Rcu-unlock \ range(rcu-rscs) as unmatched-rcu-unlock
>
> -(* Compute matching pairs of nested Srcu-lock and Srcu-unlock *)
> -let carry-srcu-data = (data ; [~ Srcu-unlock] ; rf)*
> -let srcu-rscs = ([Srcu-lock] ; carry-srcu-data ; data ; [Srcu-unlock]) & loc
> +(* SRCU read-side section modeling
> + * Compute matching pairs of nested Srcu-lock and Srcu-unlock:
> + * Each SRCU read-side critical section is treated as independent, of other
> + * overlapping SRCU read-side critical sections even when on the same domain.
> + * For this, each Srcu-lock and Srcu-unlock pair is treated as loads and
> + * stores, with the data-dependency flow also treated as independent to prevent
> + * fusing. *)

Even that is more than I would put in the source file. Basically, the
memory model is so complex that trying to document it in comments is
hopeless. The comments would have to be many times longer than the
actual code -- as is the case here with just this little part of the
model. That's the main reason why I made explanation.txt a completely
separate file.

> +
> +(* Data dependency between lock and idx store *)
> +let srcu-lock-to-store-idx = ([Srcu-lock]; data)
> +
> +(* Data dependency between idx load and unlock *)
> +let srcu-load-idx-to-unlock = (data; [Srcu-unlock])
> +
> +(* Read-from dependency between idx store on one CPU and load on same/another.
> + * This is required to model the splitting of critical section across CPUs. *)
> +let srcu-store-to-load-idx = (rf ; srcu-load-idx-to-unlock)
> +
> +(* SRCU data dependency flow. Exclude the Srcu-unlock to not transcend back to back rscs *)
> +let carry-srcu-data = (srcu-lock-to-store-idx ; [~ Srcu-unlock] ; srcu-store-to-load-idx)*
> +
> +let srcu-rscs = ([Srcu-lock] ; carry-srcu-data ; [Srcu-unlock]) & loc

That doesn't look right at all. Does it work with the following?

P0(struct srcu_struct *lock)
{
int r1;

r1 = srcu_read_lock(lock);
srcu_read_unlock(lock, r1);
}

or

P0(struct srcu_struct *lock, int *x, int *y)
{
*x = srcu_read_lock(lock);
*y = *x;
srcu_read_unlock(lock, *y);
}

The idea is that the value returned by srcu_read_lock() can be stored to
and loaded from a sequence (possibly of length 0) of variables, and the
final load gets fed to srcu_read_unlock(). That's what the original
version of the code expresses.

I'm not sure that breaking this relation up into pieces will make it any
easier to understand.

Alan

2023-02-12 00:30:40

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Sat, Feb 11, 2023 at 3:19 PM Alan Stern <[email protected]> wrote:
>
> On Sat, Feb 11, 2023 at 05:18:17PM +0000, Joel Fernandes wrote:
>
[..]
> > +
> > +(* Data dependency between lock and idx store *)
> > +let srcu-lock-to-store-idx = ([Srcu-lock]; data)
> > +
> > +(* Data dependency between idx load and unlock *)
> > +let srcu-load-idx-to-unlock = (data; [Srcu-unlock])
> > +
> > +(* Read-from dependency between idx store on one CPU and load on same/another.
> > + * This is required to model the splitting of critical section across CPUs. *)
> > +let srcu-store-to-load-idx = (rf ; srcu-load-idx-to-unlock)
> > +
> > +(* SRCU data dependency flow. Exclude the Srcu-unlock to not transcend back to back rscs *)
> > +let carry-srcu-data = (srcu-lock-to-store-idx ; [~ Srcu-unlock] ; srcu-store-to-load-idx)*
> > +
> > +let srcu-rscs = ([Srcu-lock] ; carry-srcu-data ; [Srcu-unlock]) & loc
>
> That doesn't look right at all. Does it work with the following?
>
> P0(struct srcu_struct *lock)
> {
> int r1;
>
> r1 = srcu_read_lock(lock);
> srcu_read_unlock(lock, r1);
> }
>
> or
>
> P0(struct srcu_struct *lock, int *x, int *y)
> {
> *x = srcu_read_lock(lock);
> *y = *x;
> srcu_read_unlock(lock, *y);
> }
>
> The idea is that the value returned by srcu_read_lock() can be stored to
> and loaded from a sequence (possibly of length 0) of variables, and the
> final load gets fed to srcu_read_unlock(). That's what the original
> version of the code expresses.

Now I understand it somewhat, and I see where I went wrong. Basically,
you were trying to sequence zero or one "data + rf sequence" starting
from lock and unlock with a final "data" sequence. That data sequence
can be between the srcu-lock and srcu-unlock itself, in case where the
lock/unlock happened on the same CPU.

Damn, that's really complicated.. and I still don't fully understand it.

In trying to understand your CAT code, I made some assumptions about
your formulas by reverse engineering the CAT code with the tests,
which is kind of my point that it is extremely hard to read CAT. That
is kind of why I want to understand the CAT, because for me
explanation.txt is too much at a higher level to get a proper
understanding of the memory model.. I tried re-reading explanation.txt
many times.. then I realized I am just rewriting my own condensed set
of notes every few months.

> I'm not sure that breaking this relation up into pieces will make it any
> easier to understand.

Yes, but I tried. I will keep trying to understand your last patch
more. Especially I am still not sure, why in the case of an SRCU
reader on a single CPU, the following does not work:
let srcu-rscs = ([Srcu-lock]; data; [Srcu-unlock]).

Thanks!

- Joel

2023-02-12 02:59:19

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Sat, Feb 11, 2023 at 07:30:32PM -0500, Joel Fernandes wrote:
> On Sat, Feb 11, 2023 at 3:19 PM Alan Stern <[email protected]> wrote:
> > The idea is that the value returned by srcu_read_lock() can be stored to
> > and loaded from a sequence (possibly of length 0) of variables, and the
> > final load gets fed to srcu_read_unlock(). That's what the original
> > version of the code expresses.
>
> Now I understand it somewhat, and I see where I went wrong. Basically,
> you were trying to sequence zero or one "data + rf sequence" starting
> from lock and unlock with a final "data" sequence. That data sequence
> can be between the srcu-lock and srcu-unlock itself, in case where the
> lock/unlock happened on the same CPU.

In which case the sequence has length 0. Exactly right.

> Damn, that's really complicated.. and I still don't fully understand it.

It sounds like you've made an excellent start. :-)

> In trying to understand your CAT code, I made some assumptions about
> your formulas by reverse engineering the CAT code with the tests,
> which is kind of my point that it is extremely hard to read CAT. That

I can't argue against that; it _is_ hard. It helps to have had the
right kind of training beforehand (my degree was in mathematical logic).

> is kind of why I want to understand the CAT, because for me
> explanation.txt is too much at a higher level to get a proper
> understanding of the memory model.. I tried re-reading explanation.txt
> many times.. then I realized I am just rewriting my own condensed set
> of notes every few months.

Would you like to post a few examples showing some of the most difficult
points you encountered? Maybe explanation.txt can be improved.

> > I'm not sure that breaking this relation up into pieces will make it any
> > easier to understand.
>
> Yes, but I tried. I will keep trying to understand your last patch
> more. Especially I am still not sure, why in the case of an SRCU
> reader on a single CPU, the following does not work:
> let srcu-rscs = ([Srcu-lock]; data; [Srcu-unlock]).

You have to understand that herd7 does not track dependencies through
stores and subsequent loads. That is, if you have something like:

r1 = READ_ONCE(*x);
WRITE_ONCE(*y, r1);
r2 = READ_ONCE(*y);
WRITE_ONCE(*z, r2);

then herd7 will realize that the write to y depends on the value read
from x, and it will realize that the write to z depends on the value
read from y. But it will not realize that the write to z depends on the
value read from x; it loses track of that dependency because of the
intervening store/load from y.

More to the point, if you have:

r1 = srcu_read_lock(lock);
WRITE_ONCE(*y, r1);
r2 = READ_ONCE(*y);
srcu_read_unlock(lock, r2);

then herd7 will not realize that the value of r2 depends on the value of
r1. So there will be no data dependency from the srcu_read_lock() to
the srcu_read_unlock().

Alan

2023-02-12 03:36:02

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

> On Feb 11, 2023, at 9:59 PM, Alan Stern <[email protected]> wrote:
>
> On Sat, Feb 11, 2023 at 07:30:32PM -0500, Joel Fernandes wrote:
>>> On Sat, Feb 11, 2023 at 3:19 PM Alan Stern <[email protected]> wrote:
>>> The idea is that the value returned by srcu_read_lock() can be stored to
>>> and loaded from a sequence (possibly of length 0) of variables, and the
>>> final load gets fed to srcu_read_unlock(). That's what the original
>>> version of the code expresses.
>>
>> Now I understand it somewhat, and I see where I went wrong. Basically,
>> you were trying to sequence zero or one "data + rf sequence" starting
>> from lock and unlock with a final "data" sequence. That data sequence
>> can be between the srcu-lock and srcu-unlock itself, in case where the
>> lock/unlock happened on the same CPU.
>
> In which case the sequence has length 0. Exactly right.

Got it.

>
>> Damn, that's really complicated.. and I still don't fully understand it.
>
> It sounds like you've made an excellent start. :-)

Thanks. :-)

>
>> In trying to understand your CAT code, I made some assumptions about
>> your formulas by reverse engineering the CAT code with the tests,
>> which is kind of my point that it is extremely hard to read CAT. That
>
> I can't argue against that; it _is_ hard. It helps to have had the
> right kind of training beforehand (my degree was in mathematical logic).

Got it, I am reviewing the academic material on these as well.

>> is kind of why I want to understand the CAT, because for me
>> explanation.txt is too much at a higher level to get a proper
>> understanding of the memory model.. I tried re-reading explanation.txt
>> many times.. then I realized I am just rewriting my own condensed set
>> of notes every few months.
>
> Would you like to post a few examples showing some of the most difficult
> points you encountered? Maybe explanation.txt can be improved.

Sure, I will share some things I faced difficult with, tomorrow or so. Off the top, cumulativity was certainly pretty hard to parse.

>
>>> I'm not sure that breaking this relation up into pieces will make it any
>>> easier to understand.
>>
>> Yes, but I tried. I will keep trying to understand your last patch
>> more. Especially I am still not sure, why in the case of an SRCU
>> reader on a single CPU, the following does not work:
>> let srcu-rscs = ([Srcu-lock]; data; [Srcu-unlock]).
>
> You have to understand that herd7 does not track dependencies through
> stores and subsequent loads. That is, if you have something like:
>
> r1 = READ_ONCE(*x);
> WRITE_ONCE(*y, r1);
> r2 = READ_ONCE(*y);
> WRITE_ONCE(*z, r2);
>
> then herd7 will realize that the write to y depends on the value read
> from x, and it will realize that the write to z depends on the value
> read from y. But it will not realize that the write to z depends on the
> value read from x; it loses track of that dependency because of the
> intervening store/load from y.
>
> More to the point, if you have:
>
> r1 = srcu_read_lock(lock);
> WRITE_ONCE(*y, r1);
> r2 = READ_ONCE(*y);
> srcu_read_unlock(lock, r2);
>
> then herd7 will not realize that the value of r2 depends on the value of
> r1. So there will be no data dependency from the srcu_read_lock() to
> the srcu_read_unlock().

Got it! Now I understand why the intermediate load stores needs to be modeled with your carry-srcu-data formula, even on the same CPU! Thank you so much Alan!!

Thanks,

- Joel

>
> Alan

2023-02-13 00:54:34

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Sat, Feb 11, 2023 at 9:59 PM Alan Stern <[email protected]> wrote:
[...]
> > is kind of why I want to understand the CAT, because for me
> > explanation.txt is too much at a higher level to get a proper
> > understanding of the memory model.. I tried re-reading explanation.txt
> > many times.. then I realized I am just rewriting my own condensed set
> > of notes every few months.
>
> Would you like to post a few examples showing some of the most difficult
> points you encountered? Maybe explanation.txt can be improved.

Just to list 2 of the pain points:

1. I think it is hard to reason this section
"PROPAGATION ORDER RELATION: cumul-fence"

All store-related fences should affect propagation order, even the
smp_wmb() which is not A-cumulative should do so (po-earlier stores
appearing before po-later). I think expanding this section with some
examples would make sense to understand what makes "cumul-fence"
different from any other store-related fence.

2. This part is confusing and has always confused me " The
happens-before relation (hb) links memory accesses that have to
execute in a certain order"

It is not memory accesses that execute, it is instructions that
execute. Can we separate out "memory access" from "instruction
execution" in this description?

I think ->hb tries to say that A ->hb B means, memory access A
happened before memory access B exactly in its associated
instruction's execution order (time order), but to be specific --
should that be instruction issue order, or instruction retiring order?

AFAICS ->hb maps instruction execution order to memory access order.
Not all ->po does fall into that category because of out-of-order
hardware execution. As does not ->co because the memory subsystem may
have writes to the same variable to be resolved out of order. It would
be nice to call out that ->po is instruction issue order, which is
different from execution/retiring and that's why it cannot be ->hb.

->rf does because of data flow causality, ->ppo does because of
program structure, so that makes sense to be ->hb.

IMHO, ->rfi should as well, because it is embodying a flow of data, so
that is a bit confusing. It would be great to clarify more perhaps
with an example about why ->rfi cannot be ->hb, in the
"happens-before" section.

That's really how far I typically get (line 1368) before life takes
over, and I have to go do other survival-related things. Then I
restart the activity. Now that I started reading the CAT file as well,
I feel I can make it past that line :D. But I never wanted to get past
it, till I built a solid understanding of the contents before it.

As I read the file more, I can give more feedback, but the above are
different 2 that persist.

Thanks!

- Joel

2023-02-13 11:16:03

by Andrea Parri

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

> > Would you like to post a few examples showing some of the most difficult
> > points you encountered? Maybe explanation.txt can be improved.
>
> Just to list 2 of the pain points:
>
> 1. I think it is hard to reason this section
> "PROPAGATION ORDER RELATION: cumul-fence"
>
> All store-related fences should affect propagation order, even the
> smp_wmb() which is not A-cumulative should do so (po-earlier stores
> appearing before po-later). I think expanding this section with some
> examples would make sense to understand what makes "cumul-fence"
> different from any other store-related fence.

FWIW, litmus-tests/WRC+pooncerelease+fencermbonceonce+Once.litmus illustrates
the concept of A-cumulativity. (The terminology is not LKMM-specific, it was
borrowed from other MCM literature, e.g. "Understanding POWER Multiprocessors"
in Documentation/references.txt.)

> 2. This part is confusing and has always confused me " The
> happens-before relation (hb) links memory accesses that have to
> execute in a certain order"
>
> It is not memory accesses that execute, it is instructions that
> execute. Can we separate out "memory access" from "instruction
> execution" in this description?
>
> I think ->hb tries to say that A ->hb B means, memory access A
> happened before memory access B exactly in its associated
> instruction's execution order (time order), but to be specific --
> should that be instruction issue order, or instruction retiring order?
>
> AFAICS ->hb maps instruction execution order to memory access order.
> Not all ->po does fall into that category because of out-of-order
> hardware execution. As does not ->co because the memory subsystem may
> have writes to the same variable to be resolved out of order. It would
> be nice to call out that ->po is instruction issue order, which is
> different from execution/retiring and that's why it cannot be ->hb.
>
> ->rf does because of data flow causality, ->ppo does because of
> program structure, so that makes sense to be ->hb.
>
> IMHO, ->rfi should as well, because it is embodying a flow of data, so
> that is a bit confusing. It would be great to clarify more perhaps
> with an example about why ->rfi cannot be ->hb, in the
> "happens-before" section.
>
> That's really how far I typically get (line 1368) before life takes
> over, and I have to go do other survival-related things. Then I
> restart the activity. Now that I started reading the CAT file as well,
> I feel I can make it past that line :D. But I never wanted to get past
> it, till I built a solid understanding of the contents before it.
>
> As I read the file more, I can give more feedback, but the above are
> different 2 that persist.

AFAICT, sections "The happens-before relation: hb" and "An operational model"
in Documentation/explanation.txt elaborate (should help) clarify such issues.
About the ->rfi example cf. e.g. Test PPOCA in the above mentioned paper; the
test remains allowed in arm64 and riscv.

Andrea

2023-02-13 16:48:18

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Sun, Feb 12, 2023 at 07:54:15PM -0500, Joel Fernandes wrote:
> On Sat, Feb 11, 2023 at 9:59 PM Alan Stern <[email protected]> wrote:
> [...]
> > > is kind of why I want to understand the CAT, because for me
> > > explanation.txt is too much at a higher level to get a proper
> > > understanding of the memory model.. I tried re-reading explanation.txt
> > > many times.. then I realized I am just rewriting my own condensed set
> > > of notes every few months.
> >
> > Would you like to post a few examples showing some of the most difficult
> > points you encountered? Maybe explanation.txt can be improved.
>
> Just to list 2 of the pain points:
>
> 1. I think it is hard to reason this section
> "PROPAGATION ORDER RELATION: cumul-fence"
>
> All store-related fences should affect propagation order, even the
> smp_wmb() which is not A-cumulative should do so (po-earlier stores
> appearing before po-later). I think expanding this section with some
> examples would make sense to understand what makes "cumul-fence"
> different from any other store-related fence.

Adding some examples is a good idea. That section is pretty terse.

> 2. This part is confusing and has always confused me " The
> happens-before relation (hb) links memory accesses that have to
> execute in a certain order"
>
> It is not memory accesses that execute, it is instructions that
> execute. Can we separate out "memory access" from "instruction
> execution" in this description?

The memory model doesn't really talk about instruction execution; it
thinks about everything in terms of events rather than instructions.
However, I agree that the document isn't very precise about the
distinction between instructions and events.

> I think ->hb tries to say that A ->hb B means, memory access A
> happened before memory access B exactly in its associated
> instruction's execution order (time order), but to be specific --
> should that be instruction issue order, or instruction retiring order?

The model isn't that specific. It doesn't recognize that instructions
take a nonzero amount of time to execute; it thinks of events as
happening instantaneously. (With exceptions perhaps for
synchronize_rcu() and synchronize_srcu().)

If you want to relate the memory model to actual hardware, it's probably
best to think in terms of instruction retiring. But even that isn't
exactly right.

For example, real CPUs can satisfy loads speculatively, possibly
multiple times, before retiring them -- you should think of a load as
executing at the _last_ time it is satisfied. This generally is after
the instruction has been issued and before it is retired. You can think
of a store as executing at the time the CPU commits to it.

> AFAICS ->hb maps instruction execution order to memory access order.

That's not the right way to think about it. In the model, a memory
access occurs when the corresponding event executes. So saying that two
events (or instructions) execute in a certain order is the same as
saying that their two memory accesses execute in that order. There's no
mapping involved.

> Not all ->po does fall into that category because of out-of-order
> hardware execution. As does not ->co because the memory subsystem may
> have writes to the same variable to be resolved out of order. It would
> be nice to call out that ->po is instruction issue order, which is
> different from execution/retiring and that's why it cannot be ->hb.

Okay, that would be a worthwhile addition.

> ->rf does because of data flow causality, ->ppo does because of
> program structure, so that makes sense to be ->hb.
>
> IMHO, ->rfi should as well, because it is embodying a flow of data, so
> that is a bit confusing. It would be great to clarify more perhaps
> with an example about why ->rfi cannot be ->hb, in the
> "happens-before" section.

Maybe. We do talk about store forwarding, and in fact the ppo section
already says:

------------------------------------------------------------------------
R ->dep W ->rfi R',

where the dep link can be either an address or a data dependency. In
this situation we know it is possible for the CPU to execute R' before
W, because it can forward the value that W will store to R'.
------------------------------------------------------------------------

I suppose this could be reiterated in the hb section.

Alan

2023-02-14 00:37:01

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

Thanks, I agree with most of your last email, just replying to one thing:

> > ->rf does because of data flow causality, ->ppo does because of
> > program structure, so that makes sense to be ->hb.
> >
> > IMHO, ->rfi should as well, because it is embodying a flow of data, so
> > that is a bit confusing. It would be great to clarify more perhaps
> > with an example about why ->rfi cannot be ->hb, in the
> > "happens-before" section.
>
> Maybe. We do talk about store forwarding, and in fact the ppo section
> already says:
>
> ------------------------------------------------------------------------
> R ->dep W ->rfi R',
>
> where the dep link can be either an address or a data dependency. In
> this situation we know it is possible for the CPU to execute R' before
> W, because it can forward the value that W will store to R'.
> ------------------------------------------------------------------------

Thank you for pointing this out! In the text that follows this, in
this paragraph:

<quote>
where the dep link can be either an address or a data dependency. In
this situation we know it is possible for the CPU to execute R' before
W, because it can forward the value that W will store to R'. But it
cannot execute R' before R, because it cannot forward the value before
it knows what that value is, or that W and R' do access the same
location.
</quote>

The "in this situation" should be clarified that the "situation" is a
data-dependency. Only in the case of data-dependency, the ->rfi
cannot cause misordering if I understand it correctly. However, that
sentence does not mention data-dependency explicitly. Or let me know
if I missed something?

Thanks,

- Joel

2023-02-14 00:52:48

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Mon, Feb 13, 2023 at 6:15 AM Andrea Parri <[email protected]> wrote:
>
> > > Would you like to post a few examples showing some of the most difficult
> > > points you encountered? Maybe explanation.txt can be improved.
> >
> > Just to list 2 of the pain points:
> >
> > 1. I think it is hard to reason this section
> > "PROPAGATION ORDER RELATION: cumul-fence"
> >
> > All store-related fences should affect propagation order, even the
> > smp_wmb() which is not A-cumulative should do so (po-earlier stores
> > appearing before po-later). I think expanding this section with some
> > examples would make sense to understand what makes "cumul-fence"
> > different from any other store-related fence.
>
> FWIW, litmus-tests/WRC+pooncerelease+fencermbonceonce+Once.litmus illustrates
> the concept of A-cumulativity.

Right, this I knew. The smp_store_release() in that test A-cumulative.
However, the "cumul-fence" naming in the document we are discussing
sounds redundant. (but I could be missing something).

> (The terminology is not LKMM-specific, it was borrowed from other MCM literature, e.g. "Understanding POWER Multiprocessors"
> in Documentation/references.txt.)

Thank you!

> > 2. This part is confusing and has always confused me " The
> > happens-before relation (hb) links memory accesses that have to
> > execute in a certain order"
> >
> > It is not memory accesses that execute, it is instructions that
> > execute. Can we separate out "memory access" from "instruction
> > execution" in this description?
> >
> > I think ->hb tries to say that A ->hb B means, memory access A
> > happened before memory access B exactly in its associated
> > instruction's execution order (time order), but to be specific --
> > should that be instruction issue order, or instruction retiring order?
> >
> > AFAICS ->hb maps instruction execution order to memory access order.
> > Not all ->po does fall into that category because of out-of-order
> > hardware execution. As does not ->co because the memory subsystem may
> > have writes to the same variable to be resolved out of order. It would
> > be nice to call out that ->po is instruction issue order, which is
> > different from execution/retiring and that's why it cannot be ->hb.
> >
> > ->rf does because of data flow causality, ->ppo does because of
> > program structure, so that makes sense to be ->hb.
> >
> > IMHO, ->rfi should as well, because it is embodying a flow of data, so
> > that is a bit confusing. It would be great to clarify more perhaps
> > with an example about why ->rfi cannot be ->hb, in the
> > "happens-before" section.
> >
> > That's really how far I typically get (line 1368) before life takes
> > over, and I have to go do other survival-related things. Then I
> > restart the activity. Now that I started reading the CAT file as well,
> > I feel I can make it past that line :D. But I never wanted to get past
> > it, till I built a solid understanding of the contents before it.
> >
> > As I read the file more, I can give more feedback, but the above are
> > different 2 that persist.
>
> AFAICT, sections "The happens-before relation: hb" and "An operational model"
> in Documentation/explanation.txt elaborate (should help) clarify such issues.
> About the ->rfi example cf. e.g. Test PPOCA in the above mentioned paper; the
> test remains allowed in arm64 and riscv.

Thank you, this clarifies a lot and appears there is already a similar
relation mentioned in the explanation.txt as Alan pointed. This paper
is great to clarify the concept -- appreciate!. I wonder what other
ordering 'havocs' do processor branch speculation cause. This should
imply load-load control dependency is also subject to reordering on
these architectures.

thanks,

- Joel

2023-02-14 01:57:36

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Mon, Feb 13, 2023 at 07:36:42PM -0500, Joel Fernandes wrote:
> Thanks, I agree with most of your last email, just replying to one thing:
>
> > > ->rf does because of data flow causality, ->ppo does because of
> > > program structure, so that makes sense to be ->hb.
> > >
> > > IMHO, ->rfi should as well, because it is embodying a flow of data, so
> > > that is a bit confusing. It would be great to clarify more perhaps
> > > with an example about why ->rfi cannot be ->hb, in the
> > > "happens-before" section.
> >
> > Maybe. We do talk about store forwarding, and in fact the ppo section
> > already says:
> >
> > ------------------------------------------------------------------------
> > R ->dep W ->rfi R',
> >
> > where the dep link can be either an address or a data dependency. In
> > this situation we know it is possible for the CPU to execute R' before
> > W, because it can forward the value that W will store to R'.
> > ------------------------------------------------------------------------
>
> Thank you for pointing this out! In the text that follows this, in
> this paragraph:
>
> <quote>
> where the dep link can be either an address or a data dependency. In
> this situation we know it is possible for the CPU to execute R' before
> W, because it can forward the value that W will store to R'. But it
> cannot execute R' before R, because it cannot forward the value before
> it knows what that value is, or that W and R' do access the same
> location.
> </quote>
>
> The "in this situation" should be clarified that the "situation" is a
> data-dependency. Only in the case of data-dependency, the ->rfi
> cannot cause misordering if I understand it correctly. However, that
> sentence does not mention data-dependency explicitly. Or let me know
> if I missed something?

The text explicitly says that the dep link can be either an address or a
data dependency. In either case, R' cannot be reordered before R.

In theory this doesn't have to be true for address dependencies, because
the CPU might realize that W and R' access the same address without
knowing what that address is. However, I've been reliably informed that
no existing architectures do this sort of optimization.

The case of a control dependency is different, because the CPU can
speculate that W will be executed and can speculatively forward the
value from W to R' before it knows what value R will read.

Alan

2023-02-14 02:12:53

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Mon, Feb 13, 2023 at 8:57 PM Alan Stern <[email protected]> wrote:
>
> On Mon, Feb 13, 2023 at 07:36:42PM -0500, Joel Fernandes wrote:
> > Thanks, I agree with most of your last email, just replying to one thing:
> >
> > > > ->rf does because of data flow causality, ->ppo does because of
> > > > program structure, so that makes sense to be ->hb.
> > > >
> > > > IMHO, ->rfi should as well, because it is embodying a flow of data, so
> > > > that is a bit confusing. It would be great to clarify more perhaps
> > > > with an example about why ->rfi cannot be ->hb, in the
> > > > "happens-before" section.
> > >
> > > Maybe. We do talk about store forwarding, and in fact the ppo section
> > > already says:
> > >
> > > ------------------------------------------------------------------------
> > > R ->dep W ->rfi R',
> > >
> > > where the dep link can be either an address or a data dependency. In
> > > this situation we know it is possible for the CPU to execute R' before
> > > W, because it can forward the value that W will store to R'.
> > > ------------------------------------------------------------------------
> >
> > Thank you for pointing this out! In the text that follows this, in
> > this paragraph:
> >
> > <quote>
> > where the dep link can be either an address or a data dependency. In
> > this situation we know it is possible for the CPU to execute R' before
> > W, because it can forward the value that W will store to R'. But it
> > cannot execute R' before R, because it cannot forward the value before
> > it knows what that value is, or that W and R' do access the same
> > location.
> > </quote>
> >
> > The "in this situation" should be clarified that the "situation" is a
> > data-dependency. Only in the case of data-dependency, the ->rfi
> > cannot cause misordering if I understand it correctly. However, that
> > sentence does not mention data-dependency explicitly. Or let me know
> > if I missed something?
>
> The text explicitly says that the dep link can be either an address or a
> data dependency. In either case, R' cannot be reordered before R.
>
> In theory this doesn't have to be true for address dependencies, because
> the CPU might realize that W and R' access the same address without
> knowing what that address is. However, I've been reliably informed that
> no existing architectures do this sort of optimization.
>
> The case of a control dependency is different, because the CPU can
> speculate that W will be executed and can speculatively forward the
> value from W to R' before it knows what value R will read.
>

Sorry, I misread it. You are right. Got it now, Thanks.

- Joel

2023-02-18 06:14:17

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

Hi Alan,

On Sat, Feb 11, 2023 at 9:59 PM Alan Stern <[email protected]> wrote:
>
[...]
>
> Would you like to post a few examples showing some of the most difficult
> points you encountered? Maybe explanation.txt can be improved.

One additional feedback I wanted to mention, regarding this paragraph
under "WARNING":
===========
The protections provided by READ_ONCE(), WRITE_ONCE(), and others are
not perfect; and under some circumstances it is possible for the
compiler to undermine the memory model. Here is an example. Suppose
both branches of an "if" statement store the same value to the same
location:
r1 = READ_ONCE(x);
if (r1) {
WRITE_ONCE(y, 2);
... /* do something */
} else {
WRITE_ONCE(y, 2);
... /* do something else */
}
===========

I tried lots of different compilers with varying degrees of
optimization, in all cases I find that the conditional instruction
always appears in program order before the stores inside the body of
the conditional. So I am not sure if this is really a valid concern on
current compilers, if not - could you provide an example of a compiler
and options that cause it?

In any case, if it is a theoretical concern, it could be clarified
that this is a theoretical possibility in the text. And if it is a
real/practical concern, then it could be mentioned the specific
compiler/arch this was seen in.

Thanks!

- Joel

>
> > > I'm not sure that breaking this relation up into pieces will make it any
> > > easier to understand.
> >
> > Yes, but I tried. I will keep trying to understand your last patch
> > more. Especially I am still not sure, why in the case of an SRCU
> > reader on a single CPU, the following does not work:
> > let srcu-rscs = ([Srcu-lock]; data; [Srcu-unlock]).
>
> You have to understand that herd7 does not track dependencies through
> stores and subsequent loads. That is, if you have something like:
>
> r1 = READ_ONCE(*x);
> WRITE_ONCE(*y, r1);
> r2 = READ_ONCE(*y);
> WRITE_ONCE(*z, r2);
>
> then herd7 will realize that the write to y depends on the value read
> from x, and it will realize that the write to z depends on the value
> read from y. But it will not realize that the write to z depends on the
> value read from x; it loses track of that dependency because of the
> intervening store/load from y.
>
> More to the point, if you have:
>
> r1 = srcu_read_lock(lock);
> WRITE_ONCE(*y, r1);
> r2 = READ_ONCE(*y);
> srcu_read_unlock(lock, r2);
>
> then herd7 will not realize that the value of r2 depends on the value of
> r1. So there will be no data dependency from the srcu_read_lock() to
> the srcu_read_unlock().
>
> Alan

2023-02-18 19:21:34

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Sat, Feb 18, 2023 at 01:13:59AM -0500, Joel Fernandes wrote:
> Hi Alan,
>
> On Sat, Feb 11, 2023 at 9:59 PM Alan Stern <[email protected]> wrote:
> >
> [...]
> >
> > Would you like to post a few examples showing some of the most difficult
> > points you encountered? Maybe explanation.txt can be improved.
>
> One additional feedback I wanted to mention, regarding this paragraph
> under "WARNING":
> ===========
> The protections provided by READ_ONCE(), WRITE_ONCE(), and others are
> not perfect; and under some circumstances it is possible for the
> compiler to undermine the memory model. Here is an example. Suppose
> both branches of an "if" statement store the same value to the same
> location:
> r1 = READ_ONCE(x);
> if (r1) {
> WRITE_ONCE(y, 2);
> ... /* do something */
> } else {
> WRITE_ONCE(y, 2);
> ... /* do something else */
> }
> ===========
>
> I tried lots of different compilers with varying degrees of
> optimization, in all cases I find that the conditional instruction
> always appears in program order before the stores inside the body of
> the conditional. So I am not sure if this is really a valid concern on
> current compilers, if not - could you provide an example of a compiler
> and options that cause it?
>
> In any case, if it is a theoretical concern, it could be clarified
> that this is a theoretical possibility in the text. And if it is a
> real/practical concern, then it could be mentioned the specific
> compiler/arch this was seen in.

I could be misremembering, but I believe that this reordering has been
seen in the past.

Thanx, Paul

> Thanks!
>
> - Joel
>
>
>
> >
> > > > I'm not sure that breaking this relation up into pieces will make it any
> > > > easier to understand.
> > >
> > > Yes, but I tried. I will keep trying to understand your last patch
> > > more. Especially I am still not sure, why in the case of an SRCU
> > > reader on a single CPU, the following does not work:
> > > let srcu-rscs = ([Srcu-lock]; data; [Srcu-unlock]).
> >
> > You have to understand that herd7 does not track dependencies through
> > stores and subsequent loads. That is, if you have something like:
> >
> > r1 = READ_ONCE(*x);
> > WRITE_ONCE(*y, r1);
> > r2 = READ_ONCE(*y);
> > WRITE_ONCE(*z, r2);
> >
> > then herd7 will realize that the write to y depends on the value read
> > from x, and it will realize that the write to z depends on the value
> > read from y. But it will not realize that the write to z depends on the
> > value read from x; it loses track of that dependency because of the
> > intervening store/load from y.
> >
> > More to the point, if you have:
> >
> > r1 = srcu_read_lock(lock);
> > WRITE_ONCE(*y, r1);
> > r2 = READ_ONCE(*y);
> > srcu_read_unlock(lock, r2);
> >
> > then herd7 will not realize that the value of r2 depends on the value of
> > r1. So there will be no data dependency from the srcu_read_lock() to
> > the srcu_read_unlock().
> >
> > Alan

2023-02-19 02:05:16

by Andrea Parri

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

> One additional feedback I wanted to mention, regarding this paragraph
> under "WARNING":
> ===========
> The protections provided by READ_ONCE(), WRITE_ONCE(), and others are
> not perfect; and under some circumstances it is possible for the
> compiler to undermine the memory model. Here is an example. Suppose
> both branches of an "if" statement store the same value to the same
> location:
> r1 = READ_ONCE(x);
> if (r1) {
> WRITE_ONCE(y, 2);
> ... /* do something */
> } else {
> WRITE_ONCE(y, 2);
> ... /* do something else */
> }
> ===========
>
> I tried lots of different compilers with varying degrees of
> optimization, in all cases I find that the conditional instruction
> always appears in program order before the stores inside the body of
> the conditional. So I am not sure if this is really a valid concern on
> current compilers, if not - could you provide an example of a compiler
> and options that cause it?

The compiler cannot change the order in which the load and the store
appear in the program (these are "volatile accesses"); the concern is
that (quoting from the .txt) it "could list the stores out of the
conditional", thus effectively destroying the control dependency between
the load and the store (the load-store "reordering" could then be
performed by the uarch, under certain archs). For example, compare:

(for the C snippet)

void func(int *x, int *y)
{
int r1 = *(const volatile int *)x;

if (r1)
*(volatile int *)y = 2;
else
*(volatile int *)y = 2;
}

- arm64 gcc 11.3 -O1 gives:

func:
ldr w0, [x0]
cbz w0, .L2
mov w0, 2
str w0, [x1]
.L1:
ret
.L2:
mov w0, 2
str w0, [x1]
b .L1

- OTOH, arm64 gcc 11.3 -O2 gives:

func:
ldr w0, [x0]
mov w0, 2
str w0, [x1]
ret

- similarly, using arm64 clang 14.0.0 -O2,

func: // @func
mov w8, #2
ldr wzr, [x0]
str w8, [x1]
ret

I saw similar results using riscv, powerpc, x86 gcc & clang.

Andrea

2023-02-19 02:59:00

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Sat, Feb 18, 2023 at 9:05 PM Andrea Parri <[email protected]> wrote:
>
> > One additional feedback I wanted to mention, regarding this paragraph
> > under "WARNING":
> > ===========
> > The protections provided by READ_ONCE(), WRITE_ONCE(), and others are
> > not perfect; and under some circumstances it is possible for the
> > compiler to undermine the memory model. Here is an example. Suppose
> > both branches of an "if" statement store the same value to the same
> > location:
> > r1 = READ_ONCE(x);
> > if (r1) {
> > WRITE_ONCE(y, 2);
> > ... /* do something */
> > } else {
> > WRITE_ONCE(y, 2);
> > ... /* do something else */
> > }
> > ===========
> >
> > I tried lots of different compilers with varying degrees of
> > optimization, in all cases I find that the conditional instruction
> > always appears in program order before the stores inside the body of
> > the conditional. So I am not sure if this is really a valid concern on
> > current compilers, if not - could you provide an example of a compiler
> > and options that cause it?
>
> The compiler cannot change the order in which the load and the store
> appear in the program (these are "volatile accesses"); the concern is
> that (quoting from the .txt) it "could list the stores out of the
> conditional", thus effectively destroying the control dependency between
> the load and the store (the load-store "reordering" could then be
> performed by the uarch, under certain archs). For example, compare:
>
> (for the C snippet)
>
> void func(int *x, int *y)
> {
> int r1 = *(const volatile int *)x;
>
> if (r1)
> *(volatile int *)y = 2;
> else
> *(volatile int *)y = 2;
> }
>
> - arm64 gcc 11.3 -O1 gives:
>
> func:
> ldr w0, [x0]
> cbz w0, .L2
> mov w0, 2
> str w0, [x1]
> .L1:
> ret
> .L2:
> mov w0, 2
> str w0, [x1]
> b .L1
>
> - OTOH, arm64 gcc 11.3 -O2 gives:
>
> func:
> ldr w0, [x0]
> mov w0, 2
> str w0, [x1]
> ret
>
> - similarly, using arm64 clang 14.0.0 -O2,
>
> func: // @func
> mov w8, #2
> ldr wzr, [x0]
> str w8, [x1]
> ret
>
> I saw similar results using riscv, powerpc, x86 gcc & clang.

*Sigh*, yeah and even if the conditional logic is not fully stripped
out as in your example, I still see arm64 perform stores in
program-order before ever checking the branch condition:

int prog(void)
{
int r1 = *(const volatile int *)x;
if (r1) {
*(volatile int *)y = 1;
} else {
*(volatile int *)y = 1;
*(volatile int *)z = 2;
}
}

becomes with armv8 clang and -Os:

prog:
adrp x8, x
ldrsw x8, [x8, :lo12:x]
adrp x9, y
mov w10, #1
ldr w8, [x8]
ldrsw x9, [x9, :lo12:y]
str w10, [x9]
cbz w8, .LBB0_2 // LOL
ret
.LBB0_2:
adrp x8, z
ldrsw x8, [x8, :lo12:z]
mov w9, #2
str w9, [x8]
ret

Anyway sorry for the noise... I believe I did not have the right set
of compiler options yesterday..

Thanks!

- Joel

2023-02-19 03:21:00

[permalink] [raw]

Subject: Re: Current LKMM patch disposition

On Sat, Feb 18, 2023 at 2:21 PM Paul E. McKenney <[email protected]> wrote:
>
> On Sat, Feb 18, 2023 at 01:13:59AM -0500, Joel Fernandes wrote:
> > Hi Alan,
> >
> > On Sat, Feb 11, 2023 at 9:59 PM Alan Stern <[email protected]> wrote:
> > >
> > [...]
> > >
> > > Would you like to post a few examples showing some of the most difficult
> > > points you encountered? Maybe explanation.txt can be improved.
> >
> > One additional feedback I wanted to mention, regarding this paragraph
> > under "WARNING":
> > ===========
> > The protections provided by READ_ONCE(), WRITE_ONCE(), and others are
> > not perfect; and under some circumstances it is possible for the
> > compiler to undermine the memory model. Here is an example. Suppose
> > both branches of an "if" statement store the same value to the same
> > location:
> > r1 = READ_ONCE(x);
> > if (r1) {
> > WRITE_ONCE(y, 2);
> > ... /* do something */
> > } else {
> > WRITE_ONCE(y, 2);
> > ... /* do something else */
> > }
> > ===========
> >
> > I tried lots of different compilers with varying degrees of
> > optimization, in all cases I find that the conditional instruction
> > always appears in program order before the stores inside the body of
> > the conditional. So I am not sure if this is really a valid concern on
> > current compilers, if not - could you provide an example of a compiler
> > and options that cause it?
> >
> > In any case, if it is a theoretical concern, it could be clarified
> > that this is a theoretical possibility in the text. And if it is a
> > real/practical concern, then it could be mentioned the specific
> > compiler/arch this was seen in.
>
> I could be misremembering, but I believe that this reordering has been
> seen in the past.
>

Thank you! And I also confirmed putting a barrier() in the branch
body, also "cures" the optimization... I did not know compilers
optimize so aggressively..

- Joel

2023-02-19 08:09:31