tools/memory-model/ and herdtool7 are closely linked: the latter is
responsible for (pre)processing each C-like macro of a litmus test,
and for providing the LKMM with a set of events, or "representation",
corresponding to the given macro. Provide herd-representation.txt
to document the representation of synchronization macros, following
their "classification" in Documentation/atomic_t.txt.
Suggested-by: Hernan Ponce de Leon <[email protected]>
Signed-off-by: Andrea Parri <[email protected]>
---
- Leaving srcu_{up,down}_read() and smp_mb__after_srcu_read_unlock() for
the next version.
- Limiting to "add" and "and" ops (skipping similar/same representations
for "sub", "inc", "dec", "or", "xor", "andnot").
- While preparing this submission, I recalled that atomic_add_unless()
is not listed in the .def file. I can't remember the reason for this
omission though.
- While checking the information below using herd7, I've observed some
"strange" behavior with spin_is_locked() (perhaps, unsurprisingly...);
IAC, that's also excluded from this table/submission.
.../Documentation/herd-representation.txt | 81 +++++++++++++++++++
1 file changed, 81 insertions(+)
create mode 100644 tools/memory-model/Documentation/herd-representation.txt
diff --git a/tools/memory-model/Documentation/herd-representation.txt b/tools/memory-model/Documentation/herd-representation.txt
new file mode 100644
index 0000000000000..94d0d0a9eee50
--- /dev/null
+++ b/tools/memory-model/Documentation/herd-representation.txt
@@ -0,0 +1,81 @@
+ ---------------------------------------------------------------------------
+ | C macro | Events |
+ ---------------------------------------------------------------------------
+ | Non-RMW ops | |
+ ---------------------------------------------------------------------------
+ | READ_ONCE | R[once] |
+ | atomic_read | (as in the previous row) |
+ | WRITE_ONCE | W[once] |
+ | atomic_set | |
+ | smp_load_acquire | R[acquire] |
+ | atomic_read_acquire | |
+ | smp_store_release | W[release] |
+ | atomic_set_release | |
+ | smp_store_mb | W[once] ->po F[mb] |
+ | smp_mb | F[mb] |
+ | smp_rmb | F[rmb] |
+ | smp_wmb | F[wmb] |
+ | smp_mb__before_atomic | F[before-atomic] |
+ | smp_mb__after_atomic | F[after-atomic] |
+ | spin_unlock | UL |
+ | smp_mb__after_spinlock | F[after-spinlock] |
+ | smp_mb__after_unlock_lock | F[after-unlock-lock] |
+ | rcu_read_lock | F[rcu-lock] |
+ | rcu_read_unlock | F[rcu-unlock] |
+ | synchronize_rcu | F[sync-rcu] |
+ | rcu_dereference | R[once] |
+ | rcu_assign_pointer | W[release] |
+ | srcu_read_lock | R[srcu-lock] |
+ | srcu_read_unlock | W[srcu-unlock] |
+ | synchronize_srcu | SRCU[sync-srcu] |
+ ---------------------------------------------------------------------------
+ | RMW ops w/o return value | |
+ ---------------------------------------------------------------------------
+ | atomic_add | R*[noreturn] ->rmw W*[once] |
+ | atomic_and | |
+ | spin_lock | LKR ->lk-rmw LKW |
+ ---------------------------------------------------------------------------
+ | RMW ops w/ return value | |
+ ---------------------------------------------------------------------------
+ | atomic_add_return | F[mb] ->po R*[once] |
+ | | ->rmw W*[once] ->po F[mb] |
+ | atomic_fetch_add | |
+ | atomic_fetch_and | |
+ | atomic_xchg | |
+ | xchg | |
+ | atomic_add_negative | |
+ | atomic_add_return_relaxed | R*[once] ->rmw W*[once] |
+ | atomic_fetch_add_relaxed | |
+ | atomic_fetch_and_relaxed | |
+ | atomic_xchg_relaxed | |
+ | xchg_relaxed | |
+ | atomic_add_negative_relaxed | |
+ | atomic_add_return_acquire | R*[acquire] ->rmw W*[once] |
+ | atomic_fetch_add_acquire | |
+ | atomic_fetch_and_acquire | |
+ | atomic_xchg_acquire | |
+ | xchg_acquire | |
+ | atomic_add_negative_acquire | |
+ | atomic_add_return_release | R*[once] ->rmw W*[release] |
+ | atomic_fetch_add_release | |
+ | atomic_fetch_and_release | |
+ | atomic_xchg_release | |
+ | xchg_release | |
+ | atomic_add_negative_release | |
+ ---------------------------------------------------------------------------
+ | Conditional RMW ops | |
+ ---------------------------------------------------------------------------
+ | atomic_cmpxchg | On success: F[mb] ->po R*[once] |
+ | | ->rmw W*[once] ->po F[mb] |
+ | | On failure: R*[once] |
+ | cmpxchg | |
+ | atomic_add_unless | |
+ | atomic_cmpxchg_relaxed | On success: R*[once] ->rmw W*[once] |
+ | | On failure: R*[once] |
+ | atomic_cmpxchg_acquire | On success: R*[acquire] ->rmw W*[once] |
+ | | On failure: R*[once] |
+ | atomic_cmpxchg_release | On success: R*[once] ->rmw W*[release] |
+ | | On failure: R*[once] |
+ | spin_trylock | On success: LKR ->lk-rmw LKW |
+ | | On failure: LF |
+ ---------------------------------------------------------------------------
--
2.34.1
> - While checking the information below using herd7, I've observed some
> "strange" behavior with spin_is_locked() (perhaps, unsurprisingly...);
> IAC, that's also excluded from this table/submission.
For completeness, the behavior in question:
$ cat T.litmus
C T
{}
P0(spinlock_t *x)
{
int r0;
spin_lock(x);
spin_unlock(x);
r0 = spin_is_locked(x);
}
$ herd7 -conf linux-kernel.cfg T.litmus
Test T Required
States 0
Ok
Witnesses
Positive: 0 Negative: 0
Condition forall (true)
Observation T Never 0 0
Time T 0.00
Hash=6fa204e139ddddf2cb6fa963bad117c0
Haven't been using spin_is_locked for a while... perhaps I'm doing
something wrong? (IAC, will have a closer look next week...)
Andrea
On Fri, May 24, 2024 at 05:37:06PM +0200, Andrea Parri wrote:
> > - While checking the information below using herd7, I've observed some
> > "strange" behavior with spin_is_locked() (perhaps, unsurprisingly...);
> > IAC, that's also excluded from this table/submission.
>
> For completeness, the behavior in question:
>
> $ cat T.litmus
> C T
>
> {}
>
> P0(spinlock_t *x)
> {
> int r0;
>
> spin_lock(x);
> spin_unlock(x);
> r0 = spin_is_locked(x);
> }
No "exists" clause? Maybe that's your problem.
Alan
>
> $ herd7 -conf linux-kernel.cfg T.litmus
> Test T Required
> States 0
> Ok
> Witnesses
> Positive: 0 Negative: 0
> Condition forall (true)
> Observation T Never 0 0
> Time T 0.00
> Hash=6fa204e139ddddf2cb6fa963bad117c0
>
> Haven't been using spin_is_locked for a while... perhaps I'm doing
> something wrong? (IAC, will have a closer look next week...)
>
> Andrea
On Fri, May 24, 2024 at 05:13:56PM +0200, Andrea Parri wrote:
> tools/memory-model/ and herdtool7 are closely linked: the latter is
> responsible for (pre)processing each C-like macro of a litmus test,
> and for providing the LKMM with a set of events, or "representation",
> corresponding to the given macro. Provide herd-representation.txt
> to document the representation of synchronization macros, following
> their "classification" in Documentation/atomic_t.txt.
>
> Suggested-by: Hernan Ponce de Leon <[email protected]>
> Signed-off-by: Andrea Parri <[email protected]>
> ---
> + | rcu_dereference | R[once] |
> + | rcu_assign_pointer | W[release] |
> + | srcu_read_lock | R[srcu-lock] |
> + | srcu_read_unlock | W[srcu-unlock] |
> + | synchronize_srcu | SRCU[sync-srcu] |
> + ---------------------------------------------------------------------------
> + | RMW ops w/o return value | |
> + ---------------------------------------------------------------------------
> + | atomic_add | R*[noreturn] ->rmw W*[once] |
What's the difference between R and R*, or between W and W*?
Alan
> > $ cat T.litmus
> > C T
> >
> > {}
> >
> > P0(spinlock_t *x)
> > {
> > int r0;
> >
> > spin_lock(x);
> > spin_unlock(x);
> > r0 = spin_is_locked(x);
> > }
>
> No "exists" clause? Maybe that's your problem.
Nope, that doesn't seem to be it. (Same result after adding one.)
Andrea
> What's the difference between R and R*, or between W and W*?
AFAIU, herd7 uses such notation, "*", to denote a load or a store which
is also in RMW.
Andrea
On Fri, May 24, 2024 at 05:37:06PM +0200, Andrea Parri wrote:
> > - While checking the information below using herd7, I've observed some
> > "strange" behavior with spin_is_locked() (perhaps, unsurprisingly...);
> > IAC, that's also excluded from this table/submission.
>
> For completeness, the behavior in question:
>
> $ cat T.litmus
> C T
>
> {}
>
> P0(spinlock_t *x)
> {
> int r0;
>
> spin_lock(x);
> spin_unlock(x);
> r0 = spin_is_locked(x);
> }
>
> $ herd7 -conf linux-kernel.cfg T.litmus
> Test T Required
> States 0
> Ok
> Witnesses
> Positive: 0 Negative: 0
> Condition forall (true)
> Observation T Never 0 0
> Time T 0.00
> Hash=6fa204e139ddddf2cb6fa963bad117c0
>
> Haven't been using spin_is_locked for a while... perhaps I'm doing
> something wrong? (IAC, will have a closer look next week...)
I get the same empty result. There's definitely something going wrong
in the "with ... from cross(...)" lines in lock.cat. I need to do some
checking and testing.
Also, lock.cat doesn't check for R events that don't actually read from
anything (which will happen when the spin_is_locked() call above
generates an RL event). This is a separate bug, easily fixed.
Alan
On Fri, May 24, 2024 at 05:37:06PM +0200, Andrea Parri wrote:
> > - While checking the information below using herd7, I've observed some
> > "strange" behavior with spin_is_locked() (perhaps, unsurprisingly...);
> > IAC, that's also excluded from this table/submission.
>
> For completeness, the behavior in question:
>
> $ cat T.litmus
> C T
>
> {}
>
> P0(spinlock_t *x)
> {
> int r0;
>
> spin_lock(x);
> spin_unlock(x);
> r0 = spin_is_locked(x);
> }
>
> $ herd7 -conf linux-kernel.cfg T.litmus
> Test T Required
> States 0
> Ok
> Witnesses
> Positive: 0 Negative: 0
> Condition forall (true)
> Observation T Never 0 0
> Time T 0.00
> Hash=6fa204e139ddddf2cb6fa963bad117c0
>
> Haven't been using spin_is_locked for a while... perhaps I'm doing
> something wrong? (IAC, will have a closer look next week...)
It turns out the problem lies in the way lock.cat tries to calculate the
rf relation for RU events (a spin_is_locked() that returns False). The
method it uses amounts to requiring that such events must read from the
lock's initial value or an LU event (a spin_unlock()) in a different
thread. This clearly is wrong, and glaringly so in this litmus test
since there are no other threads!
A patch to fix the problem and reorganize the code a bit for greater
readability is below. I'd appreciate it if people could try it out on
various locking litmus tests in our archives.
Alan
---
tools/memory-model/lock.cat | 61 +++++++++++++++++++++++++-------------------
1 file changed, 36 insertions(+), 25 deletions(-)
Index: usb-devel/tools/memory-model/lock.cat
===================================================================
--- usb-devel.orig/tools/memory-model/lock.cat
+++ usb-devel/tools/memory-model/lock.cat
@@ -54,6 +54,12 @@ flag ~empty LKR \ domain(lk-rmw) as unpa
*)
empty ([LKW] ; po-loc ; [LKR]) \ (po-loc ; [UL] ; po-loc) as lock-nest
+(*
+ * In the same way, spin_is_locked() inside a critical section must always
+ * return True (no RU events can be in a critical section for the same lock).
+ *)
+empty ([LKW] ; po-loc ; [RU]) \ (po-loc ; [UL] ; po-loc) as nested-is-locked
+
(* The final value of a spinlock should not be tested *)
flag ~empty [FW] ; loc ; [ALL-LOCKS] as lock-final
@@ -79,42 +85,47 @@ empty ([UNMATCHED-LKW] ; loc ; [UNMATCHE
(* rfi for LF events: link each LKW to the LF events in its critical section *)
let rfi-lf = ([LKW] ; po-loc ; [LF]) \ ([LKW] ; po-loc ; [UL] ; po-loc)
-(* rfe for LF events *)
+(* Utility macro to convert a single pair to a single-edge relation *)
+let pair-to-relation p = p ++ 0
+
+(*
+ * Given an LF event r outside a critical section, r cannot read
+ * internally but it may read from an LKW event in another thread.
+ * Compute the relation containing these possible edges.
+ *)
+let possible-rfe-noncrit-lf r = (LKW * {r}) & loc & ext
+
+(* Compute set of sets of possible rfe edges for LF events *)
let all-possible-rfe-lf =
- (*
- * Given an LF event r, compute the possible rfe edges for that event
- * (all those starting from LKW events in other threads),
- * and then convert that relation to a set of single-edge relations.
- *)
- let possible-rfe-lf r =
- let pair-to-relation p = p ++ 0
- in map pair-to-relation ((LKW * {r}) & loc & ext)
+ (* Convert the possible-rfe relation for r to a set of single edges *)
+ let set-of-singleton-rfe-lf r =
+ map pair-to-relation (possible-rfe-noncrit-lf r)
(* Do this for each LF event r that isn't in rfi-lf *)
- in map possible-rfe-lf (LF \ range(rfi-lf))
+ in map set-of-singleton-rfe-lf (LF \ range(rfi-lf))
(* Generate all rf relations for LF events *)
with rfe-lf from cross(all-possible-rfe-lf)
let rf-lf = rfe-lf | rfi-lf
(*
- * RU, i.e., spin_is_locked() returning False, is slightly different.
- * We rely on the memory model to rule out cases where spin_is_locked()
- * within one of the lock's critical sections returns False.
+ * Given an RU event r, r may read internally from the last po-previous UL,
+ * or it may read from a UL event in another thread or the initial write.
+ * Compute the relation containing these possible edges.
*)
-
-(* rfi for RU events: an RU may read from the last po-previous UL *)
-let rfi-ru = ([UL] ; po-loc ; [RU]) \ ([UL] ; po-loc ; [LKW] ; po-loc)
-
-(* rfe for RU events: an RU may read from an external UL or the initial write *)
-let all-possible-rfe-ru =
- let possible-rfe-ru r =
- let pair-to-relation p = p ++ 0
- in map pair-to-relation (((UL | IW) * {r}) & loc & ext)
- in map possible-rfe-ru RU
+let possible-rf-ru r = (((UL * {r}) & po-loc) \
+ ([UL] ; po-loc ; [UL] ; po-loc)) |
+ (((UL | IW) * {r}) & loc & ext)
+
+(* Compute set of sets of possible rf edges for RU events *)
+let all-possible-rf-ru =
+ (* Convert the possible-rf relation for r to a set of single edges *)
+ let set-of-singleton-rf-ru r =
+ map pair-to-relation (possible-rf-ru r)
+ (* Do this for each RU event r *)
+ in map set-of-singleton-rf-ru RU
(* Generate all rf relations for RU events *)
-with rfe-ru from cross(all-possible-rfe-ru)
-let rf-ru = rfe-ru | rfi-ru
+with rf-ru from cross(all-possible-rf-ru)
(* Final rf relation *)
let rf = rf | rf-lf | rf-ru
> It turns out the problem lies in the way lock.cat tries to calculate the
> rf relation for RU events (a spin_is_locked() that returns False). The
> method it uses amounts to requiring that such events must read from the
> lock's initial value or an LU event (a spin_unlock()) in a different
> thread. This clearly is wrong, and glaringly so in this litmus test
> since there are no other threads!
>
> A patch to fix the problem and reorganize the code a bit for greater
> readability is below. I'd appreciate it if people could try it out on
> various locking litmus tests in our archives.
Thanks for the quick solution, Alan. The results from our archives look
good.
Andrea
On 5/24/2024 6:00 PM, Andrea Parri wrote:
>> What's the difference between R and R*, or between W and W*?
>
> AFAIU, herd7 uses such notation, "*", to denote a load or a store which
> is also in RMW.
I also got confused with this. What about the following notation?
R[once,RMW] ->rmw W[once,RMW]
>
> Andrea
On 5/24/2024 5:13 PM, Andrea Parri wrote:
> tools/memory-model/ and herdtool7 are closely linked: the latter is
> responsible for (pre)processing each C-like macro of a litmus test,
> and for providing the LKMM with a set of events, or "representation",
> corresponding to the given macro. Provide herd-representation.txt
> to document the representation of synchronization macros, following
> their "classification" in Documentation/atomic_t.txt.
>
> Suggested-by: Hernan Ponce de Leon <[email protected]>
> Signed-off-by: Andrea Parri <[email protected]>
> ---
> - Leaving srcu_{up,down}_read() and smp_mb__after_srcu_read_unlock() for
> the next version.
>
> - Limiting to "add" and "and" ops (skipping similar/same representations
> for "sub", "inc", "dec", "or", "xor", "andnot").
>
> - While preparing this submission, I recalled that atomic_add_unless()
> is not listed in the .def file. I can't remember the reason for this
> omission though.
>
> - While checking the information below using herd7, I've observed some
> "strange" behavior with spin_is_locked() (perhaps, unsurprisingly...);
> IAC, that's also excluded from this table/submission.
>
>
> .../Documentation/herd-representation.txt | 81 +++++++++++++++++++
> 1 file changed, 81 insertions(+)
> create mode 100644 tools/memory-model/Documentation/herd-representation.txt
>
> diff --git a/tools/memory-model/Documentation/herd-representation.txt b/tools/memory-model/Documentation/herd-representation.txt
> new file mode 100644
> index 0000000000000..94d0d0a9eee50
> --- /dev/null
> +++ b/tools/memory-model/Documentation/herd-representation.txt
> @@ -0,0 +1,81 @@
> + ---------------------------------------------------------------------------
> + | C macro | Events |
> + ---------------------------------------------------------------------------
> + | Non-RMW ops | |
> + ---------------------------------------------------------------------------
> + | READ_ONCE | R[once] |
> + | atomic_read | (as in the previous row) |
> + | WRITE_ONCE | W[once] |
> + | atomic_set | |
> + | smp_load_acquire | R[acquire] |
> + | atomic_read_acquire | |
> + | smp_store_release | W[release] |
> + | atomic_set_release | |
> + | smp_store_mb | W[once] ->po F[mb] |
I expect this one to be hard-coded in herd7 source code, but I cannot
find it. Can you give me a pointer?
In fact, dartagnan uses W[Mb] ... another clear example of the need for
documentation as this one.
> + | smp_mb | F[mb] |
> + | smp_rmb | F[rmb] |
> + | smp_wmb | F[wmb] |
> + | smp_mb__before_atomic | F[before-atomic] |
> + | smp_mb__after_atomic | F[after-atomic] |
> + | spin_unlock | UL |
> + | smp_mb__after_spinlock | F[after-spinlock] |
> + | smp_mb__after_unlock_lock | F[after-unlock-lock] |
> + | rcu_read_lock | F[rcu-lock] |
> + | rcu_read_unlock | F[rcu-unlock] |
> + | synchronize_rcu | F[sync-rcu] |
> + | rcu_dereference | R[once] |
> + | rcu_assign_pointer | W[release] |
> + | srcu_read_lock | R[srcu-lock] |
> + | srcu_read_unlock | W[srcu-unlock] |
> + | synchronize_srcu | SRCU[sync-srcu] |
> + ---------------------------------------------------------------------------
> + | RMW ops w/o return value | |
> + ---------------------------------------------------------------------------
> + | atomic_add | R*[noreturn] ->rmw W*[once] |
> + | atomic_and | |
> + | spin_lock | LKR ->lk-rmw LKW |
What about spin_unlock?
> + ---------------------------------------------------------------------------
> + | RMW ops w/ return value | |
> + ---------------------------------------------------------------------------
> + | atomic_add_return | F[mb] ->po R*[once] |
> + | | ->rmw W*[once] ->po F[mb] |
> + | atomic_fetch_add | |
> + | atomic_fetch_and | |
> + | atomic_xchg | |
> + | xchg | |
> + | atomic_add_negative | |
> + | atomic_add_return_relaxed | R*[once] ->rmw W*[once] |
> + | atomic_fetch_add_relaxed | |
> + | atomic_fetch_and_relaxed | |
> + | atomic_xchg_relaxed | |
> + | xchg_relaxed | |
> + | atomic_add_negative_relaxed | |
> + | atomic_add_return_acquire | R*[acquire] ->rmw W*[once] |
> + | atomic_fetch_add_acquire | |
> + | atomic_fetch_and_acquire | |
> + | atomic_xchg_acquire | |
> + | xchg_acquire | |
> + | atomic_add_negative_acquire | |
> + | atomic_add_return_release | R*[once] ->rmw W*[release] |
> + | atomic_fetch_add_release | |
> + | atomic_fetch_and_release | |
> + | atomic_xchg_release | |
> + | xchg_release | |
> + | atomic_add_negative_release | |
> + ---------------------------------------------------------------------------
> + | Conditional RMW ops | |
> + ---------------------------------------------------------------------------
> + | atomic_cmpxchg | On success: F[mb] ->po R*[once] |
> + | | ->rmw W*[once] ->po F[mb] |
> + | | On failure: R*[once] |
> + | cmpxchg | |
> + | atomic_add_unless | |
> + | atomic_cmpxchg_relaxed | On success: R*[once] ->rmw W*[once] |
> + | | On failure: R*[once] |
> + | atomic_cmpxchg_acquire | On success: R*[acquire] ->rmw W*[once] |
> + | | On failure: R*[once] |
> + | atomic_cmpxchg_release | On success: R*[once] ->rmw W*[release] |
> + | | On failure: R*[once] |
> + | spin_trylock | On success: LKR ->lk-rmw LKW |
> + | | On failure: LF |
> + ---------------------------------------------------------------------------
I found the extra spaces in the failure case very hard to read. Any
particular reason why you went with this format?
Hernan
Am 5/27/2024 um 2:25 PM schrieb Hernan Ponce de Leon:
> On 5/24/2024 6:00 PM, Andrea Parri wrote:
>>> What's the difference between R and R*, or between W and W*?
>>
>> AFAIU, herd7 uses such notation, "*", to denote a load or a store which
>> is also in RMW.
>
> I also got confused with this. What about the following notation?
>
> R[once,RMW] ->rmw W[once,RMW]
>
>>
>> Andrea
>
Note that the * is also what herd will show in its graphs.
jonas
Am 5/24/2024 um 5:37 PM schrieb Andrea Parri:
>> - While checking the information below using herd7, I've observed some
>> "strange" behavior with spin_is_locked() (perhaps, unsurprisingly...);
>> IAC, that's also excluded from this table/submission.
>
> For completeness, the behavior in question:
>
> $ cat T.litmus
> C T
>
> {}
>
> P0(spinlock_t *x)
> {
> int r0;
>
> spin_lock(x);
> spin_unlock(x);
> r0 = spin_is_locked(x);
> }
>
Since 0 executions are generated, possibly herd things there's a deadlock.
Could be either a problem with the deadlock definition, or do you need
to initialize the lock somehow?
On Mon, May 27, 2024 at 02:25:01PM +0200, Hernan Ponce de Leon wrote:
> On 5/24/2024 6:00 PM, Andrea Parri wrote:
> > > What's the difference between R and R*, or between W and W*?
> >
> > AFAIU, herd7 uses such notation, "*", to denote a load or a store which
> > is also in RMW.
>
> I also got confused with this. What about the following notation?
>
> R[once,RMW] ->rmw W[once,RMW]
I'll add a (minimal) legenda describing this and some other notations in the
next version. As Jonas remarked in his reply, * is what you would currently
find in herd7-generated graphs, so I'd stick to it for the moment.
Andrea
On Mon, May 27, 2024 at 02:25:01PM +0200, Hernan Ponce de Leon wrote:
> On 5/24/2024 6:00 PM, Andrea Parri wrote:
> > > What's the difference between R and R*, or between W and W*?
> >
> > AFAIU, herd7 uses such notation, "*", to denote a load or a store which
> > is also in RMW.
>
> I also got confused with this. What about the following notation?
>
> R[once,RMW] ->rmw W[once,RMW]
Either way, it would be a good idea to add an explanation at the start
of the file.
Likewise, add an explanation that blank entries mean the same as the
preceding row.
Overall the table looks very good.
Alan
On Mon, May 27, 2024 at 03:28:00PM +0200, Andrea Parri wrote:
> > > + | smp_store_mb | W[once] ->po F[mb] |
> >
> > I expect this one to be hard-coded in herd7 source code, but I cannot find
> > it. Can you give me a pointer?
>
> smp_store_mb() is currently mapped to { __store{once}(X,V); __fence{mb}; } in
> the .def file, so it's semantically equivalent to "WRITE_ONCE(); smp_mb();".
Why don't we use this approach for all the value-returning full-barrier
RMW operations? That would immediately solve the issue of the
special-purpose code in herd7, leaving only the matter of how to
annotate failed RMW operations.
Alan
> > + | smp_store_mb | W[once] ->po F[mb] |
>
> I expect this one to be hard-coded in herd7 source code, but I cannot find
> it. Can you give me a pointer?
smp_store_mb() is currently mapped to { __store{once}(X,V); __fence{mb}; } in
the .def file, so it's semantically equivalent to "WRITE_ONCE(); smp_mb();".
> What about spin_unlock?
spin_unlock() is listed among the non-RMW ops/macros in the current table: it
is represented by a single UL or "Unlock" event (a special type of Store event
with (some special) Release semantics).
> I found the extra spaces in the failure case very hard to read. Any
> particular reason why you went with this format?
The extra spaces were simply to convey something like "belong to the previous
row/entry", but I'm open to remove them or other suggestions if preferred.
Andrea
Am 5/27/2024 um 3:28 PM schrieb Andrea Parri:
>>> + | smp_store_mb | W[once] ->po F[mb] |
>>
>> I expect this one to be hard-coded in herd7 source code, but I cannot find
>> it. Can you give me a pointer?
>
> smp_store_mb() is currently mapped to { __store{once}(X,V); __fence{mb}; } in
> the .def file, so it's semantically equivalent to "WRITE_ONCE(); smp_mb();".
By the way, I experimented a little with these kind of mappings to see
if we can just explicitly encode the mapping there. E.g., I had an idea
to use
{ __fence{mb-successful-rmw}; __cmpxchg{once}...;
__fence{mb-successful-rmw}; }
for defining (almost) the current mapping of cmpxchg explicitly.
But none of the changes I made were accepted by herd7.
Do you know how the syntax works?
jonas
Am 5/27/2024 um 3:37 PM schrieb Alan Stern:
> On Mon, May 27, 2024 at 03:28:00PM +0200, Andrea Parri wrote:
>>>> + | smp_store_mb | W[once] ->po F[mb] |
>>>
>>> I expect this one to be hard-coded in herd7 source code, but I cannot find
>>> it. Can you give me a pointer?
>>
>> smp_store_mb() is currently mapped to { __store{once}(X,V); __fence{mb}; } in
>> the .def file, so it's semantically equivalent to "WRITE_ONCE(); smp_mb();".
>
> Why don't we use this approach for all the value-returning full-barrier
> RMW operations? That would immediately solve the issue of the
> special-purpose code in herd7, leaving only the matter of how to
> annotate failed RMW operations.
I experimented with that the other day. My idea was to use a new
__fence{mb-successful-rmw} which would have
Mb = Mb | Mb-successful-rmw & (domain((po\(po;po));rmw) |
range(rmw;(po\(po;po)))
to turn only the ordering effect of fences around cmpxchg off (and the
existance of these fences around unsuccessful cmpxchg would be the only
difference to the current representation).
Unfortunately I didn't manage to get my changes to the .def file to
compile (FWIW I'm on herd 7.56+03).
Maybe someone wiser with herd can figure out how to work the .def file.
Best wishes,
jonas
On 5/27/2024 3:28 PM, Andrea Parri wrote:
>>> + | smp_store_mb | W[once] ->po F[mb] |
>>
>> I expect this one to be hard-coded in herd7 source code, but I cannot find
>> it. Can you give me a pointer?
>
> smp_store_mb() is currently mapped to { __store{once}(X,V); __fence{mb}; } in
> the .def file, so it's semantically equivalent to "WRITE_ONCE(); smp_mb();".
Ok, so some fences where added in the .def file (this) while other
programmatically (e.g., xchg). We should at least be consistent about
how this is done. Based on previous comments, it seems most of us agree
this should go the the .def file.
>
>
>> What about spin_unlock?
>
> spin_unlock() is listed among the non-RMW ops/macros in the current table: it
> is represented by a single UL or "Unlock" event (a special type of Store event
> with (some special) Release semantics).
I was blind. Sorry for the noise.
>
>
>> I found the extra spaces in the failure case very hard to read. Any
>> particular reason why you went with this format?
>
> The extra spaces were simply to convey something like "belong to the previous
> row/entry", but I'm open to remove them or other suggestions if preferred.
I find it easier to read without the extra spaces. The empty column on
the left already tells me this is a continuation of the previous row.
But I don't see this as a blocker.
>
> Andrea
On Mon, May 27, 2024 at 03:40:13PM +0200, Jonas Oberhauser wrote:
>
>
> Am 5/27/2024 um 3:28 PM schrieb Andrea Parri:
> > > > + | smp_store_mb | W[once] ->po F[mb] |
> > >
> > > I expect this one to be hard-coded in herd7 source code, but I cannot find
> > > it. Can you give me a pointer?
> >
> > smp_store_mb() is currently mapped to { __store{once}(X,V); __fence{mb}; } in
> > the .def file, so it's semantically equivalent to "WRITE_ONCE(); smp_mb();".
>
> By the way, I experimented a little with these kind of mappings to see if we
> can just explicitly encode the mapping there. E.g., I had an idea to use
> { __fence{mb-successful-rmw}; __cmpxchg{once}...;
> __fence{mb-successful-rmw}; }
>
> for defining (almost) the current mapping of cmpxchg explicitly.
>
> But none of the changes I made were accepted by herd7.
>
> Do you know how the syntax works?
>
This may not be trivial. Note that cmpxchg() is an expression (it has a
value), so in .def, we want to define it as an expression. However, the
C-like multiple-statement expression is not supported by herd parser, in
other words we want:
{
__fence{mb-successful-rmw};
int tmp = __cmpxchg{once}(...);
__fence{mb-successful-rmw};
tmp;
}
but herd parser doesn't support this as a valid expression.
Regards,
Boqun
> jonas
>
Am 5/28/2024 um 7:58 PM schrieb Boqun Feng:
> On Mon, May 27, 2024 at 03:40:13PM +0200, Jonas Oberhauser wrote:
>>
>>
>> Am 5/27/2024 um 3:28 PM schrieb Andrea Parri:
>>>>> + | smp_store_mb | W[once] ->po F[mb] |
>>>>
>>>> I expect this one to be hard-coded in herd7 source code, but I cannot find
>>>> it. Can you give me a pointer?
>>>
>>> smp_store_mb() is currently mapped to { __store{once}(X,V); __fence{mb}; } in
>>> the .def file, so it's semantically equivalent to "WRITE_ONCE(); smp_mb();".
>>
>> By the way, I experimented a little with these kind of mappings to see if we
>> can just explicitly encode the mapping there. E.g., I had an idea to use
>> { __fence{mb-successful-rmw}; __cmpxchg{once}...;
>> __fence{mb-successful-rmw}; }
>>
>> for defining (almost) the current mapping of cmpxchg explicitly.
>>
>> But none of the changes I made were accepted by herd7.
>>
>> Do you know how the syntax works?
>>
>
> This may not be trivial. Note that cmpxchg() is an expression (it has a
> value), so in .def, we want to define it as an expression. However, the
> C-like multiple-statement expression is not supported by herd parser, in
> other words we want:
>
> {
> __fence{mb-successful-rmw};
> int tmp = __cmpxchg{once}(...);
> __fence{mb-successful-rmw};
> tmp;
> }
Oh, you're right. Then probably the rule I was violating is that
value-returning macros can not be defined with {} at all.
Given herd's other syntactic limitations, perhaps the best way would be
to introduce these macros as
x = cmpxchg(...) {
__fence{mb-successful-rmw};
x = __cmpxchg{once}(...);
__fence{mb-successful-rmw};
}
since I think x = M(...) is the only way we are allowed to use these
macros anyways.
On Wed, May 29, 2024 at 02:37:30PM +0200, Jonas Oberhauser wrote:
>
>
> Am 5/28/2024 um 7:58 PM schrieb Boqun Feng:
> > This may not be trivial. Note that cmpxchg() is an expression (it has a
> > value), so in .def, we want to define it as an expression. However, the
> > C-like multiple-statement expression is not supported by herd parser, in
> > other words we want:
> >
> > {
> > __fence{mb-successful-rmw};
> > int tmp = __cmpxchg{once}(...);
> > __fence{mb-successful-rmw};
> > tmp;
> > }
>
> Oh, you're right. Then probably the rule I was violating is that
> value-returning macros can not be defined with {} at all.
>
> Given herd's other syntactic limitations, perhaps the best way would be to
> introduce these macros as
>
> x = cmpxchg(...) {
> __fence{mb-successful-rmw};
> x = __cmpxchg{once}(...);
> __fence{mb-successful-rmw};
> }
>
> since I think x = M(...) is the only way we are allowed to use these macros
> anyways.
If we did this, how would the .cat file know to ignore the fence events
when the cmpxchg() fails? It doesn't look like there's anything to
connect the two of them.
Adding the MB tag to the cmpxchg itself seems like the only way forward.
Alan
Am 5/29/2024 um 4:24 PM schrieb Alan Stern:
> On Wed, May 29, 2024 at 04:17:36PM +0200, Jonas Oberhauser wrote:
>>
>>
>> Am 5/29/2024 um 4:07 PM schrieb Alan Stern:
>>> On Wed, May 29, 2024 at 02:37:30PM +0200, Jonas Oberhauser wrote:
>>>> Given herd's other syntactic limitations, perhaps the best way would be to
>>>> introduce these macros as
>>>>
>>>> x = cmpxchg(...) {
>>>> __fence{mb-successful-rmw};
>>>> x = __cmpxchg{once}(...);
>>>> __fence{mb-successful-rmw};
>>>> }
>>>>
>>>> since I think x = M(...) is the only way we are allowed to use these macros
>>>> anyways.
>>>
>>> If we did this, how would the .cat file know to ignore the fence events
>>> when the cmpxchg() fails? It doesn't look like there's anything to
>>> connect the two of them.
>>>
>>> Adding the MB tag to the cmpxchg itself seems like the only way forward.
>>>
>>> Alan
>>
>> Something along these lines:
>>
>> Mb = Mb | Mb-successful-rmw & (domain((po\(po;po));rmw) |
>> range(rmw;(po\(po;po)))
>>
>> i.e., using the fact that these mb-successful-rmw fences must appear
>> directly next to a possibly failing rmw, and looking for successful rmw
>> directly around them.
>>
>> I suppose we have to distinguish between the before- and after- fences
>> though to make it work for cases like
>>
>> xchg_release();
>> cmpxchg(); // fails
>>
>>
>> __xchg_release(...); // is an rmw
>> __fence{mb-successful-rmw}; // wrong takes mb semantics
>> x = __cmpxchg{once}(...); // fails
>> __fence{mb-successful-rmw};
>>
>>
>> So that would leave us with
>>
>> x = cmpxchg(...) {
>> __fence{mb-before-successful-rmw};
>> x = __cmpxchg{once}(...);
>> __fence{mb-after-successful-rmw};
>> }
>>
>> and in .cat/.bell:
>>
>> Mb = Mb | Mb-before-successful-rmw & domain((po\(po;po));rmw) |
>> Mb-after-successful-rmw & range(rmw;(po\(po;po)))
>
> It's messy. Associating the fences directly with the RMW event(s) by
> adding the MB tags is much cleaner, IMO.
I agree.
> Also, does the syntax you are proposing require changes to herd7? I'm
> not aware that it is currently able to parse that kind of definition.
Indeed, herd7 can't deal with this syntax right now.
jonas
Am 5/29/2024 um 4:07 PM schrieb Alan Stern:
> On Wed, May 29, 2024 at 02:37:30PM +0200, Jonas Oberhauser wrote:
>>
>>
>> Am 5/28/2024 um 7:58 PM schrieb Boqun Feng:
>>> This may not be trivial. Note that cmpxchg() is an expression (it has a
>>> value), so in .def, we want to define it as an expression. However, the
>>> C-like multiple-statement expression is not supported by herd parser, in
>>> other words we want:
>>>
>>> {
>>> __fence{mb-successful-rmw};
>>> int tmp = __cmpxchg{once}(...);
>>> __fence{mb-successful-rmw};
>>> tmp;
>>> }
>>
>> Oh, you're right. Then probably the rule I was violating is that
>> value-returning macros can not be defined with {} at all.
>>
>> Given herd's other syntactic limitations, perhaps the best way would be to
>> introduce these macros as
>>
>> x = cmpxchg(...) {
>> __fence{mb-successful-rmw};
>> x = __cmpxchg{once}(...);
>> __fence{mb-successful-rmw};
>> }
>>
>> since I think x = M(...) is the only way we are allowed to use these macros
>> anyways.
>
> If we did this, how would the .cat file know to ignore the fence events
> when the cmpxchg() fails? It doesn't look like there's anything to
> connect the two of them.
>
> Adding the MB tag to the cmpxchg itself seems like the only way forward.
>
> Alan
Something along these lines:
Mb = Mb | Mb-successful-rmw & (domain((po\(po;po));rmw) |
range(rmw;(po\(po;po)))
i.e., using the fact that these mb-successful-rmw fences must appear
directly next to a possibly failing rmw, and looking for successful rmw
directly around them.
I suppose we have to distinguish between the before- and after- fences
though to make it work for cases like
xchg_release();
cmpxchg(); // fails
__xchg_release(...); // is an rmw
__fence{mb-successful-rmw}; // wrong takes mb semantics
x = __cmpxchg{once}(...); // fails
__fence{mb-successful-rmw};
So that would leave us with
x = cmpxchg(...) {
__fence{mb-before-successful-rmw};
x = __cmpxchg{once}(...);
__fence{mb-after-successful-rmw};
}
and in .cat/.bell:
Mb = Mb | Mb-before-successful-rmw & domain((po\(po;po));rmw) |
Mb-after-successful-rmw & range(rmw;(po\(po;po)))
Best wishes,
jonas
On Wed, May 29, 2024 at 04:17:36PM +0200, Jonas Oberhauser wrote:
>
>
> Am 5/29/2024 um 4:07 PM schrieb Alan Stern:
> > On Wed, May 29, 2024 at 02:37:30PM +0200, Jonas Oberhauser wrote:
> > > Given herd's other syntactic limitations, perhaps the best way would be to
> > > introduce these macros as
> > >
> > > x = cmpxchg(...) {
> > > __fence{mb-successful-rmw};
> > > x = __cmpxchg{once}(...);
> > > __fence{mb-successful-rmw};
> > > }
> > >
> > > since I think x = M(...) is the only way we are allowed to use these macros
> > > anyways.
> >
> > If we did this, how would the .cat file know to ignore the fence events
> > when the cmpxchg() fails? It doesn't look like there's anything to
> > connect the two of them.
> >
> > Adding the MB tag to the cmpxchg itself seems like the only way forward.
> >
> > Alan
>
> Something along these lines:
>
> Mb = Mb | Mb-successful-rmw & (domain((po\(po;po));rmw) |
> range(rmw;(po\(po;po)))
>
> i.e., using the fact that these mb-successful-rmw fences must appear
> directly next to a possibly failing rmw, and looking for successful rmw
> directly around them.
>
> I suppose we have to distinguish between the before- and after- fences
> though to make it work for cases like
>
> xchg_release();
> cmpxchg(); // fails
>
>
> __xchg_release(...); // is an rmw
> __fence{mb-successful-rmw}; // wrong takes mb semantics
> x = __cmpxchg{once}(...); // fails
> __fence{mb-successful-rmw};
>
>
> So that would leave us with
>
> x = cmpxchg(...) {
> __fence{mb-before-successful-rmw};
> x = __cmpxchg{once}(...);
> __fence{mb-after-successful-rmw};
> }
>
> and in .cat/.bell:
>
> Mb = Mb | Mb-before-successful-rmw & domain((po\(po;po));rmw) |
> Mb-after-successful-rmw & range(rmw;(po\(po;po)))
It's messy. Associating the fences directly with the RMW event(s) by
adding the MB tags is much cleaner, IMO.
Also, does the syntax you are proposing require changes to herd7? I'm
not aware that it is currently able to parse that kind of definition.
Alan
On Mon, May 27, 2024 at 10:07:06AM +0200, Andrea Parri wrote:
> > It turns out the problem lies in the way lock.cat tries to calculate the
> > rf relation for RU events (a spin_is_locked() that returns False). The
> > method it uses amounts to requiring that such events must read from the
> > lock's initial value or an LU event (a spin_unlock()) in a different
> > thread. This clearly is wrong, and glaringly so in this litmus test
> > since there are no other threads!
> >
> > A patch to fix the problem and reorganize the code a bit for greater
> > readability is below. I'd appreciate it if people could try it out on
> > various locking litmus tests in our archives.
>
> Thanks for the quick solution, Alan. The results from our archives look
> good.
Here's a much smaller patch, suitable for the -stable kernels. It fixes
the bug without doing the larger code reorganization (which will go into
a separate patch). Can you test this one?
Alan
Index: usb-devel/tools/memory-model/lock.cat
===================================================================
--- usb-devel.orig/tools/memory-model/lock.cat
+++ usb-devel/tools/memory-model/lock.cat
@@ -102,19 +102,19 @@ let rf-lf = rfe-lf | rfi-lf
* within one of the lock's critical sections returns False.
*)
-(* rfi for RU events: an RU may read from the last po-previous UL *)
-let rfi-ru = ([UL] ; po-loc ; [RU]) \ ([UL] ; po-loc ; [LKW] ; po-loc)
-
-(* rfe for RU events: an RU may read from an external UL or the initial write *)
-let all-possible-rfe-ru =
- let possible-rfe-ru r =
+(*
+ * rf for RU events: an RU may read from an external UL or the initial write,
+ * or from the last po-previous UL
+ *)
+let all-possible-rf-ru =
+ let possible-rf-ru r =
let pair-to-relation p = p ++ 0
- in map pair-to-relation (((UL | IW) * {r}) & loc & ext)
- in map possible-rfe-ru RU
+ in map pair-to-relation ((((UL | IW) * {r}) & loc & ext) |
+ (((UL * {r}) & po-loc) \ ([UL] ; po-loc ; [LKW] ; po-loc)))
+ in map possible-rf-ru RU
(* Generate all rf relations for RU events *)
-with rfe-ru from cross(all-possible-rfe-ru)
-let rf-ru = rfe-ru | rfi-ru
+with rf-ru from cross(all-possible-rf-ru)
(* Final rf relation *)
let rf = rf | rf-lf | rf-ru
> > > A patch to fix the problem and reorganize the code a bit for greater
> > > readability is below. I'd appreciate it if people could try it out on
> > > various locking litmus tests in our archives.
> >
> > Thanks for the quick solution, Alan. The results from our archives look
> > good.
>
> Here's a much smaller patch, suitable for the -stable kernels. It fixes
> the bug without doing the larger code reorganization (which will go into
> a separate patch). Can you test this one?
Testing in progress..., first results are good.
(+1 on splitting the patches)
Andrea
> > Here's a much smaller patch, suitable for the -stable kernels. It fixes
> > the bug without doing the larger code reorganization (which will go into
> > a separate patch). Can you test this one?
>
> Testing in progress..., first results are good.
Completed and good on the various locking litmus tests in the github
archive and in-tree.
Andrea
On Wed, Jun 05, 2024 at 09:26:27PM +0200, Andrea Parri wrote:
> > > Here's a much smaller patch, suitable for the -stable kernels. It fixes
> > > the bug without doing the larger code reorganization (which will go into
> > > a separate patch). Can you test this one?
> >
> > Testing in progress..., first results are good.
>
> Completed and good on the various locking litmus tests in the github
> archive and in-tree.
Okay, thanks. I'll submit the patches soon.
Alan