Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20;
Message-ID: <c22ec058-b058-0b6e-718b-348ff5cb5004@huaweicloud.com>
Date:   Wed, 18 Jan 2023 22:38:11 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
 Thunderbird/102.6.1
Subject: Re: [PATCH] tools/memory-model: Make ppo a subrelation of po
To:     Alan Stern <stern@rowland.harvard.edu>
Cc:     paulmck@kernel.org, parri.andrea@gmail.com, will@kernel.org,
        peterz@infradead.org, boqun.feng@gmail.com, npiggin@gmail.com,
        dhowells@redhat.com, j.alglave@ucl.ac.uk, luc.maranget@inria.fr,
        akiyks@gmail.com, dlustig@nvidia.com, joel@joelfernandes.org,
        urezki@gmail.com, quic_neeraju@quicinc.com, frederic@kernel.org,
        linux-kernel@vger.kernel.org, viktor@mpi-sws.org
References: <20230117193159.22816-1-jonas.oberhauser@huaweicloud.com>
 <Y8hN7vs/w8LhMasT@rowland.harvard.edu>
From:   Jonas Oberhauser <jonas.oberhauser@huaweicloud.com>
In-Reply-To: <Y8hN7vs/w8LhMasT@rowland.harvard.edu>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Precedence: bulk


On 1/18/2023 8:52 PM, Alan Stern wrote:
> On Tue, Jan 17, 2023 at 08:31:59PM +0100, Jonas Oberhauser wrote:
>> As stated in the documentation and implied by its name, the ppo
>> (preserved program order) relation is intended to link po-earlier
>> to po-later instructions under certain conditions.  However, a
>> corner case currently allows instructions to be linked by ppo that
>> are not executed by the same thread, i.e., instructions are being
>> linked that have no po relation.
>>
>> This happens due to the mb/strong-fence relations, which (as one
>> case) provide order when locks are passed between threads followed
>> by an smp_mb__after_unlock_lock() fence.  This is illustrated in
>> the following litmus test (as can be seen when using herd7 with
>> `doshow ppo`):
>>
>> P0(int *x, int *y)
>> {
>>      spin_lock(x);
>>      spin_unlock(x);
>> }
>>
>> P1(int *x, int *y)
>> {
>>      spin_lock(x);
>>      smp_mb__after_unlock_lock();
>>      *y = 1;
>> }
>>
>> The ppo relation will link P0's spin_lock(x) and P1's *y=1,
>> because P0 passes a lock to P1 which then uses this fence.
>>
>> The patch makes ppo a subrelation of po by eliminating this
>> possibility from mb and strong-fence, and instead introduces the
>> notion of strong ordering operations, which are allowed to link
>> events of distinct threads.
>>
>> Signed-off-by: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com>
>> ---
> I'm just going to comment on the changes to the cat file.  I'll review
> the documentation changes later.
>
>> diff --git a/tools/memory-model/linux-kernel.cat b/tools/memory-model/linux-kernel.cat
>> index 07f884f9b2bf..1d91edbc10fe 100644
>> --- a/tools/memory-model/linux-kernel.cat
>> +++ b/tools/memory-model/linux-kernel.cat
>> @@ -36,9 +36,7 @@ let wmb = [W] ; fencerel(Wmb) ; [W]
>>   let mb = ([M] ; fencerel(Mb) ; [M]) |
>>   	([M] ; fencerel(Before-atomic) ; [RMW] ; po? ; [M]) |
>>   	([M] ; po? ; [RMW] ; fencerel(After-atomic) ; [M]) |
>> -	([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M]) |
>> -	([M] ; po ; [UL] ; (co | po) ; [LKW] ;
>> -		fencerel(After-unlock-lock) ; [M])
>> +	([M] ; po? ; [LKW] ; fencerel(After-spinlock) ; [M])
> Shouldn't the po case of (co | po) remain intact here?

You can leave it here, but it is already covered by two other parts: the 
ordering given through ppo/hb is covered by the po-unlock-lock-po & int 
in ppo, and the ordering given through pb is covered by its inclusion in 
strong-order.

Now whether it should be included in strong-order or not is a matter of 
grouping: is it better to leave all cases ordered by the 
[After-unlock-lock] fence together, or is it better to keep the <=po 
parts of the fences together and the external parts together?
Right now I prefer to group things around the fences because that is 
more of an isolated idea, rather than around whether they order 
internally or externally.

As a side bonus I like that

    po-unlock-lock-po ; [After-unlock-lock]

"rhymes" nicely. If we split the strong-order definition and move the po 
part back into mb, that would disappear.


>>   let gp = po ; [Sync-rcu | Sync-srcu] ; po?
>>   let strong-fence = mb | gp
>>   
>> @@ -91,8 +89,12 @@ acyclic hb as happens-before
>>   (* Write and fence propagation ordering *)
>>   (****************************************)
>>   
>> -(* Propagation: Each non-rf link needs a strong fence. *)
>> -let pb = prop ; strong-fence ; hb* ; [Marked]
>> +(* Strong ordering operations *)
>> +let strong-order = strong-fence | ([M] ; po-unlock-lock-po ;
>> +		[After-unlock-lock] ; po ; [M])
> This is not the same as the definition removed above.  In particular,
> po-unlock-lock-po only allows one step along the locking chain -- it has
> rf where the definition above has co.

Indeed it is not, but the subsequent lock-unlock sequences are in hb*. 
For this reason it can be simplified to just consider the directly 
following unlock(). As a consequence, LKW becomes de-emphasized. Note 
that if you remove the explicit references to LKW from the model, you 
can argue about programs that don't use trylock as though LKW didn't 
exist, which is what some people in academia do, e.g., 
https://people.mpi-sws.org/~viktor/papers/oopsla2019-lapor.pdf (I added 
Viktor in CC, in case he wants to add something).
Since it doesn't cost anything I decided to include it like this.

I don't feel extremely strong about either point, but I've written them 
according to my preference.

>
>> +
>> +(* Propagation: Each non-rf link needs a strong ordering operation. *)
>> +let pb = prop ; strong-order ; hb* ; [Marked]
>>   acyclic pb as propagation
>>   
>>   (*******)
>> @@ -141,7 +143,7 @@ let rec rcu-order = rcu-gp | srcu-gp |
>>   	(rcu-order ; rcu-link ; rcu-order)
>>   let rcu-fence = po ; rcu-order ; po?
>>   let fence = fence | rcu-fence
> It would be nice here to have a separate term for a potentially
> cross-CPU fence.
>
> In fact, why don't we make a concerted effort to straighten out the
> terminology more fully?  Now seems like a good time to do it.

I agree; wrapping my head around this terminology-space is the whole 
reason why I started looking into the formalization of rcu, and I'm 
beginning to understand it a little bit.

> To begin with, let's be more careful about the difference between an
> order-inducing object (an event or pair of events) and the relation of
> being ordered by such an object.  For instance, given:
>
> 	A: WRITE_ONCE(x, 1);
> 	B: smp_mb();
> 	C: r1 = READ_ONCE(y);
>
> then B is an order-inducing object (a memory barrier), and (A,C) is a
> pair of events ordered by that object.  In general, an order is related
> to an order-inducing object by:
>
> 	order = po ; [order-inducing object] ; po

Yes! That's what I was trying to say in the rcu-order/rcu-fence 
discussion. (I would call it an operation rather than an object, since 
it may be something involving steps of multiple threads, like rcu, but 
let's stick with object for now).
However, while trying to rewrite the definition, I noticed that there 
*is* something around rb which requires playing with po?, just not in 
the current definition of rcu-fence or gp. This is currently covered in 
having po <= rcu-link.
A good example would be

    ... ->prop  X ->po;rcu-rcsc;rcu-link;rcu-gp Y ->po Z 
->rcu-gp;rcu-link;rcu-rscs;po ...

(I think this happens to be very similar if not the same as the code you 
sent earlier!).
in my view this includes two ordering objects (the two RCU law instances 
rcu-rcsc;rcu-link;rcu-gp and rcu-gp;rcu-link;rcu-rscs), but there's only 
one po! It can't be made to quite belong to either ordering object, so 
you don't have order;order here.

So I started wondering why the same isn't the case for pb, and whether 
perhaps it should be
     order = po ; [order-inducing object] ; po?
This way you'd get order;order here quite neatly.

I figured out that if we replace the first ordering object here with a 
pb related one like
    ... ->prop  X ->po;[mb fence] Y ->po Z ->rcu-gp;rcu-link;rcu-rscs;po ...
then we can just forget about the first ordering object and turn the 
whole thing into
    ... ->prop  X ->po Z ->rcu-gp;rcu-link;rcu-rscs;po ...

So in this case, the po can always be "stolen" by the second ordering 
object and we can completely forget about the first one, because the 
prop;po already has an ordering effect together with the second ordering 
object by itself.
This is also the case if the second ordering object is a pb-related one.
For this reason, the issue of po? never comes up when sticking to hb and 
pb, only when looking at rb.
That at least explains why using po is ok for defining strong-order.

I'm still not sure if just defining
     order = po ; [order-inducing object] ; po?
would also be ok for defining strong-order. This would have the benefit 
that it would just work for for rb-related order-inducing objects as well.
Do you remember why the current definition does not have a po? at the end?

>
> with suitable modifications for things like smp_store_release where
> one of the events being ordered _is_ the order-inducing event.
>
> So for example, we could consistently refer to all order-inducing events
> as either barriers or fences, and all order-reflecting relations as orders.  This would require widespread changes to the .cat file, but I
> think it would be worthwhile.

I agree that having a uniform language for this is worthwhile.
However you just dropped the cases of order-inducing objects that are 
not just a single event.
I am completely fine calling the individual *events* barriers.
(I don't like calling every barrier a fence though; Arm very explicitly 
doesn't do this and internally we don't either. However for LKMM I don't 
mind sticking to existing terminology here and calling them all fences; 
but to me a fence is something that orders certain things po-before the 
fence with certain things po-after).
But I would like to have a name for order-inducing objects that link two 
events.

I would call them an "ordering operation" where the first event is the 
event that "starts the operation", and the second event is the event 
that "completes that operation".
Then we can say things like "when CPU C starts an ordering operation of 
type blah that is completed by CPU C', then any store that propagates to 
C before C starts the operation must propagate to any CPU C'' before any 
store of C' that is executed after C' completes the execution propagates 
to C''  ".
This would be a weak ordering operation, and would probably imply that 
blah-order is a kind of happens-before order and would appear in 
cumul-fence.

Or  "when CPU C starts an ordering operation of type blubb that is 
completed by CPU C', then any store that propagates to C before C starts 
the operation must propagate to any other CPU before any instruction of 
C' po-after C' completes the execution is executed".
This would be a strong ordering operation which would imply that prop ; 
blubb-order is a kind of happens-before order.

> (Treating "barrier" and "fence" as synonyms seems to be too deeply
> entrenched to try and fight against.)
>
> Once that is straightened out, we can distinguish between fences or
> orders that are weak vs. strong.  And then we can divide up strong
> fences/orders into single-CPU vs. cross-CPU, if we want to.
With the distinction above, barriers and fences are always single-CPU, 
while ordering operations can be either cross-CPU or single-CPU. I'm not 
sure if there's still a need to distinguish more carefully than that.

Hope that makes sense,
jonas