LinuxLists.cc - smp_mb__after_spinlock requirement too strong?

2018-03-11 07:57:23

Subject: smp_mb__after_spinlock requirement too strong?

Peter pointed out in this patch https://patchwork.kernel.org/patch/9771921/
that the spinning-lock used at __schedule() should be RCsc to ensure
visibility of writes prior to __schedule when the task is to be migrated to
another CPU.

And this is emphasized at the comment of the newly introduced
smp_mb__after_spinlock(),

* This barrier must provide two things:
*
* - it must guarantee a STORE before the spin_lock() is ordered against a
* LOAD after it, see the comments at its two usage sites.
*
* - it must ensure the critical section is RCsc.
*
* The latter is important for cases where we observe values written by other
* CPUs in spin-loops, without barriers, while being subject to scheduling.
*
* CPU0 CPU1 CPU2
*
* for (;;) {
* if (READ_ONCE(X))
* break;
* }
* X=1
* <sched-out>
* <sched-in>
* r = X;
*
* without transitivity it could be that CPU1 observes X!=0 breaks the loop,
* we get migrated and CPU2 sees X==0.

which is used at,

__schedule(bool preempt) {
...
rq_lock(rq, &rf);
smp_mb__after_spinlock();
...
}
.

If I didn't miss something, I found this kind of visibility is __not__
necessarily
depends on the spinning-lock at __schedule being RCsc.

In fact, as for runnable task A, the migration would be,

CPU0 CPU1 CPU2

<ACCESS before schedule out A>

lock(rq0)
schedule out A
unock(rq0)

lock(rq0)
remove A from rq0
unlock(rq0)

lock(rq2)
add A into rq2
unlock(rq2)
lock(rq2)
schedule in A
unlock(rq2)

<ACCESS after schedule in A>

<ACCESS before schedule out A> happens-before
unlock(rq0) happends-before
lock(rq0) happends-before
unlock(rq2) happens-before
lock(rq2) happens-before
<ACCESS after schedule in A>

And for stopped tasks,

CPU0 CPU1 CPU2

<ACCESS before schedule out A>

lock(rq0)
schedule out A
remove A from rq0
store-release(A->on_cpu)
unock(rq0)

load_acquire(A->on_cpu)
set_task_cpu(A, 2)

lock(rq2)
add A into rq2
unlock(rq2)

lock(rq2)
schedule in A
unlock(rq2)

<ACCESS after schedule in A>

<ACCESS before schedule out A> happens-before
store-release(A->on_cpu) happens-before
load_acquire(A->on_cpu) happens-before
unlock(rq2) happens-before
lock(rq2) happens-before
<ACCESS after schedule in A>

So, I think the only requirement to smp_mb__after_spinlock is
to guarantee a STORE before the spin_lock() is ordered
against a LOAD after it. So we could remove the RCsc requirement
to allow more efficient implementation.

Did I miss something or this RCsc requirement does not really matter?

2018-03-12 05:41:40

by Boqun Feng

[permalink] [raw]

Subject: Re: smp_mb__after_spinlock requirement too strong?

On Sun, Mar 11, 2018 at 03:55:41PM +0800, 焦晓冬 wrote:
> Peter pointed out in this patch https://patchwork.kernel.org/patch/9771921/
> that the spinning-lock used at __schedule() should be RCsc to ensure
> visibility of writes prior to __schedule when the task is to be migrated to
> another CPU.
>
> And this is emphasized at the comment of the newly introduced
> smp_mb__after_spinlock(),
>
> * This barrier must provide two things:
> *
> * - it must guarantee a STORE before the spin_lock() is ordered against a
> * LOAD after it, see the comments at its two usage sites.
> *
> * - it must ensure the critical section is RCsc.
> *
> * The latter is important for cases where we observe values written by other
> * CPUs in spin-loops, without barriers, while being subject to scheduling.
> *
> * CPU0 CPU1 CPU2
> *
> * for (;;) {
> * if (READ_ONCE(X))
> * break;
> * }
> * X=1
> * <sched-out>
> * <sched-in>
> * r = X;
> *
> * without transitivity it could be that CPU1 observes X!=0 breaks the loop,
> * we get migrated and CPU2 sees X==0.
>
> which is used at,
>
> __schedule(bool preempt) {
> ...
> rq_lock(rq, &rf);
> smp_mb__after_spinlock();
> ...
> }
> .
>
> If I didn't miss something, I found this kind of visibility is __not__
> necessarily
> depends on the spinning-lock at __schedule being RCsc.
>
> In fact, as for runnable task A, the migration would be,
>
> CPU0 CPU1 CPU2
>
> <ACCESS before schedule out A>
>
> lock(rq0)
> schedule out A
> unock(rq0)
>
> lock(rq0)
> remove A from rq0
> unlock(rq0)
>
> lock(rq2)
> add A into rq2
> unlock(rq2)
> lock(rq2)
> schedule in A
> unlock(rq2)
>
> <ACCESS after schedule in A>
>
> <ACCESS before schedule out A> happens-before
> unlock(rq0) happends-before
> lock(rq0) happends-before
> unlock(rq2) happens-before
> lock(rq2) happens-before
> <ACCESS after schedule in A>
>

But without RCsc lock, you cannot guarantee that a write propagates to
CPU 0 and CPU 2 at the same time, so the same write may propagate to
CPU0 before <ACCESS before schedule out A> but propagate to CPU 2 after
<ACCESS after scheduler in A>. So..

Regards,
Boqun

> And for stopped tasks,
>
> CPU0 CPU1 CPU2
>
> <ACCESS before schedule out A>
>
> lock(rq0)
> schedule out A
> remove A from rq0
> store-release(A->on_cpu)
> unock(rq0)
>
> load_acquire(A->on_cpu)
> set_task_cpu(A, 2)
>
> lock(rq2)
> add A into rq2
> unlock(rq2)
>
> lock(rq2)
> schedule in A
> unlock(rq2)
>
> <ACCESS after schedule in A>
>
> <ACCESS before schedule out A> happens-before
> store-release(A->on_cpu) happens-before
> load_acquire(A->on_cpu) happens-before
> unlock(rq2) happens-before
> lock(rq2) happens-before
> <ACCESS after schedule in A>
>
> So, I think the only requirement to smp_mb__after_spinlock is
> to guarantee a STORE before the spin_lock() is ordered
> against a LOAD after it. So we could remove the RCsc requirement
> to allow more efficient implementation.
>
> Did I miss something or this RCsc requirement does not really matter?

Attachments:

(No filename) (3.71 kB)
signature.asc (499.00 B)
Download all attachments

2018-03-12 08:19:40

by 焦晓冬

[permalink] [raw]

Subject: Re: smp_mb__after_spinlock requirement too strong?

>> Peter pointed out in this patch https://patchwork.kernel.org/patch/9771921/
>> that the spinning-lock used at __schedule() should be RCsc to ensure
>> visibility of writes prior to __schedule when the task is to be migrated to
>> another CPU.
>>
>> And this is emphasized at the comment of the newly introduced
>> smp_mb__after_spinlock(),
>>
>> * This barrier must provide two things:
>> *
>> * - it must guarantee a STORE before the spin_lock() is ordered against a
>> * LOAD after it, see the comments at its two usage sites.
>> *
>> * - it must ensure the critical section is RCsc.
>> *
>> * The latter is important for cases where we observe values written by other
>> * CPUs in spin-loops, without barriers, while being subject to scheduling.
>> *
>> * CPU0 CPU1 CPU2
>> *
>> * for (;;) {
>> * if (READ_ONCE(X))
>> * break;
>> * }
>> * X=1
>> * <sched-out>
>> * <sched-in>
>> * r = X;
>> *
>> * without transitivity it could be that CPU1 observes X!=0 breaks the loop,
>> * we get migrated and CPU2 sees X==0.
>>
>> which is used at,
>>
>> __schedule(bool preempt) {
>> ...
>> rq_lock(rq, &rf);
>> smp_mb__after_spinlock();
>> ...
>> }
>> .
>>
>> If I didn't miss something, I found this kind of visibility is __not__
>> necessarily
>> depends on the spinning-lock at __schedule being RCsc.
>>
>> In fact, as for runnable task A, the migration would be,
>>
>> CPU0 CPU1 CPU2
>>
>> <ACCESS before schedule out A>
>>
>> lock(rq0)
>> schedule out A
>> unock(rq0)
>>
>> lock(rq0)
>> remove A from rq0
>> unlock(rq0)
>>
>> lock(rq2)
>> add A into rq2
>> unlock(rq2)
>> lock(rq2)
>> schedule in A
>> unlock(rq2)
>>
>> <ACCESS after schedule in A>
>>
>> <ACCESS before schedule out A> happens-before
>> unlock(rq0) happends-before
>> lock(rq0) happends-before
>> unlock(rq2) happens-before
>> lock(rq2) happens-before
>> <ACCESS after schedule in A>
>>
>
> But without RCsc lock, you cannot guarantee that a write propagates to
> CPU 0 and CPU 2 at the same time, so the same write may propagate to
> CPU0 before <ACCESS before schedule out A> but propagate to CPU 2 after
> <ACCESS after scheduler in A>. So..
>
> Regards,
> Boqun

Thank you for pointing out this case, Boqun.
But this is just one special case that acquire-release chains promise us.

A=B=0 as initial

CPU0 CPU1 CPU2 CPU3
write A=1
read A=1
write B=1
release X
acquire X
read A=?
release Y

acquire Y

read B=?

assurance 1: CPU3 will surely see B=1 writing by CPU1, and
assurance 2: CPU2 will also see A=1 writing by CPU0 as a special case

The second assurance is both in theory and implemented by real hardware.

As for theory, the C++11 memory model, which is a potential formal model
for kernel memory model as
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0124r4.html
descripes, states that:

If a value computation A of an atomic object M happens before a value
computation B of M, and A takes its value from a side effect X on M, then
the value computed by B shall either be the value stored by X or the value
stored by a side effect Y on M, where Y follows X in the modification
order of M.

at
$1.10 rule 18, on page 14
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf

As for real hardware, Luc provided detailed test and explanation on
ARM and POWER in 5.1 Cumulative Barriers for WRC on page 19
in this paper:

A Tutorial Introduction to the ARM and POWER Relaxed Memory Models
https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf

So, I think we may remove RCsc from smp_mb__after_spinlock which is
really confusing.

Best Regards,
Trol

>
>> And for stopped tasks,
>>
>> CPU0 CPU1 CPU2
>>
>> <ACCESS before schedule out A>
>>
>> lock(rq0)
>> schedule out A
>> remove A from rq0
>> store-release(A->on_cpu)
>> unock(rq0)
>>
>> load_acquire(A->on_cpu)
>> set_task_cpu(A, 2)
>>
>> lock(rq2)
>> add A into rq2
>> unlock(rq2)
>>
>> lock(rq2)
>> schedule in A
>> unlock(rq2)
>>
>> <ACCESS after schedule in A>
>>
>> <ACCESS before schedule out A> happens-before
>> store-release(A->on_cpu) happens-before
>> load_acquire(A->on_cpu) happens-before
>> unlock(rq2) happens-before
>> lock(rq2) happens-before
>> <ACCESS after schedule in A>
>>
>> So, I think the only requirement to smp_mb__after_spinlock is
>> to guarantee a STORE before the spin_lock() is ordered
>> against a LOAD after it. So we could remove the RCsc requirement
>> to allow more efficient implementation.
>>
>> Did I miss something or this RCsc requirement does not really matter?

2018-03-12 08:53:55

by Boqun Feng

[permalink] [raw]

Subject: Re: smp_mb__after_spinlock requirement too strong?

On Mon, Mar 12, 2018 at 04:18:00PM +0800, 焦晓冬 wrote:
> >> Peter pointed out in this patch https://patchwork.kernel.org/patch/9771921/
> >> that the spinning-lock used at __schedule() should be RCsc to ensure
> >> visibility of writes prior to __schedule when the task is to be migrated to
> >> another CPU.
> >>
> >> And this is emphasized at the comment of the newly introduced
> >> smp_mb__after_spinlock(),
> >>
> >> * This barrier must provide two things:
> >> *
> >> * - it must guarantee a STORE before the spin_lock() is ordered against a
> >> * LOAD after it, see the comments at its two usage sites.
> >> *
> >> * - it must ensure the critical section is RCsc.
> >> *
> >> * The latter is important for cases where we observe values written by other
> >> * CPUs in spin-loops, without barriers, while being subject to scheduling.
> >> *
> >> * CPU0 CPU1 CPU2
> >> *
> >> * for (;;) {
> >> * if (READ_ONCE(X))
> >> * break;
> >> * }
> >> * X=1
> >> * <sched-out>
> >> * <sched-in>
> >> * r = X;
> >> *
> >> * without transitivity it could be that CPU1 observes X!=0 breaks the loop,
> >> * we get migrated and CPU2 sees X==0.
> >>
> >> which is used at,
> >>
> >> __schedule(bool preempt) {
> >> ...
> >> rq_lock(rq, &rf);
> >> smp_mb__after_spinlock();
> >> ...
> >> }
> >> .
> >>
> >> If I didn't miss something, I found this kind of visibility is __not__
> >> necessarily
> >> depends on the spinning-lock at __schedule being RCsc.
> >>
> >> In fact, as for runnable task A, the migration would be,
> >>
> >> CPU0 CPU1 CPU2
> >>
> >> <ACCESS before schedule out A>
> >>
> >> lock(rq0)
> >> schedule out A
> >> unock(rq0)
> >>
> >> lock(rq0)
> >> remove A from rq0
> >> unlock(rq0)
> >>
> >> lock(rq2)
> >> add A into rq2
> >> unlock(rq2)
> >> lock(rq2)
> >> schedule in A
> >> unlock(rq2)
> >>
> >> <ACCESS after schedule in A>
> >>
> >> <ACCESS before schedule out A> happens-before
> >> unlock(rq0) happends-before
> >> lock(rq0) happends-before
> >> unlock(rq2) happens-before
> >> lock(rq2) happens-before
> >> <ACCESS after schedule in A>
> >>
> >
> > But without RCsc lock, you cannot guarantee that a write propagates to
> > CPU 0 and CPU 2 at the same time, so the same write may propagate to
> > CPU0 before <ACCESS before schedule out A> but propagate to CPU 2 after
> > <ACCESS after scheduler in A>. So..
> >
> > Regards,
> > Boqun
>
> Thank you for pointing out this case, Boqun.
> But this is just one special case that acquire-release chains promise us.
>

Ah.. right, because of A-Cumulative.

> A=B=0 as initial
>
> CPU0 CPU1 CPU2 CPU3
> write A=1
> read A=1
> write B=1
> release X
> acquire X
> read A=?
> release Y
>
> acquire Y
>
> read B=?
>
> assurance 1: CPU3 will surely see B=1 writing by CPU1, and
> assurance 2: CPU2 will also see A=1 writing by CPU0 as a special case
>
> The second assurance is both in theory and implemented by real hardware.
>
> As for theory, the C++11 memory model, which is a potential formal model
> for kernel memory model as
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0124r4.html
> descripes, states that:
>
> If a value computation A of an atomic object M happens before a value
> computation B of M, and A takes its value from a side effect X on M, then
> the value computed by B shall either be the value stored by X or the value
> stored by a side effect Y on M, where Y follows X in the modification
> order of M.
>
> at
> $1.10 rule 18, on page 14
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf
>
> As for real hardware, Luc provided detailed test and explanation on
> ARM and POWER in 5.1 Cumulative Barriers for WRC on page 19
> in this paper:
>
> A Tutorial Introduction to the ARM and POWER Relaxed Memory Models
> https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf
>
> So, I think we may remove RCsc from smp_mb__after_spinlock which is
> really confusing.
>

So I think the purpose of smp_mb__after_spinlock() is to provide RCsc
locks, it's just the comments before that may be misleading. We want
RCsc locks in schedule code because we want writes in different critical
section are ordered even outside the critical sections, for case like:

CPU 0 CPU 1 CPU 2

{A =0 , B = 0}
lock(rq0);
write A=1;
unlock(rq0);

lock(rq0);
read A=1;
write B=2;
unlock(rq0);

read B=2;
smp_rmb();
read A=1;

I think we need to fix the comments rather than loose the requirement.
Peter?

Regards,
Boqun

> Best Regards,
> Trol
>
> >
> >> And for stopped tasks,
> >>
> >> CPU0 CPU1 CPU2
> >>
> >> <ACCESS before schedule out A>
> >>
> >> lock(rq0)
> >> schedule out A
> >> remove A from rq0
> >> store-release(A->on_cpu)
> >> unock(rq0)
> >>
> >> load_acquire(A->on_cpu)
> >> set_task_cpu(A, 2)
> >>
> >> lock(rq2)
> >> add A into rq2
> >> unlock(rq2)
> >>
> >> lock(rq2)
> >> schedule in A
> >> unlock(rq2)
> >>
> >> <ACCESS after schedule in A>
> >>
> >> <ACCESS before schedule out A> happens-before
> >> store-release(A->on_cpu) happens-before
> >> load_acquire(A->on_cpu) happens-before
> >> unlock(rq2) happens-before
> >> lock(rq2) happens-before
> >> <ACCESS after schedule in A>
> >>
> >> So, I think the only requirement to smp_mb__after_spinlock is
> >> to guarantee a STORE before the spin_lock() is ordered
> >> against a LOAD after it. So we could remove the RCsc requirement
> >> to allow more efficient implementation.
> >>
> >> Did I miss something or this RCsc requirement does not really matter?

Attachments:

(No filename) (6.49 kB)
signature.asc (499.00 B)
Download all attachments

2018-03-12 08:58:23

by Peter Zijlstra

[permalink] [raw]

Subject: Re: smp_mb__after_spinlock requirement too strong?

On Mon, Mar 12, 2018 at 04:56:00PM +0800, Boqun Feng wrote:
> So I think the purpose of smp_mb__after_spinlock() is to provide RCsc
> locks, it's just the comments before that may be misleading. We want
> RCsc locks in schedule code because we want writes in different critical
> section are ordered even outside the critical sections, for case like:
>
> CPU 0 CPU 1 CPU 2
>
> {A =0 , B = 0}
> lock(rq0);
> write A=1;
> unlock(rq0);
>
> lock(rq0);
> read A=1;
> write B=2;
> unlock(rq0);
>
> read B=2;
> smp_rmb();
> read A=1;
>
> I think we need to fix the comments rather than loose the requirement.
> Peter?

Yes, ISTR people relying on schedule() being RCsc, and I just picked a
bad exmaple.

2018-03-12 09:14:36

by 焦晓冬

[permalink] [raw]

Subject: Re: smp_mb__after_spinlock requirement too strong?

On Mon, Mar 12, 2018 at 4:56 PM, Peter Zijlstra <[email protected]> wrote:
> On Mon, Mar 12, 2018 at 04:56:00PM +0800, Boqun Feng wrote:
>> So I think the purpose of smp_mb__after_spinlock() is to provide RCsc
>> locks, it's just the comments before that may be misleading. We want
>> RCsc locks in schedule code because we want writes in different critical
>> section are ordered even outside the critical sections, for case like:
>>
>> CPU 0 CPU 1 CPU 2
>>
>> {A =0 , B = 0}
>> lock(rq0);
>> write A=1;
>> unlock(rq0);
>>
>> lock(rq0);
>> read A=1;
>> write B=2;
>> unlock(rq0);
>>
>> read B=2;
>> smp_rmb();
>> read A=1;
>>
>> I think we need to fix the comments rather than loose the requirement.
>> Peter?
>
> Yes, ISTR people relying on schedule() being RCsc, and I just picked a
> bad exmaple.

Hi, Peter,
If the fixed comment could point out where this RCsc is used, it will be great.

2018-03-12 13:25:47

by Andrea Parri

[permalink] [raw]

Subject: Re: smp_mb__after_spinlock requirement too strong?

Hi Trol,

[...]

> But this is just one special case that acquire-release chains promise us.
>
> A=B=0 as initial
>
> CPU0 CPU1 CPU2 CPU3
> write A=1
> read A=1
> write B=1
> release X
> acquire X
> read A=?
> release Y
>
> acquire Y
>
> read B=?
>
> assurance 1: CPU3 will surely see B=1 writing by CPU1, and
> assurance 2: CPU2 will also see A=1 writing by CPU0 as a special case
>
> The second assurance is both in theory and implemented by real hardware.
>
> As for theory, the C++11 memory model, which is a potential formal model
> for kernel memory model as
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0124r4.html
> descripes, states that:
>
> If a value computation A of an atomic object M happens before a value
> computation B of M, and A takes its value from a side effect X on M, then
> the value computed by B shall either be the value stored by X or the value
> stored by a side effect Y on M, where Y follows X in the modification
> order of M.

A formal memory consistency model for the Linux kernel is now available at:

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git lkmm

Commit

1c27b644c0fdbc61e113b8faee14baeb8df32486
("Automate memory-barriers.txt; provide Linux-kernel memory model")

provides some information (and references) on the development of this work.

---

You can check the above observation against this model: unless I mis-typed
your snippet,

andrea@andrea:~/linux-rcu/tools/memory-model$ cat trol0.litmus
C trol0

{}

P0(int *a)
{
WRITE_ONCE(*a, 1);
}

P1(int *a, int *b, int *x)
{
int r0;

r0 = READ_ONCE(*a);
WRITE_ONCE(*b, 1);
smp_store_release(x, 1);
}

P2(int *a, int *x, int *y)
{
int r0;
int r1;

r0 = smp_load_acquire(x);
r1 = READ_ONCE(*a);
smp_store_release(y, 1);
}

P3(int *b, int *y)
{
int r0;
int r1;

r0 = smp_load_acquire(y);
r1 = READ_ONCE(*b);
}

exists (1:r0=1 /\ 2:r0=1 /\ 3:r0=1 /\ (2:r1=0 \/ 3:r1=0))

andrea@andrea:~/linux-rcu/tools/memory-model$ herd7 -conf linux-kernel.cfg trol0.litmus
Test trol0 Allowed
States 25
1:r0=0; 2:r0=0; 2:r1=0; 3:r0=0; 3:r1=0;
1:r0=0; 2:r0=0; 2:r1=0; 3:r0=0; 3:r1=1;
1:r0=0; 2:r0=0; 2:r1=0; 3:r0=1; 3:r1=0;
1:r0=0; 2:r0=0; 2:r1=0; 3:r0=1; 3:r1=1;
1:r0=0; 2:r0=0; 2:r1=1; 3:r0=0; 3:r1=0;
1:r0=0; 2:r0=0; 2:r1=1; 3:r0=0; 3:r1=1;
1:r0=0; 2:r0=0; 2:r1=1; 3:r0=1; 3:r1=0;
1:r0=0; 2:r0=0; 2:r1=1; 3:r0=1; 3:r1=1;
1:r0=0; 2:r0=1; 2:r1=0; 3:r0=0; 3:r1=0;
1:r0=0; 2:r0=1; 2:r1=0; 3:r0=0; 3:r1=1;
1:r0=0; 2:r0=1; 2:r1=0; 3:r0=1; 3:r1=1;
1:r0=0; 2:r0=1; 2:r1=1; 3:r0=0; 3:r1=0;
1:r0=0; 2:r0=1; 2:r1=1; 3:r0=0; 3:r1=1;
1:r0=0; 2:r0=1; 2:r1=1; 3:r0=1; 3:r1=1;
1:r0=1; 2:r0=0; 2:r1=0; 3:r0=0; 3:r1=0;
1:r0=1; 2:r0=0; 2:r1=0; 3:r0=0; 3:r1=1;
1:r0=1; 2:r0=0; 2:r1=0; 3:r0=1; 3:r1=0;
1:r0=1; 2:r0=0; 2:r1=0; 3:r0=1; 3:r1=1;
1:r0=1; 2:r0=0; 2:r1=1; 3:r0=0; 3:r1=0;
1:r0=1; 2:r0=0; 2:r1=1; 3:r0=0; 3:r1=1;
1:r0=1; 2:r0=0; 2:r1=1; 3:r0=1; 3:r1=0;
1:r0=1; 2:r0=0; 2:r1=1; 3:r0=1; 3:r1=1;
1:r0=1; 2:r0=1; 2:r1=1; 3:r0=0; 3:r1=0;
1:r0=1; 2:r0=1; 2:r1=1; 3:r0=0; 3:r1=1;
1:r0=1; 2:r0=1; 2:r1=1; 3:r0=1; 3:r1=1;
No
Witnesses
Positive: 0 Negative: 25
Condition exists (1:r0=1 /\ 2:r0=1 /\ 3:r0=1 /\ (2:r1=0 \/ 3:r1=0))
Observation trol0 Never 0 25
Time trol0 0.03
Hash=21369772c98e442dd382bd84b43067ee

Please see "tools/memory-model/README" or "tools/memory-model/Documentation/"
for further information about these tools/model.

Best,
Andrea

>
> at
> $1.10 rule 18, on page 14
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf
>
> As for real hardware, Luc provided detailed test and explanation on
> ARM and POWER in 5.1 Cumulative Barriers for WRC on page 19
> in this paper:
>
> A Tutorial Introduction to the ARM and POWER Relaxed Memory Models
> https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf
>
> So, I think we may remove RCsc from smp_mb__after_spinlock which is
> really confusing.
>
> Best Regards,
> Trol
>
> >
> >> And for stopped tasks,
> >>
> >> CPU0 CPU1 CPU2
> >>
> >> <ACCESS before schedule out A>
> >>
> >> lock(rq0)
> >> schedule out A
> >> remove A from rq0
> >> store-release(A->on_cpu)
> >> unock(rq0)
> >>
> >> load_acquire(A->on_cpu)
> >> set_task_cpu(A, 2)
> >>
> >> lock(rq2)
> >> add A into rq2
> >> unlock(rq2)
> >>
> >> lock(rq2)
> >> schedule in A
> >> unlock(rq2)
> >>
> >> <ACCESS after schedule in A>
> >>
> >> <ACCESS before schedule out A> happens-before
> >> store-release(A->on_cpu) happens-before
> >> load_acquire(A->on_cpu) happens-before
> >> unlock(rq2) happens-before
> >> lock(rq2) happens-before
> >> <ACCESS after schedule in A>
> >>
> >> So, I think the only requirement to smp_mb__after_spinlock is
> >> to guarantee a STORE before the spin_lock() is ordered
> >> against a LOAD after it. So we could remove the RCsc requirement
> >> to allow more efficient implementation.
> >>
> >> Did I miss something or this RCsc requirement does not really matter?

2018-03-12 13:33:20

by Peter Zijlstra

[permalink] [raw]

Subject: Re: smp_mb__after_spinlock requirement too strong?

On Mon, Mar 12, 2018 at 05:13:03PM +0800, 焦晓冬 wrote:

> If the fixed comment could point out where this RCsc is used, it will be great.

The one that comes to mind first is RCU-sched, that relies on a context
switch implying a full smp_mb(). But I think there's more..

2018-03-12 14:11:35

by 焦晓冬

[permalink] [raw]

Subject: Re: smp_mb__after_spinlock requirement too strong?

On Mon, Mar 12, 2018 at 9:24 PM, Andrea Parri <[email protected]> wrote:
> Hi Trol,
>
> [...]
>
>
>> But this is just one special case that acquire-release chains promise us.
>>
>> A=B=0 as initial
>>
>> CPU0 CPU1 CPU2 CPU3
>> write A=1
>> read A=1
>> write B=1
>> release X
>> acquire X
>> read A=?
>> release Y
>>
>> acquire Y
>>
>> read B=?
>>
>> assurance 1: CPU3 will surely see B=1 writing by CPU1, and
>> assurance 2: CPU2 will also see A=1 writing by CPU0 as a special case
>>
>> The second assurance is both in theory and implemented by real hardware.
>>
>> As for theory, the C++11 memory model, which is a potential formal model
>> for kernel memory model as
>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0124r4.html
>> descripes, states that:
>>
>> If a value computation A of an atomic object M happens before a value
>> computation B of M, and A takes its value from a side effect X on M, then
>> the value computed by B shall either be the value stored by X or the value
>> stored by a side effect Y on M, where Y follows X in the modification
>> order of M.
>
> A formal memory consistency model for the Linux kernel is now available at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git lkmm
>
> Commit
>
> 1c27b644c0fdbc61e113b8faee14baeb8df32486
> ("Automate memory-barriers.txt; provide Linux-kernel memory model")
>
> provides some information (and references) on the development of this work.
>
> ---
>
> You can check the above observation against this model: unless I mis-typed
> your snippet,
>
> andrea@andrea:~/linux-rcu/tools/memory-model$ cat trol0.litmus
> C trol0
>
> {}
>
> P0(int *a)
> {
> WRITE_ONCE(*a, 1);
> }
>
> P1(int *a, int *b, int *x)
> {
> int r0;
>
> r0 = READ_ONCE(*a);
> WRITE_ONCE(*b, 1);
> smp_store_release(x, 1);
> }
>
> P2(int *a, int *x, int *y)
> {
> int r0;
> int r1;
>
> r0 = smp_load_acquire(x);
> r1 = READ_ONCE(*a);
> smp_store_release(y, 1);
> }
>
> P3(int *b, int *y)
> {
> int r0;
> int r1;
>
> r0 = smp_load_acquire(y);
> r1 = READ_ONCE(*b);
> }
>
> exists (1:r0=1 /\ 2:r0=1 /\ 3:r0=1 /\ (2:r1=0 \/ 3:r1=0))
>
> andrea@andrea:~/linux-rcu/tools/memory-model$ herd7 -conf linux-kernel.cfg trol0.litmus
> Test trol0 Allowed
> States 25
> 1:r0=0; 2:r0=0; 2:r1=0; 3:r0=0; 3:r1=0;
> 1:r0=0; 2:r0=0; 2:r1=0; 3:r0=0; 3:r1=1;
> 1:r0=0; 2:r0=0; 2:r1=0; 3:r0=1; 3:r1=0;
> 1:r0=0; 2:r0=0; 2:r1=0; 3:r0=1; 3:r1=1;
> 1:r0=0; 2:r0=0; 2:r1=1; 3:r0=0; 3:r1=0;
> 1:r0=0; 2:r0=0; 2:r1=1; 3:r0=0; 3:r1=1;
> 1:r0=0; 2:r0=0; 2:r1=1; 3:r0=1; 3:r1=0;
> 1:r0=0; 2:r0=0; 2:r1=1; 3:r0=1; 3:r1=1;
> 1:r0=0; 2:r0=1; 2:r1=0; 3:r0=0; 3:r1=0;
> 1:r0=0; 2:r0=1; 2:r1=0; 3:r0=0; 3:r1=1;
> 1:r0=0; 2:r0=1; 2:r1=0; 3:r0=1; 3:r1=1;
> 1:r0=0; 2:r0=1; 2:r1=1; 3:r0=0; 3:r1=0;
> 1:r0=0; 2:r0=1; 2:r1=1; 3:r0=0; 3:r1=1;
> 1:r0=0; 2:r0=1; 2:r1=1; 3:r0=1; 3:r1=1;
> 1:r0=1; 2:r0=0; 2:r1=0; 3:r0=0; 3:r1=0;
> 1:r0=1; 2:r0=0; 2:r1=0; 3:r0=0; 3:r1=1;
> 1:r0=1; 2:r0=0; 2:r1=0; 3:r0=1; 3:r1=0;
> 1:r0=1; 2:r0=0; 2:r1=0; 3:r0=1; 3:r1=1;
> 1:r0=1; 2:r0=0; 2:r1=1; 3:r0=0; 3:r1=0;
> 1:r0=1; 2:r0=0; 2:r1=1; 3:r0=0; 3:r1=1;
> 1:r0=1; 2:r0=0; 2:r1=1; 3:r0=1; 3:r1=0;
> 1:r0=1; 2:r0=0; 2:r1=1; 3:r0=1; 3:r1=1;
> 1:r0=1; 2:r0=1; 2:r1=1; 3:r0=0; 3:r1=0;
> 1:r0=1; 2:r0=1; 2:r1=1; 3:r0=0; 3:r1=1;
> 1:r0=1; 2:r0=1; 2:r1=1; 3:r0=1; 3:r1=1;
> No
> Witnesses
> Positive: 0 Negative: 25
> Condition exists (1:r0=1 /\ 2:r0=1 /\ 3:r0=1 /\ (2:r1=0 \/ 3:r1=0))
> Observation trol0 Never 0 25
> Time trol0 0.03
> Hash=21369772c98e442dd382bd84b43067ee
>
> Please see "tools/memory-model/README" or "tools/memory-model/Documentation/"
> for further information about these tools/model.
>
> Best,
> Andrea
>

This work is amazingly great, Andrea.
I'd like to study on it.

>
>>
>> at
>> $1.10 rule 18, on page 14
>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf
>>
>> As for real hardware, Luc provided detailed test and explanation on
>> ARM and POWER in 5.1 Cumulative Barriers for WRC on page 19
>> in this paper:
>>
>> A Tutorial Introduction to the ARM and POWER Relaxed Memory Models
>> https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf
>>
>> So, I think we may remove RCsc from smp_mb__after_spinlock which is
>> really confusing.
>>
>> Best Regards,
>> Trol
>>
>> >
>> >> And for stopped tasks,
>> >>
>> >> CPU0 CPU1 CPU2
>> >>
>> >> <ACCESS before schedule out A>
>> >>
>> >> lock(rq0)
>> >> schedule out A
>> >> remove A from rq0
>> >> store-release(A->on_cpu)
>> >> unock(rq0)
>> >>
>> >> load_acquire(A->on_cpu)
>> >> set_task_cpu(A, 2)
>> >>
>> >> lock(rq2)
>> >> add A into rq2
>> >> unlock(rq2)
>> >>
>> >> lock(rq2)
>> >> schedule in A
>> >> unlock(rq2)
>> >>
>> >> <ACCESS after schedule in A>
>> >>
>> >> <ACCESS before schedule out A> happens-before
>> >> store-release(A->on_cpu) happens-before
>> >> load_acquire(A->on_cpu) happens-before
>> >> unlock(rq2) happens-before
>> >> lock(rq2) happens-before
>> >> <ACCESS after schedule in A>
>> >>
>> >> So, I think the only requirement to smp_mb__after_spinlock is
>> >> to guarantee a STORE before the spin_lock() is ordered
>> >> against a LOAD after it. So we could remove the RCsc requirement
>> >> to allow more efficient implementation.
>> >>
>> >> Did I miss something or this RCsc requirement does not really matter?