LinuxLists.cc - Re: [PATCH 0/2] jump label: 2.6.38 updates

2011-02-17 01:55:47

Subject: Re: [PATCH 0/2] jump label: 2.6.38 updates

(2011/02/16 22:24), Mathieu Desnoyers wrote:
> * Will Newton ([email protected]) wrote:
>> On Wed, Feb 16, 2011 at 12:18 PM, Steven Rostedt <[email protected]> wrote:
>>> On Wed, 2011-02-16 at 10:15 +0000, Will Newton wrote:
>>>
>>>>> That's some really crippled hardware... it does seem like *any* loads
>>>>> from *any* address updated by an sc would have to be done with ll as
>>>>> well, else they may load stale values. One could work this into
>>>>> atomic_read(), but surely there are other places that are problems.
>>>>
>>>> I think it's actually ok, atomics have arch implemented accessors, as
>>>> do spinlocks and atomic bitops. Those are the only place we do sc so
>>>> we can make sure we always ll or invalidate manually.
>>>
>>> I'm curious, how is cmpxchg() implemented on this architecture? As there
>>> are several places in the kernel that uses this on regular variables
>>> without any "accessor" functions.
>>
>> We can invalidate the cache manually. The current cpu will see the new
>> value (post-cache invalidate) and the other cpus will see either the
>> old value or the new value depending on whether they read before or
>> after the invalidate, which is racy but I don't think it is
>> problematic. Unless I'm missing something...
>
> Assuming the invalidate is specific to a cache-line, I'm concerned about
> the failure of a scenario like the following:
>
> initially:
> foo = 0
> bar = 0
>
> CPU A CPU B
>
> xchg(&foo, 1);
> ll foo
> sc foo
>
> -> interrupt
>
> if (foo == 1)
> xchg(&bar, 1);
> ll bar
> sc bar
> invalidate bar
>
> lbar = bar;
> smp_mb()
> lfoo = foo;
> BUG_ON(lbar == 1 && lfoo == 0);
> invalidate foo
>
> It should be valid to expect that every time "bar" read by CPU B is 1,
> then "foo" is always worth 1. However, in this case, the lack of
> invalidate on foo is keeping the cacheline from reaching CPU B. There
> seems to be a problem with interrupts/NMIs coming right between sc and
> invalidate, as Ingo pointed out.

Hmm, I think that is miss-coding ll/sc.
If I understand correctly, usually cache invalidation should be done
right before storing value, as MSI protocol does.
(or, sc should atomically invalidate the cache line)

Thank you,

--
Masami HIRAMATSU
2nd Dept. Linux Technology Center
Hitachi, Ltd., Systems Development Laboratory
E-mail: [email protected]

2011-02-17 03:20:39

by H. Peter Anvin

[permalink] [raw]

Subject: Re: [PATCH 0/2] jump label: 2.6.38 updates

On 02/16/2011 05:55 PM, Masami Hiramatsu wrote:
>
> Hmm, I think that is miss-coding ll/sc.
> If I understand correctly, usually cache invalidation should be done
> right before storing value, as MSI protocol does.
> (or, sc should atomically invalidate the cache line)
>

I suspect in this case one should flush the cache line before ll (a
cache flush will typically invalidate the ll/sc link.)

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2011-02-17 16:03:20

by Mathieu Desnoyers

[permalink] [raw]

Subject: Re: [PATCH 0/2] jump label: 2.6.38 updates

* H. Peter Anvin ([email protected]) wrote:
> On 02/16/2011 05:55 PM, Masami Hiramatsu wrote:
> >
> > Hmm, I think that is miss-coding ll/sc.
> > If I understand correctly, usually cache invalidation should be done
> > right before storing value, as MSI protocol does.
> > (or, sc should atomically invalidate the cache line)
> >
>
> I suspect in this case one should flush the cache line before ll (a
> cache flush will typically invalidate the ll/sc link.)

hrm, but if you have:

invalidate
-> interrupt
read (fetch the invalidated cacheline)
ll
sc

you basically end up in a situation similar to not having any
invalidate, no ? AFAIU, disabling interrupts around the whole
ll-sc-invalidate (or invalidate-ll-sc) seems required for this specific
architecture, so the invalidation is made "atomic" with the ll-sc pair
from the point of view of one hardware thread.

Mathieu

>
> -hpa
>
> --
> H. Peter Anvin, Intel Open Source Technology Center
> I work for Intel. I don't speak on their behalf.
>

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com