The model to access variable stopmachine_state is that a main thread
writes it and other threads read it. Its declaration has no sign
volatile. In the while loop in function stopmachine, this variable is
read, and compiler might optimize it by reading it once before the loop
and not reading it again in the loop, so the thread might enter dead
loop.
Here is the patch to fix it.
Signed-off-by: Zhang Yanmin <[email protected]>
Hi!
> The model to access variable stopmachine_state is that a main thread
> writes it and other threads read it. Its declaration has no sign
> volatile. In the while loop in function stopmachine, this variable is
> read, and compiler might optimize it by reading it once before the loop
> and not reading it again in the loop, so the thread might enter dead
> loop.
No. volatile may look like a solution, but it usually is not. You may
need some barriers, atomic_t or locking.
Pavel
--
Thanks, Sharp!
>>-----Original Message-----
>>From: Pavel Machek [mailto:[email protected]]
>>Sent: 2005??12??1?? 6:20
>>To: Zhang, Yanmin
>>Cc: [email protected]; Pallipadi, Venkatesh; Shah, Rajesh
>>Subject: Re: [BUG] Variable stopmachine_state should be volatile
>>
>>Hi!
>>
>>> The model to access variable stopmachine_state is that a main thread
>>> writes it and other threads read it. Its declaration has no sign
>>> volatile. In the while loop in function stopmachine, this variable is
>>> read, and compiler might optimize it by reading it once before the loop
>>> and not reading it again in the loop, so the thread might enter dead
>>> loop.
>>
>>No. volatile may look like a solution, but it usually is not. You may
>>need some barriers, atomic_t or locking.
>> Pavel
The original functions already use smp_mb/smp_wmb. My patch just tells compiler not to optimize by bringing the reading of stopmachine_state out of the while loop.
Hi!
> >>Hi!
> >>
> >>> The model to access variable stopmachine_state is that a main thread
> >>> writes it and other threads read it. Its declaration has no sign
> >>> volatile. In the while loop in function stopmachine, this variable is
> >>> read, and compiler might optimize it by reading it once before the loop
> >>> and not reading it again in the loop, so the thread might enter dead
> >>> loop.
> >>
> >>No. volatile may look like a solution, but it usually is not. You may
> >>need some barriers, atomic_t or locking.
> >> Pavel
> The original functions already use smp_mb/smp_wmb. My patch just
>tells compiler not to optimize by bringing the reading of
>stopmachine_state out of the while loop.
Those barriers should already prevent compiler optimalization, no? If
they do not, just use some barriers that do.
Pavel
--
Thanks, Sharp!
>>-----Original Message-----
>>From: Pavel Machek [mailto:[email protected]]
>>Sent: 2005??12??1?? 11:26
>>To: Zhang, Yanmin
>>Cc: [email protected]; Pallipadi, Venkatesh; Shah, Rajesh
>>Subject: Re: [BUG] Variable stopmachine_state should be volatile
>>
>>Hi!
>>
>>> >>Hi!
>>> >>
>>> >>> The model to access variable stopmachine_state is that a main thread
>>> >>> writes it and other threads read it. Its declaration has no sign
>>> >>> volatile. In the while loop in function stopmachine, this variable is
>>> >>> read, and compiler might optimize it by reading it once before the loop
>>> >>> and not reading it again in the loop, so the thread might enter dead
>>> >>> loop.
>>> >>
>>> >>No. volatile may look like a solution, but it usually is not. You may
>>> >>need some barriers, atomic_t or locking.
>>> >> Pavel
>>> The original functions already use smp_mb/smp_wmb. My patch just
>>>tells compiler not to optimize by bringing the reading of
>>>stopmachine_state out of the while loop.
>>
>>Those barriers should already prevent compiler optimalization, no? If
>>they do not, just use some barriers that do.
I hit the problem when compiling 2.6 kernel by intel compiler.
How about to change the type of stopmachine_state to atomic_t?
Hi!
> >>Those barriers should already prevent compiler optimalization, no? If
> >>they do not, just use some barriers that do.
>
>
> I hit the problem when compiling 2.6 kernel by intel compiler.
> How about to change the type of stopmachine_state to atomic_t?
That is certainly safest / very conservative solution.
Pavel
--
Thanks, Sharp!
>>-----Original Message-----
>>From: Pavel Machek [mailto:[email protected]]
>>Sent: 2005??12??1?? 18:26
>>To: Zhang, Yanmin
>>Cc: [email protected]; Pallipadi, Venkatesh; Shah, Rajesh
>>Subject: Re: [BUG] Variable stopmachine_state should be volatile
>>
>>Hi!
>>
>>> >>Those barriers should already prevent compiler optimalization, no? If
>>> >>they do not, just use some barriers that do.
>>>
>>>
>>> I hit the problem when compiling 2.6 kernel by intel compiler.
>>> How about to change the type of stopmachine_state to atomic_t?
>>
>>That is certainly safest / very conservative solution.
>> Pavel
Thanks. The functions are not performance sensitive, so atomic_t is ok. Here is the new patch.
Hi!
> >Thanks. The functions are not performance sensitive, so
> >atomic_t is ok. Here is the new patch.
> >
>
> The only reason atomic_t will help in this case is because it uses volatile for internal counter, as it does nothing additional on atomic_read(). I don't get what is the issue with using volatile directly. May be I am missing something. Pavel can you elaborate please.
>
Look at atomic_t again. It is definitely not "just volatile",
definitely not on non-i386 architectures.
> The code here is doing something like this
>
> While (variable != specific_value)
> cpu_relax();
So either do
while (variable != value)
mb();
or better use atomic_t.
> So, making variable atomic or declaring variable as volatile should have the same impact here. Note there is only one CPU setting this variable and rest of the CPUs just read it.
>
> If volatile is not good, probably we need to find something other than atomic as well.
>
atomic_t works, is used across the kernel, and works on all
architectures. volatile int does not. Do not use it.
Pavel
--
Thanks, Sharp!
On Wed, 2005-11-30 at 10:04 +0800, Zhang, Yanmin wrote:
> The model to access variable stopmachine_state is that a main thread
> writes it and other threads read it. Its declaration has no sign
> volatile. In the while loop in function stopmachine, this variable is
> read, and compiler might optimize it by reading it once before the loop
> and not reading it again in the loop, so the thread might enter dead
> loop.
cpu_relax() includes a compiler barier..... so... what's wrong with the
compiler that it ignores such barriers?
>>-----Original Message-----
>>From: Arjan van de Ven [mailto:[email protected]]
>>Sent: 2005??12??2?? 16:04
>>To: Zhang, Yanmin
>>Cc: [email protected]; Pallipadi, Venkatesh; Shah, Rajesh
>>Subject: Re: [BUG] Variable stopmachine_state should be volatile
>>
>>On Wed, 2005-11-30 at 10:04 +0800, Zhang, Yanmin wrote:
>>> The model to access variable stopmachine_state is that a main thread
>>> writes it and other threads read it. Its declaration has no sign
>>> volatile. In the while loop in function stopmachine, this variable is
>>> read, and compiler might optimize it by reading it once before the loop
>>> and not reading it again in the loop, so the thread might enter dead
>>> loop.
>>
>>cpu_relax() includes a compiler barier..... so... what's wrong with the
>>compiler that it ignores such barriers?
You are right. I hit the problem when I compiled kernel 2.6.9 on IA64 by intel compiler.
cpu_relax has the compiler barrier if we use gcc, but cpu_relax becomes just ia64_hint which is null when I use intel compiler to compile kernel on ia64. file include/asm-ia64/intel_intrin.h defines ia64_hint as null.
Function stopmachine in kernel/stop_machine.c uses cpu_relax to prevent compiler from moving the reading of stopmachine_state out of the while loop. But when we use intel compiler, cpu_relax doesn't work because it is just null.
The right approach is to define ia64_hint to ia64_barrier in file include/asm-ia64/intel_intrin.h. I tested the new approach and it does work.
Thank Arjan, Pavel, and Venki.
> The right approach is to define ia64_hint to ia64_barrier in file
> include/asm-ia64/intel_intrin.h. I tested the new approach and it
> does work.
Does that get you a "hint@pause" instruction inside the loop? If not, then
it isn't all the way to the "right" approach.
-Tony
>>-----Original Message-----
>>From: [email protected]
>>[mailto:[email protected]] On Behalf Of Luck, Tony
>>Sent: 2005??12??9?? 2:53
>>To: Zhang, Yanmin; Arjan van de Ven; Pavel Machek
>>Cc: [email protected]; Pallipadi, Venkatesh; Shah, Rajesh;
>>[email protected]
>>Subject: RE: [BUG] Variable stopmachine_state should be volatile
>>
>>> The right approach is to define ia64_hint to ia64_barrier in file
>>> include/asm-ia64/intel_intrin.h. I tested the new approach and it
>>> does work.
>>
>>Does that get you a "hint@pause" instruction inside the loop? If not, then
>>it isn't all the way to the "right" approach.
Tony,
The approach just fixes compiler scheduling barrier problem of cpu_relax, and there is no hint@pause inside the loop.
Today I tried the latest icc, 9.0.027. It provides __hint, a new intrinsic, to support hint@pause. __hint also has the meaning of compiler scheduling barrier. So the best solution is to use __hint(0). The disassembled result showed the new intrinsic did work.
> You are right. I hit the problem when I compiled kernel 2.6.9 on IA64 by intel compiler.
> cpu_relax has the compiler barrier if we use gcc, but cpu_relax becomes just ia64_hint which is null when I use intel compiler to compile kernel on ia64. file include/asm-ia64/intel_intrin.h defines ia64_hint as null.
Please report all bugs using ICC to intel. We can't and don't want to debug
this mess.