2008-08-21 03:23:56

by Jeffrey V. Merkey

[permalink] [raw]
Subject: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

ChangesLog:

added MAGIC_SYSRQ support. removed notify_die handler and changes
to drivers/char/keyboard.c. save or restrive pt_regs for cases where
panic is called from non-exception context. handled cases
where the per_cpu registers and context are not valid/saved
when panic is called from non-exception contexts.

removed more checkpatch.pl messages. lots more messages to still
work on. volatiles left in the code due to the previously stated
(and still present) severe breakage of the GNU compiler with SMP
shared data. most of the barrier() functions are just plain broken
and do not result in proper compiler behavior in this tree. it
helps having a working debugger to actually see these compiler problems.


2.6.27-rc4 1/27 mdb: export genapic and machine_restart functions

export genapic and machine_restart functions


2.6.27-rc4 2/27 mdb: add makefile for Merkey's Linux Kernel Debugger module

add makefile for Merkey's Linux Kernel Debugger module


2.6.27-rc4 3/27 mdb: add local Makefile for Merkey's Linux Kernel Debugger module

add local Makefile for Merkey's Linux Kernel Debugger module


2.6.27-rc4 4/27 mdb: add mdb-base.c architecure independent functions

add mdb-base.c architecure independent functions


2.6.27-rc4 5/27 mdb: add mdb-base.h architecure indepedent include

add mdb-base.h architecure indepedent include


2.6.27-rc4 6/27 mdb: add mdb.h architecure main include

add mdb.h architecure main include


2.6.27-rc4 7/27 mdb: add mdb-ia32.c Intel x86 architecure functions

add mdb-ia32.c Intel x86 architecure functions



2.6.27-rc4 8/27 mdb: add mdb-ia32.h Intel x86 architecure includes

add mdb-ia32.h Intel x86 architecure includes



2.6.27-rc4 9/27 mdb: add mdb-ia32.c Intel x86 architecure function includes

add mdb-ia32.c Intel x86 architecure function includes



2.6.27-rc4 10/27 mdb: add mdb-ia32-support.c disassembler support

add mdb-ia32-support.c disassembler support


2.6.27-rc4 11/27 mdb: add mdb-keyboard.h keyboard function includes

add mdb-keyboard.h keyboard function includes


2.6.27-rc4 12/27 mdb: add mdb-list.c debugger parser functions

add mdb-list.c debugger parser functions

2.6.27-rc4 13/27 mdb: add mdb-list.h debugger parser function includes

add mdb-list.h debugger parser function includes

2.6.27-rc4 14/27 mdb: add mdb-logic.c math parser for Merkey's Linux Kernel Debugger

add mdb-logic.c math parser for Merkey's Linux Kernel Debugger


2.6.27-rc4 15/27 mdb: add mdb-main.c linux debugger entry functions

add mdb-main.c linux debugger entry functions

add MAGIC_SYSRQ keyboard support. add support for panic notify_die
callback.


2.6.27-rc4 16/27 mdb: add mdb-os.c operating system specific functions

add mdb-os.c operating system specific functions


2.6.27-rc4 17/27 mdb: add mdb-os.h operating system specific functions

add mdb-os.h operating system specific functions


2.6.27-rc4 18/27 mdb: add mdb-proc.h function prototype includes

add mdb-proc.h function prototype includes


2.6.27-rc4 19/27 mdb: add MAGIC_SYSRQ support for mdb to documentation

add MAGIC_SYSRQ support for mdb to documentation


2.6.27-rc4 20/27 mdb: export kmsg_redirect printk for console redirection

export kmsg_redirect printk for console redirection


2.6.27-rc4 21/27 mdb: export __kernel_text_address

export __kernel_text_address


2.6.27-rc4 22/27 mdb: add mdb_kallsyms function and export kallsyms lookup functions

add mdb_kallsyms function and export kallsyms lookup functions


2.6.27-rc4 23/27 mdb: add mdb_modules function

add mdb_modules function


2.6.27-rc4 24/27 mdb: add notify_die handler to system panic

add notify_die handler to system panic


2.6.27-rc4 25/27 mdb: export clocksource watchdog function

export clocksource watchdog function


2.6.27-rc4 26/27 mdb: add Kconfig.debug sections and documentation

add Kconfig.debug sections and documentation

add help sections for MAGIC_SYSRQ keyboard support.


2.6.27-rc4 27/27 mdb: add /debug directory to main Makefile

add /debug directory to main Makefile

TO-DO

Add nested task gates to ia32 IDT for Linux to enable switching to a
good stack during exceptions.
X86_64 disssembler support
ia64 disassembler support.

Release Site : http://www.wolfmountaingroup.org
FTP Address : ftp://http://www.wolfmountaingroup.org/pub/mdb/mdb-2.6.27-rc4-ia32-08-20-08.patch


Signed-off-by: Jeffrey Vernon Merkey ([email protected])


2008-08-21 10:07:27

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

On Wed, 2008-08-20 at 20:50 -0600, [email protected] wrote:

> volatiles left in the code due to the previously stated
> (and still present) severe breakage of the GNU compiler with SMP
> shared data. most of the barrier() functions are just plain broken
> and do not result in proper compiler behavior in this tree.

Can you provide explicit detail?

By using barrier() the compiler should clobber all its memory and
registers therefore forcing a write/reload of the variable.


2008-08-21 10:58:45

by Stefan Richter

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

Peter Zijlstra wrote:
> On Wed, 2008-08-20 at 20:50 -0600, [email protected] wrote:
>
>> volatiles left in the code due to the previously stated
>> (and still present) severe breakage of the GNU compiler with SMP
>> shared data. most of the barrier() functions are just plain broken
>> and do not result in proper compiler behavior in this tree.
>
> Can you provide explicit detail?
>
> By using barrier() the compiler should clobber all its memory and
> registers therefore forcing a write/reload of the variable.

I hope Jeff didn't try mere barrier()s only. smp_wmb() and smp_rmb()
are the more relevant barrier variants for mdb, from what I remember
when I last looked at it.
--
Stefan Richter
-=====-==--- =--- =-=-=
http://arcgraph.de/sr/

2008-08-21 11:03:04

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

On Thu, 2008-08-21 at 12:57 +0200, Stefan Richter wrote:
> Peter Zijlstra wrote:
> > On Wed, 2008-08-20 at 20:50 -0600, [email protected] wrote:
> >
> >> volatiles left in the code due to the previously stated
> >> (and still present) severe breakage of the GNU compiler with SMP
> >> shared data. most of the barrier() functions are just plain broken
> >> and do not result in proper compiler behavior in this tree.
> >
> > Can you provide explicit detail?
> >
> > By using barrier() the compiler should clobber all its memory and
> > registers therefore forcing a write/reload of the variable.
>
> I hope Jeff didn't try mere barrier()s only. smp_wmb() and smp_rmb()
> are the more relevant barrier variants for mdb, from what I remember
> when I last looked at it.

Sure, but volatile isn't a replacement for memory barriers.

2008-08-21 11:47:56

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

On Thu, Aug 21, 2008 at 01:02:48PM +0200, Peter Zijlstra wrote:
> On Thu, 2008-08-21 at 12:57 +0200, Stefan Richter wrote:
> > Peter Zijlstra wrote:
> > > On Wed, 2008-08-20 at 20:50 -0600, [email protected] wrote:
> > >
> > >> volatiles left in the code due to the previously stated
> > >> (and still present) severe breakage of the GNU compiler with SMP
> > >> shared data. most of the barrier() functions are just plain broken
> > >> and do not result in proper compiler behavior in this tree.
> > >
> > > Can you provide explicit detail?
> > >
> > > By using barrier() the compiler should clobber all its memory and
> > > registers therefore forcing a write/reload of the variable.
> >
> > I hope Jeff didn't try mere barrier()s only. smp_wmb() and smp_rmb()
> > are the more relevant barrier variants for mdb, from what I remember
> > when I last looked at it.
>
> Sure, but volatile isn't a replacement for memory barriers.

Let's face it, the C standard does not support concurrency, so we are
all in a state of sin in any case, forced to rely on combinations of
gcc-specific non-standard language extensions and assembly language.

Could be worse!!!

Thanx, Paul

2008-08-21 12:04:10

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

On Thu, 2008-08-21 at 04:47 -0700, Paul E. McKenney wrote:
> On Thu, Aug 21, 2008 at 01:02:48PM +0200, Peter Zijlstra wrote:
> > On Thu, 2008-08-21 at 12:57 +0200, Stefan Richter wrote:
> > > Peter Zijlstra wrote:
> > > > On Wed, 2008-08-20 at 20:50 -0600, [email protected] wrote:
> > > >
> > > >> volatiles left in the code due to the previously stated
> > > >> (and still present) severe breakage of the GNU compiler with SMP
> > > >> shared data. most of the barrier() functions are just plain broken
> > > >> and do not result in proper compiler behavior in this tree.
> > > >
> > > > Can you provide explicit detail?
> > > >
> > > > By using barrier() the compiler should clobber all its memory and
> > > > registers therefore forcing a write/reload of the variable.
> > >
> > > I hope Jeff didn't try mere barrier()s only. smp_wmb() and smp_rmb()
> > > are the more relevant barrier variants for mdb, from what I remember
> > > when I last looked at it.
> >
> > Sure, but volatile isn't a replacement for memory barriers.
>
> Let's face it, the C standard does not support concurrency, so we are
> all in a state of sin in any case, forced to rely on combinations of
> gcc-specific non-standard language extensions and assembly language.

Hehe, still, a little birdie told me they are working on it and perhaps
someone with clue could enlighten us on their direction.

Still, I'd like Jeff to show his C, the resulting asm and the intent for
the volatile and barrier versions of his code (well, little snippets of
his code obviuosly).

Either he doesn't understand barriers (nothing to be ashamed about), or
we might have more trouble lurking in the rest of the kernel.

2008-08-21 12:12:52

by Stefan Richter

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

Paul E. McKenney wrote:
> On Thu, Aug 21, 2008 at 01:02:48PM +0200, Peter Zijlstra wrote:
>> On Thu, 2008-08-21 at 12:57 +0200, Stefan Richter wrote:
>>> Peter Zijlstra wrote:
>>>> On Wed, 2008-08-20 at 20:50 -0600, [email protected] wrote:
>>>>
>>>>> volatiles left in the code due to the previously stated
>>>>> (and still present) severe breakage of the GNU compiler with SMP
>>>>> shared data. most of the barrier() functions are just plain broken
>>>>> and do not result in proper compiler behavior in this tree.
>>>> Can you provide explicit detail?
>>>>
>>>> By using barrier() the compiler should clobber all its memory and
>>>> registers therefore forcing a write/reload of the variable.
>>> I hope Jeff didn't try mere barrier()s only. smp_wmb() and smp_rmb()
>>> are the more relevant barrier variants for mdb, from what I remember
>>> when I last looked at it.
>> Sure, but volatile isn't a replacement for memory barriers.
>
> Let's face it, the C standard does not support concurrency, so we are
> all in a state of sin in any case, forced to rely on combinations of
> gcc-specific non-standard language extensions and assembly language.
>
> Could be worse!!!

Nevertheless, an analysis of which particular parts of code generation
are insufficient if one particular volatile qualification is removed is
IMO likely to turn up places in mdb where a clearer or/and more
efficient implementation is possible. (Based on what I saw a few
revisions ago; I haven't looked at the current one yet.)
--
Stefan Richter
-=====-==--- =--- =-=-=
http://arcgraph.de/sr/

2008-08-21 12:52:21

by Jeffrey V. Merkey

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

> Paul E. McKenney wrote:
>> On Thu, Aug 21, 2008 at 01:02:48PM +0200, Peter Zijlstra wrote:
>>> On Thu, 2008-08-21 at 12:57 +0200, Stefan Richter wrote:
>>>> Peter Zijlstra wrote:
>>>>> On Wed, 2008-08-20 at 20:50 -0600, [email protected]
>>>>> wrote:
>>>>>
>>>>>> volatiles left in the code due to the previously stated
>>>>>> (and still present) severe breakage of the GNU compiler with SMP
>>>>>> shared data. most of the barrier() functions are just plain broken
>>>>>> and do not result in proper compiler behavior in this tree.
>>>>> Can you provide explicit detail?
>>>>>
>>>>> By using barrier() the compiler should clobber all its memory and
>>>>> registers therefore forcing a write/reload of the variable.
>>>> I hope Jeff didn't try mere barrier()s only. smp_wmb() and smp_rmb()
>>>> are the more relevant barrier variants for mdb, from what I remember
>>>> when I last looked at it.
>>> Sure, but volatile isn't a replacement for memory barriers.
>>
>> Let's face it, the C standard does not support concurrency, so we are
>> all in a state of sin in any case, forced to rely on combinations of
>> gcc-specific non-standard language extensions and assembly language.
>>
>> Could be worse!!!
>
> Nevertheless, an analysis of which particular parts of code generation
> are insufficient if one particular volatile qualification is removed is
> IMO likely to turn up places in mdb where a clearer or/and more
> efficient implementation is possible. (Based on what I saw a few
> revisions ago; I haven't looked at the current one yet.)
> --
> Stefan Richter
> -=====-==--- =--- =-=-=
> http://arcgraph.de/sr/
>

I used the smp_wmb() functions. I noted a couple of things. a) some of
these macros just emit __asm__ __volatile__ into the code so why not just
say "volatile" to begin with b) smp_wmb() in some cases worked and in
other cases jut optimized away the global reference. c) I can go back and
break the code again by inserting them and building broken assembler d) I
ave been doing hardware and software design since the early 1980;s, I
invented SMP affinity scheduling, and yes, I understand barriers and this
concept of instruction score-boarding and optimization very well -- its
not an excuse for a busted C compiler.

It did not break all the places in the code, but broke enough for SMP to
lock up and fail, It turned global variables into local variables. If
you want me to reproduce this I can but it will have to wait til this
evening
because I have some product releases to get out the door at Omega 8 today.

It's simple to reproduce. Take away the volatile declaration for the
rlock_t structure in mdb-ia32.c (rlock_t debug_lock) in all code
references and watch the thing lock up in SMP with multiple processors in
the debugger each stuck with their own local copy of debug_lock.

The barrier functions do not appear to work in all cases.

Jeff

2008-08-21 13:01:18

by Jeffrey V. Merkey

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

>> Paul E. McKenney wrote:
>>> On Thu, Aug 21, 2008 at 01:02:48PM +0200, Peter Zijlstra wrote:
>>>> On Thu, 2008-08-21 at 12:57 +0200, Stefan Richter wrote:
>>>>> Peter Zijlstra wrote:
>>>>>> On Wed, 2008-08-20 at 20:50 -0600, [email protected]
>>>>>> wrote:
>>>>>>
>>>>>>> volatiles left in the code due to the previously stated
>>>>>>> (and still present) severe breakage of the GNU compiler with SMP
>>>>>>> shared data. most of the barrier() functions are just plain
>>>>>>> broken
>>>>>>> and do not result in proper compiler behavior in this tree.
>>>>>> Can you provide explicit detail?
>>>>>>
>>>>>> By using barrier() the compiler should clobber all its memory and
>>>>>> registers therefore forcing a write/reload of the variable.
>>>>> I hope Jeff didn't try mere barrier()s only. smp_wmb() and smp_rmb()
>>>>> are the more relevant barrier variants for mdb, from what I remember
>>>>> when I last looked at it.
>>>> Sure, but volatile isn't a replacement for memory barriers.
>>>
>>> Let's face it, the C standard does not support concurrency, so we are
>>> all in a state of sin in any case, forced to rely on combinations of
>>> gcc-specific non-standard language extensions and assembly language.
>>>
>>> Could be worse!!!
>>
>> Nevertheless, an analysis of which particular parts of code generation
>> are insufficient if one particular volatile qualification is removed is
>> IMO likely to turn up places in mdb where a clearer or/and more
>> efficient implementation is possible. (Based on what I saw a few
>> revisions ago; I haven't looked at the current one yet.)
>> --
>> Stefan Richter
>> -=====-==--- =--- =-=-=
>> http://arcgraph.de/sr/
>>
>
> I used the smp_wmb() functions. I noted a couple of things. a) some of
> these macros just emit __asm__ __volatile__ into the code so why not just
> say "volatile" to begin with b) smp_wmb() in some cases worked and in
> other cases jut optimized away the global reference. c) I can go back and
> break the code again by inserting them and building broken assembler d) I
> ave been doing hardware and software design since the early 1980;s, I
> invented SMP affinity scheduling, and yes, I understand barriers and this
> concept of instruction score-boarding and optimization very well -- its
> not an excuse for a busted C compiler.
>
> It did not break all the places in the code, but broke enough for SMP to
> lock up and fail, It turned global variables into local variables. If
> you want me to reproduce this I can but it will have to wait til this
> evening
> because I have some product releases to get out the door at Omega 8 today.
>
> It's simple to reproduce. Take away the volatile declaration for the
> rlock_t structure in mdb-ia32.c (rlock_t debug_lock) in all code
> references and watch the thing lock up in SMP with multiple processors in
> the debugger each stuck with their own local copy of debug_lock.


Even if you use the smb_wmb() macros around the debug_lock, the compiler
still optimizes the debug_lock into a local variable. After watching the
broken behavior, the fact that some of these macro's emit __asm__
__volatile__ anyway, I just left the vars declared volatile. Its a
debugger, so its probably ok for the kernel debugger to use some volatile
data.

Jeff

>
> The barrier functions do not appear to work in all cases.
>
> Jeff
>
>

2008-08-21 13:37:54

by Nick Piggin

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

On Thursday 21 August 2008 22:26, [email protected] wrote:

> I used the smp_wmb() functions. I noted a couple of things. a) some of
> these macros just emit __asm__ __volatile__ into the code so why not just
> say "volatile" to begin with

It is not the same as volatile type. What it does is tell the compiler
to clobber all registers or temporaries. This something pretty well
defined and hard to get wrong compared to volatile type.


> b) smp_wmb() in some cases worked and in
> other cases jut optimized away the global reference.

Linux barriers aren't going to force a load to be emitted, if it can be
optimized away. If it optimized away a store, then I'd like to see a
test case.


> c) I can go back and
> break the code again by inserting them and building broken assembler d) I
> ave been doing hardware and software design since the early 1980;s, I
> invented SMP affinity scheduling, and yes, I understand barriers and this
> concept of instruction score-boarding and optimization very well -- its
> not an excuse for a busted C compiler.

The point is not whether it is possible to work with volatile types, but
that we tend not to use them in Linux to deal with concurrency.

Also, barriers seem to work fine for everybody else, so I think it is
likely you either aren't using them correctly, or have other bugs in the
code.


> It did not break all the places in the code, but broke enough for SMP to
> lock up and fail, It turned global variables into local variables. If
> you want me to reproduce this I can but it will have to wait til this
> evening
> because I have some product releases to get out the door at Omega 8 today.
>
> It's simple to reproduce. Take away the volatile declaration for the
> rlock_t structure in mdb-ia32.c (rlock_t debug_lock) in all code
> references and watch the thing lock up in SMP with multiple processors in
> the debugger each stuck with their own local copy of debug_lock.

You should disable preempt before getting the processor id. Can't see any
other possible bugs, but you should be able to see from the disassembly
pretty easily.

2008-08-21 14:04:14

by Stefan Richter

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

[email protected] wrote:
> It's simple to reproduce. Take away the volatile declaration for the
> rlock_t structure in mdb-ia32.c (rlock_t debug_lock) in all code
> references and watch the thing lock up in SMP with multiple processors in
> the debugger each stuck with their own local copy of debug_lock.

I'm having a quick look at mdb-2.6.27-rc4-ia32-08-20-08.patch at the
moment. Speaking of debug_lock()...:

You use spin_trylock_irqsave((spinlock_t *)&rlock->lock, rlock->flags)
in there. Minor nit: The pointer type cast is unnecessary.

Major problem: rlock->flags is wrong in this call. Use an on-stack
flags variable for the initial spin_trylock_irqsave. Ditto in the
following call of spin_trylock_irqsave.

Next major problem with debug_lock() and debug_unlock(): The reference
counting doesn't work. You need an atomic_t counter. Have a look at
the struct kref accessors for example, or even make use of the kref API.
Or if it isn't feasible to fix with atomic_t, add a second spinlock to
rlock_t to ensure integrity of .count (and of the .processor if necessary).

Furthermore, I have doubts about the loop which is entered by CPU B
while CPU A holds the rlock. You are fully aware that atomic_read(a) &&
!atomic_read(b) in its entirety is not atomic, I hope.

All this aside, I don't see *anything* in debug_lock and _unlock which
would necessitate volatile. Well, volatile might have papered over some
of these bugs.

PS:
Try to cut down on #if/#endif clutter. It should be possible to reduce
them at least in .c files; .h are a different matter. For example,
#if MDB_DEBUG_DEBUGGER
DBGPrint("something");
#endif
can be trivially reduced to
dbg_mdb_printk("something");
where dbg_mdb_printk() is defined as an inline function which does
nothing when MDB_DEBUG_DEBUGGER is false.

PS2:
Why are there this many debug printks anyway?
--
Stefan Richter
-=====-==--- =--- =-=-=
http://arcgraph.de/sr/

2008-08-21 14:10:39

by Stefan Richter

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

Nick Piggin wrote:
> On Thursday 21 August 2008 22:26, [email protected] wrote:
>> It's simple to reproduce. Take away the volatile declaration for the
>> rlock_t structure in mdb-ia32.c (rlock_t debug_lock) in all code
>> references and watch the thing lock up in SMP with multiple processors in
>> the debugger each stuck with their own local copy of debug_lock.
>
> You should disable preempt before getting the processor id. Can't see any
> other possible bugs, but you should be able to see from the disassembly
> pretty easily.

debug_lock() is AFAICS only called from contexts which have preemption
disabled. Last time around I recommended to Jeff to document this
requirement on the calling context.

But even though preemption is disabled, debug_lock() is still incorrect
as I mentioned in my other post a minute ago. It corrupts its .flags
and .count members. (Or maybe it coincidentally doesn't as long as
volatile is around.)
--
Stefan Richter
-=====-==--- =--- =-=-=
http://arcgraph.de/sr/

2008-08-21 14:13:22

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

On Thu, 2008-08-21 at 23:37 +1000, Nick Piggin wrote:
> On Thursday 21 August 2008 22:26, [email protected] wrote:
>
> > I used the smp_wmb() functions. I noted a couple of things. a) some of
> > these macros just emit __asm__ __volatile__ into the code so why not just
> > say "volatile" to begin with
>
> It is not the same as volatile type. What it does is tell the compiler
> to clobber all registers or temporaries. This something pretty well
> defined and hard to get wrong compared to volatile type.

Right, asm volatile () means that the asm may not be discarted. Very
different from the volatile type qualifier.

> > b) smp_wmb() in some cases worked and in
> > other cases jut optimized away the global reference.
>
> Linux barriers aren't going to force a load to be emitted, if it can be
> optimized away. If it optimized away a store, then I'd like to see a
> test case.

Not sure - I think all barrier clobber the full register and memory set.
So if you access a variable after a barrier it will have to issue a
load.

Are we talking about different things?

> > c) I can go back and
> > break the code again by inserting them and building broken assembler d) I
> > ave been doing hardware and software design since the early 1980;s, I
> > invented SMP affinity scheduling, and yes, I understand barriers and this
> > concept of instruction score-boarding and optimization very well -- its
> > not an excuse for a busted C compiler.
>
> The point is not whether it is possible to work with volatile types, but
> that we tend not to use them in Linux to deal with concurrency.
>
> Also, barriers seem to work fine for everybody else, so I think it is
> likely you either aren't using them correctly, or have other bugs in the
> code.

Well, there is of course the third option, which is what Jeff claims,
that gcc is broken. But in that case we should have more problems
elsewhere too.

2008-08-21 14:30:54

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

On Thu, Aug 21, 2008 at 04:09:59PM +0200, Peter Zijlstra wrote:
> On Thu, 2008-08-21 at 23:37 +1000, Nick Piggin wrote:
> > On Thursday 21 August 2008 22:26, [email protected] wrote:
> >
> > > I used the smp_wmb() functions. I noted a couple of things. a) some of
> > > these macros just emit __asm__ __volatile__ into the code so why not just
> > > say "volatile" to begin with
> >
> > It is not the same as volatile type. What it does is tell the compiler
> > to clobber all registers or temporaries. This something pretty well
> > defined and hard to get wrong compared to volatile type.
>
> Right, asm volatile () means that the asm may not be discarted. Very
> different from the volatile type qualifier.
>
> > > b) smp_wmb() in some cases worked and in
> > > other cases jut optimized away the global reference.
> >
> > Linux barriers aren't going to force a load to be emitted, if it can be
> > optimized away. If it optimized away a store, then I'd like to see a
> > test case.
>
> Not sure - I think all barrier clobber the full register and memory set.
> So if you access a variable after a barrier it will have to issue a
> load.

Here is one example (which might or might not be what Nick had in mind):

extern int v;

void foo(void)
{
do_something_with(v);
barrier();
do_something_else_with(v - v);
}

The second set of loads from v can be optimized away unless v is
declared volatile. In contrast:

void bar(void)
{
do_something_with(v);
barrier();
do_something_else_with(v);
}

Here the compiler must refetch v after the barrier.

> Are we talking about different things?
>
> > > c) I can go back and
> > > break the code again by inserting them and building broken assembler d) I
> > > ave been doing hardware and software design since the early 1980;s, I
> > > invented SMP affinity scheduling, and yes, I understand barriers and this
> > > concept of instruction score-boarding and optimization very well -- its
> > > not an excuse for a busted C compiler.
> >
> > The point is not whether it is possible to work with volatile types, but
> > that we tend not to use them in Linux to deal with concurrency.
> >
> > Also, barriers seem to work fine for everybody else, so I think it is
> > likely you either aren't using them correctly, or have other bugs in the
> > code.
>
> Well, there is of course the third option, which is what Jeff claims,
> that gcc is broken. But in that case we should have more problems
> elsewhere too.

Given the amount of code in gcc, we can reasonably assume that some
aspects of it are broken. Whether that presumed breakage is affecting
Jeff is another question altogether. ;-)

Thanx, Paul

2008-08-21 14:34:45

by Jeffrey V. Merkey

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

> [email protected] wrote:
>> It's simple to reproduce. Take away the volatile declaration for the
>> rlock_t structure in mdb-ia32.c (rlock_t debug_lock) in all code
>> references and watch the thing lock up in SMP with multiple processors
>> in
>> the debugger each stuck with their own local copy of debug_lock.
>
> I'm having a quick look at mdb-2.6.27-rc4-ia32-08-20-08.patch at the
> moment. Speaking of debug_lock()...:
>
> You use spin_trylock_irqsave((spinlock_t *)&rlock->lock, rlock->flags)
> in there. Minor nit: The pointer type cast is unnecessary.
>
> Major problem: rlock->flags is wrong in this call. Use an on-stack
> flags variable for the initial spin_trylock_irqsave. Ditto in the
> following call of spin_trylock_irqsave.
>
> Next major problem with debug_lock() and debug_unlock(): The reference
> counting doesn't work. You need an atomic_t counter. Have a look at
> the struct kref accessors for example, or even make use of the kref API.
> Or if it isn't feasible to fix with atomic_t, add a second spinlock to
> rlock_t to ensure integrity of .count (and of the .processor if
> necessary).
>
> Furthermore, I have doubts about the loop which is entered by CPU B
> while CPU A holds the rlock. You are fully aware that atomic_read(a) &&
> !atomic_read(b) in its entirety is not atomic, I hope.
>
> All this aside, I don't see *anything* in debug_lock and _unlock which
> would necessitate volatile. Well, volatile might have papered over some
> of these bugs.
>
> PS:
> Try to cut down on #if/#endif clutter. It should be possible to reduce
> them at least in .c files; .h are a different matter. For example,
> #if MDB_DEBUG_DEBUGGER
> DBGPrint("something");
> #endif
> can be trivially reduced to
> dbg_mdb_printk("something");
> where dbg_mdb_printk() is defined as an inline function which does
> nothing when MDB_DEBUG_DEBUGGER is false.
>
> PS2:
> Why are there this many debug printks anyway?
> --
> Stefan Richter
> -=====-==--- =--- =-=-=
> http://arcgraph.de/sr/
>

The code works in debug lock provided this memory location is actually
SHARED between the processors. The various race conditions you describe
are valid cases, but he only modifier of .count and .lock is the processor
that obtains the spinlock -- the rest are readers. This code works well,
of course, when this memory location is actually SHARED between the
processors and the read/write operations serialized.

Even in SMP, at various times it is necessary for the processors to
perform serializing operations. You cannot in checker-scoreboard land for
everything.

Jeff

2008-08-21 14:40:12

by Jeffrey V. Merkey

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

> On Thu, Aug 21, 2008 at 04:09:59PM +0200, Peter Zijlstra wrote:
>> On Thu, 2008-08-21 at 23:37 +1000, Nick Piggin wrote:
>> > On Thursday 21 August 2008 22:26, [email protected] wrote:
>> >
>> > > I used the smp_wmb() functions. I noted a couple of things. a)
>> some of
>> > > these macros just emit __asm__ __volatile__ into the code so why not
>> just
>> > > say "volatile" to begin with
>> >
>> > It is not the same as volatile type. What it does is tell the compiler
>> > to clobber all registers or temporaries. This something pretty well
>> > defined and hard to get wrong compared to volatile type.
>>
>> Right, asm volatile () means that the asm may not be discarted. Very
>> different from the volatile type qualifier.
>>
>> > > b) smp_wmb() in some cases worked and in
>> > > other cases jut optimized away the global reference.
>> >
>> > Linux barriers aren't going to force a load to be emitted, if it can
>> be
>> > optimized away. If it optimized away a store, then I'd like to see a
>> > test case.
>>
>> Not sure - I think all barrier clobber the full register and memory set.
>> So if you access a variable after a barrier it will have to issue a
>> load.
>
> Here is one example (which might or might not be what Nick had in mind):
>
> extern int v;
>
> void foo(void)
> {
> do_something_with(v);
> barrier();
> do_something_else_with(v - v);
> }
>
> The second set of loads from v can be optimized away unless v is
> declared volatile. In contrast:
>
> void bar(void)
> {
> do_something_with(v);
> barrier();
> do_something_else_with(v);
> }
>
> Here the compiler must refetch v after the barrier.
>
>> Are we talking about different things?
>>
>> > > c) I can go back and
>> > > break the code again by inserting them and building broken assembler
>> d) I
>> > > ave been doing hardware and software design since the early 1980;s,
>> I
>> > > invented SMP affinity scheduling, and yes, I understand barriers and
>> this
>> > > concept of instruction score-boarding and optimization very well --
>> its
>> > > not an excuse for a busted C compiler.
>> >
>> > The point is not whether it is possible to work with volatile types,
>> but
>> > that we tend not to use them in Linux to deal with concurrency.
>> >
>> > Also, barriers seem to work fine for everybody else, so I think it is
>> > likely you either aren't using them correctly, or have other bugs in
>> the
>> > code.
>>
>> Well, there is of course the third option, which is what Jeff claims,
>> that gcc is broken. But in that case we should have more problems
>> elsewhere too.
>
> Given the amount of code in gcc, we can reasonably assume that some
> aspects of it are broken. Whether that presumed breakage is affecting
> Jeff is another question altogether. ;-)
>
> Thanx, Paul
>

I'll create some cases this evening showing how GCC a) optimizes away
global references, even when you tell it not to (volatile is the only
thing that seems to work) b) takes global defines and turns them into
stack variables (WHAT A GREAT THING TO DO WITH ONLY A 4K STACK) when it
has bloated out on register aliases from all the global data it is turning
into local variables in registers or on the stack.

Jeff

2008-08-21 14:51:42

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

On Thu, 2008-08-21 at 07:30 -0700, Paul E. McKenney wrote:
> On Thu, Aug 21, 2008 at 04:09:59PM +0200, Peter Zijlstra wrote:
> > On Thu, 2008-08-21 at 23:37 +1000, Nick Piggin wrote:
> > > On Thursday 21 August 2008 22:26, [email protected] wrote:
> > >
> > > > I used the smp_wmb() functions. I noted a couple of things. a) some of
> > > > these macros just emit __asm__ __volatile__ into the code so why not just
> > > > say "volatile" to begin with
> > >
> > > It is not the same as volatile type. What it does is tell the compiler
> > > to clobber all registers or temporaries. This something pretty well
> > > defined and hard to get wrong compared to volatile type.
> >
> > Right, asm volatile () means that the asm may not be discarted. Very
> > different from the volatile type qualifier.
> >
> > > > b) smp_wmb() in some cases worked and in
> > > > other cases jut optimized away the global reference.
> > >
> > > Linux barriers aren't going to force a load to be emitted, if it can be
> > > optimized away. If it optimized away a store, then I'd like to see a
> > > test case.
> >
> > Not sure - I think all barrier clobber the full register and memory set.
> > So if you access a variable after a barrier it will have to issue a
> > load.
>
> Here is one example (which might or might not be what Nick had in mind):
>
> extern int v;
>
> void foo(void)
> {
> do_something_with(v);
> barrier();
> do_something_else_with(v - v);
> }
>
> The second set of loads from v can be optimized away unless v is
> declared volatile. In contrast:
>
> void bar(void)
> {
> do_something_with(v);
> barrier();
> do_something_else_with(v);
> }
>
> Here the compiler must refetch v after the barrier.

Ah, right. But in that case:

v-v := tmp1 = v; tmp2 = v; sub tmp1,tmp2;

Which you can of course write out more explicitly in C as well and
insert a barrier between the two reads of v, giving the same effect as
volatile.

Still, point taken.

2008-08-21 14:53:47

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

On Thu, Aug 21, 2008 at 02:03:26PM +0200, Peter Zijlstra wrote:
> On Thu, 2008-08-21 at 04:47 -0700, Paul E. McKenney wrote:
> > On Thu, Aug 21, 2008 at 01:02:48PM +0200, Peter Zijlstra wrote:
> > > On Thu, 2008-08-21 at 12:57 +0200, Stefan Richter wrote:
> > > > Peter Zijlstra wrote:
> > > > > On Wed, 2008-08-20 at 20:50 -0600, [email protected] wrote:
> > > > >
> > > > >> volatiles left in the code due to the previously stated
> > > > >> (and still present) severe breakage of the GNU compiler with SMP
> > > > >> shared data. most of the barrier() functions are just plain broken
> > > > >> and do not result in proper compiler behavior in this tree.
> > > > >
> > > > > Can you provide explicit detail?
> > > > >
> > > > > By using barrier() the compiler should clobber all its memory and
> > > > > registers therefore forcing a write/reload of the variable.
> > > >
> > > > I hope Jeff didn't try mere barrier()s only. smp_wmb() and smp_rmb()
> > > > are the more relevant barrier variants for mdb, from what I remember
> > > > when I last looked at it.
> > >
> > > Sure, but volatile isn't a replacement for memory barriers.
> >
> > Let's face it, the C standard does not support concurrency, so we are
> > all in a state of sin in any case, forced to rely on combinations of
> > gcc-specific non-standard language extensions and assembly language.
>
> Hehe, still, a little birdie told me they are working on it and perhaps
> someone with clue could enlighten us on their direction.

Well, I guess you guys will be the judge of that. Or one of the judges,
at least. ;-)

One advantage of the current c++0x approach is that it allows extremely
weak memory barriers to be used in many cases that would require smp_rmb()
in current Linux kernel. If you are crazy enough to want to see a
sneak preview in standardese, try all 10MB of:

http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2691.pdf

Section 1.10 (physical page 25, logical page 11) describes the memory model.
Sections 29 and 30 describe the operations (physical page 1155, logical
page 1141). The C and C++ guys got together ahead of time and agreed to
work together towards a compatible solution.

And rcu_dereference() would be implemented in terms of memory_order_consume,
for whatever that is worth.

> Still, I'd like Jeff to show his C, the resulting asm and the intent for
> the volatile and barrier versions of his code (well, little snippets of
> his code obviuosly).
>
> Either he doesn't understand barriers (nothing to be ashamed about), or
> we might have more trouble lurking in the rest of the kernel.

Sounds fair to me!

Thanx, Paul

2008-08-21 15:23:48

by Jeffrey V. Merkey

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

>
>> Still, I'd like Jeff to show his C, the resulting asm and the intent for
>> the volatile and barrier versions of his code (well, little snippets of
>> his code obviuosly).
>>
>> Either he doesn't understand barriers (nothing to be ashamed about), or
>> we might have more trouble lurking in the rest of the kernel.
>
> Sounds fair to me!
>
> Thanx, Paul
>

I have thoroughly reviewed Linux memory barriers and the efficacy of the
barriers as defined in Linux are not the issue here. the code segment
discussed sits and spins on a variable waiting for a specific state, and
its a spinlock which creates a hard barrier, so no amount of barrier usage
should nor does matter here. Even if a processor was late in flushing its
writes, sooner or later the spinning processor would see the change in the
shared memory address -- IF IT WERE ACTUALLY A SHARED REFERENCE. What I
am seeing is not an issue of races between processors on load/store
operations, but cases where gcc has chosen to optimize away global
references entirely.

Jeff

2008-08-21 15:24:08

by Stefan Richter

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

[email protected] wrote:
[Stefan Richter wrote]
>> I'm having a quick look at mdb-2.6.27-rc4-ia32-08-20-08.patch at the
>> moment. Speaking of debug_lock()...:
...
>> Major problem: rlock->flags is wrong in this call. Use an on-stack
>> flags variable for the initial spin_trylock_irqsave. Ditto in the
>> following call of spin_trylock_irqsave.
>>
>> Next major problem with debug_lock() and debug_unlock(): The reference
>> counting doesn't work. You need an atomic_t counter. Have a look at
>> the struct kref accessors for example, or even make use of the kref API.
>> Or if it isn't feasible to fix with atomic_t, add a second spinlock to
>> rlock_t to ensure integrity of .count (and of the .processor if
>> necessary).
...
> The code works in debug lock provided this memory location is actually
> SHARED between the processors. The various race conditions you describe
> are valid cases, but he only modifier of .count and .lock is the processor
> that obtains the spinlock -- the rest are readers. This code works well,
> of course, when this memory location is actually SHARED between the
> processors and the read/write operations serialized.
>
> Even in SMP, at various times it is necessary for the processors to
> perform serializing operations. You cannot in checker-scoreboard land for
> everything.

Regarding rlock->count, I stand corrected: It works correctly if the
debug_lock()...debug_unlock() region can be entered by up to two
contexts and if the second one to enter cannot be preempted by the first
one.

However, since these regions are enclosed in preempt_disable/enable and
since the first one to grab the rlock->lock keeps local interrupts
disabled until debug_unlock() or even longer, there is always only a
single context in the debug_lock()...debug_unlock() region --- which
defeats the whole purpose of the rlock_t. Or what am I missing /now/?

Independently of this, you cannot use rlock->flags like you currently
do. spin_trylock_irqsave would overwrite the flags of CPU A by the
flags of CPU B, making the results of spin_unlock_irqrestore in
debug_unlock unpredictable.
--
Stefan Richter
-=====-==--- =--- =-=-=
http://arcgraph.de/sr/

2008-08-21 15:28:44

by Jeffrey V. Merkey

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

> [email protected] wrote:
> [Stefan Richter wrote]
>>> I'm having a quick look at mdb-2.6.27-rc4-ia32-08-20-08.patch at the
>>> moment. Speaking of debug_lock()...:
> ...
>>> Major problem: rlock->flags is wrong in this call. Use an on-stack
>>> flags variable for the initial spin_trylock_irqsave. Ditto in the
>>> following call of spin_trylock_irqsave.
>>>
>>> Next major problem with debug_lock() and debug_unlock(): The reference
>>> counting doesn't work. You need an atomic_t counter. Have a look at
>>> the struct kref accessors for example, or even make use of the kref
>>> API.
>>> Or if it isn't feasible to fix with atomic_t, add a second spinlock to
>>> rlock_t to ensure integrity of .count (and of the .processor if
>>> necessary).
> ...
>> The code works in debug lock provided this memory location is actually
>> SHARED between the processors. The various race conditions you describe
>> are valid cases, but he only modifier of .count and .lock is the
>> processor
>> that obtains the spinlock -- the rest are readers. This code works
>> well,
>> of course, when this memory location is actually SHARED between the
>> processors and the read/write operations serialized.
>>
>> Even in SMP, at various times it is necessary for the processors to
>> perform serializing operations. You cannot in checker-scoreboard land
>> for
>> everything.
>
> Regarding rlock->count, I stand corrected: It works correctly if the
> debug_lock()...debug_unlock() region can be entered by up to two
> contexts and if the second one to enter cannot be preempted by the first
> one.
>
> However, since these regions are enclosed in preempt_disable/enable and
> since the first one to grab the rlock->lock keeps local interrupts
> disabled until debug_unlock() or even longer, there is always only a
> single context in the debug_lock()...debug_unlock() region --- which
> defeats the whole purpose of the rlock_t. Or what am I missing /now/?
>
> Independently of this, you cannot use rlock->flags like you currently
> do. spin_trylock_irqsave would overwrite the flags of CPU A by the
> flags of CPU B, making the results of spin_unlock_irqrestore in
> debug_unlock unpredictable.


Hmmm....

Yep, that may be correct. I'll fix that one since it is outside of the
lock. At this point, the flags in the processors are all going to be
identical at this point in the code so it would not cause any problems
(local debugger state would not care), so the debugger would work with
this and does. But I will change that and repost the patch this evening.

You have good eyes by friend.

Jeff
> --
> Stefan Richter
> -=====-==--- =--- =-=-=
> http://arcgraph.de/sr/
>

2008-08-21 15:59:04

by Linus Torvalds

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released



On Thu, 21 Aug 2008, Paul E. McKenney wrote:
>
> Let's face it, the C standard does not support concurrency, so we are
> all in a state of sin in any case, forced to rely on combinations of
> gcc-specific non-standard language extensions and assembly language.
>
> Could be worse!!!

It _will_ be worse.

The C standard will eventually support concurrency (they are working on
it), and it will almost inevitably be a horrible pile of stinking sh*t,
and we'll continue to use the gcc inline asms instead, but then the gcc
people will ignore our complaints when they break the compiler, and say
that we should use the stinking-pile-of-sh*t ones that are built in.

No, I haven't seen the drafts, and maybe I'm overly pessimistic, but I'm
pretty sure that this is an area where (a) the kernel needs more support
than most normal pthread-like models and (b) any design-by-committee thing
simply won't be very good, because they'll have to try to make everybody
happy.

Oh, well.

Linus

2008-08-21 16:20:25

by Linus Torvalds

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released



On Thu, 21 Aug 2008, Linus Torvalds wrote:
>
> No, I haven't seen the drafts

Ok, I have looked at the draft now, and I don't think I was overly
pessimistic.

If I read it right, all the memory ordering operations are defined for
_single_ objects. So if you want to do the kernel kind of memory ordering
where you specify ordering requirements independently of the actual
accesses (perhaps because the accesses are in some helper function that
doesn't care, but then you want to "finalize" the thing by stating a
sequence point), it seems to be impossible with current drafts.

Oh, well. Nothing lost. I didn't expect the thing to work.

Linus

2008-08-21 16:22:00

by Avi Kivity

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

Peter Zijlstra wrote:
> Not sure - I think all barrier clobber the full register and memory set.
> So if you access a variable after a barrier it will have to issue a
> load.
>

IIRC a barrier only clobbers memory. gcc must reload a variable from
memory unless it can prove the variable's address has not escaped anywhere.

So:

void f()
{
int v;

v = g();
barrier();
do_domething_with(v);
}

Need not reload v from memory (indeed, v can be in a register for its
entire lifetime), but

void f()
{
int v;

v = g();
h(&v);
barrier();
do_domething_with(v);
}

Will force v into memory, and reload it after the barrier.

--
error compiling committee.c: too many arguments to function

2008-08-21 16:43:53

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

On Thu, Aug 21, 2008 at 08:57:31AM -0700, Linus Torvalds wrote:
>
>
> On Thu, 21 Aug 2008, Paul E. McKenney wrote:
> >
> > Let's face it, the C standard does not support concurrency, so we are
> > all in a state of sin in any case, forced to rely on combinations of
> > gcc-specific non-standard language extensions and assembly language.
> >
> > Could be worse!!!
>
> It _will_ be worse.

And not only that, it will be worse before the new standard is ratified.
Or to give the full version: "Could be worse, and probably soon will be."

> The C standard will eventually support concurrency (they are working on
> it), and it will almost inevitably be a horrible pile of stinking sh*t,
> and we'll continue to use the gcc inline asms instead, but then the gcc
> people will ignore our complaints when they break the compiler, and say
> that we should use the stinking-pile-of-sh*t ones that are built in.
>
> No, I haven't seen the drafts, and maybe I'm overly pessimistic, but I'm
> pretty sure that this is an area where (a) the kernel needs more support
> than most normal pthread-like models and (b) any design-by-committee thing
> simply won't be very good, because they'll have to try to make everybody
> happy.

Let's see...

1. Yes, the standard is being designed by a committee.

2. No, it does not simply ratify the Linux kernel's memory-barrier
API. Yes, I did propose that. No, the proposal did not go
very far, but yes, it did get people's attention. Yes, there
were a number of people who asserted that Linux's approach was
fundamentally broken/buggy/whatever. Yes, they have changed
their mind, most of them, anyway. No, there is still zero
chance of the standard simply ratifying smp_mb() and friends.

3. Yes, we are working to get things closer to the Linux kernel's
memory-barrier API into the spec. No, they will likely not
be exact. Yes, getting some prominent committee members to
countenance the idea of any sort of raw memory barrier (not
connected directly to a load, store, or atomic operation)
was a long-term project that finally made headway last May.
Again, design by committee.

4. No, the current Linux kernel memory-barrier API is not perfect.
Yes, the proposed standard's preferred memory-barrier approach
will make code easier to read and understand in many cases, and
could potentially allow the compiler to do better optimizations
(which, yes, might or might not be a good thing). No, the
proposed standard's preferred approach does not apply to all the
cases in the Linux kernel. No, I don't know whether or not it
will be worthwhile to introduce the standard's preferred approach
to the Linux kernel (though I suspect that it would be, but have
to wait and see). Either way, yes, the Linux kernel will likely
continue to need to resort to non-standard gcc extensions and
assembly language in at least some cases. As you say, the kernel
is not a garden-variety pthreads application.

But even if the Linux kernel never uses this stuff, user applications
will need it. And there are a number of reasons why gcc extensions and
assembly language are even less welcome in many such applications than
they are in the Linux kernel.

> Oh, well.

Pretty much. After all, if you wake up some morning and find that you
have absolutely no problems, your first action should be to check to
see if you are under six feet of earth.

Thanx, Paul

2008-08-21 16:48:27

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

On Thu, Aug 21, 2008 at 09:18:29AM -0700, Linus Torvalds wrote:
>
>
> On Thu, 21 Aug 2008, Linus Torvalds wrote:
> >
> > No, I haven't seen the drafts
>
> Ok, I have looked at the draft now, and I don't think I was overly
> pessimistic.
>
> If I read it right, all the memory ordering operations are defined for
> _single_ objects. So if you want to do the kernel kind of memory ordering
> where you specify ordering requirements independently of the actual
> accesses (perhaps because the accesses are in some helper function that
> doesn't care, but then you want to "finalize" the thing by stating a
> sequence point), it seems to be impossible with current drafts.

You are looking for atomic_fence() on page 1168 (1154 virtual) of the
most recent draft. The current semantics are not correct, but this is
being worked. And yes, it does currently have a variable associated with
it, but it acts as a bare fence nevertheless. There is a proposal to
drop the variable. As you said in a previous email, design by committee.

> Oh, well. Nothing lost. I didn't expect the thing to work.

;-)

Thanx, Paul

2008-08-21 21:06:52

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

Nick Piggin wrote:
> On Thursday 21 August 2008 22:26, [email protected] wrote:
>
>
>> I used the smp_wmb() functions. I noted a couple of things. a) some of
>> these macros just emit __asm__ __volatile__ into the code so why not just
>> say "volatile" to begin with
>>
>
> It is not the same as volatile type. What it does is tell the compiler
> to clobber all registers or temporaries. This something pretty well
> defined and hard to get wrong compared to volatile type.
>

No, that's not what "asm volatile" means. Its *only* meaning is "emit
this, even if it doesn't look like it has side-effects and its results
are not used". An asm() with no outputs is "volatile" by default, which
makes most of the uses of "asm volatile" in the kernel redundant. "asm
volatile" also has no effect on the ordering of the asm with respect to
other code; you must use constraints to do that.

An asm with a "memory" clobber is sufficient to make sure that gcc
doesn't cache memory values in registers; perhaps that's what you mean.

J

2008-08-21 21:21:46

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

Linus Torvalds wrote:
> On Thu, 21 Aug 2008, Jeremy Fitzhardinge wrote:
>
>> "asm volatile" also has no effect on the ordering of the asm with
>> respect to other code; you must use constraints to do that.
>>
>
> Well, the gcc docs at _used_ to say that volatile asms are not moved
> around "significantly". They've removed that.

Yep, they seem to do that kind of thing specifically to annoy you. But
it does mean we can't rely on it at all any more.

J

2008-08-21 21:25:04

by Linus Torvalds

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released



On Thu, 21 Aug 2008, Jeremy Fitzhardinge wrote:
>
> "asm volatile" also has no effect on the ordering of the asm with
> respect to other code; you must use constraints to do that.

Well, the gcc docs at _used_ to say that volatile asms are not moved
around "significantly". They've removed that.

Linus

2008-08-22 01:38:16

by Nick Piggin

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

On Friday 22 August 2008 07:06, Jeremy Fitzhardinge wrote:
> Nick Piggin wrote:
> > On Thursday 21 August 2008 22:26, [email protected] wrote:
> >> I used the smp_wmb() functions. I noted a couple of things. a) some of
> >> these macros just emit __asm__ __volatile__ into the code so why not
> >> just say "volatile" to begin with
> >
> > It is not the same as volatile type. What it does is tell the compiler
> > to clobber all registers or temporaries. This something pretty well
> > defined and hard to get wrong compared to volatile type.
>
> No, that's not what "asm volatile" means. Its *only* meaning is "emit
> this, even if it doesn't look like it has side-effects and its results
> are not used". An asm() with no outputs is "volatile" by default, which
> makes most of the uses of "asm volatile" in the kernel redundant. "asm
> volatile" also has no effect on the ordering of the asm with respect to
> other code; you must use constraints to do that.
>
> An asm with a "memory" clobber is sufficient to make sure that gcc
> doesn't cache memory values in registers; perhaps that's what you mean.

That is what I meant, yes.

2008-08-22 01:40:40

by Nick Piggin

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

On Friday 22 August 2008 00:09, Stefan Richter wrote:
> Nick Piggin wrote:
> > On Thursday 21 August 2008 22:26, [email protected] wrote:
> >> It's simple to reproduce. Take away the volatile declaration for the
> >> rlock_t structure in mdb-ia32.c (rlock_t debug_lock) in all code
> >> references and watch the thing lock up in SMP with multiple processors
> >> in the debugger each stuck with their own local copy of debug_lock.
> >
> > You should disable preempt before getting the processor id. Can't see any
> > other possible bugs, but you should be able to see from the disassembly
> > pretty easily.
>
> debug_lock() is AFAICS only called from contexts which have preemption
> disabled. Last time around I recommended to Jeff to document this
> requirement on the calling context.

I'm not talking about where debug_lock gets called, I'm talking about
where the processor id is derived that eventually filters down to
debug_lock.

> But even though preemption is disabled, debug_lock() is still incorrect
> as I mentioned in my other post a minute ago. It corrupts its .flags
> and .count members. (Or maybe it coincidentally doesn't as long as
> volatile is around.)

I don't think so. And flags should only be restored by the processor
that saved it because the spinlock should disable preemption, right?

2008-08-22 06:36:38

by Stefan Richter

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

On 22 Aug, Nick Piggin wrote:
> On Friday 22 August 2008 00:09, Stefan Richter wrote:
>> Nick Piggin wrote:
>> > On Thursday 21 August 2008 22:26, [email protected] wrote:
>> >> It's simple to reproduce. Take away the volatile declaration for the
>> >> rlock_t structure in mdb-ia32.c (rlock_t debug_lock) in all code
>> >> references and watch the thing lock up in SMP with multiple processors
>> >> in the debugger each stuck with their own local copy of debug_lock.
>> >
>> > You should disable preempt before getting the processor id. Can't see any
>> > other possible bugs, but you should be able to see from the disassembly
>> > pretty easily.
>>
>> debug_lock() is AFAICS only called from contexts which have preemption
>> disabled. Last time around I recommended to Jeff to document this
>> requirement on the calling context.
>
> I'm not talking about where debug_lock gets called, I'm talking about
> where the processor id is derived that eventually filters down to
> debug_lock.

You are right, I replied to fast. debug_unlock() retrieves the
processor itself, but not debug_lock().

>> But even though preemption is disabled, debug_lock() is still incorrect
>> as I mentioned in my other post a minute ago. It corrupts its .flags
>> and .count members. (Or maybe it coincidentally doesn't as long as
>> volatile is around.)
>
> I don't think so. And flags should only be restored by the processor
> that saved it because the spinlock should disable preemption, right?

OK; the .count seems alright due to restrictions of the calling
contexts. About .flags: Jeff, can the following happen?

- Context A on CPU A has interrupts enabled. Enters debug_lock(),
thus disables its interrupts. (Saves its flags in rlock->flags with
the plan to enable interrupts later when leaving debug_unlock()
provided it does so as last holder.)

- Context B on CPU B happens to have interrupts disabled. Enters
debug_lock(), overwrites rlock->flags with its different value.
(Spins on the rlock which is held by CPU A.)

- Context A on CPU A leaves debug_unlock. Doesn't re-enable its
interrupts as planned, since rlock->flags is the one from CPU B.
--
Stefan Richter
-=====-==--- =--- =-==-
http://arcgraph.de/sr/

2008-08-22 12:20:05

by Jeffrey V. Merkey

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

> On 22 Aug, Nick Piggin wrote:
>> On Friday 22 August 2008 00:09, Stefan Richter wrote:
>>> Nick Piggin wrote:
>>> > On Thursday 21 August 2008 22:26, [email protected]
>>> wrote:
>>> >> It's simple to reproduce. Take away the volatile declaration for
>>> the
>>> >> rlock_t structure in mdb-ia32.c (rlock_t debug_lock) in all code
>>> >> references and watch the thing lock up in SMP with multiple
>>> processors
>>> >> in the debugger each stuck with their own local copy of debug_lock.
>>> >
>>> > You should disable preempt before getting the processor id. Can't see
>>> any
>>> > other possible bugs, but you should be able to see from the
>>> disassembly
>>> > pretty easily.
>>>
>>> debug_lock() is AFAICS only called from contexts which have preemption
>>> disabled. Last time around I recommended to Jeff to document this
>>> requirement on the calling context.
>>
>> I'm not talking about where debug_lock gets called, I'm talking about
>> where the processor id is derived that eventually filters down to
>> debug_lock.
>
> You are right, I replied to fast. debug_unlock() retrieves the
> processor itself, but not debug_lock().
>
>>> But even though preemption is disabled, debug_lock() is still incorrect
>>> as I mentioned in my other post a minute ago. It corrupts its .flags
>>> and .count members. (Or maybe it coincidentally doesn't as long as
>>> volatile is around.)
>>
>> I don't think so. And flags should only be restored by the processor
>> that saved it because the spinlock should disable preemption, right?
>
> OK; the .count seems alright due to restrictions of the calling
> contexts. About .flags: Jeff, can the following happen?
>
> - Context A on CPU A has interrupts enabled. Enters debug_lock(),
> thus disables its interrupts. (Saves its flags in rlock->flags with
> the plan to enable interrupts later when leaving debug_unlock()
> provided it does so as last holder.)

ints should already be off here.
>
> - Context B on CPU B happens to have interrupts disabled. Enters
> debug_lock(), overwrites rlock->flags with its different value.
> (Spins on the rlock which is held by CPU A.)

ints are disabled on both here (should be).

>
> - Context A on CPU A leaves debug_unlock. Doesn't re-enable its
> interrupts as planned, since rlock->flags is the one from CPU B.
> --

It should be benign since both procs have ints off here.

Stefan,

The flags needs fixing, you are right, however, since this case only
occurs when two processors both have an int1 exception (or some exception
other than an NMI) and ints are disabled here anyway, its benign. That
being said, it needs to be perfect. So I moved the flags to a stack
variable.

I will post a new patch series correcting the flags issue along with code
fragments showing the gcc breakage with debug lock later tonight. I have
a lot to get done at omega8 on appliance releases this week, and being a
48 years old this year, I am slower than I used to be.

:-)

Jeff


2008-08-22 12:38:19

by Stefan Richter

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

[email protected] wrote:
[rlock->flags in debug_lock(), debug_unlock()]
> since this case only
> occurs when two processors both have an int1 exception (or some exception
> other than an NMI) and ints are disabled here anyway, its benign.

Ah, thanks. I missed the bigger picture.
--
Stefan Richter
-=====-==--- =--- =-==-
http://arcgraph.de/sr/

2008-08-24 04:52:17

by Jeffrey V. Merkey

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released


Results from Analysis of GCC volatile/memory barriers

Use of volatile will produce the results intended in those files which
have shared data elements, but will also result in some cases global data
which is not referenced outside of a file and which has not also been
declared as volatile as being treated as static and optimized
into local variables in some cases.

If volatile is avoided entirely, the compiler appears to make correct
assumptions about whether or not it is in fact global memory references.
My conclusion is that the code generation of gcc appears correct and
in fact does a better job than Microsoft's implementation of shared
data management on SMP systems.

If you use volatile, and use optimization at the same time, the compiler
will take you at your word and potentially optimize global references into
local variables. This is in fact better than MS cl which will ALWAYS
optimize global data into local if volatile is not used with SMP data within
a single file.

If you choose to use volatile, then you had better use it on every variable
you need shared between processors -- or just leave it out entirely -- and
gcc does appear figure it out code references properly (though some of them
are quite odd).

While this may be counter-intuitive, it makes sense. When you are using
volatile, you are telling the compiler anything not declared as volatile
witihin a given file is fair game for local optimization if you turn on
optimization at the same time.


Analysis

Code Generation for two atomic_t variables. One an array and the other
standalone -- macros in kernel includes and their interactions
with the compiler may be the basis of some of these cases.

atomic_inc(&debuggerActive);
atomic_inc(&debuggerProcessors[processor]);

55a0: f0 ff 05 00 00 00 00 lock incl 0x0
55a7: 8d 2c 8d 00 00 00 00 lea 0x0(,%ecx,4),%ebp
55ae: 8d 85 00 00 00 00 lea 0x0(%ebp),%eax
55b4: 89 04 24 mov %eax,(%esp)
55b7: f0 ff 85 00 00 00 00 lock incl 0x0(%ebp)

Although the emitted asssembly is essentially correct, its odd. two
identical data types, one emitted as a global fixup and the other as
a relative fixup indirected from the stack frame. This works since
the fixup (substitute for the 0x0) input by the loader is a negative
offset relative to the entire 32 bit address space i.e. lock incl
[ebp-f800XXXX], but its still an odd way to treat an atomic variable.
I would think these would result in an absolute address fixup record
treated some other way than as data referenced from the stack frame.

Code section with mixed volatile declarations

volatile unsigned long ProcessorHold[MAX_PROCESSORS];
unsigned long ProcessorState[MAX_PROCESSORS];

case 2: /* nmi */
if (ProcessorHold[processor]) /* hold processor */
{
ProcessorHold[processor] = 0;
ProcessorState[processor] = PROCESSOR_SUSPEND;

/* processor suspend loop */
atomic_inc(&nmiProcessors[processor]);
while ((ProcessorState[processor] != PROCESSOR_RESUME) &&
(ProcessorState[processor] != PROCESSOR_SWITCH))
{
if ((ProcessorState[processor] == PROCESSOR_RESUME) ||
(ProcessorState[processor] == PROCESSOR_SWITCH))
break;

touch_nmi_watchdog();
cpu_relax();
}
atomic_dec(&nmiProcessors[processor]);
56ec: 83 3c b5 00 00 00 00 cmpl $0x0,0x0(,%esi,4)
56f3: 00
56f4: 74 1b je 5711 <debugger_entry+0x17e>
56f6: c7 04 b5 00 00 00 00 movl $0x0,0x0(,%esi,4)
56fd: 00 00 00 00
5701: f0 ff 85 00 00 00 00 lock incl 0x0(%ebp)
5708: e8 fc ff ff ff call 5709 <debugger_entry+0x176>
570d: f3 90 pause
570f: eb f7 jmp 5708 <debugger_entry+0x175>

// THIS APPEARS BROKEN - THE COMPILER IS TREATING A GLOBAL ARRAY
// AS LOCAL DATA
5711: 89 f1 mov %esi,%ecx
5713: 89 da mov %ebx,%edx
5715: b8 02 00 00 00 mov $0x2,%eax
571a: eb 06 jmp 5722 <debugger_entry+0x18f>
571c: 89 f1 mov %esi,%ecx
571e: 89 da mov %ebx,%edx
5720: 89 f8 mov %edi,%eax
5722: e8 fc ff ff ff call 5723 <debugger_entry+0x190>
5727: 83 3c b5 00 00 00 00 cmpl $0x0,0x0(,%esi,4)
572e: 00
572f: 75 c5 jne 56f6 <debugger_entry+0x163>
5731: e8 fc ff ff ff call 5732 <debugger_entry+0x19f>
5736: e8 fc ff ff ff call 5737 <debugger_entry+0x1a4>
573b: 85 c0 test %eax,%eax
573d: 74 0f je 574e <debugger_entry+0x1bb>
573f: 89 f0 mov %esi,%eax
5741: c1 e0 07 shl $0x7,%eax
5744: 05 00 00 00 00 add $0x0,%eax
5749: e8 fc ff ff ff call 574a <debugger_entry+0x1b7>
574e: c7 04 b5 00 00 00 00 movl $0x0,0x0(,%esi,4)
5755: 00 00 00 00
5759: 8b 04 24 mov (%esp),%eax
575c: c7 04 b5 00 00 00 00 movl $0x1,0x0(,%esi,4)
5763: 01 00 00 00
5767: f0 ff 08 lock decl (%eax)



Code section without ANY volatile declarations (CODE GENERATION CORRECT)

unsigned long ProcessorHold[MAX_PROCESSORS];
unsigned long ProcessorState[MAX_PROCESSORS];

case 2: /* nmi */
if (ProcessorHold[processor]) /* hold processor */
{
ProcessorHold[processor] = 0;
ProcessorState[processor] = PROCESSOR_SUSPEND;

/* processor suspend loop */
atomic_inc(&nmiProcessors[processor]);
while ((ProcessorState[processor] != PROCESSOR_RESUME) &&
(ProcessorState[processor] != PROCESSOR_SWITCH))
{
if ((ProcessorState[processor] == PROCESSOR_RESUME) ||
(ProcessorState[processor] == PROCESSOR_SWITCH))
break;

touch_nmi_watchdog();
cpu_relax();
}
atomic_dec(&nmiProcessors[processor]);

Code output from section without ANY volatile declarations

56f2: 83 3c bd 00 00 00 00 cmpl $0x0,0x0(,%edi,4)
56f9: 00
56fa: 74 5f je 575b <debugger_entry+0x1c8>
56fc: c7 04 bd 00 00 00 00 movl $0x0,0x0(,%edi,4)
5703: 00 00 00 00
5707: 8d b5 00 00 00 00 lea 0x0(%ebp),%esi
570d: c7 04 bd 00 00 00 00 movl $0x2,0x0(,%edi,4)
5714: 02 00 00 00
5718: f0 ff 85 00 00 00 00 lock incl 0x0(%ebp)
571f: eb 11 jmp 5732 <debugger_entry+0x19f>
5721: 83 f8 03 cmp $0x3,%eax
5724: 74 1d je 5743 <debugger_entry+0x1b0>
5726: 83 f8 07 cmp $0x7,%eax
5729: 74 18 je 5743 <debugger_entry+0x1b0>
572b: e8 fc ff ff ff call 572c <debugger_entry+0x199>
5730: f3 90 pause
5732: 8b 04 bd 00 00 00 00 mov 0x0(,%edi,4),%eax
5739: 83 f8 03 cmp $0x3,%eax
573c: 74 05 je 5743 <debugger_entry+0x1b0>
573e: 83 f8 07 cmp $0x7,%eax
5741: 75 de jne 5721 <debugger_entry+0x18e>
5743: f0 ff 0e lock decl (%esi)
5746: 83 3c bd 00 00 00 00 cmpl $0x7,0x0(,%edi,4)



Code from section with volatile declarations (CODE GENERATION CORRECT)

volatile unsigned long ProcessorHold[MAX_PROCESSORS];
volatile unsigned long ProcessorState[MAX_PROCESSORS];

case 2: /* nmi */
if (ProcessorHold[processor]) /* hold processor */
{
ProcessorHold[processor] = 0;
ProcessorState[processor] = PROCESSOR_SUSPEND;

/* processor suspend loop */
atomic_inc(&nmiProcessors[processor]);
while ((ProcessorState[processor] != PROCESSOR_RESUME) &&
(ProcessorState[processor] != PROCESSOR_SWITCH))
{
if ((ProcessorState[processor] == PROCESSOR_RESUME) ||
(ProcessorState[processor] == PROCESSOR_SWITCH))
break;

touch_nmi_watchdog();
cpu_relax();
}
atomic_dec(&nmiProcessors[processor]);


Code Output from section with volatile declarations

5896: 8b 04 9d 00 00 00 00 mov 0x0(,%ebx,4),%eax
589d: 85 c0 test %eax,%eax
589f: 74 73 je 5914 <debugger_entry+0x1f7>
58a1: c7 04 9d 00 00 00 00 movl $0x0,0x0(,%ebx,4)
58a8: 00 00 00 00
58ac: 8d bd 00 00 00 00 lea 0x0(%ebp),%edi
58b2: c7 04 9d 00 00 00 00 movl $0x2,0x0(,%ebx,4)
58b9: 02 00 00 00
58bd: f0 ff 85 00 00 00 00 lock incl 0x0(%ebp)
58c4: eb 1f jmp 58e5 <debugger_entry+0x1c8>
58c6: 8b 04 9d 00 00 00 00 mov 0x0(,%ebx,4),%eax
58cd: 83 f8 03 cmp $0x3,%eax
58d0: 74 2b je 58fd <debugger_entry+0x1e0>
58d2: 8b 04 9d 00 00 00 00 mov 0x0(,%ebx,4),%eax
58d9: 83 f8 07 cmp $0x7,%eax
58dc: 74 1f je 58fd <debugger_entry+0x1e0>
58de: e8 fc ff ff ff call 58df <debugger_entry+0x1c2>
58e3: f3 90 pause
58e5: 8b 04 9d 00 00 00 00 mov 0x0(,%ebx,4),%eax
58ec: 83 f8 03 cmp $0x3,%eax
58ef: 74 0c je 58fd <debugger_entry+0x1e0>
58f1: 8b 04 9d 00 00 00 00 mov 0x0(,%ebx,4),%eax
58f8: 83 f8 07 cmp $0x7,%eax
58fb: 75 c9 jne 58c6 <debugger_entry+0x1a9>
58fd: f0 ff 0f lock decl (%edi)
5900: 8b 04 9d 00 00 00 00 mov 0x0(,%ebx,4),%eax



Code from sections without volatile declaration using wmb()/rmb()
(CODE GENERATION CORRECT)

for (i=0; i < MAX_PROCESSORS; i++)
{
if (ProcessorState[i] != PROCESSOR_HOLD)
{
wmb();
ProcessorState[i] = PROCESSOR_RESUME;
}
}

unsigned long ProcessorHold[MAX_PROCESSORS];
unsigned long ProcessorState[MAX_PROCESSORS];

case 2: /* nmi */
if (ProcessorHold[processor]) /* hold processor */
{
ProcessorHold[processor] = 0;
ProcessorState[processor] = PROCESSOR_SUSPEND;

/* processor suspend loop */
atomic_inc(&nmiProcessors[processor]);
while ((ProcessorState[processor] != PROCESSOR_RESUME) &&
(ProcessorState[processor] != PROCESSOR_SWITCH))
{
rmb();
if ((ProcessorState[processor] == PROCESSOR_RESUME) ||
(ProcessorState[processor] == PROCESSOR_SWITCH))
break;

touch_nmi_watchdog();
cpu_relax();
}
atomic_dec(&nmiProcessors[processor]);

Code output from sections without volatile declaration using wmb()/rmb()

56fa: 83 3c b5 00 00 00 00 cmpl $0x0,0x0(,%esi,4)
5701: 00
5702: 74 6b je 576f <debugger_entry+0x1d7>
5704: c7 04 b5 00 00 00 00 movl $0x0,0x0(,%esi,4)
570b: 00 00 00 00
570f: 8d bd 00 00 00 00 lea 0x0(%ebp),%edi
5715: c7 04 b5 00 00 00 00 movl $0x2,0x0(,%esi,4)
571c: 02 00 00 00
5720: f0 ff 85 00 00 00 00 lock incl 0x0(%ebp)
5727: eb 1d jmp 5746 <debugger_entry+0x1ae>
5729: f0 83 04 24 00 lock addl $0x0,(%esp)
572e: 8b 04 b5 00 00 00 00 mov 0x0(,%esi,4),%eax
5735: 83 f8 03 cmp $0x3,%eax
5738: 74 1d je 5757 <debugger_entry+0x1bf>
573a: 83 f8 07 cmp $0x7,%eax
573d: 74 18 je 5757 <debugger_entry+0x1bf>
573f: e8 fc ff ff ff call 5740 <debugger_entry+0x1a8>
5744: f3 90 pause
5746: 8b 04 b5 00 00 00 00 mov 0x0(,%esi,4),%eax
574d: 83 f8 03 cmp $0x3,%eax
5750: 74 05 je 5757 <debugger_entry+0x1bf>
5752: 83 f8 07 cmp $0x7,%eax
5755: 75 d2 jne 5729 <debugger_entry+0x191>
5757: f0 ff 0f lock decl (%edi)
575a: 83 3c b5 00 00 00 00 cmpl $0x7,0x0(,%esi,4)
5761: 07
5762: 75 21 jne 5785 <debugger_entry+0x1ed>
5764: 89 f1 mov %esi,%ecx

000001e1 <FreeProcessorsExclSelf>:
1e1: 31 c0 xor %eax,%eax
1e3: 83 3c 85 00 00 00 00 cmpl $0x8,0x0(,%eax,4)
1ea: 08
1eb: 74 10 je 1fd <FreeProcessorsExclSelf+0x1c>
1ed: f0 83 04 24 00 lock addl $0x0,(%esp)
1f2: c7 04 85 00 00 00 00 movl $0x3,0x0(,%eax,4)
1f9: 03 00 00 00
1fd: 40 inc %eax
1fe: 83 f8 08 cmp $0x8,%eax
201: 75 e0 jne 1e3 <FreeProcessorsExclSelf+0x2>
203: c3 ret



Code from sections without volatile declaration using barrier()
(CODE GENERATION CORRECT)

for (i=0; i < MAX_PROCESSORS; i++)
{
if (ProcessorState[i] != PROCESSOR_HOLD)
{
barrier();
ProcessorState[i] = PROCESSOR_RESUME;
}
}

unsigned long ProcessorHold[MAX_PROCESSORS];
unsigned long ProcessorState[MAX_PROCESSORS];

case 2: /* nmi */
if (ProcessorHold[processor]) /* hold processor */
{
ProcessorHold[processor] = 0;
ProcessorState[processor] = PROCESSOR_SUSPEND;

/* processor suspend loop */
atomic_inc(&nmiProcessors[processor]);
while ((ProcessorState[processor] != PROCESSOR_RESUME) &&
(ProcessorState[processor] != PROCESSOR_SWITCH))
{
barrier();
if ((ProcessorState[processor] == PROCESSOR_RESUME) ||
(ProcessorState[processor] == PROCESSOR_SWITCH))
break;

touch_nmi_watchdog();
cpu_relax();
}
atomic_dec(&nmiProcessors[processor]);

Code output from sections without volatile declaration using barrier()

56f5: 83 3c b5 00 00 00 00 cmpl $0x0,0x0(,%esi,4)
56fc: 00
56fd: 74 66 je 5765 <debugger_entry+0x1d2>
56ff: c7 04 b5 00 00 00 00 movl $0x0,0x0(,%esi,4)
5706: 00 00 00 00
570a: 8d bd 00 00 00 00 lea 0x0(%ebp),%edi
5710: c7 04 b5 00 00 00 00 movl $0x2,0x0(,%esi,4)
5717: 02 00 00 00
571b: f0 ff 85 00 00 00 00 lock incl 0x0(%ebp)
5722: eb 18 jmp 573c <debugger_entry+0x1a9>
5724: 8b 04 b5 00 00 00 00 mov 0x0(,%esi,4),%eax
572b: 83 f8 03 cmp $0x3,%eax
572e: 74 1d je 574d <debugger_entry+0x1ba>
5730: 83 f8 07 cmp $0x7,%eax
5733: 74 18 je 574d <debugger_entry+0x1ba>
5735: e8 fc ff ff ff call 5736 <debugger_entry+0x1a3>
573a: f3 90 pause
573c: 8b 04 b5 00 00 00 00 mov 0x0(,%esi,4),%eax
5743: 83 f8 03 cmp $0x3,%eax
5746: 74 05 je 574d <debugger_entry+0x1ba>
5748: 83 f8 07 cmp $0x7,%eax
574b: 75 d7 jne 5724 <debugger_entry+0x191>
574d: f0 ff 0f lock decl (%edi)
5750: 83 3c b5 00 00 00 00 cmpl $0x7,0x0(,%esi,4)
5757: 07

000001e1 <FreeProcessorsExclSelf>:
1e1: 31 c0 xor %eax,%eax
1e3: 83 3c 85 00 00 00 00 cmpl $0x8,0x0(,%eax,4)
1ea: 08
1eb: 74 0b je 1f8 <FreeProcessorsExclSelf+0x17>
1ed: c7 04 85 00 00 00 00 movl $0x3,0x0(,%eax,4)
1f4: 03 00 00 00
1f8: 40 inc %eax
1f9: 83 f8 08 cmp $0x8,%eax
1fc: 75 e5 jne 1e3 <FreeProcessorsExclSelf+0x2>
1fe: c3 ret


Jeff

2008-08-26 08:27:17

by Andi Kleen

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

[email protected] writes:

> Results from Analysis of GCC volatile/memory barriers

So the result was that barrier()s actually worked? Great.

BTW if you did this work it might be worthwhile to go one further
step: gcc has a test suite where they run small programs and have a
simple facility to scan the resulting assembler code for specific
patterns. If you could bring your examples into that format (as in
being stand alone compiled and have a simple "yes/no" pattern to
check) and let it be included there that might make sure this always
keeps working in the future.

-Andi

2008-08-27 02:16:50

by Jeffrey V. Merkey

[permalink] [raw]
Subject: Re: [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released

> [email protected] writes:
>
>> Results from Analysis of GCC volatile/memory barriers
>
> So the result was that barrier()s actually worked? Great.
>

yes.

> BTW if you did this work it might be worthwhile to go one further
> step: gcc has a test suite where they run small programs and have a
> simple facility to scan the resulting assembler code for specific
> patterns. If you could bring your examples into that format (as in
> being stand alone compiled and have a simple "yes/no" pattern to
> check) and let it be included there that might make sure this always
> keeps working in the future.

Yep.

>
> -Andi
>

ok.

Jeff