2009-06-02 12:50:21

by Harald Welte

[permalink] [raw]
Subject: Re: LOCK prefix on uni processor has its use

On Wed, May 27, 2009 at 08:08:27PM +0200, Andi Kleen wrote:
> Harald Welte <[email protected]> writes:
> > * All X86 instructions except rep-strings are atomic wrt interrupts.
> > * The lock prefix has uses on a UP processor: It keeps DMA devices from
> > interfering with a read-modify-write sequence
>
> In theory yes, but not in Linux -- normal drivers simply don't use LOCK in
> any way on a UP kernel.

well, they might have inadvertedly used LOCK as part of regular spinlocks,
until LOCK_PREFIX was removed, right?

> > Now the question is: Is this a valid operation of a driver? Should the driver
> > do such things, or is such a driver broken?
>
> The driver is broken because if it relies on this it will not work on a UP kernel.
> Also it's not portable and in general a bad idea.

I agree. I was not referring to any real/known driver. I was just trying to
figure out what kind of problem the VIA/Centaur CPU guys tried to describe when
indicating that the LOCK prefix should be used on UP to avoid DMA interfering
with read-modify-write CPU instructions.

--
- Harald Welte <[email protected]> http://linux.via.com.tw/
============================================================================
VIA Free and Open Source Software Liaison


2009-06-02 12:56:45

by Andi Kleen

[permalink] [raw]
Subject: Re: LOCK prefix on uni processor has its use

On Tue, Jun 02, 2009 at 02:48:54PM +0200, Harald Welte wrote:
> On Wed, May 27, 2009 at 08:08:27PM +0200, Andi Kleen wrote:
> > Harald Welte <[email protected]> writes:
> > > * All X86 instructions except rep-strings are atomic wrt interrupts.
> > > * The lock prefix has uses on a UP processor: It keeps DMA devices from
> > > interfering with a read-modify-write sequence
> >
> > In theory yes, but not in Linux -- normal drivers simply don't use LOCK in
> > any way on a UP kernel.
>
> well, they might have inadvertedly used LOCK as part of regular spinlocks,
> until LOCK_PREFIX was removed, right?

LOCK_PREFIX was always defined away on UP kernels. That dates back
to the initial Linux 2.0 SMP implementation.

On newer SMP kernels they also patch away the lock prefix even
if they are running UP, so if you only have a single core you'll
never get lock.

So I think it's pretty unlikely any driver relied on this.

There are some special bit functions that always have LOCK, but these
are only used by the Xen drivers afaik (that is needed when a UP
kernel talks to a SMP hypervisor over shared memory)

> I agree. I was not referring to any real/known driver. I was just trying to
> figure out what kind of problem the VIA/Centaur CPU guys tried to describe when
> indicating that the LOCK prefix should be used on UP to avoid DMA interfering
> with read-modify-write CPU instructions.

It locks the cache line. That's a valid case in the x86 architecture,
it's just that the Linux driver model doesn't use it.

-Andi

--
[email protected] -- Speaking for myself only.

2009-06-02 13:26:18

by Michael S. Zick

[permalink] [raw]
Subject: Re: LOCK prefix on uni processor has its use

On Tue June 2 2009, Andi Kleen wrote:
> On Tue, Jun 02, 2009 at 02:48:54PM +0200, Harald Welte wrote:
> > On Wed, May 27, 2009 at 08:08:27PM +0200, Andi Kleen wrote:
> > > Harald Welte <[email protected]> writes:
> > > > * All X86 instructions except rep-strings are atomic wrt interrupts.
> > > > * The lock prefix has uses on a UP processor: It keeps DMA devices from
> > > > interfering with a read-modify-write sequence
> > >
> > > In theory yes, but not in Linux -- normal drivers simply don't use LOCK in
> > > any way on a UP kernel.
> >
> > well, they might have inadvertedly used LOCK as part of regular spinlocks,
> > until LOCK_PREFIX was removed, right?
>
> LOCK_PREFIX was always defined away on UP kernels. That dates back
> to the initial Linux 2.0 SMP implementation.
>
> On newer SMP kernels they also patch away the lock prefix even
> if they are running UP, so if you only have a single core you'll
> never get lock.
>

After another week of chasing this - -
My favorite theory is still: "human coding error" - somewhere.

The LOCK_PREFIX is used or not used or mis-used by something.

My second favorite theory (related to the "some sort of timing
problem" suggestion:

Another difference is FSB speed on the two machines -
The "trouble free" case is twice as fast as the "problem" case.

Such a thing should be totally transparent to the kernel, but...
we do have humans writing the code. ;)

> So I think it's pretty unlikely any driver relied on this.
>

The kernel assumes I/O coherency, but perhaps something is
breaking that assumption. Not by intent, but by oversight.

I posed a couple of questions to H.W. off list to pass on to
the silicon grower's department. Will see what they recommend.

At the moment, I am stuck with brute-force code reading.
Nothing very elegant going on here.

Mike

> There are some special bit functions that always have LOCK, but these
> are only used by the Xen drivers afaik (that is needed when a UP
> kernel talks to a SMP hypervisor over shared memory)
>
> > I agree. I was not referring to any real/known driver. I was just trying to
> > figure out what kind of problem the VIA/Centaur CPU guys tried to describe when
> > indicating that the LOCK prefix should be used on UP to avoid DMA interfering
> > with read-modify-write CPU instructions.
>
> It locks the cache line. That's a valid case in the x86 architecture,
> it's just that the Linux driver model doesn't use it.
>
> -Andi
>

2009-06-02 13:35:28

by Andi Kleen

[permalink] [raw]
Subject: Re: LOCK prefix on uni processor has its use

> After another week of chasing this - -

Did you use the "compile part of the kernel with LOCK and others without"
technique I described earlier?

-Andi
--
[email protected] -- Speaking for myself only.

2009-06-03 11:46:28

by Michael S. Zick

[permalink] [raw]
Subject: Re: LOCK prefix on uni processor has its use

On Tue June 2 2009, Andi Kleen wrote:
> > After another week of chasing this - -
>
> Did you use the "compile part of the kernel with LOCK and others without"
> technique I described earlier?
>

That would only help if it where a single point failure.

Although there are some assembly language things that can
be done to help in finding what to examine, like:

#define LOCK_PREFIX "\n### Lock pre-fix removed:\n\t"

Or whatever might help your favorite text search program.

Which yields asm expansion in your *.s file (gcc -S) as:

#APP
# 33 "test_bytelock.c" 1

1: xchgb %ah, %al
test %al,%al
jz 3f

### Lock pre-fix removed:
incb splock+1
2: xchgw %ax, %ax
cmpb $1, splock
je 2b

### Lock pre-fix removed:
decb splock+1
jmp 1b
3:
# 0 "" 2
#NO_APP

Note: For the readers not familar with (g)as;
#APP -> Assembler Pre-Process (gcc generated)
<ragged whitespace and comments allowed>
#NO_APP -> No Assembler Pre-Process (gcc generated)

If ambitious, you can add a comment to each asm-macro
to note the line and source filename of where it is
defined. (the line number and name gcc put there is
where it was expanded, not where it was defined).

Not really too ambitious - there are only 140 files
of interest (with asm-macros) in a x86, uni-processor build.

Mike

> -Andi