2006-03-07 23:40:29

by Chuck Ebbert

[permalink] [raw]
Subject: [patch] i386 spinlocks: disable interrupts only if we enabled them

_raw_spin_lock_flags() is entered with interrupts disabled. If it
cannot obtain a spinlock, it checks the flags that were passed and
re-enables interrupts before spinning if that's how the flags are set.
When the spinlock might be available, it disables interrupts (even if
they are already disabled) before trying to get the lock. Change that
so interrupts are only disabled if they have been enabled. This costs
nine bytes of duplicated spinloop code.

Fastpath before patch:
jle <keep looping> not-taken conditional jump
cli disable interrupts
jmp <try for lock> unconditional jump

Fastpath after patch, if interrupts were not enabled:
jg <try for lock> taken conditional branch

Signed-off-by: Chuck Ebbert <[email protected]>

--- 2.6.16-rc5-d2.orig/include/asm-i386/spinlock.h
+++ 2.6.16-rc5-d2/include/asm-i386/spinlock.h
@@ -35,18 +35,23 @@
#define __raw_spin_lock_string_flags \
"\n1:\t" \
"lock ; decb %0\n\t" \
- "jns 4f\n\t" \
+ "jns 5f\n" \
"2:\t" \
"testl $0x200, %1\n\t" \
- "jz 3f\n\t" \
- "sti\n\t" \
+ "jz 4f\n\t" \
+ "sti\n" \
"3:\t" \
"rep;nop\n\t" \
"cmpb $0, %0\n\t" \
"jle 3b\n\t" \
"cli\n\t" \
"jmp 1b\n" \
- "4:\n\t"
+ "4:\t" \
+ "rep;nop\n\t" \
+ "cmpb $0, %0\n\t" \
+ "jg 1b\n\t" \
+ "jmp 4b\n" \
+ "5:\n\t"

static inline void __raw_spin_lock(raw_spinlock_t *lock)
{
--
Chuck
"Penguins don't come from next door, they come from the Antarctic!"


2006-03-08 00:13:43

by Andrew Morton

[permalink] [raw]
Subject: Re: [patch] i386 spinlocks: disable interrupts only if we enabled them

Chuck Ebbert <[email protected]> wrote:
>
> _raw_spin_lock_flags() is entered with interrupts disabled. If it
> cannot obtain a spinlock, it checks the flags that were passed and
> re-enables interrupts before spinning if that's how the flags are set.
> When the spinlock might be available, it disables interrupts (even if
> they are already disabled) before trying to get the lock. Change that
> so interrupts are only disabled if they have been enabled. This costs
> nine bytes of duplicated spinloop code.
>
> Fastpath before patch:
> jle <keep looping> not-taken conditional jump
> cli disable interrupts
> jmp <try for lock> unconditional jump
>
> Fastpath after patch, if interrupts were not enabled:
> jg <try for lock> taken conditional branch
>

Well no. The fastpath is:

jns 4f we got the lock.

>
> --- 2.6.16-rc5-d2.orig/include/asm-i386/spinlock.h
> +++ 2.6.16-rc5-d2/include/asm-i386/spinlock.h
> @@ -35,18 +35,23 @@
> #define __raw_spin_lock_string_flags \
> "\n1:\t" \
> "lock ; decb %0\n\t" \
> - "jns 4f\n\t" \
> + "jns 5f\n" \
> "2:\t" \
> "testl $0x200, %1\n\t" \
> - "jz 3f\n\t" \
> - "sti\n\t" \
> + "jz 4f\n\t" \
> + "sti\n" \
> "3:\t" \
> "rep;nop\n\t" \
> "cmpb $0, %0\n\t" \
> "jle 3b\n\t" \
> "cli\n\t" \
> "jmp 1b\n" \
> - "4:\n\t"
> + "4:\t" \
> + "rep;nop\n\t" \
> + "cmpb $0, %0\n\t" \
> + "jg 1b\n\t" \
> + "jmp 4b\n" \
> + "5:\n\t"
>

So this is speeding up the slowpath, which really shouldn't matter unless
we have bigger problems.

And it's increasing text size. Which wouldn't be a big problem if the
spinning code was still in an out-of-line section, but it isn't any more.

(I forget why we undid that optimisation. What was wrong with it?)

2006-03-08 00:44:15

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch] i386 spinlocks: disable interrupts only if we enabled them


* Andrew Morton <[email protected]> wrote:

> And it's increasing text size. Which wouldn't be a big problem if the
> spinning code was still in an out-of-line section, but it isn't any
> more.
>
> (I forget why we undid that optimisation. What was wrong with it?)

we dont inline that code anymore. So i think the optimization is fine.

Acked-by: Ingo Molnar <[email protected]>

Ingo

2006-03-08 01:49:39

by Chuck Ebbert

[permalink] [raw]
Subject: Re: [patch] i386 spinlocks: disable interrupts only if we enabled them

In-Reply-To: <[email protected]>

On Tue, 7 Mar 2006 16:15:50, Andrew Morton wrote:

> Chuck Ebbert <[email protected]> wrote:
> > Fastpath before patch:
> > jle <keep looping> not-taken conditional jump
> > cli disable interrupts
> > jmp <try for lock> unconditional jump
> >
> > Fastpath after patch, if interrupts were not enabled:
> > jg <try for lock> taken conditional branch
> >
>
> Well no. The fastpath is:
>
> jns 4f we got the lock.

That's debatable. Once the spinlock is available, jumping back to
try and get it becomes fastpath code, even though the spinloop itself
is not. Any delay seen here means lost cycles that could be doing work.

My only real question is how long 'cli' takes if interrupts are already
disabled.

> And it's increasing text size. Which wouldn't be a big problem if the
> spinning code was still in an out-of-line section, but it isn't any more.

__raw_spin_lock_flags is inlined, but only one place, in
_spin_lock_irqsave(), which is no longer an inline. So there's no real
need to out-of-line the spin code anymore. This adds nine bytes, which
should be acceptable in an out-of-line function. (Only the unlock functions
are inlined, and even then only if !CONFIG_PREEMPT).


--
Chuck
"Penguins don't come from next door, they come from the Antarctic!"

2006-03-08 02:57:52

by Benjamin LaHaise

[permalink] [raw]
Subject: Re: [patch] i386 spinlocks: disable interrupts only if we enabled them

On Wed, Mar 08, 2006 at 01:43:08AM +0100, Ingo Molnar wrote:
> we dont inline that code anymore. So i think the optimization is fine.

Why is that? It adds memory traffic that has to be synchronized
before the lock occurs and clobbered registers now in the caller.

-ben
--
"Time is of no importance, Mr. President, only life is important."
Don't Email: <[email protected]>.

2006-03-08 06:57:57

by Andrew Morton

[permalink] [raw]
Subject: Re: [patch] i386 spinlocks: disable interrupts only if we enabled them

Benjamin LaHaise <[email protected]> wrote:
>
> On Wed, Mar 08, 2006 at 01:43:08AM +0100, Ingo Molnar wrote:
> > we dont inline that code anymore. So i think the optimization is fine.
>
> Why is that? It adds memory traffic that has to be synchronized
> before the lock occurs and clobbered registers now in the caller.

Is the inlined lock;decb+jns likely to worsen the text size? I doubt it.
Overall text will get bigger due to the out-of-line stuff, but that's OK.

I'm sure we went over all this, but I don't recall the thinking.

2006-03-08 08:52:22

by Nick Piggin

[permalink] [raw]
Subject: Re: [patch] i386 spinlocks: disable interrupts only if we enabled them

Andrew Morton wrote:
> Benjamin LaHaise <[email protected]> wrote:
>
>>On Wed, Mar 08, 2006 at 01:43:08AM +0100, Ingo Molnar wrote:
>> > we dont inline that code anymore. So i think the optimization is fine.
>>
>> Why is that? It adds memory traffic that has to be synchronized
>> before the lock occurs and clobbered registers now in the caller.
>
>
> Is the inlined lock;decb+jns likely to worsen the text size? I doubt it.
> Overall text will get bigger due to the out-of-line stuff, but that's OK.
>
> I'm sure we went over all this, but I don't recall the thinking.

Seems like a very good idea not to clobber any registers in
lock fastpaths. I don't see how that could have been a win
(especially for i386) but still, Ingo must have had a reason
behind it.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com