2006-01-10 17:44:02

by Hugh Dickins

[permalink] [raw]
Subject: i386 FRAME_POINTER needs DEBUG_MUTEXES

I find the 2.6.15-git6 i386 CONFIG_FRAME_POINTER=y doesn't work unless
CONFIG_DEBUG_MUTEXES=y, because of the __mutex_fastpath jumps to fail_fn
(giving two pushes of %ebp to one pop). Whereas x86_64 __mutex_fastpaths
use calls and work with FRAME_POINTER=y. Whether i386 deserves asm mods
rather than Kconfigery to force one from other, I've no strong instinct.

Hugh


2006-01-10 21:07:41

by Ingo Molnar

[permalink] [raw]
Subject: [patch] fix i386 mutex fastpath on FRAME_POINTER && !DEBUG_MUTEXES


* Hugh Dickins <[email protected]> wrote:

> I find the 2.6.15-git6 i386 CONFIG_FRAME_POINTER=y doesn't work unless
> CONFIG_DEBUG_MUTEXES=y, because of the __mutex_fastpath jumps to
> fail_fn (giving two pushes of %ebp to one pop). Whereas x86_64
> __mutex_fastpaths use calls and work with FRAME_POINTER=y. Whether
> i386 deserves asm mods rather than Kconfigery to force one from other,
> I've no strong instinct.

ah, good eyes! This explains some of the crashes i saw today. The patch
below solves it. Build and boot tested on x86. Linus, please apply.

Ingo

--
call the mutex slowpath more conservatively - e.g. FRAME_POINTERS can
change the calling convention, in which case a direct branch to the
slowpath becomes illegal. Bug found by Hugh Dickins.

Signed-off-by: Ingo Molnar <[email protected]>

----

include/asm-i386/mutex.h | 16 ++++++++++++++--
kernel/mutex.c | 9 ---------
2 files changed, 14 insertions(+), 11 deletions(-)

Index: linux/include/asm-i386/mutex.h
===================================================================
--- linux.orig/include/asm-i386/mutex.h
+++ linux/include/asm-i386/mutex.h
@@ -28,7 +28,13 @@ do { \
\
__asm__ __volatile__( \
LOCK " decl (%%eax) \n" \
- " js "#fail_fn" \n" \
+ " js 2f \n" \
+ "1: \n" \
+ \
+ LOCK_SECTION_START("") \
+ "2: call "#fail_fn" \n" \
+ " jmp 1b \n" \
+ LOCK_SECTION_END \
\
:"=a" (dummy) \
: "a" (count) \
@@ -78,7 +84,13 @@ do { \
\
__asm__ __volatile__( \
LOCK " incl (%%eax) \n" \
- " jle "#fail_fn" \n" \
+ " jle 2f \n" \
+ "1: \n" \
+ \
+ LOCK_SECTION_START("") \
+ "2: call "#fail_fn" \n" \
+ " jmp 1b \n" \
+ LOCK_SECTION_END \
\
:"=a" (dummy) \
: "a" (count) \
Index: linux/kernel/mutex.c
===================================================================
--- linux.orig/kernel/mutex.c
+++ linux/kernel/mutex.c
@@ -84,12 +84,6 @@ void fastcall __sched mutex_lock(struct
/*
* The locking fastpath is the 1->0 transition from
* 'unlocked' into 'locked' state.
- *
- * NOTE: if asm/mutex.h is included, then some architectures
- * rely on mutex_lock() having _no other code_ here but this
- * fastpath. That allows the assembly fastpath to do
- * tail-merging optimizations. (If you want to put testcode
- * here, do it under #ifndef CONFIG_MUTEX_DEBUG.)
*/
__mutex_fastpath_lock(&lock->count, __mutex_lock_slowpath);
}
@@ -115,8 +109,6 @@ void fastcall __sched mutex_unlock(struc
/*
* The unlocking fastpath is the 0->1 transition from 'locked'
* into 'unlocked' state:
- *
- * NOTE: no other code must be here - see mutex_lock() .
*/
__mutex_fastpath_unlock(&lock->count, __mutex_unlock_slowpath);
}
@@ -261,7 +253,6 @@ __mutex_lock_interruptible_slowpath(atom
*/
int fastcall __sched mutex_lock_interruptible(struct mutex *lock)
{
- /* NOTE: no other code must be here - see mutex_lock() */
return __mutex_fastpath_lock_retval
(&lock->count, __mutex_lock_interruptible_slowpath);
}

2006-01-11 08:27:46

by Chuck Ebbert

[permalink] [raw]
Subject: Re: [patch] fix i386 mutex fastpath on FRAME_POINTER && !DEBUG_MUTEXES

In-Reply-To: <[email protected]>

On Tue, 10 Jan 2006 at 22:07:44 +0100, Ingo Molnar wrote:

> --- linux.orig/include/asm-i386/mutex.h
> +++ linux/include/asm-i386/mutex.h
> @@ -28,7 +28,13 @@ do { \
> \
> __asm__ __volatile__( \
> LOCK " decl (%%eax) \n" \
> - " js "#fail_fn" \n" \
> + " js 2f \n" \
> + "1: \n" \
> + \
> + LOCK_SECTION_START("") \
> + "2: call "#fail_fn" \n" \
> + " jmp 1b \n" \
> + LOCK_SECTION_END \
> \
> :"=a" (dummy) \
> : "a" (count) \


But now it's inefficient again.

Why not this:

LOCK " decl (%%eax) \n" \
" jns 1f \n" \
" call "#fail_fn" \n" \
"1: \n" \
\
:"=a" (dummy) \
: "a" (count) \


Will the extra taken forward conditional jump in the fastpath cause much of
a slowdown?

--
Chuck
Currently reading: _Olympos_ by Dan Simmons

2006-01-11 10:14:04

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch] fix i386 mutex fastpath on FRAME_POINTER && !DEBUG_MUTEXES


On Wed, 11 Jan 2006, Chuck Ebbert wrote:

> In-Reply-To: <[email protected]>
>
> On Tue, 10 Jan 2006 at 22:07:44 +0100, Ingo Molnar wrote:
>
> > --- linux.orig/include/asm-i386/mutex.h
> > +++ linux/include/asm-i386/mutex.h
> > @@ -28,7 +28,13 @@ do { \
> > \
> > __asm__ __volatile__( \
> > LOCK " decl (%%eax) \n" \
> > - " js "#fail_fn" \n" \
> > + " js 2f \n" \
> > + "1: \n" \
> > + \
> > + LOCK_SECTION_START("") \
> > + "2: call "#fail_fn" \n" \
> > + " jmp 1b \n" \
> > + LOCK_SECTION_END \
> > \
> > :"=a" (dummy) \
> > : "a" (count) \
>
>
> But now it's inefficient again.
>
> Why not this:
>
> LOCK " decl (%%eax) \n" \
> " jns 1f \n" \
> " call "#fail_fn" \n" \
> "1: \n" \
> \
> :"=a" (dummy) \
> : "a" (count) \
>
>
> Will the extra taken forward conditional jump in the fastpath cause much
> of a slowdown?

yeah - the fastpath is much more common than the slowpath.

Ingo