2010-11-26 15:31:44

by Tejun Heo

[permalink] [raw]
Subject: Possible FPU context corruption w/ CONFIG_PREEMPT

Hello, guys.

Heinz-Bernd Eggenstein reports a possible FPU context corruption w/
CONFIG_PREEMPT. Please take a look at the following forum post.

http://einstein.phys.uwm.edu/forum_thread.php?id=8516

openSUSE 11.3 desktop kernel which has CONFIG_PREEMPT set is
triggering SIGFPE while the default kernel w/o preemption works fine.
He also notes that a similar bug was fixed in 2008 by commit 06c38d5e
(x86-64: fix FPU corruption with signals and preemption) from Suresh.
Does it ring anyone's bell?

Heinz, is there a simple procedure to reproduce the problem, or would
it be possible to lure you into bisection?

Thanks.

--
tejun


2010-11-27 05:34:43

by Brian Gerst

[permalink] [raw]
Subject: Re: Possible FPU context corruption w/ CONFIG_PREEMPT

On Fri, Nov 26, 2010 at 10:31 AM, Tejun Heo <[email protected]> wrote:
> Hello, guys.
>
> Heinz-Bernd Eggenstein reports a possible FPU context corruption w/
> CONFIG_PREEMPT.  Please take a look at the following forum post.
>
>  http://einstein.phys.uwm.edu/forum_thread.php?id=8516
>
> openSUSE 11.3 desktop kernel which has CONFIG_PREEMPT set is
> triggering SIGFPE while the default kernel w/o preemption works fine.
> He also notes that a similar bug was fixed in 2008 by commit 06c38d5e
> (x86-64: fix FPU corruption with signals and preemption) from Suresh.
> Does it ring anyone's bell?
>
> Heinz, is there a simple procedure to reproduce the problem, or would
> it be possible to lure you into bisection?
>
> Thanks.
>

This might be fixed by commit a4d4fbc7735bba6654b20f859135f9d3f8fe7f76
(Disable preemption when using TS_USEDFPU).

--
Brian Gerst

2010-11-27 10:19:32

by Tejun Heo

[permalink] [raw]
Subject: Re: Possible FPU context corruption w/ CONFIG_PREEMPT

Hey, Brian.

On 11/27/2010 06:34 AM, Brian Gerst wrote:
> On Fri, Nov 26, 2010 at 10:31 AM, Tejun Heo <[email protected]> wrote:
>> Hello, guys.
>>
>> Heinz-Bernd Eggenstein reports a possible FPU context corruption w/
>> CONFIG_PREEMPT. Please take a look at the following forum post.
>>
>> http://einstein.phys.uwm.edu/forum_thread.php?id=8516
>>
>> openSUSE 11.3 desktop kernel which has CONFIG_PREEMPT set is
>> triggering SIGFPE while the default kernel w/o preemption works fine.
>> He also notes that a similar bug was fixed in 2008 by commit 06c38d5e
>> (x86-64: fix FPU corruption with signals and preemption) from Suresh.
>> Does it ring anyone's bell?
>>
>> Heinz, is there a simple procedure to reproduce the problem, or would
>> it be possible to lure you into bisection?
>
> This might be fixed by commit a4d4fbc7735bba6654b20f859135f9d3f8fe7f76
> (Disable preemption when using TS_USEDFPU).

Thanks for the pointer. Can someone please verify whether the
following patch fixes the issue? And, if so, this definitely should
go to -stable.

>From a4d4fbc7735bba6654b20f859135f9d3f8fe7f76 Mon Sep 17 00:00:00 2001
From: Brian Gerst <[email protected]>
Date: Fri, 3 Sep 2010 21:17:12 -0400
Subject: [PATCH] x86-64, fpu: Disable preemption when using TS_USEDFPU

Consolidates code and fixes the below race for 64-bit.

commit 9fa2f37bfeb798728241cc4a19578ce6e4258f25
Author: torvalds <torvalds>
Date: Tue Sep 2 07:37:25 2003 +0000

Be a lot more careful about TS_USEDFPU and preemption

We had some races where we testecd (or set) TS_USEDFPU together
with sequences that depended on the setting (like clearing or
setting the TS flag in %cr0) and we could be preempted in between,
which screws up the FPU state, since preemption will itself change
USEDFPU and the TS flag.

This makes it a lot more explicit: the "internal" low-level FPU
functions ("__xxxx_fpu()") all require preemption to be disabled,
and the exported "real" functions will make sure that is the case.

One case - in __switch_to() - was switched to the non-preempt-safe
internal version, since the scheduler itself has already disabled
preemption.

BKrev: 3f5448b5WRiQuyzAlbajs3qoQjSobw

Signed-off-by: Brian Gerst <[email protected]>
Acked-by: Pekka Enberg <[email protected]>
Cc: Suresh Siddha <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/i387.h | 15 ---------------
arch/x86/kernel/process_64.c | 2 +-
2 files changed, 1 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index 88065e3..8b40a83 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -387,19 +387,6 @@ static inline void irq_ts_restore(int TS_state)
stts();
}

-#ifdef CONFIG_X86_64
-
-static inline void save_init_fpu(struct task_struct *tsk)
-{
- __save_init_fpu(tsk);
- stts();
-}
-
-#define unlazy_fpu __unlazy_fpu
-#define clear_fpu __clear_fpu
-
-#else /* CONFIG_X86_32 */
-
/*
* These disable preemption on their own and are safe
*/
@@ -425,8 +412,6 @@ static inline void clear_fpu(struct task_struct *tsk)
preempt_enable();
}

-#endif /* CONFIG_X86_64 */
-
/*
* i387 state interaction
*/
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 3d9ea53..b3d7a3a 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -424,7 +424,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
load_TLS(next, cpu);

/* Must be after DS reload */
- unlazy_fpu(prev_p);
+ __unlazy_fpu(prev_p);

/* Make sure cpu is ready for new context */
if (preload_fpu)
--
1.7.1