2008-03-05 15:30:35

by Aurelien Jarno

[permalink] [raw]
Subject: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Hi all,

Since version 4.3, gcc changed its behaviour concerning the x86/x86-64
ABI and the direction flag, that is it now assumes that the direction
flag is cleared at the entry of a function and it doesn't clear once
more if needed.

This causes some problems with the Linux kernel which does not clear
the direction flag when entering a signal handler. The small code below
(for x86-64) demonstrates that.

If the signal handler is using code that need the direction flag cleared
(for example bzero() or memset()), the code is incorrectly executed.

I guess this has to be fixed on the kernel side, but also gcc-4.3 could
revert back to the old behaviour, that is clearing the direction flag
when entering a routine that touches it until most people are running a
fixed kernel.

Kind regards,
Aurelien

[1] http://gcc.gnu.org/gcc-4.3/changes.html


#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
#include <signal.h>

void handler(int signal) {
uint64_t rflags;

asm volatile("pushfq ; popq %0" : "=g" (rflags));

if (rflags & (1 << 10))
printf("DF = 1\n");
else
printf("DF = 0\n");
}

int main() {
signal(SIGUSR1, handler);

while(1)
{
asm volatile("std\r\n");
}

return 0;
}

--
.''`. Aurelien Jarno | GPG: 1024D/F1BCDB73
: :' : Debian developer | Electrical Engineer
`. `' [email protected] | [email protected]
`- people.debian.org/~aurel32 | http://www.aurel32.net


2008-03-05 16:05:48

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Aurelien Jarno wrote:
> Hi all,
>
> Since version 4.3, gcc changed its behaviour concerning the x86/x86-64
> ABI and the direction flag, that is it now assumes that the direction
> flag is cleared at the entry of a function and it doesn't clear once
> more if needed.
>
> This causes some problems with the Linux kernel which does not clear
> the direction flag when entering a signal handler. The small code below
> (for x86-64) demonstrates that.
>
> If the signal handler is using code that need the direction flag cleared
> (for example bzero() or memset()), the code is incorrectly executed.
>
> I guess this has to be fixed on the kernel side, but also gcc-4.3 could
> revert back to the old behaviour, that is clearing the direction flag
> when entering a routine that touches it until most people are running a
> fixed kernel.
>

Linux should definitely follow the ABI. This is a bug, and a pretty
serious such.

-hpa

2008-03-05 16:56:34

by H.J. Lu

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Hi,

According to i386 psABI,

---
The direction flag must be set "forward" direction before entry and
upon exit from
a function.
---

So, asm statement should make sure that the direction flag is cleared before
function returns and kernel should make sure that the direction flag is cleared
when calling a signal handler.

H.J.
On Wed, Mar 5, 2008 at 7:30 AM, Aurelien Jarno <[email protected]> wrote:
> Hi all,
>
> Since version 4.3, gcc changed its behaviour concerning the x86/x86-64
> ABI and the direction flag, that is it now assumes that the direction
> flag is cleared at the entry of a function and it doesn't clear once
> more if needed.
>
> This causes some problems with the Linux kernel which does not clear
> the direction flag when entering a signal handler. The small code below
> (for x86-64) demonstrates that.
>
> If the signal handler is using code that need the direction flag cleared
> (for example bzero() or memset()), the code is incorrectly executed.
>
> I guess this has to be fixed on the kernel side, but also gcc-4.3 could
> revert back to the old behaviour, that is clearing the direction flag
> when entering a routine that touches it until most people are running a
> fixed kernel.
>
> Kind regards,
> Aurelien
>
> [1] http://gcc.gnu.org/gcc-4.3/changes.html
>
>
> #include <stdint.h>
> #include <stdlib.h>
> #include <stdio.h>
> #include <signal.h>
>
> void handler(int signal) {
> uint64_t rflags;
>
> asm volatile("pushfq ; popq %0" : "=g" (rflags));
>
> if (rflags & (1 << 10))
> printf("DF = 1\n");
> else
> printf("DF = 0\n");
> }
>
> int main() {
> signal(SIGUSR1, handler);
>
> while(1)
> {
> asm volatile("std\r\n");
> }
>
> return 0;
> }
>
> --
> .''`. Aurelien Jarno | GPG: 1024D/F1BCDB73
> : :' : Debian developer | Electrical Engineer
> `. `' [email protected] | [email protected]
> `- people.debian.org/~aurel32 | http://www.aurel32.net
>

2008-03-05 18:15:27

by Aurelien Jarno

[permalink] [raw]
Subject: [PATCH] x86: Clear DF before calling signal handler

The Linux kernel currently does not clear the direction flag before
calling a signal handler, whereas the x86/x86-64 ABI requires that.
This become a real problem with gcc version 4.3, which assumes that
the direction flag is correctly cleared at the entry of a function.

This patches changes the setup_frame() functions to clear the
direction before entering the signal handler.

Signed-off-by: Aurelien Jarno <[email protected]>
---
arch/x86/ia32/ia32_signal.c | 4 ++--
arch/x86/kernel/signal_32.c | 4 ++--
arch/x86/kernel/signal_64.c | 2 +-
3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index 1c0503b..5e7771a 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -500,7 +500,7 @@ int ia32_setup_frame(int sig, struct k_sigaction *ka,
regs->ss = __USER32_DS;

set_fs(USER_DS);
- regs->flags &= ~X86_EFLAGS_TF;
+ regs->flags &= ~(X86_EFLAGS_TF | X86_EFLAGS_DF);
if (test_thread_flag(TIF_SINGLESTEP))
ptrace_notify(SIGTRAP);

@@ -600,7 +600,7 @@ int ia32_setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
regs->ss = __USER32_DS;

set_fs(USER_DS);
- regs->flags &= ~X86_EFLAGS_TF;
+ regs->flags &= ~(X86_EFLAGS_TF | X86_EFLAGS_DF);
if (test_thread_flag(TIF_SINGLESTEP))
ptrace_notify(SIGTRAP);

diff --git a/arch/x86/kernel/signal_32.c b/arch/x86/kernel/signal_32.c
index caee1f0..0157a6f 100644
--- a/arch/x86/kernel/signal_32.c
+++ b/arch/x86/kernel/signal_32.c
@@ -407,7 +407,7 @@ static int setup_frame(int sig, struct k_sigaction *ka,
* The tracer may want to single-step inside the
* handler too.
*/
- regs->flags &= ~TF_MASK;
+ regs->flags &= ~(TF_MASK | X86_EFLAGS_DF);
if (test_thread_flag(TIF_SINGLESTEP))
ptrace_notify(SIGTRAP);

@@ -500,7 +500,7 @@ static int setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
* The tracer may want to single-step inside the
* handler too.
*/
- regs->flags &= ~TF_MASK;
+ regs->flags &= ~(TF_MASK | X86_EFLAGS_DF);
if (test_thread_flag(TIF_SINGLESTEP))
ptrace_notify(SIGTRAP);

diff --git a/arch/x86/kernel/signal_64.c b/arch/x86/kernel/signal_64.c
index 7347bb1..56b72fb 100644
--- a/arch/x86/kernel/signal_64.c
+++ b/arch/x86/kernel/signal_64.c
@@ -295,7 +295,7 @@ static int setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
see include/asm-x86_64/uaccess.h for details. */
set_fs(USER_DS);

- regs->flags &= ~X86_EFLAGS_TF;
+ regs->flags &= ~(X86_EFLAGS_TF | X86_EFLAGS_DF);
if (test_thread_flag(TIF_SINGLESTEP))
ptrace_notify(SIGTRAP);
#ifdef DEBUG_SIG
--
1.5.4.3


--
.''`. Aurelien Jarno | GPG: 1024D/F1BCDB73
: :' : Debian developer | Electrical Engineer
`. `' [email protected] | [email protected]
`- people.debian.org/~aurel32 | http://www.aurel32.net

2008-03-05 18:22:17

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] x86: Clear DF before calling signal handler

Aurelien Jarno wrote:
> The Linux kernel currently does not clear the direction flag before
> calling a signal handler, whereas the x86/x86-64 ABI requires that.
> This become a real problem with gcc version 4.3, which assumes that
> the direction flag is correctly cleared at the entry of a function.
>
> This patches changes the setup_frame() functions to clear the
> direction before entering the signal handler.
>
> Signed-off-by: Aurelien Jarno <[email protected]>

Acked-by: H. Peter Anvin <[email protected]>

2008-03-05 20:14:28

by Joe Buck

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag


Aurelien Jarno wrote:
> >Since version 4.3, gcc changed its behaviour concerning the x86/x86-64
> >ABI and the direction flag, that is it now assumes that the direction
> >flag is cleared at the entry of a function and it doesn't clear once
> >more if needed.
> >...
> >I guess this has to be fixed on the kernel side, but also gcc-4.3 could
> >revert back to the old behaviour, that is clearing the direction flag
> >when entering a routine that touches it until most people are running a
> >fixed kernel.

On Wed, Mar 05, 2008 at 08:00:42AM -0800, H. Peter Anvin wrote:
> Linux should definitely follow the ABI. This is a bug, and a pretty
> serious such.

Unfortunately, there are a lot of kernels out there already with this
problem, and the symptoms are likely to be subtle. So even if it is true
that it is the kernel that is "in the wrong", I think we still are going
to need to give users a workaround from the gcc side as well.

So I think gcc at least needs an *option* to revert to the old behavior,
and there's a good argument to make it the default for now, at least for
x86/x86-64 on Linux.



2008-03-05 20:23:39

by Aurelien Jarno

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Wed, Mar 05, 2008 at 11:58:34AM -0800, Joe Buck wrote:
>
> Aurelien Jarno wrote:
> > >Since version 4.3, gcc changed its behaviour concerning the x86/x86-64
> > >ABI and the direction flag, that is it now assumes that the direction
> > >flag is cleared at the entry of a function and it doesn't clear once
> > >more if needed.
> > >...
> > >I guess this has to be fixed on the kernel side, but also gcc-4.3 could
> > >revert back to the old behaviour, that is clearing the direction flag
> > >when entering a routine that touches it until most people are running a
> > >fixed kernel.
>
> On Wed, Mar 05, 2008 at 08:00:42AM -0800, H. Peter Anvin wrote:
> > Linux should definitely follow the ABI. This is a bug, and a pretty
> > serious such.
>
> Unfortunately, there are a lot of kernels out there already with this
> problem, and the symptoms are likely to be subtle. So even if it is true
> that it is the kernel that is "in the wrong", I think we still are going
> to need to give users a workaround from the gcc side as well.
>
> So I think gcc at least needs an *option* to revert to the old behavior,
> and there's a good argument to make it the default for now, at least for
> x86/x86-64 on Linux.

And for other kernels. I tested OpenBSD 4.1, FreeBSD 6.3, NetBSD 4.0,
they have the same behaviour as Linux, that is they don't clear DF
before calling the signal handler.

I also tested Hurd, and it causes a kernel crash.

--
.''`. Aurelien Jarno | GPG: 1024D/F1BCDB73
: :' : Debian developer | Electrical Engineer
`. `' [email protected] | [email protected]
`- people.debian.org/~aurel32 | http://www.aurel32.net

2008-03-05 20:38:24

by Michael Matz

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Hi,

On Wed, 5 Mar 2008, Aurelien Jarno wrote:

> > So I think gcc at least needs an *option* to revert to the old behavior,
> > and there's a good argument to make it the default for now, at least for
> > x86/x86-64 on Linux.
>
> And for other kernels. I tested OpenBSD 4.1, FreeBSD 6.3, NetBSD 4.0,
> they have the same behaviour as Linux, that is they don't clear DF
> before calling the signal handler.

Sigh. We could perhaps insert a cld for all functions which can be
recognized as possible signal handlers and call other unknown or string
functions. But it's probably even faster to emit cld in front of the
inline copies of mem functions again :-(


Ciao,
Michael.

2008-03-05 20:42:57

by Joe Buck

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Wed, Mar 05, 2008 at 09:38:13PM +0100, Michael Matz wrote:
> Hi,
>
> On Wed, 5 Mar 2008, Aurelien Jarno wrote:
>
> > > So I think gcc at least needs an *option* to revert to the old behavior,
> > > and there's a good argument to make it the default for now, at least for
> > > x86/x86-64 on Linux.
> >
> > And for other kernels. I tested OpenBSD 4.1, FreeBSD 6.3, NetBSD 4.0,
> > they have the same behaviour as Linux, that is they don't clear DF
> > before calling the signal handler.
>
> Sigh. We could perhaps insert a cld for all functions which can be
> recognized as possible signal handlers and call other unknown or string
> functions. But it's probably even faster to emit cld in front of the
> inline copies of mem functions again :-(

Yes, if there are four kernels that get it "wrong", that effectively means
that the ABI document doesn't describe reality and gcc has to adjust.

2008-03-05 20:48:33

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Michael Matz wrote:
> Hi,
>
> On Wed, 5 Mar 2008, Aurelien Jarno wrote:
>
>>> So I think gcc at least needs an *option* to revert to the old behavior,
>>> and there's a good argument to make it the default for now, at least for
>>> x86/x86-64 on Linux.
>> And for other kernels. I tested OpenBSD 4.1, FreeBSD 6.3, NetBSD 4.0,
>> they have the same behaviour as Linux, that is they don't clear DF
>> before calling the signal handler.
>
> Sigh. We could perhaps insert a cld for all functions which can be
> recognized as possible signal handlers and call other unknown or string
> functions. But it's probably even faster to emit cld in front of the
> inline copies of mem functions again :-(
>

Well, there is a (slight) difference: you know that a called function
will not clobber your DF state; it's only the entry condition which is
imprecise.

The best would be if this could be controlled by a flag, which we can
flip once kernel fixes has been around for long enough.

-hpa

2008-03-05 20:50:01

by Jan Hubicka

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

> On Wed, Mar 05, 2008 at 09:38:13PM +0100, Michael Matz wrote:
> > Hi,
> >
> > On Wed, 5 Mar 2008, Aurelien Jarno wrote:
> >
> > > > So I think gcc at least needs an *option* to revert to the old behavior,
> > > > and there's a good argument to make it the default for now, at least for
> > > > x86/x86-64 on Linux.
> > >
> > > And for other kernels. I tested OpenBSD 4.1, FreeBSD 6.3, NetBSD 4.0,
> > > they have the same behaviour as Linux, that is they don't clear DF
> > > before calling the signal handler.
> >
> > Sigh. We could perhaps insert a cld for all functions which can be
> > recognized as possible signal handlers and call other unknown or string
> > functions. But it's probably even faster to emit cld in front of the
> > inline copies of mem functions again :-(
>
> Yes, if there are four kernels that get it "wrong", that effectively means
> that the ABI document doesn't describe reality and gcc has to adjust.

Kernels almost never follow ABI used by applications to last detail.
Linux kernel is disabling red zone and use kernel code model, yet the
ABI is not going to be adjusted for that.

This is resonably easy to fix on kernel side in signal handling, or by
removing std usage completely (I believe it is not performance win, but
some benchmarking would be needed to double check)

Honza

2008-03-05 20:52:40

by Aurelien Jarno

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

H. Peter Anvin a ?crit :
> Michael Matz wrote:
>> Hi,
>>
>> On Wed, 5 Mar 2008, Aurelien Jarno wrote:
>>
>>>> So I think gcc at least needs an *option* to revert to the old
>>>> behavior,
>>>> and there's a good argument to make it the default for now, at least
>>>> for
>>>> x86/x86-64 on Linux.
>>> And for other kernels. I tested OpenBSD 4.1, FreeBSD 6.3, NetBSD 4.0,
>>> they have the same behaviour as Linux, that is they don't clear DF
>>> before calling the signal handler.
>>
>> Sigh. We could perhaps insert a cld for all functions which can be
>> recognized as possible signal handlers and call other unknown or
>> string functions. But it's probably even faster to emit cld in front
>> of the inline copies of mem functions again :-(
>>
>
> Well, there is a (slight) difference: you know that a called function
> will not clobber your DF state; it's only the entry condition which is
> imprecise.
>
> The best would be if this could be controlled by a flag, which we can
> flip once kernel fixes has been around for long enough.

I have to agree there. Whatever the decision that gcc will take,
distributions will reenable the old behaviour for some time for to allow
upgrades from a previous version.

Providing a flag to switch the behaviour (whatever the default
behaviour) will help a lot.


--
.''`. Aurelien Jarno | GPG: 1024D/F1BCDB73
: :' : Debian developer | Electrical Engineer
`. `' [email protected] | [email protected]
`- people.debian.org/~aurel32 | http://www.aurel32.net

2008-03-05 21:02:20

by Michael Matz

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Hi,

On Wed, 5 Mar 2008, Jan Hubicka wrote:

> Kernels almost never follow ABI used by applications to last detail.

But we aren't talking about the ABI the kernel uses internally, but about
what is exposed to user-space via signal handlers. _That_ part needs to
be followed, and if it isn't it's a serious problem we somehow have to
hack around.

> Linux kernel is disabling red zone and use kernel code model, yet the
> ABI is not going to be adjusted for that.
>
> This is resonably easy to fix on kernel side in signal handling, or by
> removing std usage completely

That is true. But it requires updating the kernel to a fixed one if you
want to run your programs compiled by 4.3 :-/ Not something we'd like to
demand.


Ciao,
Michael.

2008-03-05 21:12:16

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Jan Hubicka wrote:
>> Yes, if there are four kernels that get it "wrong", that effectively means
>> that the ABI document doesn't describe reality and gcc has to adjust.
>
> Kernels almost never follow ABI used by applications to last detail.
> Linux kernel is disabling red zone and use kernel code model, yet the
> ABI is not going to be adjusted for that.
>
> This is resonably easy to fix on kernel side in signal handling, or by
> removing std usage completely (I believe it is not performance win, but
> some benchmarking would be needed to double check)

That's not the issue. The issue is that the kernel leaks the DF from
the code that took a signal to the signal handler.

-hpa

2008-03-05 21:20:28

by Joe Buck

[permalink] [raw]
Subject: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag



On Wed, 5 Mar 2008, Jan Hubicka wrote:
> > Linux kernel is disabling red zone and use kernel code model, yet the
> > ABI is not going to be adjusted for that.
> >
> > This is resonably easy to fix on kernel side in signal handling, or by
> > removing std usage completely

On Wed, Mar 05, 2008 at 10:02:07PM +0100, Michael Matz wrote:
> That is true. But it requires updating the kernel to a fixed one if you
> want to run your programs compiled by 4.3 :-/ Not something we'd like to
> demand.

I changed the title just for emphasis.

I think that we can't ship 4.3.0 if signal handlers on x86/x86_64
platforms for both Linux and BSD systems will mysteriously (to the users)
fail, and it doesn't matter whose fault it is.

2008-03-05 21:23:32

by David Miller

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

From: Aurelien Jarno <[email protected]>
Date: Wed, 05 Mar 2008 21:52:14 +0100

> H. Peter Anvin a ?crit :
> > The best would be if this could be controlled by a flag, which we can
> > flip once kernel fixes has been around for long enough.
>
> I have to agree there. Whatever the decision that gcc will take,
> distributions will reenable the old behaviour for some time for to allow
> upgrades from a previous version.

I don't think this approach is tenable.

If a distribution should ship with a "fixed" kernel and
compiler enabling the new direction flag behavior, any
binary you create on that system will be broken on any
other existing system.

I think we really are stuck with this forever, overwhelming
practice over the past 15 years has dictated to us what the
real ABI is.

2008-03-05 21:32:46

by Richard Biener

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Wed, Mar 5, 2008 at 10:20 PM, Joe Buck <[email protected]> wrote:
>
>
> On Wed, 5 Mar 2008, Jan Hubicka wrote:
> > > Linux kernel is disabling red zone and use kernel code model, yet the
> > > ABI is not going to be adjusted for that.
> > >
> > > This is resonably easy to fix on kernel side in signal handling, or by
> > > removing std usage completely
>
> On Wed, Mar 05, 2008 at 10:02:07PM +0100, Michael Matz wrote:
> > That is true. But it requires updating the kernel to a fixed one if you
> > want to run your programs compiled by 4.3 :-/ Not something we'd like to
> > demand.
>
> I changed the title just for emphasis.
>
> I think that we can't ship 4.3.0 if signal handlers on x86/x86_64
> platforms for both Linux and BSD systems will mysteriously (to the users)
> fail, and it doesn't matter whose fault it is.

We didn't yet run into this issue and build openSUSE with 4.3 since more than
three month.

Richard.

2008-03-05 21:41:24

by Richard Biener

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Wed, Mar 5, 2008 at 10:34 PM, H. Peter Anvin <[email protected]> wrote:
> Richard Guenther wrote:
> >
> > We didn't yet run into this issue and build openSUSE with 4.3 since more than
> > three month.
> >
>
> Well, how often do you take a trap inside an overlapping memmove()?

Right. So this problem is over-exaggerated. It's not like
"any binary you create on that system will be broken on any
other existing system."

Richard.

2008-03-05 21:43:26

by H. Peter Anvin

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Richard Guenther wrote:
>
> We didn't yet run into this issue and build openSUSE with 4.3 since more than
> three month.
>

Well, how often do you take a trap inside an overlapping memmove()?

-hpa

2008-03-05 21:43:47

by Joe Buck

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Wed, Mar 05, 2008 at 01:34:14PM -0800, H. Peter Anvin wrote:
> Richard Guenther wrote:
> >
> >We didn't yet run into this issue and build openSUSE with 4.3 since more
> >than
> >three month.
> >
>
> Well, how often do you take a trap inside an overlapping memmove()?

Also, would it be possible to produce an exploit? If you can get string
instructions to work "the wrong way", you might be able to overwrite data.

"We haven't seen a problem" isn't the right answer. Can someone
deliberately *create* a problem?

And if we aren't sure, we should err on the side of safety.

2008-03-05 21:44:12

by Andrew Pinski

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

There are already gcc 4.3.0 packages on the FTP site.

Sent from my iPhone

On Mar 5, 2008, at 13:20, Joe Buck <[email protected]> wrote:

>
>
> On Wed, 5 Mar 2008, Jan Hubicka wrote:
>>> Linux kernel is disabling red zone and use kernel code model, yet
>>> the
>>> ABI is not going to be adjusted for that.
>>>
>>> This is resonably easy to fix on kernel side in signal handling,
>>> or by
>>> removing std usage completely
>
> On Wed, Mar 05, 2008 at 10:02:07PM +0100, Michael Matz wrote:
>> That is true. But it requires updating the kernel to a fixed one
>> if you
>> want to run your programs compiled by 4.3 :-/ Not something we'd
>> like to
>> demand.
>
> I changed the title just for emphasis.
>
> I think that we can't ship 4.3.0 if signal handlers on x86/x86_64
> platforms for both Linux and BSD systems will mysteriously (to the
> users)
> fail, and it doesn't matter whose fault it is.
>
>

2008-03-05 21:44:41

by Michael Matz

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Hi,

On Wed, 5 Mar 2008, Joe Buck wrote:

>
>
> On Wed, 5 Mar 2008, Jan Hubicka wrote:
> > > Linux kernel is disabling red zone and use kernel code model, yet the
> > > ABI is not going to be adjusted for that.
> > >
> > > This is resonably easy to fix on kernel side in signal handling, or by
> > > removing std usage completely
>
> On Wed, Mar 05, 2008 at 10:02:07PM +0100, Michael Matz wrote:
> > That is true. But it requires updating the kernel to a fixed one if you
> > want to run your programs compiled by 4.3 :-/ Not something we'd like to
> > demand.
>
> I changed the title just for emphasis.
>
> I think that we can't ship 4.3.0 if signal handlers on x86/x86_64
> platforms for both Linux and BSD systems will mysteriously (to the users)
> fail, and it doesn't matter whose fault it is.

FWIW I don't think it's a release blocker for 4.3.0. The error is arcane
and happens seldomly if at all. And only on unfixed kernels. A program
needs to do std explicitely, which most don't do _and_ get hit by a signal
while begin in a std region. This happens so seldom that it didn't occur
in building the next openSuSE 11.0, and it continually builds packages
with 4.3 since months.

It should be worked around in 4.3.1 if at all.


Ciao,
Michael.

2008-03-05 21:45:27

by Richard Biener

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Wed, Mar 5, 2008 at 10:43 PM, Joe Buck <[email protected]> wrote:
>
> On Wed, Mar 05, 2008 at 01:34:14PM -0800, H. Peter Anvin wrote:
> > Richard Guenther wrote:
> > >
> > >We didn't yet run into this issue and build openSUSE with 4.3 since more
> > >than
> > >three month.
> > >
> >
> > Well, how often do you take a trap inside an overlapping memmove()?
>
> Also, would it be possible to produce an exploit? If you can get string
> instructions to work "the wrong way", you might be able to overwrite data.
>
> "We haven't seen a problem" isn't the right answer. Can someone
> deliberately *create* a problem?
>
> And if we aren't sure, we should err on the side of safety.

Oh, you mean releasing a kernel security update? ;) What does ICC or
other compilers do?

Richard.

2008-03-05 21:46:17

by Aurelien Jarno

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Richard Guenther a ?crit :
> On Wed, Mar 5, 2008 at 10:20 PM, Joe Buck <[email protected]> wrote:
>>
>> On Wed, 5 Mar 2008, Jan Hubicka wrote:
>> > > Linux kernel is disabling red zone and use kernel code model, yet the
>> > > ABI is not going to be adjusted for that.
>> > >
>> > > This is resonably easy to fix on kernel side in signal handling, or by
>> > > removing std usage completely
>>
>> On Wed, Mar 05, 2008 at 10:02:07PM +0100, Michael Matz wrote:
>> > That is true. But it requires updating the kernel to a fixed one if you
>> > want to run your programs compiled by 4.3 :-/ Not something we'd like to
>> > demand.
>>
>> I changed the title just for emphasis.
>>
>> I think that we can't ship 4.3.0 if signal handlers on x86/x86_64
>> platforms for both Linux and BSD systems will mysteriously (to the users)
>> fail, and it doesn't matter whose fault it is.
>
> We didn't yet run into this issue and build openSUSE with 4.3 since more than
> three month.
>

The problem can be easily reproduced by using a glibc built with gcc
4.3, with SBCL (the gcc version doesn't matter). The signal handler in
SBCL calls sigemptyset() which uses memset().


--
.''`. Aurelien Jarno | GPG: 1024D/F1BCDB73
: :' : Debian developer | Electrical Engineer
`. `' [email protected] | [email protected]
`- people.debian.org/~aurel32 | http://www.aurel32.net

2008-03-05 21:59:34

by Michael Matz

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Hi,

On Wed, 5 Mar 2008, Chris Lattner wrote:

>
> On Mar 5, 2008, at 1:34 PM, H. Peter Anvin wrote:
>
> >Richard Guenther wrote:
> > >We didn't yet run into this issue and build openSUSE with 4.3 since more
> > >than
> > >three month.
> >
> >Well, how often do you take a trap inside an overlapping memmove()?
>
> How hard is it to change the kernel signal entry path from "pushf" to
> "pushf;cld"? Problem solved, no?

The problem is with old kernels, which by definition stay unfixed.


Ciao,
Michael.

2008-03-05 22:09:44

by H. Peter Anvin

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Chris Lattner wrote:
>
> On Mar 5, 2008, at 1:34 PM, H. Peter Anvin wrote:
>
>> Richard Guenther wrote:
>>> We didn't yet run into this issue and build openSUSE with 4.3 since
>>> more than
>>> three month.
>>
>> Well, how often do you take a trap inside an overlapping memmove()?
>
> How hard is it to change the kernel signal entry path from "pushf" to
> "pushf;cld"? Problem solved, no?

Not quite, but fixing it in the kernel is easy.

Still breaks for running on all old kernels.

-hpa

2008-03-05 22:13:08

by Joe Buck

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Wed, Mar 05, 2008 at 10:43:33PM +0100, Michael Matz wrote:
> Hi,
>
> On Wed, 5 Mar 2008, Joe Buck wrote:
>
> >
> >
> > On Wed, 5 Mar 2008, Jan Hubicka wrote:
> > > > Linux kernel is disabling red zone and use kernel code model, yet the
> > > > ABI is not going to be adjusted for that.
> > > >
> > > > This is resonably easy to fix on kernel side in signal handling, or by
> > > > removing std usage completely
> >
> > On Wed, Mar 05, 2008 at 10:02:07PM +0100, Michael Matz wrote:
> > > That is true. But it requires updating the kernel to a fixed one if you
> > > want to run your programs compiled by 4.3 :-/ Not something we'd like to
> > > demand.
> >
> > I changed the title just for emphasis.
> >
> > I think that we can't ship 4.3.0 if signal handlers on x86/x86_64
> > platforms for both Linux and BSD systems will mysteriously (to the users)
> > fail, and it doesn't matter whose fault it is.
>
> FWIW I don't think it's a release blocker for 4.3.0. The error is arcane
> and happens seldomly if at all. And only on unfixed kernels. A program
> needs to do std explicitely, which most don't do _and_ get hit by a signal
> while begin in a std region. This happens so seldom that it didn't occur
> in building the next openSuSE 11.0, and it continually builds packages
> with 4.3 since months.
>
> It should be worked around in 4.3.1 if at all.

OK, I suppose that I over-reacted, and it seems that the ship has sailed
in any case.

I agree that it's obscure, and I think that the only reason to worry is
if it introduces a means of attack, which seems unlikely.

2008-03-05 22:13:29

by Adrian Bunk

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Wed, Mar 05, 2008 at 10:59:21PM +0100, Michael Matz wrote:
> Hi,
>
> On Wed, 5 Mar 2008, Chris Lattner wrote:
>
> >
> > On Mar 5, 2008, at 1:34 PM, H. Peter Anvin wrote:
> >
> > >Richard Guenther wrote:
> > > >We didn't yet run into this issue and build openSUSE with 4.3 since more
> > > >than
> > > >three month.
> > >
> > >Well, how often do you take a trap inside an overlapping memmove()?
> >
> > How hard is it to change the kernel signal entry path from "pushf" to
> > "pushf;cld"? Problem solved, no?
>
> The problem is with old kernels, which by definition stay unfixed.

Compiling older kernels with new gcc versions has never been supported.

You are e.g. aware that for many 32bit architectures (including i386)
and kernels up to and including 2.6.25-rc4 even the build fails with
gcc 4.3?

> Ciao,
> Michael.

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2008-03-05 22:16:44

by David Miller

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

From: "Richard Guenther" <[email protected]>
Date: Wed, 5 Mar 2008 22:40:59 +0100

> Right. So this problem is over-exaggerated. It's not like
> "any binary you create on that system will be broken on any
> other existing system."

I will be sure to hunt you down to help debug when someone reports
that once every few weeks their multi-day simulation gives incorrect
results :-)

This is one of those cases where the bug is going to be a huge
issue to people who actually hit it, and since we know about the
problem, knowingly shipping something in that state is unforgivable.

2008-03-05 22:20:01

by David Miller

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

From: Michael Matz <[email protected]>
Date: Wed, 5 Mar 2008 22:43:33 +0100 (CET)

> The error is arcane and happens seldomly if at all. And only on
> unfixed kernels.

Which translates right now into "all kernels."

2008-03-05 22:21:49

by David Miller

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

From: Adrian Bunk <[email protected]>
Date: Thu, 6 Mar 2008 00:13:04 +0200

> On Wed, Mar 05, 2008 at 10:59:21PM +0100, Michael Matz wrote:
> > The problem is with old kernels, which by definition stay unfixed.
>
> Compiling older kernels with new gcc versions has never been supported.

Adrian we're talking about userland binaries compiled by
gcc-4.3, not the kernel.

Please follow the discussion if you'd like to contribute.

Thanks.

2008-03-05 22:46:26

by Joe Buck

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Wed, Mar 05, 2008 at 02:16:22PM -0800, David Miller wrote:
> From: "Richard Guenther" <[email protected]>
> Date: Wed, 5 Mar 2008 22:40:59 +0100
>
> > Right. So this problem is over-exaggerated. It's not like
> > "any binary you create on that system will be broken on any
> > other existing system."
>
> I will be sure to hunt you down to help debug when someone reports
> that once every few weeks their multi-day simulation gives incorrect
> results :-)
>
> This is one of those cases where the bug is going to be a huge
> issue to people who actually hit it, and since we know about the
> problem, knowingly shipping something in that state is unforgivable.

In this case, it appears that the 4.3.0 tarballs already hit the
servers before the issue was discovered.

It's not the end of the world; quite often .0 releases have some issue,
and we can patch the 4_3 branch and recommend that the patch be used.

2008-03-05 22:51:30

by Michael Matz

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Hi,

On Wed, 5 Mar 2008, David Miller wrote:

> I will be sure to hunt you down to help debug when someone reports that
> once every few weeks their multi-day simulation gives incorrect results
> :-)

Give them a LD_PRELOADable DSO that intercepts signal() and sig_action(),
installing a trampoline that first does "cld" and then jumps to the real
handler. That ought to be a quick test if it's this or a different issue.
A hack, yes, but well... :)

> This is one of those cases where the bug is going to be a huge issue to
> people who actually hit it, and since we know about the problem,
> knowingly shipping something in that state is unforgivable.

Many bugs are a big issue to people who actually hit them, and we had (and
probably still have) far nastier corner case miscompilations here and
there and nevertheless released. It never was the end of the world :)

Let it be a data point that nobody noticed the problem which existed in
trunk since more than a year (since 2006-12-06 to be precise).


Ciao,
Michael.

2008-03-05 23:08:55

by H. Peter Anvin

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Michael Matz wrote:
>
> Many bugs are a big issue to people who actually hit them, and we had (and
> probably still have) far nastier corner case miscompilations here and
> there and nevertheless released. It never was the end of the world :)
>

This is the sort of stuff that security holes are made from.

-hpa

2008-03-05 23:10:41

by Michael Matz

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Hi,

On Wed, 5 Mar 2008, H. Peter Anvin wrote:

> Michael Matz wrote:
> >
> > Many bugs are a big issue to people who actually hit them, and we had (and
> > probably still have) far nastier corner case miscompilations here and there
> > and nevertheless released. It never was the end of the world :)
> >
>
> This is the sort of stuff that security holes are made from.

For security problems I prefer fixes over work-arounds. The fix lies in
the kernel, the work-around in gcc.


Ciao,
Michael.

2008-03-05 23:13:45

by David Miller

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

From: Michael Matz <[email protected]>
Date: Thu, 6 Mar 2008 00:07:39 +0100 (CET)

> The fix lies in the kernel, the work-around in gcc.

This depends upon how you interpret this ABI situation.

There is at least some agreement that how things have
actually been implemented by these kernels for more
than 15 years trumps whatever a paper standard states.

2008-03-05 23:14:19

by Olivier Galibert

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Thu, Mar 06, 2008 at 12:07:39AM +0100, Michael Matz wrote:
> For security problems I prefer fixes over work-arounds. The fix lies in
> the kernel, the work-around in gcc.

Incorrect. The bugs are in the ABI documentation and in gcc, and the
fixes should be done there. Doing it in the kernel is the workaround.

OG.

2008-03-05 23:15:39

by Olivier Galibert

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Thu, Mar 06, 2008 at 12:13:04AM +0200, Adrian Bunk wrote:
> Compiling older kernels with new gcc versions has never been supported.

You read the thread too fast. It's not at all about compiling the
kernel.

OG.

2008-03-05 23:17:12

by Joe Buck

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Wed, Mar 05, 2008 at 03:10:12PM -0800, David Miller wrote:
> From: Michael Matz <[email protected]>
> Date: Thu, 6 Mar 2008 00:07:39 +0100 (CET)
>
> > The fix lies in the kernel, the work-around in gcc.
>
> This depends upon how you interpret this ABI situation.
>
> There is at least some agreement that how things have
> actually been implemented by these kernels for more
> than 15 years trumps whatever a paper standard states.

We had a similar argument about the undefinedness of signed int
overflow. That's what the standard says, yet code that assumes
otherwise is pervasive, including in gcc itself.

If a standard is widely violated in a very consistent way, the violation
in effect becomes standard.

2008-03-05 23:17:40

by Olivier Galibert

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Wed, Mar 05, 2008 at 10:43:33PM +0100, Michael Matz wrote:
> FWIW I don't think it's a release blocker for 4.3.0. The error is arcane
> and happens seldomly if at all. And only on unfixed kernels. A program
> needs to do std explicitely, which most don't do _and_ get hit by a signal
> while begin in a std region. This happens so seldom that it didn't occur
> in building the next openSuSE 11.0, and it continually builds packages
> with 4.3 since months.

How would you know whether it has happened?

OG.

2008-03-05 23:50:04

by David Daney

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Olivier Galibert wrote:
> On Wed, Mar 05, 2008 at 10:43:33PM +0100, Michael Matz wrote:
>> FWIW I don't think it's a release blocker for 4.3.0. The error is arcane
>> and happens seldomly if at all. And only on unfixed kernels. A program
>> needs to do std explicitely, which most don't do _and_ get hit by a signal
>> while begin in a std region. This happens so seldom that it didn't occur
>> in building the next openSuSE 11.0, and it continually builds packages
>> with 4.3 since months.
>
> How would you know whether it has happened?
>

The same way you do with other bugs: You would observe unexpected behavior.

In this case probably either corrupted memory or a SIGSEGV.

David Daney

2008-03-06 00:42:50

by Chris Lattner

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

>>> Richard Guenther wrote:
>>>> We didn't yet run into this issue and build openSUSE with 4.3
>>>> since more
>>>> than
>>>> three month.
>>>
>>> Well, how often do you take a trap inside an overlapping memmove()?
>>
>> How hard is it to change the kernel signal entry path from "pushf" to
>> "pushf;cld"? Problem solved, no?
>
> The problem is with old kernels, which by definition stay unfixed.

My impression was that the problem occurs in GCC compiled code in the
kernel itself, not in user space:

1. User space has direction flag set.
2. signal occurs
3. kernel code is entered
4. kernel code does string operation <boom>

Fixing this instance of the problem by changing GCC requires (at
least) recompiling the kernel.

Changing the ABI for this seems like a pretty crazy solution to a very
minor and easily fixable kernel bug. Distros have control over what
kernels they ship, they have absolute power to ensure this doesn't
affect their users when running default kernels - without changing the
compiler.

-Chris

2008-03-06 00:50:20

by Aurelien Jarno

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Chris Lattner a ?crit :
>>>> Richard Guenther wrote:
>>>>> We didn't yet run into this issue and build openSUSE with 4.3 since
>>>>> more
>>>>> than
>>>>> three month.
>>>>
>>>> Well, how often do you take a trap inside an overlapping memmove()?
>>>
>>> How hard is it to change the kernel signal entry path from "pushf" to
>>> "pushf;cld"? Problem solved, no?
>>
>> The problem is with old kernels, which by definition stay unfixed.
>
> My impression was that the problem occurs in GCC compiled code in the
> kernel itself, not in user space:
>
> 1. User space has direction flag set.
> 2. signal occurs
> 3. kernel code is entered
> 4. kernel code does string operation <boom>

Wrong. Except maybe for the Hurd kernel. For other kernels:

4. signal handler is called
5. signal handler does string operation <boom>

The GCC used to compile the kernel doesn't matter. Using gcc 4.3 to
compile the user code triggers the bug.

--
.''`. Aurelien Jarno | GPG: 1024D/F1BCDB73
: :' : Debian developer | Electrical Engineer
`. `' [email protected] | [email protected]
`- people.debian.org/~aurel32 | http://www.aurel32.net

2008-03-06 00:51:27

by H. Peter Anvin

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Chris Lattner wrote:
>>>> Richard Guenther wrote:
>>>>> We didn't yet run into this issue and build openSUSE with 4.3 since
>>>>> more
>>>>> than
>>>>> three month.
>>>>
>>>> Well, how often do you take a trap inside an overlapping memmove()?
>>>
>>> How hard is it to change the kernel signal entry path from "pushf" to
>>> "pushf;cld"? Problem solved, no?
>>
>> The problem is with old kernels, which by definition stay unfixed.
>
> My impression was that the problem occurs in GCC compiled code in the
> kernel itself, not in user space:

That's wrong.

The issue is that the kernel is entered (due to a trap, interrupt or
whatever) and the state is saved. The kernel decides to revector
userspace to a signal handler. The kernel modifies the userspace state
to do so, but doesn't set DF=0.

Upon return to userspace, the modified state kicks in. Thus the signal
handler is entered with DF from userspace at trap time, not DF=0.

So it's an asynchronous state leak from one piece of userspace to another.

-hpa

2008-03-06 01:22:38

by H. Peter Anvin

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Chris Lattner wrote:
>>
>> Upon return to userspace, the modified state kicks in. Thus the
>> signal handler is entered with DF from userspace at trap time, not DF=0.
>>
>> So it's an asynchronous state leak from one piece of userspace to
>> another.
>
> Fine, it can happen either way. In either case, the distro vendor
> should fix the the signal handler in the kernels they distribute. If
> you don't do that, you are still leaking information from one piece of
> user space code to another, you're just papering over it in a horrible
> way :)
>
> GCC defines the direction flag to be clear before inline asm. Enforcing
> the semantics you propose would require issuing a cld before every
> inline asm, not just before every string operation.
>

It's a kernel bug, and it needs to be fixed. The discussion is about
what to do in the meantime.

(And yes, you're absolutely right: between global subroutine entry and
the first asm or string operation, you'd have to emit cld.)

-hpa

2008-03-06 02:11:31

by Krzysztof Halasa

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

"H. Peter Anvin" <[email protected]> writes:

> Not quite, but fixing it in the kernel is easy.
>
> Still breaks for running on all old kernels.

Many more things break on old kernels. I guess it's not worse than
a (local) root exploit, is it?

*-stable and distributions should take care of it, as they do in
others cases.
--
Krzysztof Halasa

2008-03-06 08:44:20

by Andi Kleen

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

"H. Peter Anvin" <[email protected]> writes:

> Richard Guenther wrote:
> > We didn't yet run into this issue and build openSUSE with 4.3 since
> > more than
> > three month.
> >
>
> Well, how often do you take a trap inside an overlapping memmove()?

That was the state with older gcc, but with newer gcc it does not necessarily
reset the flag before the next function call.

so e.g. if you have

memmove(...)
for (... very long loop .... ) {
/* no function calls */
/* signals happen */
}

the signal could see the direction flag

-Andi

2008-03-06 09:02:52

by Jakub Jelinek

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Thu, Mar 06, 2008 at 09:44:05AM +0100, Andi Kleen wrote:
> "H. Peter Anvin" <[email protected]> writes:
>
> > Richard Guenther wrote:
> > > We didn't yet run into this issue and build openSUSE with 4.3 since
> > > more than
> > > three month.
> > >
> >
> > Well, how often do you take a trap inside an overlapping memmove()?
>
> That was the state with older gcc, but with newer gcc it does not necessarily
> reset the flag before the next function call.
>
> so e.g. if you have
>
> memmove(...)
> for (... very long loop .... ) {
> /* no function calls */
> /* signals happen */
> }
>
> the signal could see the direction flag

memmove is supposed to (and does) do a cld insn after it finishes the
backward copying.

Jakub

2008-03-06 09:18:14

by Jakub Jelinek

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Wed, Mar 05, 2008 at 05:12:07PM -0800, H. Peter Anvin wrote:
> It's a kernel bug, and it needs to be fixed. The discussion is about
> what to do in the meantime.

While it is known that 32-bit glibc memmove and also <string.h> inlines
for memmove and memrchr use std; some string op; cld;, 64-bit glibc doesn't
ever use std instruction. gcc itself never generates std instruction.

So, I've disassebled the whole Fedora/x86_64 distro (64-bit binaries/shared
libraries/archives/object files, unpacked over 12000 rpms) to see how common
is std insn in 64-bit code. The only positive hits were the kernel
(/boot/xen-syms-2.6.21.7-2897.fc9 in particular, whatever that is) and
libpolyml.so.1.0.0 (polyml-libs - this one has handwritten assembly in
NASM), though I had to skim through some false positives (0xfd byte
appearing in data within code sections, but it is easy to see if 0xfd
is surrounded by invalid or nonsensical instructions that it is actually
data). The conclusion is that DF=1 in x86_64 64-bit code is extremely rare.

Therefore, if we decide to apply a workaround for the kernel bug in gcc
(I'm not convinced we should), it should be IMNSHO limited to 32-bit code.

Jakub

2008-03-06 09:21:50

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86: Clear DF before calling signal handler


* Aurelien Jarno <[email protected]> wrote:

> The Linux kernel currently does not clear the direction flag before
> calling a signal handler, whereas the x86/x86-64 ABI requires that.
> This become a real problem with gcc version 4.3, which assumes that
> the direction flag is correctly cleared at the entry of a function.
>
> This patches changes the setup_frame() functions to clear the
> direction before entering the signal handler.

thanks, applied.

Ingo

2008-03-06 09:46:33

by Mikael Pettersson

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Aurelien Jarno writes:
> On Wed, Mar 05, 2008 at 11:58:34AM -0800, Joe Buck wrote:
> >
> > Aurelien Jarno wrote:
> > > >Since version 4.3, gcc changed its behaviour concerning the x86/x86-64
> > > >ABI and the direction flag, that is it now assumes that the direction
> > > >flag is cleared at the entry of a function and it doesn't clear once
> > > >more if needed.
> > > >...
> > > >I guess this has to be fixed on the kernel side, but also gcc-4.3 could
> > > >revert back to the old behaviour, that is clearing the direction flag
> > > >when entering a routine that touches it until most people are running a
> > > >fixed kernel.
> >
> > On Wed, Mar 05, 2008 at 08:00:42AM -0800, H. Peter Anvin wrote:
> > > Linux should definitely follow the ABI. This is a bug, and a pretty
> > > serious such.
> >
> > Unfortunately, there are a lot of kernels out there already with this
> > problem, and the symptoms are likely to be subtle. So even if it is true
> > that it is the kernel that is "in the wrong", I think we still are going
> > to need to give users a workaround from the gcc side as well.
> >
> > So I think gcc at least needs an *option* to revert to the old behavior,
> > and there's a good argument to make it the default for now, at least for
> > x86/x86-64 on Linux.
>
> And for other kernels. I tested OpenBSD 4.1, FreeBSD 6.3, NetBSD 4.0,
> they have the same behaviour as Linux, that is they don't clear DF
> before calling the signal handler.

FWIW, Solaris 10 (both 32- and 64-bit) gets it right.

2008-03-06 09:54:35

by Andrew Haley

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Aurelien Jarno wrote:
> H. Peter Anvin a ?crit :
>> Michael Matz wrote:
>>>
>>> On Wed, 5 Mar 2008, Aurelien Jarno wrote:
>>>
>>>>> So I think gcc at least needs an *option* to revert to the old
>>>>> behavior,
>>>>> and there's a good argument to make it the default for now, at least
>>>>> for
>>>>> x86/x86-64 on Linux.
>>>> And for other kernels. I tested OpenBSD 4.1, FreeBSD 6.3, NetBSD 4.0,
>>>> they have the same behaviour as Linux, that is they don't clear DF
>>>> before calling the signal handler.
>>> Sigh. We could perhaps insert a cld for all functions which can be
>>> recognized as possible signal handlers and call other unknown or
>>> string functions. But it's probably even faster to emit cld in front
>>> of the inline copies of mem functions again :-(
>>>
>> Well, there is a (slight) difference: you know that a called function
>> will not clobber your DF state; it's only the entry condition which is
>> imprecise.
>>
>> The best would be if this could be controlled by a flag, which we can
>> flip once kernel fixes has been around for long enough.
>
> I have to agree there. Whatever the decision that gcc will take,
> distributions will reenable the old behaviour for some time for to allow
> upgrades from a previous version.
>
> Providing a flag to switch the behaviour (whatever the default
> behaviour) will help a lot.

I think you've got the timescales wrong. Anything that we do now in gcc will
take a while to percolate to the Linux distributions. It is far quicker for
those distributions to fix their kernels as fast as possible. By the time any
gcc fix is in the world all of this will be over.

I suppose one could apply the precautionary principle, but those systems that
don't update kernels won't update gcc either, so the solution won't work.

Andrew.

2008-03-06 11:46:15

by Andi Kleen

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Andrew Haley <[email protected]> writes:
>
> I suppose one could apply the precautionary principle, but those systems that
> don't update kernels won't update gcc either, so the solution won't work.

You seem to assume that running a gcc 4.3 compiled binary requires a
gcc update. That is not necessarily true.

-Andi

2008-03-06 12:09:35

by Richard Biener

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On 06 Mar 2008 12:45:57 +0100, Andi Kleen <[email protected]> wrote:
> Andrew Haley <[email protected]> writes:
> >
> > I suppose one could apply the precautionary principle, but those systems that
> > don't update kernels won't update gcc either, so the solution won't work.
>
> You seem to assume that running a gcc 4.3 compiled binary requires a
> gcc update. That is not necessarily true.

It (sometimes) requires a libgcc and libstdc++ update.

Richard.

2008-03-06 13:51:53

by Olivier Galibert

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Wed, Mar 05, 2008 at 05:12:07PM -0800, H. Peter Anvin wrote:
> It's a kernel bug, and it needs to be fixed.

I'm not convinced. It's been that way for 15 years, it's that way in
the BSD kernels, at that point it's a feature. The bug is in the
documentation, nowhere else. And in gcc for blindly trusting the
documentation.

OG.

2008-03-06 14:03:42

by Paolo Bonzini

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Olivier Galibert wrote:
> On Wed, Mar 05, 2008 at 05:12:07PM -0800, H. Peter Anvin wrote:
>> It's a kernel bug, and it needs to be fixed.
>
> I'm not convinced. It's been that way for 15 years, it's that way in
> the BSD kernels, at that point it's a feature. The bug is in the
> documentation, nowhere else. And in gcc for blindly trusting the
> documentation.

No, the bug *in the kernel* was already present (if you had a signal
raised during a call to memmove). It's just more visible with GCC 4.3.

Paolo

2008-03-06 14:06:32

by Olivier Galibert

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Wed, Mar 05, 2008 at 03:21:43PM -0800, David Daney wrote:
> Olivier Galibert wrote:
> >On Wed, Mar 05, 2008 at 10:43:33PM +0100, Michael Matz wrote:
> >>FWIW I don't think it's a release blocker for 4.3.0. The error is arcane
> >>and happens seldomly if at all. And only on unfixed kernels. A program
> >>needs to do std explicitely, which most don't do _and_ get hit by a
> >>signal while begin in a std region. This happens so seldom that it
> >>didn't occur in building the next openSuSE 11.0, and it continually
> >>builds packages with 4.3 since months.
> >
> >How would you know whether it has happened?
> >
>
> The same way you do with other bugs: You would observe unexpected behavior.
>
> In this case probably either corrupted memory or a SIGSEGV.

So that probably means the programs you use for compiling packages
probably aren't hit. Doesn't mean the packages you've compiled with
it aren't hit. Compiling packages doesn't test what's in them at all.

It's extremely rare, no doubt about it. It's just that it *yells*
security issue in the making. It's not a source bug, i.e. not easily
reviewable. It's related to signal handlers which are the mark of a
server and/or more failure-conscious program than usual. It's obscure
(breaking a stringop, probably memset, or a not-paranoid-enough inline
asm in a signal handler through a running memmove in the main program,
oh my) but reasonably predictable for someone looking for an
exploitable flaw.

It's gcc's job to adapt to the realities of its running environment,
not the other way around.

OG.

2008-03-06 14:12:36

by Olivier Galibert

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Thu, Mar 06, 2008 at 03:03:15PM +0100, Paolo Bonzini wrote:
> Olivier Galibert wrote:
> >On Wed, Mar 05, 2008 at 05:12:07PM -0800, H. Peter Anvin wrote:
> >>It's a kernel bug, and it needs to be fixed.
> >
> >I'm not convinced. It's been that way for 15 years, it's that way in
> >the BSD kernels, at that point it's a feature. The bug is in the
> >documentation, nowhere else. And in gcc for blindly trusting the
> >documentation.
>
> No, the bug *in the kernel* was already present (if you had a signal
> raised during a call to memmove). It's just more visible with GCC 4.3.

I'm curious, since when paper documentation became the Truth and
reality became a bug?

OG.

2008-03-06 14:17:19

by Andrew Haley

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Olivier Galibert wrote:
> On Thu, Mar 06, 2008 at 03:03:15PM +0100, Paolo Bonzini wrote:
>> Olivier Galibert wrote:
>>> On Wed, Mar 05, 2008 at 05:12:07PM -0800, H. Peter Anvin wrote:
>>>> It's a kernel bug, and it needs to be fixed.
>>> I'm not convinced. It's been that way for 15 years, it's that way in
>>> the BSD kernels, at that point it's a feature. The bug is in the
>>> documentation, nowhere else. And in gcc for blindly trusting the
>>> documentation.
>> No, the bug *in the kernel* was already present (if you had a signal
>> raised during a call to memmove). It's just more visible with GCC 4.3.
>
> I'm curious, since when paper documentation became the Truth and
> reality became a bug?

Isn't that the definition of a bug? That a program does not meet
its specification?

Andrew.

2008-03-06 15:25:37

by H. Peter Anvin

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Jakub Jelinek wrote:
> On Thu, Mar 06, 2008 at 09:44:05AM +0100, Andi Kleen wrote:
>> "H. Peter Anvin" <[email protected]> writes:
>>
>>> Richard Guenther wrote:
>>>> We didn't yet run into this issue and build openSUSE with 4.3 since
>>>> more than
>>>> three month.
>>>>
>>> Well, how often do you take a trap inside an overlapping memmove()?
>> That was the state with older gcc, but with newer gcc it does not necessarily
>> reset the flag before the next function call.

If so, that's a much worse bug.

>> so e.g. if you have
>>
>> memmove(...)
>> for (... very long loop .... ) {
>> /* no function calls */
>> /* signals happen */
>> }
>>
>> the signal could see the direction flag
>
> memmove is supposed to (and does) do a cld insn after it finishes the
> backward copying.

You can still take a signal inside memmove() itself, of course.

-hpa

2008-03-06 15:32:56

by Robert Dewar

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Olivier Galibert wrote:
> On Wed, Mar 05, 2008 at 05:12:07PM -0800, H. Peter Anvin wrote:
>> It's a kernel bug, and it needs to be fixed.
>
> I'm not convinced. It's been that way for 15 years, it's that way in
> the BSD kernels, at that point it's a feature. The bug is in the
> documentation, nowhere else. And in gcc for blindly trusting the
> documentation.

I agree, it reminds me of Burroughs on the 5500 believing the
Fortran standard which carefully allowed for a stack based
implementation of Fortran, Algol-style, unfortunately no real
Fortran programs worked with this semantics, and it was one of
the factors contributing the demise of the 5500.
>
> OG.
>

2008-03-06 15:37:36

by NightStrike

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On 3/6/08, Olivier Galibert <[email protected]> wrote:
> On Wed, Mar 05, 2008 at 05:12:07PM -0800, H. Peter Anvin wrote:
> > It's a kernel bug, and it needs to be fixed.
>
> I'm not convinced. It's been that way for 15 years, it's that way in
> the BSD kernels, at that point it's a feature. The bug is in the
> documentation, nowhere else. And in gcc for blindly trusting the
> documentation.

The issue should not be evaluated as: "It's always been that way,
therefore, it's right." Instead, it should be: "What's the right way
to do it?"

You don't just change documentation because no existing code meets the
requirement -- UNLESS -- the non-conforming code is actually the right
way to do things.

2008-03-06 15:43:32

by H.J. Lu

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

I agree with it. There is no right or wrong here Let's start from
scratch and figure out
what is the best way to handle this, assuming we are defining a new psABI.

H.J.
On Thu, Mar 6, 2008 at 7:37 AM, NightStrike <[email protected]> wrote:
>
> On 3/6/08, Olivier Galibert <[email protected]> wrote:
> > On Wed, Mar 05, 2008 at 05:12:07PM -0800, H. Peter Anvin wrote:
> > > It's a kernel bug, and it needs to be fixed.
> >
> > I'm not convinced. It's been that way for 15 years, it's that way in
> > the BSD kernels, at that point it's a feature. The bug is in the
> > documentation, nowhere else. And in gcc for blindly trusting the
> > documentation.
>
> The issue should not be evaluated as: "It's always been that way,
> therefore, it's right." Instead, it should be: "What's the right way
> to do it?"
>
> You don't just change documentation because no existing code meets the
> requirement -- UNLESS -- the non-conforming code is actually the right
> way to do things.
>

2008-03-06 15:55:18

by H. Peter Anvin

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

H.J. Lu wrote:
> I agree with it. There is no right or wrong here Let's start from
> scratch and figure out
> what is the best way to handle this, assuming we are defining a new psABI.

No, I believe the right way to approach this is by applying the good
old-fashioned principle from Ask Mr. Protocol:

Be liberal in what you receive, conservative in what you send

In other words:

a. Fix the kernel. Already in progress.
b. Do *not* make gcc assume DF is clean for now. Adding a
switch would be a useful thing, since if nothing else it
would benefit embedded environments. We might assume
DF is clean on 64 bits, since it appears it is rarely used
anyway, and 64 bits is more important in the long run.
c. Once fixed kernels have been out long enough, we can
flip the default of the switch, one platform at a time if
need be (e.g. there may never be another SCO OpenServer.)

-hpa

2008-03-06 15:57:44

by Robert Dewar

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

NightStrike wrote:
> On 3/6/08, Olivier Galibert <[email protected]> wrote:
>> On Wed, Mar 05, 2008 at 05:12:07PM -0800, H. Peter Anvin wrote:
>>> It's a kernel bug, and it needs to be fixed.
>> I'm not convinced. It's been that way for 15 years, it's that way in
>> the BSD kernels, at that point it's a feature. The bug is in the
>> documentation, nowhere else. And in gcc for blindly trusting the
>> documentation.
>
> The issue should not be evaluated as: "It's always been that way,
> therefore, it's right." Instead, it should be: "What's the right way
> to do it?"
>
> You don't just change documentation because no existing code meets the
> requirement -- UNLESS -- the non-conforming code is actually the right
> way to do things.

Sounds good, but has almost nothing to do with the real world. I
remember back in Realia COBOL days, we had to carefully copy IBM
bugs in the IBM mainframe COBOL compiler. Doing things right and
fixing the bug would have been the right thing to do, but no one
would have used Realia COBOL :-)

Another story, the sad story of the intel chip (I think it was
the 80188) where Intel made use of Int 5, which was documented
as reserved. Unfortunately, Microsoft/IBM had used this for
print screen or some such. Intel was absolutely right that
their documentation was clear and it was wrong to have used
these interrupts .. but the result was a warehouse of unused
chips.

2008-03-06 16:28:24

by İsmail Dönmez

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Hi,

On Thu, Mar 6, 2008 at 6:23 PM, Jakub Jelinek <[email protected]> wrote:
> On Thu, Mar 06, 2008 at 07:50:12AM -0800, H. Peter Anvin wrote:
> > H.J. Lu wrote:
> > >I agree with it. There is no right or wrong here Let's start from
> > >scratch and figure out
> > >what is the best way to handle this, assuming we are defining a new psABI.
>
> BTW, just tested icc and icc doesn't generate cld either (so it matches the
> new gcc behavior).
> char buf1[32], buf2[32];
> void bar (void);
> void foo (void)
> {
> __builtin_memset (buf1, 0, 32);
> bar ();
> __builtin_memset (buf2, 0, 32);
> }

Also LKML discussion pointed out that Solaris gets this right too.

Regards,
ismail

--
Never learn by your mistakes, if you do you may never dare to try again.

2008-03-06 16:29:30

by Artur Skawina

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Olivier Galibert wrote:
> On Wed, Mar 05, 2008 at 05:12:07PM -0800, H. Peter Anvin wrote:
>> It's a kernel bug, and it needs to be fixed.
>
> I'm not convinced. It's been that way for 15 years, it's that way in
> the BSD kernels, at that point it's a feature. The bug is in the
> documentation, nowhere else. And in gcc for blindly trusting the
> documentation.

well, you could see this either way -- either the kernel is buggy and
needs to be fixed or the current behavior is correct and the abi needs
an errata. If there were no performance implications i'd go for the
latter, mostly because of the security aspect.
But this thread made me dig up an old benchmark and apparently omitting
the cld before the string ops makes a significant difference; on P2 it
was ~8%, on P4 it's ~6% for 1480 byte copies; for 32 byte ones the gain
is more like 90% on a P4 [1].

So the impact on small structure memcpy/memset etc is significant, hence
fixing the kernel looks like a better long term plan.

artur

[1]
P4 # ./bcsp m
IACCK 0.9.29 Artur Skawina <...>
[ exec time; lower is better ] [speed ] [ time ] [ok?]
TIME-N+S TIME32 TIME33 TIME1480 MBYTES/S TIMEXXXX CSUM FUNCTION ( rdtsc_overhead=0 null=0 )
0 0 0 0 inf 0 ffff csum_partial_copy_null
1885 375 389 156 7589.74 39350 0 generic_memcpy
10894 532 666 1696 698.11 108557 0 kernel_memcpylib
1804 325 346 151 7841.06 19614 0 kernel_memcpy686
1804 325 346 151 7841.06 19693 0 kernel_memcpy686ncld
1744 323 381 148 8000.00 19687 0 kernel_memcpy686as1
1332 157 232 139 8517.99 19235 0 kernel_memcpy686as1ncld
1782 318 339 148 8000.00 19607 0 kernel_memcpy686as2
1371 168 189 139 8517.99 19221 0 kernel_memcpy686as2ncld

P2 # ./bcsp m
IACKK 0.9.28 Artur Skawina <...>
TIME-N+S TIME32 TIME33 TIME1480 MBYTES/S TIMEXXXX CKSUM FUNCTION ( rdtsc_overhead=1 null=0 )
0 0 0 0 inf 0 : ffff csum_partial_copy_null
7121 746 1215 730 1621.92 127418 : 0 generic_memcpy
43604 2032 1709 6574 180.10 416409 : 0 kernel_memcpylib
7480 771 726 684 1730.99 96084 : 0 kernel_memcpy686
7036 735 543 685 1728.47 95508 : 0 kernel_memcpy686ncld
7498 1015 711 716 1653.63 92200 : 0 kernel_memcpy686as1
5826 438 489 662 1788.52 91598 : 0 kernel_memcpy686as1ncld
6667 657 488 708 1672.32 89366 : 0 kernel_memcpy686as2
6614 456 270 658 1799.39 91203 : 0 kernel_memcpy686as2ncld

2008-03-06 16:30:21

by Paolo Bonzini

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag


> Another story, the sad story of the intel chip (I think it was
> the 80188) where Intel made use of Int 5, which was documented
> as reserved. Unfortunately, Microsoft/IBM had used this for
> print screen or some such. Intel was absolutely right that
> their documentation was clear and it was wrong to have used
> these interrupts .. but the result was a warehouse of unused
> chips.

Not really. Just, no one used the BOUND instruction. All computers
running DOS (Intel, AMD, even the old NEC V20/V30 chips) still connect
INT 5 to Print Screen.

Paolo

2008-03-06 16:40:35

by Jakub Jelinek

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Thu, Mar 06, 2008 at 07:50:12AM -0800, H. Peter Anvin wrote:
> H.J. Lu wrote:
> >I agree with it. There is no right or wrong here Let's start from
> >scratch and figure out
> >what is the best way to handle this, assuming we are defining a new psABI.

BTW, just tested icc and icc doesn't generate cld either (so it matches the
new gcc behavior).
char buf1[32], buf2[32];
void bar (void);
void foo (void)
{
__builtin_memset (buf1, 0, 32);
bar ();
__builtin_memset (buf2, 0, 32);
}

Jakub

2008-03-06 16:59:21

by H.J. Lu

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Thu, Mar 6, 2008 at 8:23 AM, Jakub Jelinek <[email protected]> wrote:
> On Thu, Mar 06, 2008 at 07:50:12AM -0800, H. Peter Anvin wrote:
> > H.J. Lu wrote:
> > >I agree with it. There is no right or wrong here Let's start from
> > >scratch and figure out
> > >what is the best way to handle this, assuming we are defining a new psABI.
>
> BTW, just tested icc and icc doesn't generate cld either (so it matches the
> new gcc behavior).
> char buf1[32], buf2[32];
> void bar (void);
> void foo (void)
> {
> __builtin_memset (buf1, 0, 32);
> bar ();
> __builtin_memset (buf2, 0, 32);
> }
>

Icc follows the psABI. If we are saying icc/gcc 4.3 need a fix, we'd
better define
a new psABI first.

H.J.

2008-03-06 17:14:22

by H.J. Lu

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Thu, Mar 6, 2008 at 9:06 AM, H. Peter Anvin <[email protected]> wrote:
>
> H.J. Lu wrote:
> > On Thu, Mar 6, 2008 at 8:23 AM, Jakub Jelinek <[email protected]> wrote:
> >> On Thu, Mar 06, 2008 at 07:50:12AM -0800, H. Peter Anvin wrote:
> >> > H.J. Lu wrote:
> >> > >I agree with it. There is no right or wrong here Let's start from
> >> > >scratch and figure out
> >> > >what is the best way to handle this, assuming we are defining a new psABI.
> >>
> >> BTW, just tested icc and icc doesn't generate cld either (so it matches the
> >> new gcc behavior).
> >> char buf1[32], buf2[32];
> >> void bar (void);
> >> void foo (void)
> >> {
> >> __builtin_memset (buf1, 0, 32);
> >> bar ();
> >> __builtin_memset (buf2, 0, 32);
> >> }
> >>
> >
> > Icc follows the psABI. If we are saying icc/gcc 4.3 need a fix, we'd
> > better define
> > a new psABI first.
> >
>
> Not a fix, an (optional) workaround for a system bug.
>

So that is the bug in the Linux kernel. Since fixing kernel is much easier
than providing a workaround in compilers, I think kernel should be fixed
and no need for icc/gcc fix.


H.J.

2008-03-06 17:14:44

by H. Peter Anvin

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

H.J. Lu wrote:
> On Thu, Mar 6, 2008 at 8:23 AM, Jakub Jelinek <[email protected]> wrote:
>> On Thu, Mar 06, 2008 at 07:50:12AM -0800, H. Peter Anvin wrote:
>> > H.J. Lu wrote:
>> > >I agree with it. There is no right or wrong here Let's start from
>> > >scratch and figure out
>> > >what is the best way to handle this, assuming we are defining a new psABI.
>>
>> BTW, just tested icc and icc doesn't generate cld either (so it matches the
>> new gcc behavior).
>> char buf1[32], buf2[32];
>> void bar (void);
>> void foo (void)
>> {
>> __builtin_memset (buf1, 0, 32);
>> bar ();
>> __builtin_memset (buf2, 0, 32);
>> }
>>
>
> Icc follows the psABI. If we are saying icc/gcc 4.3 need a fix, we'd
> better define
> a new psABI first.
>

Not a fix, an (optional) workaround for a system bug.

-hpa

2008-03-06 17:18:32

by Robert Dewar

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

H.J. Lu wrote:

> So that is the bug in the Linux kernel. Since fixing kernel is much easier
> than providing a workaround in compilers, I think kernel should be fixed
> and no need for icc/gcc fix.

Fixing a bug in the Linux kernel is not "much easier". You are taking
a purely engineering viewpoint, but life is not like that. There are
lots of copies of Linux kernels around and in use. The issue is not
fixing the kernel per se, it is propagating that change to all
Linux kernels in use -- THAT'S another matter entirely, and is
far far more difficult than making sure that a kernel fix is
qualified and widely proopagated.

>
>
> H.J.

2008-03-06 17:28:56

by H. Peter Anvin

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Robert Dewar wrote:
> H.J. Lu wrote:
>
>> So that is the bug in the Linux kernel. Since fixing kernel is much
>> easier
>> than providing a workaround in compilers, I think kernel should be fixed
>> and no need for icc/gcc fix.
>
> Fixing a bug in the Linux kernel is not "much easier". You are taking
> a purely engineering viewpoint, but life is not like that. There are
> lots of copies of Linux kernels around and in use. The issue is not
> fixing the kernel per se, it is propagating that change to all
> Linux kernels in use -- THAT'S another matter entirely, and is
> far far more difficult than making sure that a kernel fix is
> qualified and widely proopagated.
>

Not really, it's just a matter of time. Typical distro cycles are on
the order of 3 years.

-hpa

2008-03-06 17:29:20

by H. Peter Anvin

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

H.J. Lu wrote:
>>
>> Not a fix, an (optional) workaround for a system bug.
>
> So that is the bug in the Linux kernel. Since fixing kernel is much easier
> than providing a workaround in compilers, I think kernel should be fixed
> and no need for icc/gcc fix.
>

The problem is, you're going to have to be able to produce binaries
compatible with old kernels for a *long* time for come. Are you
honestly saying you'll tell those people "use gcc 4.2 or earlier"? If
so, I think most distros will have to freeze gcc for the next several years.

-hpa

2008-03-06 17:34:22

by H. Peter Anvin

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Robert Dewar wrote:
>
> Sounds good, but has almost nothing to do with the real world. I
> remember back in Realia COBOL days, we had to carefully copy IBM
> bugs in the IBM mainframe COBOL compiler. Doing things right and
> fixing the bug would have been the right thing to do, but no one
> would have used Realia COBOL :-)
>
> Another story, the sad story of the intel chip (I think it was
> the 80188) where Intel made use of Int 5, which was documented
> as reserved. Unfortunately, Microsoft/IBM had used this for
> print screen or some such. Intel was absolutely right that
> their documentation was clear and it was wrong to have used
> these interrupts .. but the result was a warehouse of unused
> chips.

IBM used it for print screen (and other calls), because Microsoft
cassette BASIC used all the non-reserved INT instructions as byte codes
(they cut it down to *only* half the interrupt vectors in the disk version.)

We're still stuck with the consequences of that hack.

-hpa

2008-03-06 17:35:28

by H.J. Lu

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Thu, Mar 6, 2008 at 9:17 AM, H. Peter Anvin <[email protected]> wrote:
> H.J. Lu wrote:
> >>
> >> Not a fix, an (optional) workaround for a system bug.
> >
> > So that is the bug in the Linux kernel. Since fixing kernel is much easier
> > than providing a workaround in compilers, I think kernel should be fixed
> > and no need for icc/gcc fix.
> >
>
> The problem is, you're going to have to be able to produce binaries
> compatible with old kernels for a *long* time for come. Are you
> honestly saying you'll tell those people "use gcc 4.2 or earlier"? If
> so, I think most distros will have to freeze gcc for the next several years.
>

Icc has been following psABI for years on Linux and it doesn't stop people
using icc on Linux. On the other hand, it may be a good idea to provide a
workaround in gcc and enables it by default. OSVs can fix thekernel and
disable it by default.

We can even issue a message whenever the workaround is used.

H.J.

2008-03-06 17:35:55

by Joe Buck

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Thu, Mar 06, 2008 at 01:06:17PM +0100, Richard Guenther wrote:
> On 06 Mar 2008 12:45:57 +0100, Andi Kleen <[email protected]> wrote:
> > Andrew Haley <[email protected]> writes:
> > >
> > > I suppose one could apply the precautionary principle, but those systems that
> > > don't update kernels won't update gcc either, so the solution won't work.
> >
> > You seem to assume that running a gcc 4.3 compiled binary requires a
> > gcc update. That is not necessarily true.
>
> It (sometimes) requires a libgcc and libstdc++ update.

"Sometimes" is correct; many users commonly run newer compilers on older
distros, with LD_LIBRARY_PATH set to pick up the correct C++ support
library. This is particularly common on servers, where you don't want to
mess with a working system but you might need to run newer code.

So, we've been arguing for a while, so the question is what to do.

Using a principle based on the old IETF concept of being liberal in what
you accept, and conservative in what you send, I think that both the Linux
kernel and gcc should fix the problem. The kernel should fix the
information leak, and gcc should remove the assumption that the direction
flag is set in a given direction on function entry.

The gcc patch will be too late for 4.3.0, but it would be on the 4.3
branch, and we would recommend that distros pick it up for any compilers
they ship.

2008-03-06 17:59:01

by Joe Buck

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Thu, Mar 06, 2008 at 03:12:21PM +0100, Olivier Galibert wrote:
> On Thu, Mar 06, 2008 at 03:03:15PM +0100, Paolo Bonzini wrote:
> > Olivier Galibert wrote:
> > >On Wed, Mar 05, 2008 at 05:12:07PM -0800, H. Peter Anvin wrote:
> > >>It's a kernel bug, and it needs to be fixed.
> > >
> > >I'm not convinced. It's been that way for 15 years, it's that way in
> > >the BSD kernels, at that point it's a feature. The bug is in the
> > >documentation, nowhere else. And in gcc for blindly trusting the
> > >documentation.
> >
> > No, the bug *in the kernel* was already present (if you had a signal
> > raised during a call to memmove). It's just more visible with GCC 4.3.
>
> I'm curious, since when paper documentation became the Truth and
> reality became a bug?

If the kernel allows state to leak from one process to another,
for example from a process running as root to a process running as an
ordinary user, it's a bug, with possible security implications.

In this particular case not much can be communicated through a one-bit
flag, so it would only be relevant in those situations where you want
to forbid any communication channels from a given process. So the
kernel developers might consider it a trivial bug. Or, they could just
fix it, which I understand is the plan.

2008-03-06 18:10:56

by Olivier Galibert

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Thu, Mar 06, 2008 at 09:58:41AM -0800, Joe Buck wrote:
> If the kernel allows state to leak from one process to another,
> for example from a process running as root to a process running as an
> ordinary user, it's a bug, with possible security implications.

I don't think that it is relevant in your case. If you have the
signal handler in something that does not share the VM with the
interrupted thread, you will have a context switch which is supposed
to store the direction flag and restore the one from the handling
thread. If you share the VM there is no context switch but you have
access to the exact same memory with the exact same rights, making the
leak irrelevant.

OG.

2008-03-06 18:15:35

by Paolo Bonzini

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Olivier Galibert wrote:
> On Thu, Mar 06, 2008 at 09:58:41AM -0800, Joe Buck wrote:
>> If the kernel allows state to leak from one process to another,
>> for example from a process running as root to a process running as an
>> ordinary user, it's a bug, with possible security implications.
>
> I don't think that it is relevant in your case. If you have the
> signal handler in something that does not share the VM with the
> interrupted thread, you will have a context switch which is supposed
> to store the direction flag and restore the one from the handling
> thread. If you share the VM there is no context switch but you have
> access to the exact same memory with the exact same rights, making the
> leak irrelevant.

A process can send a signal via kill. IOW, a malicious process can
*control when the process would be interrupted* in order to get it into
the signal handler with DF=1.

Paolo

2008-03-06 18:35:32

by Andrew Pinski

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On 3/6/08, Jack Lloyd <[email protected]> wrote:
> On Thu, Mar 06, 2008 at 07:13:20PM +0100, Paolo Bonzini wrote:
> > A process can send a signal via kill. IOW, a malicious process can
> > *control when the process would be interrupted* in order to get it into
> > the signal handler with DF=1.
>
> If the malicious process can send a signal to another process, it
> could also ptrace() it. Which is more useful, if you wanted to be
> malicious?

And more to the point, it can happen before GCC 4.3.0. So why does
GCC have do something that just happens more often now? I still don't
see why we have to work around a bug in the kernel which could show up
before GCC 4.3.0.

-- Pinski

2008-03-06 18:59:45

by Jack Lloyd

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Thu, Mar 06, 2008 at 07:13:20PM +0100, Paolo Bonzini wrote:
> A process can send a signal via kill. IOW, a malicious process can
> *control when the process would be interrupted* in order to get it into
> the signal handler with DF=1.

If the malicious process can send a signal to another process, it
could also ptrace() it. Which is more useful, if you wanted to be
malicious?

Jack

2008-03-06 19:26:30

by Robert Dewar

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

H. Peter Anvin wrote:
> Robert Dewar wrote:
>> H.J. Lu wrote:
>>
>>> So that is the bug in the Linux kernel. Since fixing kernel is much
>>> easier
>>> than providing a workaround in compilers, I think kernel should be fixed
>>> and no need for icc/gcc fix.
>> Fixing a bug in the Linux kernel is not "much easier". You are taking
>> a purely engineering viewpoint, but life is not like that. There are
>> lots of copies of Linux kernels around and in use. The issue is not
>> fixing the kernel per se, it is propagating that change to all
>> Linux kernels in use -- THAT'S another matter entirely, and is
>> far far more difficult than making sure that a kernel fix is
>> qualified and widely proopagated.
>>
>
> Not really, it's just a matter of time. Typical distro cycles are on
> the order of 3 years.
>
> -hpa

again, in the real world, there are MANY projects that are nothing
like this interactive when it comes to moving to new versions of
operating systems.

2008-03-06 19:35:46

by Robert Dewar

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

H.J. Lu wrote:

> Icc has been following psABI for years on Linux and it doesn't stop people
> using icc on Linux. On the other hand, it may be a good idea to provide a
> workaround in gcc and enables it by default. OSVs can fix thekernel and
> disable it by default.

How widely is icc used? I ask because we have not encocuntered one
customer using icc. We have huge numbers of customers using gcc, and
many using proprietary compilers from Sun, DEC etc, but never an icc
user?
>
> We can even issue a message whenever the workaround is used.
>
> H.J.

2008-03-06 19:43:42

by Paolo Bonzini

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Jack Lloyd wrote:
> On Thu, Mar 06, 2008 at 07:13:20PM +0100, Paolo Bonzini wrote:
>> A process can send a signal via kill. IOW, a malicious process can
>> *control when the process would be interrupted* in order to get it into
>> the signal handler with DF=1.
>
> If the malicious process can send a signal to another process, it
> could also ptrace() it. Which is more useful, if you wanted to be
> malicious?

1) capabilities(7)

2) sometimes setuid programs send signals (e.g. SIGHUP or SIGUSR1)...

Paolo

2008-03-06 19:45:59

by Paolo Bonzini

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

>> If the malicious process can send a signal to another process, it
>> could also ptrace() it. Which is more useful, if you wanted to be
>> malicious?
>
> And more to the point, it can happen before GCC 4.3.0.

Yes, and that's why the kernel should just fix it, and the fix should be
backported and treated like any other security fix.

Paolo

2008-03-06 20:16:41

by Jack Lloyd

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Thu, Mar 06, 2008 at 08:43:27PM +0100, Paolo Bonzini wrote:
> Jack Lloyd wrote:
> >On Thu, Mar 06, 2008 at 07:13:20PM +0100, Paolo Bonzini wrote:
> >>A process can send a signal via kill. IOW, a malicious process can
> >>*control when the process would be interrupted* in order to get it into
> >>the signal handler with DF=1.
> >
> >If the malicious process can send a signal to another process, it
> >could also ptrace() it. Which is more useful, if you wanted to be
> >malicious?
>
> 1) capabilities(7)

Ah you are right, I misinterpreted something from the man page
("non-root processes cannot trace processes that they cannot send
signals to") to mean something it did not (basically, that CAP_KILL
implied CAP_SYS_PTRACE, which from reading the kernel source is
clearly not the case...)

But still: so the threat here is of a malicious process with the
ability to send arbitrary signals to any process using CAP_KILL (since
in any other case when a process can send a signal, it can do much
more damage in other ways), which could leverage that into
(potentially) uid==0 using misexecuted code in a signal handler.

As a correctness issue, obviously this should be fixed/patched around,
if feasible. But as a security flaw? I'm not seeing much that is
compelling.

> 2) sometimes setuid programs send signals (e.g. SIGHUP or SIGUSR1)

I don't understand how this is a problem - unless these setuid
programs, while not malicious, can be tricked into signalling a
process they did not intend to. (In which case they already have a
major bug, df bit being cleared or not).

-Jack

2008-03-06 20:43:37

by H. Peter Anvin

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Robert Dewar wrote:
>>
>> Not really, it's just a matter of time. Typical distro cycles are on
>> the order of 3 years.
>>
>> -hpa
>
> again, in the real world, there are MANY projects that are nothing
> like this interactive when it comes to moving to new versions of
> operating systems.

This is true, but beyond a certain point projects generally accept that
they have to monitor their toolchain dependencies.

-hpa

2008-03-06 20:56:27

by Richard Biener

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Thu, Mar 6, 2008 at 6:34 PM, Joe Buck <[email protected]> wrote:
>
> On Thu, Mar 06, 2008 at 01:06:17PM +0100, Richard Guenther wrote:
> > On 06 Mar 2008 12:45:57 +0100, Andi Kleen <[email protected]> wrote:
> > > Andrew Haley <[email protected]> writes:
> > > >
> > > > I suppose one could apply the precautionary principle, but those systems that
> > > > don't update kernels won't update gcc either, so the solution won't work.
> > >
> > > You seem to assume that running a gcc 4.3 compiled binary requires a
> > > gcc update. That is not necessarily true.
> >
> > It (sometimes) requires a libgcc and libstdc++ update.
>
> "Sometimes" is correct; many users commonly run newer compilers on older
> distros, with LD_LIBRARY_PATH set to pick up the correct C++ support
> library. This is particularly common on servers, where you don't want to
> mess with a working system but you might need to run newer code.
>
> So, we've been arguing for a while, so the question is what to do.
>
> Using a principle based on the old IETF concept of being liberal in what
> you accept, and conservative in what you send, I think that both the Linux
> kernel and gcc should fix the problem. The kernel should fix the
> information leak, and gcc should remove the assumption that the direction
> flag is set in a given direction on function entry.
>
> The gcc patch will be too late for 4.3.0, but it would be on the 4.3
> branch, and we would recommend that distros pick it up for any compilers
> they ship.

A patched GCC IMHO makes only sense if it is always-on, yet another option
won't help in corner cases. And corner cases is exactly what people seem
to care about. For this reason that we have this single release, 4.3.0, that
behaves "bad" is already a problem.

Richard.

2008-03-06 21:01:04

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Richard Guenther wrote:
>
> A patched GCC IMHO makes only sense if it is always-on, yet another option
> won't help in corner cases. And corner cases is exactly what people seem
> to care about. For this reason that we have this single release, 4.3.0, that
> behaves "bad" is already a problem.
>

The option will help embedded vendors who can guarantee that it's not a
problem.

-hpa

2008-03-06 21:38:31

by Artur Skawina

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Jack Lloyd wrote:
> But still: so the threat here is of a malicious process with the
> ability to send arbitrary signals to any process using CAP_KILL (since
> in any other case when a process can send a signal, it can do much
> more damage in other ways), which could leverage that into
> (potentially) uid==0 using misexecuted code in a signal handler.
>
> As a correctness issue, obviously this should be fixed/patched around,
> if feasible. But as a security flaw? I'm not seeing much that is
> compelling.
>
>> 2) sometimes setuid programs send signals (e.g. SIGHUP or SIGUSR1)
>
> I don't understand how this is a problem - unless these setuid
> programs, while not malicious, can be tricked into signalling a
> process they did not intend to. (In which case they already have a
> major bug, df bit being cleared or not).

think apps keeping crypto keys etc in ram and wiping them from signal
handlers. eg gnupg does this; fortunately it seems to have moved from
memset() to a open coded solution, so probably isn't affected. OTOH
it wouldn't surprise me these days if the compiler would emit string
ops even w/o an explicit mem* call.
Copying a private memory region to some public buffer could also lead
to interesting results...
IOW being able to avoid a memset (or copying the wrong data) certainly
could have security consequences.

artur

2008-03-06 22:04:50

by Andi Kleen

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Thu, Mar 06, 2008 at 12:56:16PM -0800, H. Peter Anvin wrote:
> Richard Guenther wrote:
> >
> >A patched GCC IMHO makes only sense if it is always-on, yet another option
> >won't help in corner cases. And corner cases is exactly what people seem
> >to care about. For this reason that we have this single release, 4.3.0,
> >that
> >behaves "bad" is already a problem.
> >
>
> The option will help embedded vendors who can guarantee that it's not a
> problem.

For very very low values of "help".

To be realistic it is very unlikely anybody will measure a difference
from a few more or a few less clds in a program. It's not that they're
expensive instructions and they normally don't happen in inner loops either.

"If you enable this option you will get an optimization that you cannot
measure" @)

-Andi

2008-03-07 04:56:24

by Chris Lattner

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag


On Mar 6, 2008, at 2:06 PM, Andi Kleen wrote:

> On Thu, Mar 06, 2008 at 12:56:16PM -0800, H. Peter Anvin wrote:
>> Richard Guenther wrote:
>>>
>>> A patched GCC IMHO makes only sense if it is always-on, yet
>>> another option
>>> won't help in corner cases. And corner cases is exactly what
>>> people seem
>>> to care about. For this reason that we have this single release,
>>> 4.3.0,
>>> that
>>> behaves "bad" is already a problem.
>>>
>>
>> The option will help embedded vendors who can guarantee that it's
>> not a
>> problem.
>
> For very very low values of "help".
>
> To be realistic it is very unlikely anybody will measure a difference
> from a few more or a few less clds in a program. It's not that they're
> expensive instructions

They aren't? According to http://www.agner.org/optimize/instruction_tables.pdf
, they have a latency of 52 cycles on at least one popular x86 chip.

-Chris

2008-03-07 08:00:33

by Andreas Jaeger

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

"H. Peter Anvin" <[email protected]> writes:

> Robert Dewar wrote:
>> H.J. Lu wrote:
>>
>>> So that is the bug in the Linux kernel. Since fixing kernel is much
>>> easier
>>> than providing a workaround in compilers, I think kernel should be fixed
>>> and no need for icc/gcc fix.
>>
>> Fixing a bug in the Linux kernel is not "much easier". You are taking
>> a purely engineering viewpoint, but life is not like that. There are
>> lots of copies of Linux kernels around and in use. The issue is not
>> fixing the kernel per se, it is propagating that change to all
>> Linux kernels in use -- THAT'S another matter entirely, and is
>> far far more difficult than making sure that a kernel fix is
>> qualified and widely proopagated.
>>
>
> Not really, it's just a matter of time. Typical distro cycles are on
> the order of 3 years.

But distros release fixes regularly for their kernels - and adding a fix
for this issue with their next security update is something that is
possible for distros (at least for openSUSE ;-),

Andreas
--
Andreas Jaeger, Director Platform / openSUSE, [email protected]
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG N?rnberg)
Maxfeldstr. 5, 90409 N?rnberg, Germany
GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126


Attachments:
(No filename) (193.00 B)

2008-03-07 08:30:11

by Florian Weimer

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

* Robert Dewar:

> again, in the real world, there are MANY projects that are nothing
> like this interactive when it comes to moving to new versions of
> operating systems.

Sure, but how many of those get to see software compiled with GCC 4.3?

If this has any real impact, it's more likely to show up in current
systems, where both kernel, libc and GCC are regularly updated, and GCC
happens to receive an update before the kernel.

2008-03-07 14:09:56

by Michael Matz

[permalink] [raw]
Subject: Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

Hi,

On Thu, 6 Mar 2008, Andi Kleen wrote:

> To be realistic it is very unlikely anybody will measure a difference
> from a few more or a few less clds in a program.

Only an assumption, and in fact wrong. See upthread for a benchmark.
IIRC Uros also made measurements to justify the removal of cld (on P4 I
think), where it helps tremendously on small memcpy loops.


Ciao,
Michael.

2008-03-08 19:12:01

by Alexandre Oliva

[permalink] [raw]
Subject: Re: RELEASE BLOCKER: Linux doesn't follow x86/x86-64 ABI wrt direction flag

On Mar 6, 2008, Olivier Galibert <[email protected]> wrote:

> It's extremely rare, no doubt about it. It's just that it *yells*
> security issue in the making. It's not a source bug, i.e. not easily
> reviewable. It's related to signal handlers which are the mark of a
> server and/or more failure-conscious program than usual. It's obscure
> (breaking a stringop, probably memset, or a not-paranoid-enough inline
> asm in a signal handler through a running memmove in the main program,
> oh my) but reasonably predictable for someone looking for an
> exploitable flaw.

> It's gcc's job to adapt to the realities of its running environment,
> not the other way around.

I smell a false dilemma here.

The problem doesn't have to be fixed/worked-around in either the
kernel or GCC. Per your argument, one might claim it's the userland
library's, or even the application's job to adapt to the realities of
its running environment.

GCC doesn't know what functions are signal handlers to insert cld in
them. How could it fix the problem, then? How could it possibly fix
custom assembly? How could it possibly fix object code containing
signal handlers, compiled by other compilers?

A userland system library, in theory, knows what functions are signal
handlers. It could wrap function pointers passed as arguments to
signal() such that they get cld. But then, applications that couldn't
care less about this would take a hit.

Applications, on the other hand, know when they might need cld. So,
per your argument, they should adapt to the realities of their running
environment, and add asm("cld"); to signal handlers that might need
it. At times, it may be hard for them to know whether they need it,
because too many factors may affect this need. E.g.:

- if the kernel does cld for them, then they don't need it. But
that's a run-time property, so it can't be tested at build time: the
code may run on a different kernel that doesn't do it.

- if none of the libraries they use mess with this flag, or none of
the libraries they use from signal handlers depend on this flag, then
they don't need it. But then, again, libraries may vary over time,
and you can't assume the (dynamic) library that's available at build
time will behave the same way at run time.

So an application would have to do it conservatively, adding cld to
their signal handlers just in case.

But then, it would be more convenient if the library did it.

And then, by the same argument, it would be more convenient if the
kernel did it.

(Compiler can't do it, since it doesn't know what's a signal handler
in the general case.)

And that's an argument to support the ABI specs as they are.

It would be just silly to try to work around this deviation from the
specs, at a performance penalty, in every affected compiler, library
*and* application. And anything less than fixing all of them would be
an incomplete work around.

Which is not an argument against providing work arounds where
possible, just an argument in favor of fixing the problem where it can
be fixed for good.

--
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member http://www.fsfla.org/
Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org}