2004-06-06 04:59:58

by Mike McCormack

[permalink] [raw]
Subject: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2


Linus Torvalds wrote:

> Just out of interest - how many legacy apps are broken by this? I assume
> it's a non-zero number, but wouldn't mind to be happily surprised.

Wine depends upon being able to execute code on the heap, and there are
probably Windows EXEs that depend upon being able to execute code on the
stack.

Fedore Code 1's exec-shield patch broke Wine badly, as there was no way
for an application to turn it off from user space, and Wine depended
upon certain areas of virtual memory being free.

We developed a hack to work around this problem by creating a staticly
linked binary to reserve memory then load ld-linux.so.2 and a
dynamically executable into memory manually and run start them.

So, just to confirm, an executable will be able to be built so that it
can request an executable stack and heap using PT_GNU_STACK or something
like that, right?

thanks,

Mike


2004-06-06 05:25:13

by Ingo Molnar

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2


* Mike McCormack <[email protected]> wrote:

> Fedore Code 1's exec-shield patch broke Wine badly, as there was no
> way for an application to turn it off from user space, and Wine
> depended upon certain areas of virtual memory being free.

there are multiple methods in FC1 to turn this off:

- FC1 has PT_GNU_STACK support and all binaries that have no
PT_GNU_STACK program header will have the stock Linux VM layout.
(including executable stack/heap) So by stripping the PT_GNU_STACK
header from the wine binary you get this effect.

- you get the same effect by setting the personality to PER_LINUX32 via:

personality(PER_LINUX32);

this is a NOP on stock x86 Linux, and turns off exec-shield on FC1.

all these methods were present in FC1 from day 1 on. In fact we
specifically targetted Wine (and similar applications) with these
methods to make it easy for them to be built under FC1. (of course
existing binaries of Wine worked and work fine because they dont have
PT_GNU_STACK.)

> We developed a hack to work around this problem by creating a staticly
> linked binary to reserve memory then load ld-linux.so.2 and a
> dynamically executable into memory manually and run start them.

while this should work too - why not one of the methods above?

Ingo

2004-06-06 07:20:09

by Mike McCormack

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2


Hi Ingo,

Ingo Molnar wrote:

> there are multiple methods in FC1 to turn this off:
>
> - FC1 has PT_GNU_STACK support and all binaries that have no
> PT_GNU_STACK program header will have the stock Linux VM layout.
> (including executable stack/heap) So by stripping the PT_GNU_STACK
> header from the wine binary you get this effect.

As far as we can tell, this alone does not stop the kernel from loading
stuff at the addresses we need. Even without PT_GNU_STACK ld-linux.so.2
and libc are loaded below 0x01000000, which is the region that Wine
assumes is free. I think this may be due to prelinking...

We (Codeweavers) build Wine on a Redhat 6.2 based machine, so
PT_GNU_STACK is not added to the binaries. They still don't work on
Fedora Core 1.

> - you get the same effect by setting the personality to PER_LINUX32 via:
>
> personality(PER_LINUX32);
>
> this is a NOP on stock x86 Linux, and turns off exec-shield on FC1.

From the Wine project's POV, there are two problems with that solution:

1) it's not backwards compatible with older binaries

2) it's distribution specific, so other distributions could come up
with a new method of doing the same thing.

> all these methods were present in FC1 from day 1 on. In fact we
> specifically targetted Wine (and similar applications) with these
> methods to make it easy for them to be built under FC1. (of course
> existing binaries of Wine worked and work fine because they dont have
> PT_GNU_STACK.)

The first thing we knew about exec-shield was when stuff started
breaking. Perhaps we could work a little more closely when there's a
possibility that Wine could break due to a new kernel feature?

Ideally the solution to the problem should be backward compatible, and
not require any change to older binaries for them to work.

>>We developed a hack to work around this problem by creating a staticly
>>linked binary to reserve memory then load ld-linux.so.2 and a
>>dynamically executable into memory manually and run start them.
>
>
> while this should work too - why not one of the methods above?

It would be better to argue that with Alexandre Julliard, because he's
the guy that chooses the solutions. My guess is for the reasons I
explained above.

Mike

2004-06-06 07:32:28

by Arjan van de Ven

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

On Sun, 2004-06-06 at 10:29, Mike McCormack wrote:
> Hi Ingo,
>
> Ingo Molnar wrote:
>
> > there are multiple methods in FC1 to turn this off:
> >
> > - FC1 has PT_GNU_STACK support and all binaries that have no
> > PT_GNU_STACK program header will have the stock Linux VM layout.
> > (including executable stack/heap) So by stripping the PT_GNU_STACK
> > header from the wine binary you get this effect.
>
> As far as we can tell, this alone does not stop the kernel from loading
> stuff at the addresses we need. Even without PT_GNU_STACK ld-linux.so.2
> and libc are loaded below 0x01000000, which is the region that Wine
> assumes is free. I think this may be due to prelinking...

that is prelink yes, not the kernel execshield.


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2004-06-06 07:32:45

by Christoph Hellwig

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

On Sun, Jun 06, 2004 at 03:09:32PM +0900, Mike McCormack wrote:
> Fedore Code 1's exec-shield patch broke Wine badly, as there was no way
> for an application to turn it off from user space, and Wine depended
> upon certain areas of virtual memory being free.

if you have a need for a special virtual memory layout please use your
own binary loader as I already suggested earlier in the thread, i.e.
binfmt_pecoff.

2004-06-06 08:04:10

by Mike McCormack

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2


Christoph Hellwig wrote:

> if you have a need for a special virtual memory layout please use your
> own binary loader as I already suggested earlier in the thread, i.e.
> binfmt_pecoff.

We are using our own user space loader now, but a kernel space loader is
neither portable or practical.

The Wine project is used by many people and companies for both comercial
and non-comercial purposes. In the spirit of cooperation, it would be
nice if somebody let us know when they're going to make a change that is
going to break Wine, and provide a way for us to workaround that change,
or even better maintain real binary compatability...

It seems Linus's kernel does that quite well, but some vendors seem not
to care too much about breaking Wine.

Mike

2004-06-06 08:10:25

by Christoph Hellwig

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

On Sun, Jun 06, 2004 at 06:13:41PM +0900, Mike McCormack wrote:
> We are using our own user space loader now, but a kernel space loader is
> neither portable or practical.

Huh? binfmts do work on all linux architectures unchanged. What you do
on other operating systems is up to you. And btw, netbsd already has
binfmt_pecoff, you could certainly make use of that, too.

> it would be
> nice if somebody let us know when they're going to make a change that is
> going to break Wine, and provide a way for us to workaround that change,
> or even better maintain real binary compatability...

_You_ are relying on undocumented assumptions here. Windows has different
address space layouts than ELF ABI systems and I think you're much better
off having your own pecoff loader for that.

> It seems Linus's kernel does that quite well, but some vendors seem not
> to care too much about breaking Wine.

Why should they? You need to fix up the broken assumptions in wine.

2004-06-06 08:27:56

by Mike McCormack

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2


Christoph Hellwig wrote:

> Huh? binfmts do work on all linux architectures unchanged. What you do
> on other operating systems is up to you. And btw, netbsd already has
> binfmt_pecoff, you could certainly make use of that, too.

Working on only two platforms is not really what I'd call portable.

> _You_ are relying on undocumented assumptions here. Windows has different
> address space layouts than ELF ABI systems and I think you're much better
> off having your own pecoff loader for that.

True, we are relying on undocumented assumptions. On the other hand,
there's plenty of programs that rely on undocumented assumptions.
Binary compatability to me means that the same binary will work even
when the underlying system changes... is there a caveat that I missed?

>>It seems Linus's kernel does that quite well, but some vendors seem not
>>to care too much about breaking Wine.
>
>
> Why should they? You need to fix up the broken assumptions in wine.

If you don't care about binary compatability, you can change whatever
you like. At least some people out there seem to care about it.

Mike

2004-06-06 08:40:19

by Christoph Hellwig

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

On Sun, Jun 06, 2004 at 06:37:32PM +0900, Mike McCormack wrote:
>
> Christoph Hellwig wrote:
>
> >Huh? binfmts do work on all linux architectures unchanged. What you do
> >on other operating systems is up to you. And btw, netbsd already has
> >binfmt_pecoff, you could certainly make use of that, too.
>
> Working on only two platforms is not really what I'd call portable.

Linux itself is portable so a linux driver also is portable. IF you care
for multiple OSes you need to do additional work of course. Which isn't
the end of the world either.

> True, we are relying on undocumented assumptions. On the other hand,
> there's plenty of programs that rely on undocumented assumptions.
> Binary compatability to me means that the same binary will work even
> when the underlying system changes... is there a caveat that I missed?

And there's plenty of programs that break because of that. Wine is now
one of those. You can either cludge around your brokenness even more or
try to get it fixed. Your choice.

2004-06-06 08:43:45

by Christoph Hellwig

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

On Sun, Jun 06, 2004 at 09:39:24AM +0100, Christoph Hellwig wrote:
> > True, we are relying on undocumented assumptions. On the other hand,
> > there's plenty of programs that rely on undocumented assumptions.
> > Binary compatability to me means that the same binary will work even
> > when the underlying system changes... is there a caveat that I missed?
>
> And there's plenty of programs that break because of that. Wine is now
> one of those. You can either cludge around your brokenness even more or
> try to get it fixed. Your choice.

And btw, if you'd have read the whole thread you'd have seen that I argued
against mergign the randomization and address space layout changes into
2.6, and such changes during stable series are bad. But your still much
better of getting your code fixed properly, and thus pretty much means
havign your own binary format handler in the kernel that sets up the address
space in a windows compatible way.

>
---end quoted text---

2004-06-06 09:11:14

by Mike McCormack

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2


Christoph Hellwig wrote:

> And btw, if you'd have read the whole thread you'd have seen that I argued
> against mergign the randomization and address space layout changes into
> 2.6, and such changes during stable series are bad. But your still much
> better of getting your code fixed properly, and thus pretty much means
> havign your own binary format handler in the kernel that sets up the address
> space in a windows compatible way.

The staticly linked userspace binary loader seems like the best solution
to me. For binary distributions of Wine there's no need to compile
kernels or modules at install time, no need to be root to install and no
need for us to write and maintain kernel code for N different operating
systems. Please let me know if you think of a way to break this solution ;)

Anyway, thanks for at least trying to keep these kind of changes for new
major versions.

Mike

2004-06-06 11:18:04

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

On Sun, 2004-06-06 at 10:43, Christoph Hellwig wrote:

> And btw, if you'd have read the whole thread you'd have seen that I argued
> against mergign the randomization and address space layout changes into
> 2.6, and such changes during stable series are bad. But your still much
> better of getting your code fixed properly, and thus pretty much means
> havign your own binary format handler in the kernel that sets up the address
> space in a windows compatible way.

I find randomization interesting and worthy... I would like to see it
integrated into 2.6 although adding a toggle to controli it use could be
desirable.

Another thing I miss is Process ID randomization ala OpenBSD.


2004-06-06 11:39:53

by David Woodhouse

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

On Sun, 2004-06-06 at 18:13 +0900, Mike McCormack wrote:
> Christoph Hellwig wrote:
>
> > if you have a need for a special virtual memory layout please use your
> > own binary loader as I already suggested earlier in the thread, i.e.
> > binfmt_pecoff.
>
> We are using our own user space loader now, but a kernel space loader is
> neither portable or practical.

Actually doesn't a kernel space loader let you discard text pages and
fix them up again on demand as Windows does, rather than doing the
relocations at load time and then having the pages considered dirty so
they have to be swapped instead of just discarded?

--
dwmw2

2004-06-06 14:49:12

by Mike McCormack

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2


David Woodhouse wrote:

> Actually doesn't a kernel space loader let you discard text pages and
> fix them up again on demand as Windows does, rather than doing the
> relocations at load time and then having the pages considered dirty so
> they have to be swapped instead of just discarded?

Yes, that would be one advantage of having a PE loader in the kernel.
David Howells of Redhat was working on a kernel module that implemented
all of the wineserver functionality, including a PE loader a while back.
Unfortunately that effort did not get anywhere. The code is still at:

http://cvs.winehq.com/cvsweb/kernel-win32

If there were a PE/COFF binary format handler in the kernel, it would
still be able to load ELF executables, as Wine requires glibc, X11, etc.

Mike

2004-06-07 08:49:23

by David Howells

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

> > We are using our own user space loader now, but a kernel space loader is
> > neither portable or practical.
>
> Actually doesn't a kernel space loader let you discard text pages and
> fix them up again on demand as Windows does, rather than doing the
> relocations at load time and then having the pages considered dirty so
> they have to be swapped instead of just discarded?

Yes. I've written one which worked, but it hasn't been ported to the 2.6
kernel.

David

2004-06-07 17:08:30

by Ingo Molnar

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2


* Christoph Hellwig <[email protected]> wrote:

> > It seems Linus's kernel does that quite well, but some vendors seem not
> > to care too much about breaking Wine.
>
> Why should they? You need to fix up the broken assumptions in wine.

for the record, i personally do care about Wine alot, and i'd like to
repeat that exec-shield did not break any _existing_ binaries. It broke
_newly_ compiled binaries that got the PT_GNU_STACK flag.

i can very well understand the frustration of the Wine people - dealing
with such issues doesnt give a feeling of advance, because you are
working on solving an issue that didnt exist before.

prelink might have broken other assumptions of Wine - one way around
that is to compile Wine as a PIE binary (or to link it statically).
prelink is a very important feature as well, from which Wine does
benefit as well.

Wine is in a really difficult position (due to the complex task it
achieves) and is more sensitive to VM layout changes than other
applications. So lets try to find the solution that preserves the
kernel's ability to further optimize the VM layout, while meeting Wine's
desire to get a simple VM layout that is not mapped in the first 1 GB or
so.

Ingo

2004-06-07 17:40:44

by Andi Kleen

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

Ingo Molnar <[email protected]> writes:
>
> Wine is in a really difficult position (due to the complex task it
> achieves) and is more sensitive to VM layout changes than other
> applications. So lets try to find the solution that preserves the

More ELF headers bits are not really hard to add.

-Andi

2004-06-08 09:21:47

by Jakub Jelinek

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

On Sun, Jun 06, 2004 at 09:32:21AM +0200, Arjan van de Ven wrote:
> On Sun, 2004-06-06 at 10:29, Mike McCormack wrote:
> > Hi Ingo,
> >
> > Ingo Molnar wrote:
> >
> > > there are multiple methods in FC1 to turn this off:
> > >
> > > - FC1 has PT_GNU_STACK support and all binaries that have no
> > > PT_GNU_STACK program header will have the stock Linux VM layout.
> > > (including executable stack/heap) So by stripping the PT_GNU_STACK
> > > header from the wine binary you get this effect.
> >
> > As far as we can tell, this alone does not stop the kernel from loading
> > stuff at the addresses we need. Even without PT_GNU_STACK ld-linux.so.2
> > and libc are loaded below 0x01000000, which is the region that Wine
> > assumes is free. I think this may be due to prelinking...
>
> that is prelink yes, not the kernel execshield.

But prelink only allocates in the area below executable if
/proc/sys/kernel/exec-shield exist (and only for i386; there is also
--exec-shield/--no-exec-shield to override).

Really the most safe way for Wine is to create a PT_LOAD segment with
p_flags 0 covering the whole area below the executable. The kernel first
maps the executable, then the dynamic linker, so no matter what address
are ld.so and shared libraries prelinked to, they will not be mapped to the
area Wine reserves.
Unfortunately, there is no easy way in ld to create the segment ATM,
see http://sources.redhat.com/ml/binutils/2003-12/msg00211.html
In current binutils, perhaps creating a special linker script from the
default on the fly and assigning segments there could work, but maybe
far easier would be to just create one allocated PT_LOAD segment somewhere
and using libelf change it's location and p_flags.

Making Wine a PIE is also a possible solution (at least in FC2 for
non-prelinked PIEs kernel doesn't honor ld.so's prelinked address), but
then you cannot be sure the kernel doesn't choose the addresses Wine wishes
to reserve while randomizing.

Jakub

2004-06-08 09:45:01

by Eric W. Biederman

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

Andi Kleen <[email protected]> writes:

> Ingo Molnar <[email protected]> writes:
> >
> > Wine is in a really difficult position (due to the complex task it
> > achieves) and is more sensitive to VM layout changes than other
> > applications. So lets try to find the solution that preserves the
>
> More ELF headers bits are not really hard to add.

Actually I think the cleanest thing at this point, and it was discussed
earlier is for the wine binary to have an ELF segment that is all bss
in the first 1GB. If the kernel loader can't cope we should fix that
before we start adding new ELF bits.

Wine can then mmap over it as it sees fit.

Eric

2004-06-08 10:05:57

by Mike McCormack

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2


> Really the most safe way for Wine is to create a PT_LOAD segment with
> p_flags 0 covering the whole area below the executable. The kernel first
> maps the executable, then the dynamic linker, so no matter what address
> are ld.so and shared libraries prelinked to, they will not be mapped to the
> area Wine reserves.

I did not investigate this, but others who did think that it is not
possible to create a segment that is reserve only so that does not
unnecessarily consume virtual memory. Apparently ELF allows it, but
Linux doesn't.

Secondly the amount of memory we want to reserve depends upon the PE
executable that we want to load, so varies. If we reserve only what
memory we need, when possible shared libraries can be loaded at their
prefered load address, and benefit from prelinking.

> Making Wine a PIE is also a possible solution (at least in FC2 for
> non-prelinked PIEs kernel doesn't honor ld.so's prelinked address), but
> then you cannot be sure the kernel doesn't choose the addresses Wine wishes
> to reserve while randomizing.

We are using a staticly linked binary (preloader) with a fixed load
address at the moment, which reserves memory first, then loads
ld-linux.so.2 and wine as the kernel would.

Mike

2004-06-08 10:31:15

by Ingo Molnar

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2


* Mike McCormack <[email protected]> wrote:

> I did not investigate this, but others who did think that it is not
> possible to create a segment that is reserve only so that does not
> unnecessarily consume virtual memory. Apparently ELF allows it, but
> Linux doesn't.

what do you mean by "Linux doesn't"?

Ingo

2004-06-08 10:51:00

by Mike McCormack

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2



Ingo Molnar wrote:

>>I did not investigate this, but others who did think that it is not
>>possible to create a segment that is reserve only so that does not
>>unnecessarily consume virtual memory. Apparently ELF allows it, but
>>Linux doesn't.
>
>
> what do you mean by "Linux doesn't"?

Apparently Linux will back all segments by swap space, even if they're
marked as non-accessable. Maybe I was told the wrong thing? Or maybe
the it's just difficult to create such a segment, as Jukub was saying?

In any case, the solution we have now reserves exactly the amount of
memory that is needed. If we were to use a fixed size segment, we would
be reserving too much memory most of the time, and preventing shared
libraries being loaded at their prefered addresses.

Mike

2004-06-08 12:29:16

by Horst H. von Brand

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

Mike McCormack <[email protected]> said:
> Christoph Hellwig wrote:
> > Huh? binfmts do work on all linux architectures unchanged. What you do
> > on other operating systems is up to you. And btw, netbsd already has
> > binfmt_pecoff, you could certainly make use of that, too.
>
> Working on only two platforms is not really what I'd call portable.

It is a start.

> > _You_ are relying on undocumented assumptions here. Windows has different
> > address space layouts than ELF ABI systems and I think you're much better
> > off having your own pecoff loader for that.

> True, we are relying on undocumented assumptions. On the other hand,
> there's plenty of programs that rely on undocumented assumptions.

So? "If it breaks, you get to keep all pieces" ring a bell?

> Binary compatability to me means that the same binary will work even
> when the underlying system changes... is there a caveat that I missed?

"Compatibility" means "compatible to an agreed standard", "portability" is
a subset of that... if there is no standard, you just have accidental "it
works".

> >>It seems Linus's kernel does that quite well, but some vendors seem not
> >>to care too much about breaking Wine.

> > Why should they? You need to fix up the broken assumptions in wine.
>
> If you don't care about binary compatability, you can change whatever
> you like. At least some people out there seem to care about it.

They try hard to stay within POSIX and other standards. If you need
guarantees, you'd have to convince the kernel hackers for their need. Too
bad if it is for backward compatibility of non-open source stuff, tho.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2004-06-08 21:51:14

by Robert White

[permalink] [raw]
Subject: RE: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

I would think that having an easy call to disable the NX modification would be both
safe and effective. That is, adding a syscall (or whatever) that would let you mark
your heap and/or stack executable while leaving the new default as NX, is "just as
safe" as flagging the executable in the first place.

-- You would have to turn it on manually (so Wine etc could do this freely.)
-- Nobody who didn't _need_ it on, would leave it off because people are lazy like
that.
-- The execute-arbitrary-code hacks could only execute that arbitrary code to turn
off NX if the hack was already able to execute its arbitrary code... (recurse as
necessary 8-)

Architecturally the easy-application-accessible switch should be something more than
a syscall to prevent a return-address-twiddle invoking the call directly. I'd make
it a /proc/self something, or put it in a separate include-only-if-used shared
library or something. If the minimal distance is opening and writing a
normally-untouched file then you get a nice support matrix. (e.g. no file means no
feature, file plus action means executable stack, no action means system default (old
can, new cannot), hacks would require a variable (fd) and executing arbitrary code to
open and write that file, programs/programmers that want/need the old behavior can
achieve it without having to know how to manipulate their ELF headers or tool-chains,
etc.)

I know that flagging the binary gives new programs control, but to a great extent we
want the old programs to be controlled as-is instead of only-if-recompiled.

But that's just a thought... 8-)

Rob.


-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of Ingo Molnar
Sent: Monday, June 07, 2004 7:19 AM

Wine is in a really difficult position (due to the complex task it
achieves) and is more sensitive to VM layout changes than other
applications. So lets try to find the solution that preserves the
kernel's ability to further optimize the VM layout, while meeting Wine's
desire to get a simple VM layout that is not mapped in the first 1 GB or
so.



2004-06-08 21:57:55

by Robert White

[permalink] [raw]
Subject: RE: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

Sorry, item two should start: "everybody" who didn't need it on would leave it
off..." 8-)

Rob.

-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of Robert White
Sent: Tuesday, June 08, 2004 2:51 PM
To: 'Ingo Molnar'; 'Christoph Hellwig'; 'Mike McCormack';
[email protected]
Subject: RE: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

I would think that having an easy call to disable the NX modification would be both
safe and effective. That is, adding a syscall (or whatever) that would let you mark
your heap and/or stack executable while leaving the new default as NX, is "just as
safe" as flagging the executable in the first place.

-- You would have to turn it on manually (so Wine etc could do this freely.)
-- Nobody who didn't _need_ it on, would leave it off because people are lazy like
that.
-- The execute-arbitrary-code hacks could only execute that arbitrary code to turn
off NX if the hack was already able to execute its arbitrary code... (recurse as
necessary 8-)

Architecturally the easy-application-accessible switch should be something more than
a syscall to prevent a return-address-twiddle invoking the call directly. I'd make
it a /proc/self something, or put it in a separate include-only-if-used shared
library or something. If the minimal distance is opening and writing a
normally-untouched file then you get a nice support matrix. (e.g. no file means no
feature, file plus action means executable stack, no action means system default (old
can, new cannot), hacks would require a variable (fd) and executing arbitrary code to
open and write that file, programs/programmers that want/need the old behavior can
achieve it without having to know how to manipulate their ELF headers or tool-chains,
etc.)

I know that flagging the binary gives new programs control, but to a great extent we
want the old programs to be controlled as-is instead of only-if-recompiled.

But that's just a thought... 8-)

Rob.


-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of Ingo Molnar
Sent: Monday, June 07, 2004 7:19 AM

Wine is in a really difficult position (due to the complex task it
achieves) and is more sensitive to VM layout changes than other
applications. So lets try to find the solution that preserves the
kernel's ability to further optimize the VM layout, while meeting Wine's
desire to get a simple VM layout that is not mapped in the first 1 GB or
so.



2004-06-09 01:40:36

by John Reiser

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

Ingo Molnar wrote:
> * Mike McCormack <[email protected]> wrote:
>
>
>>I did not investigate this, but others who did think that it is not
>>possible to create a segment that is reserve only so that does not
>>unnecessarily consume virtual memory. Apparently ELF allows it, but
>>Linux doesn't.
>
>
> what do you mean by "Linux doesn't"?

Current fs/binfmt_elf.c creates at most one ".bss" area, regardless of
how many PT_LOAD have .p_filesz < .p_memsz. The .bss area always
has PROT_WRITE|PROT_READ page protection, regardless of .p_flags.
Thus "Linux doesn't" do as faithful a job as it could with ELF.

I submitted "elfdiet" and "bssprot" patches a couple months ago
to address these issues. The bssprot patch appeared briefly in -mm
for 2.6.[56], but was dropped because of ARCH pain, particularly
with the sn2 variant of ia64. The hardware is scarce, and the topic
was not sufficiently interesting for those with access to such a box.

--

2004-06-09 02:29:33

by Paul Jackson

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

> was not sufficiently interesting for those with access to such a box.

That was me ;). Sorry, John. When you see a 'cpuset' patch submitted
to lkml from Simon Derr and/or myself, then I should have time to get
back to the sn2 portion of this bssprot patch. Meanwhile, the guys that
sign my paycheck got first dibbs on my time.

If anyone else with access to an sn2 can step up, that'd be fine.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-09 16:53:22

by Jesse Pollard

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

On Tuesday 08 June 2004 16:50, Robert White wrote:
> I would think that having an easy call to disable the NX modification would
> be both safe and effective. That is, adding a syscall (or whatever) that
> would let you mark your heap and/or stack executable while leaving the new
> default as NX, is "just as safe" as flagging the executable in the first
> place.

ahhhh no.

The first attack against a vulerable server would be to load a string
on the stack that would:
1. have address of the syscall to turn off NX, then return to the stack.
2. have normal worm/virus code following.

2004-06-09 17:15:25

by Jesper Juhl

[permalink] [raw]
Subject: RE: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

On Tue, 8 Jun 2004, Robert White wrote:

> I would think that having an easy call to disable the NX modification would be both
> safe and effective. That is, adding a syscall (or whatever) that would let you mark
> your heap and/or stack executable while leaving the new default as NX, is "just as
> safe" as flagging the executable in the first place.
>

Just having the abillity to turn protection off opens the door. If it is
possible to turn it off then a way will be found to do it - either via
buggy kernel code or otherwhise. Only safe approach is to have it
enabled by default and not be able to turn it off IMHO.


--
Jesper Juhl <[email protected]>

2004-06-09 18:02:35

by Evaldo Gardenali

[permalink] [raw]
Subject: RE: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

Hi there :)

Jesper Juhl wrote:
> On Tue, 8 Jun 2004, Robert White wrote:
>
> > I would think that having an easy call to disable the NX modification would be both
> > safe and effective. That is, adding a syscall (or whatever) that would let you mark
> > your heap and/or stack executable while leaving the new default as NX, is "just as
> > safe" as flagging the executable in the first place.
> >
>
> Just having the abillity to turn protection off opens the door. If it is

indeed!

> possible to turn it off then a way will be found to do it - either via
> buggy kernel code or otherwhise. Only safe approach is to have it
> enabled by default and not be able to turn it off IMHO.

if there's a way to turn it off, there's certainly a hole waiting for
trouble.

This reminds me of the "Safe Level" of NetBSD. want to run X? downgrade
your Safe Level (0 by default, can run anything)
http://netbsd.gw.com/cgi-bin/man-cgi/man?options+4+NetBSD-current --
look for "options INSECURE"
I know there may be some flaws on that concept, but it looks interesting
:)

Evaldo


Attachments:
signature.asc (189.00 B)
Esta ? uma parte de mensagem assinada digitalmente

2004-06-09 19:58:12

by Felipe Alfaro Solana

[permalink] [raw]
Subject: RE: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

On Wed, 2004-06-09 at 19:14 +0200, Jesper Juhl wrote:

> Just having the abillity to turn protection off opens the door. If it is
> possible to turn it off then a way will be found to do it - either via
> buggy kernel code or otherwhise. Only safe approach is to have it
> enabled by default and not be able to turn it off IMHO.

Much like LIDS works... You can configure, at build time, the kernel so
you can't switch the LIDS protection at all. Moreover, in case you want
to allow switching LIDS on/off, you can restrict such change to a
program that is running, at most, at the console, or over a serial line.

IMHO, I think that by definition, and programatically, allowing NX/
ExecShield to be turned on and off is an exploitable way of cracking a
system. I'd better like the LIDS approach.

2004-06-09 20:57:00

by Robert White

[permalink] [raw]
Subject: RE: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

Which is why I, later in the same message, wrote:

Architecturally the easy-application-accessible switch should be something more than
a syscall to prevent a return-address-twiddle invoking the call directly. I'd make
it a /proc/self something, or put it in a separate include-only-if-used shared
library or something. If the minimal distance is opening and writing a
normally-untouched file then you get a nice support matrix. (e.g. no file means no
feature, file plus action means executable stack, no action means system default (old
can, new cannot), hacks would require a variable (fd) and executing arbitrary code to
open and write that file, programs/programmers that want/need the old behavior can
achieve it without having to know how to manipulate their ELF headers or tool-chains,
etc.)

Which is not susceptible to the 1-2 attack you mention below because the open and
write cannot be done on a protected stack or heap, since it would then have to be
(er... ) executed to perform the hack.

Ahhhh, yes...

-----Original Message-----
From: Jesse Pollard [mailto:[email protected]]
Sent: Wednesday, June 09, 2004 9:53 AM
To: Robert White; 'Ingo Molnar'; 'Christoph Hellwig'; 'Mike McCormack';
[email protected]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

On Tuesday 08 June 2004 16:50, Robert White wrote:
> I would think that having an easy call to disable the NX modification would
> be both safe and effective. That is, adding a syscall (or whatever) that
> would let you mark your heap and/or stack executable while leaving the new
> default as NX, is "just as safe" as flagging the executable in the first
> place.

ahhhh no.

The first attack against a vulerable server would be to load a string
on the stack that would:
1. have address of the syscall to turn off NX, then return to the stack.
2. have normal worm/virus code following.



2004-06-10 13:35:40

by Jesse Pollard

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

On Wednesday 09 June 2004 15:53, Robert White wrote:
> Which is why I, later in the same message, wrote:
>
> Architecturally the easy-application-accessible switch should be something
> more than a syscall to prevent a return-address-twiddle invoking the call
> directly. I'd make it a /proc/self something, or put it in a separate
> include-only-if-used shared library or something. If the minimal distance
> is opening and writing a normally-untouched file then you get a nice
> support matrix. (e.g. no file means no feature, file plus action means
> executable stack, no action means system default (old can, new cannot),
> hacks would require a variable (fd) and executing arbitrary code to open
> and write that file, programs/programmers that want/need the old behavior
> can achieve it without having to know how to manipulate their ELF headers
> or tool-chains, etc.)
>
> Which is not susceptible to the 1-2 attack you mention below because the
> open and write cannot be done on a protected stack or heap, since it would
> then have to be (er... ) executed to perform the hack.
>
> Ahhhh, yes...

no. This only means the 1-2 attack must be done in two steps (maybe three).

1. create the file (first buffer overflow)
2. write? (second buffer overflow - depends on whether file must have value)
3. disable NX (third)

2004-06-10 18:09:34

by V13

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

On Wednesday 09 June 2004 20:14, Jesper Juhl wrote:
> On Tue, 8 Jun 2004, Robert White wrote:
> > I would think that having an easy call to disable the NX modification
> > would be both safe and effective. That is, adding a syscall (or
> > whatever) that would let you mark your heap and/or stack executable while
> > leaving the new default as NX, is "just as safe" as flagging the
> > executable in the first place.
>
> Just having the abillity to turn protection off opens the door. If it is
> possible to turn it off then a way will be found to do it - either via
> buggy kernel code or otherwhise. Only safe approach is to have it
> enabled by default and not be able to turn it off IMHO.

What about turning it on and don't be able to turn it off again?

> Jesper Juhl <[email protected]>
<<V13>>


Attachments:
(No filename) (803.00 B)
(No filename) (189.00 B)
Download all attachments

2004-06-10 19:00:10

by Bill Davidsen

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

Robert White wrote:
> I would think that having an easy call to disable the NX modification would be both
> safe and effective. That is, adding a syscall (or whatever) that would let you mark
> your heap and/or stack executable while leaving the new default as NX, is "just as
> safe" as flagging the executable in the first place.

It clearly wouldn't be safe, and that keeps it from being effective.
Like having a great lock and burglar alarm, then putting the key and
entry code under the mat. NX is to prevent abuse by BAD PEOPLE and
therefore should not have any way to defeat it from within a program.

--
-bill davidsen ([email protected])
"The secret to procrastination is to put things off until the
last possible moment - but no longer" -me

2004-06-10 21:14:28

by Robert White

[permalink] [raw]
Subject: RE: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

You are missing the model:

To enable executable stack/heap you would:

if ((fd = open("/proc/self/NX",O_RDWR)) >= 0) {
write(fd,"1",1);
close(fd);
}

(disabling would be symmetric with "0")

Because this is a sequence of specific instructions (that shouldn't exist in the
default library to prevent stack return hack invocation) these instructions would
exist only in programs that want to be EX anyway.

Because it is /proc/self other tasks cannot "do it for you".

You could put together a stack image of these instructions and overflow them into
place, but if the stack/heap isn't already executable, you couldn't run them. IF it
was already executable you wouldn't need to.

Note also that this is about old code and not new code. In the existing model, one
"actively secures" his ELF image at compile time. All the existing code is secure or
not with a kernel switch. This proposal relaxes that system wide restriction.

-- The system is NX by default.
-- ELF marked PT_GNU_STACK apps are NX and are /proc/self/NX resistant. (This is a
refinement I guess I didn't fully think about until last night.)
-- Any app that needs to be EX can leave their ELF unmarked and turn EX on and off by
tweaking this file.
-- Legacy apps can be EX enabled on a case-by-case basis with an LD_PRELOAD of a
shared library that contains the above in its __init().

So now, WINE (for instance) can put the above code in its startup system
unconditionally (since it is conditional on the presence of /proc/self/NX in the
first place) and WINE can be run on a system that is "otherwise NX".

The implementation doubles the "flag density" of the implementation because you have
to keep the ELF-set flag and the /procf/self/NX flag separately. You would also need
to be able to alter and reload the segment descriptors in the running application.
Neither should be terribly onerous.

But now the NX kernel increases security by default even for apps that are already
existent or that cannot be recompiled due to being non-Open-Source. This increases
both portability and security without requiring a complete rebuild of a distro.

So I would guess that this would be a "changeable default plus a hard lock" belt-and
suspenders approach to backwards compatability.

Rob.




-----Original Message-----
From: Jesse Pollard [mailto:[email protected]]
Sent: Thursday, June 10, 2004 6:35 AM
To: Robert White; 'Ingo Molnar'; 'Christoph Hellwig'; 'Mike McCormack';
[email protected]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

On Wednesday 09 June 2004 15:53, Robert White wrote:
> Which is why I, later in the same message, wrote:
>
> Architecturally the easy-application-accessible switch should be something
> more than a syscall to prevent a return-address-twiddle invoking the call
> directly. I'd make it a /proc/self something, or put it in a separate
> include-only-if-used shared library or something. If the minimal distance
> is opening and writing a normally-untouched file then you get a nice
> support matrix. (e.g. no file means no feature, file plus action means
> executable stack, no action means system default (old can, new cannot),
> hacks would require a variable (fd) and executing arbitrary code to open
> and write that file, programs/programmers that want/need the old behavior
> can achieve it without having to know how to manipulate their ELF headers
> or tool-chains, etc.)
>
> Which is not susceptible to the 1-2 attack you mention below because the
> open and write cannot be done on a protected stack or heap, since it would
> then have to be (er... ) executed to perform the hack.
>
> Ahhhh, yes...

no. This only means the 1-2 attack must be done in two steps (maybe three).

1. create the file (first buffer overflow)
2. write? (second buffer overflow - depends on whether file must have value)
3. disable NX (third)


2004-06-10 21:34:29

by Robert White

[permalink] [raw]
Subject: RE: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

Note my refining email elsewhere in this thread...

I am talking about the default handling of programs that are not marked PT_GNU_STACK.
The proposed /proc/self/NX file tweak does nothing to programs with the protected
stack flag (they are "locked" against executable data, instead of just resistant to
it).

It is better to protect "most of" these uncontrollable, old, legacy, or closed source
apps by default, and provide a means for those that must have it otherwise (e.g.
WINE) to exercise some control as to when or if the exposure is granted.

Consider a "sort of savvy normal user." I go and get this kernel, and build it, and
put it on my existing box. My security level has changed not-at-all because none of
my apps are marked PT_GNU_STACK. I don't actually see any improvement until I
recompile my distro.

With the proposed change the default can be everything-is-NX but unmarked apps can
"demote themselves" to the old behavior. I discover that I have some app that breaks
hideously. I can use a shim "LD_PRELOAD=libEX.so app" that opens the NX restriction
for that app.

Yes, this "raises the exposure" for all the "protected by default" apps if the
program is broken enough and the attacker is savvy enough, but these old apps have no
protection under the new system anyway, so better to protect most of them, and let
some of them slip-by if they need to, than protect none of them at all.

Rob.

-----Original Message-----
From: Bill Davidsen [mailto:[email protected]]
Sent: Thursday, June 10, 2004 11:58 AM
To: Robert White
Cc: [email protected]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

Robert White wrote:
> I would think that having an easy call to disable the NX modification would be both
> safe and effective. That is, adding a syscall (or whatever) that would let you
mark
> your heap and/or stack executable while leaving the new default as NX, is "just as
> safe" as flagging the executable in the first place.

It clearly wouldn't be safe, and that keeps it from being effective.
Like having a great lock and burglar alarm, then putting the key and
entry code under the mat. NX is to prevent abuse by BAD PEOPLE and
therefore should not have any way to defeat it from within a program.

--
-bill davidsen ([email protected])
"The secret to procrastination is to put things off until the
last possible moment - but no longer" -me



2004-06-11 09:53:46

by Marc Bevand

[permalink] [raw]
Subject: Re: WINE + NX (No eXecute) support for x86, 2.6.7-rc2-bk2

Robert White wrote:
> You are missing the model:
>
> To enable executable stack/heap you would:
>
> if ((fd = open("/proc/self/NX",O_RDWR)) >= 0) {
> write(fd,"1",1);
> close(fd);
> }
>
> (disabling would be symmetric with "0")
>
> Because this is a sequence of specific instructions (that shouldn't exist in the
> default library to prevent stack return hack invocation) these instructions would
> exist only in programs that want to be EX anyway.

Even such a protection model (a sequence of 3 syscalls to enable or
disable NX) can be easily bypassed by an attacker. The classic method
of return-into-libc (with a small variation that I would call
chained-returns-into-libc) still works.

As other people already said on this list: the ability to disable NX
is a *bad* thing for security.

--
Marc Bevand http://www.epita.fr/~bevand_m
Computer Science School EPITA - System, Network and Security Dept.