2004-03-05 17:33:20

by Mike Hearn

[permalink] [raw]
Subject: Potential bug in fs/binfmt_elf.c?

Hi,

I believe there is a problem in fs/binfmt_elf.c, around line 700 (kernel
2.6.1)

When mapping a nobits PT_LOAD segment with a memsize > filesize, the
kernel calls set_brk (which in turns calls do_brk) to map and clear the
area, but this discards access permissons on the mapping leading to rwx
protection. This causes a load failure on systems where the VM cannot
reserve swap space for the segment, unless overcommit is active (on many
systems it's not on by default).

I don't know this code well, but it seems that this discarding of access
permissions on the unlikely codepath is incorrect. I filed bug #2255 [1]
on it.

Could somebody who understands the ELF loading code please check to see
if this is a bug, and if so produce a patch?

The ability to define a new (large) ELF section which isn't backed by
swap space nor disk space and that will be mapped to a specific VMA
range is needed by Wine to reserve the PE load area.

Currently the fact that the section is always mapped rwx despite being
marked read-only in the binary prevents us from using this as a solution
to the problems caused by exec-shield/prelink, meaning the only solution
is to bootstrap the ELF interpreter ourselves from a statically linked
binary. Clearly we'd rather not do that.

Thanks to [email protected] for bringing the matter to my attention.

Your assistance is appreciated,
thanks -mike

[1] http://bugzilla.kernel.org/show_bug.cgi?id=2255


2004-03-05 18:29:25

by John Reiser

[permalink] [raw]
Subject: Re: Potential bug in fs/binfmt_elf.c?

> When mapping a nobits PT_LOAD segment with a memsize > filesize, the
> kernel calls set_brk (which in turns calls do_brk) to map and clear the
> area, but this discards access permissons on the mapping leading to rwx
> protection. This causes a load failure on systems where the VM cannot
> reserve swap space for the segment, unless overcommit is active (on many
> systems it's not on by default).
[snip]

I believe that's not the only problem with binfmt_elf. If the total address
space described by the PT_LOADs is not exactly one contiguous interval, then
2.6.3 binfmt_elf fills in the gaps with 'prw.' of zero-filled pages, instead
of the intended "holes" with no mapping at all between isolated PT_LOADs.
One example is https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=115913

--

2004-03-06 18:46:30

by Ulrich Drepper

[permalink] [raw]
Subject: Re: Potential bug in fs/binfmt_elf.c?

Mike Hearn wrote:

> When mapping a nobits PT_LOAD segment with a memsize > filesize,

Show an example of what the file looks like. Just the ELF program
header (readelf -l output).

--
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖

2004-03-06 21:05:38

by Mike Hearn

[permalink] [raw]
Subject: Re: Potential bug in fs/binfmt_elf.c?

On Sat, 2004-03-06 at 18:46, Ulrich Drepper wrote:
> Show an example of what the file looks like. Just the ELF program
> header (readelf -l output).

I can send the linker script and source file on request. They are
probably a bit buggy, this isn't an area I know much about. The binutils
guys seemed to think it should work however.

thanks -mike


Elf file type is EXEC (Executable file)
Entry point 0x818
There are 6 program headers, starting at offset 52

Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000034 0x00000034 0x00000034 0x000c0 0x000c0 R 0x4
INTERP 0x000400 0x00000400 0x00000400 0x00034 0x00034 R 0x4
[Requesting program interpreter: /lib/ld-linux.so.2]
LOAD 0x000000 0x00000000 0x00000000 0x00bc4 0x00bc4 R E 0x1000
LOAD 0x000bc4 0x00000bc4 0x00000bc4 0x00150 0x00154 RW 0x1000
DYNAMIC 0x000bd0 0x00000bd0 0x00000bd0 0x00108 0x00108 RW 0x4
LOAD 0x001000 0x00400000 0x00400000 0x00000 0x10000000 R 0x1000

Section to Segment mapping:
Segment Sections...
00
01 .interp .note.ABI-tag
02 .interp .note.ABI-tag .hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame
03 .data .dynamic .ctors .dtors .jcr .got .bss
04 .dynamic
05 .wine.loadarea


2004-03-07 06:11:57

by Ulrich Drepper

[permalink] [raw]
Subject: Re: Potential bug in fs/binfmt_elf.c?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mike Hearn wrote:

> LOAD 0x000000 0x00000000 0x00000000 0x00bc4 0x00bc4 R E 0x1000
> LOAD 0x000bc4 0x00000bc4 0x00000bc4 0x00150 0x00154 RW 0x1000
> DYNAMIC 0x000bd0 0x00000bd0 0x00000bd0 0x00108 0x00108 RW 0x4
> LOAD 0x001000 0x00400000 0x00400000 0x00000 0x10000000 R 0x1000

Not everything which can be expressed in ELF is supported. You don't
want to load something, you want to reserve address space. And you want
it allocated in a certain way. The ELF loader is no generic ELF
interpreter.

Now, if the only problem is the overcommit and making the do_brk() call
allocate the memory as read-only a change to the do_brk() interface
might be acceptable (well, ask somebody doing mm hacking). I wouldn't
be entirely sure whether read-only pages alone are enough. This does
not open any new holes as far as I can see.

I'd say experiment with it and add a flags parameter which is the right
combination of VM_READ | VM_WRITE | VM_EXEC. All calls but the one in
binfmt_elf.c should pass all read bits, the one in binfmt_elf can
respect the binaries flags. You must be sure, though, that the last
page of the data area (i.e., writable area) in a regular binary is not
mapped read-only.

- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFASr0L2ijCOnn/RHQRAjtZAJ931c+0Czw8jJc0kOv1+lIMyVLEOgCgtj3f
aHnlBUWz8qFQitDqVBWyPpc=
=f2UN
-----END PGP SIGNATURE-----

2004-03-07 10:00:36

by Mike Hearn

[permalink] [raw]
Subject: Re: Potential bug in fs/binfmt_elf.c?

On Sat, 06 Mar 2004 22:11:18 -0800, Ulrich Drepper wrote:
> Not everything which can be expressed in ELF is supported. You don't
> want to load something, you want to reserve address space. And you want
> it allocated in a certain way. The ELF loader is no generic ELF
> interpreter.

Ah, OK. I was hoping this would not the answer.

> Now, if the only problem is the overcommit and making the do_brk() call
> allocate the memory as read-only a change to the do_brk() interface
> might be acceptable (well, ask somebody doing mm hacking). I wouldn't
> be entirely sure whether read-only pages alone are enough. This does
> not open any new holes as far as I can see.

This is certainly one long term solution, but we'd like to avoid kernel
hacking if at all possible. We have a prototype of a program which is
statically linked then turns itself into a dynamically linked app by
bootstrapping the ELF interpreter in the same way the kernel does after
mapping the range wanted with MAP_NORESERVE. Obviously we'd like the real
fix, but something which works nicely on Fedora Core 1 machines today is
also necessary.

Thanks for your advice. One quick question - you said binfmt_elf is not a
generic ELF interpreter, but the one in glibc presumably is yes? Would it
be possible to achieve the effect wanted by having a dummy stub binary
linked with -nostdlib etc, so it's a dynamically linked ELF program with
only one DT_NEEDED entry which is against the real binary.

This would short-circuit the kernel loader and pass control as soon as
possible to glibc, which would follow the first DT_NEEDED entry and map in
the real binary, which in turn contains the PE load area reservation
section. IIRC glibc always uses mmap to map ELF sections so this could
work better.

Does this sound plausible? If so, do you have any tips on where to look
for docs on it? Last time I tried compiling something with -nostdlib, I
ran into problems with the default linker script not liking it (entry
points I think).

thanks -mike

2004-03-07 10:46:27

by Ulrich Drepper

[permalink] [raw]
Subject: Re: Potential bug in fs/binfmt_elf.c?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mike Hearn wrote:
> One quick question - you said binfmt_elf is not a
> generic ELF interpreter, but the one in glibc presumably is yes?

No. It only handles what is necessary.

- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFASv1y2ijCOnn/RHQRAihlAKCa6gGblyruVtYqk5OQFf3IvL4ELQCgyY5/
lkq7yoQelRbOFVuUwAAvcmk=
=MFBI
-----END PGP SIGNATURE-----

2004-03-07 11:49:02

by Mike Hearn

[permalink] [raw]
Subject: Re: Potential bug in fs/binfmt_elf.c?

On Sun, 07 Mar 2004 02:46:10 -0800, Ulrich Drepper wrote:
> No. It only handles what is necessary.

But can it handle this case, or will it also map the load area ELF section
wrongly?

2004-03-07 21:33:21

by Ulrich Drepper

[permalink] [raw]
Subject: Re: Potential bug in fs/binfmt_elf.c?

Mike Hearn wrote:

> But can it handle this case, or will it also map the load area ELF section
> wrongly?

It will most probably do something you don't want.

But ld.so is no particularly special program. Just write your own very
small and specialized dynamic loader which does exactly what you need.

--
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖

2004-03-08 00:04:20

by Eric W. Biederman

[permalink] [raw]
Subject: Re: Potential bug in fs/binfmt_elf.c?

Mike Hearn <[email protected]> writes:

> On Sat, 2004-03-06 at 18:46, Ulrich Drepper wrote:
> > Show an example of what the file looks like. Just the ELF program
> > header (readelf -l output).
>
> I can send the linker script and source file on request. They are
> probably a bit buggy, this isn't an area I know much about. The binutils
> guys seemed to think it should work however.
>
> thanks -mike
>
>
> Elf file type is EXEC (Executable file)
> Entry point 0x818
> There are 6 program headers, starting at offset 52
>
> Program Headers:
> Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
> PHDR 0x000034 0x00000034 0x00000034 0x000c0 0x000c0 R 0x4
> INTERP 0x000400 0x00000400 0x00000400 0x00034 0x00034 R 0x4
> [Requesting program interpreter: /lib/ld-linux.so.2]
> LOAD 0x000000 0x00000000 0x00000000 0x00bc4 0x00bc4 R E 0x1000
> LOAD 0x000bc4 0x00000bc4 0x00000bc4 0x00150 0x00154 RW 0x1000
> DYNAMIC 0x000bd0 0x00000bd0 0x00000bd0 0x00108 0x00108 RW 0x4
> LOAD 0x001000 0x00400000 0x00400000 0x00000 0x10000000 R 0x1000

That last PT_LOAD segment looks like pure nonsense. What is the purpose
of allocating 256MB of read-only zeros?

Eric

2004-03-08 05:58:58

by John Reiser

[permalink] [raw]
Subject: Re: Potential bug in fs/binfmt_elf.c?

>> LOAD 0x001000 0x00400000 0x00400000 0x00000 0x10000000 R 0x1000
>
>
> What is the purpose of allocating 256MB of read-only zeros?

To prevent the kernel from placing any shared libraries there [via mmap()
from ld-linux.so.2], especially under the influence of exec-shield.
This is 'wine', which wants to reserve that address space for mapping
executables that were built for some other operating system. For this
purpose, the .p_flags of PF_R instead could be 0 [==> PROT_NONE]; but
do_brk() still turns either one into 'prw.' which has potential memory
[over-]commit charges. The expected 'pr--' [or 'p---'] should have
a memory commit cost of zero.

--

2004-03-08 08:06:27

by Jakub Jelinek

[permalink] [raw]
Subject: Re: Potential bug in fs/binfmt_elf.c?

On Sun, Mar 07, 2004 at 09:57:43PM -0800, John Reiser wrote:
> >> LOAD 0x001000 0x00400000 0x00400000 0x00000 0x10000000 R
> >> 0x1000
> >
> >
> >What is the purpose of allocating 256MB of read-only zeros?
>
> To prevent the kernel from placing any shared libraries there [via mmap()
> from ld-linux.so.2], especially under the influence of exec-shield.
> This is 'wine', which wants to reserve that address space for mapping
> executables that were built for some other operating system. For this
> purpose, the .p_flags of PF_R instead could be 0 [==> PROT_NONE]; but
> do_brk() still turns either one into 'prw.' which has potential memory
> [over-]commit charges. The expected 'pr--' [or 'p---'] should have
> a memory commit cost of zero.

It should really be p_flags 0 and binfmt_elf.c should be fixed if it doesn't
handle that properly.
glibc ld.so indeed does the right thing with p_flags 0.

Jakub

2004-03-11 06:19:53

by John Reiser

[permalink] [raw]
Subject: [PATCH] binfmt_elf.c allow .bss with no access (p---)

>>>>LOAD 0x001000 0x00400000 0x00400000 0x00000 0x10000000 R 0x1000

> It should really be p_flags 0 and binfmt_elf.c should be fixed if it doesn't
> handle that properly.

This ALPHA quality patch against 2.6.3 adds another argument to do_brk()
which enables having a user ELF .bss with no-access (or read-only).
Such cases must be page-aligned on the low-address end of .bss, else the
padzero() runs into trouble. The new code applies .bss immediately to each
PT_LOAD whenever .p_filesz < .p_memsz . In 99.9% of cases at most one
PT_LOAD has any .bss, so this costs no more time than 2.6.3,
which tries to wait as long as it can before applying .bss.
This patch runs on x86.


--- ./fs/binfmt_elf.c.orig 2004-03-10 19:24:51.243682696 -0800
+++ ./fs/binfmt_elf.c 2004-03-10 19:28:31.435208496 -0800
@@ -83,12 +83,12 @@

#define BAD_ADDR(x) ((unsigned long)(x) > TASK_SIZE)

-static int set_brk(unsigned long start, unsigned long end)
+static int set_brk(unsigned long start, unsigned long end, unsigned int vm_inhibit)
{
start = ELF_PAGEALIGN(start);
end = ELF_PAGEALIGN(end);
if (end > start) {
- unsigned long addr = do_brk(start, end - start);
+ unsigned long addr = do_brk(start, end - start, vm_inhibit);
if (BAD_ADDR(addr))
return addr;
}
@@ -114,6 +114,18 @@
}
}

+static /*inline*/ unsigned int
+calc_bss_inhibit(struct elf_phdr const *phdr, unsigned elf_prot)
+{
+ unsigned vm_inhib = (VM_READ|VM_WRITE|VM_EXEC) ^ calc_vm_prot_bits(elf_prot);
+ /* Can readonly .data/.text have associated read+write .bss ?
+ if (phdr->p_filesz && !(phdr->p_flags & PF_W)) {
+ vm_inhib &= ~VM_WRITE;
+ }
+ */
+ return vm_inhib;
+}
+
/* Let's use some macros to make this stack manipulation a litle clearer */
#ifdef CONFIG_STACK_GROWSUP
#define STACK_ADD(sp, items) ((elf_addr_t *)(sp) + (items))
@@ -298,7 +310,6 @@
struct elf_phdr *eppnt;
unsigned long load_addr = 0;
int load_addr_set = 0;
- unsigned long last_bss = 0, elf_bss = 0;
unsigned long error = ~0UL;
int retval, i, size;

@@ -340,7 +351,7 @@
int elf_type = MAP_PRIVATE | MAP_DENYWRITE;
int elf_prot = 0;
unsigned long vaddr = 0;
- unsigned long k, map_addr;
+ unsigned long map_addr;

if (eppnt->p_flags & PF_R) elf_prot = PROT_READ;
if (eppnt->p_flags & PF_W) elf_prot |= PROT_WRITE;
@@ -359,38 +370,20 @@
load_addr_set = 1;
}

- /*
- * Find the end of the file mapping for this phdr, and keep
- * track of the largest address we see for this.
- */
- k = load_addr + eppnt->p_vaddr + eppnt->p_filesz;
- if (k > elf_bss)
- elf_bss = k;
-
- /*
- * Do the same thing for the memory mapping - between
- * elf_bss and last_bss is the bss section.
- */
- k = load_addr + eppnt->p_memsz + eppnt->p_vaddr;
- if (k > last_bss)
- last_bss = k;
- }
- }
-
- /*
- * Now fill out the bss section. First pad the last page up
- * to the page boundary, and then perform a mmap to make sure
- * that there are zero-mapped pages up to and including the
- * last bss page.
- */
- padzero(elf_bss);
- elf_bss = ELF_PAGESTART(elf_bss + ELF_MIN_ALIGN - 1); /* What we have mapped so far */
-
- /* Map the last of the bss segment */
- if (last_bss > elf_bss) {
- error = do_brk(elf_bss, last_bss - elf_bss);
- if (BAD_ADDR(error))
+ if (eppnt->p_filesz < eppnt->p_memsz) { /* has local .bss */
+ unsigned long elf_bss = load_addr + vaddr + eppnt->p_filesz;
+ padzero(elf_bss);
+ elf_bss = ELF_PAGESTART(elf_bss + ELF_MIN_ALIGN - 1);
+
+ vaddr += load_addr + eppnt->p_memsz;
+ if (elf_bss < vaddr) {
+ error = do_brk(elf_bss, vaddr - elf_bss,
+ calc_bss_inhibit(eppnt, elf_prot) );
+ if (BAD_ADDR(error))
goto out_close;
+ }
+ }
+ }
}

*interp_load_addr = load_addr;
@@ -428,7 +421,7 @@
goto out;
}

- do_brk(0, text_data);
+ do_brk(0, text_data, 0);
if (!interpreter->f_op || !interpreter->f_op->read)
goto out;
if (interpreter->f_op->read(interpreter, addr, text_data, &offset) < 0)
@@ -437,7 +430,7 @@
(unsigned long)addr + text_data);

do_brk(ELF_PAGESTART(text_data + ELF_MIN_ALIGN - 1),
- interp_ex->a_bss);
+ interp_ex->a_bss, 0);
elf_entry = interp_ex->a_entry;

out:
@@ -464,7 +457,7 @@
unsigned char ibcs2_interpreter = 0;
unsigned long error;
struct elf_phdr * elf_ppnt, *elf_phdata;
- unsigned long elf_bss, elf_brk;
+ unsigned long elf_brk;
int elf_exec_fileno;
int retval, i;
unsigned int size;
@@ -526,7 +519,6 @@
fd_install(elf_exec_fileno = retval, bprm->file);

elf_ppnt = elf_phdata;
- elf_bss = 0;
elf_brk = 0;

start_code = ~0UL;
@@ -687,44 +679,23 @@
the image should be loaded at fixed address, not at a variable
address. */

- for(i = 0, elf_ppnt = elf_phdata; i < elf_ex.e_phnum; i++, elf_ppnt++) {
- int elf_prot = 0, elf_flags;
- unsigned long k, vaddr;
-
- if (elf_ppnt->p_type != PT_LOAD)
- continue;
-
- if (unlikely (elf_brk > elf_bss)) {
- unsigned long nbyte;
-
- /* There was a PT_LOAD segment with p_memsz > p_filesz
- before this one. Map anonymous pages, if needed,
- and clear the area. */
- retval = set_brk (elf_bss + load_bias,
- elf_brk + load_bias);
- if (retval) {
- send_sig(SIGKILL, current, 0);
- goto out_free_dentry;
- }
- nbyte = ELF_PAGEOFFSET(elf_bss);
- if (nbyte) {
- nbyte = ELF_MIN_ALIGN - nbyte;
- if (nbyte > elf_brk - elf_bss)
- nbyte = elf_brk - elf_bss;
- clear_user((void *) elf_bss + load_bias, nbyte);
- }
- }
-
- if (elf_ppnt->p_flags & PF_R) elf_prot |= PROT_READ;
- if (elf_ppnt->p_flags & PF_W) elf_prot |= PROT_WRITE;
- if (elf_ppnt->p_flags & PF_X) elf_prot |= PROT_EXEC;
-
- elf_flags = MAP_PRIVATE|MAP_DENYWRITE|MAP_EXECUTABLE;
-
- vaddr = elf_ppnt->p_vaddr;
- if (elf_ex.e_type == ET_EXEC || load_addr_set) {
- elf_flags |= MAP_FIXED;
- } else if (elf_ex.e_type == ET_DYN) {
+ for(i = 0, elf_ppnt = elf_phdata; i < elf_ex.e_phnum; i++, elf_ppnt++) {
+ int elf_prot = 0, elf_flags;
+ unsigned long const vaddr = elf_ppnt->p_vaddr;
+ unsigned long k, elf_bss;
+
+ if (elf_ppnt->p_type != PT_LOAD)
+ continue;
+
+ if (elf_ppnt->p_flags & PF_R) elf_prot |= PROT_READ;
+ if (elf_ppnt->p_flags & PF_W) elf_prot |= PROT_WRITE;
+ if (elf_ppnt->p_flags & PF_X) elf_prot |= PROT_EXEC;
+
+ elf_flags = MAP_PRIVATE|MAP_DENYWRITE|MAP_EXECUTABLE;
+
+ if (elf_ex.e_type == ET_EXEC || load_addr_set)
+ elf_flags |= MAP_FIXED;
+ else if (elf_ex.e_type == ET_DYN)
/* Try and get dynamic programs out of the way of the default mmap
base, as well as whatever program they might try to exec. This
is because the brk will follow the loader, and is not movable. */
@@ -737,7 +708,7 @@

if (!load_addr_set) {
load_addr_set = 1;
- load_addr = (elf_ppnt->p_vaddr - elf_ppnt->p_offset);
+ load_addr = (vaddr - elf_ppnt->p_offset);
if (elf_ex.e_type == ET_DYN) {
load_bias += error -
ELF_PAGESTART(load_bias + vaddr);
@@ -745,43 +716,45 @@
reloc_func_desc = load_bias;
}
}
- k = elf_ppnt->p_vaddr;
+ k = vaddr;
if (k < start_code) start_code = k;
if (start_data < k) start_data = k;

- k = elf_ppnt->p_vaddr + elf_ppnt->p_filesz;
-
- if (k > elf_bss)
- elf_bss = k;
+ k = vaddr + elf_ppnt->p_filesz; /* start local .bss */
+ elf_bss = k;
if ((elf_ppnt->p_flags & PF_X) && end_code < k)
end_code = k;
if (end_data < k)
end_data = k;
- k = elf_ppnt->p_vaddr + elf_ppnt->p_memsz;
+
+ k = vaddr + elf_ppnt->p_memsz; /* end local .bss */
if (k > elf_brk)
elf_brk = k;
+
+ if (elf_bss < k) {
+ elf_bss += load_bias;
+ padzero(elf_bss);
+ elf_bss = ELF_PAGESTART(elf_bss + ELF_MIN_ALIGN - 1);
+
+ k += load_bias;
+ if (elf_bss < k) {
+ retval = set_brk(elf_bss, k - elf_bss,
+ calc_bss_inhibit(elf_ppnt, elf_prot) );
+ if (retval) {
+ send_sig(SIGKILL, current, 0);
+ goto out_free_dentry;
+ }
+ }
+ }
}

elf_ex.e_entry += load_bias;
- elf_bss += load_bias;
elf_brk += load_bias;
start_code += load_bias;
end_code += load_bias;
start_data += load_bias;
end_data += load_bias;

- /* Calling set_brk effectively mmaps the pages that we need
- * for the bss and break sections. We must do this before
- * mapping in the interpreter, to make sure it doesn't wind
- * up getting placed where the bss needs to go.
- */
- retval = set_brk(elf_bss, elf_brk);
- if (retval) {
- send_sig(SIGKILL, current, 0);
- goto out_free_dentry;
- }
- padzero(elf_bss);
-
if (elf_interpreter) {
if (interpreter_type == INTERPRETER_AOUT)
elf_entry = load_aout_interp(&interp_ex,
@@ -947,7 +920,7 @@
len = ELF_PAGESTART(elf_phdata->p_filesz + elf_phdata->p_vaddr + ELF_MIN_ALIGN - 1);
bss = elf_phdata->p_memsz + elf_phdata->p_vaddr;
if (bss > len)
- do_brk(len, bss - len);
+ do_brk(len, bss - len, 0);
error = 0;

out_free_ph:
--- ./include/linux/mm.h.orig 2004-03-07 19:01:46.000000000 -0800
+++ ./include/linux/mm.h 2004-03-08 08:43:02.000000000 -0800
@@ -554,7 +554,7 @@

extern int do_munmap(struct mm_struct *, unsigned long, size_t);

-extern unsigned long do_brk(unsigned long, unsigned long);
+extern unsigned long do_brk(unsigned long, unsigned long, unsigned int);

extern void __vma_unlink(struct mm_struct *mm, struct vm_area_struct *vma,
struct vm_area_struct *prev);
--- ./fs/binfmt_aout.c.orig 2004-03-07 19:01:46.000000000 -0800
+++ ./fs/binfmt_aout.c 2004-03-08 08:36:01.000000000 -0800
@@ -44,13 +44,13 @@
.min_coredump = PAGE_SIZE
};

-static void set_brk(unsigned long start, unsigned long end)
+static void set_brk(unsigned long start, unsigned long end, unsigned vm_inhibit)
{
start = PAGE_ALIGN(start);
end = PAGE_ALIGN(end);
if (end <= start)
return;
- do_brk(start, end - start);
+ do_brk(start, end - start, vm_inhibit);
}

/*
@@ -319,10 +319,10 @@
loff_t pos = fd_offset;
/* Fuck me plenty... */
/* <AOL></AOL> */
- error = do_brk(N_TXTADDR(ex), ex.a_text);
+ error = do_brk(N_TXTADDR(ex), ex.a_text, 0);
bprm->file->f_op->read(bprm->file, (char *) N_TXTADDR(ex),
ex.a_text, &pos);
- error = do_brk(N_DATADDR(ex), ex.a_data);
+ error = do_brk(N_DATADDR(ex), ex.a_data, 0);
bprm->file->f_op->read(bprm->file, (char *) N_DATADDR(ex),
ex.a_data, &pos);
goto beyond_if;
@@ -343,7 +343,7 @@
map_size = ex.a_text+ex.a_data;
#endif

- error = do_brk(text_addr & PAGE_MASK, map_size);
+ error = do_brk(text_addr & PAGE_MASK, map_size, 0);
if (error != (text_addr & PAGE_MASK)) {
send_sig(SIGKILL, current, 0);
return error;
@@ -377,7 +377,7 @@

if (!bprm->file->f_op->mmap||((fd_offset & ~PAGE_MASK) != 0)) {
loff_t pos = fd_offset;
- do_brk(N_TXTADDR(ex), ex.a_text+ex.a_data);
+ do_brk(N_TXTADDR(ex), ex.a_text+ex.a_data, 0);
bprm->file->f_op->read(bprm->file,(char *)N_TXTADDR(ex),
ex.a_text+ex.a_data, &pos);
flush_icache_range((unsigned long) N_TXTADDR(ex),
@@ -412,7 +412,7 @@
beyond_if:
set_binfmt(&aout_format);

- set_brk(current->mm->start_brk, current->mm->brk);
+ set_brk(current->mm->start_brk, current->mm->brk, 0);

retval = setup_arg_pages(bprm, 1);
if (retval < 0) {
@@ -478,7 +478,7 @@
error_time = jiffies;
}

- do_brk(start_addr, ex.a_text + ex.a_data + ex.a_bss);
+ do_brk(start_addr, ex.a_text + ex.a_data + ex.a_bss, 0);

file->f_op->read(file, (char *)start_addr,
ex.a_text + ex.a_data, &pos);
@@ -502,7 +502,7 @@
len = PAGE_ALIGN(ex.a_text + ex.a_data);
bss = ex.a_text + ex.a_data + ex.a_bss;
if (bss > len) {
- error = do_brk(start_addr + len, bss - len);
+ error = do_brk(start_addr + len, bss - len, 0);
retval = error;
if (error != start_addr + len)
goto out;
--- ./mm/mmap.c.orig 2004-03-07 19:01:46.000000000 -0800
+++ ./mm/mmap.c 2004-03-08 08:29:49.000000000 -0800
@@ -127,7 +127,7 @@
goto out;

/* Ok, looks good - let it rip. */
- if (do_brk(oldbrk, newbrk-oldbrk) != oldbrk)
+ if (do_brk(oldbrk, newbrk-oldbrk, 0) != oldbrk)
goto out;
set_brk:
mm->brk = brk;
@@ -1354,7 +1354,7 @@
* anonymous maps. eventually we may be able to do some
* brk-specific accounting here.
*/
-unsigned long do_brk(unsigned long addr, unsigned long len)
+unsigned long do_brk(unsigned long addr, unsigned long len, unsigned vm_inhibit)
{
struct mm_struct * mm = current->mm;
struct vm_area_struct * vma, * prev;
@@ -1400,7 +1400,7 @@
if (security_vm_enough_memory(len >> PAGE_SHIFT))
return -ENOMEM;

- flags = VM_DATA_DEFAULT_FLAGS | VM_ACCOUNT | mm->def_flags;
+ flags = (VM_DATA_DEFAULT_FLAGS &~ vm_inhibit) | VM_ACCOUNT | mm->def_flags;

/* Can we just expand an old anonymous mapping? */
if (rb_parent && vma_merge(mm, prev, rb_parent, addr, addr + len,
--- ./mm/nommu.c.orig 2004-02-17 19:59:20.000000000 -0800
+++ ./mm/nommu.c 2004-03-08 08:30:07.000000000 -0800
@@ -536,7 +536,7 @@
return ret;
}

-unsigned long do_brk(unsigned long addr, unsigned long len)
+unsigned long do_brk(unsigned long addr, unsigned long len, unsigned vm_inhibit)
{
return -ENOMEM;
}


--
John Reiser, [email protected]


2004-03-11 14:18:05

by Mike Hearn

[permalink] [raw]
Subject: Re: [PATCH] binfmt_elf.c allow .bss with no access (p---)

On Wed, 10 Mar 2004 22:17:35 -0800, John Reiser wrote:
> This ALPHA quality patch against 2.6.3 adds another argument to do_brk()
> which enables having a user ELF .bss with no-access (or read-only).

Hi,

Does this fix the Wine case where we have a new RO section that isn't the
BSS?

thanks -mike

2004-03-11 19:20:26

by John Reiser

[permalink] [raw]
Subject: Re: [PATCH] binfmt_elf.c allow .bss with no access (p---)

>>This ALPHA quality patch against 2.6.3 adds another argument to do_brk()
>>which enables having a user ELF .bss with no-access (or read-only).

> Does this fix the Wine case where we have a new RO section that isn't the
> BSS?

Yes, because the patch considers each PT_LOAD with p_filesz < p_memsz
to have a "local" .bss. This is more general than plain 2.6.3 which
creates only one "global" BSS after accumulating information from all
of the PT_LOAD.

--


2004-03-12 16:37:08

by Mike Hearn

[permalink] [raw]
Subject: Re: [PATCH] binfmt_elf.c allow .bss with no access (p---)

On Thu, 11 Mar 2004 11:18:57 -0800, John Reiser wrote:
> Yes, because the patch considers each PT_LOAD with p_filesz < p_memsz
> to have a "local" .bss. This is more general than plain 2.6.3 which
> creates only one "global" BSS after accumulating information from all
> of the PT_LOAD.

Fantastic. I'm afraid I'm unfamiliar with the kernel development process,
how do I track the patch to find out when it's checked in and which
releases it's available in (we'd really like 2.4 as well, though it may be
too late to make it worthwhile for now....)

thanks -mike

2004-04-13 17:32:44

by John Reiser

[permalink] [raw]
Subject: Re: [PATCH] binfmt_elf.c allow .bss with no access (p---)

>>>>>>LOAD 0x001000 0x00400000 0x00400000 0x00000 0x10000000 R 0x1000
>>
>>>It should really be p_flags 0 and binfmt_elf.c should be fixed if it doesn't
>>>handle that properly.
>>
>>This ALPHA quality patch against 2.6.3 adds another argument to do_brk()
>>which enables having a user ELF .bss with no-access (or read-only).

Here are refreshed patches (now BETA quality) against recent kernels:
http://www.bitwagon.com/elfdiet/elfdiet-2.6.5-mm5-1.patch.gz
http://www.bitwagon.com/elfdiet/elfdiet-2.6.5.patch.gz
(Patch mechanics: take 2.6.5, apply -mm5-1 [if desired], then apply
the corresponding elfdiet patch.)

A short introduction with links to past and future patches is:
http://www.bitwagon.com/elfdiet/elfdiet.html

--
John Reiser, [email protected]