2005-02-07 22:31:41

by Pavel Machek

[permalink] [raw]
Subject: 2.6.11-rc3: Kylix application no longer works?

Hi!

I have some obscure Kylix application here... It started gets
misteriously killed in 2.6.11-rc3 and -rc3-mm1...

pavel@amd:~/slovnik/bin$ strace ./Slovnik
execve("./Slovnik", ["./Slovnik"], [/* 32 vars */]) = 0
+++ killed by SIGKILL +++
pavel@amd:~/slovnik/bin$ ldd ./Slovnik
/usr/bin/ldd: line 1: 8759 Killed
LD_TRACE_LOADED_OBJECTS=1 LD_WARN= LD_BIND_NOW=
LD_LIBRARY_VERSION=$verify_out LD_VERBOSE= "$file"
pavel@amd:~/slovnik/bin$

I get this in 2.6.10-rc3:

pavel@amd:~/slovnik/bin$ ./Slovnik
./Slovnik: relocation error: ./Slovnik: undefined symbol:
initPAnsiStrings
pavel@amd:~/slovnik/bin$ ldd ./Slovnik
libz.so.1 => /usr/lib/libz.so.1 (0xb7fc2000)
libX11.so.6 => /usr/X11/lib/libX11.so.6 (0xb7efa000)
libpthread.so.0 => /lib/libpthread.so.0 (0xb7ea9000)
libdl.so.2 => /lib/libdl.so.2 (0xb7ea6000)
libc.so.6 => /lib/libc.so.6 (0xb7d73000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0xb7fea000)
pavel@amd:~/slovnik/bin$

When I set LD_LIBRARY_PATH right, it will actually work. Any ideas?

Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!


2005-02-07 22:48:39

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.11-rc3: Kylix application no longer works?

Pavel Machek <[email protected]> wrote:
>
> I have some obscure Kylix application here... It started gets
> misteriously killed in 2.6.11-rc3 and -rc3-mm1...
>
> pavel@amd:~/slovnik/bin$ strace ./Slovnik
> execve("./Slovnik", ["./Slovnik"], [/* 32 vars */]) = 0
> +++ killed by SIGKILL +++
> pavel@amd:~/slovnik/bin$ ldd ./Slovnik
> /usr/bin/ldd: line 1: 8759 Killed
> LD_TRACE_LOADED_OBJECTS=1 LD_WARN= LD_BIND_NOW=
> LD_LIBRARY_VERSION=$verify_out LD_VERBOSE= "$file"
> pavel@amd:~/slovnik/bin$
>
> I get this in 2.6.10-rc3:
>
> pavel@amd:~/slovnik/bin$ ./Slovnik
> ./Slovnik: relocation error: ./Slovnik: undefined symbol:
> initPAnsiStrings
> pavel@amd:~/slovnik/bin$ ldd ./Slovnik
> libz.so.1 => /usr/lib/libz.so.1 (0xb7fc2000)
> libX11.so.6 => /usr/X11/lib/libX11.so.6 (0xb7efa000)
> libpthread.so.0 => /lib/libpthread.so.0 (0xb7ea9000)
> libdl.so.2 => /lib/libdl.so.2 (0xb7ea6000)
> libc.so.6 => /lib/libc.so.6 (0xb7d73000)
> /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0xb7fea000)
> pavel@amd:~/slovnik/bin$

Does it work correctly under earlier kernels? If so, when did it break?

> When I set LD_LIBRARY_PATH right, it will actually work. Any ideas?

Presumably you're picking up a different library without LD_LIBRARY_PATH.
Perhaps that library is mucked up and the new uaccess checking code in
binfmt_elf.c is now doing the right thing, and we were previously
forgetting to report some error.

I wonder if reverting the patch will restore the old behaviour?

# This is a BitKeeper generated diff -Nru style patch.
#
# ChangeSet
# 2005/01/21 13:42:18-08:00 [email protected]
# Merge nuts.davemloft.net:/disk1/BK/sparcwork-2.6
# into nuts.davemloft.net:/disk1/BK/sparc-2.6
#
# fs/binfmt_elf.c
# 2005/01/21 13:42:06-08:00 [email protected] +0 -0
# Auto merged
#
# ChangeSet
# 2005/01/17 13:38:38-08:00 [email protected]
# [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c
#
# Signed-off-by: David S. Miller <[email protected]>
#
# fs/compat_ioctl.c
# 2005/01/17 13:37:56-08:00 [email protected] +12 -5
# [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c
#
# fs/binfmt_elf.c
# 2005/01/17 13:37:56-08:00 [email protected] +43 -19
# [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c
#
diff -Nru a/fs/binfmt_elf.c b/fs/binfmt_elf.c
--- a/fs/binfmt_elf.c 2005-02-07 14:50:07 -08:00
+++ b/fs/binfmt_elf.c 2005-02-07 14:50:07 -08:00
@@ -110,15 +110,17 @@
be in memory */


-static void padzero(unsigned long elf_bss)
+static int padzero(unsigned long elf_bss)
{
unsigned long nbyte;

nbyte = ELF_PAGEOFFSET(elf_bss);
if (nbyte) {
nbyte = ELF_MIN_ALIGN - nbyte;
- clear_user((void __user *) elf_bss, nbyte);
+ if (clear_user((void __user *) elf_bss, nbyte))
+ return -EFAULT;
}
+ return 0;
}

/* Let's use some macros to make this stack manipulation a litle clearer */
@@ -134,7 +136,7 @@
#define STACK_ALLOC(sp, len) ({ sp -= len ; sp; })
#endif

-static void
+static int
create_elf_tables(struct linux_binprm *bprm, struct elfhdr * exec,
int interp_aout, unsigned long load_addr,
unsigned long interp_load_addr)
@@ -179,7 +181,8 @@
STACK_ALLOC(p, ((current->pid % 64) << 7));
#endif
u_platform = (elf_addr_t __user *)STACK_ALLOC(p, len);
- __copy_to_user(u_platform, k_platform, len);
+ if (__copy_to_user(u_platform, k_platform, len))
+ return -EFAULT;
}

/* Create the ELF interpreter info */
@@ -241,7 +244,8 @@
#endif

/* Now, let's put argc (and argv, envp if appropriate) on the stack */
- __put_user(argc, sp++);
+ if (__put_user(argc, sp++))
+ return -EFAULT;
if (interp_aout) {
argv = sp + 2;
envp = argv + argc + 1;
@@ -259,25 +263,29 @@
__put_user((elf_addr_t)p, argv++);
len = strnlen_user((void __user *)p, PAGE_SIZE*MAX_ARG_PAGES);
if (!len || len > PAGE_SIZE*MAX_ARG_PAGES)
- return;
+ return 0;
p += len;
}
- __put_user(0, argv);
+ if (__put_user(0, argv))
+ return -EFAULT;
current->mm->arg_end = current->mm->env_start = p;
while (envc-- > 0) {
size_t len;
__put_user((elf_addr_t)p, envp++);
len = strnlen_user((void __user *)p, PAGE_SIZE*MAX_ARG_PAGES);
if (!len || len > PAGE_SIZE*MAX_ARG_PAGES)
- return;
+ return 0;
p += len;
}
- __put_user(0, envp);
+ if (__put_user(0, envp))
+ return -EFAULT;
current->mm->env_end = p;

/* Put the elf_info on the stack in the right place. */
sp = (elf_addr_t __user *)envp + 1;
- copy_to_user(sp, elf_info, ei_index * sizeof(elf_addr_t));
+ if (copy_to_user(sp, elf_info, ei_index * sizeof(elf_addr_t)))
+ return -EFAULT;
+ return 0;
}

#ifndef elf_map
@@ -411,7 +419,11 @@
* that there are zero-mapped pages up to and including the
* last bss page.
*/
- padzero(elf_bss);
+ if (padzero(elf_bss)) {
+ error = -EFAULT;
+ goto out_close;
+ }
+
elf_bss = ELF_PAGESTART(elf_bss + ELF_MIN_ALIGN - 1); /* What we have mapped so far */

/* Map the last of the bss segment */
@@ -791,7 +803,11 @@
nbyte = ELF_MIN_ALIGN - nbyte;
if (nbyte > elf_brk - elf_bss)
nbyte = elf_brk - elf_bss;
- clear_user((void __user *) elf_bss + load_bias, nbyte);
+ if (clear_user((void __user *) elf_bss + load_bias, nbyte)) {
+ retval = -EFAULT;
+ send_sig(SIGKILL, current, 0);
+ goto out_free_dentry;
+ }
}
}

@@ -875,7 +891,11 @@
send_sig(SIGKILL, current, 0);
goto out_free_dentry;
}
- padzero(elf_bss);
+ if (padzero(elf_bss)) {
+ send_sig(SIGSEGV, current, 0);
+ retval = -EFAULT; /* Nobody gets to see this, but.. */
+ goto out_free_dentry;
+ }

if (elf_interpreter) {
if (interpreter_type == INTERPRETER_AOUT)
@@ -1039,7 +1059,10 @@
goto out_free_ph;

elf_bss = elf_phdata->p_vaddr + elf_phdata->p_filesz;
- padzero(elf_bss);
+ if (padzero(elf_bss)) {
+ error = -EFAULT;
+ goto out_free_ph;
+ }

len = ELF_PAGESTART(elf_phdata->p_filesz + elf_phdata->p_vaddr + ELF_MIN_ALIGN - 1);
bss = elf_phdata->p_memsz + elf_phdata->p_vaddr;
@@ -1246,8 +1269,8 @@
cputime_to_timeval(p->signal->cstime, &prstatus->pr_cstime);
}

-static void fill_psinfo(struct elf_prpsinfo *psinfo, struct task_struct *p,
- struct mm_struct *mm)
+static int fill_psinfo(struct elf_prpsinfo *psinfo, struct task_struct *p,
+ struct mm_struct *mm)
{
int i, len;

@@ -1257,8 +1280,9 @@
len = mm->arg_end - mm->arg_start;
if (len >= ELF_PRARGSZ)
len = ELF_PRARGSZ-1;
- copy_from_user(&psinfo->pr_psargs,
- (const char __user *)mm->arg_start, len);
+ if (copy_from_user(&psinfo->pr_psargs,
+ (const char __user *)mm->arg_start, len))
+ return -EFAULT;
for(i = 0; i < len; i++)
if (psinfo->pr_psargs[i] == 0)
psinfo->pr_psargs[i] = ' ';
@@ -1279,7 +1303,7 @@
SET_GID(psinfo->pr_gid, p->gid);
strncpy(psinfo->pr_fname, p->comm, sizeof(psinfo->pr_fname));

- return;
+ return 0;
}

/* Here is the structure in which status of each thread is captured. */

2005-02-07 23:56:14

by Daniel Drake

[permalink] [raw]
Subject: Re: 2.6.11-rc3: Kylix application no longer works?

Andrew Morton wrote:
> I wonder if reverting the patch will restore the old behaviour?
>
> # This is a BitKeeper generated diff -Nru style patch.
> #
> # ChangeSet
> # 2005/01/21 13:42:18-08:00 [email protected]
> # Merge nuts.davemloft.net:/disk1/BK/sparcwork-2.6
> # into nuts.davemloft.net:/disk1/BK/sparc-2.6
> #
> # fs/binfmt_elf.c
> # 2005/01/21 13:42:06-08:00 [email protected] +0 -0
> # Auto merged
> #
> # ChangeSet
> # 2005/01/17 13:38:38-08:00 [email protected]
> # [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c
> #
> # Signed-off-by: David S. Miller <[email protected]>
> #
> # fs/compat_ioctl.c
> # 2005/01/17 13:37:56-08:00 [email protected] +12 -5
> # [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c
> #
> # fs/binfmt_elf.c
> # 2005/01/17 13:37:56-08:00 [email protected] +43 -19
> # [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c
> #

I think so. For a short period we applied this patch to the Gentoo 2.6.10
kernel...

http://dev.gentoo.org/~dsd/gentoo-dev-sources/release-10.01/dist/1900_umem_catch.patch

...but removed it once users complained it stopped kylix binaries from running.

Daniel

2005-02-08 00:13:09

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.11-rc3: Kylix application no longer works?

Daniel Drake <[email protected]> wrote:
>
> > # fs/binfmt_elf.c
> > # 2005/01/17 13:37:56-08:00 [email protected] +43 -19
> > # [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c
> > #
>
> I think so. For a short period we applied this patch to the Gentoo 2.6.10
> kernel...
>
> http://dev.gentoo.org/~dsd/gentoo-dev-sources/release-10.01/dist/1900_umem_catch.patch
>
> ...but removed it once users complained it stopped kylix binaries from running.

Bah. That's what happens when you fix stuff.

What's kylix? The Borland C++ builder thing?

How should one set about reproducing this problem?

2005-02-08 00:54:46

by Grzegorz Kulewski

[permalink] [raw]
Subject: Re: 2.6.11-rc3: Kylix application no longer works?

On Mon, 7 Feb 2005, Andrew Morton wrote:

> Daniel Drake <[email protected]> wrote:
>>
>>> # fs/binfmt_elf.c
>>> # 2005/01/17 13:37:56-08:00 [email protected] +43 -19
>>> # [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c
>>> #
>>
>> I think so. For a short period we applied this patch to the Gentoo 2.6.10
>> kernel...
>>
>> http://dev.gentoo.org/~dsd/gentoo-dev-sources/release-10.01/dist/1900_umem_catch.patch
>>
>> ...but removed it once users complained it stopped kylix binaries from running.
>
> Bah. That's what happens when you fix stuff.
>
> What's kylix? The Borland C++ builder thing?

Rather Delphi (== Object Pascal) thing.


> How should one set about reproducing this problem?

IIRC, Some minimal "personal" version can be downloaded from borland.com.



Grzegorz Kulewski

2005-02-08 04:07:01

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.11-rc3: Kylix application no longer works?

Grzegorz Kulewski <[email protected]> wrote:
>
> On Mon, 7 Feb 2005, Andrew Morton wrote:
>
> > Daniel Drake <[email protected]> wrote:
> >>
> >>> # fs/binfmt_elf.c
> >>> # 2005/01/17 13:37:56-08:00 [email protected] +43 -19
> >>> # [SPARC64]: Missing user access return value checks in fs/binfmt_elf.c and fs/compat.c
> >>> #
> >>
> >> I think so. For a short period we applied this patch to the Gentoo 2.6.10
> >> kernel...
> >>
> >> http://dev.gentoo.org/~dsd/gentoo-dev-sources/release-10.01/dist/1900_umem_catch.patch
> >>
> >> ...but removed it once users complained it stopped kylix binaries from running.
> >
> > Bah. That's what happens when you fix stuff.
> >
> > What's kylix? The Borland C++ builder thing?
>
> Rather Delphi (== Object Pascal) thing.
>
>
> > How should one set about reproducing this problem?
>
> IIRC, Some minimal "personal" version can be downloaded from borland.com.

Well I'd prefer that we not back out the whole patch. Could someone please
test with something like the below, let us know exactly where it's falling
over?


--- 25/fs/binfmt_elf.c~a 2005-02-07 20:01:16.000000000 -0800
+++ 25-akpm/fs/binfmt_elf.c 2005-02-07 20:03:51.000000000 -0800
@@ -44,6 +44,8 @@

#include <linux/elf.h>

+#define D() do { printk("%s:%d\n", __FILE__, __LINE__); dump_stack(); } while (0)
+
static int load_elf_binary(struct linux_binprm * bprm, struct pt_regs * regs);
static int load_elf_library(struct file*);
static unsigned long elf_map (struct file *, unsigned long, struct elf_phdr *, int, int);
@@ -181,8 +183,10 @@ create_elf_tables(struct linux_binprm *b
STACK_ALLOC(p, ((current->pid % 64) << 7));
#endif
u_platform = (elf_addr_t __user *)STACK_ALLOC(p, len);
- if (__copy_to_user(u_platform, k_platform, len))
+ if (__copy_to_user(u_platform, k_platform, len)) {
+ D();
return -EFAULT;
+ }
}

/* Create the ELF interpreter info */
@@ -244,8 +248,10 @@ create_elf_tables(struct linux_binprm *b
#endif

/* Now, let's put argc (and argv, envp if appropriate) on the stack */
- if (__put_user(argc, sp++))
+ if (__put_user(argc, sp++)) {
+ D();
return -EFAULT;
+ }
if (interp_aout) {
argv = sp + 2;
envp = argv + argc + 1;
@@ -266,8 +272,10 @@ create_elf_tables(struct linux_binprm *b
return 0;
p += len;
}
- if (__put_user(0, argv))
+ if (__put_user(0, argv)) {
+ D();
return -EFAULT;
+ }
current->mm->arg_end = current->mm->env_start = p;
while (envc-- > 0) {
size_t len;
@@ -277,14 +285,18 @@ create_elf_tables(struct linux_binprm *b
return 0;
p += len;
}
- if (__put_user(0, envp))
+ if (__put_user(0, envp)) {
+ D();
return -EFAULT;
+ }
current->mm->env_end = p;

/* Put the elf_info on the stack in the right place. */
sp = (elf_addr_t __user *)envp + 1;
- if (copy_to_user(sp, elf_info, ei_index * sizeof(elf_addr_t)))
+ if (copy_to_user(sp, elf_info, ei_index * sizeof(elf_addr_t))) {
+ D();
return -EFAULT;
+ }
return 0;
}

_

2005-02-08 08:15:38

by Daniel Drake

[permalink] [raw]
Subject: Re: 2.6.11-rc3: Kylix application no longer works?

Hi Andrew,

Andrew Morton wrote:
> Bah. That's what happens when you fix stuff.
>
> What's kylix? The Borland C++ builder thing?
>
> How should one set about reproducing this problem?

You don't need the kylix environment or libraries, you just need to run any
binary that was compiled with kylix. Teamspeak was the one that brought the
issue to our attention. Here's a direct link:

ftp://ftp.freenet.de/pub/4players/teamspeak.org/releases/ts2_client_rc2_2032.tar.bz2

I will test your diagnosis patch later today, if nobody else has.

Daniel

2005-02-08 17:17:28

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.11-rc3: Kylix application no longer works?

Hi!

> > > How should one set about reproducing this problem?
> >
> > IIRC, Some minimal "personal" version can be downloaded from borland.com.
>
> Well I'd prefer that we not back out the whole patch. Could someone please
> test with something like the below, let us know exactly where it's falling
> over?

Sorry, it seems your debugging traps do not trigger. I'll try other
patches...
Pavel

> --- 25/fs/binfmt_elf.c~a 2005-02-07 20:01:16.000000000 -0800
> +++ 25-akpm/fs/binfmt_elf.c 2005-02-07 20:03:51.000000000 -0800
> @@ -44,6 +44,8 @@
>
> #include <linux/elf.h>
>
> +#define D() do { printk("%s:%d\n", __FILE__, __LINE__); dump_stack(); } while (0)
> +
> static int load_elf_binary(struct linux_binprm * bprm, struct pt_regs * regs);
> static int load_elf_library(struct file*);
> static unsigned long elf_map (struct file *, unsigned long, struct elf_phdr *, int, int);
> @@ -181,8 +183,10 @@ create_elf_tables(struct linux_binprm *b
> STACK_ALLOC(p, ((current->pid % 64) << 7));
> #endif
> u_platform = (elf_addr_t __user *)STACK_ALLOC(p, len);
> - if (__copy_to_user(u_platform, k_platform, len))
> + if (__copy_to_user(u_platform, k_platform, len)) {
> + D();
> return -EFAULT;
> + }

--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-02-08 17:28:27

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.11-rc3: Kylix application no longer works?

Hi!

> I wonder if reverting the patch will restore the old behaviour?

Yes, this fixes it: kylix application now works for me.

Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-02-08 18:57:32

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.11-rc3: Kylix application no longer works?

Hi!

> I wonder if reverting the patch will restore the old behaviour?

This seems to be minimal fix to get Kylix application back to the
working state... Maybe it is good idea for 2.6.11?

Signed-off-by: Pavel Machek <[email protected]>
Pavel

--- clean/fs/binfmt_elf.c 2005-02-03 22:27:19.000000000 +0100
+++ linux/fs/binfmt_elf.c 2005-02-08 18:46:38.000000000 +0100
@@ -803,11 +803,8 @@
nbyte = ELF_MIN_ALIGN - nbyte;
if (nbyte > elf_brk - elf_bss)
nbyte = elf_brk - elf_bss;
- if (clear_user((void __user *) elf_bss + load_bias, nbyte)) {
- retval = -EFAULT;
- send_sig(SIGKILL, current, 0);
- goto out_free_dentry;
- }
+ if (clear_user((void __user *) elf_bss + load_bias, nbyte))
+ printk(KERN_ERR "Error clearing BSS, wrong ELF executable? (Kylix?!)\n");
}
}


--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-02-08 19:16:21

by Stephen Hemminger

[permalink] [raw]
Subject: Re: 2.6.11-rc3: Kylix application no longer works?

On Tue, 8 Feb 2005 18:51:06 +0100
Pavel Machek <[email protected]> wrote:

> Hi!
>
> > I wonder if reverting the patch will restore the old behaviour?
>
> This seems to be minimal fix to get Kylix application back to the
> working state... Maybe it is good idea for 2.6.11?
>
> Signed-off-by: Pavel Machek <[email protected]>
> Pavel
>
> --- clean/fs/binfmt_elf.c 2005-02-03 22:27:19.000000000 +0100
> +++ linux/fs/binfmt_elf.c 2005-02-08 18:46:38.000000000 +0100
> @@ -803,11 +803,8 @@
> nbyte = ELF_MIN_ALIGN - nbyte;
> if (nbyte > elf_brk - elf_bss)
> nbyte = elf_brk - elf_bss;
> - if (clear_user((void __user *) elf_bss + load_bias, nbyte)) {
> - retval = -EFAULT;
> - send_sig(SIGKILL, current, 0);
> - goto out_free_dentry;
> - }
> + if (clear_user((void __user *) elf_bss + load_bias, nbyte))
> + printk(KERN_ERR "Error clearing BSS, wrong ELF executable? (Kylix?!)\n");

do once or rate limit rather than log spamming.

--
Stephen Hemminger <[email protected]>

2005-02-08 20:43:08

by Daniel Jacobowitz

[permalink] [raw]
Subject: Re: 2.6.11-rc3: Kylix application no longer works?

On Tue, Feb 08, 2005 at 06:51:06PM +0100, Pavel Machek wrote:
> Hi!
>
> > I wonder if reverting the patch will restore the old behaviour?
>
> This seems to be minimal fix to get Kylix application back to the
> working state... Maybe it is good idea for 2.6.11?

Why does clearing the BSS fail? Are the program headers bogus?
(readelf -l).

--
Daniel Jacobowitz
CodeSourcery, LLC

2005-02-08 22:30:48

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.11-rc3: Kylix application no longer works?

Hi!

> > > I wonder if reverting the patch will restore the old behaviour?
> >
> > This seems to be minimal fix to get Kylix application back to the
> > working state... Maybe it is good idea for 2.6.11?
>
> Why does clearing the BSS fail? Are the program headers bogus?
> (readelf -l).

No idea, probably yes. Here's readelf -l result:

Pavel

Elf file type is EXEC (Executable file)
Entry point 0x80614b4
There are 5 program headers, starting at offset 52

Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000034 0x08048034 0x08048034 0x000a0 0x000a0 R E 0x4
INTERP 0x0000d4 0x080480d4 0x080480d4 0x00013 0x00013 R 0x1
[Requesting program interpreter: /lib/ld-linux.so.2]
LOAD 0x000000 0x08048000 0x08048000 0xb7354 0x1b7354 R E 0x1000
LOAD 0x0b7354 0x08200354 0x08200354 0x1e3e4 0x1f648 RW 0x1000
DYNAMIC 0x0d56a0 0x0821e6a0 0x0821e6a0 0x00098 0x00098 RW 0x4

Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .dynsym .dynstr .hash .rel.plt .plt .text borland.ressym borland.resstr borland.reshash borland.resdata borland.resspare
03 .data .rodata .got .dynamic .bss
04 .dynamic


--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-02-09 02:11:48

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.11-rc3: Kylix application no longer works?

Stephen Hemminger <[email protected]> wrote:
>
> On Tue, 8 Feb 2005 18:51:06 +0100
> Pavel Machek <[email protected]> wrote:
>
> > Hi!
> >
> > > I wonder if reverting the patch will restore the old behaviour?
> >
> > This seems to be minimal fix to get Kylix application back to the
> > working state... Maybe it is good idea for 2.6.11?
> >
> > Signed-off-by: Pavel Machek <[email protected]>
> > Pavel
> >
> > --- clean/fs/binfmt_elf.c 2005-02-03 22:27:19.000000000 +0100
> > +++ linux/fs/binfmt_elf.c 2005-02-08 18:46:38.000000000 +0100
> > @@ -803,11 +803,8 @@
> > nbyte = ELF_MIN_ALIGN - nbyte;
> > if (nbyte > elf_brk - elf_bss)
> > nbyte = elf_brk - elf_bss;
> > - if (clear_user((void __user *) elf_bss + load_bias, nbyte)) {
> > - retval = -EFAULT;
> > - send_sig(SIGKILL, current, 0);
> > - goto out_free_dentry;
> > - }
> > + if (clear_user((void __user *) elf_bss + load_bias, nbyte))
> > + printk(KERN_ERR "Error clearing BSS, wrong ELF executable? (Kylix?!)\n");
>
> do once or rate limit rather than log spamming.

We could just remove the printk and stick a comment over it. If the
application later tries to access the not-there pages then it'll just
fault.

However I worry if there is some way in which we can leave unzeroed memory
accessible to the application, although it's hard to see how that could
happen.

Daniel, Pavel cruelly chopped you off the Cc when replying. What's your
diagnosis on the below?



Begin forwarded message:

Date: Tue, 8 Feb 2005 23:27:59 +0100
From: Pavel Machek <[email protected]>
To: Andrew Morton <[email protected]>, [email protected]
Subject: Re: 2.6.11-rc3: Kylix application no longer works?


Hi!

> > > I wonder if reverting the patch will restore the old behaviour?
> >
> > This seems to be minimal fix to get Kylix application back to the
> > working state... Maybe it is good idea for 2.6.11?
>
> Why does clearing the BSS fail? Are the program headers bogus?
> (readelf -l).

No idea, probably yes. Here's readelf -l result:

Pavel

Elf file type is EXEC (Executable file)
Entry point 0x80614b4
There are 5 program headers, starting at offset 52

Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000034 0x08048034 0x08048034 0x000a0 0x000a0 R E 0x4
INTERP 0x0000d4 0x080480d4 0x080480d4 0x00013 0x00013 R 0x1
[Requesting program interpreter: /lib/ld-linux.so.2]
LOAD 0x000000 0x08048000 0x08048000 0xb7354 0x1b7354 R E 0x1000
LOAD 0x0b7354 0x08200354 0x08200354 0x1e3e4 0x1f648 RW 0x1000
DYNAMIC 0x0d56a0 0x0821e6a0 0x0821e6a0 0x00098 0x00098 RW 0x4

Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .dynsym .dynstr .hash .rel.plt .plt .text borland.ressym borland.resstr borland.reshash borland.resdata borland.resspare
03 .data .rodata .got .dynamic .bss
04 .dynamic


--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-02-09 11:11:36

by Pavel Machek

[permalink] [raw]
Subject: Re: 2.6.11-rc3: Kylix application no longer works?

Hi!

> > > > I wonder if reverting the patch will restore the old behaviour?
> > >
> > > This seems to be minimal fix to get Kylix application back to the
> > > working state... Maybe it is good idea for 2.6.11?
> > >
> > > Signed-off-by: Pavel Machek <[email protected]>
> > > Pavel
> > >
> > > --- clean/fs/binfmt_elf.c 2005-02-03 22:27:19.000000000 +0100
> > > +++ linux/fs/binfmt_elf.c 2005-02-08 18:46:38.000000000 +0100
> > > @@ -803,11 +803,8 @@
> > > nbyte = ELF_MIN_ALIGN - nbyte;
> > > if (nbyte > elf_brk - elf_bss)
> > > nbyte = elf_brk - elf_bss;
> > > - if (clear_user((void __user *) elf_bss + load_bias, nbyte)) {
> > > - retval = -EFAULT;
> > > - send_sig(SIGKILL, current, 0);
> > > - goto out_free_dentry;
> > > - }
> > > + if (clear_user((void __user *) elf_bss + load_bias, nbyte))
> > > + printk(KERN_ERR "Error clearing BSS, wrong ELF executable? (Kylix?!)\n");
> >
> > do once or rate limit rather than log spamming.
>
> We could just remove the printk and stick a comment over it. If the
> application later tries to access the not-there pages then it'll just
> fault.
>
> However I worry if there is some way in which we can leave unzeroed memory
> accessible to the application, although it's hard to see how that could
> happen.
>
> Daniel, Pavel cruelly chopped you off the Cc when replying. What's your
> diagnosis on the below?

Not me, my mailer did it :-). No, don't know what went wrong, and I
forwarded the message to Daniel; his reaction was:

>
> Elf file type is EXEC (Executable file)
> Entry point 0x80614b4
> There are 5 program headers, starting at offset 52
>
> Program Headers:
> Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg
>Align
> PHDR 0x000034 0x08048034 0x08048034 0x000a0 0x000a0 R E
>0x4
> INTERP 0x0000d4 0x080480d4 0x080480d4 0x00013 0x00013 R
>0x1
> [Requesting program interpreter: /lib/ld-linux.so.2]
> LOAD 0x000000 0x08048000 0x08048000 0xb7354 0x1b7354 R E
>0x1000
> LOAD 0x0b7354 0x08200354 0x08200354 0x1e3e4 0x1f648 RW
>0x1000
> DYNAMIC 0x0d56a0 0x0821e6a0 0x0821e6a0 0x00098 0x00098 RW
>0x4

Looks fine but it's probably the first load segment: that's a megabyte
of blank space at the end...

that's all I can see.

Pavel

--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

2005-02-09 15:34:48

by Daniel Jacobowitz

[permalink] [raw]
Subject: Re: 2.6.11-rc3: Kylix application no longer works?

On Tue, Feb 08, 2005 at 06:10:18PM -0800, Andrew Morton wrote:
> We could just remove the printk and stick a comment over it. If the
> application later tries to access the not-there pages then it'll just
> fault.
>
> However I worry if there is some way in which we can leave unzeroed memory
> accessible to the application, although it's hard to see how that could
> happen.
>
> Daniel, Pavel cruelly chopped you off the Cc when replying. What's your
> diagnosis on the below?

It's asking for a lot of unwritable zeroed space. See this:

> LOAD 0x000000 0x08048000 0x08048000 0xb7354 0x1b7354 R E 0x1000
> LOAD 0x0b7354 0x08200354 0x08200354 0x1e3e4 0x1f648 RW 0x1000

The 0xb7354 is size to map from the file, the 0x1b7354 is size to map
in memory. We're supposed to zero-fill the rest. Now that I think
about it I can see why this is a problem - the kernel probably assumes
that any segment with MemSiz > FileSiz will be writable. Certainly
it's a bit weird for the app to request unwritable zeroed pages.

clear_user's probably not the right way to provide the extra zeroing.

--
Daniel Jacobowitz
CodeSourcery, LLC

2005-02-09 20:14:25

by Rik van Riel

[permalink] [raw]
Subject: Re: 2.6.11-rc3: Kylix application no longer works?

On Wed, 9 Feb 2005, Daniel Jacobowitz wrote:
> On Tue, Feb 08, 2005 at 06:10:18PM -0800, Andrew Morton wrote:

> It's asking for a lot of unwritable zeroed space. See this:
>
>> LOAD 0x000000 0x08048000 0x08048000 0xb7354 0x1b7354 R E 0x1000
>> LOAD 0x0b7354 0x08200354 0x08200354 0x1e3e4 0x1f648 RW 0x1000

> clear_user's probably not the right way to provide the extra zeroing.

Indeed, clear_user() refuses to zero data when it's not writable
to the user process ...

unsigned long
clear_user(void __user *to, unsigned long n)
{
might_sleep();
if (access_ok(VERIFY_WRITE, to, n))
__do_clear_user(to, n);
return n;
}

--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan

2005-02-12 00:39:13

by Ingo Oeser

[permalink] [raw]
Subject: Re: 2.6.11-rc3: Kylix application no longer works?

Hi,

Rik van Riel wrote:
> On Wed, 9 Feb 2005, Daniel Jacobowitz wrote:
> > On Tue, Feb 08, 2005 at 06:10:18PM -0800, Andrew Morton wrote:
> > It's asking for a lot of unwritable zeroed space. See this:
> >> LOAD 0x000000 0x08048000 0x08048000 0xb7354 0x1b7354 R E
> >> 0x1000 LOAD 0x0b7354 0x08200354 0x08200354 0x1e3e4 0x1f648 RW
> >> 0x1000
> >
> > clear_user's probably not the right way to provide the extra zeroing.
>
> Indeed, clear_user() refuses to zero data when it's not writable
> to the user process ...

So if the application wants an read only range of zeroed pages, why
not just map the ZERO_PAGE() multiple times there?

I can imagine _valid_ uses for that (templates for zero intitialized data),
although there are _better_ ways to do that.


Regards

Ingo Oeser