2010-02-01 14:02:16

by Michal Simek

[permalink] [raw]
Subject: Split 'flush_old_exec' into two functions - 221af7f87b97431e3ee21ce4b0e77d5411cf1549

Hi Peter and Linus,

commit 221af7f87b97431e3ee21ce4b0e77d5411cf1549 breaks anything on
Microblaze. Below is my log.

I reverted this patch in my repo.

Any idea what can be wrong?
None reported any problem that's why I think that is Microblaze related.

Thanks,
Michal



early_printk_console is enabled at 0x84000000
Ramdisk addr 0x00000003, FDT at 0x00000076
<5>Linux version 2.6.33-rc6-00859-gabe94c7-dirty ([email protected]) (gcc
version 4.1.2) #33 Mon Feb 1 14:48:17 CET 2010
setup_cpuinfo: initialising
setup_cpuinfo: Using full CPU PVR support
<6>[ 0.000000] setup_memory: max_mapnr: 0x10000
[ 0.000000] setup_memory: min_low_pfn: 0x50000
[ 0.000000] setup_memory: max_low_pfn: 0x60000
On node 0 totalpages: 65536
free_area_init_node: node 0, pgdat c026a77c, node_mem_map c0945000
Normal zone: 512 pages used for memmap
Normal zone: 0 pages reserved
Normal zone: 65024 pages, LIFO batch:15
Built 1 zonelists in Zone order, mobility grouping on. Total pages: 65024
Kernel command line: console=ttyUL0,115200
PID hash table entries: 1024 (order: 0, 4096 bytes)
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Memory: 249984k/262144k available
Hierarchical RCU implementation.
NR_IRQS:32
xlnx,xps-intc-1.00.a #0 at 0xd0000000, num_irq=6, edge=0x2e
xlnx,xps-timer-1.00.a #0 at 0xd0004000, irq=2
microblaze_timer_set_mode: shutdown
microblaze_timer_set_mode: periodic
Calibrating delay loop... 59.52 BogoMIPS (lpj=119040)
Mount-cache hash table entries: 512
NET: Registered protocol family 16
XGpio: /plb@0/gpio@81400000: registered
Switching to clocksource microblaze_clocksource
microblaze_timer_set_mode: oneshot
NET: Registered protocol family 2
IP route cache hash table entries: 2048 (order: 1, 8192 bytes)
TCP established hash table entries: 8192 (order: 4, 65536 bytes)
TCP bind hash table entries: 8192 (order: 3, 32768 bytes)
TCP: Hash tables configured (established 8192 bind 8192)
TCP reno registered
UDP hash table entries: 256 (order: 0, 4096 bytes)
UDP-Lite hash table entries: 256 (order: 0, 4096 bytes)
NET: Registered protocol family 1
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
Skipping unavailable RESET gpio -2 (reset)
GPIO pin is already allocated
VFS: Disk quotas dquot_6.5.2
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
JFFS2 version 2.2. (NAND) (SUMMARY) ? 2001-2006 Red Hat, Inc.
msgmni has been set to 488
Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
84000000.serial: ttyUL0 at MMIO 0x84000003 (irq = 5) is a uartlite
console [ttyUL0] enabled
86000000.flash: Found 1 x16 devices at 0x0 in 16-bit bank
Intel/Sharp Extended Query Table at 0x010A
Intel/Sharp Extended Query Table at 0x010A
Intel/Sharp Extended Query Table at 0x010A
Intel/Sharp Extended Query Table at 0x010A
Intel/Sharp Extended Query Table at 0x010A
Using buffer write method
Using auto-unlock on power-up/resume
cfi_cmdset_0001: Erase suspend on write enabled
erase region 0: offset=0x0,size=0x20000,blocks=255
erase region 1: offset=0x1fe0000,size=0x8000,blocks=4
RedBoot partition parsing not available
Creating 6 MTD partitions on "86000000.flash":
0x000000000000-0x000000100000 : "fpga"
0x000000100000-0x000000140000 : "boot"
0x000000140000-0x000000160000 : "bootenv"
0x000000160000-0x000000180000 : "config"
0x000000180000-0x000000980000 : "image"
0x000000980000-0x000002000000 : "spare"
xilinx_emaclite 81000000.ethernet: Device Tree Probing
xilinx_emaclite 81000000.ethernet: MAC address is now 0: a:35: 0:22: 1
xilinx_emaclite 81000000.ethernet: Xilinx EmacLite at 0x81000000 mapped
to 0xD20A0000, irq=3
TCP cubic registered
NET: Registered protocol family 17
Freeing unused kernel memory: 6928k freed
Mounting proc:
Mounting var:
Populating /var:
Running local start scripts.
Mounting devpts:
/etc/rc.d/S02devpts: line 4: mount: not found
Mounting /etc/config:
Populating /etc/config:
flatfsd: Created 4 configuration files (167 bytes)
Mounting sysfs:
/etc/rc.d/S02sysfs: line 4: /bin/mount: Bad address
Setting hostname:
Bringing up network interfaces:
/etc/rc.d/S40network: line 4: /bin/ifup: Bad address
Starting portmap:
/etc/rc.d/S40network: line 8: /bin/portmap: Bad address
Starting uWeb server:
/etc/rc.d/S90uWeb:
Welcome to
_____ _ _ _
| ___ \ | | | | (_)
| |_/ / ___ | |_ __ _ | | _ _ __ _ _ __ __
| __/ / _ \| __| / _` || | | || '_ \ | | | |\ \/ /
| | | __/| |_ | (_| || |____| || | | || |_| | > <
\_| \___| \__| \__,_|\_____/|_||_| |_| \__,_|/_/\_\

on ml505-wt-emaclite

line 4: /home/httpd/cgi-bin/petalinux: Bad address

ml505-wt-emaclite login:



--
Michal Simek, Ing. (M.Eng)
PetaLogix - Linux Solutions for a Reconfigurable World
w: http://www.petalogix.com p: +61-7-30090663,+42-0-721842854 f: +61-7-30090663


2010-02-01 15:58:09

by Linus Torvalds

[permalink] [raw]
Subject: Re: Split 'flush_old_exec' into two functions - 221af7f87b97431e3ee21ce4b0e77d5411cf1549



On Mon, 1 Feb 2010, Michal Simek wrote:
>
> Hi Peter and Linus,
>
> commit 221af7f87b97431e3ee21ce4b0e77d5411cf1549 breaks anything on Microblaze.

Gaah. My original version of that patch very much tried to make it a no-op
semantically, but then Peter made some preparatory changes for the next
patch, so it actually changes semantics a bit. I was expecting that to be
benign, but clearly there are issues.

> None reported any problem that's why I think that is Microblaze related.

Well, our previous handling of the critical stage of 'execve()' when we
actually switch from the old process to the new was _so_ grotty that many
architectures ended up playing some really subtle games there. The whole
point of the patch is to get rid of the games, but it's entirely possible
that Microblaze (and others) had crazy things going on that broke when we
made the ordering more straightforward.

That said, Microblaze is not one of the architectures I would have
expected to have problems. It has one of the most straightforward
"flush_thread()" implementations in the whole kernel (it's a no-op ;), and
that's where most of the hacky things were for the architectures that
needed the change. And it has no "arch_pick_mmap_layout()" issues or
anything else that tends to depend on personality bits or whatever.

Microblaze is a no-MMU platform, isn't it? Which binary format does it
use? It looks like _some_ binaries work (it seems to happily be running a
shell to actually do those startup scripts) while others have problems. Is
there a difference between "/bin/sh" and the binaries that seem to be
problematic (like /bin/mount and /bin/ifup).

Are the failing binaries all setuid ones, for example? Or shared vs
non-shared? Or ELF vs FLAT or whatever?

Linus

2010-02-01 18:08:14

by Jason Wessel

[permalink] [raw]
Subject: Re: Split 'flush_old_exec' into two functions - 221af7f87b97431e3ee21ce4b0e77d5411cf1549

Linus Torvalds wrote:
> On Mon, 1 Feb 2010, Michal Simek wrote:
>
>> Hi Peter and Linus,
>>
>> commit 221af7f87b97431e3ee21ce4b0e77d5411cf1549 breaks anything on Microblaze.
>>
>
> Are the failing binaries all setuid ones, for example? Or shared vs
> non-shared? Or ELF vs FLAT or whatever?
>

Add to the afflicted architecture list a ppc64 kernel running 32 bit
user space apps such as a dynamically linked busybox:

file bin/busybox

ELF 32-bit MSB executable, PowerPC or cisco 4500, version 1 (SYSV),
dynamically linked (uses shared libs), stripped


All that is reported on boot is:

[clip]
VFS: Mounted root (nfs filesystem) readonly on device 0:13.
Freeing unused kernel memory: 264k freed
/sbin/init: error while loading shared libraries: out of memory: Cannot
allocate memory
Kernel panic - not syncing: Attempted to kill init!
Rebooting in 180 seconds..QEMU: Terminated


Reverting the patch solves the problem.

Jason.

2010-02-01 18:41:39

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Split 'flush_old_exec' into two functions - 221af7f87b97431e3ee21ce4b0e77d5411cf1549

On 02/01/2010 10:07 AM, Jason Wessel wrote:
>
> Add to the afflicted architecture list a ppc64 kernel running 32 bit
> user space apps such as a dynamically linked busybox:
>
> file bin/busybox
>
> ELF 32-bit MSB executable, PowerPC or cisco 4500, version 1 (SYSV),
> dynamically linked (uses shared libs), stripped
>
>
> All that is reported on boot is:
>
> [clip]
> VFS: Mounted root (nfs filesystem) readonly on device 0:13.
> Freeing unused kernel memory: 264k freed
> /sbin/init: error while loading shared libraries: out of memory: Cannot
> allocate memory
> Kernel panic - not syncing: Attempted to kill init!
> Rebooting in 180 seconds..QEMU: Terminated
>

That's is fully to be expected, since PowerPC also does the
TIF_ABI_PENDING garbage, which needs to be removed.

-hpa

2010-02-01 18:41:48

by Linus Torvalds

[permalink] [raw]
Subject: Re: Split 'flush_old_exec' into two functions - 221af7f87b97431e3ee21ce4b0e77d5411cf1549



On Mon, 1 Feb 2010, Jason Wessel wrote:
>
> Add to the afflicted architecture list a ppc64 kernel running 32 bit
> user space apps such as a dynamically linked busybox:

The ppc issue was known - I informed the ppc people before committing, as
they were actually using the same insane trick (along with sparc) that x86
had to work around the bad calling conventions.

So x86, sparc and ppc were actually the main targets of the cleanup, the
powerpc patch just wasn't ready in time. But it's merged now (got a pull
request today), so powerpc should be ok now too (commit 94f28da84:
"powerpc: TIF_ABI_PENDING bit removal").

The microblaze thing is different, exactly because microblaze does _not_
have any of that insane 32-vs-64-bit executable fiddling that this cleanup
was all about. So microblaze I'd have expected to be totally unaffected,
and must be about something else.

Linus

2010-02-01 18:56:12

by Jason Wessel

[permalink] [raw]
Subject: Re: Split 'flush_old_exec' into two functions - 221af7f87b97431e3ee21ce4b0e77d5411cf1549

Linus Torvalds wrote:
> So x86, sparc and ppc were actually the main targets of the cleanup, the
> powerpc patch just wasn't ready in time. But it's merged now (got a pull
> request today), so powerpc should be ok now too (commit 94f28da84:
> "powerpc: TIF_ABI_PENDING bit removal").
>

Confirmed. It works again, and regression tests pass. :-)

Cheers,
Jason.

2010-02-01 19:33:16

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Split 'flush_old_exec' into two functions - 221af7f87b97431e3ee21ce4b0e77d5411cf1549

On 02/01/2010 07:57 AM, Linus Torvalds wrote:
>
> That said, Microblaze is not one of the architectures I would have
> expected to have problems. It has one of the most straightforward
> "flush_thread()" implementations in the whole kernel (it's a no-op ;), and
> that's where most of the hacky things were for the architectures that
> needed the change. And it has no "arch_pick_mmap_layout()" issues or
> anything else that tends to depend on personality bits or whatever.
>
> Microblaze is a no-MMU platform, isn't it? Which binary format does it
> use? It looks like _some_ binaries work (it seems to happily be running a
> shell to actually do those startup scripts) while others have problems. Is
> there a difference between "/bin/sh" and the binaries that seem to be
> problematic (like /bin/mount and /bin/ifup).
>
> Are the failing binaries all setuid ones, for example? Or shared vs
> non-shared? Or ELF vs FLAT or whatever?
>

Another thing... it looks like you [Michal] is running a test image
under Qemu... could you perhaps point us to that image or another test
image which reproduces the problem?

Nothing like having a hands-on testcase...

-hpa

2010-02-02 10:16:52

by Michal Simek

[permalink] [raw]
Subject: Re: Split 'flush_old_exec' into two functions - 221af7f87b97431e3ee21ce4b0e77d5411cf1549

H. Peter Anvin wrote:
> On 02/01/2010 07:57 AM, Linus Torvalds wrote:
>> That said, Microblaze is not one of the architectures I would have
>> expected to have problems. It has one of the most straightforward
>> "flush_thread()" implementations in the whole kernel (it's a no-op ;), and
>> that's where most of the hacky things were for the architectures that
>> needed the change. And it has no "arch_pick_mmap_layout()" issues or
>> anything else that tends to depend on personality bits or whatever.
>>
>> Microblaze is a no-MMU platform, isn't it? Which binary format does it
>> use? It looks like _some_ binaries work (it seems to happily be running a
>> shell to actually do those startup scripts) while others have problems. Is
>> there a difference between "/bin/sh" and the binaries that seem to be
>> problematic (like /bin/mount and /bin/ifup).
>>
>> Are the failing binaries all setuid ones, for example? Or shared vs
>> non-shared? Or ELF vs FLAT or whatever?
>>
>
> Another thing... it looks like you [Michal] is running a test image
> under Qemu... could you perhaps point us to that image or another test
> image which reproduces the problem?

Yes, you are right. I am doing every day testing on qemu.
I have prepared package which contain all things which you need.
Please download it from http://www.monstr.eu/20100202-flush.tar.gz
md5sum cfe59b19e52dff2319a2b1919839d5cd 20100202-flush.tar.gz

I added there README where you will find more information. Also there is
rootfs.cpio which you can use. Compile kernel with this patch and try.
Then you can revert it and retest - you should be able to see that behavior.

If you have any problem with it, please let me know.

Michal

P.S.: you don't need to care about FPU error in bootlog - fpu is not used.

>
> Nothing like having a hands-on testcase...
>
> -hpa
>


--
Michal Simek, Ing. (M.Eng)
PetaLogix - Linux Solutions for a Reconfigurable World
w: http://www.petalogix.com p: +61-7-30090663,+42-0-721842854 f: +61-7-30090663

2010-02-02 10:18:31

by Michal Simek

[permalink] [raw]
Subject: Re: Split 'flush_old_exec' into two functions - 221af7f87b97431e3ee21ce4b0e77d5411cf1549

Linus Torvalds wrote:
>
> On Mon, 1 Feb 2010, Michal Simek wrote:
>> Hi Peter and Linus,
>>
>> commit 221af7f87b97431e3ee21ce4b0e77d5411cf1549 breaks anything on Microblaze.
>
> Gaah. My original version of that patch very much tried to make it a no-op
> semantically, but then Peter made some preparatory changes for the next
> patch, so it actually changes semantics a bit. I was expecting that to be
> benign, but clearly there are issues.

Would it be possible to cc me or send that patches to linux-next? I am
doing every day tests and report results on my site. I would be able to
catch up bugs earlier.

>
>> None reported any problem that's why I think that is Microblaze related.
>
> Well, our previous handling of the critical stage of 'execve()' when we
> actually switch from the old process to the new was _so_ grotty that many
> architectures ended up playing some really subtle games there. The whole
> point of the patch is to get rid of the games, but it's entirely possible
> that Microblaze (and others) had crazy things going on that broke when we
> made the ordering more straightforward.
>
> That said, Microblaze is not one of the architectures I would have
> expected to have problems. It has one of the most straightforward
> "flush_thread()" implementations in the whole kernel (it's a no-op ;), and
> that's where most of the hacky things were for the architectures that
> needed the change. And it has no "arch_pick_mmap_layout()" issues or
> anything else that tends to depend on personality bits or whatever.
>
> Microblaze is a no-MMU platform, isn't it?

Microblaze has support for both platforms MMU and noMMU. Only MMU
version is affected. noMMU version is without any problem.

Which binary format does it
> use? It looks like _some_ binaries work (it seems to happily be running a
> shell to actually do those startup scripts) while others have problems. Is
> there a difference between "/bin/sh" and the binaries that seem to be
> problematic (like /bin/mount and /bin/ifup).

Most of them is busybox ELF with shared libraries. I tried non-shared
ELF and the problem is the same.

>
> Are the failing binaries all setuid ones, for example? Or shared vs
> non-shared? Or ELF vs FLAT or whatever?

no setuid.

Thanks,
Michal


>
> Linus


--
Michal Simek, Ing. (M.Eng)
PetaLogix - Linux Solutions for a Reconfigurable World
w: http://www.petalogix.com p: +61-7-30090663,+42-0-721842854 f: +61-7-30090663

2010-02-02 15:51:12

by Linus Torvalds

[permalink] [raw]
Subject: Re: Split 'flush_old_exec' into two functions - 221af7f87b97431e3ee21ce4b0e77d5411cf1549



On Tue, 2 Feb 2010, Michal Simek wrote:
>
> Would it be possible to cc me or send that patches to linux-next? I am doing
> every day tests and report results on my site. I would be able to catch up
> bugs earlier.

Normally, that would happen, but this patch got applied early _literally_
because I wanted it to hit -rc6 rather than wait any longer. So it had
only a day or two of discussion, and probably just a few hours from the
final version.

That said, I think I may have found the cause.

Peter: look at setup_new_exec(), and realize that it got moved _down_ to
after all the personality setting. So far, so good, that was the
intention, but look at what it does:

current->flags &= ~PF_RANDOMIZE;

and look at how fs/binfmt_elf.c does it not just after the personality
setting, but also after

if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
current->flags |= PF_RANDOMIZE;

so it looks like you may have moved it down too much.

I think you did that because you wanted to do that

arch_pick_mmap_layout(current->mm);

in setup_new_exec(). Which makes total sense, but it all means that the
whole preparatory patch did way more than my initial one (which put
setup_new_exec() right after flush_old_exec())

In fact, it looks like PF_RANDOMIZE never gets set with the new code, but
I didn't check if it might not happen somewhere else.

But while I doubt that clearing PF_RANDOMIZE will break anything, the
movement also affects other thigns. Lookie here:

if (elf_read_implies_exec(loc->elf_ex, executable_stack))
current->personality |= READ_IMPLIES_EXEC;

also happens before setup_new_exec(), and then setup_new_exec() does

current->personality &= ~bprm->per_clear;

where that per_clear mask may be PER_CLEAR_ON_SETID. Which contains
READ_IMPLIES_EXEC.

So we now always clear READ_IMPLIES_EXEC for setuid applications.

Anyway, I'm not sure this is it, but that's two examples of something that
did change unintentionally.

Michael, mind trying this (UNTESTED!) patch? It makes conceptual sense,
and moves some more of the flushing of the old process state up to
"flush_old_exec()" rather than doing it late in "setup_new_exec()".

(I suspect we should also move the signal/fd flushing there, but I doubt
it matters)

Linus

---
fs/exec.c | 10 +++++-----
1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 675c3f4..0790a10 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -961,6 +961,11 @@ int flush_old_exec(struct linux_binprm * bprm)
goto out;

bprm->mm = NULL; /* We're using it now */
+
+ current->flags &= ~PF_RANDOMIZE;
+ flush_thread();
+ current->personality &= ~bprm->per_clear;
+
return 0;

out:
@@ -997,9 +1002,6 @@ void setup_new_exec(struct linux_binprm * bprm)
tcomm[i] = '\0';
set_task_comm(current, tcomm);

- current->flags &= ~PF_RANDOMIZE;
- flush_thread();
-
/* Set the new mm task size. We have to do that late because it may
* depend on TIF_32BIT which is only updated in flush_thread() on
* some architectures like powerpc
@@ -1015,8 +1017,6 @@ void setup_new_exec(struct linux_binprm * bprm)
set_dumpable(current->mm, suid_dumpable);
}

- current->personality &= ~bprm->per_clear;
-
/*
* Flush performance counters when crossing a
* security domain:

2010-02-02 19:54:19

by Michal Simek

[permalink] [raw]
Subject: Re: Split 'flush_old_exec' into two functions - 221af7f87b97431e3ee21ce4b0e77d5411cf1549

Linus Torvalds wrote:
>
> On Tue, 2 Feb 2010, Michal Simek wrote:
>> Would it be possible to cc me or send that patches to linux-next? I am doing
>> every day tests and report results on my site. I would be able to catch up
>> bugs earlier.
>
> Normally, that would happen, but this patch got applied early _literally_
> because I wanted it to hit -rc6 rather than wait any longer. So it had
> only a day or two of discussion, and probably just a few hours from the
> final version.

ok. I just wanted to be sure.

>
> That said, I think I may have found the cause.
>
> Peter: look at setup_new_exec(), and realize that it got moved _down_ to
> after all the personality setting. So far, so good, that was the
> intention, but look at what it does:
>
> current->flags &= ~PF_RANDOMIZE;
>
> and look at how fs/binfmt_elf.c does it not just after the personality
> setting, but also after
>
> if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
> current->flags |= PF_RANDOMIZE;
>
> so it looks like you may have moved it down too much.
>
> I think you did that because you wanted to do that
>
> arch_pick_mmap_layout(current->mm);
>
> in setup_new_exec(). Which makes total sense, but it all means that the
> whole preparatory patch did way more than my initial one (which put
> setup_new_exec() right after flush_old_exec())
>
> In fact, it looks like PF_RANDOMIZE never gets set with the new code, but
> I didn't check if it might not happen somewhere else.
>
> But while I doubt that clearing PF_RANDOMIZE will break anything, the
> movement also affects other thigns. Lookie here:
>
> if (elf_read_implies_exec(loc->elf_ex, executable_stack))
> current->personality |= READ_IMPLIES_EXEC;
>
> also happens before setup_new_exec(), and then setup_new_exec() does
>
> current->personality &= ~bprm->per_clear;
>
> where that per_clear mask may be PER_CLEAR_ON_SETID. Which contains
> READ_IMPLIES_EXEC.
>
> So we now always clear READ_IMPLIES_EXEC for setuid applications.
>
> Anyway, I'm not sure this is it, but that's two examples of something that
> did change unintentionally.
>
> Michael, mind trying this (UNTESTED!) patch?

Just Michal. :-) No worries about.

It makes conceptual sense,
> and moves some more of the flushing of the old process state up to
> "flush_old_exec()" rather than doing it late in "setup_new_exec()".

yes, your patch works. I tested it on QEMU and on real hw and I can't
see any visible problem. I will do more test tomorrow.

Thanks,
Michal

>
> (I suspect we should also move the signal/fd flushing there, but I doubt
> it matters)
>
> Linus
>
> ---
> fs/exec.c | 10 +++++-----
> 1 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/fs/exec.c b/fs/exec.c
> index 675c3f4..0790a10 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -961,6 +961,11 @@ int flush_old_exec(struct linux_binprm * bprm)
> goto out;
>
> bprm->mm = NULL; /* We're using it now */
> +
> + current->flags &= ~PF_RANDOMIZE;
> + flush_thread();
> + current->personality &= ~bprm->per_clear;
> +
> return 0;
>
> out:
> @@ -997,9 +1002,6 @@ void setup_new_exec(struct linux_binprm * bprm)
> tcomm[i] = '\0';
> set_task_comm(current, tcomm);
>
> - current->flags &= ~PF_RANDOMIZE;
> - flush_thread();
> -
> /* Set the new mm task size. We have to do that late because it may
> * depend on TIF_32BIT which is only updated in flush_thread() on
> * some architectures like powerpc
> @@ -1015,8 +1017,6 @@ void setup_new_exec(struct linux_binprm * bprm)
> set_dumpable(current->mm, suid_dumpable);
> }
>
> - current->personality &= ~bprm->per_clear;
> -
> /*
> * Flush performance counters when crossing a
> * security domain:


--
Michal Simek, Ing. (M.Eng)
PetaLogix - Linux Solutions for a Reconfigurable World
w: http://www.petalogix.com p: +61-7-30090663,+42-0-721842854 f: +61-7-30090663

2010-02-02 21:45:48

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Split 'flush_old_exec' into two functions - 221af7f87b97431e3ee21ce4b0e77d5411cf1549

On 02/02/2010 07:50 AM, Linus Torvalds wrote:
>
> Normally, that would happen, but this patch got applied early _literally_
> because I wanted it to hit -rc6 rather than wait any longer. So it had
> only a day or two of discussion, and probably just a few hours from the
> final version.
>
> That said, I think I may have found the cause.
>
> Peter: look at setup_new_exec(), and realize that it got moved _down_ to
> after all the personality setting. So far, so good, that was the
> intention, but look at what it does:
>
> current->flags &= ~PF_RANDOMIZE;
>
> and look at how fs/binfmt_elf.c does it not just after the personality
> setting, but also after
>
> if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
> current->flags |= PF_RANDOMIZE;
>
> so it looks like you may have moved it down too much.
>

Yes, not entirely surprising.

> I think you did that because you wanted to do that
>
> arch_pick_mmap_layout(current->mm);
>
> in setup_new_exec(). Which makes total sense, but it all means that the
> whole preparatory patch did way more than my initial one (which put
> setup_new_exec() right after flush_old_exec())

Yes, that was the intention, and I did specify that I had folded in my
(previously posted as a separate patch) changes; the intent was to avoid
a bisect hole. I didn't describe it well because of the rush, though.

> Michael, mind trying this (UNTESTED!) patch? It makes conceptual sense,
> and moves some more of the flushing of the old process state up to
> "flush_old_exec()" rather than doing it late in "setup_new_exec()".
>
> (I suspect we should also move the signal/fd flushing there, but I doubt
> it matters)

Quite.

-hpa