2009-11-28 15:24:24

by Mike Frysinger

[permalink] [raw]
Subject: [PATCH] FDPIC: respect PT_GNU_STACK exec markings when creating NOMMU stack

The current code will load the stack size and markings, but then only use
the markings in the MMU code path. The NOMMU code path always passes EXEC
to the mmap() call. While this doesn't matter to most people during the
run of the code, it causes a pointless icache flush when starting every
FDPIC application and by default, that tends to be 128kB of waste.

Signed-off-by: Mike Frysinger <[email protected]>
---
note: this will apply cleanly with the uninitialized flag patch applied,
but otherwise doesn't directly depend on it

fs/binfmt_elf_fdpic.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index 8563a57..3e2507b 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -380,7 +380,8 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm,

down_write(&current->mm->mmap_sem);
current->mm->start_brk = do_mmap(NULL, 0, stack_size,
- PROT_READ | PROT_WRITE | PROT_EXEC,
+ PROT_READ | PROT_WRITE |
+ (executable_stack & EXSTACK_ENABLE_X ? PROT_EXEC : 0),
MAP_PRIVATE | MAP_ANONYMOUS |
MAP_UNINITIALIZED | MAP_GROWSDOWN,
0);
--
1.6.5.3


2009-11-28 18:47:41

by Mike Frysinger

[permalink] [raw]
Subject: Re: [uClinux-dev] [PATCH] FDPIC: respect PT_GNU_STACK exec markings when creating NOMMU stack

On Sat, Nov 28, 2009 at 10:24, Mike Frysinger wrote:
> The current code will load the stack size and markings, but then only use
> the markings in the MMU code path.  The NOMMU code path always passes EXEC
> to the mmap() call.  While this doesn't matter to most people during the
> run of the code, it causes a pointless icache flush when starting every
> FDPIC application and by default, that tends to be 128kB of waste.

for some raw numbers:
with my default FDPIC boot (inetd/syslog/watchdog), we icache flush
18,562,124 bytes. with this stack fix, we cut off 3,538,944 bytes
(19% shrinkage).
-mike

2009-12-02 13:45:15

by David Howells

[permalink] [raw]
Subject: Re: [PATCH] FDPIC: respect PT_GNU_STACK exec markings when creating NOMMU stack

Mike Frysinger <[email protected]> wrote:

> that tends to be 128kB of waste.

While your patch is reasonable, at least in concept, I don't see where you get
this bit of the description from.

> + (executable_stack & EXSTACK_ENABLE_X ? PROT_EXEC : 0),

executable_stack might be EXSTACK_DEFAULT, but the default might be to enable
stack exec - in which case, this is wrong.

I also don't think that the EXSTACK_xxx constants are a bitmask; I think
they're an enumeration - in which case you shouldn't be using '&' to test
them. setup_arg_pages() uses '=='.

As far as I can see only powerpc currently defines the behaviour for
EXSTACK_DEFAULT. Mostly it seems to depend on VM_STACK_DEFAULT_FLAGS.

Further, I think it only matters for MMU and MPU modes, not for NOMMU mode.

There is one further consequence of your patch that you don't mention: the brk
area _also_ becomes non-executable. In NOMMU, I suspect this is irrelevant,
as I'm not sure brk is used at all (should we make sys_brk() return ENOSYS?).

How about the attached instead? Probably VM_DATA_DEFAULT_FLAGS should not
include VM_EXEC for either Blackfin or FRV, but it may be required for SH.
The if-statement that calls elf_read_implies_exec() will be optimised away
unless the arch specifically sets it (which none of FRV, Blackfin or SH do).

David
---
From: Mike Frysinger <[email protected]>
Subject: [PATCH] FDPIC: Respect PT_GNU_STACK exec protection markings when creating NOMMU stack

The current code will load the stack size and protection markings, but then
only use the markings in the MMU code path. The NOMMU code path always passes
PROT_EXEC to the mmap() call. While this doesn't matter to most people whilst
the code is running, it will cause a pointless icache flush when starting every
FDPIC application. Typically this icache flush will be of a region on the
order of 128KB in size, or may be the entire icache, depending on the
facilities available on the CPU.

In the case where the arch default behaviour seems to be desired
(EXSTACK_DEFAULT), we probe VM_STACK_FLAGS for VM_EXEC to determine whether we
should be setting PROT_EXEC or not.

For arches that support an MPU (Memory Protection Unit - an MMU without the
virtual mapping capability), setting PROT_EXEC or not will make an important
difference.

It should be noted that this change also affects the executability of the brk
region, since ELF-FDPIC has that share with the stack. However, this is
probably irrelevant as NOMMU programs aren't likely to use the brk region,
preferring instead allocation via mmap().

Signed-off-by: Mike Frysinger <[email protected]>
Signed-off-by: David Howells <[email protected]>
---

arch/blackfin/include/asm/page.h | 5 +++++
arch/frv/include/asm/page.h | 2 --
fs/binfmt_elf_fdpic.c | 13 +++++++++++--
3 files changed, 16 insertions(+), 4 deletions(-)


diff --git a/arch/blackfin/include/asm/page.h b/arch/blackfin/include/asm/page.h
index 944a07c..1d04e40 100644
--- a/arch/blackfin/include/asm/page.h
+++ b/arch/blackfin/include/asm/page.h
@@ -10,4 +10,9 @@
#include <asm-generic/page.h>
#define MAP_NR(addr) (((unsigned long)(addr)-PAGE_OFFSET) >> PAGE_SHIFT)

+#define VM_DATA_DEFAULT_FLAGS \
+ (VM_READ | VM_WRITE | \
+ ((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0 ) | \
+ VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+
#endif
diff --git a/arch/frv/include/asm/page.h b/arch/frv/include/asm/page.h
index 25c6a50..8c97068 100644
--- a/arch/frv/include/asm/page.h
+++ b/arch/frv/include/asm/page.h
@@ -63,12 +63,10 @@ extern unsigned long max_pfn;
#define virt_addr_valid(kaddr) pfn_valid(__pa(kaddr) >> PAGE_SHIFT)


-#ifdef CONFIG_MMU
#define VM_DATA_DEFAULT_FLAGS \
(VM_READ | VM_WRITE | \
((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0 ) | \
VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
-#endif

#endif /* __ASSEMBLY__ */

diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index 56f159b..4bd0a27 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -171,6 +171,9 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm,
#ifdef ELF_FDPIC_PLAT_INIT
unsigned long dynaddr;
#endif
+#ifndef CONFIG_MMU
+ unsigned long stack_prot;
+#endif
struct file *interpreter = NULL; /* to shut gcc up */
char *interpreter_name = NULL;
int executable_stack;
@@ -316,6 +319,8 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm,
* defunct, deceased, etc. after this point we have to exit via
* error_kill */
set_personality(PER_LINUX_FDPIC);
+ if (elf_read_implies_exec(&exec_params.hdr, executable_stack))
+ current->personality |= READ_IMPLIES_EXEC;
set_binfmt(&elf_fdpic_format);

current->mm->start_code = 0;
@@ -377,9 +382,13 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm,
if (stack_size < PAGE_SIZE * 2)
stack_size = PAGE_SIZE * 2;

+ stack_prot = PROT_READ | PROT_WRITE;
+ if (executable_stack == EXSTACK_ENABLE_X ||
+ (executable_stack == EXSTACK_DEFAULT && VM_STACK_FLAGS & VM_EXEC))
+ stack_prot |= PROT_EXEC;
+
down_write(&current->mm->mmap_sem);
- current->mm->start_brk = do_mmap(NULL, 0, stack_size,
- PROT_READ | PROT_WRITE | PROT_EXEC,
+ current->mm->start_brk = do_mmap(NULL, 0, stack_size, stack_prot,
MAP_PRIVATE | MAP_ANONYMOUS |
MAP_UNINITIALIZED | MAP_GROWSDOWN,
0);

2009-12-02 21:30:13

by Mike Frysinger

[permalink] [raw]
Subject: Re: [uClinux-dev] Re: [PATCH] FDPIC: respect PT_GNU_STACK exec markings when creating NOMMU stack

On Wed, Dec 2, 2009 at 08:44, David Howells wrote:
> Mike Frysinger wrote:
>> that tends to be 128kB of waste.
>
> While your patch is reasonable, at least in concept, I don't see where you get
> this bit of the description from.

well, you need the rest of my sentence:
... it causes a pointless icache flush when starting every ...

so i mean iflushing the stack is a waste, and default FDPIC stack
tends to be 128kB

>> +                                      (executable_stack & EXSTACK_ENABLE_X ? PROT_EXEC : 0),
>
> executable_stack might be EXSTACK_DEFAULT, but the default might be to enable
> stack exec - in which case, this is wrong.
>
> I also don't think that the EXSTACK_xxx constants are a bitmask; I think
> they're an enumeration - in which case you shouldn't be using '&' to test
> them.  setup_arg_pages() uses '=='.

err, right, this probably should have read something like:
(executable_stack == EXSTACK_DISABLE_X ? 0 : PROT_EXEC)

> Further, I think it only matters for MMU and MPU modes, not for NOMMU mode.

it matters for all modes wrt the iflush that execution perms imply.
in terms of actually being able to execute anything (and having the
permissions respected), you are correct. my concern was more for the
former issue ... the latter is just a nice side effect.

> There is one further consequence of your patch that you don't mention: the brk
> area _also_ becomes non-executable.  In NOMMU, I suspect this is irrelevant,
> as I'm not sure brk is used at all (should we make sys_brk() return ENOSYS?).

i didnt realize this ... that's why we have you here ;). i have seen
a few apps use brk()/sbrk() to query the size of things (like
e2fsprogs), but not to try and increase mappings, let alone to try and
load dynamic code and jump there (i havent even seen this on mmu
systems). certainly the uClibc malloc implementations dont use these
functions for nommu systems, nor would they work reliably since we
dont have the luxury of assuming the brk() space is wide open (since
it can easily be sitting against another mapping).

> How about the attached instead?

ive tested it and it works fine for me. my fdpic apps get an exec
stack if the PT_GNU_STACK has E, and they dont if it doesnt.

> Probably VM_DATA_DEFAULT_FLAGS should not
> include VM_EXEC for either Blackfin or FRV, but it may be required for SH.
> The if-statement that calls elf_read_implies_exec() will be optimised away
> unless the arch specifically sets it (which none of FRV, Blackfin or SH do).

while true, wont the later personality test (in VM_DATA_DEFAULT_FLAGS)
be left there ? guess it's not that big of a deal.

perhaps this define should be added to asm-generic/page.h though
rather than each arch. the default value you use seems reasonable for
everyone.
-mike

2009-12-03 17:59:09

by David Howells

[permalink] [raw]
Subject: Re: [uClinux-dev] Re: [PATCH] FDPIC: respect PT_GNU_STACK exec markings when creating NOMMU stack

Mike Frysinger <[email protected]> wrote:

> i have seen a few apps use brk()/sbrk() to query the size of things (like
> e2fsprogs)

We do actually record the size of the brk segment, so maybe we could icache
flush brk as it is increased (if it is increased):

diff --git a/mm/nommu.c b/mm/nommu.c
index 3754b16..2ea823d 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -432,6 +432,7 @@ SYSCALL_DEFINE1(brk, unsigned long, brk)
/*
* Ok, looks good - let it rip.
*/
+ flush_icache_range(mm->brk, brk);
return mm->brk = brk;
}

It might also be worth making the availability of brk() a config option under
NOMMU.

> >?Probably VM_DATA_DEFAULT_FLAGS should not include VM_EXEC for either
> > Blackfin or FRV, but it may be required for SH. The if-statement that
> > calls elf_read_implies_exec() will be optimised away unless the arch
> > specifically sets it (which none of FRV, Blackfin or SH do).
>
> while true, wont the later personality test (in VM_DATA_DEFAULT_FLAGS)
> be left there ? guess it's not that big of a deal.

I think we could justify altering FRV and Blackfin to get rid of that test,
since we don't make use of read-implies-exec in those arches, but I think that
should be a separate patch.

David

2009-12-04 07:01:56

by Mike Frysinger

[permalink] [raw]
Subject: Re: [uClinux-dev] Re: [PATCH] FDPIC: respect PT_GNU_STACK exec markings when creating NOMMU stack

On Thursday 03 December 2009 12:58:04 David Howells wrote:
> Mike Frysinger <[email protected]> wrote:
> > i have seen a few apps use brk()/sbrk() to query the size of things (like
> > e2fsprogs)
>
> We do actually record the size of the brk segment, so maybe we could icache
> flush brk as it is increased (if it is increased):
>
> diff --git a/mm/nommu.c b/mm/nommu.c
> index 3754b16..2ea823d 100644
> --- a/mm/nommu.c
> +++ b/mm/nommu.c
> @@ -432,6 +432,7 @@ SYSCALL_DEFINE1(brk, unsigned long, brk)
> /*
> * Ok, looks good - let it rip.
> */
> + flush_icache_range(mm->brk, brk);
> return mm->brk = brk;
> }

probably want mm->brk + brk for the second argument
-mike


Attachments:
signature.asc (836.00 B)
This is a digitally signed message part.

2009-12-04 09:42:53

by David Howells

[permalink] [raw]
Subject: Re: [uClinux-dev] Re: [PATCH] FDPIC: respect PT_GNU_STACK exec markings when creating NOMMU stack

Mike Frysinger <[email protected]> wrote:

> > + flush_icache_range(mm->brk, brk);
> > return mm->brk = brk;
> > }
>
> probably want mm->brk + brk for the second argument

Ummm... Why? mm->brk and brk are both addresses - note the return statement
where brk just gets assigned to mm->brk.

David

2009-12-04 10:21:32

by Mike Frysinger

[permalink] [raw]
Subject: Re: [uClinux-dev] Re: [PATCH] FDPIC: respect PT_GNU_STACK exec markings when creating NOMMU stack

On Friday 04 December 2009 04:40:32 David Howells wrote:
> Mike Frysinger <[email protected]> wrote:
> > > + flush_icache_range(mm->brk, brk);
> > > return mm->brk = brk;
> > > }
> >
> > probably want mm->brk + brk for the second argument
>
> Ummm... Why? mm->brk and brk are both addresses - note the return
> statement where brk just gets assigned to mm->brk.

yeah, sorry. i was thinking sbrk where the argument is an increment, not a
new address.
-mike


Attachments:
signature.asc (836.00 B)
This is a digitally signed message part.