2012-10-24 23:20:47

by Kees Cook

[permalink] [raw]
Subject: [PATCH] exec: do not leave bprm->interp on stack

If a series of scripts are executed, each triggering module loading via
unprintable bytes in the script header, kernel stack contents can leak
into the command line.

Normally execution of binfmt_script and binfmt_misc happens
recursively. However, when modules are enabled, and unprintable bytes
exist in the bprm->buf, execution will restart after attempting to load
matching binfmt modules. Unfortunately, the logic in binfmt_script and
binfmt_misc does not expect to get restarted. They leave bprm->interp
pointing to their local stack. This means on restart bprm->interp is
left pointing into unused stack memory which can then be copied into
the userspace argv areas.

This changes the logic to require allocation for any changes to the
bprm->interp. To avoid adding a new kmalloc to every exec, the default
value is left as-is. Only when passing through binfmt_script or
binfmt_misc does an allocation take place.

For a proof of concept, see DoTest.sh from:
http://www.halfdog.net/Security/2012/LinuxKernelBinfmtScriptStackDataDisclosure/

Cc: Alexander Viro <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: halfdog <[email protected]>
Cc: [email protected]
Signed-off-by: Kees Cook <[email protected]>
---
fs/binfmt_misc.c | 5 ++++-
fs/binfmt_script.c | 4 +++-
fs/exec.c | 14 ++++++++++++++
include/linux/binfmts.h | 1 +
4 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/fs/binfmt_misc.c b/fs/binfmt_misc.c
index 790b3cd..772428d 100644
--- a/fs/binfmt_misc.c
+++ b/fs/binfmt_misc.c
@@ -176,7 +176,10 @@ static int load_misc_binary(struct linux_binprm *bprm, struct pt_regs *regs)
goto _error;
bprm->argc ++;

- bprm->interp = iname; /* for binfmt_script */
+ /* Update interp in case binfmt_script needs it. */
+ retval = bprm_change_interp(iname, bprm);
+ if (retval < 0)
+ goto _error;

interp_file = open_exec (iname);
retval = PTR_ERR (interp_file);
diff --git a/fs/binfmt_script.c b/fs/binfmt_script.c
index d3b8c1f..df49d48 100644
--- a/fs/binfmt_script.c
+++ b/fs/binfmt_script.c
@@ -82,7 +82,9 @@ static int load_script(struct linux_binprm *bprm,struct pt_regs *regs)
retval = copy_strings_kernel(1, &i_name, bprm);
if (retval) return retval;
bprm->argc++;
- bprm->interp = interp;
+ retval = bprm_change_interp(interp, bprm);
+ if (retval < 0)
+ return retval;

/*
* OK, now restart the process with the interpreter's dentry.
diff --git a/fs/exec.c b/fs/exec.c
index 8b9011b..91099be 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1174,9 +1174,23 @@ void free_bprm(struct linux_binprm *bprm)
mutex_unlock(&current->signal->cred_guard_mutex);
abort_creds(bprm->cred);
}
+ /* If a binfmt changed the interp, free it. */
+ if (bprm->interp != bprm->filename)
+ kfree(bprm->interp);
kfree(bprm);
}

+int bprm_change_interp(char *interp, struct linux_binprm *bprm)
+{
+ /* If a binfmt changed the interp, free it first. */
+ if (bprm->interp != bprm->filename)
+ kfree(bprm->interp);
+ bprm->interp = kstrdup(interp, GFP_KERNEL);
+ if (!bprm->interp)
+ return -ENOMEM;
+ return 0;
+}
+
/*
* install the new credentials for this executable
*/
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index cfcc6bf..de0628e 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -114,6 +114,7 @@ extern int setup_arg_pages(struct linux_binprm * bprm,
unsigned long stack_top,
int executable_stack);
extern int bprm_mm_init(struct linux_binprm *bprm);
+extern int bprm_change_interp(char *interp, struct linux_binprm *bprm);
extern int copy_strings_kernel(int argc, const char *const *argv,
struct linux_binprm *bprm);
extern int prepare_bprm_creds(struct linux_binprm *bprm);
--
1.7.9.5


--
Kees Cook
Chrome OS Security


2012-10-25 04:16:34

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

On Wed, Oct 24, 2012 at 04:20:32PM -0700, Kees Cook wrote:
> If a series of scripts are executed, each triggering module loading via
> unprintable bytes in the script header, kernel stack contents can leak
> into the command line.
>
> Normally execution of binfmt_script and binfmt_misc happens
> recursively. However, when modules are enabled, and unprintable bytes
> exist in the bprm->buf, execution will restart after attempting to load
> matching binfmt modules. Unfortunately, the logic in binfmt_script and
> binfmt_misc does not expect to get restarted. They leave bprm->interp
> pointing to their local stack. This means on restart bprm->interp is
> left pointing into unused stack memory which can then be copied into
> the userspace argv areas.
>
> This changes the logic to require allocation for any changes to the
> bprm->interp. To avoid adding a new kmalloc to every exec, the default
> value is left as-is. Only when passing through binfmt_script or
> binfmt_misc does an allocation take place.

I really don't like that. It papers over the problem, but doesn't really
solve the underlying stupidity. We have no good reason to retry a binfmt
we'd already attempted on this level of recursion. And your patch doesn't
deal with that at all.

2012-10-25 06:21:25

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

On Wed, Oct 24, 2012 at 9:16 PM, Al Viro <[email protected]> wrote:
> On Wed, Oct 24, 2012 at 04:20:32PM -0700, Kees Cook wrote:
>> If a series of scripts are executed, each triggering module loading via
>> unprintable bytes in the script header, kernel stack contents can leak
>> into the command line.
>>
>> Normally execution of binfmt_script and binfmt_misc happens
>> recursively. However, when modules are enabled, and unprintable bytes
>> exist in the bprm->buf, execution will restart after attempting to load
>> matching binfmt modules. Unfortunately, the logic in binfmt_script and
>> binfmt_misc does not expect to get restarted. They leave bprm->interp
>> pointing to their local stack. This means on restart bprm->interp is
>> left pointing into unused stack memory which can then be copied into
>> the userspace argv areas.
>>
>> This changes the logic to require allocation for any changes to the
>> bprm->interp. To avoid adding a new kmalloc to every exec, the default
>> value is left as-is. Only when passing through binfmt_script or
>> binfmt_misc does an allocation take place.
>
> I really don't like that. It papers over the problem, but doesn't really
> solve the underlying stupidity. We have no good reason to retry a binfmt
> we'd already attempted on this level of recursion. And your patch doesn't
> deal with that at all.

What should the code here _actually_ be doing? The _script and _misc
handlers expect to rewrite the bprm contents and recurse, but the
module loader want to try again. It's not clear to me what the binfmt
module handler is even there for; I don't see any binfmt-XXXX aliases
in the tree. If nothing uses it, should we just rip it out? That would
solve it too.

-Kees

--
Kees Cook
Chrome OS Security

2012-10-25 11:46:42

by Prasad Pandit

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack


Hello Kees,

+-- On Wed, 24 Oct 2012, Kees Cook wrote --+
| What should the code here _actually_ be doing? The _script and _misc
| handlers expect to rewrite the bprm contents and recurse, but the module
| loader want to try again. It's not clear to me what the binfmt module
| handler is even there for; I don't see any binfmt-XXXX aliases in the tree.
| If nothing uses it, should we just rip it out? That would solve it too.

I've been following this issue and updated versions of HDs patch. Below is a
small patch to search_binary_handler() routine, which attempts to make the
request_module call before calling load_script routine.

Besides fixing the stack disclosure issue it also helps to *simplify* the
search_binary_handler routine by removing the -for (try=0;try<2;try++)- loop.

I'd really appreciate any comments/suggestions you may have.

===
diff --git a/fs/exec.c b/fs/exec.c
index 8b9011b..e368f41 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1369,65 +1369,55 @@ int search_binary_handler(struct linux_binprm *bprm,struct pt_regs *regs)
old_vpid = task_pid_nr_ns(current, task_active_pid_ns(current->parent));
rcu_read_unlock();

- retval = -ENOENT;
- for (try=0; try<2; try++) {
- read_lock(&binfmt_lock);
- list_for_each_entry(fmt, &formats, lh) {
- int (*fn)(struct linux_binprm *, struct pt_regs *) = fmt->load_binary;
- if (!fn)
- continue;
- if (!try_module_get(fmt->module))
- continue;
- read_unlock(&binfmt_lock);
- retval = fn(bprm, regs);
- /*
- * Restore the depth counter to its starting value
- * in this call, so we don't have to rely on every
- * load_binary function to restore it on return.
- */
- bprm->recursion_depth = depth;
- if (retval >= 0) {
- if (depth == 0) {
- trace_sched_process_exec(current, old_pid, bprm);
- ptrace_event(PTRACE_EVENT_EXEC, old_vpid);
- }
- put_binfmt(fmt);
- allow_write_access(bprm->file);
- if (bprm->file)
- fput(bprm->file);
- bprm->file = NULL;
- current->did_exec = 1;
- proc_exec_connector(current);
- return retval;
- }
- read_lock(&binfmt_lock);
- put_binfmt(fmt);
- if (retval != -ENOEXEC || bprm->mm == NULL)
- break;
- if (!bprm->file) {
- read_unlock(&binfmt_lock);
- return retval;
- }
- }
- read_unlock(&binfmt_lock);
#ifdef CONFIG_MODULES
- if (retval != -ENOEXEC || bprm->mm == NULL) {
- break;
- } else {
-#define printable(c) (((c)=='\t') || ((c)=='\n') || (0x20<=(c) && (c)<=0x7e))
- if (printable(bprm->buf[0]) &&
- printable(bprm->buf[1]) &&
- printable(bprm->buf[2]) &&
- printable(bprm->buf[3]))
- break; /* -ENOEXEC */
- if (try)
- break; /* -ENOEXEC */
- request_module("binfmt-%04x", *(unsigned short *)(&bprm->buf[2]));
- }
-#else
- break;
+#define printable(c) (0x20<=(c) && (c)<=0x7e)
+
+ if (printable(bprm->buf[0]) && printable(bprm->buf[1])
+ && printable(bprm->buf[2]) && printable(bprm->buf[3]))
+ request_module("binfmt-%04x", *(unsigned short *)(&bprm->buf[2]));
#endif
- }
+
+ retval = -ENOENT;
+ read_lock(&binfmt_lock);
+ list_for_each_entry(fmt, &formats, lh) {
+ int (*fn)(struct linux_binprm *, struct pt_regs *) = fmt->load_binary;
+ if (!fn)
+ continue;
+ if (!try_module_get(fmt->module))
+ continue;
+ read_unlock(&binfmt_lock);
+ retval = fn(bprm, regs);
+ /*
+ * Restore the depth counter to its starting value
+ * in this call, so we don't have to rely on every
+ * load_binary function to restore it on return.
+ */
+ bprm->recursion_depth = depth;
+ if (retval >= 0) {
+ if (depth == 0) {
+ trace_sched_process_exec(current, old_pid, bprm);
+ ptrace_event(PTRACE_EVENT_EXEC, old_vpid);
+ }
+ put_binfmt(fmt);
+ allow_write_access(bprm->file);
+ if (bprm->file)
+ fput(bprm->file);
+ bprm->file = NULL;
+ current->did_exec = 1;
+ proc_exec_connector(current);
+ return retval;
+ }
+ read_lock(&binfmt_lock);
+ put_binfmt(fmt);
+ if (retval != -ENOEXEC || bprm->mm == NULL)
+ break;
+ if (!bprm->file) {
+ read_unlock(&binfmt_lock);
+ return retval;
+ }
+ }
+ read_unlock(&binfmt_lock);
+
return retval;
}

===

Thank you.
--
Prasad J Pandit / Red Hat Security Response Team
DB7A 84C5 D3F9 7CD1 B5EB C939 D048 7860 3655 602B

2012-10-25 12:03:54

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

P J P wrote:
>
> Hello Kees,
>
> +-- On Wed, 24 Oct 2012, Kees Cook wrote --+
> | What should the code here _actually_ be doing? The _script and _misc
> | handlers expect to rewrite the bprm contents and recurse, but the module
> | loader want to try again. It's not clear to me what the binfmt module
> | handler is even there for; I don't see any binfmt-XXXX aliases in the tree.
> | If nothing uses it, should we just rip it out? That would solve it too.
>
> I've been following this issue and updated versions of HDs patch. Below is a
> small patch to search_binary_handler() routine, which attempts to make the
> request_module call before calling load_script routine.
>
> Besides fixing the stack disclosure issue it also helps to *simplify* the
> search_binary_handler routine by removing the -for (try=0;try<2;try++)- loop.
>
> I'd really appreciate any comments/suggestions you may have.

Excuse me, but why do you change definition of printable(c) ?
Looks like a regression.

Wouldn't your patch trigger call request_module() whenever a script
starting with "#!/bin/sh" is executed?

And if you meant

if (!(printable(bprm->buf[0]) && printable(bprm->buf[1])
&& printable(bprm->buf[2]) && printable(bprm->buf[3])))

then, wouldn't that trigger request_module() recursion?

2012-10-25 12:09:57

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

On Thu, Oct 25, 2012 at 05:16:22PM +0530, P J P wrote:
>
> Hello Kees,
>
> +-- On Wed, 24 Oct 2012, Kees Cook wrote --+
> | What should the code here _actually_ be doing? The _script and _misc
> | handlers expect to rewrite the bprm contents and recurse, but the module
> | loader want to try again. It's not clear to me what the binfmt module
> | handler is even there for; I don't see any binfmt-XXXX aliases in the tree.
> | If nothing uses it, should we just rip it out? That would solve it too.

; grep binfmt- /etc/*/* 2>/dev/null
/etc/modprobe.d/aliases.conf:install binfmt-0000 /bin/true
/etc/modprobe.d/aliases.conf:alias binfmt-204 binfmt_aout
/etc/modprobe.d/aliases.conf:alias binfmt-263 binfmt_aout
/etc/modprobe.d/aliases.conf:alias binfmt-264 binfmt_aout
/etc/modprobe.d/aliases.conf:alias binfmt-267 binfmt_aout
/etc/modprobe.d/aliases.conf:alias binfmt-387 binfmt_aout
; dpkg -S /etc/modprobe.d/aliases.conf
module-init-tools: /etc/modprobe.d/aliases.conf

> I've been following this issue and updated versions of HDs patch. Below is a
> small patch to search_binary_handler() routine, which attempts to make the
> request_module call before calling load_script routine.
>
> Besides fixing the stack disclosure issue it also helps to *simplify* the
> search_binary_handler routine by removing the -for (try=0;try<2;try++)- loop.
>
> I'd really appreciate any comments/suggestions you may have.

Suggestion: try testing your patches once in a while. Stopping to think
for a minute would also help - you've turned every execve() into "do
request_module() first". How do you suppose request_module() works? And
how would modprobe be able to run? IOW, this request_module() will be
stopped by protection against infinite loops, at which point execve will
proceed with already present binfmt, without having loaded anything.
But that's even worse than slowdown on each execve (with a lot of whining
in process), because *every* request_module() will fail now due to the same
loop prevention.

2012-10-25 12:38:48

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

On Thu, Oct 25, 2012 at 01:09:53PM +0100, Al Viro wrote:
> On Thu, Oct 25, 2012 at 05:16:22PM +0530, P J P wrote:
> >
> > Hello Kees,
> >
> > +-- On Wed, 24 Oct 2012, Kees Cook wrote --+
> > | What should the code here _actually_ be doing? The _script and _misc
> > | handlers expect to rewrite the bprm contents and recurse, but the module
> > | loader want to try again. It's not clear to me what the binfmt module
> > | handler is even there for; I don't see any binfmt-XXXX aliases in the tree.
> > | If nothing uses it, should we just rip it out? That would solve it too.
>
> ; grep binfmt- /etc/*/* 2>/dev/null
> /etc/modprobe.d/aliases.conf:install binfmt-0000 /bin/true
> /etc/modprobe.d/aliases.conf:alias binfmt-204 binfmt_aout
> /etc/modprobe.d/aliases.conf:alias binfmt-263 binfmt_aout
> /etc/modprobe.d/aliases.conf:alias binfmt-264 binfmt_aout
> /etc/modprobe.d/aliases.conf:alias binfmt-267 binfmt_aout
> /etc/modprobe.d/aliases.conf:alias binfmt-387 binfmt_aout
> ; dpkg -S /etc/modprobe.d/aliases.conf
> module-init-tools: /etc/modprobe.d/aliases.conf
>
> > I've been following this issue and updated versions of HDs patch. Below is a
> > small patch to search_binary_handler() routine, which attempts to make the
> > request_module call before calling load_script routine.
> >
> > Besides fixing the stack disclosure issue it also helps to *simplify* the
> > search_binary_handler routine by removing the -for (try=0;try<2;try++)- loop.
> >
> > I'd really appreciate any comments/suggestions you may have.
>
> Suggestion: try testing your patches once in a while. Stopping to think
> for a minute would also help - you've turned every execve() into "do
> request_module() first". How do you suppose request_module() works? And
> how would modprobe be able to run? IOW, this request_module() will be
> stopped by protection against infinite loops, at which point execve will
> proceed with already present binfmt, without having loaded anything.
> But that's even worse than slowdown on each execve (with a lot of whining
> in process), because *every* request_module() will fail now due to the same
> loop prevention.

... and after the second look at your patch, looks like another breakage
in there will have a different effect - it doesn't just eliminate the
first pass through the loop, it inverts the test for "should I try
request_module()". Overall result is a bit less painful - request_module()
isn't broken on loop prevention, but
* every bleeding script will have bogus execution of modprobe done
at execve time (and you'd better pray that /sbin/modprobe isn't a shell
script wrapper around the actual binary, or you *will* get loop prevention
kick in)
* none of the existing binfmt-<...> aliases is going to be hit
now; IOW, all usecases got broken. Granted, realistically it just means
broken modular aout support, but then it's the only reason to have that
request_module() there in the first place.

2012-10-25 12:57:52

by Prasad Pandit

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack


Hello Tetsuo,

+-- On Thu, 25 Oct 2012, Tetsuo Handa wrote --+
| Excuse me, but why do you change definition of printable(c) ?
| Looks like a regression.

#define printable(c) (((c)=='\t') || ((c)=='\n') || (0x20<=(c) && (c)<=0x7e))

Earlier definition of printable() as above was used to - break; - out of the
loop when (c) was either tab or new line or any printable character. Whereas,
in the patch it is used to call the request_module routine if the (c) is
printable character, and hence the change to - printable().


| Wouldn't your patch trigger call request_module() whenever a script
| starting with "#!/bin/sh" is executed?

Yes, Petr(a colleague here) already pointed out about excessive call to
request_module() routine, in case if the requested module is already loaded or
is not required/available. I'm trying to find a possible fix for the same.

Is there a way to to see if the requested module 'binfmt-xxxx' is accessible
or not? The call to - request_module - could be conditioned accordingly.

Thank you.
--
Prasad J Pandit / Red Hat Security Response Team
DB7A 84C5 D3F9 7CD1 B5EB C939 D048 7860 3655 602B

2012-10-26 17:38:51

by Prasad Pandit

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

+-- On Thu, 25 Oct 2012, Al Viro wrote --+
| * every bleeding script will have bogus execution of modprobe done
| at execve time (and you'd better pray that /sbin/modprobe isn't a shell
| script wrapper around the actual binary, or you *will* get loop prevention
| kick in)
| * none of the existing binfmt-<...> aliases is going to be hit
| now; IOW, all usecases got broken. Granted, realistically it just means
| broken modular aout support, but then it's the only reason to have that
| request_module() there in the first place.

Please have a look at the updated patch below.

It fixes the issue of excessive calls to request_module. find_module() routine
is used before request_module(), to see if the module is already loaded or
not. Module alias could dodge this though, I guess.

In this, request_module is called only when at least 1 of the first 4 bytes is
NOT printable, as is the present upstream case. It avoids calling
request_module for regular ELFs because printable() macro now includes the
last - DEL(0x7f) - character as well.

#define printable(c) (((c)=='\t') || ((c)=='\n') || (0x20<=(c) && (c)<=0x7f))

I am currently running this patch on my machine and logs only show messages
for the scripts generated by ./DoTest.sh tool.

$ grep -ri 'request_module' /var/log/messages
...
Oct 26 21:01:30 javelin kernel: [ 154.155980] request_module: failed to load: binfmt-660d
Oct 26 21:01:30 javelin kernel: [ 154.158161] request_module: failed to load: binfmt-660d
Oct 26 21:37:14 javelin kernel: [ 2293.030330] request_module: failed to load: binfmt-660d
Oct 26 21:40:07 javelin kernel: [ 2465.829154] request_module: failed to load: binfmt-660d

I'd appreciate any comments/suggestions that you may have.

===
diff --git a/fs/exec.c b/fs/exec.c
index 8b9011b..7615812 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1369,65 +1369,69 @@ int search_binary_handler(struct linux_binprm *bprm,struct pt_regs *regs)
old_vpid = task_pid_nr_ns(current, task_active_pid_ns(current->parent));
rcu_read_unlock();

- retval = -ENOENT;
- for (try=0; try<2; try++) {
- read_lock(&binfmt_lock);
- list_for_each_entry(fmt, &formats, lh) {
- int (*fn)(struct linux_binprm *, struct pt_regs *) = fmt->load_binary;
- if (!fn)
- continue;
- if (!try_module_get(fmt->module))
- continue;
- read_unlock(&binfmt_lock);
- retval = fn(bprm, regs);
- /*
- * Restore the depth counter to its starting value
- * in this call, so we don't have to rely on every
- * load_binary function to restore it on return.
- */
- bprm->recursion_depth = depth;
- if (retval >= 0) {
- if (depth == 0) {
- trace_sched_process_exec(current, old_pid, bprm);
- ptrace_event(PTRACE_EVENT_EXEC, old_vpid);
- }
- put_binfmt(fmt);
- allow_write_access(bprm->file);
- if (bprm->file)
- fput(bprm->file);
- bprm->file = NULL;
- current->did_exec = 1;
- proc_exec_connector(current);
- return retval;
- }
- read_lock(&binfmt_lock);
- put_binfmt(fmt);
- if (retval != -ENOEXEC || bprm->mm == NULL)
- break;
- if (!bprm->file) {
- read_unlock(&binfmt_lock);
- return retval;
- }
- }
- read_unlock(&binfmt_lock);
#ifdef CONFIG_MODULES
- if (retval != -ENOEXEC || bprm->mm == NULL) {
- break;
- } else {
-#define printable(c) (((c)=='\t') || ((c)=='\n') || (0x20<=(c) && (c)<=0x7e))
- if (printable(bprm->buf[0]) &&
- printable(bprm->buf[1]) &&
- printable(bprm->buf[2]) &&
- printable(bprm->buf[3]))
- break; /* -ENOEXEC */
- if (try)
- break; /* -ENOEXEC */
- request_module("binfmt-%04x", *(unsigned short *)(&bprm->buf[2]));
- }
-#else
- break;
+#define printable(c) (((c)=='\t') || ((c)=='\n') || (0x20<=(c) && (c)<=0x7f))
+
+ if (!printable(bprm->buf[0]) || !printable(bprm->buf[1])
+ || !printable(bprm->buf[2]) || !printable(bprm->buf[3]))
+ {
+ char name[12] = "\0";
+ struct module *mod = NULL;
+ unsigned short *usp = (unsigned short *)(&bprm->buf[2]);
+
+ snprintf(name, sizeof(name), "binfmt-%04x", *usp);
+ mod = find_module(name);
+
+ /* request_module is called if and only if - `name' module is NOT
+ * loaded and at least 1 of the first 4 bytes is NOT printable.
+ */
+ if (!mod && request_module(name))
+ printk(KERN_WARNING "request_module: failed to load: %s\n", name);
+ }
+
#endif
- }
+
+ retval = -ENOENT;
+ read_lock(&binfmt_lock);
+ list_for_each_entry(fmt, &formats, lh) {
+ int (*fn)(struct linux_binprm *, struct pt_regs *) = fmt->load_binary;
+ if (!fn)
+ continue;
+ if (!try_module_get(fmt->module))
+ continue;
+ read_unlock(&binfmt_lock);
+ retval = fn(bprm, regs);
+ /*
+ * Restore the depth counter to its starting value
+ * in this call, so we don't have to rely on every
+ * load_binary function to restore it on return.
+ */
+ bprm->recursion_depth = depth;
+ if (retval >= 0) {
+ if (depth == 0) {
+ trace_sched_process_exec(current, old_pid, bprm);
+ ptrace_event(PTRACE_EVENT_EXEC, old_vpid);
+ }
+ put_binfmt(fmt);
+ allow_write_access(bprm->file);
+ if (bprm->file)
+ fput(bprm->file);
+ bprm->file = NULL;
+ current->did_exec = 1;
+ proc_exec_connector(current);
+ return retval;
+ }
+ read_lock(&binfmt_lock);
+ put_binfmt(fmt);
+ if (retval != -ENOEXEC || bprm->mm == NULL)
+ break;
+ if (!bprm->file) {
+ read_unlock(&binfmt_lock);
+ return retval;
+ }
+ }
+ read_unlock(&binfmt_lock);
+
return retval;
}

===


Thank you.
--
Prasad J Pandit / Red Hat Security Response Team
DB7A 84C5 D3F9 7CD1 B5EB C939 D048 7860 3655 602B

2012-10-26 18:36:06

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

On Fri, Oct 26, 2012 at 11:08:20PM +0530, P J P wrote:
> +-- On Thu, 25 Oct 2012, Al Viro wrote --+
> | * every bleeding script will have bogus execution of modprobe done
> | at execve time (and you'd better pray that /sbin/modprobe isn't a shell
> | script wrapper around the actual binary, or you *will* get loop prevention
> | kick in)
> | * none of the existing binfmt-<...> aliases is going to be hit
> | now; IOW, all usecases got broken. Granted, realistically it just means
> | broken modular aout support, but then it's the only reason to have that
> | request_module() there in the first place.
>
> Please have a look at the updated patch below.
>
> It fixes the issue of excessive calls to request_module. find_module() routine
> is used before request_module(), to see if the module is already loaded or
> not. Module alias could dodge this though, I guess.

"Could"? Can you show a single module that would have name matching
binfmt-[0-9a-f]*? In other words, are they ever loaded _not_ via an
alias?

2012-10-27 10:47:45

by Prasad Pandit

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

+-- On Fri, 26 Oct 2012, Al Viro wrote --+
| > not. Module alias could dodge this though, I guess.
| "Could"? Can you show a single module that would have name matching
| binfmt-[0-9a-f]*? In other words, are they ever loaded _not_ via an
| alias?

I understand. I was wondering if alias information is accessible in the
kernel via any routine, alike find_module().

Just to get perspective about how many times request_module() would be called
with the latest patch, in general installations(or distributions), how
prevalent(in use) are binfmt-xxxx loadable modules?

Thank you.
--
Prasad J Pandit / Red Hat Security Response Team
DB7A 84C5 D3F9 7CD1 B5EB C939 D048 7860 3655 602B

2012-10-27 17:05:11

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

On Sat, Oct 27, 2012 at 3:47 AM, P J P <[email protected]> wrote:
> +-- On Fri, 26 Oct 2012, Al Viro wrote --+
> | > not. Module alias could dodge this though, I guess.
> | "Could"? Can you show a single module that would have name matching
> | binfmt-[0-9a-f]*? In other words, are they ever loaded _not_ via an
> | alias?
>
> I understand. I was wondering if alias information is accessible in the
> kernel via any routine, alike find_module().
>
> Just to get perspective about how many times request_module() would be called
> with the latest patch, in general installations(or distributions), how
> prevalent(in use) are binfmt-xxxx loadable modules?

Al showed a list of them earlier in the thread. I don't have any on
the various distros I checked.

The problem I see here is that we only want to do module loading in
the "no match" case. But that means that either we need to restart
with the original bprm, or we need to keep bprm changes off the stack.
Leading with a module load is going to wreck performance.

-Kees

--
Kees Cook
Chrome OS Security

2012-10-27 20:17:27

by Prasad Pandit

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

+-- On Sat, 27 Oct 2012, Kees Cook wrote --+
| Al showed a list of them earlier in the thread.

Yeah, the list Al showed and I came across mostly has - binfmt_aout - entry.
Do people still use - a.out - format? (considering ELF has been the default
standard for so many years)

| I don't have any on the various distros I checked.

Same here, my F17 machine has no entries for binfmt-xxxx modules, in fact I
don't even have the /etc/modprobe.d/aliases.conf file.

Documentation/binfmt_misc.txt talks about executing Java, Python, DOSEMU and
Windows programs which could be supported by loadable modules.

| The problem I see here is that we only want to do module loading in the "no
| match" case. But that means that either we need to restart with the original
| bprm, or we need to keep bprm changes off the stack. Leading with a module
| load is going to wreck performance.

I beg to *slightly* differ here. I agree we currently have a small overhead
of find_module() -> request_module() only when binfmt_xxxx module is already
loaded, partly because find_module can not resolve aliases.

I guess this small overhead is worth it if it helps to make things less
confusing and easy to follow. Besides this overhead does not exist for regular
executables ELFs and scripts alike.

If the required module is missing, a call to request_module() will anyway
happen and its cost remains the same whether it happens before or after the
"match".

Thank you.
--
Prasad J Pandit / Red Hat Security Response Team
DB7A 84C5 D3F9 7CD1 B5EB C939 D048 7860 3655 602B

2012-10-28 03:32:17

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

On Sat, Oct 27, 2012 at 1:16 PM, P J P <[email protected]> wrote:
> +-- On Sat, 27 Oct 2012, Kees Cook wrote --+
> | Al showed a list of them earlier in the thread.
>
> Yeah, the list Al showed and I came across mostly has - binfmt_aout - entry.
> Do people still use - a.out - format? (considering ELF has been the default
> standard for so many years)

No one sane has CONFIG_BINFMT_AOUT any more. :)

> | I don't have any on the various distros I checked.
>
> Same here, my F17 machine has no entries for binfmt-xxxx modules, in fact I
> don't even have the /etc/modprobe.d/aliases.conf file.
>
> Documentation/binfmt_misc.txt talks about executing Java, Python, DOSEMU and
> Windows programs which could be supported by loadable modules.

Right, but those are all registered from userspace and binfmt_misc
will catch them.

> | The problem I see here is that we only want to do module loading in the "no
> | match" case. But that means that either we need to restart with the original
> | bprm, or we need to keep bprm changes off the stack. Leading with a module
> | load is going to wreck performance.
>
> I beg to *slightly* differ here. I agree we currently have a small overhead
> of find_module() -> request_module() only when binfmt_xxxx module is already
> loaded, partly because find_module can not resolve aliases.

Al's point here is that non of the binfmts are named "binfmt-NNNN".
Those are only aliases, and those are only visible from userspace,
even after they're loaded, so the find_module() call in your example
will never match anything. Which means if there is a non-printable in
a binary, the kernel will exec modprobe even if there is already a
binfmt that will load it.

> I guess this small overhead is worth it if it helps to make things less
> confusing and easy to follow. Besides this overhead does not exist for regular
> executables ELFs and scripts alike.
>
> If the required module is missing, a call to request_module() will anyway
> happen and its cost remains the same whether it happens before or after the
> "match".

Well, we'll always do the modprobe call-out on a missing binfmt (so
for that reason I like the addition of the printk, though it should
probably be rate-limited), but we still need to only load modules at
fall-back time. Which means means we need to return from the recursive
loading attempt, which means we need to restart the bprm (slow) or we
need to do a heap alloc for the changed interp (less slow).

If we change binfmt_script to not make a recursive call, then we still
need to keep the interp change somewhere off the stack. I still think
my patchset is the least bad.

Al, do you have something else in mind?

-Kees

--
Kees Cook
Chrome OS Security

2012-11-06 08:10:57

by Prasad Pandit

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack


Hello Kees, Al,

+-- On Sat, 27 Oct 2012, Kees Cook wrote --+
| If we change binfmt_script to not make a recursive call, then we still
| need to keep the interp change somewhere off the stack. I still think
| my patchset is the least bad.
|
| Al, do you have something else in mind?

Guys, are there any updates further?

Al, what's your take on the *rare* extra call to request_module?

Thank you.
--
Prasad J Pandit / Red Hat Security Response Team
DB7A 84C5 D3F9 7CD1 B5EB C939 D048 7860 3655 602B

2012-11-12 22:10:53

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

On Tue, Nov 6, 2012 at 12:10 AM, P J P <[email protected]> wrote:
>
> Hello Kees, Al,
>
> +-- On Sat, 27 Oct 2012, Kees Cook wrote --+
> | If we change binfmt_script to not make a recursive call, then we still
> | need to keep the interp change somewhere off the stack. I still think
> | my patchset is the least bad.
> |
> | Al, do you have something else in mind?
>
> Guys, are there any updates further?
>
> Al, what's your take on the *rare* extra call to request_module?

Without any other feedback, I'd like to use my minimal allocation
patch, since it fixes the problem and doesn't change any of the
semantics of how/when loading happens.

-Kees

--
Kees Cook
Chrome OS Security

2012-11-13 06:54:26

by halfdog

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Kees Cook wrote:
> On Tue, Nov 6, 2012 at 12:10 AM, P J P <[email protected]> wrote:
>>
>> Hello Kees, Al,
>>
>> +-- On Sat, 27 Oct 2012, Kees Cook wrote --+ | If we change
>> binfmt_script to not make a recursive call, then we still | need
>> to keep the interp change somewhere off the stack. I still think
>> | my patchset is the least bad. | | Al, do you have something
>> else in mind?
>>
>> Guys, are there any updates further?
>>
>> Al, what's your take on the *rare* extra call to request_module?
>
> Without any other feedback, I'd like to use my minimal allocation
> patch, since it fixes the problem and doesn't change any of the
> semantics of how/when loading happens.

As a first step, I think that we can go with the Keess'
(nice/small/simple) patch. On the long run, exec should be reworked. Not
only interp is modified, also credentials are set, e.g. when using
"ping" as interpreter. With intransparent error handling and
retry-logic, this might be a future local-root-exploit in the beginning
(I tried to, but did not manage yet).


Also a remark from Prasad Pandit did not make it to the list (or at
least I missed the replies).

> Yesterday, while testing Keess' patch I was reading through
> execve(2) manual which says: path name must be a valid executable
> which is NOT a script.
>
> $ man execve ... Interpreter scripts An interpreter script is a
> text file that has execute permission enabled and whose first line
> is of the form:
>
> #! interpreter [optional-arg]
>
> The interpreter must be a valid path name for an executable which
> is not itself a script.

Does someone know what POSIX says about that? I guess that interp
recursion might have some usecases: Script uses interp, but interp was
wrapped by admin or distribution folks into another script to fix
something, e.g. to pass an additional arg.

hd

- --
http://www.halfdog.net/
PGP: 156A AE98 B91F 0114 FE88 2BD8 C459 9386 feed a bee
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iEYEARECAAYFAlCh7ZEACgkQxFmThv7tq+4X/QCeLN+0qUtP6Hhag1d4iwZ4PZbL
evEAn2iPQH9mJ0zTHMs3qOsaWLRs9UWW
=Ow3u
-----END PGP SIGNATURE-----

2012-11-16 12:51:03

by Prasad Pandit

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack


Hello folks,

+-- On Mon, 12 Nov 2012, Kees Cook wrote --+
| > Al, what's your take on the *rare* extra call to request_module?
|
| Without any other feedback, I'd like to use my minimal allocation
| patch, since it fixes the problem and doesn't change any of the
| semantics of how/when loading happens.

I did apply and test this patch with kernel-3.5.3 on my machine. Now it
seems to disclose dynamically allocated(kstrdup) bytes, instead of the call
stack bytes. Recursions still dodge and exceed the limit of
BINPRM_MAX_RECURSION(4).

Please pardon my asking, but - how is this a fix?
--
Prasad J Pandit / Red Hat Security Response Team
DB7A 84C5 D3F9 7CD1 B5EB C939 D048 7860 3655 602B

2012-11-16 18:00:38

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

On Fri, Nov 16, 2012 at 4:50 AM, P J P <[email protected]> wrote:
>
> Hello folks,
>
> +-- On Mon, 12 Nov 2012, Kees Cook wrote --+
> | > Al, what's your take on the *rare* extra call to request_module?
> |
> | Without any other feedback, I'd like to use my minimal allocation
> | patch, since it fixes the problem and doesn't change any of the
> | semantics of how/when loading happens.
>
> I did apply and test this patch with kernel-3.5.3 on my machine. Now it
> seems to disclose dynamically allocated(kstrdup) bytes, instead of the call
> stack bytes. Recursions still dodge and exceed the limit of
> BINPRM_MAX_RECURSION(4).
>
> Please pardon my asking, but - how is this a fix?

Hrm? It should be showing only the live heap-allocated interp -- are
you seeing uninitialized contents?

-Kees

--
Kees Cook
Chrome OS Security

2012-11-18 19:05:15

by Prasad Pandit

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

+-- On Fri, 16 Nov 2012, Kees Cook wrote --+
| Hrm? It should be showing only the live heap-allocated interp -- are
| you seeing uninitialized contents?

I don't see uninitialised content; I see interpreter names from previous
iterations. Which was the case earlier as well. The - interp - array is
initialised with the interpreter name, before being assigned to bprm->interp.

These - interp - bytes are *leaked* because after 4 recursions, when
load_script returns -ENOEXEC, - bprm->interp - becomes invalid for it starts
pointing to an invalid stack memory location.

Crux of the problem is in the fact that the recursion limit -
BINPRM_MAX_RECURSION(4) - exceeds after ones been rightly adhered to.

(bprm->recursion_depth > BINPRM_MAX_RECURSION))
return -ENOEXEC;

This check fails due to specific condition, which still exists.

Dynamically allocating memory fixes the leak by making the memory area live
and valid.

It does not fix the problem which caused the leak in the first place by
exceeding the BINPRM_MAX_RECURSION, not by 1 or 2 but possible 2^6
recursions. Isn't that performance hit?

--
Prasad J Pandit / Red Hat Security Response Team
DB7A 84C5 D3F9 7CD1 B5EB C939 D048 7860 3655 602B

2012-11-18 19:34:51

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

On Sun, Nov 18, 2012 at 11:04 AM, P J P <[email protected]> wrote:
> +-- On Fri, 16 Nov 2012, Kees Cook wrote --+
> | Hrm? It should be showing only the live heap-allocated interp -- are
> | you seeing uninitialized contents?
>
> I don't see uninitialised content; I see interpreter names from previous
> iterations. Which was the case earlier as well. The - interp - array is
> initialised with the interpreter name, before being assigned to bprm->interp.
>
> These - interp - bytes are *leaked* because after 4 recursions, when
> load_script returns -ENOEXEC, - bprm->interp - becomes invalid for it starts
> pointing to an invalid stack memory location.
>
> Crux of the problem is in the fact that the recursion limit -
> BINPRM_MAX_RECURSION(4) - exceeds after ones been rightly adhered to.
>
> (bprm->recursion_depth > BINPRM_MAX_RECURSION))
> return -ENOEXEC;
>
> This check fails due to specific condition, which still exists.
>
> Dynamically allocating memory fixes the leak by making the memory area live
> and valid.

Right. There are two problems. This fixes the first, which is the
memory content leak.

> It does not fix the problem which caused the leak in the first place by
> exceeding the BINPRM_MAX_RECURSION, not by 1 or 2 but possible 2^6
> recursions. Isn't that performance hit?

This is the second problem. I view this as less critical because it's
only 64 instead of 4, but it certainly should be solved as well.

-Kees

--
Kees Cook
Chrome OS Security

2012-11-19 06:58:12

by Prasad Pandit

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

+-- On Sun, 18 Nov 2012, Kees Cook wrote --+
| This is the second problem. I view this as less critical because it's only
| 64 instead of 4, but it certainly should be solved as well.

I don't mean to be rude, but the patch I had sent solves both of these
problems with much less performance hit.

Please see -> https://lkml.org/lkml/2012/10/26/442

Worst case: instead of 2^6(64) recursions, it would make only 4 calls to
request_module() function. find_module() does not resolve module aliases, it
could be removed from the above patch.

Please pardon me if I came across rude or offensive, I'm only trying to make a
case.
--
Prasad J Pandit / Red Hat Security Response Team
DB7A 84C5 D3F9 7CD1 B5EB C939 D048 7860 3655 602B

2012-11-19 20:41:25

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

On Sun, Nov 18, 2012 at 10:57 PM, P J P <[email protected]> wrote:
> +-- On Sun, 18 Nov 2012, Kees Cook wrote --+
> | This is the second problem. I view this as less critical because it's only
> | 64 instead of 4, but it certainly should be solved as well.
>
> I don't mean to be rude, but the patch I had sent solves both of these
> problems with much less performance hit.
>
> Please see -> https://lkml.org/lkml/2012/10/26/442
>
> Worst case: instead of 2^6(64) recursions, it would make only 4 calls to
> request_module() function. find_module() does not resolve module aliases, it
> could be removed from the above patch.
>
> Please pardon me if I came across rude or offensive, I'm only trying to make a
> case.

I don't think you're being rude at all. You're defending your
solution. :) I wasn't very verbose in my objection to it, so I
apologize for that and now attempt to make up for it in this email. :)

I felt that your patch made certain things worse, but perhaps I was
seeing it incorrectly. Just to have a common point of reference, here
are some of the exec cases I've been considering. This is without
either of our patches applied:

An exec of ELF directly:

- bprm created
- search_binary_handler called
- each fmt->load_binary called
- ELF handler hits
- done: 0 request_module calls, 0 leaks

An exec of a a file handled by binfmt_misc:

- bprm created
- search_binary_handler called (depth == 0)
- each fmt->load_binary called
- misc handler hits
- rewrites various bprm things, including bprm->interp
- search_binary_handler called (depth == 1)
- each fmt->load_binary called
- ELF handler hits (usually, depends on specific binfmt_misc
interpreter config)
- done: 0 request_module calls, 0 leaks

An exec of a script that calls ELF:

- bprm created
- search_binary_handler called (depth == 0)
- each fmt->load_binary called
- script handler hits
- rewrites various bprm things, including bprm->interp
- search_binary_handler called (depth == 1)
- each fmt->load_binary called
- ELF handler hits
- done: 0 request_module calls, 0 leaks

An exec of an binary with unprintables where a module exists:

- bprm created
- search_binary_handler called (depth == 0)
- each fmt->load_binary called
- nothing hits, retval == ENOEXEC
- request_module("binfmt-%04x") called
- search_binary_handler loop starts again
- each fmt->load_binary called
- new handler hits
- done: 1 request_module call, 1 leak

An exec of a chain of abusive scripts that contains unprintables:

- bprm created
- search_binary_handler called (depth == 0)
- each fmt->load_binary called
- script handler hits
- rewrites various bprm things, including bprm->interp
- search_binary_handler called (depth == 1)
- each fmt->load_binary called
- script handler hits
- rewrites various bprm things, including bprm->interp
- search_binary_handler called (depth == 2 ...)
...
- each fmt->load_binary called
- at depth > BINPRM_MAX_RECURSION, script handler returns -ENOEXEC
- retval == ENOEXEC
- request_module("binfmt-%04x") called
- search_binary_handler loop starts again
- each fmt->load_binary called
- at depth > BINPRM_MAX_RECURSION, script handler returns -ENOEXEC
... unwind and retry continues all the way back up
- retval == ENOEXEC
- request_module("binfmt-%04x") called
- search_binary_handler loop starts again
- each fmt->load_binary called
- done: 63 calls to request_module (one in each search_binary_handler
call, to depth 5 (32 wide)), 63 leaks

So, my interp-on-heap patch doesn't change the request_module call
count, but it changes the stack reference leaks to kalloc/kfree pairs.

Your patch changes a number of things. The obvious part is that is
solves the abusive chain of script example by stopping at depth == 5,
after 5 calls to request_module, and eliminates the retries so there
are 0 stack reference leaks.

However, it also changes the conditions for when a module is loaded
(i.e. 0x7f no longer triggers a module_load, so anything needing that
would break -- I'm not sure if this really qualifies for ABI breakage,
I don't use any obscure binfmt modules so I can't say).

And, most importantly, it triggers request_module for any binary with
unprintables that binfmt_misc may already handle (for example, the
very common case of handling DOS MZ files, which only define 2 bytes
as magic (MZ) and exampes I find show things like "@\x00" trailing it,
or JAR files which are PK\x03\x04). Which means each exec of these
kinds of files would trigger a needless request_module() call on every
exec.

I feel that this too radically changes the behavioral characteristics
of exec for common cases. The original logic is:
- try all existing binfmt handlers
- if none found, try to load a module for it in it had unprintables
(including 0x7f)
- try them all again

Your new logic would be:
- try to load a module if there are unprintables (excluding 0x7f)
- try all existing binfmt handlers

I really don't think this change is sensible. We need to only do the
module request when none of the handlers hit, and things haven't gone
wrong.

I think to avoid the explosion of request_module calls in the abusive
case, we could simply return ELOOP instead of ENOEXEC on max
recursion. This would mean an immediate return on max depth failures.
Also, after looking at the recursion tracking here, it really seems
like search_binary_handler should track it, not the binfmt handlers.
I'll send the patch.

Both the interp-on-heap patch and this proposed ELOOP patch are needed
to handle the case of binfmt_script and/or binfmt_misc being modules
(first binfmt walk fails with -ENOEXEC, loads binfmt_script, retries
loop, hits binfmt_script rewriting interp to a PE file, recurses,
fails with -ENOEXEC, loads binfmt_misc via a modalias for PE files,
retries loop, hits binfmt_misc rewriting interp to an ELF, recurses,
loads ELF, happiness). Without the heap patch, we could be pointing
into old stack (rewritten e.g. during module load or taking an
interrupt, etc) on the loop retries. Without the ELOOP patch, the
recursion could explode with an abusive script chain.

-Kees

--
Kees Cook
Chrome OS Security

2012-11-20 07:04:54

by Prasad Pandit

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

+-- On Mon, 19 Nov 2012, Kees Cook wrote --+
| I don't think you're being rude at all. You're defending your solution. :)

Thank you Kees, really appreciate it.

| However, it also changes the conditions for when a module is loaded
| (i.e. 0x7f no longer triggers a module_load, so anything needing that
| would break -- I'm not sure if this really qualifies for ABI breakage,
| I don't use any obscure binfmt modules so I can't say).

Ah right.

| And, most importantly, it triggers request_module for any binary with
| unprintables that binfmt_misc may already handle (for example, the
| very common case of handling DOS MZ files, which only define 2 bytes
| as magic (MZ) and exampes I find show things like "@\x00" trailing it,
| or JAR files which are PK\x03\x04). Which means each exec of these
| kinds of files would trigger a needless request_module() call on every
| exec.

Hmmn...true.

| Both the interp-on-heap patch and this proposed ELOOP patch are needed
| to handle the case of binfmt_script and/or binfmt_misc being modules
| (first binfmt walk fails with -ENOEXEC, loads binfmt_script, retries
| loop, hits binfmt_script rewriting interp to a PE file, recurses,
| fails with -ENOEXEC, loads binfmt_misc via a modalias for PE files,
| retries loop, hits binfmt_misc rewriting interp to an ELF, recurses,
| loads ELF, happiness). Without the heap patch, we could be pointing
| into old stack (rewritten e.g. during module load or taking an
| interrupt, etc) on the loop retries. Without the ELOOP patch, the
| recursion could explode with an abusive script chain.

I see! Thanks so much for explaining Kees, I appreciate it.

Thank you.
--
Prasad J Pandit / Red Hat Security Response Team
DB7A 84C5 D3F9 7CD1 B5EB C939 D048 7860 3655 602B

2012-11-20 07:08:30

by Prasad Pandit

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

+-- On Mon, 19 Nov 2012, Linus Torvalds wrote --+
| isn't your patch the one that does a request_module even without trying
| without it? There was a discussion about why that isn't valid. Don't do it.

Yep, okay.
--
Prasad J Pandit / Red Hat Security Response Team
DB7A 84C5 D3F9 7CD1 B5EB C939 D048 7860 3655 602B

2012-11-22 20:06:40

by Prasad Pandit

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

+-- On Mon, 19 Nov 2012, Kees Cook wrote --+
| I think to avoid the explosion of request_module calls in the abusive
| case, we could simply return ELOOP instead of ENOEXEC on max
| recursion.

-> http://www.spinics.net/lists/mm-commits/msg92433.html

1. returning -ELOOP has a side effect of not reaching to request_module()
ever, for:
==
#ifdef CONFIG_MODULES
1415 if (retval != -ENOEXEC || bprm->mm == NULL) {
1416 break;
1417 } else {
...
==

2. above patch does not seem to fix the 2^6(64) recursions issue, for:
==
+ bprm->recursion_depth = depth + 1;
retval = fn(bprm);
bprm->recursion_depth = depth;
==

setting - recursion_dept = depth - again and the outer for(try=0;try<2...)
loop seems to be causing the 2^6 recursions.

Thank you.
--
Prasad J Pandit / Red Hat Security Response Team
DB7A 84C5 D3F9 7CD1 B5EB C939 D048 7860 3655 602B

2012-11-23 18:44:20

by Prasad Pandit

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack


Hello Kees, all,

Please have a look at a *NEW* patch at the end of this mail. It seems to fix
both the issues, stack disclosure + undue recursions.

It uses modprobe "--first-time" option which returns an error code when trying
to load a module which is already present or unload one which is not present.

Checking this return value at - request_module() - and breaking the loop in
case of an error seems to be doing the trick.

+-- On Mon, 19 Nov 2012, Kees Cook wrote --+
| I think to avoid the explosion of request_module calls in the abusive case,
| we could simply return ELOOP instead of ENOEXEC on max recursion.

-> http://www.spinics.net/lists/mm-commits/msg92433.html

1. returning -ELOOP has a side effect of not reaching to request_module()
ever, for:
==
#ifdef CONFIG_MODULES
1415 if (retval != -ENOEXEC || bprm->mm == NULL) {
1416 break;
1417 } else {
...
==

2. above patch does not seem to fix the 2^6(64) recursions issue, for:
==
+ bprm->recursion_depth = depth + 1;
retval = fn(bprm);
bprm->recursion_depth = depth;
==

setting - recursion_dept = depth - again and the outer for(try=0;try<2...)
loop seems to be causing the 2^6 recursions.


Please have a look at this *NEW* patch to fix both the issues, stack
disclosure + undue recursions.

===
diff --git a/fs/exec.c b/fs/exec.c
index 0039055..dec467f 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1423,7 +1423,14 @@ int search_binary_handler(struct linux_binprm *bprm,struct pt_regs *regs)
break; /* -ENOEXEC */
if (try)
break; /* -ENOEXEC */
- request_module("binfmt-%04x", *(unsigned short *)(&bprm->buf[2]));
+ if (request_module("binfmt-%04x",
+ *(unsigned short *)(&bprm->buf[2])))
+ {
+ printk(KERN_WARNING
+ "request_module: failed to load: binfmt-%04x",
+ *(unsigned short *)(&bprm->buf[2]));
+ break;
+ }
}
#else
break;
diff --git a/kernel/kmod.c b/kernel/kmod.c
index 1c317e3..7ec0e3e 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -83,7 +83,7 @@ static int call_modprobe(char *module_name, int wait)
NULL
};

- char **argv = kmalloc(sizeof(char *[5]), GFP_KERNEL);
+ char **argv = kmalloc(sizeof(char *[6]), GFP_KERNEL);
if (!argv)
goto out;

@@ -93,9 +93,10 @@ static int call_modprobe(char *module_name, int wait)

argv[0] = modprobe_path;
argv[1] = "-q";
- argv[2] = "--";
- argv[3] = module_name; /* check free_modprobe_argv() */
- argv[4] = NULL;
+ argv[2] = "--first-time";
+ argv[3] = "--";
+ argv[4] = module_name; /* check free_modprobe_argv() */
+ argv[5] = NULL;

return call_usermodehelper_fns(modprobe_path, argv, envp,
wait | UMH_KILLABLE, NULL, free_modprobe_argv, NULL);
===



Thank you.
--
Prasad J Pandit / Red Hat Security Response Team
DB7A 84C5 D3F9 7CD1 B5EB C939 D048 7860 3655 602B

2012-11-23 23:13:15

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

P J P wrote:
>
> Hello Kees, all,
>
> Please have a look at a *NEW* patch at the end of this mail. It seems to fix
> both the issues, stack disclosure + undue recursions.
>
> It uses modprobe "--first-time" option which returns an error code when trying
> to load a module which is already present or unload one which is not present.

It might fix both "stack disclosure" + "undue recursions" issues, but it
introduces a regression "only one of concurrent requesters succeeds" which
ruins the value of automatic module loading feature.

> @@ -1423,7 +1423,14 @@ int search_binary_handler(struct linux_binprm *bprm,struct pt_regs *regs)
> break; /* -ENOEXEC */
> if (try)
> break; /* -ENOEXEC */
> - request_module("binfmt-%04x", *(unsigned short *)(&bprm->buf[2]));

What happens if more than one processes are requesting execve() of a program
which needs to call request_module("binfmt-%04x") to succeed, and more than two
of them concurrently reached at this line? Only one process will succeed. This
means that execve() of a program which needs to call request_module() will fail
if concurrently reached here.

> + if (request_module("binfmt-%04x",
> + *(unsigned short *)(&bprm->buf[2])))
> + {
> + printk(KERN_WARNING
> + "request_module: failed to load: binfmt-%04x",
> + *(unsigned short *)(&bprm->buf[2]));
> + break;
> + }
> }
> #else
> break;

2012-11-26 07:09:39

by Prasad Pandit

[permalink] [raw]
Subject: Re: [PATCH] exec: do not leave bprm->interp on stack

+-- On Sat, 24 Nov 2012, Kees Cook wrote --+
| Well, "ever" meaning "if depth is hit, always fail out", yes. This is
| intentional. We do not want to attempt module loading if we hit a recursion
| limit.

Ah yes, that's right!

Thanks so much! :)
--
Prasad J Pandit / Red Hat Security Response Team
DB7A 84C5 D3F9 7CD1 B5EB C939 D048 7860 3655 602B