2009-09-03 16:10:49

by Oleg Nesterov

[permalink] [raw]
Subject: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex

Tom Horsley reports that his debugger hangs when it tries to read
/proc/pid_of_tracee/maps, this happens since

"mm_for_maps: take ->cred_guard_mutex to fix the race with exec"
04b836cbf19e885f8366bccb2e4b0474346c02d

commit in 2.6.31.

But I strongly believe we should blame another patch

"CRED: Make execve() take advantage of copy-on-write credentials"
a6f76f23d297f70e2a6b3ec607f7aeeea9e37e8d

The tracee must not sleep in TASK_TRACED holding this mutex (it was named
cred_exec_mutex). Even if we remove ->cred_guard_mutex from mm_for_maps()
and proc_pid_attr_write(), another task doing PTRACE_ATTACH should not
hang until it is killed or the tracee resumes.

With this patch do_execve() does not use ->cred_guard_mutex directly and
we do not hold it throughout, instead:

- introduce prepare_bprm_creds() helper, it locks the mutex
and calls prepare_exec_creds() to initialize bprm->cred.

- install_exec_creds() drops the mutex after commit_creds(),
and thus before tracehook_report_exec()->ptrace_stop().

or, if exec fails,

free_bprm() drops this mutex when bprm->cred != NULL which
indicates install_exec_creds() was not called.

Reported-by: Tom Horsley <[email protected]>
Signed-off-by: Oleg Nesterov <[email protected]>
---

fs/exec.c | 49 +++++++++++++++++++++++++++---------------------
fs/compat.c | 17 +++-------------
include/linux/binfmts.h | 3 +-
3 files changed, 34 insertions(+), 35 deletions(-)

--- WAIT/fs/exec.c~CRED_MUTEX_EEEVENT_EXEC 2009-09-03 15:31:08.000000000 +0200
+++ WAIT/fs/exec.c 2009-09-03 17:37:54.000000000 +0200
@@ -1017,6 +1017,29 @@ out:

EXPORT_SYMBOL(flush_old_exec);

+int prepare_bprm_creds(struct linux_binprm *bprm)
+{
+ if (mutex_lock_interruptible(&current->cred_guard_mutex))
+ return -ERESTARTNOINTR;
+
+ bprm->cred = prepare_exec_creds();
+ if (likely(bprm->cred))
+ return 0;
+
+ mutex_unlock(&current->cred_guard_mutex);
+ return -ENOMEM;
+}
+
+void free_bprm(struct linux_binprm *bprm)
+{
+ free_arg_pages(bprm);
+ if (bprm->cred) {
+ mutex_unlock(&current->cred_guard_mutex);
+ abort_creds(bprm->cred);
+ }
+ kfree(bprm);
+}
+
/*
* install the new credentials for this executable
*/
@@ -1032,6 +1055,7 @@ void install_exec_creds(struct linux_bin
* credentials; any time after this it may be unlocked */

security_bprm_committed_creds(bprm);
+ mutex_unlock(&current->cred_guard_mutex);
}
EXPORT_SYMBOL(install_exec_creds);

@@ -1248,14 +1272,6 @@ int search_binary_handler(struct linux_b

EXPORT_SYMBOL(search_binary_handler);

-void free_bprm(struct linux_binprm *bprm)
-{
- free_arg_pages(bprm);
- if (bprm->cred)
- abort_creds(bprm->cred);
- kfree(bprm);
-}
-
/*
* sys_execve() executes a new program.
*/
@@ -1279,20 +1295,15 @@ int do_execve(char * filename,
if (!bprm)
goto out_files;

- retval = -ERESTARTNOINTR;
- if (mutex_lock_interruptible(&current->cred_guard_mutex))
+ retval = prepare_bprm_creds(bprm);
+ if (retval)
goto out_free;
- current->in_execve = 1;
-
- retval = -ENOMEM;
- bprm->cred = prepare_exec_creds();
- if (!bprm->cred)
- goto out_unlock;

retval = check_unsafe_exec(bprm);
if (retval < 0)
- goto out_unlock;
+ goto out_free;
clear_in_exec = retval;
+ current->in_execve = 1;

file = open_exec(filename);
retval = PTR_ERR(file);
@@ -1342,7 +1353,6 @@ int do_execve(char * filename,
/* execve succeeded */
current->fs->in_exec = 0;
current->in_execve = 0;
- mutex_unlock(&current->cred_guard_mutex);
acct_update_integrals(current);
free_bprm(bprm);
if (displaced)
@@ -1362,10 +1372,7 @@ out_file:
out_unmark:
if (clear_in_exec)
current->fs->in_exec = 0;
-
-out_unlock:
current->in_execve = 0;
- mutex_unlock(&current->cred_guard_mutex);

out_free:
free_bprm(bprm);
--- WAIT/fs/compat.c~CRED_MUTEX_EEEVENT_EXEC 2009-09-03 15:31:08.000000000 +0200
+++ WAIT/fs/compat.c 2009-09-03 16:42:17.000000000 +0200
@@ -1486,20 +1486,15 @@ int compat_do_execve(char * filename,
if (!bprm)
goto out_files;

- retval = -ERESTARTNOINTR;
- if (mutex_lock_interruptible(&current->cred_guard_mutex))
+ retval = prepare_bprm_creds(bprm);
+ if (retval)
goto out_free;
- current->in_execve = 1;
-
- retval = -ENOMEM;
- bprm->cred = prepare_exec_creds();
- if (!bprm->cred)
- goto out_unlock;

retval = check_unsafe_exec(bprm);
if (retval < 0)
- goto out_unlock;
+ goto out_free;
clear_in_exec = retval;
+ current->in_execve = 1;

file = open_exec(filename);
retval = PTR_ERR(file);
@@ -1548,7 +1543,6 @@ int compat_do_execve(char * filename,
/* execve succeeded */
current->fs->in_exec = 0;
current->in_execve = 0;
- mutex_unlock(&current->cred_guard_mutex);
acct_update_integrals(current);
free_bprm(bprm);
if (displaced)
@@ -1568,10 +1562,7 @@ out_file:
out_unmark:
if (clear_in_exec)
current->fs->in_exec = 0;
-
-out_unlock:
current->in_execve = 0;
- mutex_unlock(&current->cred_guard_mutex);

out_free:
free_bprm(bprm);
--- WAIT/include/linux/binfmts.h~CRED_MUTEX_EEEVENT_EXEC 2009-07-24 19:02:19.000000000 +0200
+++ WAIT/include/linux/binfmts.h 2009-09-03 17:37:04.000000000 +0200
@@ -117,9 +117,10 @@ extern int setup_arg_pages(struct linux_
int executable_stack);
extern int bprm_mm_init(struct linux_binprm *bprm);
extern int copy_strings_kernel(int argc,char ** argv,struct linux_binprm *bprm);
-extern void install_exec_creds(struct linux_binprm *bprm);
extern void do_coredump(long signr, int exit_code, struct pt_regs *regs);
extern void set_binfmt(struct linux_binfmt *new);
+extern int prepare_bprm_creds(struct linux_binprm *bprm);
+extern void install_exec_creds(struct linux_binprm *bprm);
extern void free_bprm(struct linux_binprm *);

#endif /* __KERNEL__ */


2009-09-03 20:12:08

by Roland McGrath

[permalink] [raw]
Subject: Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex

I certainly think it's right to hold the mutex only as long as necessary.
Clearly holding it when we stop is wrong.

I'm a bit concerned about holding it for arbitrary periods while we block
in the filesystem code. e.g., consider the scenario with a hangs-forever
NFS server or suchlike. But I'm not sure there is a reasonable way around
that one.

The paired calls that leave the mutex locked in between should have some
clear comments calling attention to their pairing. Aside from that making
sure that subtlety is clear, I don't see any problems in the patch off hand.
But I haven't scoured the code path lately to have full confidence.
I'd like to hear David's reactions.


Thanks,
Roland

2009-09-04 08:41:28

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex

Oleg Nesterov <[email protected]> wrote:

> But I strongly believe we should blame another patch
>
> "CRED: Make execve() take advantage of copy-on-write credentials"
> a6f76f23d297f70e2a6b3ec607f7aeeea9e37e8d
>
> The tracee must not sleep in TASK_TRACED holding this mutex (it was named
> cred_exec_mutex). Even if we remove ->cred_guard_mutex from mm_for_maps()
> and proc_pid_attr_write(), another task doing PTRACE_ATTACH should not
> hang until it is killed or the tracee resumes.

Ummm... I don't see how this is relevant.

Yes, a task must not sleep in TASK_TRACED if it is holding this mutex, but how
does that apply to do_exec(), mm_for_maps() or proc_pid_attr_write()? A
process can't be in any of those three if it is in the TASK_TRACED state.

A process can only be in the TASK_TRACED state in one of two ways: its parent
moved it there from the TASK_STOPPED state, or it put itself in that state -
neither of which should apply here.

Apart from that, I've no objection to dropping the guard semaphore earlier.

I do think though, the problem is elsewhere. Are we failing to unlock the
semaphore somewhere? Or double locking it, I wonder? Has Tom tried lockdep?

Btw, should mm_for_maps() use mutex_lock_interruptible()? There doesn't seem
any point making it non-interruptible (except for kill signals) - unless that
would muck up seqfile handling.

> +int prepare_bprm_creds(struct linux_binprm *bprm)
> +{

__acquires()

> +void free_bprm(struct linux_binprm *bprm)
> +{

__releases()

2009-09-04 08:45:30

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex

Roland McGrath <[email protected]> wrote:

> I certainly think it's right to hold the mutex only as long as necessary.
> Clearly holding it when we stop is wrong.

As far as I can tell, we don't hold it when we stop for debugging, or stop on
signal.

> I'm a bit concerned about holding it for arbitrary periods while we block
> in the filesystem code. e.g., consider the scenario with a hangs-forever
> NFS server or suchlike. But I'm not sure there is a reasonable way around
> that one.

If you drop the sem before committing the creds, you have to recalculate the
new credentials.

> The paired calls that leave the mutex locked in between should have some
> clear comments calling attention to their pairing. Aside from that making
> sure that subtlety is clear, I don't see any problems in the patch off hand.
> But I haven't scoured the code path lately to have full confidence.
> I'd like to hear David's reactions.

Looking at the patch description, I don't see how the patch it relevant to the
problem. There must be something else, either a call that's now being
skipped, or it's a matter of timing.

David

2009-09-04 09:26:32

by Roland McGrath

[permalink] [raw]
Subject: Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex

> > +int prepare_bprm_creds(struct linux_binprm *bprm)
> > +{
>
> __acquires()
>
> > +void free_bprm(struct linux_binprm *bprm)
> > +{
>
> __releases()

No, they're conditional.


Thanks,
Roland

2009-09-04 12:52:10

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex

On 09/04, David Howells wrote:
>
> Oleg Nesterov <[email protected]> wrote:
>
> > But I strongly believe we should blame another patch
> >
> > "CRED: Make execve() take advantage of copy-on-write credentials"
> > a6f76f23d297f70e2a6b3ec607f7aeeea9e37e8d
> >
> > The tracee must not sleep in TASK_TRACED holding this mutex (it was named
> > cred_exec_mutex). Even if we remove ->cred_guard_mutex from mm_for_maps()
> > and proc_pid_attr_write(), another task doing PTRACE_ATTACH should not
> > hang until it is killed or the tracee resumes.

(Argh. Sorry David, the changelog should have mentioned tracehook_report_exec()
more explicitely).

So, David, do you agree with this patch? Do you think it can go to 2.6.31 ?


> Btw, should mm_for_maps() use mutex_lock_interruptible()? There doesn't seem
> any point making it non-interruptible (except for kill signals) - unless that
> would muck up seqfile handling.

Perhaps, but we should change m_start() first, it should check IS_ERR() instead
of mm != NULL. Afaics, vfs_read()->seq_read() path will return ERESTART...
correctly.

I am not sure would be right though, short reads can confuse user space. And
this can't solve the problem, this only helps to react to signals.

Oleg.

2009-09-04 13:41:33

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex

Oleg Nesterov <[email protected]> wrote:

> (Argh. Sorry David, the changelog should have mentioned
> tracehook_report_exec() more explicitely).
>
> So, David, do you agree with this patch? Do you think it can go to 2.6.31 ?

Okay, add my Acked-by. Can I recommend you mention tracehook_report_exec() in
the changelog and then push it to Linus again?

David

2009-09-04 13:46:03

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex

On 09/03, Roland McGrath wrote:
>
> The paired calls that leave the mutex locked in between should have some
> clear comments calling attention to their pairing.

Agreed. Please see the same patch + some comments below.

------------------------------------------------------------------------------
[PATCH] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex

Tom Horsley reports that his debugger hangs when it tries to read
/proc/pid_of_tracee/maps, this happens since

"mm_for_maps: take ->cred_guard_mutex to fix the race with exec"
04b836cbf19e885f8366bccb2e4b0474346c02d

commit in 2.6.31.

But I strongly believe we should blame another patch

"CRED: Make execve() take advantage of copy-on-write credentials"
a6f76f23d297f70e2a6b3ec607f7aeeea9e37e8d

The tracee must not sleep in TASK_TRACED holding this mutex (it was named
cred_exec_mutex). Even if we remove ->cred_guard_mutex from mm_for_maps()
and proc_pid_attr_write(), another task doing PTRACE_ATTACH should not
hang until it is killed or the tracee resumes.

With this patch do_execve() does not use ->cred_guard_mutex directly and
we do not hold it throughout, instead:

- introduce prepare_bprm_creds() helper, it locks the mutex
and calls prepare_exec_creds() to initialize bprm->cred.

- install_exec_creds() drops the mutex after commit_creds(),
and thus before tracehook_report_exec()->ptrace_stop().

or, if exec fails,

free_bprm() drops this mutex when bprm->cred != NULL which
indicates install_exec_creds() was not called.

Reported-by: Tom Horsley <[email protected]>
Signed-off-by: Oleg Nesterov <[email protected]>
---


--- WAIT/include/linux/binfmts.h~CRED_MUTEX_EEEVENT_EXEC 2009-07-24 19:02:19.000000000 +0200
+++ WAIT/include/linux/binfmts.h 2009-09-03 17:37:04.000000000 +0200
@@ -117,9 +117,10 @@ extern int setup_arg_pages(struct linux_
int executable_stack);
extern int bprm_mm_init(struct linux_binprm *bprm);
extern int copy_strings_kernel(int argc,char ** argv,struct linux_binprm *bprm);
-extern void install_exec_creds(struct linux_binprm *bprm);
extern void do_coredump(long signr, int exit_code, struct pt_regs *regs);
extern void set_binfmt(struct linux_binfmt *new);
+extern int prepare_bprm_creds(struct linux_binprm *bprm);
+extern void install_exec_creds(struct linux_binprm *bprm);
extern void free_bprm(struct linux_binprm *);

#endif /* __KERNEL__ */
--- WAIT/fs/exec.c~CRED_MUTEX_EEEVENT_EXEC 2009-09-03 15:31:08.000000000 +0200
+++ WAIT/fs/exec.c 2009-09-04 15:35:52.000000000 +0200
@@ -1018,6 +1018,35 @@ out:
EXPORT_SYMBOL(flush_old_exec);

/*
+ * Prepare credentials and lock ->cred_guard_mutex.
+ * install_exec_creds() commits the new creds and drops the lock.
+ * Or, if exec fails before, free_bprm() should release ->cred and
+ * and unlock.
+ */
+int prepare_bprm_creds(struct linux_binprm *bprm)
+{
+ if (mutex_lock_interruptible(&current->cred_guard_mutex))
+ return -ERESTARTNOINTR;
+
+ bprm->cred = prepare_exec_creds();
+ if (likely(bprm->cred))
+ return 0;
+
+ mutex_unlock(&current->cred_guard_mutex);
+ return -ENOMEM;
+}
+
+void free_bprm(struct linux_binprm *bprm)
+{
+ free_arg_pages(bprm);
+ if (bprm->cred) {
+ mutex_unlock(&current->cred_guard_mutex);
+ abort_creds(bprm->cred);
+ }
+ kfree(bprm);
+}
+
+/*
* install the new credentials for this executable
*/
void install_exec_creds(struct linux_binprm *bprm)
@@ -1026,12 +1055,9 @@ void install_exec_creds(struct linux_bin

commit_creds(bprm->cred);
bprm->cred = NULL;
-
- /* cred_guard_mutex must be held at least to this point to prevent
- * ptrace_attach() from altering our determination of the task's
- * credentials; any time after this it may be unlocked */
-
security_bprm_committed_creds(bprm);
+
+ mutex_unlock(&current->cred_guard_mutex);
}
EXPORT_SYMBOL(install_exec_creds);

@@ -1248,14 +1274,6 @@ int search_binary_handler(struct linux_b

EXPORT_SYMBOL(search_binary_handler);

-void free_bprm(struct linux_binprm *bprm)
-{
- free_arg_pages(bprm);
- if (bprm->cred)
- abort_creds(bprm->cred);
- kfree(bprm);
-}
-
/*
* sys_execve() executes a new program.
*/
@@ -1279,20 +1297,15 @@ int do_execve(char * filename,
if (!bprm)
goto out_files;

- retval = -ERESTARTNOINTR;
- if (mutex_lock_interruptible(&current->cred_guard_mutex))
+ retval = prepare_bprm_creds(bprm);
+ if (retval)
goto out_free;
- current->in_execve = 1;
-
- retval = -ENOMEM;
- bprm->cred = prepare_exec_creds();
- if (!bprm->cred)
- goto out_unlock;

retval = check_unsafe_exec(bprm);
if (retval < 0)
- goto out_unlock;
+ goto out_free;
clear_in_exec = retval;
+ current->in_execve = 1;

file = open_exec(filename);
retval = PTR_ERR(file);
@@ -1342,7 +1355,6 @@ int do_execve(char * filename,
/* execve succeeded */
current->fs->in_exec = 0;
current->in_execve = 0;
- mutex_unlock(&current->cred_guard_mutex);
acct_update_integrals(current);
free_bprm(bprm);
if (displaced)
@@ -1362,10 +1374,7 @@ out_file:
out_unmark:
if (clear_in_exec)
current->fs->in_exec = 0;
-
-out_unlock:
current->in_execve = 0;
- mutex_unlock(&current->cred_guard_mutex);

out_free:
free_bprm(bprm);
--- WAIT/fs/compat.c~CRED_MUTEX_EEEVENT_EXEC 2009-09-03 15:31:08.000000000 +0200
+++ WAIT/fs/compat.c 2009-09-03 16:42:17.000000000 +0200
@@ -1486,20 +1486,15 @@ int compat_do_execve(char * filename,
if (!bprm)
goto out_files;

- retval = -ERESTARTNOINTR;
- if (mutex_lock_interruptible(&current->cred_guard_mutex))
+ retval = prepare_bprm_creds(bprm);
+ if (retval)
goto out_free;
- current->in_execve = 1;
-
- retval = -ENOMEM;
- bprm->cred = prepare_exec_creds();
- if (!bprm->cred)
- goto out_unlock;

retval = check_unsafe_exec(bprm);
if (retval < 0)
- goto out_unlock;
+ goto out_free;
clear_in_exec = retval;
+ current->in_execve = 1;

file = open_exec(filename);
retval = PTR_ERR(file);
@@ -1548,7 +1543,6 @@ int compat_do_execve(char * filename,
/* execve succeeded */
current->fs->in_exec = 0;
current->in_execve = 0;
- mutex_unlock(&current->cred_guard_mutex);
acct_update_integrals(current);
free_bprm(bprm);
if (displaced)
@@ -1568,10 +1562,7 @@ out_file:
out_unmark:
if (clear_in_exec)
current->fs->in_exec = 0;
-
-out_unlock:
current->in_execve = 0;
- mutex_unlock(&current->cred_guard_mutex);

out_free:
free_bprm(bprm);

2009-09-04 14:01:20

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex

On 09/04, David Howells wrote:
>
> Oleg Nesterov <[email protected]> wrote:
>
> > (Argh. Sorry David, the changelog should have mentioned
> > tracehook_report_exec() more explicitely).
> >
> > So, David, do you agree with this patch? Do you think it can go to 2.6.31 ?
>
> Okay, add my Acked-by.

Great.

> Can I recommend you mention tracehook_report_exec() in
> the changelog and then push it to Linus again?

OK, will do.

I just sent v2 (the same patch + a bit of comments). Please tell me
if you disagree with the comments.

Oleg.

2009-09-04 14:49:39

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex

Oleg Nesterov <[email protected]> wrote:

> - install_exec_creds() drops the mutex after commit_creds(),
> and thus before tracehook_report_exec()->ptrace_stop().

Thanks!

> -
> - /* cred_guard_mutex must be held at least to this point to prevent
> - * ptrace_attach() from altering our determination of the task's
> - * credentials; any time after this it may be unlocked */
> -

Why are you removing this comment, btw? In what way does your patch
invalidate it?

David

2009-09-04 15:55:48

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex

On 09/04, David Howells wrote:
>
> Oleg Nesterov <[email protected]> wrote:
>
> > -
> > - /* cred_guard_mutex must be held at least to this point to prevent
> > - * ptrace_attach() from altering our determination of the task's
> > - * credentials; any time after this it may be unlocked */
> > -
>
> Why are you removing this comment, btw? In what way does your patch
> invalidate it?

No, the comment is fine. Except, now we have more users of this mutex,
but it only mentions ptrace_attach(). Tried to improve it, but failed
to make the concise and understandable comment,

I tried to collect all comments aboout cred_guard_mutex in one place,
above prepare_bprm_creds()... OK, agreed. Will revert this bit and
resend.

Oleg.

2009-09-04 17:32:45

by Oleg Nesterov

[permalink] [raw]
Subject: [PATCH v3] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex

Tom Horsley reports that his debugger hangs when it tries to read
/proc/pid_of_tracee/maps, this happens since

"mm_for_maps: take ->cred_guard_mutex to fix the race with exec"
04b836cbf19e885f8366bccb2e4b0474346c02d

commit in 2.6.31.

But the root of the problem lies in the fact that do_execve() path calls
tracehook_report_exec() which can stop if the tracer sets PT_TRACE_EXEC.

The tracee must not sleep in TASK_TRACED holding this mutex. Even if we
remove ->cred_guard_mutex from mm_for_maps() and proc_pid_attr_write(),
another task doing PTRACE_ATTACH should not hang until it is killed or
the tracee resumes.

With this patch do_execve() does not use ->cred_guard_mutex directly and
we do not hold it throughout, instead:

- introduce prepare_bprm_creds() helper, it locks the mutex
and calls prepare_exec_creds() to initialize bprm->cred.

- install_exec_creds() drops the mutex after commit_creds(),
and thus before tracehook_report_exec()->ptrace_stop().

or, if exec fails,

free_bprm() drops this mutex when bprm->cred != NULL which
indicates install_exec_creds() was not called.

Reported-by: Tom Horsley <[email protected]>
Signed-off-by: Oleg Nesterov <[email protected]>
Acked-by: David Howells <[email protected]>
---

include/linux/binfmts.h | 3 +-
fs/exec.c | 63 ++++++++++++++++++++++++++++--------------------
fs/compat.c | 17 +++---------
3 files changed, 44 insertions(+), 39 deletions(-)

--- WAIT/include/linux/binfmts.h~CRED_MUTEX_EEEVENT_EXEC 2009-07-24 19:02:19.000000000 +0200
+++ WAIT/include/linux/binfmts.h 2009-09-03 17:37:04.000000000 +0200
@@ -117,9 +117,10 @@ extern int setup_arg_pages(struct linux_
int executable_stack);
extern int bprm_mm_init(struct linux_binprm *bprm);
extern int copy_strings_kernel(int argc,char ** argv,struct linux_binprm *bprm);
-extern void install_exec_creds(struct linux_binprm *bprm);
extern void do_coredump(long signr, int exit_code, struct pt_regs *regs);
extern void set_binfmt(struct linux_binfmt *new);
+extern int prepare_bprm_creds(struct linux_binprm *bprm);
+extern void install_exec_creds(struct linux_binprm *bprm);
extern void free_bprm(struct linux_binprm *);

#endif /* __KERNEL__ */
--- WAIT/fs/exec.c~CRED_MUTEX_EEEVENT_EXEC 2009-09-03 15:31:08.000000000 +0200
+++ WAIT/fs/exec.c 2009-09-04 19:09:14.000000000 +0200
@@ -1018,6 +1018,35 @@ out:
EXPORT_SYMBOL(flush_old_exec);

/*
+ * Prepare credentials and lock ->cred_guard_mutex.
+ * install_exec_creds() commits the new creds and drops the lock.
+ * Or, if exec fails before, free_bprm() should release ->cred and
+ * and unlock.
+ */
+int prepare_bprm_creds(struct linux_binprm *bprm)
+{
+ if (mutex_lock_interruptible(&current->cred_guard_mutex))
+ return -ERESTARTNOINTR;
+
+ bprm->cred = prepare_exec_creds();
+ if (likely(bprm->cred))
+ return 0;
+
+ mutex_unlock(&current->cred_guard_mutex);
+ return -ENOMEM;
+}
+
+void free_bprm(struct linux_binprm *bprm)
+{
+ free_arg_pages(bprm);
+ if (bprm->cred) {
+ mutex_unlock(&current->cred_guard_mutex);
+ abort_creds(bprm->cred);
+ }
+ kfree(bprm);
+}
+
+/*
* install the new credentials for this executable
*/
void install_exec_creds(struct linux_binprm *bprm)
@@ -1026,12 +1055,13 @@ void install_exec_creds(struct linux_bin

commit_creds(bprm->cred);
bprm->cred = NULL;
-
- /* cred_guard_mutex must be held at least to this point to prevent
+ /*
+ * cred_guard_mutex must be held at least to this point to prevent
* ptrace_attach() from altering our determination of the task's
- * credentials; any time after this it may be unlocked */
-
+ * credentials; any time after this it may be unlocked.
+ */
security_bprm_committed_creds(bprm);
+ mutex_unlock(&current->cred_guard_mutex);
}
EXPORT_SYMBOL(install_exec_creds);

@@ -1248,14 +1278,6 @@ int search_binary_handler(struct linux_b

EXPORT_SYMBOL(search_binary_handler);

-void free_bprm(struct linux_binprm *bprm)
-{
- free_arg_pages(bprm);
- if (bprm->cred)
- abort_creds(bprm->cred);
- kfree(bprm);
-}
-
/*
* sys_execve() executes a new program.
*/
@@ -1279,20 +1301,15 @@ int do_execve(char * filename,
if (!bprm)
goto out_files;

- retval = -ERESTARTNOINTR;
- if (mutex_lock_interruptible(&current->cred_guard_mutex))
+ retval = prepare_bprm_creds(bprm);
+ if (retval)
goto out_free;
- current->in_execve = 1;
-
- retval = -ENOMEM;
- bprm->cred = prepare_exec_creds();
- if (!bprm->cred)
- goto out_unlock;

retval = check_unsafe_exec(bprm);
if (retval < 0)
- goto out_unlock;
+ goto out_free;
clear_in_exec = retval;
+ current->in_execve = 1;

file = open_exec(filename);
retval = PTR_ERR(file);
@@ -1342,7 +1359,6 @@ int do_execve(char * filename,
/* execve succeeded */
current->fs->in_exec = 0;
current->in_execve = 0;
- mutex_unlock(&current->cred_guard_mutex);
acct_update_integrals(current);
free_bprm(bprm);
if (displaced)
@@ -1362,10 +1378,7 @@ out_file:
out_unmark:
if (clear_in_exec)
current->fs->in_exec = 0;
-
-out_unlock:
current->in_execve = 0;
- mutex_unlock(&current->cred_guard_mutex);

out_free:
free_bprm(bprm);
--- WAIT/fs/compat.c~CRED_MUTEX_EEEVENT_EXEC 2009-09-03 15:31:08.000000000 +0200
+++ WAIT/fs/compat.c 2009-09-03 16:42:17.000000000 +0200
@@ -1486,20 +1486,15 @@ int compat_do_execve(char * filename,
if (!bprm)
goto out_files;

- retval = -ERESTARTNOINTR;
- if (mutex_lock_interruptible(&current->cred_guard_mutex))
+ retval = prepare_bprm_creds(bprm);
+ if (retval)
goto out_free;
- current->in_execve = 1;
-
- retval = -ENOMEM;
- bprm->cred = prepare_exec_creds();
- if (!bprm->cred)
- goto out_unlock;

retval = check_unsafe_exec(bprm);
if (retval < 0)
- goto out_unlock;
+ goto out_free;
clear_in_exec = retval;
+ current->in_execve = 1;

file = open_exec(filename);
retval = PTR_ERR(file);
@@ -1548,7 +1543,6 @@ int compat_do_execve(char * filename,
/* execve succeeded */
current->fs->in_exec = 0;
current->in_execve = 0;
- mutex_unlock(&current->cred_guard_mutex);
acct_update_integrals(current);
free_bprm(bprm);
if (displaced)
@@ -1568,10 +1562,7 @@ out_file:
out_unmark:
if (clear_in_exec)
current->fs->in_exec = 0;
-
-out_unlock:
current->in_execve = 0;
- mutex_unlock(&current->cred_guard_mutex);

out_free:
free_bprm(bprm);

2009-09-04 19:45:07

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v3] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex

On Fri, 4 Sep 2009 19:26:48 +0200
Oleg Nesterov <[email protected]> wrote:

> Tom Horsley reports that his debugger hangs when it tries to read
> /proc/pid_of_tracee/maps, this happens since
>
> "mm_for_maps: take ->cred_guard_mutex to fix the race with exec"
> 04b836cbf19e885f8366bccb2e4b0474346c02d
>
> commit in 2.6.31.
>
> But the root of the problem lies in the fact that do_execve() path calls
> tracehook_report_exec() which can stop if the tracer sets PT_TRACE_EXEC.
>
> The tracee must not sleep in TASK_TRACED holding this mutex. Even if we
> remove ->cred_guard_mutex from mm_for_maps() and proc_pid_attr_write(),
> another task doing PTRACE_ATTACH should not hang until it is killed or
> the tracee resumes.
>
> With this patch do_execve() does not use ->cred_guard_mutex directly and
> we do not hold it throughout, instead:
>
> - introduce prepare_bprm_creds() helper, it locks the mutex
> and calls prepare_exec_creds() to initialize bprm->cred.
>
> - install_exec_creds() drops the mutex after commit_creds(),
> and thus before tracehook_report_exec()->ptrace_stop().
>
> or, if exec fails,
>
> free_bprm() drops this mutex when bprm->cred != NULL which
> indicates install_exec_creds() was not called.
>
> Reported-by: Tom Horsley <[email protected]>
> Signed-off-by: Oleg Nesterov <[email protected]>
> Acked-by: David Howells <[email protected]>

I get a reject in binfmts.h because your kernel has `extern void
set_binfmt' and mine has `extern int set_binfmt'.

Hopefully this patch works OK in mainline as well as in whatever kernel
you tested against!

I see a Cc:stable in the mail headers, but not in the changelog. I
don't think the patch is applicable to -stable unless we miss 2.6.31.

2009-09-04 21:39:14

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH v3] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex

On 09/04, Andrew Morton wrote:
>
> On Fri, 4 Sep 2009 19:26:48 +0200
> Oleg Nesterov <[email protected]> wrote:
>
> > Tom Horsley reports that his debugger hangs when it tries to read
> > /proc/pid_of_tracee/maps, this happens since
> >
> > "mm_for_maps: take ->cred_guard_mutex to fix the race with exec"
> > 04b836cbf19e885f8366bccb2e4b0474346c02d
> >
> > commit in 2.6.31.
>
> I get a reject in binfmts.h because your kernel has `extern void
> set_binfmt' and mine has `extern int set_binfmt'.

Ah, I have exec-fix-set_binfmt-vs-sys_delete_module-race.patch from -mm
applied...

> Hopefully this patch works OK in mainline as well as in whatever kernel
> you tested against!

Yes I tried to test it ;)

> I see a Cc:stable in the mail headers, but not in the changelog. I
> don't think the patch is applicable to -stable unless we miss 2.6.31.

-stable has the same problems but I agree, it can live without this fix.

Oleg.

2009-09-09 22:02:43

by Chuck Ebbert

[permalink] [raw]
Subject: Re: [PATCH v3] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex

On Fri, 4 Sep 2009 23:33:37 +0200
Oleg Nesterov <[email protected]> wrote:

> On 09/04, Andrew Morton wrote:
> >
> > On Fri, 4 Sep 2009 19:26:48 +0200
> > Oleg Nesterov <[email protected]> wrote:
> >
> > > Tom Horsley reports that his debugger hangs when it tries to read
> > > /proc/pid_of_tracee/maps, this happens since
> > >
> > > "mm_for_maps: take ->cred_guard_mutex to fix the race with exec"
> > > 04b836cbf19e885f8366bccb2e4b0474346c02d
> > >
> > > commit in 2.6.31.

>
> > I see a Cc:stable in the mail headers, but not in the changelog. I
> > don't think the patch is applicable to -stable unless we miss 2.6.31.
>
> -stable has the same problems but I agree, it can live without this fix.
>

That patch was backported in 2.6.30.5 -- are you sure -stable doesn't need
this fix?

2009-09-09 23:04:14

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH v3] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex

On 09/09, Chuck Ebbert wrote:
>
> On Fri, 4 Sep 2009 23:33:37 +0200
> Oleg Nesterov <[email protected]> wrote:
>
> > On 09/04, Andrew Morton wrote:
> > >
> > > On Fri, 4 Sep 2009 19:26:48 +0200
> > > Oleg Nesterov <[email protected]> wrote:
> > >
> > > > Tom Horsley reports that his debugger hangs when it tries to read
> > > > /proc/pid_of_tracee/maps, this happens since
> > > >
> > > > "mm_for_maps: take ->cred_guard_mutex to fix the race with exec"
> > > > 04b836cbf19e885f8366bccb2e4b0474346c02d
> > > >
> > > > commit in 2.6.31.
>
> >
> > > I see a Cc:stable in the mail headers, but not in the changelog. I
> > > don't think the patch is applicable to -stable unless we miss 2.6.31.
> >
> > -stable has the same problems but I agree, it can live without this fix.
> >
>
> That patch was backported in 2.6.30.5 -- are you sure -stable doesn't need
> this fix?

No, I am not sure, that is why I cc'ed -stable.

But otoh the problem is minor, both tracer and tracee can be killed.

I dunno. I don't know if gdb or any other "important" application
read /proc/pid/maps after PTRACE_EVENT_EXEC...

Oleg.