2022-01-26 22:27:31

by Kees Cook

[permalink] [raw]
Subject: [PATCH] fs/binfmt_elf: Add padding NULL when argc == 0

Quoting Ariadne Conill:

"In several other operating systems, it is a hard requirement that the
first argument to execve(2) be the name of a program, thus prohibiting
a scenario where argc < 1. POSIX 2017 also recommends this behaviour,
but it is not an explicit requirement[1]:

The argument arg0 should point to a filename string that is
associated with the process being started by one of the exec
functions.
...
Interestingly, Michael Kerrisk opened an issue about this in 2008[2],
but there was no consensus to support fixing this issue then.
Hopefully now that CVE-2021-4034 shows practical exploitative use[3]
of this bug in a shellcode, we can reconsider."

An examination of existing[4] users of execve(..., NULL, NULL) shows
mostly test code, or example rootkit code. While rejecting a NULL argv
would be preferred, it looks like the main cause of userspace confusion
is an assumption that argc >= 1, and buggy programs may skip argv[0]
when iterating. To protect against userspace bugs of this nature, insert
an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].

Note that this is only done in the argc == 0 case because some userspace
programs expect to find envp at exactly argv[argc]. The overlap of these
two misguided assumptions is believed to be zero.

[1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
[2] https://bugzilla.kernel.org/show_bug.cgi?id=8408
[3] https://www.qualys.com/2022/01/25/cve-2021-4034/pwnkit.txt
[4] https://codesearch.debian.net/search?q=execve%5C+*%5C%28%5B%5E%2C%5D%2B%2C+*NULL&literal=0

Reported-by: Ariadne Conill <[email protected]>
Reported-by: Michael Kerrisk <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Kees Cook <[email protected]>
---
fs/binfmt_elf.c | 10 +++++++++-
fs/exec.c | 7 ++++++-
2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 605017eb9349..e456c48658ad 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -297,7 +297,8 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,
ei_index = elf_info - (elf_addr_t *)mm->saved_auxv;
sp = STACK_ADD(p, ei_index);

- items = (argc + 1) + (envc + 1) + 1;
+ /* Make room for extra pointer when argc == 0. See below. */
+ items = (min(argc, 1) + 1) + (envc + 1) + 1;
bprm->p = STACK_ROUND(sp, items);

/* Point sp at the lowest address on the stack */
@@ -326,6 +327,13 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,

/* Populate list of argv pointers back to argv strings. */
p = mm->arg_end = mm->arg_start;
+ /*
+ * Include an extra NULL pointer in argv when argc == 0 so
+ * that argv[1] != envp[0] to help userspace programs from
+ * mishandling argc == 0. See fs/exec.c bprm_stack_limits().
+ */
+ if (argc == 0 && put_user(0, sp++))
+ return -EFAULT;
while (argc-- > 0) {
size_t len;
if (put_user((elf_addr_t)p, sp++))
diff --git a/fs/exec.c b/fs/exec.c
index 79f2c9483302..0b36384e55b1 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -495,8 +495,13 @@ static int bprm_stack_limits(struct linux_binprm *bprm)
* the stack. They aren't stored until much later when we can't
* signal to the parent that the child has run out of stack space.
* Instead, calculate it here so it's possible to fail gracefully.
+ *
+ * In the case of argc < 1, make sure there is a NULL pointer gap
+ * between argv and envp to ensure confused userspace programs don't
+ * start processing from argv[1], thinking argc can never be 0,
+ * to block them from walking envp by accident. See fs/binfmt_elf.c.
*/
- ptr_size = (bprm->argc + bprm->envc) * sizeof(void *);
+ ptr_size = (min(bprm->argc, 1) + bprm->envc) * sizeof(void *);
if (limit <= ptr_size)
return -E2BIG;
limit -= ptr_size;
--
2.30.2


2022-01-26 22:28:49

by Jann Horn

[permalink] [raw]
Subject: Re: [PATCH] fs/binfmt_elf: Add padding NULL when argc == 0

On Wed, Jan 26, 2022 at 6:58 PM Kees Cook <[email protected]> wrote:
> Quoting Ariadne Conill:
>
> "In several other operating systems, it is a hard requirement that the
> first argument to execve(2) be the name of a program, thus prohibiting
> a scenario where argc < 1. POSIX 2017 also recommends this behaviour,
> but it is not an explicit requirement[1]:
>
> The argument arg0 should point to a filename string that is
> associated with the process being started by one of the exec
> functions.
> ...
> Interestingly, Michael Kerrisk opened an issue about this in 2008[2],
> but there was no consensus to support fixing this issue then.
> Hopefully now that CVE-2021-4034 shows practical exploitative use[3]
> of this bug in a shellcode, we can reconsider."
>
> An examination of existing[4] users of execve(..., NULL, NULL) shows
> mostly test code, or example rootkit code. While rejecting a NULL argv
> would be preferred, it looks like the main cause of userspace confusion
> is an assumption that argc >= 1, and buggy programs may skip argv[0]
> when iterating. To protect against userspace bugs of this nature, insert
> an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
>
> Note that this is only done in the argc == 0 case because some userspace
> programs expect to find envp at exactly argv[argc]. The overlap of these
> two misguided assumptions is believed to be zero.

Will this result in the executed program being told that argc==0 but
having an extra NULL pointer on the stack?
If so, I believe this breaks the x86-64 ABI documented at
https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf - page 29,
figure 3.9 describes the layout of the initial process stack.

Actually, does this even work? Can a program still properly access its
environment variables when invoked with argc==0 with this patch
applied? AFAIU the way userspace locates envv on x86-64 is by
calculating 8*(argc+1)?

2022-01-26 22:31:52

by Ariadne Conill

[permalink] [raw]
Subject: Re: [PATCH] fs/binfmt_elf: Add padding NULL when argc == 0

Hi,

On Wed, 26 Jan 2022, Jann Horn wrote:

> On Wed, Jan 26, 2022 at 6:58 PM Kees Cook <[email protected]> wrote:
>> Quoting Ariadne Conill:
>>
>> "In several other operating systems, it is a hard requirement that the
>> first argument to execve(2) be the name of a program, thus prohibiting
>> a scenario where argc < 1. POSIX 2017 also recommends this behaviour,
>> but it is not an explicit requirement[1]:
>>
>> The argument arg0 should point to a filename string that is
>> associated with the process being started by one of the exec
>> functions.
>> ...
>> Interestingly, Michael Kerrisk opened an issue about this in 2008[2],
>> but there was no consensus to support fixing this issue then.
>> Hopefully now that CVE-2021-4034 shows practical exploitative use[3]
>> of this bug in a shellcode, we can reconsider."
>>
>> An examination of existing[4] users of execve(..., NULL, NULL) shows
>> mostly test code, or example rootkit code. While rejecting a NULL argv
>> would be preferred, it looks like the main cause of userspace confusion
>> is an assumption that argc >= 1, and buggy programs may skip argv[0]
>> when iterating. To protect against userspace bugs of this nature, insert
>> an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
>>
>> Note that this is only done in the argc == 0 case because some userspace
>> programs expect to find envp at exactly argv[argc]. The overlap of these
>> two misguided assumptions is believed to be zero.
>
> Will this result in the executed program being told that argc==0 but
> having an extra NULL pointer on the stack?
> If so, I believe this breaks the x86-64 ABI documented at
> https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf - page 29,
> figure 3.9 describes the layout of the initial process stack.

I'm presently compiling a kernel with the patch to see if it works or not.

> Actually, does this even work? Can a program still properly access its
> environment variables when invoked with argc==0 with this patch
> applied? AFAIU the way userspace locates envv on x86-64 is by
> calculating 8*(argc+1)?

In the other thread, it was suggested that perhaps we should set up an
argv of {"", NULL}. In that case, it seems like it would be safe to claim
argc == 1.

What do you think?

Ariadne

2022-01-26 22:34:17

by Jann Horn

[permalink] [raw]
Subject: Re: [PATCH] fs/binfmt_elf: Add padding NULL when argc == 0

On Wed, Jan 26, 2022 at 7:42 PM Ariadne Conill <[email protected]> wrote:
> On Wed, 26 Jan 2022, Jann Horn wrote:
> > On Wed, Jan 26, 2022 at 6:58 PM Kees Cook <[email protected]> wrote:
> >> Quoting Ariadne Conill:
> >>
> >> "In several other operating systems, it is a hard requirement that the
> >> first argument to execve(2) be the name of a program, thus prohibiting
> >> a scenario where argc < 1. POSIX 2017 also recommends this behaviour,
> >> but it is not an explicit requirement[1]:
> >>
> >> The argument arg0 should point to a filename string that is
> >> associated with the process being started by one of the exec
> >> functions.
> >> ...
> >> Interestingly, Michael Kerrisk opened an issue about this in 2008[2],
> >> but there was no consensus to support fixing this issue then.
> >> Hopefully now that CVE-2021-4034 shows practical exploitative use[3]
> >> of this bug in a shellcode, we can reconsider."
> >>
> >> An examination of existing[4] users of execve(..., NULL, NULL) shows
> >> mostly test code, or example rootkit code. While rejecting a NULL argv
> >> would be preferred, it looks like the main cause of userspace confusion
> >> is an assumption that argc >= 1, and buggy programs may skip argv[0]
> >> when iterating. To protect against userspace bugs of this nature, insert
> >> an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
> >>
> >> Note that this is only done in the argc == 0 case because some userspace
> >> programs expect to find envp at exactly argv[argc]. The overlap of these
> >> two misguided assumptions is believed to be zero.
> >
> > Will this result in the executed program being told that argc==0 but
> > having an extra NULL pointer on the stack?
> > If so, I believe this breaks the x86-64 ABI documented at
> > https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf - page 29,
> > figure 3.9 describes the layout of the initial process stack.
>
> I'm presently compiling a kernel with the patch to see if it works or not.
>
> > Actually, does this even work? Can a program still properly access its
> > environment variables when invoked with argc==0 with this patch
> > applied? AFAIU the way userspace locates envv on x86-64 is by
> > calculating 8*(argc+1)?
>
> In the other thread, it was suggested that perhaps we should set up an
> argv of {"", NULL}. In that case, it seems like it would be safe to claim
> argc == 1.
>
> What do you think?

Sounds good to me, since that's something that could also happen
normally if userspace calls execve(..., {"", NULL}, ...).

(I'd like it even better if we could just bail out with an error code,
but I guess the risk of breakage might be too high with that
approach?)

2022-01-26 22:35:03

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] fs/binfmt_elf: Add padding NULL when argc == 0

On Wed, Jan 26, 2022 at 07:07:20PM +0100, Jann Horn wrote:
> On Wed, Jan 26, 2022 at 6:58 PM Kees Cook <[email protected]> wrote:
> > Quoting Ariadne Conill:
> >
> > "In several other operating systems, it is a hard requirement that the
> > first argument to execve(2) be the name of a program, thus prohibiting
> > a scenario where argc < 1. POSIX 2017 also recommends this behaviour,
> > but it is not an explicit requirement[1]:
> >
> > The argument arg0 should point to a filename string that is
> > associated with the process being started by one of the exec
> > functions.
> > ...
> > Interestingly, Michael Kerrisk opened an issue about this in 2008[2],
> > but there was no consensus to support fixing this issue then.
> > Hopefully now that CVE-2021-4034 shows practical exploitative use[3]
> > of this bug in a shellcode, we can reconsider."
> >
> > An examination of existing[4] users of execve(..., NULL, NULL) shows
> > mostly test code, or example rootkit code. While rejecting a NULL argv
> > would be preferred, it looks like the main cause of userspace confusion
> > is an assumption that argc >= 1, and buggy programs may skip argv[0]
> > when iterating. To protect against userspace bugs of this nature, insert
> > an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
> >
> > Note that this is only done in the argc == 0 case because some userspace
> > programs expect to find envp at exactly argv[argc]. The overlap of these
> > two misguided assumptions is believed to be zero.
>
> Will this result in the executed program being told that argc==0 but
> having an extra NULL pointer on the stack?
> If so, I believe this breaks the x86-64 ABI documented at
> https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf - page 29,
> figure 3.9 describes the layout of the initial process stack.
>
> Actually, does this even work? Can a program still properly access its
> environment variables when invoked with argc==0 with this patch
> applied? AFAIU the way userspace locates envv on x86-64 is by
> calculating 8*(argc+1)?

Hrm, yeah, I guess it's libc providing the envp pointer; it's not passes
separately. Hrm.

--
Kees Cook

2022-01-26 22:35:36

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] fs/binfmt_elf: Add padding NULL when argc == 0

On Wed, Jan 26, 2022 at 08:50:39PM +0100, Jann Horn wrote:
> On Wed, Jan 26, 2022 at 7:42 PM Ariadne Conill <[email protected]> wrote:
> > On Wed, 26 Jan 2022, Jann Horn wrote:
> > > On Wed, Jan 26, 2022 at 6:58 PM Kees Cook <[email protected]> wrote:
> > >> Quoting Ariadne Conill:
> > >>
> > >> "In several other operating systems, it is a hard requirement that the
> > >> first argument to execve(2) be the name of a program, thus prohibiting
> > >> a scenario where argc < 1. POSIX 2017 also recommends this behaviour,
> > >> but it is not an explicit requirement[1]:
> > >>
> > >> The argument arg0 should point to a filename string that is
> > >> associated with the process being started by one of the exec
> > >> functions.
> > >> ...
> > >> Interestingly, Michael Kerrisk opened an issue about this in 2008[2],
> > >> but there was no consensus to support fixing this issue then.
> > >> Hopefully now that CVE-2021-4034 shows practical exploitative use[3]
> > >> of this bug in a shellcode, we can reconsider."
> > >>
> > >> An examination of existing[4] users of execve(..., NULL, NULL) shows
> > >> mostly test code, or example rootkit code. While rejecting a NULL argv
> > >> would be preferred, it looks like the main cause of userspace confusion
> > >> is an assumption that argc >= 1, and buggy programs may skip argv[0]
> > >> when iterating. To protect against userspace bugs of this nature, insert
> > >> an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
> > >>
> > >> Note that this is only done in the argc == 0 case because some userspace
> > >> programs expect to find envp at exactly argv[argc]. The overlap of these
> > >> two misguided assumptions is believed to be zero.
> > >
> > > Will this result in the executed program being told that argc==0 but
> > > having an extra NULL pointer on the stack?
> > > If so, I believe this breaks the x86-64 ABI documented at
> > > https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf - page 29,
> > > figure 3.9 describes the layout of the initial process stack.
> >
> > I'm presently compiling a kernel with the patch to see if it works or not.
> >
> > > Actually, does this even work? Can a program still properly access its
> > > environment variables when invoked with argc==0 with this patch
> > > applied? AFAIU the way userspace locates envv on x86-64 is by
> > > calculating 8*(argc+1)?
> >
> > In the other thread, it was suggested that perhaps we should set up an
> > argv of {"", NULL}. In that case, it seems like it would be safe to claim
> > argc == 1.
> >
> > What do you think?
>
> Sounds good to me, since that's something that could also happen
> normally if userspace calls execve(..., {"", NULL}, ...).
>
> (I'd like it even better if we could just bail out with an error code,
> but I guess the risk of breakage might be too high with that
> approach?)

We can't mutate argc; it'll turn at least some userspace into an
infinite loop:
https://sources.debian.org/src/valgrind/1:3.18.1-1/none/tests/execve.c/?hl=22#L22

--
Kees Cook

2022-01-26 22:36:50

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] fs/binfmt_elf: Add padding NULL when argc == 0

On Wed, Jan 26, 2022 at 11:58:39AM -0800, Kees Cook wrote:
> On Wed, Jan 26, 2022 at 08:50:39PM +0100, Jann Horn wrote:
> > On Wed, Jan 26, 2022 at 7:42 PM Ariadne Conill <[email protected]> wrote:
> > > On Wed, 26 Jan 2022, Jann Horn wrote:
> > > > On Wed, Jan 26, 2022 at 6:58 PM Kees Cook <[email protected]> wrote:
> > > >> Quoting Ariadne Conill:
> > > >>
> > > >> "In several other operating systems, it is a hard requirement that the
> > > >> first argument to execve(2) be the name of a program, thus prohibiting
> > > >> a scenario where argc < 1. POSIX 2017 also recommends this behaviour,
> > > >> but it is not an explicit requirement[1]:
> > > >>
> > > >> The argument arg0 should point to a filename string that is
> > > >> associated with the process being started by one of the exec
> > > >> functions.
> > > >> ...
> > > >> Interestingly, Michael Kerrisk opened an issue about this in 2008[2],
> > > >> but there was no consensus to support fixing this issue then.
> > > >> Hopefully now that CVE-2021-4034 shows practical exploitative use[3]
> > > >> of this bug in a shellcode, we can reconsider."
> > > >>
> > > >> An examination of existing[4] users of execve(..., NULL, NULL) shows
> > > >> mostly test code, or example rootkit code. While rejecting a NULL argv
> > > >> would be preferred, it looks like the main cause of userspace confusion
> > > >> is an assumption that argc >= 1, and buggy programs may skip argv[0]
> > > >> when iterating. To protect against userspace bugs of this nature, insert
> > > >> an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
> > > >>
> > > >> Note that this is only done in the argc == 0 case because some userspace
> > > >> programs expect to find envp at exactly argv[argc]. The overlap of these
> > > >> two misguided assumptions is believed to be zero.
> > > >
> > > > Will this result in the executed program being told that argc==0 but
> > > > having an extra NULL pointer on the stack?
> > > > If so, I believe this breaks the x86-64 ABI documented at
> > > > https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf - page 29,
> > > > figure 3.9 describes the layout of the initial process stack.
> > >
> > > I'm presently compiling a kernel with the patch to see if it works or not.
> > >
> > > > Actually, does this even work? Can a program still properly access its
> > > > environment variables when invoked with argc==0 with this patch
> > > > applied? AFAIU the way userspace locates envv on x86-64 is by
> > > > calculating 8*(argc+1)?
> > >
> > > In the other thread, it was suggested that perhaps we should set up an
> > > argv of {"", NULL}. In that case, it seems like it would be safe to claim
> > > argc == 1.
> > >
> > > What do you think?
> >
> > Sounds good to me, since that's something that could also happen
> > normally if userspace calls execve(..., {"", NULL}, ...).
> >
> > (I'd like it even better if we could just bail out with an error code,
> > but I guess the risk of breakage might be too high with that
> > approach?)
>
> We can't mutate argc; it'll turn at least some userspace into an
> infinite loop:
> https://sources.debian.org/src/valgrind/1:3.18.1-1/none/tests/execve.c/?hl=22#L22

How does that become an infinite loop? We obviously wouldn't mutate
argc in the caller, just the callee.

Also, there's a version of this where we only mutate argc if we're
executing a setuid program, which would remove the privilege
escalation part of things.

2022-01-26 22:37:08

by Ariadne Conill

[permalink] [raw]
Subject: Re: [PATCH] fs/binfmt_elf: Add padding NULL when argc == 0

Hi,

On Wed, 26 Jan 2022, Kees Cook wrote:

> Quoting Ariadne Conill:
>
> "In several other operating systems, it is a hard requirement that the
> first argument to execve(2) be the name of a program, thus prohibiting
> a scenario where argc < 1. POSIX 2017 also recommends this behaviour,
> but it is not an explicit requirement[1]:
>
> The argument arg0 should point to a filename string that is
> associated with the process being started by one of the exec
> functions.
> ...
> Interestingly, Michael Kerrisk opened an issue about this in 2008[2],
> but there was no consensus to support fixing this issue then.
> Hopefully now that CVE-2021-4034 shows practical exploitative use[3]
> of this bug in a shellcode, we can reconsider."
>
> An examination of existing[4] users of execve(..., NULL, NULL) shows
> mostly test code, or example rootkit code. While rejecting a NULL argv
> would be preferred, it looks like the main cause of userspace confusion
> is an assumption that argc >= 1, and buggy programs may skip argv[0]
> when iterating. To protect against userspace bugs of this nature, insert
> an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
>
> Note that this is only done in the argc == 0 case because some userspace
> programs expect to find envp at exactly argv[argc]. The overlap of these
> two misguided assumptions is believed to be zero.
>
> [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
> [2] https://bugzilla.kernel.org/show_bug.cgi?id=8408
> [3] https://www.qualys.com/2022/01/25/cve-2021-4034/pwnkit.txt
> [4] https://codesearch.debian.net/search?q=execve%5C+*%5C%28%5B%5E%2C%5D%2B%2C+*NULL&literal=0
>
> Reported-by: Ariadne Conill <[email protected]>
> Reported-by: Michael Kerrisk <[email protected]>
> Cc: Matthew Wilcox <[email protected]>
> Cc: Christian Brauner <[email protected]>
> Cc: Rich Felker <[email protected]>
> Cc: Eric Biederman <[email protected]>
> Cc: Alexander Viro <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Kees Cook <[email protected]>

Tested-by: Ariadne Conill <[email protected]>

It seems to work, but I still think bailing early with -EINVAL is a more
reasonable position to take. For example, the following code, when used
with BusyBox applets results in a segfault, as the multicall stub does not
support scenarios where argc < 1:

#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>

int main(int argc, const char **argv) {
if (syscall(SYS_execve, "/bin/date", NULL, NULL) < 0)
perror("execve");
return 0;
}

Ariadne

2022-01-26 22:37:16

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] fs/binfmt_elf: Add padding NULL when argc == 0

On Wed, Jan 26, 2022 at 08:08:14PM +0000, Matthew Wilcox wrote:
> On Wed, Jan 26, 2022 at 11:58:39AM -0800, Kees Cook wrote:
> > We can't mutate argc; it'll turn at least some userspace into an
> > infinite loop:
> > https://sources.debian.org/src/valgrind/1:3.18.1-1/none/tests/execve.c/?hl=22#L22
>
> How does that become an infinite loop? We obviously wouldn't mutate
> argc in the caller, just the callee.

Oh, sorry, I misread. It's using /bin/true, not argv[0] (another bit of
code I found was using argv[0]). Yeah, {"", NULL} could work.

> Also, there's a version of this where we only mutate argc if we're
> executing a setuid program, which would remove the privilege
> escalation part of things.

True; though I'd like to keep the logic as non-specialized as possible.
I don't like making stuff conditional on privilege boundaries if we can
make it always happen.

--
Kees Cook

2022-01-26 22:42:26

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] fs/binfmt_elf: Add padding NULL when argc == 0

Kees Cook <[email protected]> writes:

> On Wed, Jan 26, 2022 at 08:08:14PM +0000, Matthew Wilcox wrote:
>> On Wed, Jan 26, 2022 at 11:58:39AM -0800, Kees Cook wrote:
>> > We can't mutate argc; it'll turn at least some userspace into an
>> > infinite loop:
>> > https://sources.debian.org/src/valgrind/1:3.18.1-1/none/tests/execve.c/?hl=22#L22
>>
>> How does that become an infinite loop? We obviously wouldn't mutate
>> argc in the caller, just the callee.
>
> Oh, sorry, I misread. It's using /bin/true, not argv[0] (another bit of
> code I found was using argv[0]). Yeah, {"", NULL} could work.
>
>> Also, there's a version of this where we only mutate argc if we're
>> executing a setuid program, which would remove the privilege
>> escalation part of things.
>
> True; though I'd like to keep the logic as non-specialized as possible.
> I don't like making stuff conditional on privilege boundaries if we can
> make it always happen.

Which I think means turning the argc == 0 case into { "", NULL }.
I think we can always do that, and it is already valid in userspace.

The only case I can imagine breaking would be an explicitly testing
for argc == 0 and behaving completely differently if that is passed
to the program.

Eric


2022-01-27 00:38:48

by Ariadne Conill

[permalink] [raw]
Subject: Re: [PATCH] fs/binfmt_elf: Add padding NULL when argc == 0

Hi,

On Wed, 26 Jan 2022, Ariadne Conill wrote:

> Hi,
>
> On Wed, 26 Jan 2022, Kees Cook wrote:
>
>> Quoting Ariadne Conill:
>>
>> "In several other operating systems, it is a hard requirement that the
>> first argument to execve(2) be the name of a program, thus prohibiting
>> a scenario where argc < 1. POSIX 2017 also recommends this behaviour,
>> but it is not an explicit requirement[1]:
>>
>> The argument arg0 should point to a filename string that is
>> associated with the process being started by one of the exec
>> functions.
>> ...
>> Interestingly, Michael Kerrisk opened an issue about this in 2008[2],
>> but there was no consensus to support fixing this issue then.
>> Hopefully now that CVE-2021-4034 shows practical exploitative use[3]
>> of this bug in a shellcode, we can reconsider."
>>
>> An examination of existing[4] users of execve(..., NULL, NULL) shows
>> mostly test code, or example rootkit code. While rejecting a NULL argv
>> would be preferred, it looks like the main cause of userspace confusion
>> is an assumption that argc >= 1, and buggy programs may skip argv[0]
>> when iterating. To protect against userspace bugs of this nature, insert
>> an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
>>
>> Note that this is only done in the argc == 0 case because some userspace
>> programs expect to find envp at exactly argv[argc]. The overlap of these
>> two misguided assumptions is believed to be zero.
>>
>> [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
>> [2] https://bugzilla.kernel.org/show_bug.cgi?id=8408
>> [3] https://www.qualys.com/2022/01/25/cve-2021-4034/pwnkit.txt
>> [4]
>> https://codesearch.debian.net/search?q=execve%5C+*%5C%28%5B%5E%2C%5D%2B%2C+*NULL&literal=0
>>
>> Reported-by: Ariadne Conill <[email protected]>
>> Reported-by: Michael Kerrisk <[email protected]>
>> Cc: Matthew Wilcox <[email protected]>
>> Cc: Christian Brauner <[email protected]>
>> Cc: Rich Felker <[email protected]>
>> Cc: Eric Biederman <[email protected]>
>> Cc: Alexander Viro <[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
>> Signed-off-by: Kees Cook <[email protected]>
>
> Tested-by: Ariadne Conill <[email protected]>
>
> It seems to work, but I still think bailing early with -EINVAL is a more
> reasonable position to take. For example, the following code, when used with
> BusyBox applets results in a segfault, as the multicall stub does not support
> scenarios where argc < 1:
>
> #include <stdio.h>
> #include <unistd.h>
> #include <sys/syscall.h>
>
> int main(int argc, const char **argv) {
> if (syscall(SYS_execve, "/bin/date", NULL, NULL) < 0)
> perror("execve");
> return 0;
> }
>

Further testing indicates that while things *mostly* work, it results in
memory corruption in various tasks, for example, trying to build a new
kernel hung, and the gcc process's name was a bunch of uninitialized
memory. So, I don't think { NULL, NULL } is a good way to go.

Ariadne

2022-01-27 00:38:48

by Rich Felker

[permalink] [raw]
Subject: Re: [PATCH] fs/binfmt_elf: Add padding NULL when argc == 0

On Wed, Jan 26, 2022 at 09:57:47AM -0800, Kees Cook wrote:
> Quoting Ariadne Conill:
>
> "In several other operating systems, it is a hard requirement that the
> first argument to execve(2) be the name of a program, thus prohibiting
> a scenario where argc < 1. POSIX 2017 also recommends this behaviour,
> but it is not an explicit requirement[1]:
>
> The argument arg0 should point to a filename string that is
> associated with the process being started by one of the exec
> functions.
> ...
> Interestingly, Michael Kerrisk opened an issue about this in 2008[2],
> but there was no consensus to support fixing this issue then.
> Hopefully now that CVE-2021-4034 shows practical exploitative use[3]
> of this bug in a shellcode, we can reconsider."
>
> An examination of existing[4] users of execve(..., NULL, NULL) shows
> mostly test code, or example rootkit code. While rejecting a NULL argv
> would be preferred, it looks like the main cause of userspace confusion
> is an assumption that argc >= 1, and buggy programs may skip argv[0]
> when iterating. To protect against userspace bugs of this nature, insert
> an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
>
> Note that this is only done in the argc == 0 case because some userspace
> programs expect to find envp at exactly argv[argc]. The overlap of these
> two misguided assumptions is believed to be zero.
>
> [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
> [2] https://bugzilla.kernel.org/show_bug.cgi?id=8408
> [3] https://www.qualys.com/2022/01/25/cve-2021-4034/pwnkit.txt
> [4] https://codesearch.debian.net/search?q=execve%5C+*%5C%28%5B%5E%2C%5D%2B%2C+*NULL&literal=0
>
> Reported-by: Ariadne Conill <[email protected]>
> Reported-by: Michael Kerrisk <[email protected]>
> Cc: Matthew Wilcox <[email protected]>
> Cc: Christian Brauner <[email protected]>
> Cc: Rich Felker <[email protected]>
> Cc: Eric Biederman <[email protected]>
> Cc: Alexander Viro <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Kees Cook <[email protected]>
> ---
> fs/binfmt_elf.c | 10 +++++++++-
> fs/exec.c | 7 ++++++-
> 2 files changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> index 605017eb9349..e456c48658ad 100644
> --- a/fs/binfmt_elf.c
> +++ b/fs/binfmt_elf.c
> @@ -297,7 +297,8 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,
> ei_index = elf_info - (elf_addr_t *)mm->saved_auxv;
> sp = STACK_ADD(p, ei_index);
>
> - items = (argc + 1) + (envc + 1) + 1;
> + /* Make room for extra pointer when argc == 0. See below. */
> + items = (min(argc, 1) + 1) + (envc + 1) + 1;
> bprm->p = STACK_ROUND(sp, items);
>
> /* Point sp at the lowest address on the stack */
> @@ -326,6 +327,13 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,
>
> /* Populate list of argv pointers back to argv strings. */
> p = mm->arg_end = mm->arg_start;
> + /*
> + * Include an extra NULL pointer in argv when argc == 0 so
> + * that argv[1] != envp[0] to help userspace programs from
> + * mishandling argc == 0. See fs/exec.c bprm_stack_limits().
> + */
> + if (argc == 0 && put_user(0, sp++))
> + return -EFAULT;
> while (argc-- > 0) {
> size_t len;
> if (put_user((elf_addr_t)p, sp++))
> diff --git a/fs/exec.c b/fs/exec.c
> index 79f2c9483302..0b36384e55b1 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -495,8 +495,13 @@ static int bprm_stack_limits(struct linux_binprm *bprm)
> * the stack. They aren't stored until much later when we can't
> * signal to the parent that the child has run out of stack space.
> * Instead, calculate it here so it's possible to fail gracefully.
> + *
> + * In the case of argc < 1, make sure there is a NULL pointer gap
> + * between argv and envp to ensure confused userspace programs don't
> + * start processing from argv[1], thinking argc can never be 0,
> + * to block them from walking envp by accident. See fs/binfmt_elf.c.
> */
> - ptr_size = (bprm->argc + bprm->envc) * sizeof(void *);
> + ptr_size = (min(bprm->argc, 1) + bprm->envc) * sizeof(void *);
> if (limit <= ptr_size)
> return -E2BIG;
> limit -= ptr_size;
> --
> 2.30.2
>

This patch is not just wrong, but extremely dangerously wrong, to the
point that it may make all suid-root binaries exploitable (at least
dynamic linked ones).

The ELF entry point contract is that argv+argc+1==envp, and in fact
this is the "preferred" way of computing envp so as to avoid linear
search over argv. In musl's dynamic linker we do exactly that; I'm not
sure about glibc's. See:

https://git.musl-libc.org/cgit/musl/tree/ldso/dynlink.c?id=v1.2.2#n1740

If argv[argc+1] wrongly contains a null pointer, semantically, that
means the environment is empty and auxv starts at the next stack slot.
It's an exercise for the reader to populate the environment in a way
that this memory wrongly gets interpreted as a meaningful auxv. I'm
not sure this is possible, but I wouldn't automatically rule it out.

In short: YOU CANNOT CHANGE/BREAK CONTRACTS TO MITIGATE A VULN. Doing
so just makes new vulns in the programs that were correct before.

Silently replacing argc==0 with argc==1 and argv[0]=="" would be a
safe variant of this, but I'm really in favor of just erroring out,
but *only doing it when the exec is a privilege boundary* (suid/etc.)
to minimize the chance of breaking software dependent on allowing
argc==0.

Rich

2022-01-31 23:13:12

by Oliver Sang

[permalink] [raw]
Subject: [fs/binfmt_elf] 4736b95ed2: kernel-selftests.x86.make_fail



Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: 4736b95ed241d76c59d34859cb77703cf587dcee ("[PATCH] fs/binfmt_elf: Add padding NULL when argc == 0")
url: https://github.com/0day-ci/linux/commits/Kees-Cook/fs-binfmt_elf-Add-padding-NULL-when-argc-0/20220127-015851
base: https://git.kernel.org/cgit/linux/kernel/git/kees/linux.git for-next/pstore
patch link: https://lore.kernel.org/linux-fsdevel/[email protected]

in testcase: kernel-selftests
version: kernel-selftests-x86_64-f050cde9-1_20220127
with following parameters:

group: x86

test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel.
test-url: https://www.kernel.org/doc/Documentation/kselftest.txt


on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):




If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


gcc -m32 -o /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-4736b95ed241d76c59d34859cb77703cf587dcee/tools/testing/selftests/x86/test_FCOMI_32 -O2 -g -std=gnu99 -p
thread -Wall -no-pie -DCAN_BUILD_32 test_FCOMI.c helpers.h -lrt -ldl -lm
/usr/bin/ld: bad -plugin-opt option
: error: ld returned 1 exit status
make: *** [Makefile:75: /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-4736b95ed241d76c59d34859cb77703cf587dcee/tools/testing/selftests/x86/test_FCOMI_32] Error 1


please be noted above failure detail is in attachments.

actually we also saw other types of make failure which all not observed upon
parent.

(1)
gcc -m32 -o /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-4736b95ed241d76c59d34859cb77703cf587dcee/tools/testing/selftests/x86/test_mremap_vdso_32 -O2 -g -std=gn
u99 -pthread -Wall -no-pie -DCAN_BUILD_32 -DCAN_BUILD_64 test_mremap_vdso.c helpers.h -lrt -ldl -lm
: error: : No such file or directory
: error: ^_: No such file or directory
: error: : No such file or directory
make: *** [Makefile:75: /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-4736b95ed241d76c59d34859cb77703cf587dcee/tools/testing/selftests/x86/test_mremap_vdso_32] E
rror 1

(2)
gcc -m32 -o /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-4736b95ed241d76c59d34859cb77703cf587dcee/tools/testing/selftests/x86/check_initial_reg_state_32 -O2 -g
-std=gnu99 -pthread -Wall -no-pie -Wl,-ereal_start -static -DCAN_BUILD_32 -DCAN_BUILD_64 check_initial_reg_state.c helpers.h -lrt -ldl -lm
: error: too many filenames given. Type --help for usage
: fatal error: ~W▒~▒^?: No such file or directory
compilation terminated.
make: *** [Makefile:75: /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-4736b95ed241d76c59d34859cb77703cf587dcee/tools/testing/selftests/x86/check_initial_reg_stat
e_32] Error 1




To reproduce:

# build kernel
cd linux
cp config-5.16.0-rc1-00002-g4736b95ed241 .config
make HOSTCC=gcc-9 CC=gcc-9 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-9 CC=gcc-9 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz


git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.



---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (3.81 kB)
config-5.16.0-rc1-00002-g4736b95ed241 (179.97 kB)
job-script (5.19 kB)
dmesg.xz (18.70 kB)
kernel-selftests (9.64 kB)
Download all attachments