2023-10-11 03:42:16

by Rik van Riel

[permalink] [raw]
Subject: [PATCH] execve.2: execve also returns E2BIG if a string is too long

Document that if a command line or environment string is too long (> MAX_ARG_STRLEN), execve will also return E2BIG.

Signed-off-by: Rik van Riel <[email protected]>
---
man2/execve.2 | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/man2/execve.2 b/man2/execve.2
index 0d9582492ad1..c1a359d01872 100644
--- a/man2/execve.2
+++ b/man2/execve.2
@@ -449,7 +449,7 @@ The total number of bytes in the environment
.RI ( envp )
and argument list
.RI ( argv )
-is too large.
+is too large, or an argument or environment string is too long.
.TP
.B EACCES
Search permission is denied on a component of the path prefix of
--
2.41.0



2023-10-11 10:41:46

by Alejandro Colomar

[permalink] [raw]
Subject: Re: [PATCH] execve.2: execve also returns E2BIG if a string is too long

Hi Rik,

On Tue, Oct 10, 2023 at 11:41:53PM -0400, Rik van Riel wrote:
> Document that if a command line or environment string is too long (> MAX_ARG_STRLEN), execve will also return E2BIG.

That's already implied by the current text:

E2BIG The total number of bytes in the environment (envp) and argument
list (argv) is too large.

That means that

size_t bytes;

bytes = 0;
for (char *e = envp; e != NULL; e++)
bytes += strlen(e) + 1; // I have doubts about the +1
for (char *a = argv; a != NULL; a++)
bytes += strlen(a) + 1; // Same doubts

if (bytes > MAX_ARG_STRLEN) // Maybe >= ?
return -E2BIG;

>
> Signed-off-by: Rik van Riel <[email protected]>
> ---
> man2/execve.2 | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/man2/execve.2 b/man2/execve.2
> index 0d9582492ad1..c1a359d01872 100644
> --- a/man2/execve.2
> +++ b/man2/execve.2
> @@ -449,7 +449,7 @@ The total number of bytes in the environment
> .RI ( envp )
> and argument list
> .RI ( argv )
> -is too large.
> +is too large, or an argument or environment string is too long.

Please use semantic newlines:

$ MANWIDTH=72 man man-pages | sed -n '/Use semantic newlines/,/^$/p'
Use semantic newlines
In the source of a manual page, new sentences should be started
on new lines, long sentences should be split into lines at
clause breaks (commas, semicolons, colons, and so on), and long
clauses should be split at phrase boundaries. This convention,
sometimes known as "semantic newlines", makes it easier to see
the effect of patches, which often operate at the level of in‐
dividual sentences, clauses, or phrases.


Thanks,
Alex

> .TP
> .B EACCES
> Search permission is denied on a component of the path prefix of
> --
> 2.41.0
>
>

--
<https://www.alejandro-colomar.es/>


Attachments:
(No filename) (1.89 kB)
signature.asc (849.00 B)
Download all attachments

2023-10-11 13:21:52

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH] execve.2: execve also returns E2BIG if a string is too long

On Wed, 2023-10-11 at 12:41 +0200, Alejandro Colomar wrote:
> Hi Rik,
>
> On Tue, Oct 10, 2023 at 11:41:53PM -0400, Rik van Riel wrote:
> > Document that if a command line or environment string is too long
> > (> MAX_ARG_STRLEN), execve will also return E2BIG.
>
> That's already implied by the current text:
>
>        E2BIG  The total number of bytes in the environment (envp) and
> argument
>               list (argv) is too large.
>
> That means that
>
> size_t  bytes;
>
> bytes = 0;
> for (char *e = envp; e != NULL; e++)
>         bytes += strlen(e) + 1;  // I have doubts about the +1
> for (char *a = argv; a != NULL; a++)
>         bytes += strlen(a) + 1;  // Same doubts
>
> if (bytes > MAX_ARG_STRLEN)  // Maybe >= ?
>         return -E2BIG;

The code in fs/exec.c enforces MAX_ARG_STRLEN against
each individual string, not against the total.

If any string, either argument or environment, is larger
than 32 * PAGE_SIZE, the kernel will return -E2BIG.

do_execveat_common() has this code, which uses copy_strings
to copy both the strings from the environment, and from
the command line arguments:

retval = copy_strings(bprm->envc, envp, bprm);
if (retval < 0)
goto out_free;

retval = copy_strings(bprm->argc, argv, bprm);
if (retval < 0)
goto out_free;

Inside copy_strings() we have this code:


while (argc-- > 0) {
...
len = strnlen_user(str, MAX_ARG_STRLEN);
if (!len)
goto out;

ret = -E2BIG;
if (!valid_arg_len(bprm, len))
goto out;

The valid_arg_len() function does not need explanation:

static bool valid_arg_len(struct linux_binprm *bprm, long len)
{
return len <= MAX_ARG_STRLEN;
}


The current man page wording is very clear about the total
length being enforced, but IMHO not as clear about the limit
that gets enforced on each individual string.

The total length limit of environment & commandline arguments
is enforced by bprm_stack_limits(), and is checked against
either 1/4 of the maximum stack size, or 3/4 of _STK_LIM, whichever
is smaller. The MAX_ARG_STRLEN value does not come into play when
enforcing the total.

--
All Rights Reversed.

2023-10-11 13:44:59

by Matthew House

[permalink] [raw]
Subject: Re: [PATCH] execve.2: execve also returns E2BIG if a string is too long

On Wed, Oct 11, 2023 at 9:21 AM Rik van Riel <[email protected]> wrote:
> On Wed, 2023-10-11 at 12:41 +0200, Alejandro Colomar wrote:
> > Hi Rik,
> >
> > On Tue, Oct 10, 2023 at 11:41:53PM -0400, Rik van Riel wrote:
> > > Document that if a command line or environment string is too long
> > > (> MAX_ARG_STRLEN), execve will also return E2BIG.
> >
> > That's already implied by the current text:
> >
> > E2BIG The total number of bytes in the environment (envp) and
> > argument
> > list (argv) is too large.
> >
> > That means that
> >
> > size_t bytes;
> >
> > bytes = 0;
> > for (char *e = envp; e != NULL; e++)
> > bytes += strlen(e) + 1; // I have doubts about the +1
> > for (char *a = argv; a != NULL; a++)
> > bytes += strlen(a) + 1; // Same doubts
> >
> > if (bytes > MAX_ARG_STRLEN) // Maybe >= ?
> > return -E2BIG;
>
> The code in fs/exec.c enforces MAX_ARG_STRLEN against
> each individual string, not against the total.
>
> If any string, either argument or environment, is larger
> than 32 * PAGE_SIZE, the kernel will return -E2BIG.
>
> do_execveat_common() has this code, which uses copy_strings
> to copy both the strings from the environment, and from
> the command line arguments:
>
> retval = copy_strings(bprm->envc, envp, bprm);
> if (retval < 0)
> goto out_free;
>
> retval = copy_strings(bprm->argc, argv, bprm);
> if (retval < 0)
> goto out_free;
>
> Inside copy_strings() we have this code:
>
>
> while (argc-- > 0) {
> ...
> len = strnlen_user(str, MAX_ARG_STRLEN);
> if (!len)
> goto out;
>
> ret = -E2BIG;
> if (!valid_arg_len(bprm, len))
> goto out;
>
> The valid_arg_len() function does not need explanation:
>
> static bool valid_arg_len(struct linux_binprm *bprm, long len)
> {
> return len <= MAX_ARG_STRLEN;
> }
>
>
> The current man page wording is very clear about the total
> length being enforced, but IMHO not as clear about the limit
> that gets enforced on each individual string.
>
> The total length limit of environment & commandline arguments
> is enforced by bprm_stack_limits(), and is checked against
> either 1/4 of the maximum stack size, or 3/4 of _STK_LIM, whichever
> is smaller. The MAX_ARG_STRLEN value does not come into play when
> enforcing the total.

To expand on this, there are basically two separate byte limits in
fs/exec.c, one for each individual argv/envp string, and another for all
strings and all pointers to them as a whole. To put the whole thing in
pseudocode, the checks work effectively like this, assuming I haven't made
any errors:

int argc, envc;
unsigned long bytes, limit;

/* assume that argv has already been adjusted to add an empty argv[0] */
argc = 0, envc = 0, bytes = 0;
for (char **a = argv; *a != NULL; a++, argc++) {
if (strlen(*a) >= MAX_ARG_STRLEN)
return -E2BIG;
bytes += strlen(*a) + 1;
}
for (char **e = envp; *e != NULL; e++, envc++) {
if (strlen(*e) >= MAX_ARG_STRLEN)
return -E2BIG;
bytes += strlen(*e) + 1;
}

if (argc > MAX_ARG_STRINGS || envc > MAX_ARG_STRINGS)
return -E2BIG;
bytes += (argc + envc) * sizeof(void *);

limit = max(min(_STK_LIM / 4 * 3, rlim_stack.rlim_cur / 4), ARG_MAX);
if (bytes > limit)
return -E2BIG;

Thank you,
Matthew House

2023-10-11 13:53:02

by Matthew House

[permalink] [raw]
Subject: Re: [PATCH] execve.2: execve also returns E2BIG if a string is too long

On Tue, Oct 10, 2023 at 11:51 PM Rik van Riel <[email protected]> wrote:
> Document that if a command line or environment string is too long (> MAX_ARG_STRLEN), execve will also return E2BIG.
>
> Signed-off-by: Rik van Riel <[email protected]>

It might be worth mentioning that strlen(pathname) must also be strictly
less than MAX_ARG_STRLEN, it being subject to the same restrictions as
each of the argv/envp strings.

Thank you,
Matthew House

2023-10-11 14:43:01

by Alejandro Colomar

[permalink] [raw]
Subject: Re: [PATCH] execve.2: execve also returns E2BIG if a string is too long

Hi Rik,

On Wed, Oct 11, 2023 at 09:21:28AM -0400, Rik van Riel wrote:
> On Wed, 2023-10-11 at 12:41 +0200, Alejandro Colomar wrote:
> > Hi Rik,
> >
> > On Tue, Oct 10, 2023 at 11:41:53PM -0400, Rik van Riel wrote:
> > > Document that if a command line or environment string is too long
> > > (> MAX_ARG_STRLEN), execve will also return E2BIG.
> >
> > That's already implied by the current text:
> >
> >        E2BIG  The total number of bytes in the environment (envp) and
> > argument
> >               list (argv) is too large.
> >
> > That means that
> >
> > size_t  bytes;
> >
> > bytes = 0;
> > for (char *e = envp; e != NULL; e++)
> >         bytes += strlen(e) + 1;  // I have doubts about the +1
> > for (char *a = argv; a != NULL; a++)
> >         bytes += strlen(a) + 1;  // Same doubts
> >
> > if (bytes > MAX_ARG_STRLEN)  // Maybe >= ?
> >         return -E2BIG;
>
> The code in fs/exec.c enforces MAX_ARG_STRLEN against
> each individual string, not against the total.
>
> If any string, either argument or environment, is larger
> than 32 * PAGE_SIZE, the kernel will return -E2BIG.
>
> do_execveat_common() has this code, which uses copy_strings
> to copy both the strings from the environment, and from
> the command line arguments:
>
> retval = copy_strings(bprm->envc, envp, bprm);
> if (retval < 0)
> goto out_free;
>
> retval = copy_strings(bprm->argc, argv, bprm);
> if (retval < 0)
> goto out_free;
>
> Inside copy_strings() we have this code:
>
>
> while (argc-- > 0) {
> ...
> len = strnlen_user(str, MAX_ARG_STRLEN);
> if (!len)
> goto out;
>
> ret = -E2BIG;
> if (!valid_arg_len(bprm, len))
> goto out;
>
> The valid_arg_len() function does not need explanation:
>
> static bool valid_arg_len(struct linux_binprm *bprm, long len)
> {
> return len <= MAX_ARG_STRLEN;
> }
>
>
> The current man page wording is very clear about the total
> length being enforced, but IMHO not as clear about the limit
> that gets enforced on each individual string.
>
> The total length limit of environment & commandline arguments
> is enforced by bprm_stack_limits(), and is checked against
> either 1/4 of the maximum stack size, or 3/4 of _STK_LIM, whichever
> is smaller. The MAX_ARG_STRLEN value does not come into play when
> enforcing the total.

Ahh, so the limit for each string is different than the limit for the
total. That makes sense. Sorry for the noise.

Cheers,
Alex

>
> --
> All Rights Reversed.

--
<https://www.alejandro-colomar.es/>


Attachments:
(No filename) (2.75 kB)
signature.asc (849.00 B)
Download all attachments

2023-10-11 14:44:37

by Alejandro Colomar

[permalink] [raw]
Subject: Re: [PATCH] execve.2: execve also returns E2BIG if a string is too long

Hi Matthew,

On Wed, Oct 11, 2023 at 09:44:29AM -0400, Matthew House wrote:
>
> To expand on this, there are basically two separate byte limits in
> fs/exec.c, one for each individual argv/envp string, and another for all
> strings and all pointers to them as a whole. To put the whole thing in
> pseudocode, the checks work effectively like this, assuming I haven't made
> any errors:
>
> int argc, envc;
> unsigned long bytes, limit;
>
> /* assume that argv has already been adjusted to add an empty argv[0] */
> argc = 0, envc = 0, bytes = 0;
> for (char **a = argv; *a != NULL; a++, argc++) {
> if (strlen(*a) >= MAX_ARG_STRLEN)
> return -E2BIG;
> bytes += strlen(*a) + 1;
> }
> for (char **e = envp; *e != NULL; e++, envc++) {
> if (strlen(*e) >= MAX_ARG_STRLEN)
> return -E2BIG;
> bytes += strlen(*e) + 1;
> }
>
> if (argc > MAX_ARG_STRINGS || envc > MAX_ARG_STRINGS)
> return -E2BIG;
> bytes += (argc + envc) * sizeof(void *);
>
> limit = max(min(_STK_LIM / 4 * 3, rlim_stack.rlim_cur / 4), ARG_MAX);
> if (bytes > limit)
> return -E2BIG;
>
> Thank you,
> Matthew House

Thanks!

This thing would be useful in the commit message. An example program
demonstrating it would be even better.

Cheers,
Alex

--
<https://www.alejandro-colomar.es/>


Attachments:
(No filename) (1.31 kB)
signature.asc (849.00 B)
Download all attachments

2023-10-11 14:47:40

by Alejandro Colomar

[permalink] [raw]
Subject: Re: [PATCH] execve.2: execve also returns E2BIG if a string is too long

On Wed, Oct 11, 2023 at 09:44:29AM -0400, Matthew House wrote:
> On Wed, Oct 11, 2023 at 9:21 AM Rik van Riel <[email protected]> wrote:
> > On Wed, 2023-10-11 at 12:41 +0200, Alejandro Colomar wrote:
> > > Hi Rik,
> > >
> > > On Tue, Oct 10, 2023 at 11:41:53PM -0400, Rik van Riel wrote:
> > > > Document that if a command line or environment string is too long
> > > > (> MAX_ARG_STRLEN), execve will also return E2BIG.
> > >
> > > That's already implied by the current text:
> > >
> > > E2BIG The total number of bytes in the environment (envp) and
> > > argument
> > > list (argv) is too large.
> > >
> > > That means that
> > >
> > > size_t bytes;
> > >
> > > bytes = 0;
> > > for (char *e = envp; e != NULL; e++)
> > > bytes += strlen(e) + 1; // I have doubts about the +1
> > > for (char *a = argv; a != NULL; a++)
> > > bytes += strlen(a) + 1; // Same doubts
> > >
> > > if (bytes > MAX_ARG_STRLEN) // Maybe >= ?
> > > return -E2BIG;
> >
> > The code in fs/exec.c enforces MAX_ARG_STRLEN against
> > each individual string, not against the total.
> >
> > If any string, either argument or environment, is larger
> > than 32 * PAGE_SIZE, the kernel will return -E2BIG.
> >
> > do_execveat_common() has this code, which uses copy_strings
> > to copy both the strings from the environment, and from
> > the command line arguments:
> >
> > retval = copy_strings(bprm->envc, envp, bprm);
> > if (retval < 0)
> > goto out_free;
> >
> > retval = copy_strings(bprm->argc, argv, bprm);
> > if (retval < 0)
> > goto out_free;
> >
> > Inside copy_strings() we have this code:
> >
> >
> > while (argc-- > 0) {
> > ...
> > len = strnlen_user(str, MAX_ARG_STRLEN);
> > if (!len)
> > goto out;
> >
> > ret = -E2BIG;
> > if (!valid_arg_len(bprm, len))
> > goto out;
> >
> > The valid_arg_len() function does not need explanation:
> >
> > static bool valid_arg_len(struct linux_binprm *bprm, long len)
> > {
> > return len <= MAX_ARG_STRLEN;
> > }
> >
> >
> > The current man page wording is very clear about the total
> > length being enforced, but IMHO not as clear about the limit
> > that gets enforced on each individual string.
> >
> > The total length limit of environment & commandline arguments
> > is enforced by bprm_stack_limits(), and is checked against
> > either 1/4 of the maximum stack size, or 3/4 of _STK_LIM, whichever
> > is smaller. The MAX_ARG_STRLEN value does not come into play when
> > enforcing the total.
>
> To expand on this, there are basically two separate byte limits in
> fs/exec.c, one for each individual argv/envp string, and another for all
> strings and all pointers to them as a whole. To put the whole thing in
> pseudocode, the checks work effectively like this, assuming I haven't made
> any errors:
>
> int argc, envc;
> unsigned long bytes, limit;
>
> /* assume that argv has already been adjusted to add an empty argv[0] */
> argc = 0, envc = 0, bytes = 0;
> for (char **a = argv; *a != NULL; a++, argc++) {
> if (strlen(*a) >= MAX_ARG_STRLEN)

Are you sure this is >= and not > ?

> return -E2BIG;
> bytes += strlen(*a) + 1;
> }
> for (char **e = envp; *e != NULL; e++, envc++) {
> if (strlen(*e) >= MAX_ARG_STRLEN)
> return -E2BIG;
> bytes += strlen(*e) + 1;
> }
>
> if (argc > MAX_ARG_STRINGS || envc > MAX_ARG_STRINGS)
> return -E2BIG;
> bytes += (argc + envc) * sizeof(void *);
>
> limit = max(min(_STK_LIM / 4 * 3, rlim_stack.rlim_cur / 4), ARG_MAX);
> if (bytes > limit)
> return -E2BIG;
>
> Thank you,
> Matthew House

--
<https://www.alejandro-colomar.es/>


Attachments:
(No filename) (3.82 kB)
signature.asc (849.00 B)
Download all attachments

2023-10-11 15:11:49

by Matthew House

[permalink] [raw]
Subject: Re: [PATCH] execve.2: execve also returns E2BIG if a string is too long

On Wed, Oct 11, 2023 at 10:47 AM Alejandro Colomar <[email protected]> wrote:
> On Wed, Oct 11, 2023 at 09:44:29AM -0400, Matthew House wrote:
> > To expand on this, there are basically two separate byte limits in
> > fs/exec.c, one for each individual argv/envp string, and another for all
> > strings and all pointers to them as a whole. To put the whole thing in
> > pseudocode, the checks work effectively like this, assuming I haven't made
> > any errors:
> >
> > int argc, envc;
> > unsigned long bytes, limit;
> >
> > /* assume that argv has already been adjusted to add an empty argv[0] */
> > argc = 0, envc = 0, bytes = 0;
> > for (char **a = argv; *a != NULL; a++, argc++) {
> > if (strlen(*a) >= MAX_ARG_STRLEN)
>
> Are you sure this is >= and not > ?

Yes. In general, the kernel's string limits tend to include the trailing
null byte. There are two places where this limit is enforced, one for the
pathname (or full pathname for execveat) and the other for the argv/envp
strings. The pathname is handled by copy_string_kernel():

int len = strnlen(arg, MAX_ARG_STRLEN) + 1 /* terminating NUL */;

if (len == 0)
return -EFAULT;
if (!valid_arg_len(bprm, len))
return -E2BIG;

where valid_arg_len(bprm, len) is just (len <= MAX_ARG_STRLEN). Here,
strnlen() has the same behavior as the ordinary libc strnlen(3),
effectively returning min(strlen(arg), MAX_ARG_STRLEN). Thus, the check
succeeds iff strlen(arg) + 1 <= MAX_ARG_STRLEN, or equivalently, iff
strlen(arg) < MAX_ARG_STRLEN.

Next, each of the environment and argument strings is handled by
copy_strings():

len = strnlen_user(str, MAX_ARG_STRLEN);
if (!len)
goto out;

ret = -E2BIG;
if (!valid_arg_len(bprm, len))
goto out;

The strnlen_user() function, per its documentation, is explicitly inclusive
of the trailing null byte:

* Returns the size of the string INCLUDING the terminating NUL.
* If the string is too long, returns a number larger than @count. User
* has to check the return value against "> count".
* On exception (or invalid count), returns 0.

Thus, the check succeeds iff the size including the null byte is
<= MAX_ARG_STRLEN, i.e., iff strlen(arg) + 1 <= MAX_ARG_STRLEN, or
strlen(arg) < MAX_ARG_STRLEN.

Matthew House

2023-10-11 15:51:09

by Alejandro Colomar

[permalink] [raw]
Subject: Re: [PATCH] execve.2: execve also returns E2BIG if a string is too long

Hi Matthew,

On Wed, Oct 11, 2023 at 11:11:24AM -0400, Matthew House wrote:
> On Wed, Oct 11, 2023 at 10:47 AM Alejandro Colomar <[email protected]> wrote:
> > On Wed, Oct 11, 2023 at 09:44:29AM -0400, Matthew House wrote:
> > > To expand on this, there are basically two separate byte limits in
> > > fs/exec.c, one for each individual argv/envp string, and another for all
> > > strings and all pointers to them as a whole. To put the whole thing in
> > > pseudocode, the checks work effectively like this, assuming I haven't made
> > > any errors:
> > >
> > > int argc, envc;
> > > unsigned long bytes, limit;
> > >
> > > /* assume that argv has already been adjusted to add an empty argv[0] */
> > > argc = 0, envc = 0, bytes = 0;
> > > for (char **a = argv; *a != NULL; a++, argc++) {
> > > if (strlen(*a) >= MAX_ARG_STRLEN)
> >
> > Are you sure this is >= and not > ?
>
> Yes. In general, the kernel's string limits tend to include the trailing
> null byte. There are two places where this limit is enforced, one for the
> pathname (or full pathname for execveat) and the other for the argv/envp
> strings. The pathname is handled by copy_string_kernel():
>
> int len = strnlen(arg, MAX_ARG_STRLEN) + 1 /* terminating NUL */;
>
> if (len == 0)
> return -EFAULT;
> if (!valid_arg_len(bprm, len))
> return -E2BIG;
>
> where valid_arg_len(bprm, len) is just (len <= MAX_ARG_STRLEN). Here,
> strnlen() has the same behavior as the ordinary libc strnlen(3),
> effectively returning min(strlen(arg), MAX_ARG_STRLEN). Thus, the check
> succeeds iff strlen(arg) + 1 <= MAX_ARG_STRLEN, or equivalently, iff
> strlen(arg) < MAX_ARG_STRLEN.
>
> Next, each of the environment and argument strings is handled by
> copy_strings():
>
> len = strnlen_user(str, MAX_ARG_STRLEN);
> if (!len)
> goto out;
>
> ret = -E2BIG;
> if (!valid_arg_len(bprm, len))
> goto out;
>
> The strnlen_user() function, per its documentation, is explicitly inclusive
> of the trailing null byte:
>
> * Returns the size of the string INCLUDING the terminating NUL.
> * If the string is too long, returns a number larger than @count. User
> * has to check the return value against "> count".
> * On exception (or invalid count), returns 0.
>
> Thus, the check succeeds iff the size including the null byte is
> <= MAX_ARG_STRLEN, i.e., iff strlen(arg) + 1 <= MAX_ARG_STRLEN, or
> strlen(arg) < MAX_ARG_STRLEN.

Thanks! It's a bit confusing to see the terms 'len' and '_STRLEN'
meaning length+1, but it makes sense now.

Cheers,
Alex

>
> Matthew House

--
<https://www.alejandro-colomar.es/>


Attachments:
(No filename) (2.61 kB)
signature.asc (849.00 B)
Download all attachments