2021-01-21 22:30:25

by Cyrill Gorcunov

[permalink] [raw]
Subject: [PATCH] prctl: allow to setup brk for et_dyn executables

Keno Fischer reported that when a binray loaded via
ld-linux-x the prctl(PR_SET_MM_MAP) doesn't allow to
setup brk value because it lays before mm:end_data.

For example a test program shows

| # ~/t
|
| start_code 401000
| end_code 401a15
| start_stack 7ffce4577dd0
| start_data 403e10
| end_data 40408c
| start_brk b5b000
| sbrk(0) b5b000

and when executed via ld-linux

| # /lib64/ld-linux-x86-64.so.2 ~/t
|
| start_code 7fc25b0a4000
| end_code 7fc25b0c4524
| start_stack 7fffcc6b2400
| start_data 7fc25b0ce4c0
| end_data 7fc25b0cff98
| start_brk 55555710c000
| sbrk(0) 55555710c000

This of course prevent criu from restoring such programs.
Looking into how kernel operates with brk/start_brk inside
brk() syscall I don't see any problem if we allow to setup
brk/start_brk without checking for end_data. Even if someone
pass some weird address here on a purpose then the worst
possible result will be an unexpected unmapping of existing
vma (own vma, since prctl works with the callers memory) but
test for RLIMIT_DATA is still valid and a user won't be able
to gain more memory in case of expanding VMAs via new values
shipped with prctl call.

Reported-by: Keno Fischer <[email protected]>
Signed-off-by: Cyrill Gorcunov <[email protected]>
CC: Andrew Morton <[email protected]>
CC: Dmitry Safonov <[email protected]>
CC: Andrey Vagin <[email protected]>
CC: Kirill Tkhai <[email protected]>
CC: Eric W. Biederman <[email protected]>
---
Guys, take a look please once time permit. Hopefully I didn't
miss something 'cause made this patch via code reading only.

Andrey, do we still have a criu container which tests new kernels,
right? Would be great to run criu tests with this patch applied
to make sure everything is intact.

kernel/sys.c | 7 -------
1 file changed, 7 deletions(-)

Index: linux-tip.git/kernel/sys.c
===================================================================
--- linux-tip.git.orig/kernel/sys.c
+++ linux-tip.git/kernel/sys.c
@@ -1943,13 +1943,6 @@ static int validate_prctl_map_addr(struc
error = -EINVAL;

/*
- * @brk should be after @end_data in traditional maps.
- */
- if (prctl_map->start_brk <= prctl_map->end_data ||
- prctl_map->brk <= prctl_map->end_data)
- goto out;
-
- /*
* Neither we should allow to override limits if they set.
*/
if (check_data_rlimit(rlimit(RLIMIT_DATA), prctl_map->brk,


2021-07-20 07:41:00

by Andrei Vagin

[permalink] [raw]
Subject: Re: [PATCH] prctl: allow to setup brk for et_dyn executables

On Thu, Jan 21, 2021 at 2:12 PM Cyrill Gorcunov <[email protected]> wrote:
>
> Keno Fischer reported that when a binray loaded via
> ld-linux-x the prctl(PR_SET_MM_MAP) doesn't allow to
> setup brk value because it lays before mm:end_data.
>
> For example a test program shows
>
> | # ~/t
> |
> | start_code 401000
> | end_code 401a15
> | start_stack 7ffce4577dd0
> | start_data 403e10
> | end_data 40408c
> | start_brk b5b000
> | sbrk(0) b5b000
>
> and when executed via ld-linux
>
> | # /lib64/ld-linux-x86-64.so.2 ~/t
> |
> | start_code 7fc25b0a4000
> | end_code 7fc25b0c4524
> | start_stack 7fffcc6b2400
> | start_data 7fc25b0ce4c0
> | end_data 7fc25b0cff98
> | start_brk 55555710c000
> | sbrk(0) 55555710c000
>
> This of course prevent criu from restoring such programs.
> Looking into how kernel operates with brk/start_brk inside
> brk() syscall I don't see any problem if we allow to setup
> brk/start_brk without checking for end_data. Even if someone
> pass some weird address here on a purpose then the worst
> possible result will be an unexpected unmapping of existing
> vma (own vma, since prctl works with the callers memory) but
> test for RLIMIT_DATA is still valid and a user won't be able
> to gain more memory in case of expanding VMAs via new values
> shipped with prctl call.
>
> Reported-by: Keno Fischer <[email protected]>
> Signed-off-by: Cyrill Gorcunov <[email protected]>
> CC: Andrew Morton <[email protected]>
> CC: Dmitry Safonov <[email protected]>
> CC: Andrey Vagin <[email protected]>

Acked-by: Andrey Vagin <[email protected]>
Fixes: bbdc6076d2e5 ("binfmt_elf: move brk out of mmap when doing
direct loader exec")

> CC: Kirill Tkhai <[email protected]>
> CC: Eric W. Biederman <[email protected]>
> ---
> Guys, take a look please once time permit. Hopefully I didn't
> miss something 'cause made this patch via code reading only.
>
> Andrey, do we still have a criu container which tests new kernels,
> right? Would be great to run criu tests with this patch applied
> to make sure everything is intact.

Sorry for the delay. I run tests and everything works as expected.

Thanks,
Andrei

2021-07-20 14:16:31

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [PATCH] prctl: allow to setup brk for et_dyn executables

On Tue, Jul 20, 2021 at 12:33:11AM -0700, Andrei Vagin wrote:
> >
> > Reported-by: Keno Fischer <[email protected]>
> > Signed-off-by: Cyrill Gorcunov <[email protected]>
> > CC: Andrew Morton <[email protected]>
> > CC: Dmitry Safonov <[email protected]>
> > CC: Andrey Vagin <[email protected]>
>
> Acked-by: Andrey Vagin <[email protected]>
> Fixes: bbdc6076d2e5 ("binfmt_elf: move brk out of mmap when doing
> direct loader exec")

Thanks for review, Andrew! I reviseted this patch recently again and
indeed we still need it.

2021-07-20 21:54:17

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] prctl: allow to setup brk for et_dyn executables

On Fri, 22 Jan 2021 01:12:07 +0300 Cyrill Gorcunov <[email protected]> wrote:

> Keno Fischer reported that when a binray loaded via
> ld-linux-x the prctl(PR_SET_MM_MAP) doesn't allow to
> setup brk value because it lays before mm:end_data.
>
> For example a test program shows
>
> | # ~/t
> |
> | start_code 401000
> | end_code 401a15
> | start_stack 7ffce4577dd0
> | start_data 403e10
> | end_data 40408c
> | start_brk b5b000
> | sbrk(0) b5b000
>
> and when executed via ld-linux
>
> | # /lib64/ld-linux-x86-64.so.2 ~/t
> |
> | start_code 7fc25b0a4000
> | end_code 7fc25b0c4524
> | start_stack 7fffcc6b2400
> | start_data 7fc25b0ce4c0
> | end_data 7fc25b0cff98
> | start_brk 55555710c000
> | sbrk(0) 55555710c000
>
> This of course prevent criu from restoring such programs.
> Looking into how kernel operates with brk/start_brk inside
> brk() syscall I don't see any problem if we allow to setup
> brk/start_brk without checking for end_data. Even if someone
> pass some weird address here on a purpose then the worst
> possible result will be an unexpected unmapping of existing
> vma (own vma, since prctl works with the callers memory) but
> test for RLIMIT_DATA is still valid and a user won't be able
> to gain more memory in case of expanding VMAs via new values
> shipped with prctl call.

So... do you recall why you added that test originally?

This is under prctl(CAP_SET_MM), yes? What capabilities does this
require?

2021-07-20 22:28:09

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [PATCH] prctl: allow to setup brk for et_dyn executables

On Tue, Jul 20, 2021 at 02:51:48PM -0700, Andrew Morton wrote:
> >
> > This of course prevent criu from restoring such programs.
> > Looking into how kernel operates with brk/start_brk inside
> > brk() syscall I don't see any problem if we allow to setup
> > brk/start_brk without checking for end_data. Even if someone
> > pass some weird address here on a purpose then the worst
> > possible result will be an unexpected unmapping of existing
> > vma (own vma, since prctl works with the callers memory) but
> > test for RLIMIT_DATA is still valid and a user won't be able
> > to gain more memory in case of expanding VMAs via new values
> > shipped with prctl call.
>
> So... do you recall why you added that test originally?

To be honest, when I added this test in first place I simply forgot
about et_dyn executables because we usually run executables via
traditional exec call (where brk map sits before end_data VMA),
not via loader and that's the reason why I didn't hit this problem
before and why this get revealed only after a couple of years.
This is simply rarely used.

>
> This is under prctl(CAP_SET_MM), yes? What capabilities does this
> require?

Yes, it is for prctl(PR_SET_MM_MAP) and requires no additional
caps. The most important thing here is check_data_rlimit() function
which called at the end of memory map verification -- we make sure
the user won't get more memory than been granted by RLIMIT_DATA limit
even if he passes some bad brk value here on a purpose.

/*
* Neither we should allow to override limits if they set.
*/
if (check_data_rlimit(rlimit(RLIMIT_DATA), prctl_map->brk,
prctl_map->start_brk, prctl_map->end_data,
prctl_map->start_data))
goto out;

which expands to (I wrapped code to make it a bit more readable)

static inline int check_data_rlimit(unsigned long rlim,
unsigned long new,
unsigned long start,
unsigned long end_data,
unsigned long start_data)
{
if (rlimit(RLIMIT_DATA) < RLIM_INFINITY) {
if (((prctl_map->brk - prctl_map->start_brk) +
(prctl_map->end_data - prctl_map->start_data)) > rlimit(RLIMIT_DATA))
return -ENOSPC;
}

return 0;
}