Subject: Inconsistent capability requirements for prctl_set_mm_exe_file()

Hello Nicolas, Cyrill, and others,

@Nicolas, your commit ebd6de6812387a changed the capability
requirements for the prctl_set_mm_exe_file() operation from

ns_capable(CAP_SYS_ADMIN)

to

ns_capable(CAP_SYS_ADMIN) || ns_capable(CAP_CHECKPOINT_RESTORE).

That's fine I guess, but while looking at that change, I found
an anomaly.

The same prctl_set_mm_exe_file() functionality is also available
via the prctl() PR_SET_MM_EXE_FILE operation, which was added
by Cyrill's commit b32dfe377102ce668. However, there the
prctl_set_mm_exe_file() operation is guarded by a check

capable(CAP_SYS_RESOURCE).

There are two things I note:

* The capability requirements are different in the two cases.
* In one case the checks are with ns_capable(), while in the
other case the check is with capable().

In both cases, the inconsistencies predate Nicolas's patch,
and appear to have been introduced in Kirill Tkhai's commit
4d28df6152aa3ff.

I'm not sure what is right, but those inconsistencies seem
seem odd, and presumably unintended. Similarly, I'm not
sure what fix, if any, should be applied. However, I thought
it worth mentioning these details, since the situation is odd
and surprising.

Thanks,

Michael

--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/


2020-10-28 07:39:41

by Jann Horn

[permalink] [raw]
Subject: Re: Inconsistent capability requirements for prctl_set_mm_exe_file()

On Tue, Oct 27, 2020 at 1:11 PM Michael Kerrisk (man-pages)
<[email protected]> wrote:
> @Nicolas, your commit ebd6de6812387a changed the capability
> requirements for the prctl_set_mm_exe_file() operation from
>
> ns_capable(CAP_SYS_ADMIN)
>
> to
>
> ns_capable(CAP_SYS_ADMIN) || ns_capable(CAP_CHECKPOINT_RESTORE).
>
> That's fine I guess, but while looking at that change, I found
> an anomaly.
>
> The same prctl_set_mm_exe_file() functionality is also available
> via the prctl() PR_SET_MM_EXE_FILE operation, which was added
> by Cyrill's commit b32dfe377102ce668. However, there the
> prctl_set_mm_exe_file() operation is guarded by a check
>
> capable(CAP_SYS_RESOURCE).
>
> There are two things I note:
>
> * The capability requirements are different in the two cases.
> * In one case the checks are with ns_capable(), while in the
> other case the check is with capable().
>
> In both cases, the inconsistencies predate Nicolas's patch,
> and appear to have been introduced in Kirill Tkhai's commit
> 4d28df6152aa3ff.
>
> I'm not sure what is right, but those inconsistencies seem
> seem odd, and presumably unintended. Similarly, I'm not
> sure what fix, if any, should be applied. However, I thought
> it worth mentioning these details, since the situation is odd
> and surprising.

FWIW, as a bit of context here: I believe that these checks are more
driven by "what capabilitiies do we think a typical caller will have"
than by a proper security design of "what capabilities do we have to
require to establish certain security guarantees". As people have
noted elsewhere, on a system without LSMs, a process can point
/proc/self/exe to almost any executable file of its choice anyway (by
executing that file and then replacing the executable code of the
resulting process).

The properly engineered solution would probably be to let LSMs hook
these APIs (if they care) and then remove the capable()/ns_capable()
checks.

2020-10-28 07:40:15

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: Inconsistent capability requirements for prctl_set_mm_exe_file()

On Tue, Oct 27, 2020 at 01:11:40PM +0100, Michael Kerrisk (man-pages) wrote:
> Hello Nicolas, Cyrill, and others,
>
> @Nicolas, your commit ebd6de6812387a changed the capability
> requirements for the prctl_set_mm_exe_file() operation from
>
> ns_capable(CAP_SYS_ADMIN)
>
> to
>
> ns_capable(CAP_SYS_ADMIN) || ns_capable(CAP_CHECKPOINT_RESTORE).
>
> That's fine I guess, but while looking at that change, I found
> an anomaly.
>
> The same prctl_set_mm_exe_file() functionality is also available
> via the prctl() PR_SET_MM_EXE_FILE operation, which was added
> by Cyrill's commit b32dfe377102ce668. However, there the
> prctl_set_mm_exe_file() operation is guarded by a check
>
> capable(CAP_SYS_RESOURCE).
>
> There are two things I note:
>
> * The capability requirements are different in the two cases.
> * In one case the checks are with ns_capable(), while in the
> other case the check is with capable().
>
> In both cases, the inconsistencies predate Nicolas's patch,
> and appear to have been introduced in Kirill Tkhai's commit
> 4d28df6152aa3ff.
>
> I'm not sure what is right, but those inconsistencies seem
> seem odd, and presumably unintended. Similarly, I'm not
> sure what fix, if any, should be applied. However, I thought
> it worth mentioning these details, since the situation is odd
> and surprising.

Hi Michael! This is more likely due to historical reasons:
the initial version of prctl(PR_SET_MM, ...) been operating
with individual fields and this was very unsafe. Because of
this we left it under CAP_SYS_RESOURCE (because you must have
enough rights to change such deep fields). Later we switched
to PR_SET_MM_MAP which is a safe version and allows to modify
memory map as a "whole" so we can do a precise check. And this
allowed us to relax requirements.

As to me the old PR_SET_MM should be deprecated and finally
removed from the kernel, but since it is a part of API we
can't do such thing easily.

Same time current PR_SET_MM internally is rather an alias
for PR_SET_MM_MAP because we create a temporary map and
pass it to the verification procedure so it looks like
we can relax requirements here to match the PR_SET_MM_MAP
call. But need to think maybe I miss something obvious here.

2020-10-28 20:57:26

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: Inconsistent capability requirements for prctl_set_mm_exe_file()

On Tue, Oct 27, 2020 at 08:22:11PM +0300, Kirill Tkhai wrote:
> 1)Before my commit there also were different checks
>
> !capable(CAP_SYS_RESOURCE))
> and
> uid_eq(cred->uid, make_kuid(ns, 0)) && gid_eq(cred->gid, make_kgid(ns, 0))
>
> so it is not the initial reason. The commit even decreased the checks difference
> and it made both the checks are about capability().
>
> 2)As I understand new PR_SET_MM_MAP interface differs in the way, that it allows to batch
> a setup of prctl_mm_map parameters. So, instead of 14 prlctl calls with different arguments:
> PR_SET_MM_START_CODE, PR_SET_MM_END_CODE, PR_SET_MM_START_DATA, .., PR_SET_MM_ENV_END,
> we set then all at once and the performance is better.
>
> The second advantage is that the new interface is more comfortable in case of we set all
> of 14 parameters. Old interface requires special order of calls: sometimes you have to
> set PR_SET_MM_START_CODE first and then PR_SET_MM_END_CODE second, some times it is vice
> versa. Otherwise __prctl_check_order() in validate_prctl_map() will fail.

Since I've been the person who introduced the former PR_SET_MM_X interface I already
explained that the PR_SET_MM_X is simply shitty and better be vanished and forgotten.
Which we simply can't do :/ In turn PR_SET_MM_MAP is the only right way to verify
all fields in a one pass not only because of speed but otherwise the validation
of parameters is not even associative and in result (as you mentioned) _order_ of
calls does matter for start_code/end_code.

> 3)For me it looks like any combinations of parameters acceptable to be set by both interfaces
> are the same (in case of we don't see on permissions checks). In case of we can assign a set of
> parameters {A} using old interface, we can assign it from new interface and vice versa.
> Isn't this so?! If so, we should use the same permissions check.

Yup, if only I'm not missing something obvious we could drop cap(sys-resource) here
because internally after this check we jump into traditional verification procedure
where the map is filled from runtime data.

I'm quite sceptic that LSM hook Jann mentioned gonna be better. It might be an
addition but not the replacement. Moreover if we add it here someone from CRIU
camp have to verify that it won't break current CRIU tests.