LinuxLists.cc - [PATCH v2 0/5] Add support for O

2019-09-07 04:11:12

Subject: [PATCH v2 0/5] Add support for O_MAYEXEC

Hi,

The goal of this patch series is to control script interpretation. A
new O_MAYEXEC flag used by sys_open() is added to enable userspace
script interpreter to delegate to the kernel (and thus the system
security policy) the permission to interpret/execute scripts or other
files containing what can be seen as commands.

This second series mainly differ from the previous one [1] by moving the
basic security policy from Yama to the filesystem subsystem. This
policy can be enforced by the system administrator through a sysctl
configuration consistent with the mount points.

Furthermore, the security policy can also be delegated to an LSM, either
a MAC system or an integrity system. For instance, the new kernel
MAY_OPENEXEC flag closes a major IMA measurement/appraisal interpreter
integrity gap by bringing the ability to check the use of scripts [2].
Other uses are expected, such as for openat2(2) [3], SGX integration
[4], and bpffs [5].

Userspace need to adapt to take advantage of this new feature. For
example, the PEP 578 [6] (Runtime Audit Hooks) enables Python 3.8 to be
extended with policy enforcement points related to code interpretation,
which can be used to align with the PowerShell audit features.
Additional Python security improvements (e.g. a limited interpreter
withou -c, stdin piping of code) are on their way.

The initial idea come from CLIP OS and the original implementation has
been used for more than 10 years:
https://github.com/clipos-archive/clipos4_doc

An introduction to O_MAYEXEC was given at the Linux Security Summit
Europe 2018 - Linux Kernel Security Contributions by ANSSI:
https://www.youtube.com/watch?v=chNjCRtPKQY&t=17m15s
The "write xor execute" principle was explained at Kernel Recipes 2018 -
CLIP OS: a defense-in-depth OS:
https://www.youtube.com/watch?v=PjRE0uBtkHU&t=11m14s

This patch series can be applied on top of v5.3-rc7. This can be tested
with CONFIG_SYSCTL. I would really appreciate constructive comments on
this patch series.

# Changes since v1

* move code from Yama to the FS subsystem
* set __FMODE_EXEC when using O_MAYEXEC to make this information
available through the new fanotify/FAN_OPEN_EXEC event
* only match regular files (not directories nor other types), which
follows the same semantic as commit 73601ea5b7b1 ("fs/open.c: allow
opening only regular files during execve()")
* improve tests

[1] https://lore.kernel.org/lkml/[email protected]/
[2] https://lore.kernel.org/lkml/[email protected]/
[3] https://lore.kernel.org/lkml/[email protected]/
[4] https://lore.kernel.org/lkml/CALCETrVovr8XNZSroey7pHF46O=kj_c5D9K8h=z2T_cNrpvMig@mail.gmail.com/
[5] https://lore.kernel.org/lkml/CALCETrVeZ0eufFXwfhtaG_j+AdvbzEWE0M3wjXMWVEO7pj+xkw@mail.gmail.com/
[6] https://www.python.org/dev/peps/pep-0578/

Regards,

Mickaël Salaün (5):
fs: Add support for an O_MAYEXEC flag on sys_open()
fs: Add a MAY_EXECMOUNT flag to infer the noexec mount propertie
fs: Enable to enforce noexec mounts or file exec through O_MAYEXEC
selftest/exec: Add tests for O_MAYEXEC enforcing
doc: Add documentation for the fs.open_mayexec_enforce sysctl

Documentation/admin-guide/sysctl/fs.rst | 43 +++
fs/fcntl.c | 2 +-
fs/namei.c | 70 +++++
fs/open.c | 6 +
include/linux/fcntl.h | 2 +-
include/linux/fs.h | 7 +
include/uapi/asm-generic/fcntl.h | 3 +
kernel/sysctl.c | 7 +
tools/testing/selftests/exec/.gitignore | 1 +
tools/testing/selftests/exec/Makefile | 4 +-
tools/testing/selftests/exec/omayexec.c | 317 ++++++++++++++++++++
tools/testing/selftests/kselftest_harness.h | 3 +
12 files changed, 462 insertions(+), 3 deletions(-)
create mode 100644 tools/testing/selftests/exec/omayexec.c

--
2.23.0

2019-09-07 04:46:33

by Mickaël Salaün

[permalink] [raw]

Subject: [PATCH v2 3/5] fs: Enable to enforce noexec mounts or file exec through O_MAYEXEC

Enable to either propagate the mount options from the underlying VFS
mount to prevent execution, or to propagate the file execute permission.
This may allow a script interpreter to check execution permissions
before reading commands from a file.

The main goal is to be able to protect the kernel by restricting
arbitrary syscalls that an attacker could perform with a crafted binary
or certain script languages. It also improves multilevel isolation
by reducing the ability of an attacker to use side channels with
specific code. These restrictions can natively be enforced for ELF
binaries (with the noexec mount option) but require this kernel
extension to properly handle scripts (e.g., Python, Perl).

Add a new sysctl fs.open_mayexec_enforce to control this behavior. A
following patch adds documentation.

Changes since v1:
* move code from Yama to the FS subsystem (suggested by Kees Cook)
* make omayexec_inode_permission() static (suggested by Jann Horn)
* use mode 0600 for the sysctl
* only match regular files (not directories nor other types), which
follows the same semantic as commit 73601ea5b7b1 ("fs/open.c: allow
opening only regular files during execve()")

Signed-off-by: Mickaël Salaün <[email protected]>
Reviewed-by: Philippe Trébuchet <[email protected]>
Reviewed-by: Thibaut Sautereau <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Mickaël Salaün <[email protected]>
---
fs/namei.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++
include/linux/fs.h | 3 ++
kernel/sysctl.c | 7 +++++
3 files changed, 78 insertions(+)

diff --git a/fs/namei.c b/fs/namei.c
index 0a6b9483d0cb..abd29a76ecef 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -39,6 +39,7 @@
#include <linux/bitops.h>
#include <linux/init_task.h>
#include <linux/uaccess.h>
+#include <linux/sysctl.h>

#include "internal.h"
#include "mount.h"
@@ -411,6 +412,34 @@ static int sb_permission(struct super_block *sb, struct inode *inode, int mask)
return 0;
}

+#define OMAYEXEC_ENFORCE_NONE 0
+#define OMAYEXEC_ENFORCE_MOUNT (1 << 0)
+#define OMAYEXEC_ENFORCE_FILE (1 << 1)
+#define _OMAYEXEC_LAST OMAYEXEC_ENFORCE_FILE
+#define _OMAYEXEC_MASK ((_OMAYEXEC_LAST << 1) - 1)
+
+/**
+ * omayexec_inode_permission - check O_MAYEXEC before accessing an inode
+ * @inode: inode structure to check
+ * @mask: permission mask
+ *
+ * Return 0 if access is permitted, -EACCES otherwise.
+ */
+static int omayexec_inode_permission(struct inode *inode, int mask)
+{
+ if (!(mask & MAY_OPENEXEC))
+ return 0;
+
+ if ((sysctl_omayexec_enforce & OMAYEXEC_ENFORCE_MOUNT) &&
+ !(mask & MAY_EXECMOUNT))
+ return -EACCES;
+
+ if (sysctl_omayexec_enforce & OMAYEXEC_ENFORCE_FILE)
+ return generic_permission(inode, MAY_EXEC);
+
+ return 0;
+}
+
/**
* inode_permission - Check for access rights to a given inode
* @inode: Inode to check permission on
@@ -454,10 +483,48 @@ int inode_permission(struct inode *inode, int mask)
if (retval)
return retval;

+ retval = omayexec_inode_permission(inode, mask);
+ if (retval)
+ return retval;
+
return security_inode_permission(inode, mask);
}
EXPORT_SYMBOL(inode_permission);

+/*
+ * Handle open_mayexec_enforce sysctl
+ */
+#ifdef CONFIG_SYSCTL
+int proc_omayexec(struct ctl_table *table, int write, void __user *buffer,
+ size_t *lenp, loff_t *ppos)
+{
+ int error;
+
+ if (write) {
+ struct ctl_table table_copy;
+ int tmp_mayexec_enforce;
+
+ if (!capable(CAP_MAC_ADMIN))
+ return -EPERM;
+ tmp_mayexec_enforce = *((int *)table->data);
+ table_copy = *table;
+ /* do not erase sysctl_omayexec_enforce */
+ table_copy.data = &tmp_mayexec_enforce;
+ error = proc_dointvec(&table_copy, write, buffer, lenp, ppos);
+ if (error)
+ return error;
+ if ((tmp_mayexec_enforce | _OMAYEXEC_MASK) != _OMAYEXEC_MASK)
+ return -EINVAL;
+ *((int *)table->data) = tmp_mayexec_enforce;
+ } else {
+ error = proc_dointvec(table, write, buffer, lenp, ppos);
+ if (error)
+ return error;
+ }
+ return 0;
+}
+#endif
+
/**
* path_get - get a reference to a path
* @path: path to get the reference to
@@ -887,6 +954,7 @@ int sysctl_protected_symlinks __read_mostly = 0;
int sysctl_protected_hardlinks __read_mostly = 0;
int sysctl_protected_fifos __read_mostly;
int sysctl_protected_regular __read_mostly;
+int sysctl_omayexec_enforce __read_mostly = OMAYEXEC_ENFORCE_NONE;

/**
* may_follow_link - Check symlink following for unsafe situations
diff --git a/include/linux/fs.h b/include/linux/fs.h
index e57609dac8dd..735f5950cfed 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -81,6 +81,7 @@ extern int sysctl_protected_symlinks;
extern int sysctl_protected_hardlinks;
extern int sysctl_protected_fifos;
extern int sysctl_protected_regular;
+extern int sysctl_omayexec_enforce;

typedef __kernel_rwf_t rwf_t;

@@ -3452,6 +3453,8 @@ int proc_nr_dentry(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp, loff_t *ppos);
int proc_nr_inodes(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp, loff_t *ppos);
+int proc_omayexec(struct ctl_table *table, int write, void __user *buffer,
+ size_t *lenp, loff_t *ppos);
int __init get_filesystem_list(char *buf);

#define __FMODE_EXEC ((__force int) FMODE_EXEC)
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 078950d9605b..eaaeb229a828 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1911,6 +1911,13 @@ static struct ctl_table fs_table[] = {
.extra1 = SYSCTL_ZERO,
.extra2 = &two,
},
+ {
+ .procname = "open_mayexec_enforce",
+ .data = &sysctl_omayexec_enforce,
+ .maxlen = sizeof(int),
+ .mode = 0600,
+ .proc_handler = proc_omayexec,
+ },
#if defined(CONFIG_BINFMT_MISC) || defined(CONFIG_BINFMT_MISC_MODULE)
{
.procname = "binfmt_misc",
--
2.23.0

2019-09-07 17:06:11

by Steve Grubb

[permalink] [raw]

Subject: Re: [PATCH v2 0/5] Add support for O_MAYEXEC

On Friday, September 6, 2019 11:24:50 AM EDT Micka?l Sala?n wrote:
> The goal of this patch series is to control script interpretation. A
> new O_MAYEXEC flag used by sys_open() is added to enable userspace
> script interpreter to delegate to the kernel (and thus the system
> security policy) the permission to interpret/execute scripts or other
> files containing what can be seen as commands.

The problem is that this is only a gentleman's handshake. If I don't tell the
kernel that what I'm opening is tantamount to executing it, then the security
feature is never invoked. It is simple to strip the flags off of any system
call without needing privileges. For example:

#define _GNU_SOURCE
#include <link.h>
#include <fcntl.h>
#include <string.h>

unsigned int
la_version(unsigned int version)
{
return version;
}

unsigned int
la_objopen(struct link_map *map, Lmid_t lmid, uintptr_t *cookie)
{
return LA_FLG_BINDTO | LA_FLG_BINDFROM;
}

typedef int (*openat_t) (int dirfd, const char *pathname, int flags, mode_t mode);
static openat_t real_openat = 0L;
int my_openat(int dirfd, const char *pathname, int flags, mode_t mode)
{
flags &= ~O_CLOEXEC;
return real_openat(dirfd, pathname, flags, mode);
}

uintptr_t
la_symbind64(Elf64_Sym *sym, unsigned int ndx, uintptr_t *refcook,
uintptr_t *defcook, unsigned int *flags, const char *symname)
{
if (real_openat == 0L && strcmp(symname, "openat") == 0) {
real_openat = (openat_t) sym->st_value;
return (uintptr_t) my_openat;
}
return sym->st_value;
}

gcc -c -g -Wno-unused-parameter -W -Wall -Wundef -O2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fPIC test.c
gcc -o strip-flags.so.0 -shared -Wl,-soname,strip-flags.so.0 -ldl test.o

Now, let's make a test program:

#include <stdio.h>
#include <dirent.h>
#include <fcntl.h>
#include <unistd.h>

int main(void)
{
int dir_fd, fd;
DIR *d = opendir("/etc");
dir_fd = dirfd(d);
fd = openat(dir_fd, "passwd", O_RDONLY|O_CLOEXEC);
close (fd);
closedir(d);
return 0;
}

gcc -g -W -Wall -Wundef test.c -o test

OK, let's see what happens.
$ strace ./test 2>&1 | grep passwd
openat(3, "passwd", O_RDONLY|O_CLOEXEC) = 4

Now with LD_AUDIT
$ LD_AUDIT=/home/sgrubb/test/openflags/strip-flags.so.0 strace ./test 2>&1 | grep passwd
openat(3, "passwd", O_RDONLY) = 4

No O_CLOEXEC flag.

-Steve

2019-09-07 17:18:08

by Steve Grubb

[permalink] [raw]

Subject: Re: [PATCH v2 0/5] Add support for O_MAYEXEC

On Friday, September 6, 2019 2:57:00 PM EDT Florian Weimer wrote:
> * Steve Grubb:
> > Now with LD_AUDIT
> > $ LD_AUDIT=/home/sgrubb/test/openflags/strip-flags.so.0 strace ./test
> > 2>&1 | grep passwd openat(3, "passwd", O_RDONLY) = 4
> >
> > No O_CLOEXEC flag.
>
> I think you need to explain in detail why you consider this a problem.

Because you can strip the O_MAYEXEC flag from being passed into the kernel.
Once you do that, you defeat the security mechanism because it never gets
invoked. The issue is that the only thing that knows _why_ something is being
opened is user space. With this mechanism, you can attempt to pass this
reason to the kernel so that it may see if policy permits this. But you can
just remove the flag.

-Steve

> With LD_PRELOAD and LD_AUDIT, you can already do anything, including
> scanning other loaded objects for a system call instruction and jumping
> to that (in case a security module in the kernel performs a PC check to
> confer additional privileges).
>
> Thanks,
> Florian

2019-09-07 20:02:09

by Florian Weimer

[permalink] [raw]

Subject: Re: [PATCH v2 0/5] Add support for O_MAYEXEC

* Steve Grubb:

> Now with LD_AUDIT
> $ LD_AUDIT=/home/sgrubb/test/openflags/strip-flags.so.0 strace ./test 2>&1 | grep passwd
> openat(3, "passwd", O_RDONLY) = 4
>
> No O_CLOEXEC flag.

I think you need to explain in detail why you consider this a problem.

With LD_PRELOAD and LD_AUDIT, you can already do anything, including
scanning other loaded objects for a system call instruction and jumping
to that (in case a security module in the kernel performs a PC check to
confer additional privileges).

Thanks,
Florian

2019-09-07 21:44:25

by Andy Lutomirski

[permalink] [raw]

Subject: Re: [PATCH v2 0/5] Add support for O_MAYEXEC

> On Sep 6, 2019, at 12:07 PM, Steve Grubb <[email protected]> wrote:
>
>> On Friday, September 6, 2019 2:57:00 PM EDT Florian Weimer wrote:
>> * Steve Grubb:
>>> Now with LD_AUDIT
>>> $ LD_AUDIT=/home/sgrubb/test/openflags/strip-flags.so.0 strace ./test
>>> 2>&1 | grep passwd openat(3, "passwd", O_RDONLY) = 4
>>>
>>> No O_CLOEXEC flag.
>>
>> I think you need to explain in detail why you consider this a problem.
>
> Because you can strip the O_MAYEXEC flag from being passed into the kernel.
> Once you do that, you defeat the security mechanism because it never gets
> invoked. The issue is that the only thing that knows _why_ something is being
> opened is user space. With this mechanism, you can attempt to pass this
> reason to the kernel so that it may see if policy permits this. But you can
> just remove the flag.

I’m with Florian here. Once you are executing code in a process, you could just emulate some other unapproved code. This series is not intended to provide the kind of absolute protection you’re imagining.

What the kernel *could* do is prevent mmapping a non-FMODE_EXEC file with PROT_EXEC, which would indeed have a real effect (in an iOS-like world, for example) but would break many, many things.

2019-09-08 07:32:14

by Aleksa Sarai

[permalink] [raw]

Subject: Re: [PATCH v2 0/5] Add support for O_MAYEXEC

On 2019-09-06, Andy Lutomirski <[email protected]> wrote:
> > On Sep 6, 2019, at 12:07 PM, Steve Grubb <[email protected]> wrote:
> >
> >> On Friday, September 6, 2019 2:57:00 PM EDT Florian Weimer wrote:
> >> * Steve Grubb:
> >>> Now with LD_AUDIT
> >>> $ LD_AUDIT=/home/sgrubb/test/openflags/strip-flags.so.0 strace ./test
> >>> 2>&1 | grep passwd openat(3, "passwd", O_RDONLY) = 4
> >>>
> >>> No O_CLOEXEC flag.
> >>
> >> I think you need to explain in detail why you consider this a problem.
> >
> > Because you can strip the O_MAYEXEC flag from being passed into the kernel.
> > Once you do that, you defeat the security mechanism because it never gets
> > invoked. The issue is that the only thing that knows _why_ something is being
> > opened is user space. With this mechanism, you can attempt to pass this
> > reason to the kernel so that it may see if policy permits this. But you can
> > just remove the flag.
>
> I’m with Florian here. Once you are executing code in a process, you
> could just emulate some other unapproved code. This series is not
> intended to provide the kind of absolute protection you’re imagining.

I also agree, though I think that there is a separate argument to be
made that there are two possible problems with O_MAYEXEC (which might
not be really big concerns):

* It's very footgun-prone if you didn't call O_MAYEXEC yourself and
you pass the descriptor elsewhere. You need to check f_flags to see
if it contains O_MAYEXEC. Maybe there is an argument to be made that
passing O_MAYEXECs around isn't a valid use-case, but in that case
there should be some warnings about that.

* There's effectively a TOCTOU flaw (even if you are sure O_MAYEXEC is
in f_flags) -- if the filesystem becomes re-mounted noexec (or the
file has a-x permissions) after you've done the check you won't get
hit with an error when you go to use the file descriptor later.

To fix both you'd need to do what you mention later:

> What the kernel *could* do is prevent mmapping a non-FMODE_EXEC file
> with PROT_EXEC, which would indeed have a real effect (in an iOS-like
> world, for example) but would break many, many things.

And I think this would be useful (with the two possible ways of
executing .text split into FMODE_EXEC and FMODE_MAP_EXEC, as mentioned
in a sister subthread), but would have to be opt-in for the obvious
reason you outlined. However, we could make it the default for
openat2(2) -- assuming we can agree on what the semantics of a
theoretical FMODE_EXEC should be.

And of course we'd need to do FMODE_UPGRADE_EXEC (which would need to
also permit fexecve(2) though probably not PROT_EXEC -- I don't think
you can mmap() an O_PATH descriptor).

--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

Attachments:

(No filename) (2.84 kB)
signature.asc (235.00 B)
Download all attachments

2019-09-09 18:17:36

by James Morris

[permalink] [raw]

Subject: Re: [PATCH v2 0/5] Add support for O_MAYEXEC

On Fri, 6 Sep 2019, Micka?l Sala?n wrote:

> Furthermore, the security policy can also be delegated to an LSM, either
> a MAC system or an integrity system. For instance, the new kernel
> MAY_OPENEXEC flag closes a major IMA measurement/appraisal interpreter
> integrity gap by bringing the ability to check the use of scripts [2].

To clarify, scripts are already covered by IMA if they're executed
directly, and the gap is when invoking a script as a parameter to the
interpreter (and for any sourced files). In that case only the interpreter
is measured/appraised, unless there's a rule also covering the script
file(s).

See:
https://en.opensuse.org/SDB:Ima_evm#script-behaviour

In theory you could probably also close the gap by modifying the
interpreters to check for the execute bit on any file opened for
interpretation (as earlier suggested by Steve Grubb), and then you could
have IMA measure/appraise all files with +x. I suspect this could get
messy in terms of unwanted files being included, and the MAY_OPENEXEC flag
has cleaner semantics.

--
James Morris
<[email protected]>

2019-09-10 02:45:57

by Mickaël Salaün

[permalink] [raw]

Subject: Re: [PATCH v2 0/5] Add support for O_MAYEXEC

On 07/09/2019 00:44, Aleksa Sarai wrote:
> On 2019-09-06, Andy Lutomirski <[email protected]> wrote:
>>> On Sep 6, 2019, at 12:07 PM, Steve Grubb <[email protected]> wrote:
>>>
>>>> On Friday, September 6, 2019 2:57:00 PM EDT Florian Weimer wrote:
>>>> * Steve Grubb:
>>>>> Now with LD_AUDIT
>>>>> $ LD_AUDIT=/home/sgrubb/test/openflags/strip-flags.so.0 strace ./test
>>>>> 2>&1 | grep passwd openat(3, "passwd", O_RDONLY) = 4
>>>>>
>>>>> No O_CLOEXEC flag.
>>>>
>>>> I think you need to explain in detail why you consider this a problem.

Right, LD_PRELOAD and such things are definitely not part of the threat
model for O_MAYEXEC, on purpose, because this must be addressed with
other security mechanism (e.g. correct file system access-control, IMA
policy, SELinux or other LSM security policies). This is a requirement
for O_MAYEXEC to be useful.

An interpreter is just a flexible program which is generic and doesn't
have other purpose other than behaving accordingly to external rules
(i.e. scripts). If you don't trust your interpreter, it should not be
executable in the first place. O_MAYEXEC enables to restrict the use of
(some) interpreters accordingly to a *global* system security policy.

>>>
>>> Because you can strip the O_MAYEXEC flag from being passed into the kernel.
>>> Once you do that, you defeat the security mechanism because it never gets
>>> invoked. The issue is that the only thing that knows _why_ something is being
>>> opened is user space. With this mechanism, you can attempt to pass this
>>> reason to the kernel so that it may see if policy permits this. But you can
>>> just remove the flag.
>>
>> I’m with Florian here. Once you are executing code in a process, you
>> could just emulate some other unapproved code. This series is not
>> intended to provide the kind of absolute protection you’re imagining.
>
> I also agree, though I think that there is a separate argument to be
> made that there are two possible problems with O_MAYEXEC (which might
> not be really big concerns):
>
> * It's very footgun-prone if you didn't call O_MAYEXEC yourself and
> you pass the descriptor elsewhere. You need to check f_flags to see
> if it contains O_MAYEXEC. Maybe there is an argument to be made that
> passing O_MAYEXECs around isn't a valid use-case, but in that case
> there should be some warnings about that.

That could be an issue if you don't trust your system, especially if the
mount points (and the "noexec" option) can be changed by untrusted
users. As I said above, there is a requirement for basic security
properties as a meaningful file system access control, and obviously not
letting any user change mount points (which can lead to much sever
security issues anyway).

If a process A pass a FD to an interpreter B, then the interpreter B
must trust the process A. Moreover, being able to tell if the FD was
open with O_MAYEXEC and relying on it may create a wrong feeling of
security. As I said in a previous email, being able to probe for
O_MAYEXEC does not make sense because it would not be enough to
know the system policy (either this flag is enforced or not, for mount
points, based on xattr, time…). The main goal of O_MAYEXEC is to ask the
kernel, on a trusted link (hence without LD_PRELOAD-like interfering),
for a file which is allowed to be interpreted/executed by this interpreter.

To be able to correctly handle the case you pointed out (FD passing),
either an existing or a new LSM should handle this behavior according to
the origin of the FD and the chain of processes getting it.

Some advanced LSM rules could tie interpreters with scripts dedicated to
them, and have different behavior for the same scripts but with
different interpreters.

>
> * There's effectively a TOCTOU flaw (even if you are sure O_MAYEXEC is
> in f_flags) -- if the filesystem becomes re-mounted noexec (or the
> file has a-x permissions) after you've done the check you won't get
> hit with an error when you go to use the file descriptor later.

Again, the threat model needs to be appropriate to make O_MAYEXEC
useful. The security policies of the system need to be seen as a whole,
and updated as such.

As for most file system access control on Linux, it may be possible to
have TOCTOU, but the whole system should be designed to protect against
that. For example, changing file access control (e.g. mount point
options) without a reboot may lead to inconsistent security properties,
which is why such thing are discouraged by some access control systems
(e.g. SELinux).

>
> To fix both you'd need to do what you mention later:
>
>> What the kernel *could* do is prevent mmapping a non-FMODE_EXEC file
>> with PROT_EXEC, which would indeed have a real effect (in an iOS-like
>> world, for example) but would break many, many things.
>
> And I think this would be useful (with the two possible ways of
> executing .text split into FMODE_EXEC and FMODE_MAP_EXEC, as mentioned
> in a sister subthread), but would have to be opt-in for the obvious
> reason you outlined. However, we could make it the default for
> openat2(2) -- assuming we can agree on what the semantics of a
> theoretical FMODE_EXEC should be.
>
> And of course we'd need to do FMODE_UPGRADE_EXEC (which would need to
> also permit fexecve(2) though probably not PROT_EXEC -- I don't think
> you can mmap() an O_PATH descriptor).

The mmapping restriction may be interesting but it is a different use
case. This series address the interpreter/script problem. Either the
script may be mapped executable is the choice of the interpreter. In
most cases, no script are mapped as such, exactly because they are
interpreted by a process but not by the CPU.

--
Mickaël Salaün

Les données à caractère personnel recueillies et traitées dans le cadre de cet échange, le sont à seule fin d’exécution d’une relation professionnelle et s’opèrent dans cette seule finalité et pour la durée nécessaire à cette relation. Si vous souhaitez faire usage de vos droits de consultation, de rectification et de suppression de vos données, veuillez contacter [email protected]. Si vous avez reçu ce message par erreur, nous vous remercions d’en informer l’expéditeur et de détruire le message. The personal data collected and processed during this exchange aims solely at completing a business relationship and is limited to the necessary duration of that relationship. If you wish to use your rights of consultation, rectification and deletion of your data, please contact: [email protected]. If you have received this message in error, we thank you for informing the sender and destroying the message.