Hi,
While looking at the code paths for the proposed O_MAYEXEC flag, I saw
some things that looked like they should be fixed up.
exec: Change uselib(2) IS_SREG() failure to EACCES
This just regularizes the return code on uselib(2).
exec: Relocate S_ISREG() check
This moves the S_ISREG() check even earlier than it was already.
exec: Relocate path_noexec() check
This adds the path_noexec() check to the same place as the
S_ISREG() check.
fs: Include FMODE_EXEC when converting flags to f_mode
This seemed like an oversight, but I suspect there is some
reason I couldn't find for why FMODE_EXEC doesn't get set in
f_mode and just stays in f_flags.
Thanks!
-Kees
Kees Cook (4):
exec: Change uselib(2) IS_SREG() failure to EACCES
exec: Relocate S_ISREG() check
exec: Relocate path_noexec() check
fs: Include FMODE_EXEC when converting flags to f_mode
fs/exec.c | 13 +++++++++----
fs/namei.c | 5 +++++
fs/open.c | 6 ------
include/linux/fs.h | 3 ++-
include/linux/fsnotify.h | 4 ++--
5 files changed, 18 insertions(+), 13 deletions(-)
--
2.20.1
The execve(2)/uselib(2) syscalls have always rejected non-regular
files. Recently, it was noticed that a deadlock was introduced when trying
to execute pipes, as the S_ISREG() test was happening too late. This was
fixed in commit 73601ea5b7b1 ("fs/open.c: allow opening only regular files
during execve()"), but it was added after inode_permission() had already
run, which meant LSMs could see bogus attempts to execute non-regular
files. Move the test earlier.
Also include a comment with the redundant S_ISREG() checks at the end of
execve(2)/uselib(2) to note that they are present to avoid any mistakes.
Finally, instead of dereferencing the inode, use dcache for S_ISREG()
test.
My notes on the call path, and related arguments, checks, etc:
do_open_execat()
struct open_flags open_exec_flags = {
.open_flag = O_LARGEFILE | O_RDONLY | __FMODE_EXEC, ...
do_filp_open(dfd, filename, open_flags)
path_openat(nameidata, open_flags, flags)
/* f_mode populated from open_flags in alloc_empty_file() */
file = alloc_empty_file(open_flags, current_cred());
do_open(nameidata, file, open_flags)
/* new location of FMODE_EXEC vs S_ISREG() test */
may_open(path, acc_mode, open_flag)
inode_permission(inode, MAY_OPEN | acc_mode)
security_inode_permission(inode, acc_mode)
vfs_open(path, file)
do_dentry_open(file, path->dentry->d_inode, open)
/* old location of FMODE_EXEC vs S_ISREG() test */
security_file_open(f)
open()
Signed-off-by: Kees Cook <[email protected]>
---
fs/exec.c | 8 ++++++++
fs/namei.c | 4 ++++
fs/open.c | 6 ------
3 files changed, 12 insertions(+), 6 deletions(-)
diff --git a/fs/exec.c b/fs/exec.c
index 30735ce1dc0e..f0c80a8b9ccd 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -139,6 +139,10 @@ SYSCALL_DEFINE1(uselib, const char __user *, library)
if (IS_ERR(file))
goto out;
+ /*
+ * do_open() has already checked for this, but we can be extra
+ * cautious and check again at the very end too.
+ */
error = -EACCES;
if (!S_ISREG(file_inode(file)->i_mode))
goto exit;
@@ -860,6 +864,10 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags)
if (IS_ERR(file))
goto out;
+ /*
+ * do_open() has already checked for this, but we can be extra
+ * cautious and check again at the very end too.
+ */
err = -EACCES;
if (!S_ISREG(file_inode(file)->i_mode))
goto exit;
diff --git a/fs/namei.c b/fs/namei.c
index a320371899cf..b9408aacaaa4 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3212,6 +3212,10 @@ static int do_open(struct nameidata *nd,
if ((nd->flags & LOOKUP_DIRECTORY) && !d_can_lookup(nd->path.dentry))
return -ENOTDIR;
+ /* Any file opened for execution has to be a regular file. */
+ if ((file->f_flags & FMODE_EXEC) && !d_is_reg(nd->path.dentry))
+ return -EACCES;
+
do_truncate = false;
acc_mode = op->acc_mode;
if (file->f_mode & FMODE_CREATED) {
diff --git a/fs/open.c b/fs/open.c
index 719b320ede52..bb16e4e3cd57 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -753,12 +753,6 @@ static int do_dentry_open(struct file *f,
return 0;
}
- /* Any file opened for execve()/uselib() has to be a regular file. */
- if (unlikely(f->f_flags & FMODE_EXEC && !S_ISREG(inode->i_mode))) {
- error = -EACCES;
- goto cleanup_file;
- }
-
if (f->f_mode & FMODE_WRITE && !special_file(inode->i_mode)) {
error = get_write_access(inode);
if (unlikely(error))
--
2.20.1
Change uselib(2)' S_ISREG() error return to EACCES instead of EINVAL so
the behavior matches execve(2), and the seemingly documented value.
The "not a regular file" failure mode of execve(2) is explicitly
documented[1], but it is not mentioned in uselib(2)[2] which does,
however, say that open(2) and mmap(2) errors may apply. The documentation
for open(2) does not include a "not a regular file" error[3], but mmap(2)
does[4], and it is EACCES.
[1] http://man7.org/linux/man-pages/man2/execve.2.html#ERRORS
[2] http://man7.org/linux/man-pages/man2/uselib.2.html#ERRORS
[3] http://man7.org/linux/man-pages/man2/open.2.html#ERRORS
[4] http://man7.org/linux/man-pages/man2/mmap.2.html#ERRORS
Signed-off-by: Kees Cook <[email protected]>
---
fs/exec.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/exec.c b/fs/exec.c
index 06b4c550af5d..30735ce1dc0e 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -139,11 +139,10 @@ SYSCALL_DEFINE1(uselib, const char __user *, library)
if (IS_ERR(file))
goto out;
- error = -EINVAL;
+ error = -EACCES;
if (!S_ISREG(file_inode(file)->i_mode))
goto exit;
- error = -EACCES;
if (path_noexec(&file->f_path))
goto exit;
--
2.20.1
Include FMODE_EXEC when building the f_mode field, so that code can
actually test the correct field and values. Only three places actually
examine f_flags for FMODE_EXEC:
fs/open.c: if (unlikely((f->f_mode & FMODE_EXEC) && !S_ISREG(inode->i_mode))) {
include/linux/fsnotify.h: if (file->f_mode & FMODE_EXEC) {
include/linux/fsnotify.h: if (file->f_mode & FMODE_EXEC)
Signed-off-by: Kees Cook <[email protected]>
---
I assume there must be some reason for FMODE_EXEC not being pulled into
f_mode, but I couldn't find it. I guess this is my attempt to either fix
an oversight or to learn by flames. :)
---
fs/namei.c | 4 ++--
include/linux/fs.h | 3 ++-
include/linux/fsnotify.h | 4 ++--
3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/fs/namei.c b/fs/namei.c
index 6bb1b6624bad..362b1cc75f5c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3213,8 +3213,8 @@ static int do_open(struct nameidata *nd,
return -ENOTDIR;
/* Opening for execution requires a regular file on an exec mnt. */
- if ((file->f_flags & FMODE_EXEC) && (!d_is_reg(nd->path.dentry) ||
- path_noexec(&nd->path)))
+ if ((file->f_mode & FMODE_EXEC) && (!d_is_reg(nd->path.dentry) ||
+ path_noexec(&nd->path)))
return -EACCES;
do_truncate = false;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 4f6f59b4f22a..8a2cabdcf531 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3545,10 +3545,11 @@ int __init get_filesystem_list(char *buf);
#define __FMODE_EXEC ((__force int) FMODE_EXEC)
#define __FMODE_NONOTIFY ((__force int) FMODE_NONOTIFY)
+#define __SMUGGLED_FMODE_FLAGS (__FMODE_EXEC | __FMODE_NONOTIFY)
#define ACC_MODE(x) ("\004\002\006\006"[(x)&O_ACCMODE])
#define OPEN_FMODE(flag) ((__force fmode_t)(((flag + 1) & O_ACCMODE) | \
- (flag & __FMODE_NONOTIFY)))
+ (flag & __SMUGGLED_FMODE_FLAGS)))
static inline bool is_sxid(umode_t mode)
{
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 5ab28f6c7d26..86761ed4b434 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -90,7 +90,7 @@ static inline int fsnotify_perm(struct file *file, int mask)
if (mask & MAY_OPEN) {
fsnotify_mask = FS_OPEN_PERM;
- if (file->f_flags & __FMODE_EXEC) {
+ if (file->f_mode & FMODE_EXEC) {
ret = fsnotify_file(file, FS_OPEN_EXEC_PERM);
if (ret)
@@ -264,7 +264,7 @@ static inline void fsnotify_open(struct file *file)
{
__u32 mask = FS_OPEN;
- if (file->f_flags & __FMODE_EXEC)
+ if (file->f_mode & FMODE_EXEC)
mask |= FS_OPEN_EXEC;
fsnotify_file(file, mask);
--
2.20.1
On Sun, May 17, 2020 at 10:54:54PM -0700, Kees Cook wrote:
> Change uselib(2)' S_ISREG() error return to EACCES instead of EINVAL so
> the behavior matches execve(2), and the seemingly documented value.
> The "not a regular file" failure mode of execve(2) is explicitly
> documented[1], but it is not mentioned in uselib(2)[2] which does,
> however, say that open(2) and mmap(2) errors may apply. The documentation
> for open(2) does not include a "not a regular file" error[3], but mmap(2)
> does[4], and it is EACCES.
>
> [1] http://man7.org/linux/man-pages/man2/execve.2.html#ERRORS
> [2] http://man7.org/linux/man-pages/man2/uselib.2.html#ERRORS
> [3] http://man7.org/linux/man-pages/man2/open.2.html#ERRORS
> [4] http://man7.org/linux/man-pages/man2/mmap.2.html#ERRORS
>
> Signed-off-by: Kees Cook <[email protected]>
This is all extremely weird.
uselib has been deprected since forever basically which makes me doubt
this matters much but:
Acked-by: Christian Brauner <[email protected]>
Also - gulp (puts on flame proof suit) - may I suggest we check if there
are any distros out there that still set CONFIG_USELIB=y and if not do
what we did with the sysctl syscall and remove it? If someone yells we
can always backpaddle...
Christian
On Mon, May 18, 2020 at 3:03 PM Christian Brauner
<[email protected]> wrote:
> Also - gulp (puts on flame proof suit) - may I suggest we check if there
> are any distros out there that still set CONFIG_USELIB=y
Debian seems to have it enabled on x86...
https://salsa.debian.org/kernel-team/linux/-/blob/master/debian/config/kernelarch-x86/config#L1896
A random Ubuntu 19.10 VM I have here has it enabled, too.
On Mon, May 18, 2020 at 04:43:20PM +0200, Jann Horn wrote:
> On Mon, May 18, 2020 at 3:03 PM Christian Brauner
> <[email protected]> wrote:
> > Also - gulp (puts on flame proof suit) - may I suggest we check if there
> > are any distros out there that still set CONFIG_USELIB=y
>
> Debian seems to have it enabled on x86...
>
> https://salsa.debian.org/kernel-team/linux/-/blob/master/debian/config/kernelarch-x86/config#L1896
>
> A random Ubuntu 19.10 VM I have here has it enabled, too.
I wonder if there's any program - apart from _ancient_ glibc out there
that actually use it...
I looked at uselib in codsearch but the results were quite unspecific
but I didn't look too close.
Christian
Christian Brauner <[email protected]> writes:
> On Mon, May 18, 2020 at 04:43:20PM +0200, Jann Horn wrote:
>> On Mon, May 18, 2020 at 3:03 PM Christian Brauner
>> <[email protected]> wrote:
>> > Also - gulp (puts on flame proof suit) - may I suggest we check if there
>> > are any distros out there that still set CONFIG_USELIB=y
>>
>> Debian seems to have it enabled on x86...
>>
>> https://salsa.debian.org/kernel-team/linux/-/blob/master/debian/config/kernelarch-x86/config#L1896
>>
>> A random Ubuntu 19.10 VM I have here has it enabled, too.
>
> I wonder if there's any program - apart from _ancient_ glibc out there
> that actually use it...
> I looked at uselib in codsearch but the results were quite unspecific
> but I didn't look too close.
So the thing to do is to have a polite word with people who build Ubuntu
and Debian kernels and get them to disable the kernel .config.
A quick look suggets it is already disabled in RHEL8. It cannot be
disabled in RHEL7.
Then in a few years we can come back and discuss removing the uselib
system call, base on no distributions having it enabled.
If it was only libc4 and libc5 that used the uselib system call then it
can probably be removed after enough time.
We can probably reorganize the code before the point it is clearly safe
to drop support for USELIB to keep it off to the side so USELIB does not
have any ongoing mainteance costs.
For this patchset I think we need to assume uselib will need to be
maintained for a bit longer.
Eric
On Mon, May 18, 2020 at 06:57:15PM -0500, Eric W. Biederman wrote:
> Christian Brauner <[email protected]> writes:
>
> > On Mon, May 18, 2020 at 04:43:20PM +0200, Jann Horn wrote:
> >> On Mon, May 18, 2020 at 3:03 PM Christian Brauner
> >> <[email protected]> wrote:
> >> > Also - gulp (puts on flame proof suit) - may I suggest we check if there
> >> > are any distros out there that still set CONFIG_USELIB=y
> >>
> >> Debian seems to have it enabled on x86...
> >>
> >> https://salsa.debian.org/kernel-team/linux/-/blob/master/debian/config/kernelarch-x86/config#L1896
> >>
> >> A random Ubuntu 19.10 VM I have here has it enabled, too.
> >
> > I wonder if there's any program - apart from _ancient_ glibc out there
> > that actually use it...
> > I looked at uselib in codsearch but the results were quite unspecific
> > but I didn't look too close.
>
> So the thing to do is to have a polite word with people who build Ubuntu
> and Debian kernels and get them to disable the kernel .config.
Yeah, I think that's a sane thing to do.
I filed a bug for Ubuntu to start a discussion. I can't see an obvious
reason why not.
>
> A quick look suggets it is already disabled in RHEL8. It cannot be
> disabled in RHEL7.
>
> Then in a few years we can come back and discuss removing the uselib
> system call, base on no distributions having it enabled.
>
> If it was only libc4 and libc5 that used the uselib system call then it
> can probably be removed after enough time.
>
> We can probably reorganize the code before the point it is clearly safe
> to drop support for USELIB to keep it off to the side so USELIB does not
> have any ongoing mainteance costs.
>
> For this patchset I think we need to assume uselib will need to be
> maintained for a bit longer.
Yeah, agreed. It doesn't matter as long as we have a plan for the future
to remove it. I don't think keeping this cruft around forever should be
the only outlook.
Christian
On Mai 18 2020, Eric W. Biederman wrote:
> If it was only libc4 and libc5 that used the uselib system call then it
> can probably be removed after enough time.
Only libc4 used it, libc5 was already ELF.
Andreas.
--
Andreas Schwab, [email protected]
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
Andreas Schwab <[email protected]> writes:
> On Mai 18 2020, Eric W. Biederman wrote:
>
>> If it was only libc4 and libc5 that used the uselib system call then it
>> can probably be removed after enough time.
>
> Only libc4 used it, libc5 was already ELF.
binfmt_elf.c supports uselib. In a very a.out ish way. Do you know if
that support was ever used?
If we are truly talking a.out only we should be able to make uselib
conditional on a.out support in the kernel which is strongly mostly
disabled at this point.
I am wondering if there are source trees for libc4 or libc5 around
anywhere that we can look at to see how usage of uselib evolved.
Eric
On Mai 19 2020, Eric W. Biederman wrote:
> I am wondering if there are source trees for libc4 or libc5 around
> anywhere that we can look at to see how usage of uselib evolved.
libc5 is available from archive.debian.org.
http://archive.debian.org/debian-archive/debian/pool/main/libc/libc/libc_5.4.46.orig.tar.gz
Andreas.
--
Andreas Schwab, [email protected]
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
Andreas Schwab <[email protected]> writes:
> On Mai 19 2020, Eric W. Biederman wrote:
>
>> I am wondering if there are source trees for libc4 or libc5 around
>> anywhere that we can look at to see how usage of uselib evolved.
>
> libc5 is available from archive.debian.org.
>
> http://archive.debian.org/debian-archive/debian/pool/main/libc/libc/libc_5.4.46.orig.tar.gz
Interesting.
It appears that the old a.out code to make use of uselib remained in
the libc5 sources but it was all conditional on the being compiled not
to use ELF.
libc5 did provide a wrapper for the uselib system call.
It appears glibc also provides a wrapper for the uselib system call
named: uselib@GLIBC_2.2.5.
I don't see a glibc header file that provides a declaration for uselib
though.
So the question becomes did anyone use those glibc wrappers.
Eric
On Tue, May 19, 2020 at 06:56:36AM -0500, Eric W. Biederman wrote:
> Andreas Schwab <[email protected]> writes:
>
> > On Mai 18 2020, Eric W. Biederman wrote:
> >
> >> If it was only libc4 and libc5 that used the uselib system call then it
> >> can probably be removed after enough time.
> >
> > Only libc4 used it, libc5 was already ELF.
>
> binfmt_elf.c supports uselib. In a very a.out ish way. Do you know if
> that support was ever used?
>
> If we are truly talking a.out only we should be able to make uselib
> conditional on a.out support in the kernel which is strongly mostly
> disabled at this point.
The only ones that even allow setting AOUT:
arch/alpha/Kconfig: select HAVE_AOUT
arch/m68k/Kconfig: select HAVE_AOUT if MMU
and x86 deprecated it March 2019:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=eac616557050737a8d6ef6fe0322d0980ff0ffde
Christian
On Tue, May 19, 2020 at 07:28:46AM -0500, Eric W. Biederman wrote:
> Andreas Schwab <[email protected]> writes:
>
> > On Mai 19 2020, Eric W. Biederman wrote:
> >
> >> I am wondering if there are source trees for libc4 or libc5 around
> >> anywhere that we can look at to see how usage of uselib evolved.
> >
> > libc5 is available from archive.debian.org.
> >
> > http://archive.debian.org/debian-archive/debian/pool/main/libc/libc/libc_5.4.46.orig.tar.gz
>
> Interesting.
>
> It appears that the old a.out code to make use of uselib remained in
> the libc5 sources but it was all conditional on the being compiled not
> to use ELF.
>
> libc5 did provide a wrapper for the uselib system call.
>
> It appears glibc also provides a wrapper for the uselib system call
> named: uselib@GLIBC_2.2.5.
>
> I don't see a glibc header file that provides a declaration for uselib
> though.
>
> So the question becomes did anyone use those glibc wrappers.
The only software I could find was ski, the ia64 instruction set
emulator, which apparently used to make use of this and when glibc
removed they did:
#define uselib(libname) syscall(__NR_uselib, libname)
but they only define it for the sake of the internal syscall list they
maintain so not actively using it. I just checked, ski is available on
Fedora 31 and Fedora has USELIB disabled.
Codesearch on Debian yields no users that actively use the syscall for
anything.
Christian
Hi Christian,
On Tue, May 19, 2020 at 3:15 PM Christian Brauner
<[email protected]> wrote:
> On Tue, May 19, 2020 at 06:56:36AM -0500, Eric W. Biederman wrote:
> > Andreas Schwab <[email protected]> writes:
> > > On Mai 18 2020, Eric W. Biederman wrote:
> > >> If it was only libc4 and libc5 that used the uselib system call then it
> > >> can probably be removed after enough time.
> > >
> > > Only libc4 used it, libc5 was already ELF.
> >
> > binfmt_elf.c supports uselib. In a very a.out ish way. Do you know if
> > that support was ever used?
> >
> > If we are truly talking a.out only we should be able to make uselib
> > conditional on a.out support in the kernel which is strongly mostly
> > disabled at this point.
>
> The only ones that even allow setting AOUT:
>
> arch/alpha/Kconfig: select HAVE_AOUT
> arch/m68k/Kconfig: select HAVE_AOUT if MMU
>
> and x86 deprecated it March 2019:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=eac616557050737a8d6ef6fe0322d0980ff0ffde
Quoting myself (for the second time this month):
"I think it's safe to assume no one still runs a.out binaries on m68k."
http://lore.kernel.org/r/CAMuHMdW+m0Q+j3rsQdMXnrEPm+XB5Y2AQrxW5sD1mZAKgmEqoA@mail.gmail.com
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
On Tue, May 19, 2020 at 04:32:38PM +0200, Geert Uytterhoeven wrote:
> Hi Christian,
>
> On Tue, May 19, 2020 at 3:15 PM Christian Brauner
> <[email protected]> wrote:
> > On Tue, May 19, 2020 at 06:56:36AM -0500, Eric W. Biederman wrote:
> > > Andreas Schwab <[email protected]> writes:
> > > > On Mai 18 2020, Eric W. Biederman wrote:
> > > >> If it was only libc4 and libc5 that used the uselib system call then it
> > > >> can probably be removed after enough time.
> > > >
> > > > Only libc4 used it, libc5 was already ELF.
> > >
> > > binfmt_elf.c supports uselib. In a very a.out ish way. Do you know if
> > > that support was ever used?
> > >
> > > If we are truly talking a.out only we should be able to make uselib
> > > conditional on a.out support in the kernel which is strongly mostly
> > > disabled at this point.
> >
> > The only ones that even allow setting AOUT:
> >
> > arch/alpha/Kconfig: select HAVE_AOUT
> > arch/m68k/Kconfig: select HAVE_AOUT if MMU
> >
> > and x86 deprecated it March 2019:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=eac616557050737a8d6ef6fe0322d0980ff0ffde
>
> Quoting myself (for the second time this month):
> "I think it's safe to assume no one still runs a.out binaries on m68k."
> http://lore.kernel.org/r/CAMuHMdW+m0Q+j3rsQdMXnrEPm+XB5Y2AQrxW5sD1mZAKgmEqoA@mail.gmail.com
Well, it's just a great thing to hear. :)
Fyi, I once tried to give a demo with a Slackware version in a container
which pre-dated ELF on a x86 kernel with AOUT. It didn't go well...
Christian
Christian Brauner <[email protected]> writes:
> On Tue, May 19, 2020 at 07:28:46AM -0500, Eric W. Biederman wrote:
>> Andreas Schwab <[email protected]> writes:
>>
>> > On Mai 19 2020, Eric W. Biederman wrote:
>> >
>> >> I am wondering if there are source trees for libc4 or libc5 around
>> >> anywhere that we can look at to see how usage of uselib evolved.
>> >
>> > libc5 is available from archive.debian.org.
>> >
>> > http://archive.debian.org/debian-archive/debian/pool/main/libc/libc/libc_5.4.46.orig.tar.gz
>>
>> Interesting.
>>
>> It appears that the old a.out code to make use of uselib remained in
>> the libc5 sources but it was all conditional on the being compiled not
>> to use ELF.
>>
>> libc5 did provide a wrapper for the uselib system call.
>>
>> It appears glibc also provides a wrapper for the uselib system call
>> named: uselib@GLIBC_2.2.5.
>>
>> I don't see a glibc header file that provides a declaration for uselib
>> though.
>>
>> So the question becomes did anyone use those glibc wrappers.
>
> The only software I could find was ski, the ia64 instruction set
> emulator, which apparently used to make use of this and when glibc
> removed they did:
>
> #define uselib(libname) syscall(__NR_uselib, libname)
>
> but they only define it for the sake of the internal syscall list they
> maintain so not actively using it. I just checked, ski is available on
> Fedora 31 and Fedora has USELIB disabled.
> Codesearch on Debian yields no users that actively use the syscall for
> anything.
I think there is a very good argument that no one builds libraries
usable with uselib anymore. The ELF version requires a ET_EXEC binary
with one PT_LOAD segment that is loaded at a fixed virtual address.
This is a format that does not allow for relocation processing, the
loading executable has to ``know'' where the symbols are in the loaded
binary, and they have to be build to run at distinct virtual addresses.
I think I could conjure up some linker scripts to do that with no more
linker support than we use to build the kernel, but it is not easy to
maintain binaries and libraries like that as code changes. Which is
why we switched to ELF in the first place.
I think the tooling challenges plus not being able to find anything
using uselib anymore make a solid argument for going to a distribution
and asking them to stop enabling CONFIG_USELIB in their kernels.
Eric
Kees Cook <[email protected]> writes:
> Hi,
>
> While looking at the code paths for the proposed O_MAYEXEC flag, I saw
> some things that looked like they should be fixed up.
>
> exec: Change uselib(2) IS_SREG() failure to EACCES
> This just regularizes the return code on uselib(2).
>
> exec: Relocate S_ISREG() check
> This moves the S_ISREG() check even earlier than it was already.
>
> exec: Relocate path_noexec() check
> This adds the path_noexec() check to the same place as the
> S_ISREG() check.
>
> fs: Include FMODE_EXEC when converting flags to f_mode
> This seemed like an oversight, but I suspect there is some
> reason I couldn't find for why FMODE_EXEC doesn't get set in
> f_mode and just stays in f_flags.
So I took a look at this series.
I think the belt and suspenders approach of adding code in open and then
keeping it in exec and uselib is probably wrong. My sense of the
situation is a belt and suspenders approach is more likely to be
confusing and result in people making mistakes when maintaining the code
than to actually be helpful.
Eric
On Tue, May 19, 2020 at 10:06:32AM -0500, Eric W. Biederman wrote:
> Kees Cook <[email protected]> writes:
>
> > Hi,
> >
> > While looking at the code paths for the proposed O_MAYEXEC flag, I saw
> > some things that looked like they should be fixed up.
> >
> > exec: Change uselib(2) IS_SREG() failure to EACCES
> > This just regularizes the return code on uselib(2).
> >
> > exec: Relocate S_ISREG() check
> > This moves the S_ISREG() check even earlier than it was already.
> >
> > exec: Relocate path_noexec() check
> > This adds the path_noexec() check to the same place as the
> > S_ISREG() check.
> >
> > fs: Include FMODE_EXEC when converting flags to f_mode
> > This seemed like an oversight, but I suspect there is some
> > reason I couldn't find for why FMODE_EXEC doesn't get set in
> > f_mode and just stays in f_flags.
>
> So I took a look at this series.
>
> I think the belt and suspenders approach of adding code in open and then
> keeping it in exec and uselib is probably wrong. My sense of the
> situation is a belt and suspenders approach is more likely to be
> confusing and result in people making mistakes when maintaining the code
> than to actually be helpful.
This is why I added the comments in fs/exec.c's redundant checks. When I
was originally testing this series, I had entirely removed the checks in
fs/exec.c, but then had nightmares about some kind of future VFS paths
that would somehow bypass do_open() and result in execve() working on
noexec mounts, there by allowing for the introduction of a really nasty
security bug.
The S_ISREG test is demonstrably too late (as referenced in the series),
and given the LSM hooks, I think the noexec check is too late as well.
(This is especially true for the coming O_MAYEXEC series, which will
absolutely need those tests earlier as well[1] -- the permission checking
is then in the correct place: during open, not exec.) I think the only
question is about leaving the redundant checks in fs/exec.c, which I
think are a cheap way to retain a sense of robustness.
-Kees
[1] https://lore.kernel.org/lkml/202005142343.D580850@keescook/
--
Kees Cook
Kees Cook <[email protected]> writes:
> On Tue, May 19, 2020 at 10:06:32AM -0500, Eric W. Biederman wrote:
>> Kees Cook <[email protected]> writes:
>>
>> > Hi,
>> >
>> > While looking at the code paths for the proposed O_MAYEXEC flag, I saw
>> > some things that looked like they should be fixed up.
>> >
>> > exec: Change uselib(2) IS_SREG() failure to EACCES
>> > This just regularizes the return code on uselib(2).
>> >
>> > exec: Relocate S_ISREG() check
>> > This moves the S_ISREG() check even earlier than it was already.
>> >
>> > exec: Relocate path_noexec() check
>> > This adds the path_noexec() check to the same place as the
>> > S_ISREG() check.
>> >
>> > fs: Include FMODE_EXEC when converting flags to f_mode
>> > This seemed like an oversight, but I suspect there is some
>> > reason I couldn't find for why FMODE_EXEC doesn't get set in
>> > f_mode and just stays in f_flags.
>>
>> So I took a look at this series.
>>
>> I think the belt and suspenders approach of adding code in open and then
>> keeping it in exec and uselib is probably wrong. My sense of the
>> situation is a belt and suspenders approach is more likely to be
>> confusing and result in people making mistakes when maintaining the code
>> than to actually be helpful.
>
> This is why I added the comments in fs/exec.c's redundant checks. When I
> was originally testing this series, I had entirely removed the checks in
> fs/exec.c, but then had nightmares about some kind of future VFS paths
> that would somehow bypass do_open() and result in execve() working on
> noexec mounts, there by allowing for the introduction of a really nasty
> security bug.
>
> The S_ISREG test is demonstrably too late (as referenced in the series),
Yes. The open of a pipe very much happens when it should not.
The deadlock looks like part of the cred_guard_mutex mess. I think I
introduced an alternate solution for the specific code paths in the
backtrace when I introduced exec_update_mutex.
The fact that cred_guard_mutex is held over open, while at the same time
cred_guard_mutex is grabbed on open files is very questionable. Until
my most recent patchset feeding exec /proc/self/maps would also deadlock
this way.
> and given the LSM hooks, I think the noexec check is too late as well.
> (This is especially true for the coming O_MAYEXEC series, which will
> absolutely need those tests earlier as well[1] -- the permission checking
> is then in the correct place: during open, not exec.) I think the only
> question is about leaving the redundant checks in fs/exec.c, which I
> think are a cheap way to retain a sense of robustness.
The trouble is when someone passes through changes one of the permission
checks for whatever reason (misses that they are duplicated in another
location) and things then fail in some very unexpected way.
Eric
On Tue, May 19, 2020 at 12:41:27PM -0500, Eric W. Biederman wrote:
> Kees Cook <[email protected]> writes:
> > and given the LSM hooks, I think the noexec check is too late as well.
> > (This is especially true for the coming O_MAYEXEC series, which will
> > absolutely need those tests earlier as well[1] -- the permission checking
> > is then in the correct place: during open, not exec.) I think the only
> > question is about leaving the redundant checks in fs/exec.c, which I
> > think are a cheap way to retain a sense of robustness.
>
> The trouble is when someone passes through changes one of the permission
> checks for whatever reason (misses that they are duplicated in another
> location) and things then fail in some very unexpected way.
Do you think this series should drop the "late" checks in fs/exec.c?
Honestly, the largest motivation for me to move the checks earlier as
I've done is so that other things besides execve() can use FMODE_EXEC
during open() and receive the same sanity-checking as execve() (i.e the
O_MAYEXEC series -- the details are still under discussion but this
cleanup will be needed regardless).
--
Kees Cook
Kees Cook <[email protected]> writes:
> On Tue, May 19, 2020 at 12:41:27PM -0500, Eric W. Biederman wrote:
>> Kees Cook <[email protected]> writes:
>> > and given the LSM hooks, I think the noexec check is too late as well.
>> > (This is especially true for the coming O_MAYEXEC series, which will
>> > absolutely need those tests earlier as well[1] -- the permission checking
>> > is then in the correct place: during open, not exec.) I think the only
>> > question is about leaving the redundant checks in fs/exec.c, which I
>> > think are a cheap way to retain a sense of robustness.
>>
>> The trouble is when someone passes through changes one of the permission
>> checks for whatever reason (misses that they are duplicated in another
>> location) and things then fail in some very unexpected way.
>
> Do you think this series should drop the "late" checks in fs/exec.c?
> Honestly, the largest motivation for me to move the checks earlier as
> I've done is so that other things besides execve() can use FMODE_EXEC
> during open() and receive the same sanity-checking as execve() (i.e the
> O_MAYEXEC series -- the details are still under discussion but this
> cleanup will be needed regardless).
I think this series should drop the "late" checks in fs/exec.c It feels
less error prone, and it feels like that would transform this into
something Linus would be eager to merge because series becomes a cleanup
that reduces line count.
I haven't been inside of open recently enough to remember if the
location you are putting the check fundamentally makes sense. But the
O_MAYEXEC bits make a pretty strong case that something of the sort
needs to happen.
I took a quick look but I can not see clearly where path_noexec
and the regular file tests should go.
I do see that you have code duplication with faccessat which suggests
that you haven't put the checks in the right place.
I am wondering if we need something distinct to request the type of the
file being opened versus execute permissions.
All I know is being careful and putting the tests in a good logical
place makes the code more maintainable, whereas not being careful
results in all kinds of sharp corners that might be exploitable.
So I think it is worth digging in and figuring out where those checks
should live. Especially so that code like faccessat does not need
to duplicate them.
Eric
On Tue, May 19, 2020 at 01:42:28PM -0500, Eric W. Biederman wrote:
> Kees Cook <[email protected]> writes:
>
> > On Tue, May 19, 2020 at 12:41:27PM -0500, Eric W. Biederman wrote:
> >> Kees Cook <[email protected]> writes:
> >> > and given the LSM hooks, I think the noexec check is too late as well.
> >> > (This is especially true for the coming O_MAYEXEC series, which will
> >> > absolutely need those tests earlier as well[1] -- the permission checking
> >> > is then in the correct place: during open, not exec.) I think the only
> >> > question is about leaving the redundant checks in fs/exec.c, which I
> >> > think are a cheap way to retain a sense of robustness.
> >>
> >> The trouble is when someone passes through changes one of the permission
> >> checks for whatever reason (misses that they are duplicated in another
> >> location) and things then fail in some very unexpected way.
> >
> > Do you think this series should drop the "late" checks in fs/exec.c?
> > Honestly, the largest motivation for me to move the checks earlier as
> > I've done is so that other things besides execve() can use FMODE_EXEC
> > during open() and receive the same sanity-checking as execve() (i.e the
> > O_MAYEXEC series -- the details are still under discussion but this
> > cleanup will be needed regardless).
>
> I think this series should drop the "late" checks in fs/exec.c It feels
> less error prone, and it feels like that would transform this into
> something Linus would be eager to merge because series becomes a cleanup
> that reduces line count.
Yeah, that was my initial sense too. I just started to get nervous about
removing the long-standing exec sanity checks. ;)
> I haven't been inside of open recently enough to remember if the
> location you are putting the check fundamentally makes sense. But the
> O_MAYEXEC bits make a pretty strong case that something of the sort
> needs to happen.
Right. I *think* it's correct place for now, based on my understanding
of the call graph (which is why I included it in the commit logs).
> I took a quick look but I can not see clearly where path_noexec
> and the regular file tests should go.
>
> I do see that you have code duplication with faccessat which suggests
> that you haven't put the checks in the right place.
Yeah, I have notes on the similar call sites (which I concluded, perhaps
wrongly) to ignore:
do_faccessat()
user_path_at(dfd, filename, lookup_flags, &path);
if (acc_mode & MAY_EXEC .... path_noexec()
inode_permission(inode, mode | MAY_ACCESS);
This appears to be strictly advisory, and the path_noexec() test is
there to, perhaps, avoid surprises when doing access() then fexecve()?
I would note, however, that that path-based LSMs appear to have no hook
in this call graph at all. I was expecting a call like:
security_file_permission(..., mode | MAY_ACCESS)
but I couldn't find one (or anything like it), so only
inode_permission() is being tested (which means also the existing
execve() late tests are missed, and the newly added S_ISREG() test from
do_dentry_open() is missed).
prctl_set_mm_exe_file()
err = -EACCESS;
if (!S_ISREG(inode->i_mode) || path_noexec(&exe.file->f_path))
goto exit;
err = inode_permission(inode, MAY_EXEC);
This is similar (no path-based LSM hooks present, only inode_permission()
used for permission checking), but it is at least gated by CAP_SYS_ADMIN.
And this bring me to a related question from my review: does
dentry_open() intentionally bypass security_inode_permission()? I.e. it
calls vfs_open() not do_open():
openat2(dfd, char * filename, open_how)
build_open_flags(open_how, open_flags)
do_filp_open(dfd, filename, open_flags)
path_openat(nameidata, open_flags, flags)
file = alloc_empty_file(open_flags, current_cred());
do_open(nameidata, file, open_flags)
may_open(path, acc_mode, open_flag)
inode_permission(inode, MAY_OPEN | acc_mode)
security_inode_permission(inode, acc_mode)
vfs_open(path, file)
do_dentry_open(file, path->dentry->d_inode, open)
if (unlikely(f->f_flags & FMODE_EXEC && !S_ISREG(inode->i_mode))) ...
security_file_open(f)
/* path-based LSMs check for open here
* and use FMODE_* flags to determine how a file
* is being opened. */
open()
vs
dentry_open(path, flags, cred)
f = alloc_empty_file(flags, cred);
vfs_open(path, f);
I would expect dentry_open() to mostly duplicate a bunch of
path_openat(), but it lacks the may_open() call, etc.
I really got the feeling that there was some new conceptual split needed
inside do_open() where the nameidata details have been finished, after
we've gained the "file" information, but before we've lost the "path"
information. For example, may_open(path, ...) has no sense of "file",
though it does do the inode_permission() call.
Note also that may_open() is used in do_tmpfile() too, and has a comment
implying it needs to be checking only a subset of the path details. So
I'm not sure how to split things up.
So, that's why I put the new checks just before the may_open() call in
do_open(): it's the most central, positions itself correctly for dealing
with O_MAYEXEC, and doesn't appear to make any existing paths worse.
> I am wondering if we need something distinct to request the type of the
> file being opened versus execute permissions.
Well, this is why I wanted to centralize it -- the knowledge of how a
file is going to be used needs to be tested both by the core VFS
(S_ISREG, path_noexec) and the LSMs. Things were inconsistent before.
> All I know is being careful and putting the tests in a good logical
> place makes the code more maintainable, whereas not being careful
> results in all kinds of sharp corners that might be exploitable.
> So I think it is worth digging in and figuring out where those checks
> should live. Especially so that code like faccessat does not need
> to duplicate them.
I think this is the right place with respect to execve(), though I think
there are other cases that could use to be improved (or at least made
more consistent).
-Kees
--
Kees Cook
On 5/19/20 2:17 PM, Kees Cook wrote:
> On Tue, May 19, 2020 at 01:42:28PM -0500, Eric W. Biederman wrote:
>> Kees Cook <[email protected]> writes:
>>
>>> On Tue, May 19, 2020 at 12:41:27PM -0500, Eric W. Biederman wrote:
>>>> Kees Cook <[email protected]> writes:
>>>>> and given the LSM hooks, I think the noexec check is too late as well.
>>>>> (This is especially true for the coming O_MAYEXEC series, which will
>>>>> absolutely need those tests earlier as well[1] -- the permission checking
>>>>> is then in the correct place: during open, not exec.) I think the only
>>>>> question is about leaving the redundant checks in fs/exec.c, which I
>>>>> think are a cheap way to retain a sense of robustness.
>>>>
>>>> The trouble is when someone passes through changes one of the permission
>>>> checks for whatever reason (misses that they are duplicated in another
>>>> location) and things then fail in some very unexpected way.
>>>
>>> Do you think this series should drop the "late" checks in fs/exec.c?
>>> Honestly, the largest motivation for me to move the checks earlier as
>>> I've done is so that other things besides execve() can use FMODE_EXEC
>>> during open() and receive the same sanity-checking as execve() (i.e the
>>> O_MAYEXEC series -- the details are still under discussion but this
>>> cleanup will be needed regardless).
>>
>> I think this series should drop the "late" checks in fs/exec.c It feels
>> less error prone, and it feels like that would transform this into
>> something Linus would be eager to merge because series becomes a cleanup
>> that reduces line count.
>
> Yeah, that was my initial sense too. I just started to get nervous about
> removing the long-standing exec sanity checks. ;)
>
>> I haven't been inside of open recently enough to remember if the
>> location you are putting the check fundamentally makes sense. But the
>> O_MAYEXEC bits make a pretty strong case that something of the sort
>> needs to happen.
>
> Right. I *think* it's correct place for now, based on my understanding
> of the call graph (which is why I included it in the commit logs).
>
>> I took a quick look but I can not see clearly where path_noexec
>> and the regular file tests should go.
>>
>> I do see that you have code duplication with faccessat which suggests
>> that you haven't put the checks in the right place.
>
> Yeah, I have notes on the similar call sites (which I concluded, perhaps
> wrongly) to ignore:
>
> do_faccessat()
> user_path_at(dfd, filename, lookup_flags, &path);
> if (acc_mode & MAY_EXEC .... path_noexec()
> inode_permission(inode, mode | MAY_ACCESS);
>
> This appears to be strictly advisory, and the path_noexec() test is
> there to, perhaps, avoid surprises when doing access() then fexecve()?
> I would note, however, that that path-based LSMs appear to have no hook
> in this call graph at all. I was expecting a call like:
>
> security_file_permission(..., mode | MAY_ACCESS)
>
> but I couldn't find one (or anything like it), so only
> inode_permission() is being tested (which means also the existing
> execve() late tests are missed, and the newly added S_ISREG() test from
> do_dentry_open() is missed).
>
>
sadly correct, its something that we intend to fix but haven't gotten
to it
> prctl_set_mm_exe_file()
> err = -EACCESS;
> if (!S_ISREG(inode->i_mode) || path_noexec(&exe.file->f_path))
> goto exit;
> err = inode_permission(inode, MAY_EXEC);
>
> This is similar (no path-based LSM hooks present, only inode_permission()
> used for permission checking), but it is at least gated by CAP_SYS_ADMIN.
>
dito here
>
> And this bring me to a related question from my review: does
> dentry_open() intentionally bypass security_inode_permission()? I.e. it
> calls vfs_open() not do_open():
>
> openat2(dfd, char * filename, open_how)
> build_open_flags(open_how, open_flags)
> do_filp_open(dfd, filename, open_flags)
> path_openat(nameidata, open_flags, flags)
> file = alloc_empty_file(open_flags, current_cred());
> do_open(nameidata, file, open_flags)
> may_open(path, acc_mode, open_flag)
> inode_permission(inode, MAY_OPEN | acc_mode)
> security_inode_permission(inode, acc_mode)
> vfs_open(path, file)
> do_dentry_open(file, path->dentry->d_inode, open)
> if (unlikely(f->f_flags & FMODE_EXEC && !S_ISREG(inode->i_mode))) ...
> security_file_open(f)
> /* path-based LSMs check for open here
> * and use FMODE_* flags to determine how a file
> * is being opened. */
> open()
>
> vs
>
> dentry_open(path, flags, cred)
> f = alloc_empty_file(flags, cred);
> vfs_open(path, f);
>
> I would expect dentry_open() to mostly duplicate a bunch of
> path_openat(), but it lacks the may_open() call, etc.
>
> I really got the feeling that there was some new conceptual split needed
> inside do_open() where the nameidata details have been finished, after
> we've gained the "file" information, but before we've lost the "path"
> information. For example, may_open(path, ...) has no sense of "file",
> though it does do the inode_permission() call.
>
yes that would be nice, sadly the path hooks are really a bolted on after thought
> Note also that may_open() is used in do_tmpfile() too, and has a comment
> implying it needs to be checking only a subset of the path details. So
> I'm not sure how to split things up.
>
/me neither anymore
> So, that's why I put the new checks just before the may_open() call in
> do_open(): it's the most central, positions itself correctly for dealing
> with O_MAYEXEC, and doesn't appear to make any existing paths worse.
>
>> I am wondering if we need something distinct to request the type of the
>> file being opened versus execute permissions.
>
> Well, this is why I wanted to centralize it -- the knowledge of how a
> file is going to be used needs to be tested both by the core VFS
> (S_ISREG, path_noexec) and the LSMs. Things were inconsistent before.
>
yep
>> All I know is being careful and putting the tests in a good logical
>> place makes the code more maintainable, whereas not being careful
>> results in all kinds of sharp corners that might be exploitable.
>> So I think it is worth digging in and figuring out where those checks
>> should live. Especially so that code like faccessat does not need
>> to duplicate them.
>
> I think this is the right place with respect to execve(), though I think
> there are other cases that could use to be improved (or at least made
> more consistent).
>
> -Kees
>