2015-08-13 09:32:58

by David Drysdale

[permalink] [raw]
Subject: [PATCHv4 0/3] fs: add O_BENEATH flag to openat(2)

A couple of questions with this iteration:
- Should we create a new errno (say ENOTBENEATH) for this policing, to
make it easier to distinguish this case from other EPERM failures?
(The FreeBSD implementation is considering this approach.)
- Al, does the code look OK for (in particular) integrating with the
shiny new re-worked fs/namei.c code?
Thanks.


This change adds a new O_BENEATH flag for openat(2) which restricts the
provided path, rejecting (with -EPERM) paths that are not beneath
the provided dfd.

This functionality was originally implemented as part of the internals
of the Capsicum security framework, which is available in FreeBSD 10.x
and which has previously had a Linux kernel port proposed [1].

However, as this behaviour is potentially useful as an independent feature,
this change exposes it via an openat(2) flag. (This variant was not
originally exposed in FreeBSD, but is currently being proposed there
too [2].)

Various folks from Chrome[OS] have indicated an interest in having this
functionality -- when combined with a seccomp filter it allows a directory
to be more safely accessed by a sandboxed process. Other folk have also
expressed an interest [3].


[1] https://lkml.org/lkml/2014/7/25/426
[2] https://reviews.freebsd.org/D2808
[3] https://groups.google.com/d/msg/capnproto/sKpzanYNZmQ/T9IbJIB-rqQJ

Changes since v3:
- Merge up to v4.2-rc6
- Reinstate local selftests (I'll send xfstest changes separately
if and when this is merged)
- Pull in common selftests makefile

Changes since v2:
- Move tests into xfstests [Dave Chinner, with thanks for feedback
on initial version]
- Merge up to v4.0-rc3 & latest man-pages

Changes since v1:
- Don't needlessly duplicate flags [Al Viro]
- Use EPERM rather than EACCES as error code [Paolo Bonzini]
- Disallow nd_jump_link for O_BENEATH [Al Viro/Andy Lutomirski]
- Add test of a jumped symlink (/proc/self/root)

Changes since the version included in the Capsicum v2 patchset:
- Add tests of normal symlinks
- Fix man-page typo
- Update patch to 3.17

Changes from v1 to v2 of Capsicum patchset:
- renamed O_BENEATH_ONLY to O_BENEATH [Christoph Hellwig]


David Drysdale (2):
fs: add O_BENEATH flag to openat(2)
selftests: Add test of O_BENEATH & openat(2)

arch/alpha/include/uapi/asm/fcntl.h | 1 +
arch/parisc/include/uapi/asm/fcntl.h | 1 +
arch/sparc/include/uapi/asm/fcntl.h | 1 +
fs/fcntl.c | 4 +-
fs/namei.c | 12 +-
fs/open.c | 4 +-
fs/proc/base.c | 4 +-
fs/proc/namespaces.c | 8 +-
include/linux/namei.h | 3 +-
include/uapi/asm-generic/fcntl.h | 4 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/openat/.gitignore | 4 +
tools/testing/selftests/openat/Makefile | 29 ++++
tools/testing/selftests/openat/openat.c | 258 ++++++++++++++++++++++++++++++
14 files changed, 326 insertions(+), 8 deletions(-)
create mode 100644 tools/testing/selftests/openat/.gitignore
create mode 100644 tools/testing/selftests/openat/Makefile
create mode 100644 tools/testing/selftests/openat/openat.c

--
1.9.1


2015-08-13 09:33:58

by David Drysdale

[permalink] [raw]
Subject: [PATCHv4 1/3] fs: add O_BENEATH flag to openat(2)

Add a new O_BENEATH flag for openat(2) which restricts the
provided path, rejecting (with -EPERM) paths that are not beneath
the provided dfd. In particular, reject:
- paths that contain .. components
- paths that begin with /
- symlinks that have paths as above.

Also disallow use of nd_jump_link() for following symlinks without
path expansion, when O_BENEATH is set.

Signed-off-by: David Drysdale <[email protected]>
---
arch/alpha/include/uapi/asm/fcntl.h | 1 +
arch/parisc/include/uapi/asm/fcntl.h | 1 +
arch/sparc/include/uapi/asm/fcntl.h | 1 +
fs/fcntl.c | 4 ++--
fs/namei.c | 12 +++++++++++-
fs/open.c | 4 +++-
fs/proc/base.c | 4 +++-
fs/proc/namespaces.c | 8 ++++++--
include/linux/namei.h | 3 ++-
include/uapi/asm-generic/fcntl.h | 4 ++++
10 files changed, 34 insertions(+), 8 deletions(-)

diff --git a/arch/alpha/include/uapi/asm/fcntl.h b/arch/alpha/include/uapi/asm/fcntl.h
index 09f49a6b87d1..76a87038d2c1 100644
--- a/arch/alpha/include/uapi/asm/fcntl.h
+++ b/arch/alpha/include/uapi/asm/fcntl.h
@@ -33,6 +33,7 @@

#define O_PATH 040000000
#define __O_TMPFILE 0100000000
+#define O_BENEATH 0200000000 /* no / or .. in openat path */

#define F_GETLK 7
#define F_SETLK 8
diff --git a/arch/parisc/include/uapi/asm/fcntl.h b/arch/parisc/include/uapi/asm/fcntl.h
index 34a46cbc76ed..3adadf72f929 100644
--- a/arch/parisc/include/uapi/asm/fcntl.h
+++ b/arch/parisc/include/uapi/asm/fcntl.h
@@ -21,6 +21,7 @@

#define O_PATH 020000000
#define __O_TMPFILE 040000000
+#define O_BENEATH 080000000 /* no / or .. in openat path */

#define F_GETLK64 8
#define F_SETLK64 9
diff --git a/arch/sparc/include/uapi/asm/fcntl.h b/arch/sparc/include/uapi/asm/fcntl.h
index 7e8ace5bf760..ea38f0bd6cec 100644
--- a/arch/sparc/include/uapi/asm/fcntl.h
+++ b/arch/sparc/include/uapi/asm/fcntl.h
@@ -36,6 +36,7 @@

#define O_PATH 0x1000000
#define __O_TMPFILE 0x2000000
+#define O_BENEATH 0x4000000 /* no / or .. in openat path */

#define F_GETOWN 5 /* for sockets. */
#define F_SETOWN 6 /* for sockets. */
diff --git a/fs/fcntl.c b/fs/fcntl.c
index ee85cd4e136a..3169693e9390 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -740,7 +740,7 @@ static int __init fcntl_init(void)
* Exceptions: O_NONBLOCK is a two bit define on parisc; O_NDELAY
* is defined as O_NONBLOCK on some platforms and not on others.
*/
- BUILD_BUG_ON(21 - 1 /* for O_RDONLY being 0 */ != HWEIGHT32(
+ BUILD_BUG_ON(22 - 1 /* for O_RDONLY being 0 */ != HWEIGHT32(
O_RDONLY | O_WRONLY | O_RDWR |
O_CREAT | O_EXCL | O_NOCTTY |
O_TRUNC | O_APPEND | /* O_NONBLOCK | */
@@ -748,7 +748,7 @@ static int __init fcntl_init(void)
O_DIRECT | O_LARGEFILE | O_DIRECTORY |
O_NOFOLLOW | O_NOATIME | O_CLOEXEC |
__FMODE_EXEC | O_PATH | __O_TMPFILE |
- __FMODE_NONOTIFY
+ __FMODE_NONOTIFY| O_BENEATH
));

fasync_cache = kmem_cache_create("fasync_cache",
diff --git a/fs/namei.c b/fs/namei.c
index fbbcf0993312..978f07d91a11 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -827,14 +827,18 @@ static inline void path_to_nameidata(const struct path *path,
* Helper to directly jump to a known parsed path from ->follow_link,
* caller must have taken a reference to path beforehand.
*/
-void nd_jump_link(struct path *path)
+int nd_jump_link(struct path *path)
{
struct nameidata *nd = current->nameidata;
+
+ if (nd->flags & LOOKUP_BENEATH)
+ return -EPERM;
path_put(&nd->path);

nd->path = *path;
nd->inode = nd->path.dentry->d_inode;
nd->flags |= LOOKUP_JUMPED;
+ return 0;
}

static inline void put_link(struct nameidata *nd)
@@ -1000,6 +1004,8 @@ const char *get_link(struct nameidata *nd)
}
}
if (*res == '/') {
+ if (nd->flags & LOOKUP_BENEATH)
+ return ERR_PTR(-EPERM);
if (nd->flags & LOOKUP_RCU) {
struct dentry *d;
if (!nd->root.mnt)
@@ -1888,6 +1894,8 @@ static int link_path_walk(const char *name, struct nameidata *nd)
if (name[0] == '.') switch (hashlen_len(hash_len)) {
case 2:
if (name[1] == '.') {
+ if (nd->flags & LOOKUP_BENEATH)
+ return -EPERM;
type = LAST_DOTDOT;
nd->flags |= LOOKUP_JUMPED;
}
@@ -2000,6 +2008,8 @@ static const char *path_init(struct nameidata *nd, unsigned flags)

nd->m_seq = read_seqbegin(&mount_lock);
if (*s == '/') {
+ if (flags & LOOKUP_BENEATH)
+ return ERR_PTR(-EPERM);
if (flags & LOOKUP_RCU) {
rcu_read_lock();
set_root_rcu(nd);
diff --git a/fs/open.c b/fs/open.c
index e33dab287fa0..29208cd307f7 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -917,7 +917,7 @@ static inline int build_open_flags(int flags, umode_t mode, struct open_flags *o
* If we have O_PATH in the open flag. Then we
* cannot have anything other than the below set of flags
*/
- flags &= O_DIRECTORY | O_NOFOLLOW | O_PATH;
+ flags &= O_DIRECTORY | O_NOFOLLOW | O_PATH | O_BENEATH;
acc_mode = 0;
} else {
acc_mode = MAY_OPEN | ACC_MODE(flags);
@@ -948,6 +948,8 @@ static inline int build_open_flags(int flags, umode_t mode, struct open_flags *o
lookup_flags |= LOOKUP_DIRECTORY;
if (!(flags & O_NOFOLLOW))
lookup_flags |= LOOKUP_FOLLOW;
+ if (flags & O_BENEATH)
+ lookup_flags |= LOOKUP_BENEATH;
op->lookup_flags = lookup_flags;
return 0;
}
diff --git a/fs/proc/base.c b/fs/proc/base.c
index aa50d1ac28fc..281f7a8d8060 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1589,7 +1589,9 @@ static const char *proc_pid_follow_link(struct dentry *dentry, void **cookie)
if (error)
goto out;

- nd_jump_link(&path);
+ error = nd_jump_link(&path);
+ if (error)
+ goto out;
return NULL;
out:
return ERR_PTR(error);
diff --git a/fs/proc/namespaces.c b/fs/proc/namespaces.c
index f6e8354b8cea..beb5aa3ad38c 100644
--- a/fs/proc/namespaces.c
+++ b/fs/proc/namespaces.c
@@ -36,6 +36,7 @@ static const char *proc_ns_follow_link(struct dentry *dentry, void **cookie)
const struct proc_ns_operations *ns_ops = PROC_I(inode)->ns_ops;
struct task_struct *task;
struct path ns_path;
+ int err;
void *error = ERR_PTR(-EACCES);

task = get_proc_task(inode);
@@ -44,8 +45,11 @@ static const char *proc_ns_follow_link(struct dentry *dentry, void **cookie)

if (ptrace_may_access(task, PTRACE_MODE_READ)) {
error = ns_get_path(&ns_path, task, ns_ops);
- if (!error)
- nd_jump_link(&ns_path);
+ if (!error) {
+ err = nd_jump_link(&ns_path);
+ if (err)
+ error = ERR_PTR(err);
+ }
}
put_task_struct(task);
return error;
diff --git a/include/linux/namei.h b/include/linux/namei.h
index d8c6334cd150..a5a262c85e49 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -27,6 +27,7 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, LAST_BIND};
#define LOOKUP_FOLLOW 0x0001
#define LOOKUP_DIRECTORY 0x0002
#define LOOKUP_AUTOMOUNT 0x0004
+#define LOOKUP_BENEATH 0x0008

#define LOOKUP_PARENT 0x0010
#define LOOKUP_REVAL 0x0020
@@ -85,7 +86,7 @@ extern int follow_up(struct path *);
extern struct dentry *lock_rename(struct dentry *, struct dentry *);
extern void unlock_rename(struct dentry *, struct dentry *);

-extern void nd_jump_link(struct path *path);
+extern int nd_jump_link(struct path *path);

static inline void nd_terminate_link(void *name, size_t len, size_t maxlen)
{
diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
index e063effe0cc1..4542bc6a2950 100644
--- a/include/uapi/asm-generic/fcntl.h
+++ b/include/uapi/asm-generic/fcntl.h
@@ -92,6 +92,10 @@
#define O_TMPFILE (__O_TMPFILE | O_DIRECTORY)
#define O_TMPFILE_MASK (__O_TMPFILE | O_DIRECTORY | O_CREAT)

+#ifndef O_BENEATH
+#define O_BENEATH 040000000 /* no / or .. in openat path */
+#endif
+
#ifndef O_NDELAY
#define O_NDELAY O_NONBLOCK
#endif
--
1.9.1

2015-08-13 09:33:32

by David Drysdale

[permalink] [raw]
Subject: [PATCHv4 2/3] selftests: Add test of O_BENEATH & openat(2)

Add simple tests of openat(2) variations, including examples that
check the new O_BENEATH flag.

Signed-off-by: David Drysdale <[email protected]>
---
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/openat/.gitignore | 4 +
tools/testing/selftests/openat/Makefile | 29 ++++
tools/testing/selftests/openat/openat.c | 258 ++++++++++++++++++++++++++++++
4 files changed, 292 insertions(+)
create mode 100644 tools/testing/selftests/openat/.gitignore
create mode 100644 tools/testing/selftests/openat/Makefile
create mode 100644 tools/testing/selftests/openat/openat.c

diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 24ae9e829e9a..606f8df5c8aa 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -11,6 +11,7 @@ TARGETS += memory-hotplug
TARGETS += mount
TARGETS += mqueue
TARGETS += net
+TARGETS += openat
TARGETS += powerpc
TARGETS += ptrace
TARGETS += seccomp
diff --git a/tools/testing/selftests/openat/.gitignore b/tools/testing/selftests/openat/.gitignore
new file mode 100644
index 000000000000..835b2dcd8678
--- /dev/null
+++ b/tools/testing/selftests/openat/.gitignore
@@ -0,0 +1,4 @@
+openat
+subdir
+topfile
+symlinkdown
\ No newline at end of file
diff --git a/tools/testing/selftests/openat/Makefile b/tools/testing/selftests/openat/Makefile
new file mode 100644
index 000000000000..73f80428b6a5
--- /dev/null
+++ b/tools/testing/selftests/openat/Makefile
@@ -0,0 +1,29 @@
+CFLAGS = -Wall
+BINARIES = openat
+DEPS = subdir topfile symlinkdown subdir/bottomfile subdir/symlinkup subdir/symlinkout subdir/symlinkin
+all: $(BINARIES) $(DEPS)
+
+subdir:
+ mkdir -p subdir
+topfile:
+ echo 0123456789 > $@
+subdir/bottomfile: | subdir
+ echo 0123456789 > $@
+subdir/symlinkup: | subdir
+ ln -s ../topfile $@
+subdir/symlinkout: | subdir
+ ln -s /etc/passwd $@
+subdir/symlinkin: | subdir
+ ln -s bottomfile $@
+symlinkdown:
+ ln -s subdir/bottomfile $@
+%: %.c
+ $(CC) $(CFLAGS) -o $@ $^
+
+TEST_PROGS := openat
+TEST_FILES := $(DEPS)
+
+include ../lib.mk
+
+clean:
+ rm -rf $(BINARIES) $(DEPS)
diff --git a/tools/testing/selftests/openat/openat.c b/tools/testing/selftests/openat/openat.c
new file mode 100644
index 000000000000..bc2c4b02a091
--- /dev/null
+++ b/tools/testing/selftests/openat/openat.c
@@ -0,0 +1,258 @@
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/syscall.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <errno.h>
+
+#include <linux/fcntl.h>
+
+/* Bypass glibc */
+static int openat_(int dirfd, const char *pathname, int flags)
+{
+ return syscall(__NR_openat, dirfd, pathname, flags);
+}
+static int open_(const char *pathname, int flags)
+{
+ return syscall(__NR_open, pathname, flags);
+}
+
+static int openat_or_die(int dfd, const char *path, int flags)
+{
+ int fd = openat_(dfd, path, flags);
+
+ if (fd < 0) {
+ printf("Failed to openat(%d, '%s'); "
+ "check prerequisites are available\n", dfd, path);
+ exit(1);
+ }
+ return fd;
+}
+
+static int check_fd(int fd)
+{
+ int rc;
+ struct stat info;
+ char buffer[4];
+
+ if (fd < 0) {
+ printf("[FAIL]: openat() failed, rc=%d errno=%d (%s)\n",
+ fd, errno, strerror(errno));
+ return 1;
+ }
+ if (fstat(fd, &info) != 0) {
+ printf("[FAIL]: fstat() failed, rc=%d errno=%d (%s)\n",
+ fd, errno, strerror(errno));
+ return 1;
+ }
+ if (!S_ISDIR(info.st_mode)) {
+ errno = 0;
+ rc = read(fd, buffer, sizeof(buffer));
+ if (rc < 0) {
+ printf("[FAIL]: read() failed, rc=%d errno=%d (%s)\n",
+ rc, errno, strerror(errno));
+ return 1;
+ }
+ }
+ close(fd);
+ printf("[OK]\n");
+ return 0;
+}
+
+static int check_openat(int dfd, const char *path, int flags)
+{
+ int fd;
+
+ errno = 0;
+ printf("Check success of openat(%d, '%s', %x)... ",
+ dfd, path?:"(null)", flags);
+ fd = openat_(dfd, path, flags);
+ return check_fd(fd);
+}
+
+static int check_open(const char *path, int flags)
+{
+ int fd;
+
+ errno = 0;
+ printf("Check success of open('%s', %x)... ", path?:"(null)", flags);
+ fd = open_(path, flags);
+ return check_fd(fd);
+}
+
+static int check_fail(int rc, int expected_errno, const char *errno_str)
+{
+ if (rc > 0) {
+ printf("[FAIL] (unexpected success from open operation)\n");
+ close(rc);
+ return 1;
+ }
+ if (errno != expected_errno) {
+ printf("[FAIL] (expected errno %d (%s) not %d (%s)\n",
+ expected_errno, strerror(expected_errno),
+ errno, strerror(errno));
+ return 1;
+ }
+ printf("[OK]\n");
+ return 0;
+}
+
+#define check_openat_fail(dfd, path, flags, errno) \
+ _check_openat_fail(dfd, path, flags, errno, #errno)
+static int _check_openat_fail(int dfd, const char *path, int flags,
+ int expected_errno, const char *errno_str)
+{
+ int rc;
+
+ printf("Check failure of openat(%d, '%s', %x) with %s... ",
+ dfd, path?:"(null)", flags, errno_str);
+ errno = 0;
+ rc = openat_(dfd, path, flags);
+ return check_fail(rc, expected_errno, errno_str);
+}
+
+#define check_open_fail(path, flags, errno) \
+ _check_open_fail(path, flags, errno, #errno)
+static int _check_open_fail(const char *path, int flags,
+ int expected_errno, const char *errno_str)
+{
+ int rc;
+
+ printf("Check failure of open('%s', %x) with %s... ",
+ path?:"(null)", flags, errno_str);
+ errno = 0;
+ rc = open_(path, flags);
+ return check_fail(rc, expected_errno, errno_str);
+}
+
+int check_proc(void)
+{
+ int root_dfd = openat_(AT_FDCWD, "/", O_RDONLY);
+ int proc_dfd = openat_(AT_FDCWD, "/proc/self", O_RDONLY);
+ int fail = 0;
+
+ if (proc_dfd < 0) {
+ printf("'/proc/self' unavailable (errno=%d '%s'), skipping\n",
+ errno, strerror(errno));
+ return 0;
+ }
+ fail += check_openat(proc_dfd, "root/etc/passwd", O_RDONLY);
+ fail += check_openat(root_dfd, "proc/self/root/etc/passwd", O_RDONLY);
+#ifdef O_BENEATH
+ fail += check_openat_fail(proc_dfd, "root/etc/passwd",
+ O_RDONLY|O_BENEATH, EPERM);
+ fail += check_openat_fail(root_dfd, "proc/self/root/etc/passwd",
+ O_RDONLY|O_BENEATH, EPERM);
+#endif
+ return fail;
+}
+
+int main(int argc, char *argv[])
+{
+ int fail = 0;
+ int dot_dfd = openat_or_die(AT_FDCWD, ".", O_RDONLY);
+ int subdir_dfd = openat_or_die(AT_FDCWD, "subdir", O_RDONLY);
+ int file_fd = openat_or_die(AT_FDCWD, "topfile", O_RDONLY);
+
+ /* Sanity check normal behavior */
+ fail += check_open("topfile", O_RDONLY);
+ fail += check_open("subdir/bottomfile", O_RDONLY);
+ fail += check_openat(AT_FDCWD, "topfile", O_RDONLY);
+ fail += check_openat(AT_FDCWD, "subdir/bottomfile", O_RDONLY);
+
+ fail += check_openat(dot_dfd, "topfile", O_RDONLY);
+ fail += check_openat(dot_dfd, "subdir/bottomfile", O_RDONLY);
+ fail += check_openat(dot_dfd, "subdir/../topfile", O_RDONLY);
+ fail += check_open("subdir/../topfile", O_RDONLY);
+
+ fail += check_openat(subdir_dfd, "../topfile", O_RDONLY);
+ fail += check_openat(subdir_dfd, "bottomfile", O_RDONLY);
+ fail += check_openat(subdir_dfd, "../subdir/bottomfile", O_RDONLY);
+ fail += check_openat(subdir_dfd, "symlinkup", O_RDONLY);
+ fail += check_openat(subdir_dfd, "symlinkout", O_RDONLY);
+
+ fail += check_open("/etc/passwd", O_RDONLY);
+ fail += check_openat(AT_FDCWD, "/etc/passwd", O_RDONLY);
+ fail += check_openat(dot_dfd, "/etc/passwd", O_RDONLY);
+ fail += check_openat(subdir_dfd, "/etc/passwd", O_RDONLY);
+
+ fail += check_openat_fail(AT_FDCWD, "bogus", O_RDONLY, ENOENT);
+ fail += check_openat_fail(dot_dfd, "bogus", O_RDONLY, ENOENT);
+ fail += check_openat_fail(999, "bogus", O_RDONLY, EBADF);
+ fail += check_openat_fail(file_fd, "bogus", O_RDONLY, ENOTDIR);
+
+#ifdef O_BENEATH
+ /* Test out O_BENEATH */
+ fail += check_open("topfile", O_RDONLY|O_BENEATH);
+ fail += check_open("subdir/bottomfile", O_RDONLY|O_BENEATH);
+ fail += check_openat(AT_FDCWD, "topfile", O_RDONLY|O_BENEATH);
+ fail += check_openat(AT_FDCWD, "subdir/bottomfile",
+ O_RDONLY|O_BENEATH);
+
+ fail += check_openat(dot_dfd, "topfile", O_RDONLY|O_BENEATH);
+ fail += check_openat(dot_dfd, "subdir/bottomfile",
+ O_RDONLY|O_BENEATH);
+ fail += check_openat(dot_dfd, "subdir///bottomfile",
+ O_RDONLY|O_BENEATH);
+ fail += check_openat(subdir_dfd, "bottomfile", O_RDONLY|O_BENEATH);
+ fail += check_openat(subdir_dfd, "./bottomfile", O_RDONLY|O_BENEATH);
+ fail += check_openat(subdir_dfd, ".", O_RDONLY|O_BENEATH);
+
+ /* Symlinks without .. or leading / are OK */
+ fail += check_open("symlinkdown", O_RDONLY|O_BENEATH);
+ fail += check_open("subdir/symlinkin", O_RDONLY|O_BENEATH);
+ fail += check_openat(dot_dfd, "symlinkdown", O_RDONLY|O_BENEATH);
+ fail += check_openat(dot_dfd, "subdir/symlinkin", O_RDONLY|O_BENEATH);
+ fail += check_openat(subdir_dfd, "symlinkin", O_RDONLY|O_BENEATH);
+ /* ... unless of course we specify O_NOFOLLOW */
+ fail += check_open_fail("symlinkdown",
+ O_RDONLY|O_BENEATH|O_NOFOLLOW, ELOOP);
+ fail += check_open_fail("subdir/symlinkin",
+ O_RDONLY|O_BENEATH|O_NOFOLLOW, ELOOP);
+ fail += check_openat_fail(dot_dfd, "symlinkdown",
+ O_RDONLY|O_BENEATH|O_NOFOLLOW, ELOOP);
+ fail += check_openat_fail(dot_dfd, "subdir/symlinkin",
+ O_RDONLY|O_BENEATH|O_NOFOLLOW, ELOOP);
+ fail += check_openat_fail(subdir_dfd, "symlinkin",
+ O_RDONLY|O_BENEATH|O_NOFOLLOW, ELOOP);
+
+ /* Can't open paths with ".." in them */
+ fail += check_open_fail("subdir/../topfile",
+ O_RDONLY|O_BENEATH, EPERM);
+ fail += check_openat_fail(dot_dfd, "subdir/../topfile",
+ O_RDONLY|O_BENEATH, EPERM);
+ fail += check_openat_fail(subdir_dfd, "../topfile",
+ O_RDONLY|O_BENEATH, EPERM);
+ fail += check_openat_fail(subdir_dfd, "../subdir/bottomfile",
+ O_RDONLY|O_BENEATH, EPERM);
+ fail += check_openat_fail(subdir_dfd, "..", O_RDONLY|O_BENEATH, EPERM);
+
+ /* Can't open paths starting with "/" */
+ fail += check_open_fail("/etc/passwd", O_RDONLY|O_BENEATH, EPERM);
+ fail += check_openat_fail(AT_FDCWD, "/etc/passwd",
+ O_RDONLY|O_BENEATH, EPERM);
+ fail += check_openat_fail(dot_dfd, "/etc/passwd",
+ O_RDONLY|O_BENEATH, EPERM);
+ fail += check_openat_fail(subdir_dfd, "/etc/passwd",
+ O_RDONLY|O_BENEATH, EPERM);
+ /* Can't sneak around constraints with symlinks */
+ fail += check_openat_fail(subdir_dfd, "symlinkup",
+ O_RDONLY|O_BENEATH, EPERM);
+ fail += check_openat_fail(subdir_dfd, "symlinkout",
+ O_RDONLY|O_BENEATH, EPERM);
+ fail += check_openat_fail(subdir_dfd, "../symlinkdown",
+ O_RDONLY|O_BENEATH, EPERM);
+ fail += check_openat_fail(dot_dfd, "subdir/symlinkup",
+ O_RDONLY|O_BENEATH, EPERM);
+ fail += check_open_fail("subdir/symlinkup", O_RDONLY|O_BENEATH, EPERM);
+#else
+ printf("Skipping O_BENEATH tests due to missing #define\n");
+#endif
+ fail += check_proc();
+
+ if (fail > 0)
+ printf("%d tests failed\n", fail);
+ return fail;
+}
--
1.9.1

2015-08-13 09:33:09

by David Drysdale

[permalink] [raw]
Subject: [PATCHv4 man-pages 3/3] open.2: describe O_BENEATH flag

Signed-off-by: David Drysdale <[email protected]>
---
man2/open.2 | 44 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 44 insertions(+)

diff --git a/man2/open.2 b/man2/open.2
index f49ab3042161..d09511f9ffb0 100644
--- a/man2/open.2
+++ b/man2/open.2
@@ -201,6 +201,43 @@ See
for further details.
See also BUGS, below.
.TP
+.B O_BENEATH " (since Linux 4.??)"
+Ensure that the
+.I pathname
+is beneath the current working directory (for
+.BR open (2))
+or the
+.I dirfd
+(for
+.BR openat (2)).
+If the
+.I pathname
+is absolute or contains a path component of "..", the
+.BR open ()
+fails with the error
+.BR EPERM.
+This occurs even if ".." path component would not actually
+escape the original directory; for example, a
+.I pathname
+of "subdir/../filename" would be rejected.
+Path components that are symbolic links to absolute paths, or that are
+relative paths containing a ".." component, will also cause the
+.BR open ()
+operation to fail with the error
+.BR EPERM.
+
+This feature allows applications to be sure that the opened file is
+within the specified directory, regardless of the original source of the
+.I pathname
+argument.
+Some security-conscious programs may further ensure
+this by imposing a system call filter (with
+.BR seccomp (2))
+that requires this flag for all
+.BR open ()
+operations, so that the program cannot open files outside of
+specified directories even if subverted.
+.TP
.BR O_CLOEXEC " (since Linux 2.6.23)"
.\" NOTE! several other man pages refer to this text
Enable the close-on-exec flag for the new file descriptor.
@@ -1015,6 +1052,13 @@ did not match the owner of the file and the caller was not privileged
The operation was prevented by a file seal; see
.BR fcntl (2).
.TP
+.B EPERM
+The
+.B O_BENEATH
+flag was specified and the
+.I pathname
+was not beneath the relevant directory.
+.TP
.B EROFS
.I pathname
refers to a file on a read-only filesystem and write access was
--
2.5.0.rc2.392.g76e840b

2015-08-13 17:38:44

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCHv4 man-pages 3/3] open.2: describe O_BENEATH flag

On Thu, Aug 13, 2015 at 2:32 AM, David Drysdale <[email protected]> wrote:
> Signed-off-by: David Drysdale <[email protected]>

What's the behavior wrt fcntl(F_GETFL, etc)?

--Andy

Subject: Re: [PATCHv4 man-pages 3/3] open.2: describe O_BENEATH flag

On 13 August 2015 at 19:38, Andy Lutomirski <[email protected]> wrote:
> On Thu, Aug 13, 2015 at 2:32 AM, David Drysdale <[email protected]> wrote:
>> Signed-off-by: David Drysdale <[email protected]>
>
> What's the behavior wrt fcntl(F_GETFL, etc)?

I would presume that O_BENEATH is one of the so-called "file creation
flags". See this paragraph of the DESCRIPTION:

In addition, zero or more file creation flags and file status
flags can be bitwise-or'd in flags. The file creation flags are
O_CLOEXEC, O_CREAT, O_DIRECTORY, O_EXCL, O_NOCTTY, O_NOFOLLOW,
O_TMPFILE, O_TRUNC, and O_TTY_INIT. The file status flags are
all of the remaining flags listed below. The distinction between
these two groups of flags is that the file status flags can be
retrieved and (in some cases) modified; see fcntl(2) for details.

David, presuming this is correct (I can't see how O_BENEATH could be a
"file *status* flag"), your patch should also add O_BENEATH to the
list in that paragraph.

Cheers,

Michael



--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

2015-08-14 09:29:31

by David Drysdale

[permalink] [raw]
Subject: Re: [PATCHv4 man-pages 3/3] open.2: describe O_BENEATH flag

On Fri, Aug 14, 2015 at 6:33 AM, Michael Kerrisk (man-pages)
<[email protected]> wrote:
> On 13 August 2015 at 19:38, Andy Lutomirski <[email protected]> wrote:
>> On Thu, Aug 13, 2015 at 2:32 AM, David Drysdale <[email protected]> wrote:
>>> Signed-off-by: David Drysdale <[email protected]>
>>
>> What's the behavior wrt fcntl(F_GETFL, etc)?
>
> I would presume that O_BENEATH is one of the so-called "file creation
> flags". See this paragraph of the DESCRIPTION:
>
> In addition, zero or more file creation flags and file status
> flags can be bitwise-or'd in flags. The file creation flags are
> O_CLOEXEC, O_CREAT, O_DIRECTORY, O_EXCL, O_NOCTTY, O_NOFOLLOW,
> O_TMPFILE, O_TRUNC, and O_TTY_INIT. The file status flags are
> all of the remaining flags listed below. The distinction between
> these two groups of flags is that the file status flags can be
> retrieved and (in some cases) modified; see fcntl(2) for details.
>
> David, presuming this is correct (I can't see how O_BENEATH could be a
> "file *status* flag"), your patch should also add O_BENEATH to the
> list in that paragraph.

Yeah, O_BENEATH makes sense as a file creation flag; I'll add it
to that list -- thanks for spotting.

> Cheers,
>
> Michael
>
>
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/

2015-08-14 14:18:19

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCHv4 man-pages 3/3] open.2: describe O_BENEATH flag

On Fri, Aug 14, 2015 at 2:29 AM, David Drysdale <[email protected]> wrote:
> On Fri, Aug 14, 2015 at 6:33 AM, Michael Kerrisk (man-pages)
> <[email protected]> wrote:
>> On 13 August 2015 at 19:38, Andy Lutomirski <[email protected]> wrote:
>>> On Thu, Aug 13, 2015 at 2:32 AM, David Drysdale <[email protected]> wrote:
>>>> Signed-off-by: David Drysdale <[email protected]>
>>>
>>> What's the behavior wrt fcntl(F_GETFL, etc)?
>>
>> I would presume that O_BENEATH is one of the so-called "file creation
>> flags". See this paragraph of the DESCRIPTION:
>>
>> In addition, zero or more file creation flags and file status
>> flags can be bitwise-or'd in flags. The file creation flags are
>> O_CLOEXEC, O_CREAT, O_DIRECTORY, O_EXCL, O_NOCTTY, O_NOFOLLOW,
>> O_TMPFILE, O_TRUNC, and O_TTY_INIT. The file status flags are
>> all of the remaining flags listed below. The distinction between
>> these two groups of flags is that the file status flags can be
>> retrieved and (in some cases) modified; see fcntl(2) for details.
>>
>> David, presuming this is correct (I can't see how O_BENEATH could be a
>> "file *status* flag"), your patch should also add O_BENEATH to the
>> list in that paragraph.
>
> Yeah, O_BENEATH makes sense as a file creation flag; I'll add it
> to that list -- thanks for spotting.

Should there be a test that you can't clear O_BENEATH with F_SETFL?

--Andy

2015-08-14 15:31:01

by David Drysdale

[permalink] [raw]
Subject: Re: [PATCHv4 man-pages 3/3] open.2: describe O_BENEATH flag

On Fri, Aug 14, 2015 at 3:17 PM, Andy Lutomirski <[email protected]> wrote:
> On Fri, Aug 14, 2015 at 2:29 AM, David Drysdale <[email protected]> wrote:
>> On Fri, Aug 14, 2015 at 6:33 AM, Michael Kerrisk (man-pages)
>> <[email protected]> wrote:
>>> On 13 August 2015 at 19:38, Andy Lutomirski <[email protected]> wrote:
>>>> On Thu, Aug 13, 2015 at 2:32 AM, David Drysdale <[email protected]> wrote:
>>>>> Signed-off-by: David Drysdale <[email protected]>
>>>>
>>>> What's the behavior wrt fcntl(F_GETFL, etc)?
>>>
>>> I would presume that O_BENEATH is one of the so-called "file creation
>>> flags". See this paragraph of the DESCRIPTION:
>>>
>>> In addition, zero or more file creation flags and file status
>>> flags can be bitwise-or'd in flags. The file creation flags are
>>> O_CLOEXEC, O_CREAT, O_DIRECTORY, O_EXCL, O_NOCTTY, O_NOFOLLOW,
>>> O_TMPFILE, O_TRUNC, and O_TTY_INIT. The file status flags are
>>> all of the remaining flags listed below. The distinction between
>>> these two groups of flags is that the file status flags can be
>>> retrieved and (in some cases) modified; see fcntl(2) for details.
>>>
>>> David, presuming this is correct (I can't see how O_BENEATH could be a
>>> "file *status* flag"), your patch should also add O_BENEATH to the
>>> list in that paragraph.
>>
>> Yeah, O_BENEATH makes sense as a file creation flag; I'll add it
>> to that list -- thanks for spotting.
>
> Should there be a test that you can't clear O_BENEATH with F_SETFL?
>
> --Andy

I'll add a test that fcntl(F_SETFL) silently ignores the file creation flags,
including O_BENEATH.