This change adds a new O_BENEATH flag for openat(2) which restricts the
provided path, rejecting (with -EACCES) paths that are not beneath
the provided dfd.
This change was previously included as part of a larger patchset
(https://lkml.org/lkml/2014/7/25/426) for Capsicum support; however, it
is potentially useful as an independent change so I've pulled it out
separately here.
In particular, various folks from Chrome[OS] have indicated an interest
in having this functionality.
Changes since the version included in the Capsicum v2 patchset:
- Add tests of normal symlinks
- Fix man-page typo
- Update patch to 3.17
Changes from v1 to v2 of Capsicum patchset:
- renamed O_BENEATH_ONLY to O_BENEATH [Christoph Hellwig]
David Drysdale (2):
fs: add O_BENEATH flag to openat(2)
selftests: Add test of O_BENEATH & openat(2)
arch/alpha/include/uapi/asm/fcntl.h | 1 +
arch/parisc/include/uapi/asm/fcntl.h | 1 +
arch/sparc/include/uapi/asm/fcntl.h | 1 +
fs/fcntl.c | 5 +-
fs/namei.c | 43 ++++++---
fs/open.c | 4 +-
include/linux/namei.h | 1 +
include/uapi/asm-generic/fcntl.h | 4 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/openat/.gitignore | 3 +
tools/testing/selftests/openat/Makefile | 24 +++++
tools/testing/selftests/openat/openat.c | 149 ++++++++++++++++++++++++++++++
12 files changed, 220 insertions(+), 17 deletions(-)
create mode 100644 tools/testing/selftests/openat/.gitignore
create mode 100644 tools/testing/selftests/openat/Makefile
create mode 100644 tools/testing/selftests/openat/openat.c
--
2.1.0.rc2.206.gedb03e5
Signed-off-by: David Drysdale <[email protected]>
---
man2/open.2 | 31 ++++++++++++++++++++++++++++++-
1 file changed, 30 insertions(+), 1 deletion(-)
diff --git a/man2/open.2 b/man2/open.2
index abc3c35b8b3a..495c7f1e81a4 100644
--- a/man2/open.2
+++ b/man2/open.2
@@ -716,6 +716,31 @@ XFS support was added
.\" commit ab29743117f9f4c22ac44c13c1647fb24fb2bafe
in Linux 3.15.
.TP
+.B O_BENEATH " (since Linux 3.??)"
+Ensure that the
+.I pathname
+is beneath the current working directory (for
+.BR open (2))
+or the
+.I dirfd
+(for
+.BR openat (2)).
+If the
+.I pathname
+is absolute or contains a path component of "..", the
+.BR open ()
+fails with the error
+.BR EACCES.
+This occurs even if ".." path component would not actually
+escape the original directory; for example, a
+.I pathname
+of "subdir/../filename" would be rejected.
+Path components that are symbolic links to absolute paths, or that are
+relative paths containing a ".." component, will also cause the
+.BR open ()
+operation to fail with the error
+.BR EACCES.
+.TP
.B O_TRUNC
If the file already exists and is a regular file and the access mode allows
writing (i.e., is
@@ -792,7 +817,11 @@ The requested access to the file is not allowed, or search permission
is denied for one of the directories in the path prefix of
.IR pathname ,
or the file did not exist yet and write access to the parent directory
-is not allowed.
+is not allowed, or the
+.B O_BENEATH
+flag was specified and the
+.I pathname
+was not beneath the relevant directory.
(See also
.BR path_resolution (7).)
.TP
--
2.1.0.rc2.206.gedb03e5
Add simple tests of openat(2) variations, including examples that
check the new O_BENEATH flag.
Signed-off-by: David Drysdale <[email protected]>
---
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/openat/.gitignore | 4 +
tools/testing/selftests/openat/Makefile | 28 ++++++
tools/testing/selftests/openat/openat.c | 161 ++++++++++++++++++++++++++++++
4 files changed, 194 insertions(+)
create mode 100644 tools/testing/selftests/openat/.gitignore
create mode 100644 tools/testing/selftests/openat/Makefile
create mode 100644 tools/testing/selftests/openat/openat.c
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 36ff2e4c7b6f..812e973233d2 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -14,6 +14,7 @@ TARGETS += powerpc
TARGETS += user
TARGETS += sysctl
TARGETS += firmware
+TARGETS += openat
TARGETS_HOTPLUG = cpu-hotplug
TARGETS_HOTPLUG += memory-hotplug
diff --git a/tools/testing/selftests/openat/.gitignore b/tools/testing/selftests/openat/.gitignore
new file mode 100644
index 000000000000..835b2dcd8678
--- /dev/null
+++ b/tools/testing/selftests/openat/.gitignore
@@ -0,0 +1,4 @@
+openat
+subdir
+topfile
+symlinkdown
\ No newline at end of file
diff --git a/tools/testing/selftests/openat/Makefile b/tools/testing/selftests/openat/Makefile
new file mode 100644
index 000000000000..84cd06e7ee82
--- /dev/null
+++ b/tools/testing/selftests/openat/Makefile
@@ -0,0 +1,28 @@
+CC = $(CROSS_COMPILE)gcc
+CFLAGS = -Wall
+BINARIES = openat
+DEPS = subdir topfile symlinkdown subdir/bottomfile subdir/symlinkup subdir/symlinkout subdir/symlinkin
+all: $(BINARIES) $(DEPS)
+
+subdir:
+ mkdir -p subdir
+topfile:
+ echo 0123456789 > $@
+subdir/bottomfile: | subdir
+ echo 0123456789 > $@
+subdir/symlinkup: | subdir
+ ln -s ../topfile $@
+subdir/symlinkout: | subdir
+ ln -s /etc/passwd $@
+subdir/symlinkin: | subdir
+ ln -s bottomfile $@
+symlinkdown:
+ ln -s subdir/bottomfile $@
+%: %.c
+ $(CC) $(CFLAGS) -o $@ $^
+
+run_tests: all
+ ./openat
+
+clean:
+ rm -rf $(BINARIES) $(DEPS)
diff --git a/tools/testing/selftests/openat/openat.c b/tools/testing/selftests/openat/openat.c
new file mode 100644
index 000000000000..61acfb53442e
--- /dev/null
+++ b/tools/testing/selftests/openat/openat.c
@@ -0,0 +1,161 @@
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/syscall.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <errno.h>
+
+#include <linux/fcntl.h>
+
+/* Bypass glibc */
+static int openat_(int dirfd, const char *pathname, int flags)
+{
+ return syscall(__NR_openat, dirfd, pathname, flags);
+}
+
+static int openat_or_die(int dfd, const char *path, int flags)
+{
+ int fd = openat_(dfd, path, flags);
+
+ if (fd < 0) {
+ printf("Failed to openat(%d, '%s'); "
+ "check prerequisites are available\n", dfd, path);
+ exit(1);
+ }
+ return fd;
+}
+
+static int check_openat(int dfd, const char *path, int flags)
+{
+ int rc;
+ int fd;
+ char buffer[4];
+
+ errno = 0;
+ printf("Check success of openat(%d, '%s', %x)... ",
+ dfd, path?:"(null)", flags);
+ fd = openat_(dfd, path, flags);
+ if (fd < 0) {
+ printf("[FAIL]: openat() failed, rc=%d errno=%d (%s)\n",
+ fd, errno, strerror(errno));
+ return 1;
+ }
+ errno = 0;
+ rc = read(fd, buffer, sizeof(buffer));
+ if (rc < 0) {
+ printf("[FAIL]: read() failed, rc=%d errno=%d (%s)\n",
+ rc, errno, strerror(errno));
+ return 1;
+ }
+ close(fd);
+ printf("[OK]\n");
+ return 0;
+}
+
+#define check_openat_fail(dfd, path, flags, errno) \
+ _check_openat_fail(dfd, path, flags, errno, #errno)
+static int _check_openat_fail(int dfd, const char *path, int flags,
+ int expected_errno, const char *errno_str)
+{
+ int rc;
+
+ errno = 0;
+ printf("Check failure of openat(%d, '%s', %x) with %s... ",
+ dfd, path?:"(null)", flags, errno_str);
+ rc = openat_(dfd, path, flags);
+ if (rc > 0) {
+ printf("[FAIL] (unexpected success from openat(2))\n");
+ close(rc);
+ return 1;
+ }
+ if (errno != expected_errno) {
+ printf("[FAIL] (expected errno %d (%s) not %d (%s)\n",
+ expected_errno, strerror(expected_errno),
+ errno, strerror(errno));
+ return 1;
+ }
+ printf("[OK]\n");
+ return 0;
+}
+
+int main(int argc, char *argv[])
+{
+ int fail = 0;
+ int dot_dfd = openat_or_die(AT_FDCWD, ".", O_RDONLY);
+ int subdir_dfd = openat_or_die(AT_FDCWD, "subdir", O_RDONLY);
+ int file_fd = openat_or_die(AT_FDCWD, "topfile", O_RDONLY);
+
+ /* Sanity check normal behavior */
+ fail |= check_openat(AT_FDCWD, "topfile", O_RDONLY);
+ fail |= check_openat(AT_FDCWD, "subdir/bottomfile", O_RDONLY);
+
+ fail |= check_openat(dot_dfd, "topfile", O_RDONLY);
+ fail |= check_openat(dot_dfd, "subdir/bottomfile", O_RDONLY);
+ fail |= check_openat(dot_dfd, "subdir/../topfile", O_RDONLY);
+
+ fail |= check_openat(subdir_dfd, "../topfile", O_RDONLY);
+ fail |= check_openat(subdir_dfd, "bottomfile", O_RDONLY);
+ fail |= check_openat(subdir_dfd, "../subdir/bottomfile", O_RDONLY);
+ fail |= check_openat(subdir_dfd, "symlinkup", O_RDONLY);
+ fail |= check_openat(subdir_dfd, "symlinkout", O_RDONLY);
+
+ fail |= check_openat(AT_FDCWD, "/etc/passwd", O_RDONLY);
+ fail |= check_openat(dot_dfd, "/etc/passwd", O_RDONLY);
+ fail |= check_openat(subdir_dfd, "/etc/passwd", O_RDONLY);
+
+ fail |= check_openat_fail(AT_FDCWD, "bogus", O_RDONLY, ENOENT);
+ fail |= check_openat_fail(dot_dfd, "bogus", O_RDONLY, ENOENT);
+ fail |= check_openat_fail(999, "bogus", O_RDONLY, EBADF);
+ fail |= check_openat_fail(file_fd, "bogus", O_RDONLY, ENOTDIR);
+
+#ifdef O_BENEATH
+ /* Test out O_BENEATH */
+ fail |= check_openat(AT_FDCWD, "topfile", O_RDONLY|O_BENEATH);
+ fail |= check_openat(AT_FDCWD, "subdir/bottomfile",
+ O_RDONLY|O_BENEATH);
+
+ fail |= check_openat(dot_dfd, "topfile", O_RDONLY|O_BENEATH);
+ fail |= check_openat(dot_dfd, "subdir/bottomfile",
+ O_RDONLY|O_BENEATH);
+ fail |= check_openat(subdir_dfd, "bottomfile", O_RDONLY|O_BENEATH);
+
+ /* Symlinks without .. or leading / are OK */
+ fail |= check_openat(dot_dfd, "symlinkdown", O_RDONLY|O_BENEATH);
+ fail |= check_openat(dot_dfd, "subdir/symlinkin", O_RDONLY|O_BENEATH);
+ fail |= check_openat(subdir_dfd, "symlinkin", O_RDONLY|O_BENEATH);
+ /* ... unless of course we specify O_NOFOLLOW */
+ fail |= check_openat_fail(dot_dfd, "symlinkdown",
+ O_RDONLY|O_BENEATH|O_NOFOLLOW, ELOOP);
+ fail |= check_openat_fail(dot_dfd, "subdir/symlinkin",
+ O_RDONLY|O_BENEATH|O_NOFOLLOW, ELOOP);
+ fail |= check_openat_fail(subdir_dfd, "symlinkin",
+ O_RDONLY|O_BENEATH|O_NOFOLLOW, ELOOP);
+
+ /* Can't open paths with ".." in them */
+ fail |= check_openat_fail(dot_dfd, "subdir/../topfile",
+ O_RDONLY|O_BENEATH, EACCES);
+ fail |= check_openat_fail(subdir_dfd, "../topfile",
+ O_RDONLY|O_BENEATH, EACCES);
+ fail |= check_openat_fail(subdir_dfd, "../subdir/bottomfile",
+ O_RDONLY|O_BENEATH, EACCES);
+
+ /* Can't open paths starting with "/" */
+ fail |= check_openat_fail(AT_FDCWD, "/etc/passwd",
+ O_RDONLY|O_BENEATH, EACCES);
+ fail |= check_openat_fail(dot_dfd, "/etc/passwd",
+ O_RDONLY|O_BENEATH, EACCES);
+ fail |= check_openat_fail(subdir_dfd, "/etc/passwd",
+ O_RDONLY|O_BENEATH, EACCES);
+ /* Can't sneak around constraints with symlinks */
+ fail |= check_openat_fail(subdir_dfd, "symlinkup",
+ O_RDONLY|O_BENEATH, EACCES);
+ fail |= check_openat_fail(subdir_dfd, "symlinkout",
+ O_RDONLY|O_BENEATH, EACCES);
+#else
+ printf("Skipping O_BENEATH tests due to missing #define\n");
+#endif
+
+ return fail ? -1 : 0;
+}
--
2.1.0.rc2.206.gedb03e5
Add a new O_BENEATH flag for openat(2) which restricts the
provided path, rejecting (with -EACCES) paths that are not beneath
the provided dfd. In particular, reject:
- paths that contain .. components
- paths that begin with /
- symlinks that have paths as above.
Signed-off-by: David Drysdale <[email protected]>
---
arch/alpha/include/uapi/asm/fcntl.h | 1 +
arch/parisc/include/uapi/asm/fcntl.h | 1 +
arch/sparc/include/uapi/asm/fcntl.h | 1 +
fs/fcntl.c | 5 +++--
fs/namei.c | 43 ++++++++++++++++++++++++------------
fs/open.c | 4 +++-
include/linux/namei.h | 1 +
include/uapi/asm-generic/fcntl.h | 4 ++++
8 files changed, 43 insertions(+), 17 deletions(-)
diff --git a/arch/alpha/include/uapi/asm/fcntl.h b/arch/alpha/include/uapi/asm/fcntl.h
index 09f49a6b87d1..76a87038d2c1 100644
--- a/arch/alpha/include/uapi/asm/fcntl.h
+++ b/arch/alpha/include/uapi/asm/fcntl.h
@@ -33,6 +33,7 @@
#define O_PATH 040000000
#define __O_TMPFILE 0100000000
+#define O_BENEATH 0200000000 /* no / or .. in openat path */
#define F_GETLK 7
#define F_SETLK 8
diff --git a/arch/parisc/include/uapi/asm/fcntl.h b/arch/parisc/include/uapi/asm/fcntl.h
index 34a46cbc76ed..3adadf72f929 100644
--- a/arch/parisc/include/uapi/asm/fcntl.h
+++ b/arch/parisc/include/uapi/asm/fcntl.h
@@ -21,6 +21,7 @@
#define O_PATH 020000000
#define __O_TMPFILE 040000000
+#define O_BENEATH 080000000 /* no / or .. in openat path */
#define F_GETLK64 8
#define F_SETLK64 9
diff --git a/arch/sparc/include/uapi/asm/fcntl.h b/arch/sparc/include/uapi/asm/fcntl.h
index 7e8ace5bf760..ea38f0bd6cec 100644
--- a/arch/sparc/include/uapi/asm/fcntl.h
+++ b/arch/sparc/include/uapi/asm/fcntl.h
@@ -36,6 +36,7 @@
#define O_PATH 0x1000000
#define __O_TMPFILE 0x2000000
+#define O_BENEATH 0x4000000 /* no / or .. in openat path */
#define F_GETOWN 5 /* for sockets. */
#define F_SETOWN 6 /* for sockets. */
diff --git a/fs/fcntl.c b/fs/fcntl.c
index 22d1c3df61ac..c07a32efc34b 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -747,14 +747,15 @@ static int __init fcntl_init(void)
* Exceptions: O_NONBLOCK is a two bit define on parisc; O_NDELAY
* is defined as O_NONBLOCK on some platforms and not on others.
*/
- BUILD_BUG_ON(20 - 1 /* for O_RDONLY being 0 */ != HWEIGHT32(
+ BUILD_BUG_ON(21 - 1 /* for O_RDONLY being 0 */ != HWEIGHT32(
O_RDONLY | O_WRONLY | O_RDWR |
O_CREAT | O_EXCL | O_NOCTTY |
O_TRUNC | O_APPEND | /* O_NONBLOCK | */
__O_SYNC | O_DSYNC | FASYNC |
O_DIRECT | O_LARGEFILE | O_DIRECTORY |
O_NOFOLLOW | O_NOATIME | O_CLOEXEC |
- __FMODE_EXEC | O_PATH | __O_TMPFILE
+ __FMODE_EXEC | O_PATH | __O_TMPFILE |
+ O_BENEATH
));
fasync_cache = kmem_cache_create("fasync_cache",
diff --git a/fs/namei.c b/fs/namei.c
index a7b05bf82d31..2fd547014b6b 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -647,7 +647,7 @@ static __always_inline void set_root(struct nameidata *nd)
get_fs_root(current->fs, &nd->root);
}
-static int link_path_walk(const char *, struct nameidata *);
+static int link_path_walk(const char *, struct nameidata *, unsigned int);
static __always_inline unsigned set_root_rcu(struct nameidata *nd)
{
@@ -819,7 +819,8 @@ static int may_linkat(struct path *link)
}
static __always_inline int
-follow_link(struct path *link, struct nameidata *nd, void **p)
+follow_link(struct path *link, struct nameidata *nd, unsigned int flags,
+ void **p)
{
struct dentry *dentry = link->dentry;
int error;
@@ -867,7 +868,7 @@ follow_link(struct path *link, struct nameidata *nd, void **p)
nd->flags |= LOOKUP_JUMPED;
}
nd->inode = nd->path.dentry->d_inode;
- error = link_path_walk(s, nd);
+ error = link_path_walk(s, nd, flags);
if (unlikely(error))
put_link(nd, link, *p);
}
@@ -1585,7 +1586,8 @@ out_err:
* Without that kind of total limit, nasty chains of consecutive
* symlinks can cause almost arbitrarily long lookups.
*/
-static inline int nested_symlink(struct path *path, struct nameidata *nd)
+static inline int nested_symlink(struct path *path, struct nameidata *nd,
+ unsigned int flags)
{
int res;
@@ -1603,7 +1605,7 @@ static inline int nested_symlink(struct path *path, struct nameidata *nd)
struct path link = *path;
void *cookie;
- res = follow_link(&link, nd, &cookie);
+ res = follow_link(&link, nd, flags, &cookie);
if (res)
break;
res = walk_component(nd, path, LOOKUP_FOLLOW);
@@ -1739,13 +1741,19 @@ static inline u64 hash_name(const char *name)
* Returns 0 and nd will have valid dentry and mnt on success.
* Returns error and drops reference to input namei data on failure.
*/
-static int link_path_walk(const char *name, struct nameidata *nd)
+static int link_path_walk(const char *name, struct nameidata *nd,
+ unsigned int flags)
{
struct path next;
int err;
- while (*name=='/')
+ while (*name == '/') {
+ if (flags & LOOKUP_BENEATH) {
+ err = -EACCES;
+ goto exit;
+ }
name++;
+ }
if (!*name)
return 0;
@@ -1764,6 +1772,10 @@ static int link_path_walk(const char *name, struct nameidata *nd)
if (name[0] == '.') switch (hashlen_len(hash_len)) {
case 2:
if (name[1] == '.') {
+ if (flags & LOOKUP_BENEATH) {
+ err = -EACCES;
+ goto exit;
+ }
type = LAST_DOTDOT;
nd->flags |= LOOKUP_JUMPED;
}
@@ -1806,7 +1818,7 @@ static int link_path_walk(const char *name, struct nameidata *nd)
return err;
if (err) {
- err = nested_symlink(&next, nd);
+ err = nested_symlink(&next, nd, flags);
if (err)
return err;
}
@@ -1815,6 +1827,7 @@ static int link_path_walk(const char *name, struct nameidata *nd)
break;
}
}
+exit:
terminate_walk(nd);
return err;
}
@@ -1853,6 +1866,8 @@ static int path_init(int dfd, const char *name, unsigned int flags,
nd->m_seq = read_seqbegin(&mount_lock);
if (*name=='/') {
+ if (flags & LOOKUP_BENEATH)
+ return -EACCES;
if (flags & LOOKUP_RCU) {
rcu_read_lock();
nd->seq = set_root_rcu(nd);
@@ -1953,7 +1968,7 @@ static int path_lookupat(int dfd, const char *name,
return err;
current->total_link_count = 0;
- err = link_path_walk(name, nd);
+ err = link_path_walk(name, nd, flags);
if (!err && !(flags & LOOKUP_PARENT)) {
err = lookup_last(nd, &path);
@@ -1964,7 +1979,7 @@ static int path_lookupat(int dfd, const char *name,
if (unlikely(err))
break;
nd->flags |= LOOKUP_PARENT;
- err = follow_link(&link, nd, &cookie);
+ err = follow_link(&link, nd, flags, &cookie);
if (err)
break;
err = lookup_last(nd, &path);
@@ -2304,7 +2319,7 @@ path_mountpoint(int dfd, const char *name, struct path *path, unsigned int flags
return err;
current->total_link_count = 0;
- err = link_path_walk(name, &nd);
+ err = link_path_walk(name, &nd, flags);
if (err)
goto out;
@@ -2316,7 +2331,7 @@ path_mountpoint(int dfd, const char *name, struct path *path, unsigned int flags
if (unlikely(err))
break;
nd.flags |= LOOKUP_PARENT;
- err = follow_link(&link, &nd, &cookie);
+ err = follow_link(&link, &nd, flags, &cookie);
if (err)
break;
err = mountpoint_last(&nd, path);
@@ -3202,7 +3217,7 @@ static struct file *path_openat(int dfd, struct filename *pathname,
goto out;
current->total_link_count = 0;
- error = link_path_walk(pathname->name, nd);
+ error = link_path_walk(pathname->name, nd, flags);
if (unlikely(error))
goto out;
@@ -3221,7 +3236,7 @@ static struct file *path_openat(int dfd, struct filename *pathname,
break;
nd->flags |= LOOKUP_PARENT;
nd->flags &= ~(LOOKUP_OPEN|LOOKUP_CREATE|LOOKUP_EXCL);
- error = follow_link(&link, nd, &cookie);
+ error = follow_link(&link, nd, flags, &cookie);
if (unlikely(error))
break;
error = do_last(nd, &path, file, op, &opened, pathname);
diff --git a/fs/open.c b/fs/open.c
index d6fd3acde134..8afca5b87a0b 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -874,7 +874,7 @@ static inline int build_open_flags(int flags, umode_t mode, struct open_flags *o
* If we have O_PATH in the open flag. Then we
* cannot have anything other than the below set of flags
*/
- flags &= O_DIRECTORY | O_NOFOLLOW | O_PATH;
+ flags &= O_DIRECTORY | O_NOFOLLOW | O_PATH | O_BENEATH;
acc_mode = 0;
} else {
acc_mode = MAY_OPEN | ACC_MODE(flags);
@@ -905,6 +905,8 @@ static inline int build_open_flags(int flags, umode_t mode, struct open_flags *o
lookup_flags |= LOOKUP_DIRECTORY;
if (!(flags & O_NOFOLLOW))
lookup_flags |= LOOKUP_FOLLOW;
+ if (flags & O_BENEATH)
+ lookup_flags |= LOOKUP_BENEATH;
op->lookup_flags = lookup_flags;
return 0;
}
diff --git a/include/linux/namei.h b/include/linux/namei.h
index 492de72560fa..bd0615d1143b 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -39,6 +39,7 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, LAST_BIND};
#define LOOKUP_FOLLOW 0x0001
#define LOOKUP_DIRECTORY 0x0002
#define LOOKUP_AUTOMOUNT 0x0004
+#define LOOKUP_BENEATH 0x0008
#define LOOKUP_PARENT 0x0010
#define LOOKUP_REVAL 0x0020
diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
index 7543b3e51331..f63aa749a4fb 100644
--- a/include/uapi/asm-generic/fcntl.h
+++ b/include/uapi/asm-generic/fcntl.h
@@ -92,6 +92,10 @@
#define O_TMPFILE (__O_TMPFILE | O_DIRECTORY)
#define O_TMPFILE_MASK (__O_TMPFILE | O_DIRECTORY | O_CREAT)
+#ifndef O_BENEATH
+#define O_BENEATH 040000000 /* no / or .. in openat path */
+#endif
+
#ifndef O_NDELAY
#define O_NDELAY O_NONBLOCK
#endif
--
2.1.0.rc2.206.gedb03e5
On 03/11/2014 12:48, David Drysdale wrote:
> +.I pathname
> +is beneath the current working directory (for
> +.BR open (2))
> +or the
> +.I dirfd
> +(for
> +.BR openat (2)).
> +If the
> +.I pathname
> +is absolute or contains a path component of "..", the
> +.BR open ()
> +fails with the error
> +.BR EACCES.
> +This occurs even if ".." path component would not actually
> +escape the original directory; for example, a
> +.I pathname
> +of "subdir/../filename" would be rejected.
> +Path components that are symbolic links to absolute paths, or that are
> +relative paths containing a ".." component, will also cause the
> +.BR open ()
> +operation to fail with the error
> +.BR EACCES.
I wonder if EPERM is more appropriate than EACCES.
Apart from this, the patches look fine.
Paolo
On Mon, Nov 03, 2014 at 11:48:23AM +0000, David Drysdale wrote:
> Add a new O_BENEATH flag for openat(2) which restricts the
> provided path, rejecting (with -EACCES) paths that are not beneath
> the provided dfd. In particular, reject:
> - paths that contain .. components
> - paths that begin with /
> - symlinks that have paths as above.
Yecch... The degree of usefulness aside (and I'm not convinced that it
is non-zero), WTF pass one bit out of nameidata->flags in a separate argument?
Through the mutual recursion, no less... And then you are not even attempting
to detect symlinks that are not followed by interpretation of _any_ pathname.
On Mon, Nov 3, 2014 at 7:20 AM, Al Viro <[email protected]> wrote:
> On Mon, Nov 03, 2014 at 11:48:23AM +0000, David Drysdale wrote:
>> Add a new O_BENEATH flag for openat(2) which restricts the
>> provided path, rejecting (with -EACCES) paths that are not beneath
>> the provided dfd. In particular, reject:
>> - paths that contain .. components
>> - paths that begin with /
>> - symlinks that have paths as above.
>
> Yecch... The degree of usefulness aside (and I'm not convinced that it
> is non-zero),
This is extremely useful in conjunction with seccomp.
> WTF pass one bit out of nameidata->flags in a separate argument?
> Through the mutual recursion, no less... And then you are not even attempting
> to detect symlinks that are not followed by interpretation of _any_ pathname.
How many symlinks like that are there? Is there anything except
nd_jump_link users? All of those are in /proc. Arguably O_BENEATH
should prevent traversal of all of those links.
--Andy
On November 3, 2014 7:42:58 AM PST, Andy Lutomirski <[email protected]> wrote:
>On Mon, Nov 3, 2014 at 7:20 AM, Al Viro <[email protected]>
>wrote:
>> On Mon, Nov 03, 2014 at 11:48:23AM +0000, David Drysdale wrote:
>>> Add a new O_BENEATH flag for openat(2) which restricts the
>>> provided path, rejecting (with -EACCES) paths that are not beneath
>>> the provided dfd. In particular, reject:
>>> - paths that contain .. components
>>> - paths that begin with /
>>> - symlinks that have paths as above.
>>
>> Yecch... The degree of usefulness aside (and I'm not convinced that
>it
>> is non-zero),
>
>This is extremely useful in conjunction with seccomp.
>
>> WTF pass one bit out of nameidata->flags in a separate argument?
>> Through the mutual recursion, no less... And then you are not even
>attempting
>> to detect symlinks that are not followed by interpretation of _any_
>pathname.
>
>How many symlinks like that are there? Is there anything except
>nd_jump_link users? All of those are in /proc. Arguably O_BENEATH
>should prevent traversal of all of those links.
Not commenting on the sanity of this one way or another, and I haven't read the patch. There is an absolutely trivial implementation of this.
After the path is resolved, walk backwards along d_parent and the mount tree, and see if you come to the file or directory dfd refers to.
That can handle magic proc symlinks, and does not need to disallow .. or / explicitly so it should be much simpler code.
My gut says that if Al says blech when looking at your code it is too complex to give you a security guarantee.
Eric
On Mon, Nov 3, 2014 at 3:42 PM, Andy Lutomirski <[email protected]> wrote:
> On Mon, Nov 3, 2014 at 7:20 AM, Al Viro <[email protected]> wrote:
>> On Mon, Nov 03, 2014 at 11:48:23AM +0000, David Drysdale wrote:
>>> Add a new O_BENEATH flag for openat(2) which restricts the
>>> provided path, rejecting (with -EACCES) paths that are not beneath
>>> the provided dfd. In particular, reject:
>>> - paths that contain .. components
>>> - paths that begin with /
>>> - symlinks that have paths as above.
>>
>> Yecch... The degree of usefulness aside (and I'm not convinced that it
>> is non-zero),
>
> This is extremely useful in conjunction with seccomp.
Yes, that was my understanding of how the Chrome[OS] folk wanted
to use it.
>> WTF pass one bit out of nameidata->flags in a separate argument?
I'll shift to using nd->flags; not sure what I was thinking of there.
(It *might* have made more sense in the full patchset this was extracted
from but it certainly doesn't look sensible in this narrower context.)
>> Through the mutual recursion, no less... And then you are not even attempting
>> to detect symlinks that are not followed by interpretation of _any_ pathname.
>
> How many symlinks like that are there? Is there anything except
> nd_jump_link users? All of those are in /proc. Arguably O_BENEATH
> should prevent traversal of all of those links.
>
> --Andy
On a quick search, the 2 users of nd_jump_link (namely proc_pid_follow_link
and proc_ns_follow_link) seem to be the only implementations of
inode_operations->follow_link that don't just call nd_set_link(). So
disallowing that for O_BENEATH might give sensible behaviour.
On Mon, Nov 3, 2014 at 9:37 AM, David Drysdale <[email protected]> wrote:
> On Mon, Nov 3, 2014 at 3:42 PM, Andy Lutomirski <[email protected]> wrote:
>> This is extremely useful in conjunction with seccomp.
>
> Yes, that was my understanding of how the Chrome[OS] folk wanted
> to use it.
Yes, exactly. Without this, if we want to give a sandboxed process A
access to a directory, we need to:
1. Create a new 'broker" process B
2. Make sure to have an IPC channel between A and B.
3. SIGSYS open() and openat() in A via seccomp-bpf
4. Have an async-signal-safe handler that can IPC open / openat.
There is a lot of hidden complexity in such a set-up. For instance, if
you need to prevent contention, the number of threads in the broker B
should scale automatically.
This is 'fine' (but undesirable) for a big beast such as Chromium
which needs such a complex set-ups anyways, but David's patch would
make it a lot easier to build a sandbox and whitelist directories for
everyone, simply by enforcing O_BENEATH in seccomp and whitelisting
open directory file descriptors in the sandboxed process.
Julien
On Mon, Nov 3, 2014 at 10:25 AM, Julien Tinnes <[email protected]> wrote:
> On Mon, Nov 3, 2014 at 9:37 AM, David Drysdale <[email protected]> wrote:
>>
>> On Mon, Nov 3, 2014 at 3:42 PM, Andy Lutomirski <[email protected]>
>> wrote:
>> > On Mon, Nov 3, 2014 at 7:20 AM, Al Viro <[email protected]> wrote:
>> >> On Mon, Nov 03, 2014 at 11:48:23AM +0000, David Drysdale wrote:
>> >>> Add a new O_BENEATH flag for openat(2) which restricts the
>> >>> provided path, rejecting (with -EACCES) paths that are not beneath
>> >>> the provided dfd. In particular, reject:
>> >>> - paths that contain .. components
>> >>> - paths that begin with /
>> >>> - symlinks that have paths as above.
>> >>
>> >> Yecch... The degree of usefulness aside (and I'm not convinced that it
>> >> is non-zero),
>> >
>> > This is extremely useful in conjunction with seccomp.
>>
>> Yes, that was my understanding of how the Chrome[OS] folk wanted
>> to use it.
>
>
> Yes, exactly. Without this, if we want to give a sandboxed process A access
> to a directory, we need to:
> 1. Create a new 'broker" process B
> 2. Make sure to have an IPC channel between A and B.
> 3. SIGSYS open() and openat() in A via seccomp-bpf
> 4. Have an async-signal-safe handler that can IPC open / openat.
You can do this with user namespaces, too. But this is way more
complicated than it should be, and it has a lot more overhead.
--Andy
>
> There is a lot of hidden complexity in such a set-up. For instance, if you
> need to prevent contention, the number of threads in the broker B should
> scale automatically.
>
> This is 'fine' (but undesirable) for a big beast such as Chromium which
> needs such a complex set-ups anyways, but David's patch would make it a lot
> easier to build a sandbox and whitelist directories for everyone, simply by
> enforcing O_BENEATH in seccomp and whitelisting open directory file
> descriptors in the sandboxed process.
>
> Julien
--
Andy Lutomirski
AMA Capital Management, LLC
On Mon, Nov 3, 2014 at 5:22 PM, Eric W.Biederman <[email protected]> wrote:
> On November 3, 2014 7:42:58 AM PST, Andy Lutomirski <[email protected]> wrote:
>>On Mon, Nov 3, 2014 at 7:20 AM, Al Viro <[email protected]>
>>wrote:
>>> On Mon, Nov 03, 2014 at 11:48:23AM +0000, David Drysdale wrote:
>>>> Add a new O_BENEATH flag for openat(2) which restricts the
>>>> provided path, rejecting (with -EACCES) paths that are not beneath
>>>> the provided dfd. In particular, reject:
>>>> - paths that contain .. components
>>>> - paths that begin with /
>>>> - symlinks that have paths as above.
>>>
>>> Yecch... The degree of usefulness aside (and I'm not convinced that
>>it
>>> is non-zero),
>>
>>This is extremely useful in conjunction with seccomp.
>>
>>> WTF pass one bit out of nameidata->flags in a separate argument?
>>> Through the mutual recursion, no less... And then you are not even
>>attempting
>>> to detect symlinks that are not followed by interpretation of _any_
>>pathname.
>>
>>How many symlinks like that are there? Is there anything except
>>nd_jump_link users? All of those are in /proc. Arguably O_BENEATH
>>should prevent traversal of all of those links.
>
> Not commenting on the sanity of this one way or another, and I haven't read the patch. There is an absolutely trivial implementation of this.
>
> After the path is resolved, walk backwards along d_parent and the mount tree, and see if you come to the file or directory dfd refers to.
>
> That can handle magic proc symlinks, and does not need to disallow .. or / explicitly so it should be much simpler code.
>
> My gut says that if Al says blech when looking at your code it is too complex to give you a security guarantee.
>
> Eric
Well, the 'yecch' was deserved for the unnecessary duplication of the
flags. Without that, the patch looks much simpler -- I'll send out a v2
with those changes for discussion, and think about your alternative
implementation suggestion (thanks!) separately.
On Tue, Nov 4, 2014 at 9:40 AM, David Drysdale <[email protected]> wrote:
> On Mon, Nov 3, 2014 at 5:22 PM, Eric W.Biederman <[email protected]> wrote:
>> On November 3, 2014 7:42:58 AM PST, Andy Lutomirski <[email protected]> wrote:
>>>On Mon, Nov 3, 2014 at 7:20 AM, Al Viro <[email protected]>
>>>wrote:
>>>> On Mon, Nov 03, 2014 at 11:48:23AM +0000, David Drysdale wrote:
>>>>> Add a new O_BENEATH flag for openat(2) which restricts the
>>>>> provided path, rejecting (with -EACCES) paths that are not beneath
>>>>> the provided dfd. In particular, reject:
>>>>> - paths that contain .. components
>>>>> - paths that begin with /
>>>>> - symlinks that have paths as above.
>>>>
>>>> Yecch... The degree of usefulness aside (and I'm not convinced that
>>>it
>>>> is non-zero),
>>>
>>>This is extremely useful in conjunction with seccomp.
>>>
>>>> WTF pass one bit out of nameidata->flags in a separate argument?
>>>> Through the mutual recursion, no less... And then you are not even
>>>attempting
>>>> to detect symlinks that are not followed by interpretation of _any_
>>>pathname.
>>>
>>>How many symlinks like that are there? Is there anything except
>>>nd_jump_link users? All of those are in /proc. Arguably O_BENEATH
>>>should prevent traversal of all of those links.
>>
>> Not commenting on the sanity of this one way or another, and I haven't read the patch. There is an absolutely trivial implementation of this.
>>
>> After the path is resolved, walk backwards along d_parent and the mount tree, and see if you come to the file or directory dfd refers to.
>>
>> That can handle magic proc symlinks, and does not need to disallow .. or / explicitly so it should be much simpler code.
>>
>> My gut says that if Al says blech when looking at your code it is too complex to give you a security guarantee.
>>
>> Eric
>
> Well, the 'yecch' was deserved for the unnecessary duplication of the
> flags. Without that, the patch looks much simpler -- I'll send out a v2
> with those changes for discussion, and think about your alternative
> implementation suggestion (thanks!) separately.
One concern with the "walk upwards and see if you get back where you
started" approach -- it will allow use of a symlink that lives outside the
original directory, but which points back inside it. That's going to be
slightly surprising behaviour for users, and I worry that there's the
potential for unexpected information leakage from it.
(BTW, size-wise my initial naive implementation of the walk-upward
approach is only marginally smaller than the v2 patch.)
On Wed, Nov 5, 2014 at 9:21 AM, David Drysdale <[email protected]> wrote:
> On Tue, Nov 4, 2014 at 9:40 AM, David Drysdale <[email protected]> wrote:
>> On Mon, Nov 3, 2014 at 5:22 PM, Eric W.Biederman <[email protected]> wrote:
>>> On November 3, 2014 7:42:58 AM PST, Andy Lutomirski <[email protected]> wrote:
>>>>On Mon, Nov 3, 2014 at 7:20 AM, Al Viro <[email protected]>
>>>>wrote:
>>>>> On Mon, Nov 03, 2014 at 11:48:23AM +0000, David Drysdale wrote:
>>>>>> Add a new O_BENEATH flag for openat(2) which restricts the
>>>>>> provided path, rejecting (with -EACCES) paths that are not beneath
>>>>>> the provided dfd. In particular, reject:
>>>>>> - paths that contain .. components
>>>>>> - paths that begin with /
>>>>>> - symlinks that have paths as above.
>>>>>
>>>>> Yecch... The degree of usefulness aside (and I'm not convinced that
>>>>it
>>>>> is non-zero),
>>>>
>>>>This is extremely useful in conjunction with seccomp.
>>>>
>>>>> WTF pass one bit out of nameidata->flags in a separate argument?
>>>>> Through the mutual recursion, no less... And then you are not even
>>>>attempting
>>>>> to detect symlinks that are not followed by interpretation of _any_
>>>>pathname.
>>>>
>>>>How many symlinks like that are there? Is there anything except
>>>>nd_jump_link users? All of those are in /proc. Arguably O_BENEATH
>>>>should prevent traversal of all of those links.
>>>
>>> Not commenting on the sanity of this one way or another, and I haven't read the patch. There is an absolutely trivial implementation of this.
>>>
>>> After the path is resolved, walk backwards along d_parent and the mount tree, and see if you come to the file or directory dfd refers to.
>>>
>>> That can handle magic proc symlinks, and does not need to disallow .. or / explicitly so it should be much simpler code.
>>>
>>> My gut says that if Al says blech when looking at your code it is too complex to give you a security guarantee.
>>>
>>> Eric
>>
>> Well, the 'yecch' was deserved for the unnecessary duplication of the
>> flags. Without that, the patch looks much simpler -- I'll send out a v2
>> with those changes for discussion, and think about your alternative
>> implementation suggestion (thanks!) separately.
>
> One concern with the "walk upwards and see if you get back where you
> started" approach -- it will allow use of a symlink that lives outside the
> original directory, but which points back inside it. That's going to be
> slightly surprising behaviour for users, and I worry that there's the
> potential for unexpected information leakage from it.
>
> (BTW, size-wise my initial naive implementation of the walk-upward
> approach is only marginally smaller than the v2 patch.)
It has another problem. Since we still haven't fixed the eternal
/proc/PID/fd-doesn't-respect-file-mode issue, you can have a read-only
fd somewhere and reopen it read-write using O_BENEATH on a different
fd.
For example:
fd 3 points to /sandbox
/sandbox/blocked has mode 0700 and isn't owned by us
/sandbox/blocked/foo has mode 0666
fd 4 points to /sandbox/blocked/foo, O_RDONLY
openat(3, "/proc/self/fd/4", O_RDWR | O_BENEATH) will get a read-write
descriptor pointing at /sandbox/blocked/foo, which should have been
impossible.
Also, I really don't like the information leak. The result of asking
a server for "/home/victim/compromising-directory/../../www/index.html"
should not reveal whether compromising-directory exists.
--Andy