2021-02-02 23:09:19

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v28 00/12] Landlock LSM

Hi,

This patch series fixes a corner-case with non-overlapping access rights
coming from different layers. This is now handled in a generic way and
verified with new tests. A stricter check is enforced for
landlock_add_rule(2) to forbid useless rules. Finally, the previous
landlock_enforce_ruleset_self(2) is renamed to
landlock_restrict_self(2), which is more consistent.

The SLOC count is 1314 for security/landlock/ and 2484 for
tools/testing/selftest/landlock/ . Test coverage for security/landlock/
is 94.7% of lines. The code not covered only deals with internal kernel
errors (e.g. memory allocation) and race conditions. This series is
being fuzzed by syzkaller, and patches are on their way:
https://github.com/google/syzkaller/pull/2380

The compiled documentation is available here:
https://landlock.io/linux-doc/landlock-v28/userspace-api/landlock.html

This series can be applied on top of v5.11-rc6 . This can be tested
with CONFIG_SECURITY_LANDLOCK, CONFIG_SAMPLE_LANDLOCK and by prepending
"landlock," to CONFIG_LSM. This patch series can be found in a Git
repository here:
https://github.com/landlock-lsm/linux/commits/landlock-v28
This patch series seems ready for upstream and I would really appreciate
final reviews.


# Landlock LSM

The goal of Landlock is to enable to restrict ambient rights (e.g.
global filesystem access) for a set of processes. Because Landlock is a
stackable LSM [1], it makes possible to create safe security sandboxes
as new security layers in addition to the existing system-wide
access-controls. This kind of sandbox is expected to help mitigate the
security impact of bugs or unexpected/malicious behaviors in user-space
applications. Landlock empowers any process, including unprivileged
ones, to securely restrict themselves.

Landlock is inspired by seccomp-bpf but instead of filtering syscalls
and their raw arguments, a Landlock rule can restrict the use of kernel
objects like file hierarchies, according to the kernel semantic.
Landlock also takes inspiration from other OS sandbox mechanisms: XNU
Sandbox, FreeBSD Capsicum or OpenBSD Pledge/Unveil.

In this current form, Landlock misses some access-control features.
This enables to minimize this patch series and ease review. This series
still addresses multiple use cases, especially with the combined use of
seccomp-bpf: applications with built-in sandboxing, init systems,
security sandbox tools and security-oriented APIs [2].

Previous version:
https://lore.kernel.org/lkml/[email protected]/

[1] https://lore.kernel.org/lkml/[email protected]/
[2] https://lore.kernel.org/lkml/[email protected]/


Casey Schaufler (1):
LSM: Infrastructure management of the superblock

Mickaël Salaün (11):
landlock: Add object management
landlock: Add ruleset and domain management
landlock: Set up the security framework and manage credentials
landlock: Add ptrace restrictions
fs,security: Add sb_delete hook
landlock: Support filesystem access-control
landlock: Add syscall implementations
arch: Wire up Landlock syscalls
selftests/landlock: Add user space tests
samples/landlock: Add a sandbox manager example
landlock: Add user and kernel documentation

Documentation/security/index.rst | 1 +
Documentation/security/landlock.rst | 79 +
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/landlock.rst | 307 ++
MAINTAINERS | 15 +
arch/Kconfig | 7 +
arch/alpha/kernel/syscalls/syscall.tbl | 3 +
arch/arm/tools/syscall.tbl | 3 +
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/asm/unistd32.h | 6 +
arch/ia64/kernel/syscalls/syscall.tbl | 3 +
arch/m68k/kernel/syscalls/syscall.tbl | 3 +
arch/microblaze/kernel/syscalls/syscall.tbl | 3 +
arch/mips/kernel/syscalls/syscall_n32.tbl | 3 +
arch/mips/kernel/syscalls/syscall_n64.tbl | 3 +
arch/mips/kernel/syscalls/syscall_o32.tbl | 3 +
arch/parisc/kernel/syscalls/syscall.tbl | 3 +
arch/powerpc/kernel/syscalls/syscall.tbl | 3 +
arch/s390/kernel/syscalls/syscall.tbl | 3 +
arch/sh/kernel/syscalls/syscall.tbl | 3 +
arch/sparc/kernel/syscalls/syscall.tbl | 3 +
arch/um/Kconfig | 1 +
arch/x86/entry/syscalls/syscall_32.tbl | 3 +
arch/x86/entry/syscalls/syscall_64.tbl | 3 +
arch/xtensa/kernel/syscalls/syscall.tbl | 3 +
fs/super.c | 1 +
include/linux/lsm_hook_defs.h | 1 +
include/linux/lsm_hooks.h | 3 +
include/linux/security.h | 4 +
include/linux/syscalls.h | 7 +
include/uapi/asm-generic/unistd.h | 8 +-
include/uapi/linux/landlock.h | 128 +
kernel/sys_ni.c | 5 +
samples/Kconfig | 7 +
samples/Makefile | 1 +
samples/landlock/.gitignore | 1 +
samples/landlock/Makefile | 13 +
samples/landlock/sandboxer.c | 238 ++
security/Kconfig | 11 +-
security/Makefile | 2 +
security/landlock/Kconfig | 21 +
security/landlock/Makefile | 4 +
security/landlock/common.h | 20 +
security/landlock/cred.c | 46 +
security/landlock/cred.h | 58 +
security/landlock/fs.c | 627 ++++
security/landlock/fs.h | 56 +
security/landlock/limits.h | 21 +
security/landlock/object.c | 67 +
security/landlock/object.h | 91 +
security/landlock/ptrace.c | 120 +
security/landlock/ptrace.h | 14 +
security/landlock/ruleset.c | 473 +++
security/landlock/ruleset.h | 165 +
security/landlock/setup.c | 40 +
security/landlock/setup.h | 18 +
security/landlock/syscalls.c | 444 +++
security/security.c | 51 +-
security/selinux/hooks.c | 58 +-
security/selinux/include/objsec.h | 6 +
security/selinux/ss/services.c | 3 +-
security/smack/smack.h | 6 +
security/smack/smack_lsm.c | 35 +-
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/landlock/.gitignore | 2 +
tools/testing/selftests/landlock/Makefile | 24 +
tools/testing/selftests/landlock/base_test.c | 219 ++
tools/testing/selftests/landlock/common.h | 169 ++
tools/testing/selftests/landlock/config | 6 +
tools/testing/selftests/landlock/fs_test.c | 2664 +++++++++++++++++
.../testing/selftests/landlock/ptrace_test.c | 314 ++
tools/testing/selftests/landlock/true.c | 5 +
72 files changed, 6668 insertions(+), 77 deletions(-)
create mode 100644 Documentation/security/landlock.rst
create mode 100644 Documentation/userspace-api/landlock.rst
create mode 100644 include/uapi/linux/landlock.h
create mode 100644 samples/landlock/.gitignore
create mode 100644 samples/landlock/Makefile
create mode 100644 samples/landlock/sandboxer.c
create mode 100644 security/landlock/Kconfig
create mode 100644 security/landlock/Makefile
create mode 100644 security/landlock/common.h
create mode 100644 security/landlock/cred.c
create mode 100644 security/landlock/cred.h
create mode 100644 security/landlock/fs.c
create mode 100644 security/landlock/fs.h
create mode 100644 security/landlock/limits.h
create mode 100644 security/landlock/object.c
create mode 100644 security/landlock/object.h
create mode 100644 security/landlock/ptrace.c
create mode 100644 security/landlock/ptrace.h
create mode 100644 security/landlock/ruleset.c
create mode 100644 security/landlock/ruleset.h
create mode 100644 security/landlock/setup.c
create mode 100644 security/landlock/setup.h
create mode 100644 security/landlock/syscalls.c
create mode 100644 tools/testing/selftests/landlock/.gitignore
create mode 100644 tools/testing/selftests/landlock/Makefile
create mode 100644 tools/testing/selftests/landlock/base_test.c
create mode 100644 tools/testing/selftests/landlock/common.h
create mode 100644 tools/testing/selftests/landlock/config
create mode 100644 tools/testing/selftests/landlock/fs_test.c
create mode 100644 tools/testing/selftests/landlock/ptrace_test.c
create mode 100644 tools/testing/selftests/landlock/true.c


base-commit: 1048ba83fb1c00cd24172e23e8263972f6b5d9ac
--
2.30.0


2021-02-02 23:10:22

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v28 11/12] samples/landlock: Add a sandbox manager example

From: Mickaël Salaün <[email protected]>

Add a basic sandbox tool to launch a command which can only access a
list of file hierarchies in a read-only or read-write way.

Cc: James Morris <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Signed-off-by: Mickaël Salaün <[email protected]>
Reviewed-by: Jann Horn <[email protected]>
---

Changes since v27:
* Add samples/landlock/ to MAINTAINERS.
* Update landlock_restrict_self(2).
* Tweak Kconfig title and description.

Changes since v25:
* Improve comments and fix help (suggested by Jann Horn).
* Add a safeguard for errno check (suggested by Jann Horn).
* Allows users to not use all possible restrictions (e.g. use LL_FS_RO
without LL_FS_RW).
* Update syscall names.
* Improve Makefile:
- Replace hostprogs/always-y with userprogs-always-y, available since
commit faabed295ccc ("kbuild: introduce hostprogs-always-y and
userprogs-always-y").
- Depends on CC_CAN_LINK.
* Add Reviewed-by Jann Horn.

Changes since v25:
* Remove useless errno set in the syscall wrappers.
* Cosmetic variable renames.

Changes since v23:
* Re-add hints to help users understand the required kernel
configuration. This was removed with the removal of
landlock_get_features(2).

Changes since v21:
* Remove LANDLOCK_ACCESS_FS_CHROOT.
* Clean up help.

Changes since v20:
* Update with new syscalls and type names.
* Update errno check for EOPNOTSUPP.
* Use the full syscall interfaces: explicitely set the "flags" field to
zero.

Changes since v19:
* Update with the new Landlock syscalls.
* Comply with commit 5f2fb52fac15 ("kbuild: rename hostprogs-y/always to
hostprogs/always-y").

Changes since v16:
* Switch syscall attribute pointer and size arguments.

Changes since v15:
* Update access right names.
* Properly assign access right to files according to the new related
syscall restriction.
* Replace "select" with "depends on" HEADERS_INSTALL (suggested by Randy
Dunlap).

Changes since v14:
* Fix Kconfig dependency.
* Remove access rights that may be required for FD-only requests:
mmap, truncate, getattr, lock, chmod, chown, chgrp, ioctl.
* Fix useless hardcoded syscall number.
* Use execvpe().
* Follow symlinks.
* Extend help with common file paths.
* Constify variables.
* Clean up comments.
* Improve error message.

Changes since v11:
* Add back the filesystem sandbox manager and update it to work with the
new Landlock syscall.

Previous changes:
https://lore.kernel.org/lkml/[email protected]/
---
MAINTAINERS | 1 +
samples/Kconfig | 7 ++
samples/Makefile | 1 +
samples/landlock/.gitignore | 1 +
samples/landlock/Makefile | 13 ++
samples/landlock/sandboxer.c | 238 +++++++++++++++++++++++++++++++++++
6 files changed, 261 insertions(+)
create mode 100644 samples/landlock/.gitignore
create mode 100644 samples/landlock/Makefile
create mode 100644 samples/landlock/sandboxer.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 3df7b12dc7f1..cf49d9431439 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9943,6 +9943,7 @@ S: Supported
W: https://landlock.io
T: git https://github.com/landlock-lsm/linux.git
F: include/uapi/linux/landlock.h
+F: samples/landlock/
F: security/landlock/
F: tools/testing/selftests/landlock/
K: landlock
diff --git a/samples/Kconfig b/samples/Kconfig
index 0ed6e4d71d87..30ad633cd82c 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -124,6 +124,13 @@ config SAMPLE_HIDRAW
bool "hidraw sample"
depends on CC_CAN_LINK && HEADERS_INSTALL

+config SAMPLE_LANDLOCK
+ bool "Build Landlock example"
+ depends on CC_CAN_LINK && HEADERS_INSTALL
+ help
+ Build a simple Landlock sandbox manager able to start a process
+ restricted by a user-defined filesystem access control policy.
+
config SAMPLE_PIDFD
bool "pidfd sample"
depends on CC_CAN_LINK && HEADERS_INSTALL
diff --git a/samples/Makefile b/samples/Makefile
index c3392a595e4b..087e0988ccc5 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -11,6 +11,7 @@ obj-$(CONFIG_SAMPLE_KDB) += kdb/
obj-$(CONFIG_SAMPLE_KFIFO) += kfifo/
obj-$(CONFIG_SAMPLE_KOBJECT) += kobject/
obj-$(CONFIG_SAMPLE_KPROBES) += kprobes/
+subdir-$(CONFIG_SAMPLE_LANDLOCK) += landlock
obj-$(CONFIG_SAMPLE_LIVEPATCH) += livepatch/
subdir-$(CONFIG_SAMPLE_PIDFD) += pidfd
obj-$(CONFIG_SAMPLE_QMI_CLIENT) += qmi/
diff --git a/samples/landlock/.gitignore b/samples/landlock/.gitignore
new file mode 100644
index 000000000000..f43668b2d318
--- /dev/null
+++ b/samples/landlock/.gitignore
@@ -0,0 +1 @@
+/sandboxer
diff --git a/samples/landlock/Makefile b/samples/landlock/Makefile
new file mode 100644
index 000000000000..5d601e51c2eb
--- /dev/null
+++ b/samples/landlock/Makefile
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: BSD-3-Clause
+
+userprogs-always-y := sandboxer
+
+userccflags += -I usr/include
+
+.PHONY: all clean
+
+all:
+ $(MAKE) -C ../.. samples/landlock/
+
+clean:
+ $(MAKE) -C ../.. M=samples/landlock/ clean
diff --git a/samples/landlock/sandboxer.c b/samples/landlock/sandboxer.c
new file mode 100644
index 000000000000..7a15910d2171
--- /dev/null
+++ b/samples/landlock/sandboxer.c
@@ -0,0 +1,238 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/*
+ * Simple Landlock sandbox manager able to launch a process restricted by a
+ * user-defined filesystem access control policy.
+ *
+ * Copyright © 2017-2020 Mickaël Salaün <[email protected]>
+ * Copyright © 2020 ANSSI
+ */
+
+#define _GNU_SOURCE
+#include <errno.h>
+#include <fcntl.h>
+#include <linux/landlock.h>
+#include <linux/prctl.h>
+#include <stddef.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/prctl.h>
+#include <sys/stat.h>
+#include <sys/syscall.h>
+#include <unistd.h>
+
+#ifndef landlock_create_ruleset
+static inline int landlock_create_ruleset(
+ const struct landlock_ruleset_attr *const attr,
+ const size_t size, const __u32 flags)
+{
+ return syscall(__NR_landlock_create_ruleset, attr, size, flags);
+}
+#endif
+
+#ifndef landlock_add_rule
+static inline int landlock_add_rule(const int ruleset_fd,
+ const enum landlock_rule_type rule_type,
+ const void *const rule_attr, const __u32 flags)
+{
+ return syscall(__NR_landlock_add_rule, ruleset_fd, rule_type,
+ rule_attr, flags);
+}
+#endif
+
+#ifndef landlock_restrict_self
+static inline int landlock_restrict_self(const int ruleset_fd,
+ const __u32 flags)
+{
+ return syscall(__NR_landlock_restrict_self, ruleset_fd, flags);
+}
+#endif
+
+#define ENV_FS_RO_NAME "LL_FS_RO"
+#define ENV_FS_RW_NAME "LL_FS_RW"
+#define ENV_PATH_TOKEN ":"
+
+static int parse_path(char *env_path, const char ***const path_list)
+{
+ int i, num_paths = 0;
+
+ if (env_path) {
+ num_paths++;
+ for (i = 0; env_path[i]; i++) {
+ if (env_path[i] == ENV_PATH_TOKEN[0])
+ num_paths++;
+ }
+ }
+ *path_list = malloc(num_paths * sizeof(**path_list));
+ for (i = 0; i < num_paths; i++)
+ (*path_list)[i] = strsep(&env_path, ENV_PATH_TOKEN);
+
+ return num_paths;
+}
+
+#define ACCESS_FILE ( \
+ LANDLOCK_ACCESS_FS_EXECUTE | \
+ LANDLOCK_ACCESS_FS_WRITE_FILE | \
+ LANDLOCK_ACCESS_FS_READ_FILE)
+
+static int populate_ruleset(
+ const char *const env_var, const int ruleset_fd,
+ const __u64 allowed_access)
+{
+ int num_paths, i, ret = 1;
+ char *env_path_name;
+ const char **path_list = NULL;
+ struct landlock_path_beneath_attr path_beneath = {
+ .parent_fd = -1,
+ };
+
+ env_path_name = getenv(env_var);
+ if (!env_path_name) {
+ /* Prevents users to forget a setting. */
+ fprintf(stderr, "Missing environment variable %s\n", env_var);
+ return 1;
+ }
+ env_path_name = strdup(env_path_name);
+ unsetenv(env_var);
+ num_paths = parse_path(env_path_name, &path_list);
+ if (num_paths == 1 && path_list[0][0] == '\0') {
+ /*
+ * Allows to not use all possible restrictions (e.g. use
+ * LL_FS_RO without LL_FS_RW).
+ */
+ ret = 0;
+ goto out_free_name;
+ }
+
+ for (i = 0; i < num_paths; i++) {
+ struct stat statbuf;
+
+ path_beneath.parent_fd = open(path_list[i], O_PATH |
+ O_CLOEXEC);
+ if (path_beneath.parent_fd < 0) {
+ fprintf(stderr, "Failed to open \"%s\": %s\n",
+ path_list[i],
+ strerror(errno));
+ goto out_free_name;
+ }
+ if (fstat(path_beneath.parent_fd, &statbuf)) {
+ close(path_beneath.parent_fd);
+ goto out_free_name;
+ }
+ path_beneath.allowed_access = allowed_access;
+ if (!S_ISDIR(statbuf.st_mode))
+ path_beneath.allowed_access &= ACCESS_FILE;
+ if (landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+ &path_beneath, 0)) {
+ fprintf(stderr, "Failed to update the ruleset with \"%s\": %s\n",
+ path_list[i], strerror(errno));
+ close(path_beneath.parent_fd);
+ goto out_free_name;
+ }
+ close(path_beneath.parent_fd);
+ }
+ ret = 0;
+
+out_free_name:
+ free(env_path_name);
+ return ret;
+}
+
+#define ACCESS_FS_ROUGHLY_READ ( \
+ LANDLOCK_ACCESS_FS_EXECUTE | \
+ LANDLOCK_ACCESS_FS_READ_FILE | \
+ LANDLOCK_ACCESS_FS_READ_DIR)
+
+#define ACCESS_FS_ROUGHLY_WRITE ( \
+ LANDLOCK_ACCESS_FS_WRITE_FILE | \
+ LANDLOCK_ACCESS_FS_REMOVE_DIR | \
+ LANDLOCK_ACCESS_FS_REMOVE_FILE | \
+ LANDLOCK_ACCESS_FS_MAKE_CHAR | \
+ LANDLOCK_ACCESS_FS_MAKE_DIR | \
+ LANDLOCK_ACCESS_FS_MAKE_REG | \
+ LANDLOCK_ACCESS_FS_MAKE_SOCK | \
+ LANDLOCK_ACCESS_FS_MAKE_FIFO | \
+ LANDLOCK_ACCESS_FS_MAKE_BLOCK | \
+ LANDLOCK_ACCESS_FS_MAKE_SYM)
+
+int main(const int argc, char *const argv[], char *const *const envp)
+{
+ const char *cmd_path;
+ char *const *cmd_argv;
+ int ruleset_fd;
+ struct landlock_ruleset_attr ruleset_attr = {
+ .handled_access_fs = ACCESS_FS_ROUGHLY_READ |
+ ACCESS_FS_ROUGHLY_WRITE,
+ };
+
+ if (argc < 2) {
+ fprintf(stderr, "usage: %s=\"...\" %s=\"...\" %s <cmd> [args]...\n\n",
+ ENV_FS_RO_NAME, ENV_FS_RW_NAME, argv[0]);
+ fprintf(stderr, "Launch a command in a restricted environment.\n\n");
+ fprintf(stderr, "Environment variables containing paths, "
+ "each separated by a colon:\n");
+ fprintf(stderr, "* %s: list of paths allowed to be used in a read-only way.\n",
+ ENV_FS_RO_NAME);
+ fprintf(stderr, "* %s: list of paths allowed to be used in a read-write way.\n",
+ ENV_FS_RW_NAME);
+ fprintf(stderr, "\nexample:\n"
+ "%s=\"/bin:/lib:/usr:/proc:/etc:/dev/urandom\" "
+ "%s=\"/dev/null:/dev/full:/dev/zero:/dev/pts:/tmp\" "
+ "%s bash -i\n",
+ ENV_FS_RO_NAME, ENV_FS_RW_NAME, argv[0]);
+ return 1;
+ }
+
+ ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+ if (ruleset_fd < 0) {
+ const int err = errno;
+
+ perror("Failed to create a ruleset");
+ switch (err) {
+ case ENOSYS:
+ fprintf(stderr, "Hint: Landlock is not supported by the current kernel. "
+ "To support it, build the kernel with "
+ "CONFIG_SECURITY_LANDLOCK=y and prepend "
+ "\"landlock,\" to the content of CONFIG_LSM.\n");
+ break;
+ case EOPNOTSUPP:
+ fprintf(stderr, "Hint: Landlock is currently disabled. "
+ "It can be enabled in the kernel configuration by "
+ "prepending \"landlock,\" to the content of CONFIG_LSM, "
+ "or at boot time by setting the same content to the "
+ "\"lsm\" kernel parameter.\n");
+ break;
+ }
+ return 1;
+ }
+ if (populate_ruleset(ENV_FS_RO_NAME, ruleset_fd,
+ ACCESS_FS_ROUGHLY_READ)) {
+ goto err_close_ruleset;
+ }
+ if (populate_ruleset(ENV_FS_RW_NAME, ruleset_fd,
+ ACCESS_FS_ROUGHLY_READ | ACCESS_FS_ROUGHLY_WRITE)) {
+ goto err_close_ruleset;
+ }
+ if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
+ perror("Failed to restrict privileges");
+ goto err_close_ruleset;
+ }
+ if (landlock_restrict_self(ruleset_fd, 0)) {
+ perror("Failed to enforce ruleset");
+ goto err_close_ruleset;
+ }
+ close(ruleset_fd);
+
+ cmd_path = argv[1];
+ cmd_argv = argv + 1;
+ execvpe(cmd_path, cmd_argv, envp);
+ fprintf(stderr, "Failed to execute \"%s\": %s\n", cmd_path,
+ strerror(errno));
+ fprintf(stderr, "Hint: access to the binary, the interpreter or "
+ "shared libraries may be denied.\n");
+ return 1;
+
+err_close_ruleset:
+ close(ruleset_fd);
+ return 1;
+}
--
2.30.0

2021-02-02 23:11:20

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v28 06/12] fs,security: Add sb_delete hook

From: Mickaël Salaün <[email protected]>

The sb_delete security hook is called when shutting down a superblock,
which may be useful to release kernel objects tied to the superblock's
lifetime (e.g. inodes).

This new hook is needed by Landlock to release (ephemerally) tagged
struct inodes. This comes from the unprivileged nature of Landlock
described in the next commit.

Cc: Al Viro <[email protected]>
Cc: James Morris <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Signed-off-by: Mickaël Salaün <[email protected]>
Reviewed-by: Jann Horn <[email protected]>
---

Changes since v22:
* Add Reviewed-by: Jann Horn <[email protected]>

Changes since v17:
* Initial patch to replace the direct call to landlock_release_inodes()
(requested by James Morris).
https://lore.kernel.org/lkml/[email protected]/
---
fs/super.c | 1 +
include/linux/lsm_hook_defs.h | 1 +
include/linux/lsm_hooks.h | 2 ++
include/linux/security.h | 4 ++++
security/security.c | 5 +++++
5 files changed, 13 insertions(+)

diff --git a/fs/super.c b/fs/super.c
index 2c6cdea2ab2d..c3c5178cde65 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -454,6 +454,7 @@ void generic_shutdown_super(struct super_block *sb)
evict_inodes(sb);
/* only nonzero refcount inodes can have marks */
fsnotify_sb_delete(sb);
+ security_sb_delete(sb);

if (sb->s_dio_done_wq) {
destroy_workqueue(sb->s_dio_done_wq);
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index 7aaa753b8608..32472b3849bc 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -59,6 +59,7 @@ LSM_HOOK(int, 0, fs_context_dup, struct fs_context *fc,
LSM_HOOK(int, -ENOPARAM, fs_context_parse_param, struct fs_context *fc,
struct fs_parameter *param)
LSM_HOOK(int, 0, sb_alloc_security, struct super_block *sb)
+LSM_HOOK(void, LSM_RET_VOID, sb_delete, struct super_block *sb)
LSM_HOOK(void, LSM_RET_VOID, sb_free_security, struct super_block *sb)
LSM_HOOK(void, LSM_RET_VOID, sb_free_mnt_opts, void *mnt_opts)
LSM_HOOK(int, 0, sb_eat_lsm_opts, char *orig, void **mnt_opts)
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 970106d98306..e339b201f79b 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -108,6 +108,8 @@
* allocated.
* @sb contains the super_block structure to be modified.
* Return 0 if operation was successful.
+ * @sb_delete:
+ * Release objects tied to a superblock (e.g. inodes).
* @sb_free_security:
* Deallocate and clear the sb->s_security field.
* @sb contains the super_block structure to be modified.
diff --git a/include/linux/security.h b/include/linux/security.h
index c35ea0ffccd9..c41a94e29b62 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -288,6 +288,7 @@ void security_bprm_committed_creds(struct linux_binprm *bprm);
int security_fs_context_dup(struct fs_context *fc, struct fs_context *src_fc);
int security_fs_context_parse_param(struct fs_context *fc, struct fs_parameter *param);
int security_sb_alloc(struct super_block *sb);
+void security_sb_delete(struct super_block *sb);
void security_sb_free(struct super_block *sb);
void security_free_mnt_opts(void **mnt_opts);
int security_sb_eat_lsm_opts(char *options, void **mnt_opts);
@@ -620,6 +621,9 @@ static inline int security_sb_alloc(struct super_block *sb)
return 0;
}

+static inline void security_sb_delete(struct super_block *sb)
+{ }
+
static inline void security_sb_free(struct super_block *sb)
{ }

diff --git a/security/security.c b/security/security.c
index 9f979d4afe6c..1b4a73b2549a 100644
--- a/security/security.c
+++ b/security/security.c
@@ -900,6 +900,11 @@ int security_sb_alloc(struct super_block *sb)
return rc;
}

+void security_sb_delete(struct super_block *sb)
+{
+ call_void_hook(sb_delete, sb);
+}
+
void security_sb_free(struct super_block *sb)
{
call_void_hook(sb_free_security, sb);
--
2.30.0

2021-02-02 23:13:02

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v28 08/12] landlock: Add syscall implementations

From: Mickaël Salaün <[email protected]>

These 3 system calls are designed to be used by unprivileged processes
to sandbox themselves:
* landlock_create_ruleset(2): Creates a ruleset and returns its file
descriptor.
* landlock_add_rule(2): Adds a rule (e.g. file hierarchy access) to a
ruleset, identified by the dedicated file descriptor.
* landlock_restrict_self(2): Enforces a ruleset on the calling thread
and its future children (similar to seccomp). This syscall has the
same usage restrictions as seccomp(2): the caller must have the
no_new_privs attribute set or have CAP_SYS_ADMIN in the current user
namespace.

All these syscalls have a "flags" argument (not currently used) to
enable extensibility.

Here are the motivations for these new syscalls:
* A sandboxed process may not have access to file systems, including
/dev, /sys or /proc, but it should still be able to add more
restrictions to itself.
* Neither prctl(2) nor seccomp(2) (which was used in a previous version)
fit well with the current definition of a Landlock security policy.

All passed structs (attributes) are checked at build time to ensure that
they don't contain holes and that they are aligned the same way for each
architecture.

See the user and kernel documentation for more details (provided by a
following commit):
* Documentation/userspace-api/landlock.rst
* Documentation/security/landlock.rst

Cc: Arnd Bergmann <[email protected]>
Cc: James Morris <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Signed-off-by: Mickaël Salaün <[email protected]>
---

Changes since v27:
* Forbid creation of rules with an empty allowed_access value because
they are now ignored (since v26) in path walks.
* Rename landlock_enforce_ruleset_self(2) to landlock_restrict_self(2):
shorter and consistent with the two other syscalls (i.e. verb + direct
object).
* Update ruleset access check according to the new access stack.
* Improve landlock_add_rule(2) documentation.
* Fix comment.
* Remove Reviewed-by Jann Horn because of the above changes.

Changes since v26:
* Rename landlock_enforce_ruleset_current(2) to
landlock_enforce_ruleset_self(2). "current" makes sense for a kernel
developer, but much less from a user space developer stand point.
"self" is widely used to refer to the current task (e.g. /proc/self).
"current" may refer to temporal properties, which could be added later
to this syscall flags (cf. /proc/self/attr/{current,exec}).
* Simplify build_check_abi().
* Rename syscall.c to syscalls.c .
* Use less ambiguous comments.
* Fix spelling.

Changes since v25:
* Revert build_check_abi() as non-inline to trigger a warning if it is
not called.
* Use the new limit names.

Changes since v24:
* Add Reviewed-by: Jann Horn <[email protected]>
* Set build_check_abi() as inline.

Changes since v23:
* Rewrite get_ruleset_from_fd() to please the 0-DAY CI Kernel Test
Service that reported an uninitialized variable (false positive):
https://lore.kernel.org/linux-security-module/[email protected]/
Anyway, it is cleaner like this.
* Add a comment about E2BIG which can be returned by
landlock_enforce_ruleset_current(2) when there is no more room for
another stacked ruleset (i.e. domain).

Changes since v22:
* Replace security_capable() with ns_capable_noaudit() (suggested by
Jann Horn) and explicitly return EPERM.
* Fix landlock_enforce_ruleset_current(2)'s out_put_creds (spotted by
Jann Horn).
* Add __always_inline to copy_min_struct_from_user() to make its
BUILD_BUG_ON() checks reliable (suggested by Jann Horn).
* Simplify path assignation in get_path_from_fd() (suggested by Jann
Horn).
* Fix spelling (spotted by Jann Horn).

Changes since v21:
* Fix and improve comments.

Changes since v20:
* Remove two arguments to landlock_enforce_ruleset(2) (requested by Arnd
Bergmann) and rename it to landlock_enforce_ruleset_current(2): remove
the enum landlock_target_type and the target file descriptor (not used
for now). A ruleset can only be enforced on the current thread.
* Remove the size argument in landlock_add_rule() (requested by Arnd
Bergmann).
* Remove landlock_get_features(2) (suggested by Arnd Bergmann).
* Simplify and rename copy_struct_if_any_from_user() to
copy_min_struct_from_user().
* Rename "options" to "flags" to allign with current syscalls.
* Rename some types and variables in a more consistent way.
* Fix missing type declarations in syscalls.h .

Changes since v19:
* Replace the landlock(2) syscall with 4 syscalls (one for each
command): landlock_get_features(2), landlock_create_ruleset(2),
landlock_add_rule(2) and landlock_enforce_ruleset(2) (suggested by
Arnd Bergmann).
https://lore.kernel.org/lkml/[email protected]/
* Return EOPNOTSUPP (instead of ENOPKG) when Landlock is disabled.
* Add two new fields to landlock_attr_features to fit with the new
syscalls: last_rule_type and last_target_type. This enable to easily
identify which types are supported.
* Pack landlock_attr_path_beneath struct because of the removed
ruleset_fd.
* Update documentation and fix spelling.

Changes since v18:
* Remove useless include.
* Remove LLATTR_SIZE() which was only used to shorten lines. Cf. commit
bdc48fa11e46 ("checkpatch/coding-style: deprecate 80-column warning").

Changes since v17:
* Synchronize syscall declaration.
* Fix comment.

Changes since v16:
* Add a size_attr_features field to struct landlock_attr_features for
self-introspection, and move the access_fs field to be more
consistent.
* Replace __aligned_u64 types of attribute fields with __u16, __s32,
__u32 and __u64, and check at build time that these structures does
not contain hole and that they are aligned the same way (8-bits) on
all architectures. This shrinks the size of the userspace ABI, which
may be appreciated especially for struct landlock_attr_features which
could grow a lot in the future. For instance, struct
landlock_attr_features shrinks from 72 bytes to 32 bytes. This change
also enables to remove 64-bits to 32-bits conversion checks.
* Switch syscall attribute pointer and size arguments to follow similar
syscall argument order (e.g. bpf, clone3, openat2).
* Set LANDLOCK_OPT_* types to 32-bits.
* Allow enforcement of empty ruleset, which enables deny-all policies.
* Fix documentation inconsistency.

Changes since v15:
* Do not add file descriptors referring to internal filesystems (e.g.
nsfs) in a ruleset.
* Replace is_user_mountable() with in-place clean checks.
* Replace EBADR with EBADFD in get_ruleset_from_fd() and
get_path_from_fd().
* Remove ruleset's show_fdinfo() for now.

Changes since v14:
* Remove the security_file_open() check in get_path_from_fd(): an
opened FD should not be restricted here, and even less with this hook.
As a result, it is now allowed to add a path to a ruleset even if the
access to this path is not allowed (without O_PATH). This doesn't
change the fact that enforcing a ruleset can't grant any right, only
remove some rights. The new layer levels add more consistent
restrictions.
* Check minimal landlock_attr_* size/content. This fix the case when
no data was provided and e.g., FD 0 was interpreted as ruleset_fd.
Now this leads to a returned -EINVAL.
* Fix credential double-free error case.
* Complete struct landlock_attr_size with size_attr_enforce.
* Fix undefined reference to syscall when Landlock is not selected.
* Remove f.file->f_path.mnt check (suggested by Al Viro).
* Add build-time checks.
* Move ABI checks from fs.c .
* Constify variables.
* Fix spelling.
* Add comments.

Changes since v13:
* New implementation, replacing the dependency on seccomp(2) and bpf(2).
---
include/linux/syscalls.h | 7 +
include/uapi/linux/landlock.h | 53 ++++
kernel/sys_ni.c | 5 +
security/landlock/Makefile | 2 +-
security/landlock/syscalls.c | 444 ++++++++++++++++++++++++++++++++++
5 files changed, 510 insertions(+), 1 deletion(-)
create mode 100644 security/landlock/syscalls.c

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 7688bc983de5..6918be404b64 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -68,6 +68,8 @@ union bpf_attr;
struct io_uring_params;
struct clone_args;
struct open_how;
+struct landlock_ruleset_attr;
+enum landlock_rule_type;

#include <linux/types.h>
#include <linux/aio_abi.h>
@@ -1037,6 +1039,11 @@ asmlinkage long sys_pidfd_send_signal(int pidfd, int sig,
siginfo_t __user *info,
unsigned int flags);
asmlinkage long sys_pidfd_getfd(int pidfd, int fd, unsigned int flags);
+asmlinkage long sys_landlock_create_ruleset(const struct landlock_ruleset_attr __user *attr,
+ size_t size, __u32 flags);
+asmlinkage long sys_landlock_add_rule(int ruleset_fd, enum landlock_rule_type rule_type,
+ const void __user *rule_attr, __u32 flags);
+asmlinkage long sys_landlock_restrict_self(int ruleset_fd, __u32 flags);

/*
* Architecture-specific system calls
diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
index f69877099c8e..d1fc6af3381e 100644
--- a/include/uapi/linux/landlock.h
+++ b/include/uapi/linux/landlock.h
@@ -9,6 +9,59 @@
#ifndef _UAPI_LINUX_LANDLOCK_H
#define _UAPI_LINUX_LANDLOCK_H

+#include <linux/types.h>
+
+/**
+ * struct landlock_ruleset_attr - Ruleset definition
+ *
+ * Argument of sys_landlock_create_ruleset(). This structure can grow in
+ * future versions.
+ */
+struct landlock_ruleset_attr {
+ /**
+ * @handled_access_fs: Bitmask of actions (cf. `Filesystem flags`_)
+ * that is handled by this ruleset and should then be forbidden if no
+ * rule explicitly allow them. This is needed for backward
+ * compatibility reasons.
+ */
+ __u64 handled_access_fs;
+};
+
+/**
+ * enum landlock_rule_type - Landlock rule type
+ *
+ * Argument of sys_landlock_add_rule().
+ */
+enum landlock_rule_type {
+ /**
+ * @LANDLOCK_RULE_PATH_BENEATH: Type of a &struct
+ * landlock_path_beneath_attr .
+ */
+ LANDLOCK_RULE_PATH_BENEATH = 1,
+};
+
+/**
+ * struct landlock_path_beneath_attr - Path hierarchy definition
+ *
+ * Argument of sys_landlock_add_rule().
+ */
+struct landlock_path_beneath_attr {
+ /**
+ * @allowed_access: Bitmask of allowed actions for this file hierarchy
+ * (cf. `Filesystem flags`_).
+ */
+ __u64 allowed_access;
+ /**
+ * @parent_fd: File descriptor, open with ``O_PATH``, which identifies
+ * the parent directory of a file hierarchy, or just a file.
+ */
+ __s32 parent_fd;
+ /*
+ * This struct is packed to avoid trailing reserved members.
+ * Cf. security/landlock/syscalls.c:build_check_abi()
+ */
+} __attribute__((packed));
+
/**
* DOC: fs_access
*
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index 19aa806890d5..cce430cf2ff2 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -266,6 +266,11 @@ COND_SYSCALL(request_key);
COND_SYSCALL(keyctl);
COND_SYSCALL_COMPAT(keyctl);

+/* security/landlock/syscalls.c */
+COND_SYSCALL(landlock_create_ruleset);
+COND_SYSCALL(landlock_add_rule);
+COND_SYSCALL(landlock_restrict_self);
+
/* arch/example/kernel/sys_example.c */

/* mm/fadvise.c */
diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index 92e3d80ab8ed..7bbd2f413b3e 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -1,4 +1,4 @@
obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o

-landlock-y := setup.o object.o ruleset.o \
+landlock-y := setup.o syscalls.o object.o ruleset.o \
cred.o ptrace.o fs.o
diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
new file mode 100644
index 000000000000..ebb3c126a3c0
--- /dev/null
+++ b/security/landlock/syscalls.c
@@ -0,0 +1,444 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - System call implementations and user space interfaces
+ *
+ * Copyright © 2016-2020 Mickaël Salaün <[email protected]>
+ * Copyright © 2018-2020 ANSSI
+ */
+
+#include <asm/current.h>
+#include <linux/anon_inodes.h>
+#include <linux/build_bug.h>
+#include <linux/capability.h>
+#include <linux/compiler_types.h>
+#include <linux/dcache.h>
+#include <linux/err.h>
+#include <linux/errno.h>
+#include <linux/fs.h>
+#include <linux/limits.h>
+#include <linux/mount.h>
+#include <linux/path.h>
+#include <linux/sched.h>
+#include <linux/security.h>
+#include <linux/stddef.h>
+#include <linux/syscalls.h>
+#include <linux/types.h>
+#include <linux/uaccess.h>
+#include <uapi/linux/landlock.h>
+
+#include "cred.h"
+#include "fs.h"
+#include "limits.h"
+#include "ruleset.h"
+#include "setup.h"
+
+/**
+ * copy_min_struct_from_user - Safe future-proof argument copying
+ *
+ * Extend copy_struct_from_user() to check for consistent user buffer.
+ *
+ * @dst: Kernel space pointer or NULL.
+ * @ksize: Actual size of the data pointed to by @dst.
+ * @ksize_min: Minimal required size to be copied.
+ * @src: User space pointer or NULL.
+ * @usize: (Alleged) size of the data pointed to by @src.
+ */
+static __always_inline int copy_min_struct_from_user(void *const dst,
+ const size_t ksize, const size_t ksize_min,
+ const void __user *const src, const size_t usize)
+{
+ /* Checks buffer inconsistencies. */
+ BUILD_BUG_ON(!dst);
+ if (!src)
+ return -EFAULT;
+
+ /* Checks size ranges. */
+ BUILD_BUG_ON(ksize <= 0);
+ BUILD_BUG_ON(ksize < ksize_min);
+ if (usize < ksize_min)
+ return -EINVAL;
+ if (usize > PAGE_SIZE)
+ return -E2BIG;
+
+ /* Copies user buffer and fills with zeros. */
+ return copy_struct_from_user(dst, ksize, src, usize);
+}
+
+/*
+ * This function only contains arithmetic operations with constants, leading to
+ * BUILD_BUG_ON(). The related code is evaluated and checked at build time,
+ * but it is then ignored thanks to compiler optimizations.
+ */
+static void build_check_abi(void)
+{
+ struct landlock_ruleset_attr ruleset_attr;
+ struct landlock_path_beneath_attr path_beneath_attr;
+ size_t ruleset_size, path_beneath_size;
+
+ /*
+ * For each user space ABI structures, first checks that there is no
+ * hole in them, then checks that all architectures have the same
+ * struct size.
+ */
+ ruleset_size = sizeof(ruleset_attr.handled_access_fs);
+ BUILD_BUG_ON(sizeof(ruleset_attr) != ruleset_size);
+ BUILD_BUG_ON(sizeof(ruleset_attr) != 8);
+
+ path_beneath_size = sizeof(path_beneath_attr.allowed_access);
+ path_beneath_size += sizeof(path_beneath_attr.parent_fd);
+ BUILD_BUG_ON(sizeof(path_beneath_attr) != path_beneath_size);
+ BUILD_BUG_ON(sizeof(path_beneath_attr) != 12);
+}
+
+/* Ruleset handling */
+
+static int fop_ruleset_release(struct inode *const inode,
+ struct file *const filp)
+{
+ struct landlock_ruleset *ruleset = filp->private_data;
+
+ landlock_put_ruleset(ruleset);
+ return 0;
+}
+
+static ssize_t fop_dummy_read(struct file *const filp, char __user *const buf,
+ const size_t size, loff_t *const ppos)
+{
+ /* Dummy handler to enable FMODE_CAN_READ. */
+ return -EINVAL;
+}
+
+static ssize_t fop_dummy_write(struct file *const filp,
+ const char __user *const buf, const size_t size,
+ loff_t *const ppos)
+{
+ /* Dummy handler to enable FMODE_CAN_WRITE. */
+ return -EINVAL;
+}
+
+/*
+ * A ruleset file descriptor enables to build a ruleset by adding (i.e.
+ * writing) rule after rule, without relying on the task's context. This
+ * reentrant design is also used in a read way to enforce the ruleset on the
+ * current task.
+ */
+static const struct file_operations ruleset_fops = {
+ .release = fop_ruleset_release,
+ .read = fop_dummy_read,
+ .write = fop_dummy_write,
+};
+
+/**
+ * sys_landlock_create_ruleset - Create a new ruleset
+ *
+ * @attr: Pointer to a &struct landlock_ruleset_attr identifying the scope of
+ * the new ruleset.
+ * @size: Size of the pointed &struct landlock_ruleset_attr (needed for
+ * backward and forward compatibility).
+ * @flags: Must be 0.
+ *
+ * This system call enables to create a new Landlock ruleset, and returns the
+ * related file descriptor on success.
+ *
+ * Possible returned errors are:
+ *
+ * - EOPNOTSUPP: Landlock is supported by the kernel but disabled at boot time;
+ * - EINVAL: @flags is not 0, or unknown access, or too small @size;
+ * - E2BIG or EFAULT: @attr or @size inconsistencies;
+ * - ENOMSG: empty &landlock_ruleset_attr.handled_access_fs.
+ */
+SYSCALL_DEFINE3(landlock_create_ruleset,
+ const struct landlock_ruleset_attr __user *const, attr,
+ const size_t, size, const __u32, flags)
+{
+ struct landlock_ruleset_attr ruleset_attr;
+ struct landlock_ruleset *ruleset;
+ int err, ruleset_fd;
+
+ /* Build-time checks. */
+ build_check_abi();
+
+ if (!landlock_initialized)
+ return -EOPNOTSUPP;
+
+ /* No flag for now. */
+ if (flags)
+ return -EINVAL;
+
+ /* Copies raw user space buffer. */
+ err = copy_min_struct_from_user(&ruleset_attr, sizeof(ruleset_attr),
+ offsetofend(typeof(ruleset_attr), handled_access_fs),
+ attr, size);
+ if (err)
+ return err;
+
+ /* Checks content (and 32-bits cast). */
+ if ((ruleset_attr.handled_access_fs | LANDLOCK_MASK_ACCESS_FS) !=
+ LANDLOCK_MASK_ACCESS_FS)
+ return -EINVAL;
+
+ /* Checks arguments and transforms to kernel struct. */
+ ruleset = landlock_create_ruleset(ruleset_attr.handled_access_fs);
+ if (IS_ERR(ruleset))
+ return PTR_ERR(ruleset);
+
+ /* Creates anonymous FD referring to the ruleset. */
+ ruleset_fd = anon_inode_getfd("landlock-ruleset", &ruleset_fops,
+ ruleset, O_RDWR | O_CLOEXEC);
+ if (ruleset_fd < 0)
+ landlock_put_ruleset(ruleset);
+ return ruleset_fd;
+}
+
+/*
+ * Returns an owned ruleset from a FD. It is thus needed to call
+ * landlock_put_ruleset() on the return value.
+ */
+static struct landlock_ruleset *get_ruleset_from_fd(const int fd,
+ const fmode_t mode)
+{
+ struct fd ruleset_f;
+ struct landlock_ruleset *ruleset;
+
+ ruleset_f = fdget(fd);
+ if (!ruleset_f.file)
+ return ERR_PTR(-EBADF);
+
+ /* Checks FD type and access right. */
+ if (ruleset_f.file->f_op != &ruleset_fops) {
+ ruleset = ERR_PTR(-EBADFD);
+ goto out_fdput;
+ }
+ if (!(ruleset_f.file->f_mode & mode)) {
+ ruleset = ERR_PTR(-EPERM);
+ goto out_fdput;
+ }
+ ruleset = ruleset_f.file->private_data;
+ if (WARN_ON_ONCE(ruleset->num_layers != 1)) {
+ ruleset = ERR_PTR(-EINVAL);
+ goto out_fdput;
+ }
+ landlock_get_ruleset(ruleset);
+
+out_fdput:
+ fdput(ruleset_f);
+ return ruleset;
+}
+
+/* Path handling */
+
+/*
+ * @path: Must call put_path(@path) after the call if it succeeded.
+ */
+static int get_path_from_fd(const s32 fd, struct path *const path)
+{
+ struct fd f;
+ int err = 0;
+
+ BUILD_BUG_ON(!__same_type(fd,
+ ((struct landlock_path_beneath_attr *)NULL)->parent_fd));
+
+ /* Handles O_PATH. */
+ f = fdget_raw(fd);
+ if (!f.file)
+ return -EBADF;
+ /*
+ * Only allows O_PATH file descriptor: enables to restrict ambient
+ * filesystem access without requiring to open and risk leaking or
+ * misusing a file descriptor. Forbid internal filesystems (e.g.
+ * nsfs), including pseudo filesystems that will never be mountable
+ * (e.g. sockfs, pipefs).
+ */
+ if (!(f.file->f_mode & FMODE_PATH) ||
+ (f.file->f_path.mnt->mnt_flags & MNT_INTERNAL) ||
+ (f.file->f_path.dentry->d_sb->s_flags & SB_NOUSER) ||
+ d_is_negative(f.file->f_path.dentry) ||
+ IS_PRIVATE(d_backing_inode(f.file->f_path.dentry))) {
+ err = -EBADFD;
+ goto out_fdput;
+ }
+ *path = f.file->f_path;
+ path_get(path);
+
+out_fdput:
+ fdput(f);
+ return err;
+}
+
+/**
+ * sys_landlock_add_rule - Add a new rule to a ruleset
+ *
+ * @ruleset_fd: File descriptor tied to the ruleset that should be extended
+ * with the new rule.
+ * @rule_type: Identify the structure type pointed to by @rule_attr (only
+ * LANDLOCK_RULE_PATH_BENEATH for now).
+ * @rule_attr: Pointer to a rule (only of type &struct
+ * landlock_path_beneath_attr for now).
+ * @flags: Must be 0.
+ *
+ * This system call enables to define a new rule and add it to an existing
+ * ruleset.
+ *
+ * Possible returned errors are:
+ *
+ * - EOPNOTSUPP: Landlock is supported by the kernel but disabled at boot time;
+ * - EINVAL: @flags is not 0, or inconsistent access in the rule (i.e.
+ * &landlock_path_beneath_attr.allowed_access is not a subset of the rule's
+ * accesses);
+ * - ENOMSG: Empty accesses (e.g. &landlock_path_beneath_attr.allowed_access);
+ * - EBADF: @ruleset_fd is not a file descriptor for the current thread, or a
+ * member of @rule_attr is not a file descriptor as expected;
+ * - EBADFD: @ruleset_fd is not a ruleset file descriptor, or a member of
+ * @rule_attr is not the expected file descriptor type (e.g. file open
+ * without O_PATH);
+ * - EPERM: @ruleset_fd has no write access to the underlying ruleset;
+ * - EFAULT: @rule_attr inconsistency.
+ */
+SYSCALL_DEFINE4(landlock_add_rule,
+ const int, ruleset_fd, const enum landlock_rule_type, rule_type,
+ const void __user *const, rule_attr, const __u32, flags)
+{
+ struct landlock_path_beneath_attr path_beneath_attr;
+ struct path path;
+ struct landlock_ruleset *ruleset;
+ int res, err;
+
+ if (!landlock_initialized)
+ return -EOPNOTSUPP;
+
+ /* No flag for now. */
+ if (flags)
+ return -EINVAL;
+
+ if (rule_type != LANDLOCK_RULE_PATH_BENEATH)
+ return -EINVAL;
+
+ /* Copies raw user space buffer, only one type for now. */
+ res = copy_from_user(&path_beneath_attr, rule_attr,
+ sizeof(path_beneath_attr));
+ if (res)
+ return -EFAULT;
+
+ /* Gets and checks the ruleset. */
+ ruleset = get_ruleset_from_fd(ruleset_fd, FMODE_CAN_WRITE);
+ if (IS_ERR(ruleset))
+ return PTR_ERR(ruleset);
+
+ /*
+ * Informs about useless rule: empty allowed_access (i.e. deny rules)
+ * are ignored in path walks.
+ */
+ if (!path_beneath_attr.allowed_access) {
+ err = -ENOMSG;
+ goto out_put_ruleset;
+ }
+ /*
+ * Checks that allowed_access matches the @ruleset constraints
+ * (ruleset->fs_access_masks[0] is automatically upgraded to 64-bits).
+ */
+ if ((path_beneath_attr.allowed_access | ruleset->fs_access_masks[0]) !=
+ ruleset->fs_access_masks[0]) {
+ err = -EINVAL;
+ goto out_put_ruleset;
+ }
+
+ /* Gets and checks the new rule. */
+ err = get_path_from_fd(path_beneath_attr.parent_fd, &path);
+ if (err)
+ goto out_put_ruleset;
+
+ /* Imports the new rule. */
+ err = landlock_append_fs_rule(ruleset, &path,
+ path_beneath_attr.allowed_access);
+ path_put(&path);
+
+out_put_ruleset:
+ landlock_put_ruleset(ruleset);
+ return err;
+}
+
+/* Enforcement */
+
+/**
+ * sys_landlock_restrict_self - Enforce a ruleset on the calling thread
+ *
+ * @ruleset_fd: File descriptor tied to the ruleset to merge with the target.
+ * @flags: Must be 0.
+ *
+ * This system call enables to enforce a Landlock ruleset on the current
+ * thread. Enforcing a ruleset requires that the task has CAP_SYS_ADMIN in its
+ * namespace or is running with no_new_privs. This avoids scenarios where
+ * unprivileged tasks can affect the behavior of privileged children.
+ *
+ * Possible returned errors are:
+ *
+ * - EOPNOTSUPP: Landlock is supported by the kernel but disabled at boot time;
+ * - EINVAL: @flags is not 0.
+ * - EBADF: @ruleset_fd is not a file descriptor for the current thread;
+ * - EBADFD: @ruleset_fd is not a ruleset file descriptor;
+ * - EPERM: @ruleset_fd has no read access to the underlying ruleset, or the
+ * current thread is not running with no_new_privs, or it doesn't have
+ * CAP_SYS_ADMIN in its namespace.
+ * - E2BIG: The maximum number of stacked rulesets is reached for the current
+ * thread.
+ */
+SYSCALL_DEFINE2(landlock_restrict_self,
+ const int, ruleset_fd, const __u32, flags)
+{
+ struct landlock_ruleset *new_dom, *ruleset;
+ struct cred *new_cred;
+ struct landlock_cred_security *new_llcred;
+ int err;
+
+ if (!landlock_initialized)
+ return -EOPNOTSUPP;
+
+ /* No flag for now. */
+ if (flags)
+ return -EINVAL;
+
+ /*
+ * Similar checks as for seccomp(2), except that an -EPERM may be
+ * returned.
+ */
+ if (!task_no_new_privs(current) &&
+ !ns_capable_noaudit(current_user_ns(), CAP_SYS_ADMIN))
+ return -EPERM;
+
+ /* Gets and checks the ruleset. */
+ ruleset = get_ruleset_from_fd(ruleset_fd, FMODE_CAN_READ);
+ if (IS_ERR(ruleset))
+ return PTR_ERR(ruleset);
+
+ /* Prepares new credentials. */
+ new_cred = prepare_creds();
+ if (!new_cred) {
+ err = -ENOMEM;
+ goto out_put_ruleset;
+ }
+ new_llcred = landlock_cred(new_cred);
+
+ /*
+ * There is no possible race condition while copying and manipulating
+ * the current credentials because they are dedicated per thread.
+ */
+ new_dom = landlock_merge_ruleset(new_llcred->domain, ruleset);
+ if (IS_ERR(new_dom)) {
+ err = PTR_ERR(new_dom);
+ goto out_put_creds;
+ }
+
+ /* Replaces the old (prepared) domain. */
+ landlock_put_ruleset(new_llcred->domain);
+ new_llcred->domain = new_dom;
+
+ landlock_put_ruleset(ruleset);
+ return commit_creds(new_cred);
+
+out_put_creds:
+ abort_creds(new_cred);
+
+out_put_ruleset:
+ landlock_put_ruleset(ruleset);
+ return err;
+}
--
2.30.0

2021-02-02 23:15:11

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v28 05/12] LSM: Infrastructure management of the superblock

From: Casey Schaufler <[email protected]>

Move management of the superblock->sb_security blob out of the
individual security modules and into the security infrastructure.
Instead of allocating the blobs from within the modules, the modules
tell the infrastructure how much space is required, and the space is
allocated there.

Cc: Kees Cook <[email protected]>
Cc: John Johansen <[email protected]>
Signed-off-by: Casey Schaufler <[email protected]>
Signed-off-by: Mickaël Salaün <[email protected]>
Reviewed-by: Stephen Smalley <[email protected]>
---

Changes since v26:
* Rebase on commit b159e86b5a2a ("selinux: drop super_block backpointer
from superblock_security_struct"). No change in the patch itself,
only a trivial conflict because of an updated nearby line in
selinux_set_mnt_opts() variable declarations.

Changes since v20:
* Remove all Reviewed-by except Stephen Smalley:
https://lore.kernel.org/lkml/CAEjxPJ7ARJO57MBW66=xsBzMMRb=9uLgqocK5eskHCaiVMx7Vw@mail.gmail.com/
* Cosmetic fix in the commit message.

Changes since v17:
* Rebase the original LSM stacking patch from v5.3 to v5.7: I fixed some
diff conflicts caused by code moves and function renames in
selinux/include/objsec.h and selinux/hooks.c . I checked that it
builds but I didn't test the changes for SELinux nor SMACK.
https://lore.kernel.org/r/[email protected]
---
include/linux/lsm_hooks.h | 1 +
security/security.c | 46 ++++++++++++++++++++----
security/selinux/hooks.c | 58 ++++++++++++-------------------
security/selinux/include/objsec.h | 6 ++++
security/selinux/ss/services.c | 3 +-
security/smack/smack.h | 6 ++++
security/smack/smack_lsm.c | 35 +++++--------------
7 files changed, 85 insertions(+), 70 deletions(-)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index a19adef1f088..970106d98306 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1563,6 +1563,7 @@ struct lsm_blob_sizes {
int lbs_cred;
int lbs_file;
int lbs_inode;
+ int lbs_superblock;
int lbs_ipc;
int lbs_msg_msg;
int lbs_task;
diff --git a/security/security.c b/security/security.c
index 7b09cfbae94f..9f979d4afe6c 100644
--- a/security/security.c
+++ b/security/security.c
@@ -203,6 +203,7 @@ static void __init lsm_set_blob_sizes(struct lsm_blob_sizes *needed)
lsm_set_blob_size(&needed->lbs_inode, &blob_sizes.lbs_inode);
lsm_set_blob_size(&needed->lbs_ipc, &blob_sizes.lbs_ipc);
lsm_set_blob_size(&needed->lbs_msg_msg, &blob_sizes.lbs_msg_msg);
+ lsm_set_blob_size(&needed->lbs_superblock, &blob_sizes.lbs_superblock);
lsm_set_blob_size(&needed->lbs_task, &blob_sizes.lbs_task);
}

@@ -333,12 +334,13 @@ static void __init ordered_lsm_init(void)
for (lsm = ordered_lsms; *lsm; lsm++)
prepare_lsm(*lsm);

- init_debug("cred blob size = %d\n", blob_sizes.lbs_cred);
- init_debug("file blob size = %d\n", blob_sizes.lbs_file);
- init_debug("inode blob size = %d\n", blob_sizes.lbs_inode);
- init_debug("ipc blob size = %d\n", blob_sizes.lbs_ipc);
- init_debug("msg_msg blob size = %d\n", blob_sizes.lbs_msg_msg);
- init_debug("task blob size = %d\n", blob_sizes.lbs_task);
+ init_debug("cred blob size = %d\n", blob_sizes.lbs_cred);
+ init_debug("file blob size = %d\n", blob_sizes.lbs_file);
+ init_debug("inode blob size = %d\n", blob_sizes.lbs_inode);
+ init_debug("ipc blob size = %d\n", blob_sizes.lbs_ipc);
+ init_debug("msg_msg blob size = %d\n", blob_sizes.lbs_msg_msg);
+ init_debug("superblock blob size = %d\n", blob_sizes.lbs_superblock);
+ init_debug("task blob size = %d\n", blob_sizes.lbs_task);

/*
* Create any kmem_caches needed for blobs
@@ -670,6 +672,27 @@ static void __init lsm_early_task(struct task_struct *task)
panic("%s: Early task alloc failed.\n", __func__);
}

+/**
+ * lsm_superblock_alloc - allocate a composite superblock blob
+ * @sb: the superblock that needs a blob
+ *
+ * Allocate the superblock blob for all the modules
+ *
+ * Returns 0, or -ENOMEM if memory can't be allocated.
+ */
+static int lsm_superblock_alloc(struct super_block *sb)
+{
+ if (blob_sizes.lbs_superblock == 0) {
+ sb->s_security = NULL;
+ return 0;
+ }
+
+ sb->s_security = kzalloc(blob_sizes.lbs_superblock, GFP_KERNEL);
+ if (sb->s_security == NULL)
+ return -ENOMEM;
+ return 0;
+}
+
/*
* The default value of the LSM hook is defined in linux/lsm_hook_defs.h and
* can be accessed with:
@@ -867,12 +890,21 @@ int security_fs_context_parse_param(struct fs_context *fc, struct fs_parameter *

int security_sb_alloc(struct super_block *sb)
{
- return call_int_hook(sb_alloc_security, 0, sb);
+ int rc = lsm_superblock_alloc(sb);
+
+ if (unlikely(rc))
+ return rc;
+ rc = call_int_hook(sb_alloc_security, 0, sb);
+ if (unlikely(rc))
+ security_sb_free(sb);
+ return rc;
}

void security_sb_free(struct super_block *sb)
{
call_void_hook(sb_free_security, sb);
+ kfree(sb->s_security);
+ sb->s_security = NULL;
}

void security_free_mnt_opts(void **mnt_opts)
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 644b17ec9e63..ecf0ca8c3108 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -322,7 +322,7 @@ static void inode_free_security(struct inode *inode)

if (!isec)
return;
- sbsec = inode->i_sb->s_security;
+ sbsec = selinux_superblock(inode->i_sb);
/*
* As not all inode security structures are in a list, we check for
* empty list outside of the lock to make sure that we won't waste
@@ -340,13 +340,6 @@ static void inode_free_security(struct inode *inode)
}
}

-static void superblock_free_security(struct super_block *sb)
-{
- struct superblock_security_struct *sbsec = sb->s_security;
- sb->s_security = NULL;
- kfree(sbsec);
-}
-
struct selinux_mnt_opts {
const char *fscontext, *context, *rootcontext, *defcontext;
};
@@ -458,7 +451,7 @@ static int selinux_is_genfs_special_handling(struct super_block *sb)

static int selinux_is_sblabel_mnt(struct super_block *sb)
{
- struct superblock_security_struct *sbsec = sb->s_security;
+ struct superblock_security_struct *sbsec = selinux_superblock(sb);

/*
* IMPORTANT: Double-check logic in this function when adding a new
@@ -486,7 +479,7 @@ static int selinux_is_sblabel_mnt(struct super_block *sb)

static int sb_finish_set_opts(struct super_block *sb)
{
- struct superblock_security_struct *sbsec = sb->s_security;
+ struct superblock_security_struct *sbsec = selinux_superblock(sb);
struct dentry *root = sb->s_root;
struct inode *root_inode = d_backing_inode(root);
int rc = 0;
@@ -599,7 +592,7 @@ static int selinux_set_mnt_opts(struct super_block *sb,
unsigned long *set_kern_flags)
{
const struct cred *cred = current_cred();
- struct superblock_security_struct *sbsec = sb->s_security;
+ struct superblock_security_struct *sbsec = selinux_superblock(sb);
struct dentry *root = sb->s_root;
struct selinux_mnt_opts *opts = mnt_opts;
struct inode_security_struct *root_isec;
@@ -836,8 +829,8 @@ static int selinux_set_mnt_opts(struct super_block *sb,
static int selinux_cmp_sb_context(const struct super_block *oldsb,
const struct super_block *newsb)
{
- struct superblock_security_struct *old = oldsb->s_security;
- struct superblock_security_struct *new = newsb->s_security;
+ struct superblock_security_struct *old = selinux_superblock(oldsb);
+ struct superblock_security_struct *new = selinux_superblock(newsb);
char oldflags = old->flags & SE_MNTMASK;
char newflags = new->flags & SE_MNTMASK;

@@ -869,8 +862,9 @@ static int selinux_sb_clone_mnt_opts(const struct super_block *oldsb,
unsigned long *set_kern_flags)
{
int rc = 0;
- const struct superblock_security_struct *oldsbsec = oldsb->s_security;
- struct superblock_security_struct *newsbsec = newsb->s_security;
+ const struct superblock_security_struct *oldsbsec =
+ selinux_superblock(oldsb);
+ struct superblock_security_struct *newsbsec = selinux_superblock(newsb);

int set_fscontext = (oldsbsec->flags & FSCONTEXT_MNT);
int set_context = (oldsbsec->flags & CONTEXT_MNT);
@@ -1049,7 +1043,7 @@ static int show_sid(struct seq_file *m, u32 sid)

static int selinux_sb_show_options(struct seq_file *m, struct super_block *sb)
{
- struct superblock_security_struct *sbsec = sb->s_security;
+ struct superblock_security_struct *sbsec = selinux_superblock(sb);
int rc;

if (!(sbsec->flags & SE_SBINITIALIZED))
@@ -1399,7 +1393,7 @@ static int inode_doinit_with_dentry(struct inode *inode, struct dentry *opt_dent
if (isec->sclass == SECCLASS_FILE)
isec->sclass = inode_mode_to_security_class(inode->i_mode);

- sbsec = inode->i_sb->s_security;
+ sbsec = selinux_superblock(inode->i_sb);
if (!(sbsec->flags & SE_SBINITIALIZED)) {
/* Defer initialization until selinux_complete_init,
after the initial policy is loaded and the security
@@ -1750,7 +1744,8 @@ selinux_determine_inode_label(const struct task_security_struct *tsec,
const struct qstr *name, u16 tclass,
u32 *_new_isid)
{
- const struct superblock_security_struct *sbsec = dir->i_sb->s_security;
+ const struct superblock_security_struct *sbsec =
+ selinux_superblock(dir->i_sb);

if ((sbsec->flags & SE_SBINITIALIZED) &&
(sbsec->behavior == SECURITY_FS_USE_MNTPOINT)) {
@@ -1781,7 +1776,7 @@ static int may_create(struct inode *dir,
int rc;

dsec = inode_security(dir);
- sbsec = dir->i_sb->s_security;
+ sbsec = selinux_superblock(dir->i_sb);

sid = tsec->sid;

@@ -1930,7 +1925,7 @@ static int superblock_has_perm(const struct cred *cred,
struct superblock_security_struct *sbsec;
u32 sid = cred_sid(cred);

- sbsec = sb->s_security;
+ sbsec = selinux_superblock(sb);
return avc_has_perm(&selinux_state,
sid, sbsec->sid, SECCLASS_FILESYSTEM, perms, ad);
}
@@ -2559,11 +2554,7 @@ static void selinux_bprm_committed_creds(struct linux_binprm *bprm)

static int selinux_sb_alloc_security(struct super_block *sb)
{
- struct superblock_security_struct *sbsec;
-
- sbsec = kzalloc(sizeof(struct superblock_security_struct), GFP_KERNEL);
- if (!sbsec)
- return -ENOMEM;
+ struct superblock_security_struct *sbsec = selinux_superblock(sb);

mutex_init(&sbsec->lock);
INIT_LIST_HEAD(&sbsec->isec_head);
@@ -2571,16 +2562,10 @@ static int selinux_sb_alloc_security(struct super_block *sb)
sbsec->sid = SECINITSID_UNLABELED;
sbsec->def_sid = SECINITSID_FILE;
sbsec->mntpoint_sid = SECINITSID_UNLABELED;
- sb->s_security = sbsec;

return 0;
}

-static void selinux_sb_free_security(struct super_block *sb)
-{
- superblock_free_security(sb);
-}
-
static inline int opt_len(const char *s)
{
bool open_quote = false;
@@ -2659,7 +2644,7 @@ static int selinux_sb_eat_lsm_opts(char *options, void **mnt_opts)
static int selinux_sb_remount(struct super_block *sb, void *mnt_opts)
{
struct selinux_mnt_opts *opts = mnt_opts;
- struct superblock_security_struct *sbsec = sb->s_security;
+ struct superblock_security_struct *sbsec = selinux_superblock(sb);
u32 sid;
int rc;

@@ -2897,7 +2882,7 @@ static int selinux_inode_init_security(struct inode *inode, struct inode *dir,
int rc;
char *context;

- sbsec = dir->i_sb->s_security;
+ sbsec = selinux_superblock(dir->i_sb);

newsid = tsec->create_sid;

@@ -3142,7 +3127,7 @@ static int selinux_inode_setxattr(struct dentry *dentry, const char *name,
if (!selinux_initialized(&selinux_state))
return (inode_owner_or_capable(inode) ? 0 : -EPERM);

- sbsec = inode->i_sb->s_security;
+ sbsec = selinux_superblock(inode->i_sb);
if (!(sbsec->flags & SBLABEL_MNT))
return -EOPNOTSUPP;

@@ -3384,13 +3369,14 @@ static int selinux_inode_setsecurity(struct inode *inode, const char *name,
const void *value, size_t size, int flags)
{
struct inode_security_struct *isec = inode_security_novalidate(inode);
- struct superblock_security_struct *sbsec = inode->i_sb->s_security;
+ struct superblock_security_struct *sbsec;
u32 newsid;
int rc;

if (strcmp(name, XATTR_SELINUX_SUFFIX))
return -EOPNOTSUPP;

+ sbsec = selinux_superblock(inode->i_sb);
if (!(sbsec->flags & SBLABEL_MNT))
return -EOPNOTSUPP;

@@ -6882,6 +6868,7 @@ struct lsm_blob_sizes selinux_blob_sizes __lsm_ro_after_init = {
.lbs_inode = sizeof(struct inode_security_struct),
.lbs_ipc = sizeof(struct ipc_security_struct),
.lbs_msg_msg = sizeof(struct msg_security_struct),
+ .lbs_superblock = sizeof(struct superblock_security_struct),
};

#ifdef CONFIG_PERF_EVENTS
@@ -6982,7 +6969,6 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
LSM_HOOK_INIT(bprm_committing_creds, selinux_bprm_committing_creds),
LSM_HOOK_INIT(bprm_committed_creds, selinux_bprm_committed_creds),

- LSM_HOOK_INIT(sb_free_security, selinux_sb_free_security),
LSM_HOOK_INIT(sb_free_mnt_opts, selinux_free_mnt_opts),
LSM_HOOK_INIT(sb_remount, selinux_sb_remount),
LSM_HOOK_INIT(sb_kern_mount, selinux_sb_kern_mount),
diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h
index ca4d7ab6a835..2953132408bf 100644
--- a/security/selinux/include/objsec.h
+++ b/security/selinux/include/objsec.h
@@ -188,4 +188,10 @@ static inline u32 current_sid(void)
return tsec->sid;
}

+static inline struct superblock_security_struct *selinux_superblock(
+ const struct super_block *superblock)
+{
+ return superblock->s_security + selinux_blob_sizes.lbs_superblock;
+}
+
#endif /* _SELINUX_OBJSEC_H_ */
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
index 597b79703584..74e3905dd9c5 100644
--- a/security/selinux/ss/services.c
+++ b/security/selinux/ss/services.c
@@ -47,6 +47,7 @@
#include <linux/sched.h>
#include <linux/audit.h>
#include <linux/vmalloc.h>
+#include <linux/lsm_hooks.h>
#include <net/netlabel.h>

#include "flask.h"
@@ -2873,7 +2874,7 @@ int security_fs_use(struct selinux_state *state, struct super_block *sb)
struct sidtab *sidtab;
int rc = 0;
struct ocontext *c;
- struct superblock_security_struct *sbsec = sb->s_security;
+ struct superblock_security_struct *sbsec = selinux_superblock(sb);
const char *fstype = sb->s_type->name;

if (!selinux_initialized(state)) {
diff --git a/security/smack/smack.h b/security/smack/smack.h
index a9768b12716b..7077b18c79ec 100644
--- a/security/smack/smack.h
+++ b/security/smack/smack.h
@@ -357,6 +357,12 @@ static inline struct smack_known **smack_ipc(const struct kern_ipc_perm *ipc)
return ipc->security + smack_blob_sizes.lbs_ipc;
}

+static inline struct superblock_smack *smack_superblock(
+ const struct super_block *superblock)
+{
+ return superblock->s_security + smack_blob_sizes.lbs_superblock;
+}
+
/*
* Is the directory transmuting?
*/
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index f69c3dd9a0c6..767084dc2c29 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -535,12 +535,7 @@ static int smack_syslog(int typefrom_file)
*/
static int smack_sb_alloc_security(struct super_block *sb)
{
- struct superblock_smack *sbsp;
-
- sbsp = kzalloc(sizeof(struct superblock_smack), GFP_KERNEL);
-
- if (sbsp == NULL)
- return -ENOMEM;
+ struct superblock_smack *sbsp = smack_superblock(sb);

sbsp->smk_root = &smack_known_floor;
sbsp->smk_default = &smack_known_floor;
@@ -549,22 +544,10 @@ static int smack_sb_alloc_security(struct super_block *sb)
/*
* SMK_SB_INITIALIZED will be zero from kzalloc.
*/
- sb->s_security = sbsp;

return 0;
}

-/**
- * smack_sb_free_security - free a superblock blob
- * @sb: the superblock getting the blob
- *
- */
-static void smack_sb_free_security(struct super_block *sb)
-{
- kfree(sb->s_security);
- sb->s_security = NULL;
-}
-
struct smack_mnt_opts {
const char *fsdefault, *fsfloor, *fshat, *fsroot, *fstransmute;
};
@@ -772,7 +755,7 @@ static int smack_set_mnt_opts(struct super_block *sb,
{
struct dentry *root = sb->s_root;
struct inode *inode = d_backing_inode(root);
- struct superblock_smack *sp = sb->s_security;
+ struct superblock_smack *sp = smack_superblock(sb);
struct inode_smack *isp;
struct smack_known *skp;
struct smack_mnt_opts *opts = mnt_opts;
@@ -871,7 +854,7 @@ static int smack_set_mnt_opts(struct super_block *sb,
*/
static int smack_sb_statfs(struct dentry *dentry)
{
- struct superblock_smack *sbp = dentry->d_sb->s_security;
+ struct superblock_smack *sbp = smack_superblock(dentry->d_sb);
int rc;
struct smk_audit_info ad;

@@ -905,7 +888,7 @@ static int smack_bprm_creds_for_exec(struct linux_binprm *bprm)
if (isp->smk_task == NULL || isp->smk_task == bsp->smk_task)
return 0;

- sbsp = inode->i_sb->s_security;
+ sbsp = smack_superblock(inode->i_sb);
if ((sbsp->smk_flags & SMK_SB_UNTRUSTED) &&
isp->smk_task != sbsp->smk_root)
return 0;
@@ -1157,7 +1140,7 @@ static int smack_inode_rename(struct inode *old_inode,
*/
static int smack_inode_permission(struct inode *inode, int mask)
{
- struct superblock_smack *sbsp = inode->i_sb->s_security;
+ struct superblock_smack *sbsp = smack_superblock(inode->i_sb);
struct smk_audit_info ad;
int no_block = mask & MAY_NOT_BLOCK;
int rc;
@@ -1398,7 +1381,7 @@ static int smack_inode_removexattr(struct dentry *dentry, const char *name)
*/
if (strcmp(name, XATTR_NAME_SMACK) == 0) {
struct super_block *sbp = dentry->d_sb;
- struct superblock_smack *sbsp = sbp->s_security;
+ struct superblock_smack *sbsp = smack_superblock(sbp);

isp->smk_inode = sbsp->smk_default;
} else if (strcmp(name, XATTR_NAME_SMACKEXEC) == 0)
@@ -1668,7 +1651,7 @@ static int smack_mmap_file(struct file *file,
isp = smack_inode(file_inode(file));
if (isp->smk_mmap == NULL)
return 0;
- sbsp = file_inode(file)->i_sb->s_security;
+ sbsp = smack_superblock(file_inode(file)->i_sb);
if (sbsp->smk_flags & SMK_SB_UNTRUSTED &&
isp->smk_mmap != sbsp->smk_root)
return -EACCES;
@@ -3283,7 +3266,7 @@ static void smack_d_instantiate(struct dentry *opt_dentry, struct inode *inode)
return;

sbp = inode->i_sb;
- sbsp = sbp->s_security;
+ sbsp = smack_superblock(sbp);
/*
* We're going to use the superblock default label
* if there's no label on the file.
@@ -4696,6 +4679,7 @@ struct lsm_blob_sizes smack_blob_sizes __lsm_ro_after_init = {
.lbs_inode = sizeof(struct inode_smack),
.lbs_ipc = sizeof(struct smack_known *),
.lbs_msg_msg = sizeof(struct smack_known *),
+ .lbs_superblock = sizeof(struct superblock_smack),
};

static struct security_hook_list smack_hooks[] __lsm_ro_after_init = {
@@ -4707,7 +4691,6 @@ static struct security_hook_list smack_hooks[] __lsm_ro_after_init = {
LSM_HOOK_INIT(fs_context_parse_param, smack_fs_context_parse_param),

LSM_HOOK_INIT(sb_alloc_security, smack_sb_alloc_security),
- LSM_HOOK_INIT(sb_free_security, smack_sb_free_security),
LSM_HOOK_INIT(sb_free_mnt_opts, smack_free_mnt_opts),
LSM_HOOK_INIT(sb_eat_lsm_opts, smack_sb_eat_lsm_opts),
LSM_HOOK_INIT(sb_statfs, smack_sb_statfs),
--
2.30.0

2021-02-02 23:15:12

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v28 10/12] selftests/landlock: Add user space tests

From: Mickaël Salaün <[email protected]>

Test all Landlock system calls, ptrace hooks semantic and filesystem
access-control.

Test coverage for security/landlock/ is 94.7% of lines. The code not
covered only deals with internal kernel errors (e.g. memory allocation)
and race conditions.

Cc: James Morris <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Shuah Khan <[email protected]>
Signed-off-by: Mickaël Salaün <[email protected]>
Reviewed-by: Vincent Dagonneau <[email protected]>
---

Changes since v27:
* Add layout1.non_overlapping_accesses to check rules without
overlapping access rights (fixed in this patchset).
* Extend layout1.interleaved_masked_accesses with a non-overlapping
execute-only rule.
* Update tests for empty path_beneath.allowed_access, and replace
useless (i.e. deny-only rules) code in
layout1.interleaved_masked_accesses with equivalent meaningful rules.
* Fix the returned step when a test failed with TEST_F_FORK().
* Update MAINTAINERS.
* Cosmetic fix to please checkpatch.
* Fix typo in comment.
* Update landlock_restrict_self(2).

Changes since v26:
* Add layout1_bind tests to check inherited bind mount accesses.
* Add layout2_overlay tests to check non-inherited overlayfs accesses.
* Fix final cleanup which was reordered because of kselftest_harness
changes.
* Update layout1.inherit_subset test according to the
check_access_path_layer() change.
* Implement TEST_F_FORK() to be able to use FIXTURE_TEARDOWN() to clean
up layouts even if the test (child) lost access rights or failed.
Remove now useless layout*.cleanup .
* Update syscall names.
* Clean up FIXTURE_SETUP(layout1).
* Clean up file layout management:
- Replace specific create_dir_and_file() with generic
create_directory() and create_file().
- Replace specific delete_dir_and_file() with generic remove_path().
- Rename and move cleanup_*() to remove_*() to improve readability.
- Use EXPECT_*() for all FIXTURE_TEARDOWN() code.

Changes since v25:
* Add a new test to check that Landlock ruleset file descriptors
received through UNIX sockets are usable. Contributed by Vincent
Dagonneau.
* Improve hierarchy.trace tests to not hang when testing on a kernel
that don't support Landlock.
* Replace EXPECT_EQ(0, close(*)) with ASSERT_EQ(0, close(*)).
* Guard WEXITSTATUS() use with WIFEXITED() in ptrace tests.
* Use pipe2(2) with O_CLOEXEC.
* Remove useless errno set for syscall wrappers, and related useless checks.
* Rename test.
* Add Microsoft copyright for layout1.interleaved_masked_accesses .

Changes since v24:
* Revert the ruleset_overlap test from v24: check that access righs are
ORed together when building a ruleset. Keep the extra checks
added with v24.
* Revert inherit_subset test from v24: use the automatic ORing of
access rights for the same file.
* Update interleaved_masked_accesses test (added with v24) to stop when
all layers allowed at least one time an inode in the path walk.
* Extend interleaved_masked_accesses test with new tricky interleaved
layers which would not work as intended with (allow or deny) bitmask
layer implementations.
* Simplify and rename test_path*() to test_open*() to make easier the
diagnostic in case of unattended errors.
* Replace most call to open(2) with a call to test_open(), which
reduces the number of lines and make tests more readable.
* Fix erroneous check in inherit_superset.

Changes since v23:
* Add an interleaved_masked_accesses test to check corner cases for
interleaved layered ruleset combinations.
* Update ruleset_overlap and inherit_subset tests to follow the new
intersect access rights behavior.
* Extend the inherit_superset test to check that layers are handled as
expected in the superset use case, which complete the inherit_subset
checks.
* Fix comment (spotted by Vincent Dagonneau).

Changes since v22:
* Extend and add a new test to better check rules applied to the root
directory: rule_over_root_allow_then_deny, rule_over_root_deny.
* Change the signature of test_path*() to make the calls clearer.

Changes since v21:
* Remove layout1.chroot test and update layout1.unhandled_access to not
rely on LANDLOCK_ACCESS_FS_CHROOT.
* Clean up comments.

Changes since v20:
* Update with new syscalls and type names.
* Use the full syscall interfaces: explicitely set the "flags" field to
zero.
* Update the empty_path_beneath_attr test to check for EFAULT.
* Update and merge tests for the simplified copy_min_struct_from_user().
* Clean up makefile.
* Rename some types and variables in a more consistent way.

Changes since v19:
* Update with the new Landlock syscalls.
* Fix device creation.
* Check the new landlock_attr_features members: last_rule_type and
last_target_type .
* Constify variables.

Changes since v18:
* Replace ruleset_rw.inval with layout1.inval to avoid inexistent test
layout.
* Use the new FIXTURE_VARIANT for ptrace_test: makes the tests more
readable and usable.
* Add ARRAY_SIZE() macro to please checkpatch.

Changes since v17:
* Add new test for mknod with a zero mode.
* Use memset(3) to initialize attr_features in base_test.

Changes since v16:
* Add new unpriv_enforce_without_no_new_privs test: check that ruleset
enforcing is forbiden without no_new_privs and CAP_SYS_ADMIN.
* Drop capabilities when useful.
* Check the new size_attr_features field from struct
landlock_attr_features.
* Update the empty_or_same_ruleset test to check complementary empty
ruleset.
* Update base_test according to the new attribute structures and fix the
inconsistent_attr test accordingly.
* Switch syscall attribute pointer and size arguments.
* Rename test files with a "_test" suffix.

Changes since v14:
* Add new tests:
- superset: check new layer bitmask.
- max_layers: check maximum number of layers.
- release_inodes: check that umount work well.
- empty_or_same_ruleset.
- inconsistent_attr: checks copy_to_user limits.
- in ruleset_rw.inval to check ruleset FD.
- proc_unlinked_file: check file access through /proc/self/fd .
- file_access_rights: check that a file can only get consistent access
rights.
- unpriv: check that NO_NEW_PRIVS or CAP_SYS_ADMIN is required.
- check pipe access through /proc/self/fd .
- check move_mount(2).
- check ruleset file descriptor properties.
- proc_nsfs: extend to check that internal filesystems (e.g. nsfs) are
allowed.
* Double-check read and write effective actions.
* Fix potential desynchronization between the kernel sources and
installed headers by overriding the build step in the Makefile. This
also enable to build with Clang.
* Add two files in the test directories (for link test and rename test).
* Remove test for ruleset's show_fdinfo().
* Replace EBADR with EBADFD.
* Update tests accordingly to the changes of rename and link rights.
* Fix (now) illegal access rights tied to files.
* Update rename and link tests.
* Remove superfluous '\n' in TH_LOG() calls.
* Make assert calls consistent and readable.
* Fix the execute test.
* Make tests future-proof.
* Cosmetic fixes.

Changes since v14:
* Add new tests:
- Compatibility: empty_attr_{ruleset,path_beneath,enforce} to check
minimal attr size.
- Access types: link_to, rename_from, rename_to, rmdir, unlink,
make_char, make_block, make_reg, make_sock, make_fifo, make_sym,
make_dir, chroot, execute.
- Test privilege escalation prevention by enforcing a nested rule, on
a parent directory, with less restrictions than one on a child
directory.
- Test for empty and more than 32-bits allowed_access
* Merge the two test mount hierarchies.
* Complete relative path tests by combining chdir and chroot.
* Adjust tests:
- Remove the layout1/extend_ruleset_with_denied_path test.
- Extend layout1/whitelist test with checks on file.
- Add and use create_dir_and_file().
* Only use read/write checks but not stat(2) for tests.
* Rename test.h to common.h and improve it.
* Rename path name to make them more consistent, easy to understand and
make them in a common directory.
* Make create_ruleset() more generic.
* Constify variables.
* Re-add static global variables.
* Remove useless openat(2).
* Fix and complete kernel config.
* Set umask and clean up file modes.
* Clean up open flags.
* Improve Makefile.
* Fix spelling.
* Improve comments and error messages.

Changes since v13:
* Add back the filesystem tests (from v10) and extend them.
* Add tests for the new syscall.

Previous changes:
https://lore.kernel.org/lkml/[email protected]/
---
MAINTAINERS | 1 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/landlock/.gitignore | 2 +
tools/testing/selftests/landlock/Makefile | 24 +
tools/testing/selftests/landlock/base_test.c | 219 ++
tools/testing/selftests/landlock/common.h | 169 ++
tools/testing/selftests/landlock/config | 6 +
tools/testing/selftests/landlock/fs_test.c | 2664 +++++++++++++++++
.../testing/selftests/landlock/ptrace_test.c | 314 ++
tools/testing/selftests/landlock/true.c | 5 +
10 files changed, 3405 insertions(+)
create mode 100644 tools/testing/selftests/landlock/.gitignore
create mode 100644 tools/testing/selftests/landlock/Makefile
create mode 100644 tools/testing/selftests/landlock/base_test.c
create mode 100644 tools/testing/selftests/landlock/common.h
create mode 100644 tools/testing/selftests/landlock/config
create mode 100644 tools/testing/selftests/landlock/fs_test.c
create mode 100644 tools/testing/selftests/landlock/ptrace_test.c
create mode 100644 tools/testing/selftests/landlock/true.c

diff --git a/MAINTAINERS b/MAINTAINERS
index ffb6f7ac526a..3df7b12dc7f1 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9944,6 +9944,7 @@ W: https://landlock.io
T: git https://github.com/landlock-lsm/linux.git
F: include/uapi/linux/landlock.h
F: security/landlock/
+F: tools/testing/selftests/landlock/
K: landlock
K: LANDLOCK

diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 8a917cb4426a..0b6ca165774b 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -25,6 +25,7 @@ TARGETS += ir
TARGETS += kcmp
TARGETS += kexec
TARGETS += kvm
+TARGETS += landlock
TARGETS += lib
TARGETS += livepatch
TARGETS += lkdtm
diff --git a/tools/testing/selftests/landlock/.gitignore b/tools/testing/selftests/landlock/.gitignore
new file mode 100644
index 000000000000..470203a7cd73
--- /dev/null
+++ b/tools/testing/selftests/landlock/.gitignore
@@ -0,0 +1,2 @@
+/*_test
+/true
diff --git a/tools/testing/selftests/landlock/Makefile b/tools/testing/selftests/landlock/Makefile
new file mode 100644
index 000000000000..a99596ca9882
--- /dev/null
+++ b/tools/testing/selftests/landlock/Makefile
@@ -0,0 +1,24 @@
+# SPDX-License-Identifier: GPL-2.0
+
+CFLAGS += -Wall -O2
+
+src_test := $(wildcard *_test.c)
+
+TEST_GEN_PROGS := $(src_test:.c=)
+
+TEST_GEN_PROGS_EXTENDED := true
+
+KSFT_KHDR_INSTALL := 1
+OVERRIDE_TARGETS := 1
+include ../lib.mk
+
+khdr_dir = $(top_srcdir)/usr/include
+
+$(khdr_dir)/linux/landlock.h: khdr
+ @:
+
+$(OUTPUT)/true: true.c
+ $(LINK.c) $< $(LDLIBS) -o $@ -static
+
+$(OUTPUT)/%_test: %_test.c $(khdr_dir)/linux/landlock.h ../kselftest_harness.h common.h
+ $(LINK.c) $< $(LDLIBS) -o $@ -lcap -I$(khdr_dir)
diff --git a/tools/testing/selftests/landlock/base_test.c b/tools/testing/selftests/landlock/base_test.c
new file mode 100644
index 000000000000..d4bed665ed0a
--- /dev/null
+++ b/tools/testing/selftests/landlock/base_test.c
@@ -0,0 +1,219 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Landlock tests - Common user space base
+ *
+ * Copyright © 2017-2020 Mickaël Salaün <[email protected]>
+ * Copyright © 2019-2020 ANSSI
+ */
+
+#define _GNU_SOURCE
+#include <errno.h>
+#include <fcntl.h>
+#include <linux/landlock.h>
+#include <string.h>
+#include <sys/prctl.h>
+#include <sys/socket.h>
+#include <sys/types.h>
+
+#include "common.h"
+
+#ifndef O_PATH
+#define O_PATH 010000000
+#endif
+
+TEST(inconsistent_attr) {
+ const long page_size = sysconf(_SC_PAGESIZE);
+ char *const buf = malloc(page_size + 1);
+ struct landlock_ruleset_attr *const ruleset_attr = (void *)buf;
+
+ ASSERT_NE(NULL, buf);
+
+ /* Checks copy_from_user(). */
+ ASSERT_EQ(-1, landlock_create_ruleset(ruleset_attr, 0, 0));
+ /* The size if less than sizeof(struct landlock_attr_enforce). */
+ ASSERT_EQ(EINVAL, errno);
+ ASSERT_EQ(-1, landlock_create_ruleset(ruleset_attr, 1, 0));
+ ASSERT_EQ(EINVAL, errno);
+
+ ASSERT_EQ(-1, landlock_create_ruleset(NULL, 1, 0));
+ /* The size if less than sizeof(struct landlock_attr_enforce). */
+ ASSERT_EQ(EFAULT, errno);
+
+ ASSERT_EQ(-1, landlock_create_ruleset(NULL,
+ sizeof(struct landlock_ruleset_attr), 0));
+ ASSERT_EQ(EFAULT, errno);
+
+ ASSERT_EQ(-1, landlock_create_ruleset(ruleset_attr, page_size + 1, 0));
+ ASSERT_EQ(E2BIG, errno);
+
+ ASSERT_EQ(-1, landlock_create_ruleset(ruleset_attr,
+ sizeof(struct landlock_ruleset_attr), 0));
+ ASSERT_EQ(ENOMSG, errno);
+ ASSERT_EQ(-1, landlock_create_ruleset(ruleset_attr, page_size, 0));
+ ASSERT_EQ(ENOMSG, errno);
+
+ /* Checks non-zero value. */
+ buf[page_size - 2] = '.';
+ ASSERT_EQ(-1, landlock_create_ruleset(ruleset_attr, page_size, 0));
+ ASSERT_EQ(E2BIG, errno);
+
+ ASSERT_EQ(-1, landlock_create_ruleset(ruleset_attr, page_size + 1, 0));
+ ASSERT_EQ(E2BIG, errno);
+
+ free(buf);
+}
+
+TEST(empty_path_beneath_attr) {
+ const struct landlock_ruleset_attr ruleset_attr = {
+ .handled_access_fs = LANDLOCK_ACCESS_FS_EXECUTE,
+ };
+ const int ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+ sizeof(ruleset_attr), 0);
+
+ ASSERT_LE(0, ruleset_fd);
+
+ /* Similar to struct landlock_path_beneath_attr.parent_fd = 0 */
+ ASSERT_EQ(-1, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+ NULL, 0));
+ ASSERT_EQ(EFAULT, errno);
+ ASSERT_EQ(0, close(ruleset_fd));
+}
+
+TEST(inval_fd_enforce) {
+ ASSERT_EQ(0, prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0));
+
+ ASSERT_EQ(-1, landlock_restrict_self(-1, 0));
+ ASSERT_EQ(EBADF, errno);
+}
+
+TEST(unpriv_enforce_without_no_new_privs) {
+ int err;
+
+ disable_caps(_metadata);
+ err = landlock_restrict_self(-1, 0);
+ ASSERT_EQ(EPERM, errno);
+ ASSERT_EQ(err, -1);
+}
+
+TEST(ruleset_fd_io)
+{
+ struct landlock_ruleset_attr ruleset_attr = {
+ .handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE,
+ };
+ int ruleset_fd;
+ char buf;
+
+ disable_caps(_metadata);
+ ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+ sizeof(ruleset_attr), 0);
+ ASSERT_LE(0, ruleset_fd);
+
+ ASSERT_EQ(-1, write(ruleset_fd, ".", 1));
+ ASSERT_EQ(EINVAL, errno);
+ ASSERT_EQ(-1, read(ruleset_fd, &buf, 1));
+ ASSERT_EQ(EINVAL, errno);
+
+ ASSERT_EQ(0, close(ruleset_fd));
+}
+
+/* Tests enforcement of a ruleset FD transferred through a UNIX socket. */
+TEST(ruleset_fd_transfer)
+{
+ struct landlock_ruleset_attr ruleset_attr = {
+ .handled_access_fs = LANDLOCK_ACCESS_FS_READ_DIR,
+ };
+ struct landlock_path_beneath_attr path_beneath_attr = {
+ .allowed_access = LANDLOCK_ACCESS_FS_READ_DIR,
+ };
+ int ruleset_fd_tx, dir_fd;
+ union {
+ /* Aligned ancillary data buffer. */
+ char buf[CMSG_SPACE(sizeof(ruleset_fd_tx))];
+ struct cmsghdr _align;
+ } cmsg_tx = {};
+ char data_tx = '.';
+ struct iovec io = {
+ .iov_base = &data_tx,
+ .iov_len = sizeof(data_tx),
+ };
+ struct msghdr msg = {
+ .msg_iov = &io,
+ .msg_iovlen = 1,
+ .msg_control = &cmsg_tx.buf,
+ .msg_controllen = sizeof(cmsg_tx.buf),
+ };
+ struct cmsghdr *cmsg;
+ int socket_fds[2];
+ pid_t child;
+ int status;
+
+ disable_caps(_metadata);
+
+ /* Creates a test ruleset with a simple rule. */
+ ruleset_fd_tx = landlock_create_ruleset(&ruleset_attr,
+ sizeof(ruleset_attr), 0);
+ ASSERT_LE(0, ruleset_fd_tx);
+ path_beneath_attr.parent_fd = open("/tmp", O_PATH | O_NOFOLLOW |
+ O_DIRECTORY | O_CLOEXEC);
+ ASSERT_LE(0, path_beneath_attr.parent_fd);
+ ASSERT_EQ(0, landlock_add_rule(ruleset_fd_tx, LANDLOCK_RULE_PATH_BENEATH,
+ &path_beneath_attr, 0));
+ ASSERT_EQ(0, close(path_beneath_attr.parent_fd));
+
+ cmsg = CMSG_FIRSTHDR(&msg);
+ ASSERT_NE(NULL, cmsg);
+ cmsg->cmsg_len = CMSG_LEN(sizeof(ruleset_fd_tx));
+ cmsg->cmsg_level = SOL_SOCKET;
+ cmsg->cmsg_type = SCM_RIGHTS;
+ memcpy(CMSG_DATA(cmsg), &ruleset_fd_tx, sizeof(ruleset_fd_tx));
+
+ /* Sends the ruleset FD over a socketpair and then close it. */
+ ASSERT_EQ(0, socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, socket_fds));
+ ASSERT_EQ(sizeof(data_tx), sendmsg(socket_fds[0], &msg, 0));
+ ASSERT_EQ(0, close(socket_fds[0]));
+ ASSERT_EQ(0, close(ruleset_fd_tx));
+
+ child = fork();
+ ASSERT_LE(0, child);
+ if (child == 0) {
+ int ruleset_fd_rx;
+
+ *(char *)msg.msg_iov->iov_base = '\0';
+ ASSERT_EQ(sizeof(data_tx), recvmsg(socket_fds[1], &msg, MSG_CMSG_CLOEXEC));
+ ASSERT_EQ('.', *(char *)msg.msg_iov->iov_base);
+ ASSERT_EQ(0, close(socket_fds[1]));
+ cmsg = CMSG_FIRSTHDR(&msg);
+ ASSERT_EQ(cmsg->cmsg_len, CMSG_LEN(sizeof(ruleset_fd_tx)));
+ memcpy(&ruleset_fd_rx, CMSG_DATA(cmsg), sizeof(ruleset_fd_tx));
+
+ /* Enforces the received ruleset on the child. */
+ ASSERT_EQ(0, prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0));
+ ASSERT_EQ(0, landlock_restrict_self(ruleset_fd_rx, 0));
+ ASSERT_EQ(0, close(ruleset_fd_rx));
+
+ /* Checks that the ruleset enforcement. */
+ ASSERT_EQ(-1, open("/", O_RDONLY | O_DIRECTORY | O_CLOEXEC));
+ ASSERT_EQ(EACCES, errno);
+ dir_fd = open("/tmp", O_RDONLY | O_DIRECTORY | O_CLOEXEC);
+ ASSERT_LE(0, dir_fd);
+ ASSERT_EQ(0, close(dir_fd));
+ _exit(_metadata->passed ? EXIT_SUCCESS : EXIT_FAILURE);
+ return;
+ }
+
+ ASSERT_EQ(0, close(socket_fds[1]));
+
+ /* Checks that the parent is unrestricted. */
+ dir_fd = open("/", O_RDONLY | O_DIRECTORY | O_CLOEXEC);
+ ASSERT_LE(0, dir_fd);
+ ASSERT_EQ(0, close(dir_fd));
+ dir_fd = open("/tmp", O_RDONLY | O_DIRECTORY | O_CLOEXEC);
+ ASSERT_LE(0, dir_fd);
+ ASSERT_EQ(0, close(dir_fd));
+
+ ASSERT_EQ(child, waitpid(child, &status, 0));
+ ASSERT_EQ(1, WIFEXITED(status));
+ ASSERT_EQ(EXIT_SUCCESS, WEXITSTATUS(status));
+}
+
+TEST_HARNESS_MAIN
diff --git a/tools/testing/selftests/landlock/common.h b/tools/testing/selftests/landlock/common.h
new file mode 100644
index 000000000000..3c4303fd7ad1
--- /dev/null
+++ b/tools/testing/selftests/landlock/common.h
@@ -0,0 +1,169 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Landlock test helpers
+ *
+ * Copyright © 2017-2020 Mickaël Salaün <[email protected]>
+ * Copyright © 2019-2020 ANSSI
+ * Copyright © 2021 Microsoft Corporation
+ */
+
+#include <errno.h>
+#include <linux/landlock.h>
+#include <sys/capability.h>
+#include <sys/syscall.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+#include <unistd.h>
+
+#include "../kselftest_harness.h"
+
+#ifndef ARRAY_SIZE
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+#endif
+
+/*
+ * TEST_F_FORK() is useful when a test drop privileges but the corresponding
+ * FIXTURE_TEARDOWN() requires them (e.g. to remove files from a directory
+ * where write actions are denied). For convenience, FIXTURE_TEARDOWN() is
+ * also called when the test failed, but not when FIXTURE_SETUP() failed. For
+ * this to be possible, we must not call abort() but instead exit smoothly
+ * (hence the step print).
+ */
+#define TEST_F_FORK(fixture_name, test_name) \
+ static void fixture_name##_##test_name##_child( \
+ struct __test_metadata *_metadata, \
+ FIXTURE_DATA(fixture_name) *self, \
+ const FIXTURE_VARIANT(fixture_name) *variant); \
+ TEST_F(fixture_name, test_name) \
+ { \
+ int status; \
+ const pid_t child = fork(); \
+ if (child < 0) \
+ abort(); \
+ if (child == 0) { \
+ _metadata->no_print = 1; \
+ fixture_name##_##test_name##_child(_metadata, self, variant); \
+ if (_metadata->skip) \
+ _exit(255); \
+ if (_metadata->passed) \
+ _exit(0); \
+ _exit(_metadata->step); \
+ } \
+ if (child != waitpid(child, &status, 0)) \
+ abort(); \
+ if (WIFSIGNALED(status) || !WIFEXITED(status)) { \
+ _metadata->passed = 0; \
+ _metadata->step = 1; \
+ return; \
+ } \
+ switch (WEXITSTATUS(status)) { \
+ case 0: \
+ _metadata->passed = 1; \
+ break; \
+ case 255: \
+ _metadata->passed = 1; \
+ _metadata->skip = 1; \
+ break; \
+ default: \
+ _metadata->passed = 0; \
+ _metadata->step = WEXITSTATUS(status); \
+ break; \
+ } \
+ } \
+ static void fixture_name##_##test_name##_child( \
+ struct __test_metadata __attribute__((unused)) *_metadata, \
+ FIXTURE_DATA(fixture_name) __attribute__((unused)) *self, \
+ const FIXTURE_VARIANT(fixture_name) \
+ __attribute__((unused)) *variant)
+
+#ifndef landlock_create_ruleset
+static inline int landlock_create_ruleset(
+ const struct landlock_ruleset_attr *const attr,
+ const size_t size, const __u32 flags)
+{
+ return syscall(__NR_landlock_create_ruleset, attr, size, flags);
+}
+#endif
+
+#ifndef landlock_add_rule
+static inline int landlock_add_rule(const int ruleset_fd,
+ const enum landlock_rule_type rule_type,
+ const void *const rule_attr, const __u32 flags)
+{
+ return syscall(__NR_landlock_add_rule, ruleset_fd, rule_type,
+ rule_attr, flags);
+}
+#endif
+
+#ifndef landlock_restrict_self
+static inline int landlock_restrict_self(const int ruleset_fd,
+ const __u32 flags)
+{
+ return syscall(__NR_landlock_restrict_self, ruleset_fd, flags);
+}
+#endif
+
+static void disable_caps(struct __test_metadata *const _metadata)
+{
+ cap_t cap_p;
+ /* Only these three capabilities are useful for the tests. */
+ const cap_value_t caps[] = {
+ CAP_DAC_OVERRIDE,
+ CAP_MKNOD,
+ CAP_SYS_ADMIN,
+ CAP_SYS_CHROOT,
+ };
+
+ cap_p = cap_get_proc();
+ EXPECT_NE(NULL, cap_p) {
+ TH_LOG("Failed to cap_get_proc: %s", strerror(errno));
+ }
+ EXPECT_NE(-1, cap_clear(cap_p)) {
+ TH_LOG("Failed to cap_clear: %s", strerror(errno));
+ }
+ EXPECT_NE(-1, cap_set_flag(cap_p, CAP_PERMITTED, ARRAY_SIZE(caps),
+ caps, CAP_SET)) {
+ TH_LOG("Failed to cap_set_flag: %s", strerror(errno));
+ }
+ EXPECT_NE(-1, cap_set_proc(cap_p)) {
+ TH_LOG("Failed to cap_set_proc: %s", strerror(errno));
+ }
+ EXPECT_NE(-1, cap_free(cap_p)) {
+ TH_LOG("Failed to cap_free: %s", strerror(errno));
+ }
+}
+
+static void effective_cap(struct __test_metadata *const _metadata,
+ const cap_value_t caps, const cap_flag_value_t value)
+{
+ cap_t cap_p;
+
+ cap_p = cap_get_proc();
+ EXPECT_NE(NULL, cap_p) {
+ TH_LOG("Failed to cap_get_proc: %s", strerror(errno));
+ }
+ EXPECT_NE(-1, cap_set_flag(cap_p, CAP_EFFECTIVE, 1, &caps, value)) {
+ TH_LOG("Failed to cap_set_flag: %s", strerror(errno));
+ }
+ EXPECT_NE(-1, cap_set_proc(cap_p)) {
+ TH_LOG("Failed to cap_set_proc: %s", strerror(errno));
+ }
+ EXPECT_NE(-1, cap_free(cap_p)) {
+ TH_LOG("Failed to cap_free: %s", strerror(errno));
+ }
+}
+
+/* We cannot put such helpers in a library because of kselftest_harness.h . */
+__attribute__((__unused__))
+static void set_cap(struct __test_metadata *const _metadata,
+ const cap_value_t caps)
+{
+ effective_cap(_metadata, caps, CAP_SET);
+}
+
+__attribute__((__unused__))
+static void clear_cap(struct __test_metadata *const _metadata,
+ const cap_value_t caps)
+{
+ effective_cap(_metadata, caps, CAP_CLEAR);
+}
diff --git a/tools/testing/selftests/landlock/config b/tools/testing/selftests/landlock/config
new file mode 100644
index 000000000000..7122532c673b
--- /dev/null
+++ b/tools/testing/selftests/landlock/config
@@ -0,0 +1,6 @@
+CONFIG_OVERLAY_FS=y
+CONFIG_SECURITY_LANDLOCK=y
+CONFIG_SECURITY_PATH=y
+CONFIG_SECURITY=y
+CONFIG_SHMEM=y
+CONFIG_TMPFS=y
diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
new file mode 100644
index 000000000000..fa7cde4edbfa
--- /dev/null
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -0,0 +1,2664 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Landlock tests - Filesystem
+ *
+ * Copyright © 2017-2020 Mickaël Salaün <[email protected]>
+ * Copyright © 2020 ANSSI
+ * Copyright © 2020-2021 Microsoft Corporation
+ */
+
+#define _GNU_SOURCE
+#include <fcntl.h>
+#include <linux/landlock.h>
+#include <sched.h>
+#include <string.h>
+#include <sys/capability.h>
+#include <sys/mount.h>
+#include <sys/prctl.h>
+#include <sys/sendfile.h>
+#include <sys/stat.h>
+#include <sys/sysmacros.h>
+#include <unistd.h>
+
+#include "common.h"
+
+#define TMP_DIR "tmp"
+#define BINARY_PATH "./true"
+
+/* Paths (sibling number and depth) */
+static const char dir_s1d1[] = TMP_DIR "/s1d1";
+static const char file1_s1d1[] = TMP_DIR "/s1d1/f1";
+static const char file2_s1d1[] = TMP_DIR "/s1d1/f2";
+static const char dir_s1d2[] = TMP_DIR "/s1d1/s1d2";
+static const char file1_s1d2[] = TMP_DIR "/s1d1/s1d2/f1";
+static const char file2_s1d2[] = TMP_DIR "/s1d1/s1d2/f2";
+static const char dir_s1d3[] = TMP_DIR "/s1d1/s1d2/s1d3";
+static const char file1_s1d3[] = TMP_DIR "/s1d1/s1d2/s1d3/f1";
+static const char file2_s1d3[] = TMP_DIR "/s1d1/s1d2/s1d3/f2";
+
+static const char dir_s2d1[] = TMP_DIR "/s2d1";
+static const char file1_s2d1[] = TMP_DIR "/s2d1/f1";
+static const char dir_s2d2[] = TMP_DIR "/s2d1/s2d2";
+static const char file1_s2d2[] = TMP_DIR "/s2d1/s2d2/f1";
+static const char dir_s2d3[] = TMP_DIR "/s2d1/s2d2/s2d3";
+static const char file1_s2d3[] = TMP_DIR "/s2d1/s2d2/s2d3/f1";
+static const char file2_s2d3[] = TMP_DIR "/s2d1/s2d2/s2d3/f2";
+
+static const char dir_s3d1[] = TMP_DIR "/s3d1";
+/* dir_s3d2 is a mount point. */
+static const char dir_s3d2[] = TMP_DIR "/s3d1/s3d2";
+static const char dir_s3d3[] = TMP_DIR "/s3d1/s3d2/s3d3";
+
+/*
+ * layout1 hierarchy:
+ *
+ * tmp
+ * ├── s1d1
+ * │   ├── f1
+ * │   ├── f2
+ * │   └── s1d2
+ * │   ├── f1
+ * │   ├── f2
+ * │   └── s1d3
+ * │   ├── f1
+ * │   └── f2
+ * ├── s2d1
+ * │   ├── f1
+ * │   └── s2d2
+ * │   ├── f1
+ * │   └── s2d3
+ * │   ├── f1
+ * │   └── f2
+ * └── s3d1
+ * └── s3d2
+ * └── s3d3
+ */
+
+static void mkdir_parents(struct __test_metadata *const _metadata,
+ const char *const path)
+{
+ char *walker;
+ const char *parent;
+ int i, err;
+
+ ASSERT_NE(path[0], '\0');
+ walker = strdup(path);
+ ASSERT_NE(NULL, walker);
+ parent = walker;
+ for (i = 1; walker[i]; i++) {
+ if (walker[i] != '/')
+ continue;
+ walker[i] = '\0';
+ err = mkdir(parent, 0700);
+ ASSERT_FALSE(err && errno != EEXIST) {
+ TH_LOG("Failed to create directory \"%s\": %s",
+ parent, strerror(errno));
+ }
+ walker[i] = '/';
+ }
+ free(walker);
+}
+
+static void create_directory(struct __test_metadata *const _metadata,
+ const char *const path)
+{
+ mkdir_parents(_metadata, path);
+ ASSERT_EQ(0, mkdir(path, 0700)) {
+ TH_LOG("Failed to create directory \"%s\": %s", path,
+ strerror(errno));
+ }
+}
+
+static void create_file(struct __test_metadata *const _metadata,
+ const char *const path)
+{
+ mkdir_parents(_metadata, path);
+ ASSERT_EQ(0, mknod(path, S_IFREG | 0700, 0)) {
+ TH_LOG("Failed to create file \"%s\": %s", path,
+ strerror(errno));
+ }
+}
+
+static int remove_path(const char *const path)
+{
+ char *walker;
+ int i, ret, err = 0;
+
+ walker = strdup(path);
+ if (!walker) {
+ err = ENOMEM;
+ goto out;
+ }
+ if (unlink(path) && rmdir(path)) {
+ if (errno != ENOENT)
+ err = errno;
+ goto out;
+ }
+ for (i = strlen(walker); i > 0; i--) {
+ if (walker[i] != '/')
+ continue;
+ walker[i] = '\0';
+ ret = rmdir(walker);
+ if (ret) {
+ if (errno != ENOTEMPTY && errno != EBUSY)
+ err = errno;
+ goto out;
+ }
+ if (strcmp(walker, TMP_DIR) == 0)
+ goto out;
+ }
+
+out:
+ free(walker);
+ return err;
+}
+
+static void create_layout1(struct __test_metadata *const _metadata)
+{
+ /* Do not pollute the rest of the system. */
+ set_cap(_metadata, CAP_SYS_ADMIN);
+ ASSERT_EQ(0, unshare(CLONE_NEWNS));
+ clear_cap(_metadata, CAP_SYS_ADMIN);
+ umask(0077);
+
+ create_file(_metadata, file1_s1d1);
+ create_file(_metadata, file1_s1d2);
+ create_file(_metadata, file1_s1d3);
+ create_file(_metadata, file2_s1d1);
+ create_file(_metadata, file2_s1d2);
+ create_file(_metadata, file2_s1d3);
+
+ create_file(_metadata, file1_s2d1);
+ create_file(_metadata, file1_s2d2);
+ create_file(_metadata, file1_s2d3);
+ create_file(_metadata, file2_s2d3);
+
+ create_directory(_metadata, dir_s3d2);
+ set_cap(_metadata, CAP_SYS_ADMIN);
+ ASSERT_EQ(0, mount("tmp", dir_s3d2, "tmpfs", 0, "size=4m,mode=700"));
+ clear_cap(_metadata, CAP_SYS_ADMIN);
+
+ ASSERT_EQ(0, mkdir(dir_s3d3, 0700));
+}
+
+static void remove_layout1(struct __test_metadata *const _metadata)
+{
+ EXPECT_EQ(0, remove_path(file2_s1d3));
+ EXPECT_EQ(0, remove_path(file2_s1d2));
+ EXPECT_EQ(0, remove_path(file2_s1d1));
+ EXPECT_EQ(0, remove_path(file1_s1d3));
+ EXPECT_EQ(0, remove_path(file1_s1d2));
+ EXPECT_EQ(0, remove_path(file1_s1d1));
+
+ EXPECT_EQ(0, remove_path(file2_s2d3));
+ EXPECT_EQ(0, remove_path(file1_s2d3));
+ EXPECT_EQ(0, remove_path(file1_s2d2));
+ EXPECT_EQ(0, remove_path(file1_s2d1));
+
+ EXPECT_EQ(0, remove_path(dir_s3d3));
+ set_cap(_metadata, CAP_SYS_ADMIN);
+ umount(dir_s3d2);
+ clear_cap(_metadata, CAP_SYS_ADMIN);
+ EXPECT_EQ(0, remove_path(dir_s3d2));
+
+ EXPECT_EQ(0, remove_path(TMP_DIR));
+}
+
+FIXTURE(layout1) {
+};
+
+FIXTURE_SETUP(layout1)
+{
+ disable_caps(_metadata);
+ create_layout1(_metadata);
+}
+
+FIXTURE_TEARDOWN(layout1)
+{
+ remove_layout1(_metadata);
+}
+
+/*
+ * This helper enables to use the ASSERT_* macros and print the line number
+ * pointing to the test caller.
+ */
+static int test_open_rel(const int dirfd, const char *const path, const int flags)
+{
+ int fd;
+
+ /* Works with file and directories. */
+ fd = openat(dirfd, path, flags | O_CLOEXEC);
+ if (fd < 0)
+ return errno;
+ if (close(fd) == 0)
+ return 0;
+ /*
+ * Mixing error codes from close(2) and open(2) should not lead to any
+ * (access type) confusion for this test.
+ */
+ return errno;
+}
+
+static int test_open(const char *const path, const int flags)
+{
+ return test_open_rel(AT_FDCWD, path, flags);
+}
+
+TEST_F_FORK(layout1, no_restriction)
+{
+ ASSERT_EQ(0, test_open(dir_s1d1, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s1d1, O_RDONLY));
+ ASSERT_EQ(0, test_open(file2_s1d1, O_RDONLY));
+ ASSERT_EQ(0, test_open(dir_s1d2, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s1d2, O_RDONLY));
+ ASSERT_EQ(0, test_open(file2_s1d2, O_RDONLY));
+ ASSERT_EQ(0, test_open(dir_s1d3, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDONLY));
+
+ ASSERT_EQ(0, test_open(dir_s2d1, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s2d1, O_RDONLY));
+ ASSERT_EQ(0, test_open(dir_s2d2, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s2d2, O_RDONLY));
+ ASSERT_EQ(0, test_open(dir_s2d3, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s2d3, O_RDONLY));
+
+ ASSERT_EQ(0, test_open(dir_s3d1, O_RDONLY));
+ ASSERT_EQ(0, test_open(dir_s3d2, O_RDONLY));
+ ASSERT_EQ(0, test_open(dir_s3d3, O_RDONLY));
+}
+
+TEST_F_FORK(layout1, inval)
+{
+ struct landlock_path_beneath_attr path_beneath = {
+ .allowed_access = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ .parent_fd = -1,
+ };
+ struct landlock_ruleset_attr ruleset_attr = {
+ .handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ };
+ int ruleset_fd;
+
+ path_beneath.parent_fd = open(dir_s1d2, O_PATH | O_DIRECTORY |
+ O_CLOEXEC);
+ ASSERT_LE(0, path_beneath.parent_fd);
+
+ ruleset_fd = open(dir_s1d1, O_PATH | O_DIRECTORY | O_CLOEXEC);
+ ASSERT_LE(0, ruleset_fd);
+ ASSERT_EQ(-1, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+ &path_beneath, 0));
+ /* Returns EBADF because ruleset_fd contains O_PATH. */
+ ASSERT_EQ(EBADF, errno);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ ruleset_fd = open(dir_s1d1, O_DIRECTORY | O_CLOEXEC);
+ ASSERT_LE(0, ruleset_fd);
+ ASSERT_EQ(-1, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+ &path_beneath, 0));
+ /* Returns EBADFD because ruleset_fd is not a valid ruleset. */
+ ASSERT_EQ(EBADFD, errno);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Gets a real ruleset. */
+ ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+ sizeof(ruleset_attr), 0);
+ ASSERT_LE(0, ruleset_fd);
+ ASSERT_EQ(0, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+ &path_beneath, 0));
+ ASSERT_EQ(0, close(path_beneath.parent_fd));
+
+ /* Tests without O_PATH. */
+ path_beneath.parent_fd = open(dir_s1d2, O_DIRECTORY | O_CLOEXEC);
+ ASSERT_LE(0, path_beneath.parent_fd);
+ ASSERT_EQ(-1, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+ &path_beneath, 0));
+ ASSERT_EQ(EBADFD, errno);
+ ASSERT_EQ(0, close(path_beneath.parent_fd));
+
+ /* Checks unhandled allowed_access. */
+ path_beneath.parent_fd = open(dir_s1d2, O_PATH | O_DIRECTORY |
+ O_CLOEXEC);
+ ASSERT_LE(0, path_beneath.parent_fd);
+
+ /* Test with legitimate values. */
+ path_beneath.allowed_access |= LANDLOCK_ACCESS_FS_EXECUTE;
+ ASSERT_EQ(-1, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+ &path_beneath, 0));
+ ASSERT_EQ(EINVAL, errno);
+ path_beneath.allowed_access &= ~LANDLOCK_ACCESS_FS_EXECUTE;
+
+ /* Test with unknown (64-bits) value. */
+ path_beneath.allowed_access |= (1ULL << 60);
+ ASSERT_EQ(-1, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+ &path_beneath, 0));
+ ASSERT_EQ(EINVAL, errno);
+ path_beneath.allowed_access &= ~(1ULL << 60);
+
+ /* Test with no access. */
+ path_beneath.allowed_access = 0;
+ ASSERT_EQ(-1, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+ &path_beneath, 0));
+ ASSERT_EQ(ENOMSG, errno);
+ path_beneath.allowed_access &= ~(1ULL << 60);
+
+ ASSERT_EQ(0, close(path_beneath.parent_fd));
+
+ /* Enforces the ruleset. */
+ ASSERT_EQ(0, prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0));
+ ASSERT_EQ(0, landlock_restrict_self(ruleset_fd, 0));
+
+ ASSERT_EQ(0, close(ruleset_fd));
+}
+
+#define ACCESS_FILE ( \
+ LANDLOCK_ACCESS_FS_EXECUTE | \
+ LANDLOCK_ACCESS_FS_WRITE_FILE | \
+ LANDLOCK_ACCESS_FS_READ_FILE)
+
+#define ACCESS_LAST LANDLOCK_ACCESS_FS_MAKE_SYM
+
+#define ACCESS_ALL ( \
+ ACCESS_FILE | \
+ LANDLOCK_ACCESS_FS_READ_DIR | \
+ LANDLOCK_ACCESS_FS_REMOVE_DIR | \
+ LANDLOCK_ACCESS_FS_REMOVE_FILE | \
+ LANDLOCK_ACCESS_FS_MAKE_CHAR | \
+ LANDLOCK_ACCESS_FS_MAKE_DIR | \
+ LANDLOCK_ACCESS_FS_MAKE_REG | \
+ LANDLOCK_ACCESS_FS_MAKE_SOCK | \
+ LANDLOCK_ACCESS_FS_MAKE_FIFO | \
+ LANDLOCK_ACCESS_FS_MAKE_BLOCK | \
+ ACCESS_LAST)
+
+TEST_F_FORK(layout1, file_access_rights)
+{
+ __u64 access;
+ int err;
+ struct landlock_path_beneath_attr path_beneath = {};
+ struct landlock_ruleset_attr ruleset_attr = {
+ .handled_access_fs = ACCESS_ALL,
+ };
+ const int ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+ sizeof(ruleset_attr), 0);
+
+ ASSERT_LE(0, ruleset_fd);
+
+ /* Tests access rights for files. */
+ path_beneath.parent_fd = open(file1_s1d2, O_PATH | O_CLOEXEC);
+ ASSERT_LE(0, path_beneath.parent_fd);
+ for (access = 1; access <= ACCESS_LAST; access <<= 1) {
+ path_beneath.allowed_access = access;
+ err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+ &path_beneath, 0);
+ if ((access | ACCESS_FILE) == ACCESS_FILE) {
+ ASSERT_EQ(0, err);
+ } else {
+ ASSERT_EQ(-1, err);
+ ASSERT_EQ(EINVAL, errno);
+ }
+ }
+ ASSERT_EQ(0, close(path_beneath.parent_fd));
+}
+
+static void add_path_beneath(struct __test_metadata *const _metadata,
+ const int ruleset_fd, const __u64 allowed_access,
+ const char *const path)
+{
+ struct landlock_path_beneath_attr path_beneath = {
+ .allowed_access = allowed_access,
+ };
+
+ path_beneath.parent_fd = open(path, O_PATH | O_CLOEXEC);
+ ASSERT_LE(0, path_beneath.parent_fd) {
+ TH_LOG("Failed to open directory \"%s\": %s", path,
+ strerror(errno));
+ }
+ ASSERT_EQ(0, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+ &path_beneath, 0)) {
+ TH_LOG("Failed to update the ruleset with \"%s\": %s", path,
+ strerror(errno));
+ }
+ ASSERT_EQ(0, close(path_beneath.parent_fd));
+}
+
+struct rule {
+ const char *path;
+ __u64 access;
+};
+
+#define ACCESS_RO ( \
+ LANDLOCK_ACCESS_FS_READ_FILE | \
+ LANDLOCK_ACCESS_FS_READ_DIR)
+
+#define ACCESS_RW ( \
+ ACCESS_RO | \
+ LANDLOCK_ACCESS_FS_WRITE_FILE)
+
+static int create_ruleset(struct __test_metadata *const _metadata,
+ const __u64 handled_access_fs, const struct rule rules[])
+{
+ int ruleset_fd, i;
+ struct landlock_ruleset_attr ruleset_attr = {
+ .handled_access_fs = handled_access_fs,
+ };
+
+ ASSERT_NE(NULL, rules) {
+ TH_LOG("No rule list");
+ }
+ ASSERT_NE(NULL, rules[0].path) {
+ TH_LOG("Empty rule list");
+ }
+
+ ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+ sizeof(ruleset_attr), 0);
+ ASSERT_LE(0, ruleset_fd) {
+ TH_LOG("Failed to create a ruleset: %s", strerror(errno));
+ }
+
+ for (i = 0; rules[i].path; i++) {
+ add_path_beneath(_metadata, ruleset_fd, rules[i].access,
+ rules[i].path);
+ }
+ return ruleset_fd;
+}
+
+static void enforce_ruleset(struct __test_metadata *const _metadata,
+ const int ruleset_fd)
+{
+ ASSERT_EQ(0, prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0));
+ ASSERT_EQ(0, landlock_restrict_self(ruleset_fd, 0)) {
+ TH_LOG("Failed to enforce ruleset: %s", strerror(errno));
+ }
+}
+
+TEST_F_FORK(layout1, proc_nsfs)
+{
+ const struct rule rules[] = {
+ {
+ .path = "/dev/null",
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {}
+ };
+ struct landlock_path_beneath_attr path_beneath;
+ const int ruleset_fd = create_ruleset(_metadata, rules[0].access |
+ LANDLOCK_ACCESS_FS_READ_DIR, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ ASSERT_EQ(0, test_open("/proc/self/ns/mnt", O_RDONLY));
+
+ enforce_ruleset(_metadata, ruleset_fd);
+
+ ASSERT_EQ(EACCES, test_open("/", O_RDONLY));
+ ASSERT_EQ(EACCES, test_open("/dev", O_RDONLY));
+ ASSERT_EQ(0, test_open("/dev/null", O_RDONLY));
+ ASSERT_EQ(EACCES, test_open("/dev/full", O_RDONLY));
+
+ ASSERT_EQ(EACCES, test_open("/proc", O_RDONLY));
+ ASSERT_EQ(EACCES, test_open("/proc/self", O_RDONLY));
+ ASSERT_EQ(EACCES, test_open("/proc/self/ns", O_RDONLY));
+ /*
+ * Because nsfs is an internal filesystem, /proc/self/ns/mnt is a
+ * disconnected path. Such path cannot be identified and must then be
+ * allowed.
+ */
+ ASSERT_EQ(0, test_open("/proc/self/ns/mnt", O_RDONLY));
+
+ /*
+ * Checks that it is not possible to add nsfs-like filesystem
+ * references to a ruleset.
+ */
+ path_beneath.allowed_access = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ path_beneath.parent_fd = open("/proc/self/ns/mnt", O_PATH | O_CLOEXEC);
+ ASSERT_LE(0, path_beneath.parent_fd);
+ ASSERT_EQ(-1, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+ &path_beneath, 0));
+ ASSERT_EQ(EBADFD, errno);
+ ASSERT_EQ(0, close(path_beneath.parent_fd));
+}
+
+static void drop_privileges(struct __test_metadata *const _metadata)
+{
+ cap_t caps;
+ const cap_value_t cap_val = CAP_SYS_ADMIN;
+
+ caps = cap_get_proc();
+ ASSERT_NE(NULL, caps);
+ ASSERT_EQ(0, cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap_val,
+ CAP_CLEAR));
+ ASSERT_EQ(0, cap_set_proc(caps));
+ ASSERT_EQ(0, cap_free(caps));
+}
+
+TEST_F_FORK(layout1, unpriv) {
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d2,
+ .access = ACCESS_RO,
+ },
+ {}
+ };
+ int ruleset_fd;
+
+ drop_privileges(_metadata);
+ ruleset_fd = create_ruleset(_metadata, ACCESS_RO, rules);
+ ASSERT_LE(0, ruleset_fd);
+ ASSERT_EQ(-1, landlock_restrict_self(ruleset_fd, 0));
+ ASSERT_EQ(EPERM, errno);
+
+ /* enforce_ruleset() calls prctl(no_new_privs). */
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+}
+
+TEST_F_FORK(layout1, effective_access)
+{
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d2,
+ .access = ACCESS_RO,
+ },
+ {
+ .path = file1_s2d2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+ char buf;
+ int reg_fd;
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Tests on a directory. */
+ ASSERT_EQ(EACCES, test_open("/", O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDONLY));
+ ASSERT_EQ(0, test_open(dir_s1d2, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s1d2, O_RDONLY));
+ ASSERT_EQ(0, test_open(dir_s1d3, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDONLY));
+
+ /* Tests on a file. */
+ ASSERT_EQ(EACCES, test_open(dir_s2d2, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s2d2, O_RDONLY));
+
+ /* Checks effective read and write actions. */
+ reg_fd = open(file1_s2d2, O_RDWR | O_CLOEXEC);
+ ASSERT_LE(0, reg_fd);
+ ASSERT_EQ(1, write(reg_fd, ".", 1));
+ ASSERT_LE(0, lseek(reg_fd, 0, SEEK_SET));
+ ASSERT_EQ(1, read(reg_fd, &buf, 1));
+ ASSERT_EQ('.', buf);
+ ASSERT_EQ(0, close(reg_fd));
+
+ /* Just in case, double-checks effective actions. */
+ reg_fd = open(file1_s2d2, O_RDONLY | O_CLOEXEC);
+ ASSERT_LE(0, reg_fd);
+ ASSERT_EQ(-1, write(reg_fd, &buf, 1));
+ ASSERT_EQ(EBADF, errno);
+ ASSERT_EQ(0, close(reg_fd));
+}
+
+TEST_F_FORK(layout1, unhandled_access)
+{
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d2,
+ .access = ACCESS_RO,
+ },
+ {}
+ };
+ /* Here, we only handle read accesses, not write accesses. */
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RO, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /*
+ * Because the policy does not handle LANDLOCK_ACCESS_FS_WRITE_FILE,
+ * opening for write-only should be allowed, but not read-write.
+ */
+ ASSERT_EQ(0, test_open(file1_s1d1, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDWR));
+
+ ASSERT_EQ(0, test_open(file1_s1d2, O_WRONLY));
+ ASSERT_EQ(0, test_open(file1_s1d2, O_RDWR));
+}
+
+TEST_F_FORK(layout1, ruleset_overlap)
+{
+ const struct rule rules[] = {
+ /* These rules should be ORed among them. */
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_READ_DIR,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks s1d1 hierarchy. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDWR));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
+
+ /* Checks s1d2 hierarchy. */
+ ASSERT_EQ(0, test_open(file1_s1d2, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s1d2, O_WRONLY));
+ ASSERT_EQ(0, test_open(file1_s1d2, O_RDWR));
+ ASSERT_EQ(0, test_open(dir_s1d2, O_RDONLY | O_DIRECTORY));
+
+ /* Checks s1d3 hierarchy. */
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s1d3, O_WRONLY));
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDWR));
+ ASSERT_EQ(0, test_open(dir_s1d3, O_RDONLY | O_DIRECTORY));
+}
+
+TEST_F_FORK(layout1, non_overlapping_accesses)
+{
+ const struct rule layer1[] = {
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_MAKE_REG,
+ },
+ {}
+ };
+ const struct rule layer2[] = {
+ {
+ .path = dir_s1d3,
+ .access = LANDLOCK_ACCESS_FS_REMOVE_FILE,
+ },
+ {}
+ };
+ int ruleset_fd;
+
+ ASSERT_EQ(0, unlink(file1_s1d1));
+ ASSERT_EQ(0, unlink(file1_s1d2));
+
+ ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_MAKE_REG,
+ layer1);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ ASSERT_EQ(-1, mknod(file1_s1d1, S_IFREG | 0700, 0));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(0, mknod(file1_s1d2, S_IFREG | 0700, 0));
+ ASSERT_EQ(0, unlink(file1_s1d2));
+
+ ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_REMOVE_FILE,
+ layer2);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Unchanged accesses for file creation. */
+ ASSERT_EQ(-1, mknod(file1_s1d1, S_IFREG | 0700, 0));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(0, mknod(file1_s1d2, S_IFREG | 0700, 0));
+
+ /* Checks file removing. */
+ ASSERT_EQ(-1, unlink(file1_s1d2));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(0, unlink(file1_s1d3));
+}
+
+TEST_F_FORK(layout1, interleaved_masked_accesses)
+{
+ /*
+ * Checks overly restrictive rules:
+ * layer 1: allows R s1d1/s1d2/s1d3/file1
+ * layer 2: allows RW s1d1/s1d2/s1d3
+ * allows W s1d1/s1d2
+ * denies R s1d1/s1d2
+ * layer 3: allows R s1d1
+ * layer 4: allows R s1d1/s1d2
+ * denies W s1d1/s1d2
+ * layer 5: allows R s1d1/s1d2
+ * layer 6: allows X ----
+ * layer 7: allows W s1d1/s1d2
+ * denies R s1d1/s1d2
+ */
+ const struct rule layer1_read[] = {
+ /* Allows read access to file1_s1d3 with the first layer. */
+ {
+ .path = file1_s1d3,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {}
+ };
+ /* First rule with write restrictions. */
+ const struct rule layer2_read_write[] = {
+ /* Start by granting read-write access via its parent directory... */
+ {
+ .path = dir_s1d3,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ /* ...but also denies read access via its grandparent directory. */
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {}
+ };
+ const struct rule layer3_read[] = {
+ /* Allows read access via its great-grandparent directory. */
+ {
+ .path = dir_s1d1,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {}
+ };
+ const struct rule layer4_read_write[] = {
+ /*
+ * Try to confuse the deny access by denying write (but not
+ * read) access via its grandparent directory.
+ */
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {}
+ };
+ const struct rule layer5_read[] = {
+ /*
+ * Try to override layer2's deny read access by explicitly
+ * allowing read access via file1_s1d3's grandparent.
+ */
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {}
+ };
+ const struct rule layer6_execute[] = {
+ /*
+ * Restricts an unrelated file hierarchy with a new access
+ * (non-overlapping) type.
+ */
+ {
+ .path = dir_s2d1,
+ .access = LANDLOCK_ACCESS_FS_EXECUTE,
+ },
+ {}
+ };
+ const struct rule layer7_read_write[] = {
+ /*
+ * Finally, denies read access to file1_s1d3 via its
+ * grandparent.
+ */
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {}
+ };
+ int ruleset_fd;
+
+ ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_READ_FILE,
+ layer1_read);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks that read access is granted for file1_s1d3 with layer 1. */
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDWR));
+ ASSERT_EQ(EACCES, test_open(file2_s1d3, O_RDONLY));
+ ASSERT_EQ(0, test_open(file2_s1d3, O_WRONLY));
+
+ ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_WRITE_FILE, layer2_read_write);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks that previous access rights are unchanged with layer 2. */
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDWR));
+ ASSERT_EQ(EACCES, test_open(file2_s1d3, O_RDONLY));
+ ASSERT_EQ(0, test_open(file2_s1d3, O_WRONLY));
+
+ ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_READ_FILE,
+ layer3_read);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks that previous access rights are unchanged with layer 3. */
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDWR));
+ ASSERT_EQ(EACCES, test_open(file2_s1d3, O_RDONLY));
+ ASSERT_EQ(0, test_open(file2_s1d3, O_WRONLY));
+
+ /* This time, denies write access for the file hierarchy. */
+ ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_WRITE_FILE, layer4_read_write);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /*
+ * Checks that the only change with layer 4 is that write access is
+ * denied.
+ */
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d3, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(file2_s1d3, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file2_s1d3, O_WRONLY));
+
+ ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_READ_FILE,
+ layer5_read);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks that previous access rights are unchanged with layer 5. */
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d3, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(file2_s1d3, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(file2_s1d3, O_RDONLY));
+
+ ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_EXECUTE,
+ layer6_execute);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks that previous access rights are unchanged with layer 6. */
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d3, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(file2_s1d3, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(file2_s1d3, O_RDONLY));
+
+ ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_WRITE_FILE, layer7_read_write);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks read access is now denied with layer 7. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d3, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d3, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(file2_s1d3, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(file2_s1d3, O_RDONLY));
+}
+
+TEST_F_FORK(layout1, inherit_subset)
+{
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_READ_DIR,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
+
+ /* Write access is forbidden. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_WRONLY));
+ /* Readdir access is allowed. */
+ ASSERT_EQ(0, test_open(dir_s1d2, O_RDONLY | O_DIRECTORY));
+
+ /* Write access is forbidden. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d3, O_WRONLY));
+ /* Readdir access is allowed. */
+ ASSERT_EQ(0, test_open(dir_s1d3, O_RDONLY | O_DIRECTORY));
+
+ /*
+ * Tests shared rule extension: the following rules should not grant
+ * any new access, only remove some. Once enforced, these rules are
+ * ANDed with the previous ones.
+ */
+ add_path_beneath(_metadata, ruleset_fd, LANDLOCK_ACCESS_FS_WRITE_FILE,
+ dir_s1d2);
+ /*
+ * According to ruleset_fd, dir_s1d2 should now have the
+ * LANDLOCK_ACCESS_FS_READ_FILE and LANDLOCK_ACCESS_FS_WRITE_FILE
+ * access rights (even if this directory is opened a second time).
+ * However, when enforcing this updated ruleset, the ruleset tied to
+ * the current process (i.e. its domain) will still only have the
+ * dir_s1d2 with LANDLOCK_ACCESS_FS_READ_FILE and
+ * LANDLOCK_ACCESS_FS_READ_DIR accesses, but
+ * LANDLOCK_ACCESS_FS_WRITE_FILE must not be allowed because it would
+ * be a privilege escalation.
+ */
+ enforce_ruleset(_metadata, ruleset_fd);
+
+ /* Same tests and results as above. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
+
+ /* It is still forbidden to write in file1_s1d2. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_WRONLY));
+ /* Readdir access is still allowed. */
+ ASSERT_EQ(0, test_open(dir_s1d2, O_RDONLY | O_DIRECTORY));
+
+ /* It is still forbidden to write in file1_s1d3. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d3, O_WRONLY));
+ /* Readdir access is still allowed. */
+ ASSERT_EQ(0, test_open(dir_s1d3, O_RDONLY | O_DIRECTORY));
+
+ /*
+ * Try to get more privileges by adding new access rights to the parent
+ * directory: dir_s1d1.
+ */
+ add_path_beneath(_metadata, ruleset_fd, ACCESS_RW, dir_s1d1);
+ enforce_ruleset(_metadata, ruleset_fd);
+
+ /* Same tests and results as above. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
+
+ /* It is still forbidden to write in file1_s1d2. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_WRONLY));
+ /* Readdir access is still allowed. */
+ ASSERT_EQ(0, test_open(dir_s1d2, O_RDONLY | O_DIRECTORY));
+
+ /* It is still forbidden to write in file1_s1d3. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d3, O_WRONLY));
+ /* Readdir access is still allowed. */
+ ASSERT_EQ(0, test_open(dir_s1d3, O_RDONLY | O_DIRECTORY));
+
+ /*
+ * Now, dir_s1d3 get a new rule tied to it, only allowing
+ * LANDLOCK_ACCESS_FS_WRITE_FILE. The (kernel internal) difference is
+ * that there was no rule tied to it before.
+ */
+ add_path_beneath(_metadata, ruleset_fd, LANDLOCK_ACCESS_FS_WRITE_FILE,
+ dir_s1d3);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /*
+ * Same tests and results as above, except for open(dir_s1d3) which is
+ * now denied because the new rule mask the rule previously inherited
+ * from dir_s1d2.
+ */
+
+ /* Same tests and results as above. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
+
+ /* It is still forbidden to write in file1_s1d2. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_WRONLY));
+ /* Readdir access is still allowed. */
+ ASSERT_EQ(0, test_open(dir_s1d2, O_RDONLY | O_DIRECTORY));
+
+ /* It is still forbidden to write in file1_s1d3. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d3, O_WRONLY));
+ /*
+ * Readdir of dir_s1d3 is still allowed because of the OR policy inside
+ * the same layer.
+ */
+ ASSERT_EQ(0, test_open(dir_s1d3, O_RDONLY | O_DIRECTORY));
+}
+
+TEST_F_FORK(layout1, inherit_superset)
+{
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d3,
+ .access = ACCESS_RO,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+
+ /* Readdir access is denied for dir_s1d2. */
+ ASSERT_EQ(EACCES, test_open(dir_s1d2, O_RDONLY | O_DIRECTORY));
+ /* Readdir access is allowed for dir_s1d3. */
+ ASSERT_EQ(0, test_open(dir_s1d3, O_RDONLY | O_DIRECTORY));
+ /* File access is allowed for file1_s1d3. */
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDONLY));
+
+ /* Now dir_s1d2, parent of dir_s1d3, gets a new rule tied to it. */
+ add_path_beneath(_metadata, ruleset_fd, LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_READ_DIR, dir_s1d2);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Readdir access is still denied for dir_s1d2. */
+ ASSERT_EQ(EACCES, test_open(dir_s1d2, O_RDONLY | O_DIRECTORY));
+ /* Readdir access is still allowed for dir_s1d3. */
+ ASSERT_EQ(0, test_open(dir_s1d3, O_RDONLY | O_DIRECTORY));
+ /* File access is still allowed for file1_s1d3. */
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDONLY));
+}
+
+TEST_F_FORK(layout1, max_layers)
+{
+ int i, err;
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d2,
+ .access = ACCESS_RO,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ for (i = 0; i < 64; i++)
+ enforce_ruleset(_metadata, ruleset_fd);
+
+ for (i = 0; i < 2; i++) {
+ err = landlock_restrict_self(ruleset_fd, 0);
+ ASSERT_EQ(-1, err);
+ ASSERT_EQ(E2BIG, errno);
+ }
+ ASSERT_EQ(0, close(ruleset_fd));
+}
+
+TEST_F_FORK(layout1, empty_or_same_ruleset)
+{
+ struct landlock_ruleset_attr ruleset_attr = {};
+ int ruleset_fd;
+
+ /* Tests empty handled_access_fs. */
+ ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+ sizeof(ruleset_attr), 0);
+ ASSERT_LE(-1, ruleset_fd);
+ ASSERT_EQ(ENOMSG, errno);
+
+ /* Enforces policy which deny read access to all files. */
+ ruleset_attr.handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE;
+ ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+ sizeof(ruleset_attr), 0);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDONLY));
+ ASSERT_EQ(0, test_open(dir_s1d1, O_RDONLY));
+
+ /* Nests a policy which deny read access to all directories. */
+ ruleset_attr.handled_access_fs = LANDLOCK_ACCESS_FS_READ_DIR;
+ ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+ sizeof(ruleset_attr), 0);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY));
+
+ /* Enforces a second time with the same ruleset. */
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+}
+
+TEST_F_FORK(layout1, rule_on_mountpoint)
+{
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d1,
+ .access = ACCESS_RO,
+ },
+ {
+ /* dir_s3d2 is a mount point. */
+ .path = dir_s3d2,
+ .access = ACCESS_RO,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ ASSERT_EQ(0, test_open(dir_s1d1, O_RDONLY));
+
+ ASSERT_EQ(EACCES, test_open(dir_s2d1, O_RDONLY));
+
+ ASSERT_EQ(EACCES, test_open(dir_s3d1, O_RDONLY));
+ ASSERT_EQ(0, test_open(dir_s3d2, O_RDONLY));
+ ASSERT_EQ(0, test_open(dir_s3d3, O_RDONLY));
+}
+
+TEST_F_FORK(layout1, rule_over_mountpoint)
+{
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d1,
+ .access = ACCESS_RO,
+ },
+ {
+ /* dir_s3d2 is a mount point. */
+ .path = dir_s3d1,
+ .access = ACCESS_RO,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ ASSERT_EQ(0, test_open(dir_s1d1, O_RDONLY));
+
+ ASSERT_EQ(EACCES, test_open(dir_s2d1, O_RDONLY));
+
+ ASSERT_EQ(0, test_open(dir_s3d1, O_RDONLY));
+ ASSERT_EQ(0, test_open(dir_s3d2, O_RDONLY));
+ ASSERT_EQ(0, test_open(dir_s3d3, O_RDONLY));
+}
+
+/*
+ * This test verifies that we can apply a landlock rule on the root directory
+ * (which might require special handling).
+ */
+TEST_F_FORK(layout1, rule_over_root_allow_then_deny)
+{
+ struct rule rules[] = {
+ {
+ .path = "/",
+ .access = ACCESS_RO,
+ },
+ {}
+ };
+ int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks allowed access. */
+ ASSERT_EQ(0, test_open("/", O_RDONLY));
+ ASSERT_EQ(0, test_open(dir_s1d1, O_RDONLY));
+
+ rules[0].access = LANDLOCK_ACCESS_FS_READ_FILE;
+ ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks denied access (on a directory). */
+ ASSERT_EQ(EACCES, test_open("/", O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY));
+}
+
+TEST_F_FORK(layout1, rule_over_root_deny)
+{
+ const struct rule rules[] = {
+ {
+ .path = "/",
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks denied access (on a directory). */
+ ASSERT_EQ(EACCES, test_open("/", O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY));
+}
+
+TEST_F_FORK(layout1, rule_inside_mount_ns)
+{
+ const struct rule rules[] = {
+ {
+ .path = "s3d3",
+ .access = ACCESS_RO,
+ },
+ {}
+ };
+ int ruleset_fd;
+
+ set_cap(_metadata, CAP_SYS_ADMIN);
+ ASSERT_EQ(0, mount(NULL, "/", NULL, MS_PRIVATE | MS_REC, NULL));
+ ASSERT_EQ(0, syscall(SYS_pivot_root, dir_s3d2, dir_s3d3)) {
+ TH_LOG("Failed to pivot_root into \"%s\": %s", dir_s3d2,
+ strerror(errno));
+ };
+ ASSERT_EQ(0, chdir("/"));
+ clear_cap(_metadata, CAP_SYS_ADMIN);
+
+ ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ ASSERT_EQ(0, test_open("s3d3", O_RDONLY));
+ ASSERT_EQ(EACCES, test_open("/", O_RDONLY));
+}
+
+TEST_F_FORK(layout1, mount_and_pivot)
+{
+ const struct rule rules[] = {
+ {
+ .path = dir_s3d2,
+ .access = ACCESS_RO,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ set_cap(_metadata, CAP_SYS_ADMIN);
+ ASSERT_EQ(0, mount(NULL, "/", NULL, MS_PRIVATE | MS_REC, NULL));
+
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ ASSERT_EQ(-1, mount(NULL, "/", NULL, MS_PRIVATE | MS_REC, NULL));
+ ASSERT_EQ(EPERM, errno);
+ ASSERT_EQ(-1, syscall(SYS_pivot_root, dir_s3d2, dir_s3d3));
+ ASSERT_EQ(EPERM, errno);
+}
+
+TEST_F_FORK(layout1, move_mount)
+{
+ const struct rule rules[] = {
+ {
+ .path = dir_s3d2,
+ .access = ACCESS_RO,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+
+ set_cap(_metadata, CAP_SYS_ADMIN);
+ ASSERT_EQ(0, mount(NULL, "/", NULL, MS_PRIVATE | MS_REC, NULL));
+ ASSERT_EQ(0, syscall(SYS_move_mount, AT_FDCWD, dir_s3d2, AT_FDCWD,
+ dir_s1d2, 0)) {
+ TH_LOG("Failed to move_mount: %s", strerror(errno));
+ }
+ ASSERT_EQ(0, syscall(SYS_move_mount, AT_FDCWD, dir_s1d2, AT_FDCWD,
+ dir_s3d2, 0));
+
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ ASSERT_EQ(-1, syscall(SYS_move_mount, AT_FDCWD, dir_s3d2, AT_FDCWD,
+ dir_s1d2, 0));
+ ASSERT_EQ(EPERM, errno);
+}
+
+TEST_F_FORK(layout1, release_inodes)
+{
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d1,
+ .access = ACCESS_RO,
+ },
+ {
+ .path = dir_s3d2,
+ .access = ACCESS_RO,
+ },
+ {
+ .path = dir_s3d3,
+ .access = ACCESS_RO,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ /* Unmount a file hierarchy while it is being used by a ruleset. */
+ set_cap(_metadata, CAP_SYS_ADMIN);
+ ASSERT_EQ(0, umount(dir_s3d2));
+ clear_cap(_metadata, CAP_SYS_ADMIN);
+
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ ASSERT_EQ(0, test_open(file1_s1d1, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(dir_s3d2, O_RDONLY));
+ /* This dir_s3d3 would not be allowed and does not exist anyway. */
+ ASSERT_EQ(ENOENT, test_open(dir_s3d3, O_RDONLY));
+}
+
+enum relative_access {
+ REL_OPEN,
+ REL_CHDIR,
+ REL_CHROOT_ONLY,
+ REL_CHROOT_CHDIR,
+};
+
+static void test_relative_path(struct __test_metadata *const _metadata,
+ const enum relative_access rel)
+{
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d2,
+ .access = ACCESS_RO,
+ },
+ {
+ .path = dir_s2d2,
+ .access = ACCESS_RO,
+ },
+ {}
+ };
+ int dirfd;
+ const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ switch (rel) {
+ case REL_OPEN:
+ case REL_CHDIR:
+ break;
+ case REL_CHROOT_ONLY:
+ ASSERT_EQ(0, chdir(dir_s2d2));
+ break;
+ case REL_CHROOT_CHDIR:
+ ASSERT_EQ(0, chdir(dir_s1d2));
+ break;
+ default:
+ ASSERT_TRUE(false);
+ return;
+ }
+
+ set_cap(_metadata, CAP_SYS_CHROOT);
+ enforce_ruleset(_metadata, ruleset_fd);
+
+ switch (rel) {
+ case REL_OPEN:
+ dirfd = open(dir_s1d2, O_DIRECTORY);
+ ASSERT_LE(0, dirfd);
+ break;
+ case REL_CHDIR:
+ ASSERT_EQ(0, chdir(dir_s1d2));
+ dirfd = AT_FDCWD;
+ break;
+ case REL_CHROOT_ONLY:
+ /* Do chroot into dir_s1d2 (relative to dir_s2d2). */
+ ASSERT_EQ(0, chroot("../../s1d1/s1d2")) {
+ TH_LOG("Failed to chroot: %s", strerror(errno));
+ }
+ dirfd = AT_FDCWD;
+ break;
+ case REL_CHROOT_CHDIR:
+ /* Do chroot into dir_s1d2. */
+ ASSERT_EQ(0, chroot(".")) {
+ TH_LOG("Failed to chroot: %s", strerror(errno));
+ }
+ dirfd = AT_FDCWD;
+ break;
+ }
+
+ ASSERT_EQ((rel == REL_CHROOT_CHDIR) ? 0 : EACCES,
+ test_open_rel(dirfd, "..", O_RDONLY));
+ ASSERT_EQ(0, test_open_rel(dirfd, ".", O_RDONLY));
+
+ if (rel == REL_CHROOT_ONLY) {
+ /* The current directory is dir_s2d2. */
+ ASSERT_EQ(0, test_open_rel(dirfd, "./s2d3", O_RDONLY));
+ } else {
+ /* The current directory is dir_s1d2. */
+ ASSERT_EQ(0, test_open_rel(dirfd, "./s1d3", O_RDONLY));
+ }
+
+ if (rel != REL_CHROOT_CHDIR) {
+ ASSERT_EQ(EACCES, test_open_rel(dirfd, "../../s1d1", O_RDONLY));
+ ASSERT_EQ(0, test_open_rel(dirfd, "../../s1d1/s1d2", O_RDONLY));
+ ASSERT_EQ(0, test_open_rel(dirfd, "../../s1d1/s1d2/s1d3", O_RDONLY));
+
+ ASSERT_EQ(EACCES, test_open_rel(dirfd, "../../s2d1", O_RDONLY));
+ ASSERT_EQ(0, test_open_rel(dirfd, "../../s2d1/s2d2", O_RDONLY));
+ ASSERT_EQ(0, test_open_rel(dirfd, "../../s2d1/s2d2/s2d3", O_RDONLY));
+ }
+
+ if (rel == REL_OPEN)
+ ASSERT_EQ(0, close(dirfd));
+ ASSERT_EQ(0, close(ruleset_fd));
+}
+
+TEST_F_FORK(layout1, relative_open)
+{
+ test_relative_path(_metadata, REL_OPEN);
+}
+
+TEST_F_FORK(layout1, relative_chdir)
+{
+ test_relative_path(_metadata, REL_CHDIR);
+}
+
+TEST_F_FORK(layout1, relative_chroot_only)
+{
+ test_relative_path(_metadata, REL_CHROOT_ONLY);
+}
+
+TEST_F_FORK(layout1, relative_chroot_chdir)
+{
+ test_relative_path(_metadata, REL_CHROOT_CHDIR);
+}
+
+static void copy_binary(struct __test_metadata *const _metadata,
+ const char *const dst_path)
+{
+ int dst_fd, src_fd;
+ struct stat statbuf;
+
+ dst_fd = open(dst_path, O_WRONLY | O_TRUNC | O_CLOEXEC);
+ ASSERT_LE(0, dst_fd) {
+ TH_LOG("Failed to open \"%s\": %s", dst_path,
+ strerror(errno));
+ }
+ src_fd = open(BINARY_PATH, O_RDONLY | O_CLOEXEC);
+ ASSERT_LE(0, src_fd) {
+ TH_LOG("Failed to open \"" BINARY_PATH "\": %s",
+ strerror(errno));
+ }
+ ASSERT_EQ(0, fstat(src_fd, &statbuf));
+ ASSERT_EQ(statbuf.st_size, sendfile(dst_fd, src_fd, 0,
+ statbuf.st_size));
+ ASSERT_EQ(0, close(src_fd));
+ ASSERT_EQ(0, close(dst_fd));
+}
+
+static void test_execute(struct __test_metadata *const _metadata,
+ const char *const path, const int ret)
+{
+ int status;
+ char *const argv[] = {(char *)path, NULL};
+ const pid_t child = fork();
+
+ ASSERT_LE(0, child);
+ if (child == 0) {
+ ASSERT_EQ(ret, execve(path, argv, NULL)) {
+ TH_LOG("Failed to execute \"%s\": %s", path,
+ strerror(errno));
+ };
+ ASSERT_EQ(EACCES, errno);
+ _exit(_metadata->passed ? 2 : 1);
+ return;
+ }
+ ASSERT_EQ(child, waitpid(child, &status, 0));
+ ASSERT_EQ(1, WIFEXITED(status));
+ ASSERT_EQ(ret ? 2 : 0, WEXITSTATUS(status)) {
+ TH_LOG("Unexpected return code for \"%s\": %s", path,
+ strerror(errno));
+ };
+}
+
+TEST_F_FORK(layout1, execute)
+{
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_EXECUTE,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, rules[0].access,
+ rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ copy_binary(_metadata, file1_s1d1);
+ copy_binary(_metadata, file1_s1d2);
+ copy_binary(_metadata, file1_s1d3);
+
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ test_execute(_metadata, file1_s1d1, -1);
+ test_execute(_metadata, file1_s1d2, 0);
+ test_execute(_metadata, file1_s1d3, 0);
+}
+
+TEST_F_FORK(layout1, link)
+{
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_MAKE_REG,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, rules[0].access,
+ rules);
+
+ ASSERT_LE(0, ruleset_fd);
+
+ ASSERT_EQ(0, unlink(file1_s1d1));
+ ASSERT_EQ(0, unlink(file1_s1d2));
+ ASSERT_EQ(0, unlink(file1_s1d3));
+
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ ASSERT_EQ(-1, link(file2_s1d1, file1_s1d1));
+ ASSERT_EQ(EACCES, errno);
+ /* Denies linking because of reparenting. */
+ ASSERT_EQ(-1, link(file1_s2d1, file1_s1d2));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(-1, link(file2_s1d2, file1_s1d3));
+ ASSERT_EQ(EACCES, errno);
+
+ ASSERT_EQ(0, link(file2_s1d2, file1_s1d2)) {
+ TH_LOG("Failed to link file to \"%s\": %s", file2_s1d2,
+ strerror(errno));
+ };
+ ASSERT_EQ(0, link(file2_s1d3, file1_s1d3));
+}
+
+TEST_F_FORK(layout1, rename_file)
+{
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d3,
+ .access = LANDLOCK_ACCESS_FS_REMOVE_FILE,
+ },
+ {
+ .path = dir_s2d2,
+ .access = LANDLOCK_ACCESS_FS_REMOVE_FILE,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, rules[0].access,
+ rules);
+
+ ASSERT_LE(0, ruleset_fd);
+
+ ASSERT_EQ(0, unlink(file1_s1d1));
+ ASSERT_EQ(0, unlink(file1_s1d2));
+
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Replaces file. */
+ ASSERT_EQ(-1, rename(file1_s2d3, file1_s1d3));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(-1, rename(file1_s2d1, file1_s1d3));
+ ASSERT_EQ(EACCES, errno);
+ /* Same parent. */
+ ASSERT_EQ(0, rename(file2_s2d3, file1_s2d3)) {
+ TH_LOG("Failed to rename file \"%s\": %s", file2_s2d3,
+ strerror(errno));
+ };
+
+ /* Renames files. */
+ ASSERT_EQ(-1, rename(file1_s2d2, file1_s1d2));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(0, unlink(file1_s1d3));
+ ASSERT_EQ(-1, rename(file1_s2d1, file1_s1d3));
+ ASSERT_EQ(EACCES, errno);
+ /* Same parent. */
+ ASSERT_EQ(0, rename(file2_s1d3, file1_s1d3));
+}
+
+TEST_F_FORK(layout1, rename_dir)
+{
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_REMOVE_DIR,
+ },
+ {
+ .path = dir_s2d1,
+ .access = LANDLOCK_ACCESS_FS_REMOVE_DIR,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, rules[0].access,
+ rules);
+
+ ASSERT_LE(0, ruleset_fd);
+
+ /* Empties dir_s1d3. */
+ ASSERT_EQ(0, unlink(file1_s1d3));
+ ASSERT_EQ(0, unlink(file2_s1d3));
+
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Renames directory. */
+ ASSERT_EQ(-1, rename(dir_s2d3, dir_s1d3));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(0, unlink(file1_s1d2));
+ ASSERT_EQ(0, rename(dir_s1d3, file1_s1d2)) {
+ TH_LOG("Failed to rename directory \"%s\": %s", dir_s1d3,
+ strerror(errno));
+ };
+ ASSERT_EQ(0, rmdir(file1_s1d2));
+}
+
+TEST_F_FORK(layout1, rmdir)
+{
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_REMOVE_DIR,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, rules[0].access,
+ rules);
+
+ ASSERT_LE(0, ruleset_fd);
+
+ ASSERT_EQ(0, unlink(file1_s1d1));
+ ASSERT_EQ(0, unlink(file1_s1d2));
+ ASSERT_EQ(0, unlink(file1_s1d3));
+ ASSERT_EQ(0, unlink(file2_s1d3));
+
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ ASSERT_EQ(0, rmdir(dir_s1d3));
+ /* dir_s1d2 itself cannot be removed. */
+ ASSERT_EQ(-1, rmdir(dir_s1d2));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(-1, rmdir(dir_s1d1));
+ ASSERT_EQ(EACCES, errno);
+}
+
+TEST_F_FORK(layout1, unlink)
+{
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_REMOVE_FILE,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, rules[0].access,
+ rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ ASSERT_EQ(-1, unlink(file1_s1d1));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(0, unlink(file1_s1d2)) {
+ TH_LOG("Failed to unlink file \"%s\": %s", file1_s1d2,
+ strerror(errno));
+ };
+ ASSERT_EQ(0, unlink(file1_s1d3));
+}
+
+static void test_make_file(struct __test_metadata *const _metadata,
+ const __u64 access, const mode_t mode, const dev_t dev)
+{
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d2,
+ .access = access,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, access, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+
+ unlink(file1_s1d1);
+ unlink(file1_s1d2);
+ unlink(file1_s1d3);
+
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ ASSERT_EQ(-1, mknod(file1_s1d1, mode | 0400, dev));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(0, mknod(file1_s1d2, mode | 0400, dev)) {
+ TH_LOG("Failed to make file \"%s\": %s",
+ file1_s1d2, strerror(errno));
+ };
+ ASSERT_EQ(0, mknod(file1_s1d3, mode | 0400, dev));
+}
+
+TEST_F_FORK(layout1, make_char)
+{
+ /* Creates a /dev/null device. */
+ set_cap(_metadata, CAP_MKNOD);
+ test_make_file(_metadata, LANDLOCK_ACCESS_FS_MAKE_CHAR, S_IFCHR,
+ makedev(1, 3));
+}
+
+TEST_F_FORK(layout1, make_block)
+{
+ /* Creates a /dev/loop0 device. */
+ set_cap(_metadata, CAP_MKNOD);
+ test_make_file(_metadata, LANDLOCK_ACCESS_FS_MAKE_BLOCK, S_IFBLK,
+ makedev(7, 0));
+}
+
+TEST_F_FORK(layout1, make_reg)
+{
+ test_make_file(_metadata, LANDLOCK_ACCESS_FS_MAKE_REG, S_IFREG, 0);
+ test_make_file(_metadata, LANDLOCK_ACCESS_FS_MAKE_REG, 0, 0);
+}
+
+TEST_F_FORK(layout1, make_sock)
+{
+ test_make_file(_metadata, LANDLOCK_ACCESS_FS_MAKE_SOCK, S_IFSOCK, 0);
+}
+
+TEST_F_FORK(layout1, make_fifo)
+{
+ test_make_file(_metadata, LANDLOCK_ACCESS_FS_MAKE_FIFO, S_IFIFO, 0);
+}
+
+TEST_F_FORK(layout1, make_sym)
+{
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_MAKE_SYM,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, rules[0].access,
+ rules);
+
+ ASSERT_LE(0, ruleset_fd);
+
+ ASSERT_EQ(0, unlink(file1_s1d1));
+ ASSERT_EQ(0, unlink(file1_s1d2));
+ ASSERT_EQ(0, unlink(file1_s1d3));
+
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ ASSERT_EQ(-1, symlink("none", file1_s1d1));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(0, symlink("none", file1_s1d2)) {
+ TH_LOG("Failed to make symlink \"%s\": %s",
+ file1_s1d2, strerror(errno));
+ };
+ ASSERT_EQ(0, symlink("none", file1_s1d3));
+}
+
+TEST_F_FORK(layout1, make_dir)
+{
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_MAKE_DIR,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata, rules[0].access,
+ rules);
+
+ ASSERT_LE(0, ruleset_fd);
+
+ ASSERT_EQ(0, unlink(file1_s1d1));
+ ASSERT_EQ(0, unlink(file1_s1d2));
+ ASSERT_EQ(0, unlink(file1_s1d3));
+
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Uses file_* as directory names. */
+ ASSERT_EQ(-1, mkdir(file1_s1d1, 0700));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(0, mkdir(file1_s1d2, 0700)) {
+ TH_LOG("Failed to make directory \"%s\": %s",
+ file1_s1d2, strerror(errno));
+ };
+ ASSERT_EQ(0, mkdir(file1_s1d3, 0700));
+}
+
+static int open_proc_fd(struct __test_metadata *const _metadata, const int fd,
+ const int open_flags)
+{
+ static const char path_template[] = "/proc/self/fd/%d";
+ char procfd_path[sizeof(path_template) + 10];
+ const int procfd_path_size = snprintf(procfd_path, sizeof(procfd_path),
+ path_template, fd);
+
+ ASSERT_LT(procfd_path_size, sizeof(procfd_path));
+ return open(procfd_path, open_flags);
+}
+
+TEST_F_FORK(layout1, proc_unlinked_file)
+{
+ const struct rule rules[] = {
+ {
+ .path = file1_s1d2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {}
+ };
+ int reg_fd, proc_fd;
+ const int ruleset_fd = create_ruleset(_metadata,
+ LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_WRITE_FILE, rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_RDWR));
+ ASSERT_EQ(0, test_open(file1_s1d2, O_RDONLY));
+ reg_fd = open(file1_s1d2, O_RDONLY | O_CLOEXEC);
+ ASSERT_LE(0, reg_fd);
+ ASSERT_EQ(0, unlink(file1_s1d2));
+
+ proc_fd = open_proc_fd(_metadata, reg_fd, O_RDONLY | O_CLOEXEC);
+ ASSERT_LE(0, proc_fd);
+ ASSERT_EQ(0, close(proc_fd));
+
+ proc_fd = open_proc_fd(_metadata, reg_fd, O_RDWR | O_CLOEXEC);
+ ASSERT_EQ(-1, proc_fd) {
+ TH_LOG("Successfully opened /proc/self/fd/%d: %s",
+ reg_fd, strerror(errno));
+ }
+ ASSERT_EQ(EACCES, errno);
+
+ ASSERT_EQ(0, close(reg_fd));
+}
+
+TEST_F_FORK(layout1, proc_pipe)
+{
+ int proc_fd;
+ int pipe_fds[2];
+ char buf = '\0';
+ const struct rule rules[] = {
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {}
+ };
+ /* Limits read and write access to files tied to the filesystem. */
+ const int ruleset_fd = create_ruleset(_metadata, rules[0].access,
+ rules);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks enforcement for normal files. */
+ ASSERT_EQ(0, test_open(file1_s1d2, O_RDWR));
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDWR));
+
+ /* Checks access to pipes through FD. */
+ ASSERT_EQ(0, pipe2(pipe_fds, O_CLOEXEC));
+ ASSERT_EQ(1, write(pipe_fds[1], ".", 1)) {
+ TH_LOG("Failed to write in pipe: %s", strerror(errno));
+ }
+ ASSERT_EQ(1, read(pipe_fds[0], &buf, 1));
+ ASSERT_EQ('.', buf);
+
+ /* Checks write access to pipe through /proc/self/fd . */
+ proc_fd = open_proc_fd(_metadata, pipe_fds[1], O_WRONLY | O_CLOEXEC);
+ ASSERT_LE(0, proc_fd);
+ ASSERT_EQ(1, write(proc_fd, ".", 1)) {
+ TH_LOG("Failed to write through /proc/self/fd/%d: %s",
+ pipe_fds[1], strerror(errno));
+ }
+ ASSERT_EQ(0, close(proc_fd));
+
+ /* Checks read access to pipe through /proc/self/fd . */
+ proc_fd = open_proc_fd(_metadata, pipe_fds[0], O_RDONLY | O_CLOEXEC);
+ ASSERT_LE(0, proc_fd);
+ buf = '\0';
+ ASSERT_EQ(1, read(proc_fd, &buf, 1)) {
+ TH_LOG("Failed to read through /proc/self/fd/%d: %s",
+ pipe_fds[1], strerror(errno));
+ }
+ ASSERT_EQ(0, close(proc_fd));
+
+ ASSERT_EQ(0, close(pipe_fds[0]));
+ ASSERT_EQ(0, close(pipe_fds[1]));
+}
+
+static void create_layout1_bind(struct __test_metadata *const _metadata)
+{
+ create_layout1(_metadata);
+
+ set_cap(_metadata, CAP_SYS_ADMIN);
+ ASSERT_EQ(0, mount(dir_s1d2, dir_s2d2, NULL, MS_BIND, NULL));
+ clear_cap(_metadata, CAP_SYS_ADMIN);
+}
+
+static void remove_layout1_bind(struct __test_metadata *const _metadata)
+{
+ set_cap(_metadata, CAP_SYS_ADMIN);
+ EXPECT_EQ(0, umount(dir_s2d2));
+ clear_cap(_metadata, CAP_SYS_ADMIN);
+
+ remove_layout1(_metadata);
+}
+
+FIXTURE(layout1_bind) {
+};
+
+FIXTURE_SETUP(layout1_bind)
+{
+ disable_caps(_metadata);
+ create_layout1_bind(_metadata);
+}
+
+FIXTURE_TEARDOWN(layout1_bind)
+{
+ remove_layout1_bind(_metadata);
+}
+
+static const char bind_dir_s1d3[] = TMP_DIR "/s2d1/s2d2/s1d3";
+static const char bind_file1_s1d3[] = TMP_DIR "/s2d1/s2d2/s1d3/f1";
+
+/*
+ * layout1_bind hierarchy:
+ *
+ * tmp
+ * ├── s1d1
+ * │   ├── f1
+ * │   ├── f2
+ * │   └── s1d2
+ * │   ├── f1
+ * │   ├── f2
+ * │   └── s1d3
+ * │   ├── f1
+ * │   └── f2
+ * ├── s2d1
+ * │   ├── f1
+ * │   └── s2d2
+ * │   ├── f1
+ * │   ├── f2
+ * │   └── s1d3
+ * │   ├── f1
+ * │   └── f2
+ * └── s3d1
+ * └── s3d2
+ * └── s3d3
+ */
+
+TEST_F_FORK(layout1_bind, no_restriction)
+{
+ ASSERT_EQ(0, test_open(dir_s1d1, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s1d1, O_RDONLY));
+ ASSERT_EQ(0, test_open(dir_s1d2, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s1d2, O_RDONLY));
+ ASSERT_EQ(0, test_open(dir_s1d3, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDONLY));
+
+ ASSERT_EQ(0, test_open(dir_s2d1, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s2d1, O_RDONLY));
+ ASSERT_EQ(0, test_open(dir_s2d2, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s2d2, O_RDONLY));
+ ASSERT_EQ(ENOENT, test_open(dir_s2d3, O_RDONLY));
+ ASSERT_EQ(ENOENT, test_open(file1_s2d3, O_RDONLY));
+
+ ASSERT_EQ(0, test_open(bind_dir_s1d3, O_RDONLY));
+ ASSERT_EQ(0, test_open(bind_file1_s1d3, O_RDONLY));
+
+ ASSERT_EQ(0, test_open(dir_s3d1, O_RDONLY));
+}
+
+TEST_F_FORK(layout1_bind, same_content_same_file)
+{
+ /*
+ * Sets access right on parent directories of both source and
+ * destination mount points.
+ */
+ const struct rule layer1_parent[] = {
+ {
+ .path = dir_s1d1,
+ .access = ACCESS_RO,
+ },
+ {
+ .path = dir_s2d1,
+ .access = ACCESS_RW,
+ },
+ {}
+ };
+ /*
+ * Sets access rights on the same bind-mounted directories. The result
+ * should be ACCESS_RW for both directories, but not both hierarchies
+ * because of the first layer.
+ */
+ const struct rule layer2_mount_point[] = {
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {
+ .path = dir_s2d2,
+ .access = ACCESS_RW,
+ },
+ {}
+ };
+ /* Only allow read-access to the s1d3 hierarchies. */
+ const struct rule layer3_source[] = {
+ {
+ .path = dir_s1d3,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {}
+ };
+ /* Removes all access rights. */
+ const struct rule layer4_destination[] = {
+ {
+ .path = bind_file1_s1d3,
+ .access = LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {}
+ };
+ int ruleset_fd;
+
+ /* Sets rules for the parent directories. */
+ ruleset_fd = create_ruleset(_metadata, ACCESS_RW, layer1_parent);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks source hierarchy. */
+ ASSERT_EQ(0, test_open(file1_s1d1, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_WRONLY));
+ ASSERT_EQ(0, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
+
+ ASSERT_EQ(0, test_open(file1_s1d2, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_WRONLY));
+ ASSERT_EQ(0, test_open(dir_s1d2, O_RDONLY | O_DIRECTORY));
+
+ /* Checks destination hierarchy. */
+ ASSERT_EQ(0, test_open(file1_s2d1, O_RDWR));
+ ASSERT_EQ(0, test_open(dir_s2d1, O_RDONLY | O_DIRECTORY));
+
+ ASSERT_EQ(0, test_open(file1_s2d2, O_RDWR));
+ ASSERT_EQ(0, test_open(dir_s2d2, O_RDONLY | O_DIRECTORY));
+
+ /* Sets rules for the mount points. */
+ ruleset_fd = create_ruleset(_metadata, ACCESS_RW, layer2_mount_point);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks source hierarchy. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
+
+ ASSERT_EQ(0, test_open(file1_s1d2, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_WRONLY));
+ ASSERT_EQ(0, test_open(dir_s1d2, O_RDONLY | O_DIRECTORY));
+
+ /* Checks destination hierarchy. */
+ ASSERT_EQ(EACCES, test_open(file1_s2d1, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s2d1, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(dir_s2d1, O_RDONLY | O_DIRECTORY));
+
+ ASSERT_EQ(0, test_open(file1_s2d2, O_RDWR));
+ ASSERT_EQ(0, test_open(dir_s2d2, O_RDONLY | O_DIRECTORY));
+ ASSERT_EQ(0, test_open(bind_dir_s1d3, O_RDONLY | O_DIRECTORY));
+
+ /* Sets a (shared) rule only on the source. */
+ ruleset_fd = create_ruleset(_metadata, ACCESS_RW, layer3_source);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks source hierarchy. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(dir_s1d2, O_RDONLY | O_DIRECTORY));
+
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d3, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(dir_s1d3, O_RDONLY | O_DIRECTORY));
+
+ /* Checks destination hierarchy. */
+ ASSERT_EQ(EACCES, test_open(file1_s2d2, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s2d2, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(dir_s2d2, O_RDONLY | O_DIRECTORY));
+
+ ASSERT_EQ(0, test_open(bind_file1_s1d3, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(bind_file1_s1d3, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(bind_dir_s1d3, O_RDONLY | O_DIRECTORY));
+
+ /* Sets a (shared) rule only on the destination. */
+ ruleset_fd = create_ruleset(_metadata, ACCESS_RW, layer4_destination);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks source hierarchy. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d3, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d3, O_WRONLY));
+
+ /* Checks destination hierarchy. */
+ ASSERT_EQ(EACCES, test_open(bind_file1_s1d3, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(bind_file1_s1d3, O_WRONLY));
+}
+
+#define LOWER_BASE TMP_DIR "/lower"
+#define LOWER_DATA LOWER_BASE "/data"
+static const char lower_fl1[] = LOWER_DATA "/fl1";
+static const char lower_dl1[] = LOWER_DATA "/dl1";
+static const char lower_dl1_fl2[] = LOWER_DATA "/dl1/fl2";
+static const char lower_fo1[] = LOWER_DATA "/fo1";
+static const char lower_do1[] = LOWER_DATA "/do1";
+static const char lower_do1_fo2[] = LOWER_DATA "/do1/fo2";
+static const char lower_do1_fl3[] = LOWER_DATA "/do1/fl3";
+
+static const char (*lower_base_files[])[] = {
+ &lower_fl1,
+ &lower_fo1,
+ NULL
+};
+static const char (*lower_base_directories[])[] = {
+ &lower_dl1,
+ &lower_do1,
+ NULL
+};
+static const char (*lower_sub_files[])[] = {
+ &lower_dl1_fl2,
+ &lower_do1_fo2,
+ &lower_do1_fl3,
+ NULL
+};
+
+#define UPPER_BASE TMP_DIR "/upper"
+#define UPPER_DATA UPPER_BASE "/data"
+#define UPPER_WORK UPPER_BASE "/work"
+static const char upper_fu1[] = UPPER_DATA "/fu1";
+static const char upper_du1[] = UPPER_DATA "/du1";
+static const char upper_du1_fu2[] = UPPER_DATA "/du1/fu2";
+static const char upper_fo1[] = UPPER_DATA "/fo1";
+static const char upper_do1[] = UPPER_DATA "/do1";
+static const char upper_do1_fo2[] = UPPER_DATA "/do1/fo2";
+static const char upper_do1_fu3[] = UPPER_DATA "/do1/fu3";
+
+static const char (*upper_base_files[])[] = {
+ &upper_fu1,
+ &upper_fo1,
+ NULL
+};
+static const char (*upper_base_directories[])[] = {
+ &upper_du1,
+ &upper_do1,
+ NULL
+};
+static const char (*upper_sub_files[])[] = {
+ &upper_du1_fu2,
+ &upper_do1_fo2,
+ &upper_do1_fu3,
+ NULL
+};
+
+#define MERGE_BASE TMP_DIR "/merge"
+#define MERGE_DATA MERGE_BASE "/data"
+static const char merge_fl1[] = MERGE_DATA "/fl1";
+static const char merge_dl1[] = MERGE_DATA "/dl1";
+static const char merge_dl1_fl2[] = MERGE_DATA "/dl1/fl2";
+static const char merge_fu1[] = MERGE_DATA "/fu1";
+static const char merge_du1[] = MERGE_DATA "/du1";
+static const char merge_du1_fu2[] = MERGE_DATA "/du1/fu2";
+static const char merge_fo1[] = MERGE_DATA "/fo1";
+static const char merge_do1[] = MERGE_DATA "/do1";
+static const char merge_do1_fo2[] = MERGE_DATA "/do1/fo2";
+static const char merge_do1_fl3[] = MERGE_DATA "/do1/fl3";
+static const char merge_do1_fu3[] = MERGE_DATA "/do1/fu3";
+
+static const char (*merge_base_files[])[] = {
+ &merge_fl1,
+ &merge_fu1,
+ &merge_fo1,
+ NULL
+};
+static const char (*merge_base_directories[])[] = {
+ &merge_dl1,
+ &merge_du1,
+ &merge_do1,
+ NULL
+};
+static const char (*merge_sub_files[])[] = {
+ &merge_dl1_fl2,
+ &merge_du1_fu2,
+ &merge_do1_fo2,
+ &merge_do1_fl3,
+ &merge_do1_fu3,
+ NULL
+};
+
+/*
+ * layout2_overlay hierarchy:
+ *
+ * tmp
+ * ├── lower
+ * │   └── data
+ * │   ├── dl1
+ * │   │   └── fl2
+ * │   ├── do1
+ * │   │   ├── fl3
+ * │   │   └── fo2
+ * │   ├── fl1
+ * │   └── fo1
+ * ├── merge
+ * │   └── data
+ * │   ├── dl1
+ * │   │   └── fl2
+ * │   ├── do1
+ * │   │   ├── fl3
+ * │   │   ├── fo2
+ * │   │   └── fu3
+ * │   ├── du1
+ * │   │   └── fu2
+ * │   ├── fl1
+ * │   ├── fo1
+ * │   └── fu1
+ * └── upper
+ * ├── data
+ * │   ├── do1
+ * │   │   ├── fo2
+ * │   │   └── fu3
+ * │   ├── du1
+ * │   │   └── fu2
+ * │   ├── fo1
+ * │   └── fu1
+ * └── work
+ * └── work
+ */
+
+FIXTURE(layout2_overlay) {
+};
+
+FIXTURE_SETUP(layout2_overlay)
+{
+ disable_caps(_metadata);
+
+ /* Do not pollute the rest of the system. */
+ set_cap(_metadata, CAP_SYS_ADMIN);
+ ASSERT_EQ(0, unshare(CLONE_NEWNS));
+ clear_cap(_metadata, CAP_SYS_ADMIN);
+ umask(0077);
+
+ create_directory(_metadata, LOWER_BASE);
+ set_cap(_metadata, CAP_SYS_ADMIN);
+ /* Creates tmpfs mount points to get deterministic overlayfs. */
+ ASSERT_EQ(0, mount("tmp", LOWER_BASE, "tmpfs", 0, "size=4m,mode=700"));
+ clear_cap(_metadata, CAP_SYS_ADMIN);
+ create_file(_metadata, lower_fl1);
+ create_file(_metadata, lower_dl1_fl2);
+ create_file(_metadata, lower_fo1);
+ create_file(_metadata, lower_do1_fo2);
+ create_file(_metadata, lower_do1_fl3);
+
+ create_directory(_metadata, UPPER_BASE);
+ set_cap(_metadata, CAP_SYS_ADMIN);
+ ASSERT_EQ(0, mount("tmp", UPPER_BASE, "tmpfs", 0, "size=4m,mode=700"));
+ clear_cap(_metadata, CAP_SYS_ADMIN);
+ create_file(_metadata, upper_fu1);
+ create_file(_metadata, upper_du1_fu2);
+ create_file(_metadata, upper_fo1);
+ create_file(_metadata, upper_do1_fo2);
+ create_file(_metadata, upper_do1_fu3);
+ ASSERT_EQ(0, mkdir(UPPER_WORK, 0700));
+
+ create_directory(_metadata, MERGE_DATA);
+ set_cap(_metadata, CAP_SYS_ADMIN);
+ set_cap(_metadata, CAP_DAC_OVERRIDE);
+ ASSERT_EQ(0, mount("overlay", MERGE_DATA, "overlay", 0,
+ "lowerdir=" LOWER_DATA
+ ",upperdir=" UPPER_DATA
+ ",workdir=" UPPER_WORK));
+ clear_cap(_metadata, CAP_DAC_OVERRIDE);
+ clear_cap(_metadata, CAP_SYS_ADMIN);
+}
+
+FIXTURE_TEARDOWN(layout2_overlay)
+{
+ EXPECT_EQ(0, remove_path(lower_do1_fl3));
+ EXPECT_EQ(0, remove_path(lower_dl1_fl2));
+ EXPECT_EQ(0, remove_path(lower_fl1));
+ EXPECT_EQ(0, remove_path(lower_do1_fo2));
+ EXPECT_EQ(0, remove_path(lower_fo1));
+ set_cap(_metadata, CAP_SYS_ADMIN);
+ EXPECT_EQ(0, umount(LOWER_BASE));
+ clear_cap(_metadata, CAP_SYS_ADMIN);
+ EXPECT_EQ(0, remove_path(LOWER_BASE));
+
+ EXPECT_EQ(0, remove_path(upper_do1_fu3));
+ EXPECT_EQ(0, remove_path(upper_du1_fu2));
+ EXPECT_EQ(0, remove_path(upper_fu1));
+ EXPECT_EQ(0, remove_path(upper_do1_fo2));
+ EXPECT_EQ(0, remove_path(upper_fo1));
+ EXPECT_EQ(0, remove_path(UPPER_WORK "/work"));
+ set_cap(_metadata, CAP_SYS_ADMIN);
+ EXPECT_EQ(0, umount(UPPER_BASE));
+ clear_cap(_metadata, CAP_SYS_ADMIN);
+ EXPECT_EQ(0, remove_path(UPPER_BASE));
+
+ set_cap(_metadata, CAP_SYS_ADMIN);
+ EXPECT_EQ(0, umount(MERGE_DATA));
+ clear_cap(_metadata, CAP_SYS_ADMIN);
+ EXPECT_EQ(0, remove_path(MERGE_DATA));
+
+ EXPECT_EQ(0, remove_path(TMP_DIR));
+}
+
+TEST_F_FORK(layout2_overlay, no_restriction)
+{
+ ASSERT_EQ(0, test_open(lower_fl1, O_RDONLY));
+ ASSERT_EQ(0, test_open(lower_dl1, O_RDONLY));
+ ASSERT_EQ(0, test_open(lower_dl1_fl2, O_RDONLY));
+ ASSERT_EQ(0, test_open(lower_fo1, O_RDONLY));
+ ASSERT_EQ(0, test_open(lower_do1, O_RDONLY));
+ ASSERT_EQ(0, test_open(lower_do1_fo2, O_RDONLY));
+ ASSERT_EQ(0, test_open(lower_do1_fl3, O_RDONLY));
+
+ ASSERT_EQ(0, test_open(upper_fu1, O_RDONLY));
+ ASSERT_EQ(0, test_open(upper_du1, O_RDONLY));
+ ASSERT_EQ(0, test_open(upper_du1_fu2, O_RDONLY));
+ ASSERT_EQ(0, test_open(upper_fo1, O_RDONLY));
+ ASSERT_EQ(0, test_open(upper_do1, O_RDONLY));
+ ASSERT_EQ(0, test_open(upper_do1_fo2, O_RDONLY));
+ ASSERT_EQ(0, test_open(upper_do1_fu3, O_RDONLY));
+
+ ASSERT_EQ(0, test_open(merge_fl1, O_RDONLY));
+ ASSERT_EQ(0, test_open(merge_dl1, O_RDONLY));
+ ASSERT_EQ(0, test_open(merge_dl1_fl2, O_RDONLY));
+ ASSERT_EQ(0, test_open(merge_fu1, O_RDONLY));
+ ASSERT_EQ(0, test_open(merge_du1, O_RDONLY));
+ ASSERT_EQ(0, test_open(merge_du1_fu2, O_RDONLY));
+ ASSERT_EQ(0, test_open(merge_fo1, O_RDONLY));
+ ASSERT_EQ(0, test_open(merge_do1, O_RDONLY));
+ ASSERT_EQ(0, test_open(merge_do1_fo2, O_RDONLY));
+ ASSERT_EQ(0, test_open(merge_do1_fl3, O_RDONLY));
+ ASSERT_EQ(0, test_open(merge_do1_fu3, O_RDONLY));
+}
+
+#define for_each_path(path_list, path_entry, i) \
+ for (i = 0, path_entry = *path_list[i]; path_list[i]; \
+ path_entry = *path_list[++i])
+
+TEST_F_FORK(layout2_overlay, same_content_different_file)
+{
+ /* Sets access right on parent directories of both layers. */
+ const struct rule layer1_base[] = {
+ {
+ .path = LOWER_BASE,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {
+ .path = UPPER_BASE,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {
+ .path = MERGE_BASE,
+ .access = ACCESS_RW,
+ },
+ {}
+ };
+ const struct rule layer2_data[] = {
+ {
+ .path = LOWER_DATA,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {
+ .path = UPPER_DATA,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {
+ .path = MERGE_DATA,
+ .access = ACCESS_RW,
+ },
+ {}
+ };
+ /* Sets access right on directories inside both layers. */
+ const struct rule layer3_subdirs[] = {
+ {
+ .path = lower_dl1,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {
+ .path = lower_do1,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {
+ .path = upper_du1,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {
+ .path = upper_do1,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {
+ .path = merge_dl1,
+ .access = ACCESS_RW,
+ },
+ {
+ .path = merge_du1,
+ .access = ACCESS_RW,
+ },
+ {
+ .path = merge_do1,
+ .access = ACCESS_RW,
+ },
+ {}
+ };
+ /* Tighten access rights to the files. */
+ const struct rule layer4_files[] = {
+ {
+ .path = lower_dl1_fl2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {
+ .path = lower_do1_fo2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {
+ .path = lower_do1_fl3,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {
+ .path = upper_du1_fu2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {
+ .path = upper_do1_fo2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {
+ .path = upper_do1_fu3,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ {
+ .path = merge_dl1_fl2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {
+ .path = merge_du1_fu2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {
+ .path = merge_do1_fo2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {
+ .path = merge_do1_fl3,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {
+ .path = merge_do1_fu3,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {}
+ };
+ const struct rule layer5_merge_only[] = {
+ {
+ .path = MERGE_DATA,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {}
+ };
+ int ruleset_fd;
+ size_t i;
+ const char *path_entry;
+
+ /* Sets rules on base directories (i.e. outside overlay scope). */
+ ruleset_fd = create_ruleset(_metadata, ACCESS_RW, layer1_base);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks lower layer. */
+ for_each_path(lower_base_files, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(path_entry, O_WRONLY));
+ }
+ for_each_path(lower_base_directories, path_entry, i) {
+ ASSERT_EQ(EACCES, test_open(path_entry, O_RDONLY | O_DIRECTORY));
+ }
+ for_each_path(lower_sub_files, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(path_entry, O_WRONLY));
+ }
+ /* Checks upper layer. */
+ for_each_path(upper_base_files, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(path_entry, O_WRONLY));
+ }
+ for_each_path(upper_base_directories, path_entry, i) {
+ ASSERT_EQ(EACCES, test_open(path_entry, O_RDONLY | O_DIRECTORY));
+ }
+ for_each_path(upper_sub_files, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(path_entry, O_WRONLY));
+ }
+ /*
+ * Checks that access rights are independent from the lower and upper
+ * layers: write access to upper files viewed through the merge point
+ * is still allowed, and write access to lower file viewed (and copied)
+ * through the merge point is still allowed.
+ */
+ for_each_path(merge_base_files, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDWR));
+ }
+ for_each_path(merge_base_directories, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDONLY | O_DIRECTORY));
+ }
+ for_each_path(merge_sub_files, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDWR));
+ }
+
+ /* Sets rules on data directories (i.e. inside overlay scope). */
+ ruleset_fd = create_ruleset(_metadata, ACCESS_RW, layer2_data);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks merge. */
+ for_each_path(merge_base_files, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDWR));
+ }
+ for_each_path(merge_base_directories, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDONLY | O_DIRECTORY));
+ }
+ for_each_path(merge_sub_files, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDWR));
+ }
+
+ /* Same checks with tighter rules. */
+ ruleset_fd = create_ruleset(_metadata, ACCESS_RW, layer3_subdirs);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks changes for lower layer. */
+ for_each_path(lower_base_files, path_entry, i) {
+ ASSERT_EQ(EACCES, test_open(path_entry, O_RDONLY));
+ }
+ /* Checks changes for upper layer. */
+ for_each_path(upper_base_files, path_entry, i) {
+ ASSERT_EQ(EACCES, test_open(path_entry, O_RDONLY));
+ }
+ /* Checks all merge accesses. */
+ for_each_path(merge_base_files, path_entry, i) {
+ ASSERT_EQ(EACCES, test_open(path_entry, O_RDWR));
+ }
+ for_each_path(merge_base_directories, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDONLY | O_DIRECTORY));
+ }
+ for_each_path(merge_sub_files, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDWR));
+ }
+
+ /* Sets rules directly on overlayed files. */
+ ruleset_fd = create_ruleset(_metadata, ACCESS_RW, layer4_files);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks unchanged accesses on lower layer. */
+ for_each_path(lower_sub_files, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(path_entry, O_WRONLY));
+ }
+ /* Checks unchanged accesses on upper layer. */
+ for_each_path(upper_sub_files, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(path_entry, O_WRONLY));
+ }
+ /* Checks all merge accesses. */
+ for_each_path(merge_base_files, path_entry, i) {
+ ASSERT_EQ(EACCES, test_open(path_entry, O_RDWR));
+ }
+ for_each_path(merge_base_directories, path_entry, i) {
+ ASSERT_EQ(EACCES, test_open(path_entry, O_RDONLY | O_DIRECTORY));
+ }
+ for_each_path(merge_sub_files, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDWR));
+ }
+
+ /* Only allowes access to the merge hierarchy. */
+ ruleset_fd = create_ruleset(_metadata, ACCESS_RW, layer5_merge_only);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks new accesses on lower layer. */
+ for_each_path(lower_sub_files, path_entry, i) {
+ ASSERT_EQ(EACCES, test_open(path_entry, O_RDONLY));
+ }
+ /* Checks new accesses on upper layer. */
+ for_each_path(upper_sub_files, path_entry, i) {
+ ASSERT_EQ(EACCES, test_open(path_entry, O_RDONLY));
+ }
+ /* Checks all merge accesses. */
+ for_each_path(merge_base_files, path_entry, i) {
+ ASSERT_EQ(EACCES, test_open(path_entry, O_RDWR));
+ }
+ for_each_path(merge_base_directories, path_entry, i) {
+ ASSERT_EQ(EACCES, test_open(path_entry, O_RDONLY | O_DIRECTORY));
+ }
+ for_each_path(merge_sub_files, path_entry, i) {
+ ASSERT_EQ(0, test_open(path_entry, O_RDWR));
+ }
+}
+
+TEST_HARNESS_MAIN
diff --git a/tools/testing/selftests/landlock/ptrace_test.c b/tools/testing/selftests/landlock/ptrace_test.c
new file mode 100644
index 000000000000..961be120f245
--- /dev/null
+++ b/tools/testing/selftests/landlock/ptrace_test.c
@@ -0,0 +1,314 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Landlock tests - Ptrace
+ *
+ * Copyright © 2017-2020 Mickaël Salaün <[email protected]>
+ * Copyright © 2019-2020 ANSSI
+ */
+
+#define _GNU_SOURCE
+#include <errno.h>
+#include <fcntl.h>
+#include <linux/landlock.h>
+#include <signal.h>
+#include <sys/prctl.h>
+#include <sys/ptrace.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+#include <unistd.h>
+
+#include "common.h"
+
+static void create_domain(struct __test_metadata *const _metadata)
+{
+ int ruleset_fd;
+ struct landlock_ruleset_attr ruleset_attr = {
+ .handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE,
+ };
+ struct landlock_path_beneath_attr path_beneath_attr = {
+ .allowed_access = LANDLOCK_ACCESS_FS_READ_FILE,
+ };
+
+ ruleset_fd = landlock_create_ruleset(&ruleset_attr,
+ sizeof(ruleset_attr), 0);
+ EXPECT_LE(0, ruleset_fd) {
+ TH_LOG("Failed to create a ruleset: %s", strerror(errno));
+ }
+ path_beneath_attr.parent_fd = open("/tmp", O_PATH | O_NOFOLLOW |
+ O_DIRECTORY | O_CLOEXEC);
+ EXPECT_LE(0, path_beneath_attr.parent_fd);
+ EXPECT_EQ(0, landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
+ &path_beneath_attr, 0));
+ EXPECT_EQ(0, close(path_beneath_attr.parent_fd));
+
+ EXPECT_EQ(0, prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0));
+ EXPECT_EQ(0, landlock_restrict_self(ruleset_fd, 0));
+ EXPECT_EQ(0, close(ruleset_fd));
+}
+
+FIXTURE(hierarchy) { };
+
+FIXTURE_VARIANT(hierarchy) {
+ const bool domain_both;
+ const bool domain_parent;
+ const bool domain_child;
+};
+
+/*
+ * Test multiple tracing combinations between a parent process P1 and a child
+ * process P2.
+ *
+ * Yama's scoped ptrace is presumed disabled. If enabled, this optional
+ * restriction is enforced in addition to any Landlock check, which means that
+ * all P2 requests to trace P1 would be denied.
+ */
+
+/*
+ * No domain
+ *
+ * P1-. P1 -> P2 : allow
+ * \ P2 -> P1 : allow
+ * 'P2
+ */
+FIXTURE_VARIANT_ADD(hierarchy, allow_without_domain) {
+ .domain_both = false,
+ .domain_parent = false,
+ .domain_child = false,
+};
+
+/*
+ * Child domain
+ *
+ * P1--. P1 -> P2 : allow
+ * \ P2 -> P1 : deny
+ * .'-----.
+ * | P2 |
+ * '------'
+ */
+FIXTURE_VARIANT_ADD(hierarchy, allow_with_one_domain) {
+ .domain_both = false,
+ .domain_parent = false,
+ .domain_child = true,
+};
+
+/*
+ * Parent domain
+ * .------.
+ * | P1 --. P1 -> P2 : deny
+ * '------' \ P2 -> P1 : allow
+ * '
+ * P2
+ */
+FIXTURE_VARIANT_ADD(hierarchy, deny_with_parent_domain) {
+ .domain_both = false,
+ .domain_parent = true,
+ .domain_child = false,
+};
+
+/*
+ * Parent + child domain (siblings)
+ * .------.
+ * | P1 ---. P1 -> P2 : deny
+ * '------' \ P2 -> P1 : deny
+ * .---'--.
+ * | P2 |
+ * '------'
+ */
+FIXTURE_VARIANT_ADD(hierarchy, deny_with_sibling_domain) {
+ .domain_both = false,
+ .domain_parent = true,
+ .domain_child = true,
+};
+
+/*
+ * Same domain (inherited)
+ * .-------------.
+ * | P1----. | P1 -> P2 : allow
+ * | \ | P2 -> P1 : allow
+ * | ' |
+ * | P2 |
+ * '-------------'
+ */
+FIXTURE_VARIANT_ADD(hierarchy, allow_sibling_domain) {
+ .domain_both = true,
+ .domain_parent = false,
+ .domain_child = false,
+};
+
+/*
+ * Inherited + child domain
+ * .-----------------.
+ * | P1----. | P1 -> P2 : allow
+ * | \ | P2 -> P1 : deny
+ * | .-'----. |
+ * | | P2 | |
+ * | '------' |
+ * '-----------------'
+ */
+FIXTURE_VARIANT_ADD(hierarchy, allow_with_nested_domain) {
+ .domain_both = true,
+ .domain_parent = false,
+ .domain_child = true,
+};
+
+/*
+ * Inherited + parent domain
+ * .-----------------.
+ * |.------. | P1 -> P2 : deny
+ * || P1 ----. | P2 -> P1 : allow
+ * |'------' \ |
+ * | ' |
+ * | P2 |
+ * '-----------------'
+ */
+FIXTURE_VARIANT_ADD(hierarchy, deny_with_nested_and_parent_domain) {
+ .domain_both = true,
+ .domain_parent = true,
+ .domain_child = false,
+};
+
+/*
+ * Inherited + parent and child domain (siblings)
+ * .-----------------.
+ * | .------. | P1 -> P2 : deny
+ * | | P1 . | P2 -> P1 : deny
+ * | '------'\ |
+ * | \ |
+ * | .--'---. |
+ * | | P2 | |
+ * | '------' |
+ * '-----------------'
+ */
+FIXTURE_VARIANT_ADD(hierarchy, deny_with_forked_domain) {
+ .domain_both = true,
+ .domain_parent = true,
+ .domain_child = true,
+};
+
+FIXTURE_SETUP(hierarchy)
+{ }
+
+FIXTURE_TEARDOWN(hierarchy)
+{ }
+
+/* Test PTRACE_TRACEME and PTRACE_ATTACH for parent and child. */
+TEST_F(hierarchy, trace)
+{
+ pid_t child, parent;
+ int status;
+ int pipe_child[2], pipe_parent[2];
+ char buf_parent;
+ long ret;
+
+ disable_caps(_metadata);
+
+ parent = getpid();
+ ASSERT_EQ(0, pipe2(pipe_child, O_CLOEXEC));
+ ASSERT_EQ(0, pipe2(pipe_parent, O_CLOEXEC));
+ if (variant->domain_both) {
+ create_domain(_metadata);
+ if (!_metadata->passed)
+ /* Aborts before forking. */
+ return;
+ }
+
+ child = fork();
+ ASSERT_LE(0, child);
+ if (child == 0) {
+ char buf_child;
+
+ ASSERT_EQ(0, close(pipe_parent[1]));
+ ASSERT_EQ(0, close(pipe_child[0]));
+ if (variant->domain_child)
+ create_domain(_metadata);
+
+ /* Waits for the parent to be in a domain, if any. */
+ ASSERT_EQ(1, read(pipe_parent[0], &buf_child, 1));
+
+ /* Tests PTRACE_ATTACH on the parent. */
+ ret = ptrace(PTRACE_ATTACH, parent, NULL, 0);
+ if (variant->domain_child) {
+ EXPECT_EQ(-1, ret);
+ EXPECT_EQ(EPERM, errno);
+ } else {
+ EXPECT_EQ(0, ret);
+ }
+ if (ret == 0) {
+ ASSERT_EQ(parent, waitpid(parent, &status, 0));
+ ASSERT_EQ(1, WIFSTOPPED(status));
+ ASSERT_EQ(0, ptrace(PTRACE_DETACH, parent, NULL, 0));
+ }
+
+ /* Tests child PTRACE_TRACEME. */
+ ret = ptrace(PTRACE_TRACEME);
+ if (variant->domain_parent) {
+ EXPECT_EQ(-1, ret);
+ EXPECT_EQ(EPERM, errno);
+ } else {
+ EXPECT_EQ(0, ret);
+ }
+
+ /*
+ * Signals that the PTRACE_ATTACH test is done and the
+ * PTRACE_TRACEME test is ongoing.
+ */
+ ASSERT_EQ(1, write(pipe_child[1], ".", 1));
+
+ if (!variant->domain_parent) {
+ ASSERT_EQ(0, raise(SIGSTOP));
+ }
+
+ /* Waits for the parent PTRACE_ATTACH test. */
+ ASSERT_EQ(1, read(pipe_parent[0], &buf_child, 1));
+ _exit(_metadata->passed ? EXIT_SUCCESS : EXIT_FAILURE);
+ return;
+ }
+
+ ASSERT_EQ(0, close(pipe_child[1]));
+ ASSERT_EQ(0, close(pipe_parent[0]));
+ if (variant->domain_parent)
+ create_domain(_metadata);
+
+ /* Signals that the parent is in a domain, if any. */
+ ASSERT_EQ(1, write(pipe_parent[1], ".", 1));
+
+ /*
+ * Waits for the child to test PTRACE_ATTACH on the parent and start
+ * testing PTRACE_TRACEME.
+ */
+ ASSERT_EQ(1, read(pipe_child[0], &buf_parent, 1));
+
+ /* Tests child PTRACE_TRACEME. */
+ if (!variant->domain_parent) {
+ ASSERT_EQ(child, waitpid(child, &status, 0));
+ ASSERT_EQ(1, WIFSTOPPED(status));
+ ASSERT_EQ(0, ptrace(PTRACE_DETACH, child, NULL, 0));
+ } else {
+ /* The child should not be traced by the parent. */
+ EXPECT_EQ(-1, ptrace(PTRACE_DETACH, child, NULL, 0));
+ EXPECT_EQ(ESRCH, errno);
+ }
+
+ /* Tests PTRACE_ATTACH on the child. */
+ ret = ptrace(PTRACE_ATTACH, child, NULL, 0);
+ if (variant->domain_parent) {
+ EXPECT_EQ(-1, ret);
+ EXPECT_EQ(EPERM, errno);
+ } else {
+ EXPECT_EQ(0, ret);
+ }
+ if (ret == 0) {
+ ASSERT_EQ(child, waitpid(child, &status, 0));
+ ASSERT_EQ(1, WIFSTOPPED(status));
+ ASSERT_EQ(0, ptrace(PTRACE_DETACH, child, NULL, 0));
+ }
+
+ /* Signals that the parent PTRACE_ATTACH test is done. */
+ ASSERT_EQ(1, write(pipe_parent[1], ".", 1));
+ ASSERT_EQ(child, waitpid(child, &status, 0));
+ if (WIFSIGNALED(status) || !WIFEXITED(status) ||
+ WEXITSTATUS(status) != EXIT_SUCCESS)
+ _metadata->passed = 0;
+}
+
+TEST_HARNESS_MAIN
diff --git a/tools/testing/selftests/landlock/true.c b/tools/testing/selftests/landlock/true.c
new file mode 100644
index 000000000000..3f9ccbf52783
--- /dev/null
+++ b/tools/testing/selftests/landlock/true.c
@@ -0,0 +1,5 @@
+// SPDX-License-Identifier: GPL-2.0
+int main(void)
+{
+ return 0;
+}
--
2.30.0

2021-02-02 23:15:21

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v28 04/12] landlock: Add ptrace restrictions

From: Mickaël Salaün <[email protected]>

Using ptrace(2) and related debug features on a target process can lead
to a privilege escalation. Indeed, ptrace(2) can be used by an attacker
to impersonate another task and to remain undetected while performing
malicious activities. Thanks to ptrace_may_access(), various part of
the kernel can check if a tracer is more privileged than a tracee.

A landlocked process has fewer privileges than a non-landlocked process
and must then be subject to additional restrictions when manipulating
processes. To be allowed to use ptrace(2) and related syscalls on a
target process, a landlocked process must have a subset of the target
process's rules (i.e. the tracee must be in a sub-domain of the tracer).

Cc: James Morris <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Signed-off-by: Mickaël Salaün <[email protected]>
Reviewed-by: Jann Horn <[email protected]>
---

Changes since v25:
* Rename function to landlock_add_ptrace_hooks().

Changes since v22:
* Add Reviewed-by: Jann Horn <[email protected]>

Changes since v21:
* Fix copyright dates.

Changes since v14:
* Constify variables.

Changes since v13:
* Make the ptrace restriction mandatory, like in the v10.
* Remove the eBPF dependency.

Previous changes:
https://lore.kernel.org/lkml/[email protected]/
---
security/landlock/Makefile | 2 +-
security/landlock/ptrace.c | 120 +++++++++++++++++++++++++++++++++++++
security/landlock/ptrace.h | 14 +++++
security/landlock/setup.c | 2 +
4 files changed, 137 insertions(+), 1 deletion(-)
create mode 100644 security/landlock/ptrace.c
create mode 100644 security/landlock/ptrace.h

diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index 041ea242e627..f1d1eb72fa76 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -1,4 +1,4 @@
obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o

landlock-y := setup.o object.o ruleset.o \
- cred.o
+ cred.o ptrace.o
diff --git a/security/landlock/ptrace.c b/security/landlock/ptrace.c
new file mode 100644
index 000000000000..f55b82446de2
--- /dev/null
+++ b/security/landlock/ptrace.c
@@ -0,0 +1,120 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - Ptrace hooks
+ *
+ * Copyright © 2017-2020 Mickaël Salaün <[email protected]>
+ * Copyright © 2019-2020 ANSSI
+ */
+
+#include <asm/current.h>
+#include <linux/cred.h>
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/lsm_hooks.h>
+#include <linux/rcupdate.h>
+#include <linux/sched.h>
+
+#include "common.h"
+#include "cred.h"
+#include "ptrace.h"
+#include "ruleset.h"
+#include "setup.h"
+
+/**
+ * domain_scope_le - Checks domain ordering for scoped ptrace
+ *
+ * @parent: Parent domain.
+ * @child: Potential child of @parent.
+ *
+ * Checks if the @parent domain is less or equal to (i.e. an ancestor, which
+ * means a subset of) the @child domain.
+ */
+static bool domain_scope_le(const struct landlock_ruleset *const parent,
+ const struct landlock_ruleset *const child)
+{
+ const struct landlock_hierarchy *walker;
+
+ if (!parent)
+ return true;
+ if (!child)
+ return false;
+ for (walker = child->hierarchy; walker; walker = walker->parent) {
+ if (walker == parent->hierarchy)
+ /* @parent is in the scoped hierarchy of @child. */
+ return true;
+ }
+ /* There is no relationship between @parent and @child. */
+ return false;
+}
+
+static bool task_is_scoped(const struct task_struct *const parent,
+ const struct task_struct *const child)
+{
+ bool is_scoped;
+ const struct landlock_ruleset *dom_parent, *dom_child;
+
+ rcu_read_lock();
+ dom_parent = landlock_get_task_domain(parent);
+ dom_child = landlock_get_task_domain(child);
+ is_scoped = domain_scope_le(dom_parent, dom_child);
+ rcu_read_unlock();
+ return is_scoped;
+}
+
+static int task_ptrace(const struct task_struct *const parent,
+ const struct task_struct *const child)
+{
+ /* Quick return for non-landlocked tasks. */
+ if (!landlocked(parent))
+ return 0;
+ if (task_is_scoped(parent, child))
+ return 0;
+ return -EPERM;
+}
+
+/**
+ * hook_ptrace_access_check - Determines whether the current process may access
+ * another
+ *
+ * @child: Process to be accessed.
+ * @mode: Mode of attachment.
+ *
+ * If the current task has Landlock rules, then the child must have at least
+ * the same rules. Else denied.
+ *
+ * Determines whether a process may access another, returning 0 if permission
+ * granted, -errno if denied.
+ */
+static int hook_ptrace_access_check(struct task_struct *const child,
+ const unsigned int mode)
+{
+ return task_ptrace(current, child);
+}
+
+/**
+ * hook_ptrace_traceme - Determines whether another process may trace the
+ * current one
+ *
+ * @parent: Task proposed to be the tracer.
+ *
+ * If the parent has Landlock rules, then the current task must have the same
+ * or more rules. Else denied.
+ *
+ * Determines whether the nominated task is permitted to trace the current
+ * process, returning 0 if permission is granted, -errno if denied.
+ */
+static int hook_ptrace_traceme(struct task_struct *const parent)
+{
+ return task_ptrace(parent, current);
+}
+
+static struct security_hook_list landlock_hooks[] __lsm_ro_after_init = {
+ LSM_HOOK_INIT(ptrace_access_check, hook_ptrace_access_check),
+ LSM_HOOK_INIT(ptrace_traceme, hook_ptrace_traceme),
+};
+
+__init void landlock_add_ptrace_hooks(void)
+{
+ security_add_hooks(landlock_hooks, ARRAY_SIZE(landlock_hooks),
+ LANDLOCK_NAME);
+}
diff --git a/security/landlock/ptrace.h b/security/landlock/ptrace.h
new file mode 100644
index 000000000000..265b220ae3bf
--- /dev/null
+++ b/security/landlock/ptrace.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Landlock LSM - Ptrace hooks
+ *
+ * Copyright © 2017-2019 Mickaël Salaün <[email protected]>
+ * Copyright © 2019 ANSSI
+ */
+
+#ifndef _SECURITY_LANDLOCK_PTRACE_H
+#define _SECURITY_LANDLOCK_PTRACE_H
+
+__init void landlock_add_ptrace_hooks(void);
+
+#endif /* _SECURITY_LANDLOCK_PTRACE_H */
diff --git a/security/landlock/setup.c b/security/landlock/setup.c
index 8661112fb238..a5d6ef334991 100644
--- a/security/landlock/setup.c
+++ b/security/landlock/setup.c
@@ -11,6 +11,7 @@

#include "common.h"
#include "cred.h"
+#include "ptrace.h"
#include "setup.h"

struct lsm_blob_sizes landlock_blob_sizes __lsm_ro_after_init = {
@@ -20,6 +21,7 @@ struct lsm_blob_sizes landlock_blob_sizes __lsm_ro_after_init = {
static int __init landlock_init(void)
{
landlock_add_cred_hooks();
+ landlock_add_ptrace_hooks();
pr_info("Up and running.\n");
return 0;
}
--
2.30.0

2021-02-02 23:15:22

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v28 01/12] landlock: Add object management

From: Mickaël Salaün <[email protected]>

A Landlock object enables to identify a kernel object (e.g. an inode).
A Landlock rule is a set of access rights allowed on an object. Rules
are grouped in rulesets that may be tied to a set of processes (i.e.
subjects) to enforce a scoped access-control (i.e. a domain).

Because Landlock's goal is to empower any process (especially
unprivileged ones) to sandbox themselves, we cannot rely on a
system-wide object identification such as file extended attributes.
Indeed, we need innocuous, composable and modular access-controls.

The main challenge with these constraints is to identify kernel objects
while this identification is useful (i.e. when a security policy makes
use of this object). But this identification data should be freed once
no policy is using it. This ephemeral tagging should not and may not be
written in the filesystem. We then need to manage the lifetime of a
rule according to the lifetime of its objects. To avoid a global lock,
this implementation make use of RCU and counters to safely reference
objects.

A following commit uses this generic object management for inodes.

Cc: James Morris <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Signed-off-by: Mickaël Salaün <[email protected]>
Reviewed-by: Jann Horn <[email protected]>
---

Changes since v27:
* Update Kconfig for landlock_restrict_self(2).
* Cosmetic fixes: use 80 columns in Kconfig and align Makefile
declarations.

Changes since v26:
* Update Kconfig for landlock_enforce_ruleset_self(2).
* Fix spelling.

Changes since v24:
* Fix typo in comment (spotted by Jann Horn).
* Add Reviewed-by: Jann Horn <[email protected]>

Changes since v23:
* Update landlock_create_object() to return error codes instead of NULL.
This help error handling in callers.
* When using make oldconfig with a previous configuration already
including the CONFIG_LSM variable, no question is asked to update its
content. Update the Kconfig help to warn about LSM stacking
configuration.
* Constify variable (spotted by Vincent Dagonneau).

Changes since v22:
* Fix spelling (spotted by Jann Horn).

Changes since v21:
* Update Kconfig help.
* Clean up comments.

Changes since v18:
* Account objects to kmemcg.

Changes since v14:
* Simplify the object, rule and ruleset management at the expense of a
less aggressive memory freeing (contributed by Jann Horn, with
additional modifications):
- Remove object->list aggregating the rules tied to an object.
- Remove landlock_get_object(), landlock_drop_object(),
{get,put}_object_cleaner() and landlock_rule_is_disabled().
- Rewrite landlock_put_object() to use a more simple mechanism
(no tricky RCU).
- Replace enum landlock_object_type and landlock_release_object() with
landlock_object_underops->release()
- Adjust unions and Sparse annotations.
Cf. https://lore.kernel.org/lkml/CAG48ez21bEn0wL1bbmTiiu8j9jP5iEWtHOwz4tURUJ+ki0ydYw@mail.gmail.com/
* Merge struct landlock_rule into landlock_ruleset_elem to simplify the
rule management.
* Constify variables.
* Improve kernel documentation.
* Cosmetic variable renames.
* Remove the "default" in the Kconfig (suggested by Jann Horn).
* Only use refcount_inc() through getter helpers.
* Update Kconfig description.

Changes since v13:
* New dedicated implementation, removing the need for eBPF.

Previous changes:
https://lore.kernel.org/lkml/[email protected]/
---
MAINTAINERS | 10 +++++
security/Kconfig | 1 +
security/Makefile | 2 +
security/landlock/Kconfig | 21 +++++++++
security/landlock/Makefile | 3 ++
security/landlock/object.c | 67 ++++++++++++++++++++++++++++
security/landlock/object.h | 91 ++++++++++++++++++++++++++++++++++++++
7 files changed, 195 insertions(+)
create mode 100644 security/landlock/Kconfig
create mode 100644 security/landlock/Makefile
create mode 100644 security/landlock/object.c
create mode 100644 security/landlock/object.h

diff --git a/MAINTAINERS b/MAINTAINERS
index d3e847f7f3dc..a0e57ade0524 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9936,6 +9936,16 @@ F: net/core/sock_map.c
F: net/ipv4/tcp_bpf.c
F: net/ipv4/udp_bpf.c

+LANDLOCK SECURITY MODULE
+M: Mickaël Salaün <[email protected]>
+L: [email protected]
+S: Supported
+W: https://landlock.io
+T: git https://github.com/landlock-lsm/linux.git
+F: security/landlock/
+K: landlock
+K: LANDLOCK
+
LANTIQ / INTEL Ethernet drivers
M: Hauke Mehrtens <[email protected]>
L: [email protected]
diff --git a/security/Kconfig b/security/Kconfig
index 7561f6f99f1d..15a4342b5d01 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -238,6 +238,7 @@ source "security/loadpin/Kconfig"
source "security/yama/Kconfig"
source "security/safesetid/Kconfig"
source "security/lockdown/Kconfig"
+source "security/landlock/Kconfig"

source "security/integrity/Kconfig"

diff --git a/security/Makefile b/security/Makefile
index 3baf435de541..47e432900e24 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -13,6 +13,7 @@ subdir-$(CONFIG_SECURITY_LOADPIN) += loadpin
subdir-$(CONFIG_SECURITY_SAFESETID) += safesetid
subdir-$(CONFIG_SECURITY_LOCKDOWN_LSM) += lockdown
subdir-$(CONFIG_BPF_LSM) += bpf
+subdir-$(CONFIG_SECURITY_LANDLOCK) += landlock

# always enable default capabilities
obj-y += commoncap.o
@@ -32,6 +33,7 @@ obj-$(CONFIG_SECURITY_SAFESETID) += safesetid/
obj-$(CONFIG_SECURITY_LOCKDOWN_LSM) += lockdown/
obj-$(CONFIG_CGROUPS) += device_cgroup.o
obj-$(CONFIG_BPF_LSM) += bpf/
+obj-$(CONFIG_SECURITY_LANDLOCK) += landlock/

# Object integrity file lists
subdir-$(CONFIG_INTEGRITY) += integrity
diff --git a/security/landlock/Kconfig b/security/landlock/Kconfig
new file mode 100644
index 000000000000..79b7d0c3b11e
--- /dev/null
+++ b/security/landlock/Kconfig
@@ -0,0 +1,21 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config SECURITY_LANDLOCK
+ bool "Landlock support"
+ depends on SECURITY
+ select SECURITY_PATH
+ help
+ Landlock is a safe sandboxing mechanism that enables processes to
+ restrict themselves (and their future children) by gradually
+ enforcing tailored access control policies. A security policy is a
+ set of access rights (e.g. open a file in read-only, make a
+ directory, etc.) tied to a file hierarchy. Such policy can be
+ configured and enforced by any processes for themselves thanks to
+ dedicated system calls: landlock_create_ruleset(),
+ landlock_add_rule(), and landlock_restrict_self().
+
+ See Documentation/userspace-api/landlock.rst for further information.
+
+ If you are unsure how to answer this question, answer N. Otherwise,
+ you should also prepend "landlock," to the content of CONFIG_LSM to
+ enable Landlock at boot time.
diff --git a/security/landlock/Makefile b/security/landlock/Makefile
new file mode 100644
index 000000000000..cb6deefbf4c0
--- /dev/null
+++ b/security/landlock/Makefile
@@ -0,0 +1,3 @@
+obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
+
+landlock-y := object.o
diff --git a/security/landlock/object.c b/security/landlock/object.c
new file mode 100644
index 000000000000..d674fdf9ff04
--- /dev/null
+++ b/security/landlock/object.c
@@ -0,0 +1,67 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - Object management
+ *
+ * Copyright © 2016-2020 Mickaël Salaün <[email protected]>
+ * Copyright © 2018-2020 ANSSI
+ */
+
+#include <linux/bug.h>
+#include <linux/compiler_types.h>
+#include <linux/err.h>
+#include <linux/kernel.h>
+#include <linux/rcupdate.h>
+#include <linux/refcount.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+#include "object.h"
+
+struct landlock_object *landlock_create_object(
+ const struct landlock_object_underops *const underops,
+ void *const underobj)
+{
+ struct landlock_object *new_object;
+
+ if (WARN_ON_ONCE(!underops || !underobj))
+ return ERR_PTR(-ENOENT);
+ new_object = kzalloc(sizeof(*new_object), GFP_KERNEL_ACCOUNT);
+ if (!new_object)
+ return ERR_PTR(-ENOMEM);
+ refcount_set(&new_object->usage, 1);
+ spin_lock_init(&new_object->lock);
+ new_object->underops = underops;
+ new_object->underobj = underobj;
+ return new_object;
+}
+
+/*
+ * The caller must own the object (i.e. thanks to object->usage) to safely put
+ * it.
+ */
+void landlock_put_object(struct landlock_object *const object)
+{
+ /*
+ * The call to @object->underops->release(object) might sleep, e.g.
+ * because of iput().
+ */
+ might_sleep();
+ if (!object)
+ return;
+
+ /*
+ * If the @object's refcount cannot drop to zero, we can just decrement
+ * the refcount without holding a lock. Otherwise, the decrement must
+ * happen under @object->lock for synchronization with things like
+ * get_inode_object().
+ */
+ if (refcount_dec_and_lock(&object->usage, &object->lock)) {
+ __acquire(&object->lock);
+ /*
+ * With @object->lock initially held, remove the reference from
+ * @object->underobj to @object (if it still exists).
+ */
+ object->underops->release(object);
+ kfree_rcu(object, rcu_free);
+ }
+}
diff --git a/security/landlock/object.h b/security/landlock/object.h
new file mode 100644
index 000000000000..56f17c51df01
--- /dev/null
+++ b/security/landlock/object.h
@@ -0,0 +1,91 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Landlock LSM - Object management
+ *
+ * Copyright © 2016-2020 Mickaël Salaün <[email protected]>
+ * Copyright © 2018-2020 ANSSI
+ */
+
+#ifndef _SECURITY_LANDLOCK_OBJECT_H
+#define _SECURITY_LANDLOCK_OBJECT_H
+
+#include <linux/compiler_types.h>
+#include <linux/refcount.h>
+#include <linux/spinlock.h>
+
+struct landlock_object;
+
+/**
+ * struct landlock_object_underops - Operations on an underlying object
+ */
+struct landlock_object_underops {
+ /**
+ * @release: Releases the underlying object (e.g. iput() for an inode).
+ */
+ void (*release)(struct landlock_object *const object)
+ __releases(object->lock);
+};
+
+/**
+ * struct landlock_object - Security blob tied to a kernel object
+ *
+ * The goal of this structure is to enable to tie a set of ephemeral access
+ * rights (pertaining to different domains) to a kernel object (e.g an inode)
+ * in a safe way. This implies to handle concurrent use and modification.
+ *
+ * The lifetime of a &struct landlock_object depends of the rules referring to
+ * it.
+ */
+struct landlock_object {
+ /**
+ * @usage: This counter is used to tie an object to the rules matching
+ * it or to keep it alive while adding a new rule. If this counter
+ * reaches zero, this struct must not be modified, but this counter can
+ * still be read from within an RCU read-side critical section. When
+ * adding a new rule to an object with a usage counter of zero, we must
+ * wait until the pointer to this object is set to NULL (or recycled).
+ */
+ refcount_t usage;
+ /**
+ * @lock: Guards against concurrent modifications. This lock must be
+ * held from the time @usage drops to zero until any weak references
+ * from @underobj to this object have been cleaned up.
+ *
+ * Lock ordering: inode->i_lock nests inside this.
+ */
+ spinlock_t lock;
+ /**
+ * @underobj: Used when cleaning up an object and to mark an object as
+ * tied to its underlying kernel structure. This pointer is protected
+ * by @lock. Cf. landlock_release_inodes() and release_inode().
+ */
+ void *underobj;
+ union {
+ /**
+ * @rcu_free: Enables lockless use of @usage, @lock and
+ * @underobj from within an RCU read-side critical section.
+ * @rcu_free and @underops are only used by
+ * landlock_put_object().
+ */
+ struct rcu_head rcu_free;
+ /**
+ * @underops: Enables landlock_put_object() to release the
+ * underlying object (e.g. inode).
+ */
+ const struct landlock_object_underops *underops;
+ };
+};
+
+struct landlock_object *landlock_create_object(
+ const struct landlock_object_underops *const underops,
+ void *const underobj);
+
+void landlock_put_object(struct landlock_object *const object);
+
+static inline void landlock_get_object(struct landlock_object *const object)
+{
+ if (object)
+ refcount_inc(&object->usage);
+}
+
+#endif /* _SECURITY_LANDLOCK_OBJECT_H */
--
2.30.0

2021-02-02 23:15:57

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v28 02/12] landlock: Add ruleset and domain management

From: Mickaël Salaün <[email protected]>

A Landlock ruleset is mainly a red-black tree with Landlock rules as
nodes. This enables quick update and lookup to match a requested
access, e.g. to a file. A ruleset is usable through a dedicated file
descriptor (cf. following commit implementing syscalls) which enables a
process to create and populate a ruleset with new rules.

A domain is a ruleset tied to a set of processes. This group of rules
defines the security policy enforced on these processes and their future
children. A domain can transition to a new domain which is the
intersection of all its constraints and those of a ruleset provided by
the current process. This modification only impact the current process.
This means that a process can only gain more constraints (i.e. lose
accesses) over time.

Cc: James Morris <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Signed-off-by: Mickaël Salaün <[email protected]>
---

Changes since v27:
* Fix domains with layers of non-overlapping access rights.
* Add stricter limit checks (same semantic).
* Change the grow direction of a rule layer stack to make it the same as
the new ruleset fs_access_masks stack (cosmetic change).
* Cosmetic fix for a comment block.

Changes since v26:
* Fix spelling.

Changes since v25:
* Add build-time checks for the num_layers and num_rules variables
according to LANDLOCK_MAX_NUM_LAYERS and LANDLOCK_MAX_NUM_RULES, and
move these limits to a dedicated file.
* Cosmetic variable renames.

Changes since v24:
* Update struct landlock_rule with a layer stack. This reverts "Always
intersect access rights" from v24 and also adds the ability to tie
access rights with their policy layer. As noted by Jann Horn, always
intersecting access rights made some use cases uselessly more
difficult to handle in user space. Thanks to this new stack, we still
have a deterministic policy behavior whatever their level in the stack
of policies, while using a "union" of accesses when building a
ruleset. The implementation use a FAM to keep the access checks quick
and memory efficient (4 bytes per layer per inode). Update
insert_rule() accordingly.

Changes since v23:
* Always intersect access rights. Following the filesystem change
logic, make ruleset updates more consistent by always intersecting
access rights (boolean AND) instead of combining them (boolean OR) for
the same layer. This defensive approach could also help avoid user
space to inadvertently allow multiple access rights for the same
object (e.g. write and execute access on a path hierarchy) instead of
dealing with such inconsistency. This can happen when there is no
deduplication of objects (e.g. paths and underlying inodes) whereas
they get different access rights with landlock_add_rule(2).
* Add extra checks to make sure that:
- there is always an (allocated) object in each used rules;
- when updating a ruleset with a new rule (i.e. not merging two
rulesets), the ruleset doesn't contain multiple layers.
* Hide merge parameter from the public landlock_insert_rule() API. This
helps avoid misuse of this function.
* Replace a remaining hardcoded 1 with SINGLE_DEPTH_NESTING.

Changes since v22:
* Explicitely use RB_ROOT and SINGLE_DEPTH_NESTING (suggested by Jann
Horn).
* Improve comments and fix spelling (suggested by Jann Horn).

Changes since v21:
* Add and clean up comments.

Changes since v18:
* Account rulesets to kmemcg.
* Remove struct holes.
* Cosmetic changes.

Changes since v17:
* Move include/uapi/linux/landlock.h and _LANDLOCK_ACCESS_FS_* to a
following patch.

Changes since v16:
* Allow enforcement of empty ruleset, which enables deny-all policies.

Changes since v15:
* Replace layer_levels and layer_depth with a bitfield of layers, cf.
filesystem commit.
* Rename the LANDLOCK_ACCESS_FS_{UNLINK,RMDIR} with
LANDLOCK_ACCESS_FS_REMOVE_{FILE,DIR} because it makes sense to use
them for the action of renaming a file or a directory, which may lead
to the removal of the source file or directory. Removes the
LANDLOCK_ACCESS_FS_{LINK_TO,RENAME_FROM,RENAME_TO} which are now
replaced with LANDLOCK_ACCESS_FS_REMOVE_{FILE,DIR} and
LANDLOCK_ACCESS_FS_MAKE_* .
* Update the documentation accordingly and highlight how the access
rights are taken into account.
* Change nb_rules from atomic_t to u32 because it is not use anymore by
show_fdinfo().
* Add safeguard for level variables types.
* Check max number of rules.
* Replace struct landlock_access (self and beneath bitfields) with one
bitfield.
* Remove useless variable.
* Add comments.

Changes since v14:
* Simplify the object, rule and ruleset management at the expense of a
less aggressive memory freeing (contributed by Jann Horn, with
additional modifications):
- Make a domain immutable (remove the opportunistic cleaning).
- Remove RCU pointers.
- Merge struct landlock_ref and struct landlock_ruleset_elem into
landlock_rule: get ride of rule's RCU.
- Adjust union.
- Remove the landlock_insert_rule() check about a new object with the
same address as a previously disabled one, because it is not
possible to disable a rule anymore.
Cf. https://lore.kernel.org/lkml/CAG48ez21bEn0wL1bbmTiiu8j9jP5iEWtHOwz4tURUJ+ki0ydYw@mail.gmail.com/
* Fix nested domains by implementing a notion of layer level and depth:
- Update landlock_insert_rule() to manage such layers.
- Add an inherit_ruleset() helper to properly create a new domain.
- Rename landlock_find_access() to landlock_find_rule() and return a
full rule reference.
- Add a layer_level and a layer_depth fields to struct landlock_rule.
- Add a top_layer_level field to struct landlock_ruleset.
* Remove access rights that may be required for FD-only requests:
truncate, getattr, lock, chmod, chown, chgrp, ioctl. This will be
handle in a future evolution of Landlock, but right now the goal is to
lighten the code to ease review.
* Remove LANDLOCK_ACCESS_FS_OPEN and rename
LANDLOCK_ACCESS_FS_{READ,WRITE} with a FILE suffix.
* Rename LANDLOCK_ACCESS_FS_READDIR to match the *_FILE pattern.
* Remove LANDLOCK_ACCESS_FS_MAP which was useless.
* Fix memory leak in put_hierarchy() (reported by Jann Horn).
* Fix user-after-free and rename free_ruleset() (reported by Jann Horn).
* Replace the for loops with rbtree_postorder_for_each_entry_safe().
* Constify variables.
* Only use refcount_inc() through getter helpers.
* Change Landlock_insert_ruleset_access() to
Landlock_insert_ruleset_rule().
* Rename landlock_put_ruleset_enqueue() to landlock_put_ruleset_deferred().
* Improve kernel documentation and add a warning about the unhandled
access/syscall families.
* Move ABI check to syscall.c .

Changes since v13:
* New implementation, inspired by the previous inode eBPF map, but
agnostic to the underlying kernel object.

Previous changes:
https://lore.kernel.org/lkml/[email protected]/
---
security/landlock/Makefile | 2 +-
security/landlock/limits.h | 17 ++
security/landlock/ruleset.c | 469 ++++++++++++++++++++++++++++++++++++
security/landlock/ruleset.h | 165 +++++++++++++
4 files changed, 652 insertions(+), 1 deletion(-)
create mode 100644 security/landlock/limits.h
create mode 100644 security/landlock/ruleset.c
create mode 100644 security/landlock/ruleset.h

diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index cb6deefbf4c0..d846eba445bb 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -1,3 +1,3 @@
obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o

-landlock-y := object.o
+landlock-y := object.o ruleset.o
diff --git a/security/landlock/limits.h b/security/landlock/limits.h
new file mode 100644
index 000000000000..b734f597bb0e
--- /dev/null
+++ b/security/landlock/limits.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Landlock LSM - Limits for different components
+ *
+ * Copyright © 2016-2020 Mickaël Salaün <[email protected]>
+ * Copyright © 2018-2020 ANSSI
+ */
+
+#ifndef _SECURITY_LANDLOCK_LIMITS_H
+#define _SECURITY_LANDLOCK_LIMITS_H
+
+#include <linux/limits.h>
+
+#define LANDLOCK_MAX_NUM_LAYERS 64
+#define LANDLOCK_MAX_NUM_RULES U32_MAX
+
+#endif /* _SECURITY_LANDLOCK_LIMITS_H */
diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
new file mode 100644
index 000000000000..59c86126ea1c
--- /dev/null
+++ b/security/landlock/ruleset.c
@@ -0,0 +1,469 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - Ruleset management
+ *
+ * Copyright © 2016-2020 Mickaël Salaün <[email protected]>
+ * Copyright © 2018-2020 ANSSI
+ */
+
+#include <linux/bits.h>
+#include <linux/bug.h>
+#include <linux/compiler_types.h>
+#include <linux/err.h>
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/lockdep.h>
+#include <linux/overflow.h>
+#include <linux/rbtree.h>
+#include <linux/refcount.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/workqueue.h>
+
+#include "limits.h"
+#include "object.h"
+#include "ruleset.h"
+
+static struct landlock_ruleset *create_ruleset(const u32 num_layers)
+{
+ struct landlock_ruleset *new_ruleset;
+
+ new_ruleset = kzalloc(struct_size(new_ruleset, fs_access_masks,
+ num_layers), GFP_KERNEL_ACCOUNT);
+ if (!new_ruleset)
+ return ERR_PTR(-ENOMEM);
+ refcount_set(&new_ruleset->usage, 1);
+ mutex_init(&new_ruleset->lock);
+ new_ruleset->root = RB_ROOT;
+ new_ruleset->num_layers = num_layers;
+ /*
+ * hierarchy = NULL
+ * num_rules = 0
+ * fs_access_masks[] = 0
+ */
+ return new_ruleset;
+}
+
+struct landlock_ruleset *landlock_create_ruleset(const u32 fs_access_mask)
+{
+ struct landlock_ruleset *new_ruleset;
+
+ /* Informs about useless ruleset. */
+ if (!fs_access_mask)
+ return ERR_PTR(-ENOMSG);
+ new_ruleset = create_ruleset(1);
+ if (!IS_ERR(new_ruleset))
+ new_ruleset->fs_access_masks[0] = fs_access_mask;
+ return new_ruleset;
+}
+
+static void build_check_rule(void)
+{
+ const struct landlock_rule rule = {
+ .num_layers = ~0,
+ };
+
+ BUILD_BUG_ON(rule.num_layers < LANDLOCK_MAX_NUM_LAYERS);
+}
+
+static struct landlock_rule *create_rule(
+ struct landlock_object *const object,
+ const struct landlock_layer (*const layers)[],
+ const u32 num_layers,
+ const struct landlock_layer *const new_layer)
+{
+ struct landlock_rule *new_rule;
+ u32 new_num_layers;
+
+ build_check_rule();
+ if (new_layer) {
+ /* Should already be checked by landlock_merge_ruleset(). */
+ if (WARN_ON_ONCE(num_layers >= LANDLOCK_MAX_NUM_LAYERS))
+ return ERR_PTR(-E2BIG);
+ new_num_layers = num_layers + 1;
+ } else {
+ new_num_layers = num_layers;
+ }
+ new_rule = kzalloc(struct_size(new_rule, layers, new_num_layers),
+ GFP_KERNEL_ACCOUNT);
+ if (!new_rule)
+ return ERR_PTR(-ENOMEM);
+ RB_CLEAR_NODE(&new_rule->node);
+ landlock_get_object(object);
+ new_rule->object = object;
+ new_rule->num_layers = new_num_layers;
+ /* Copies the original layer stack. */
+ memcpy(new_rule->layers, layers,
+ flex_array_size(new_rule, layers, num_layers));
+ if (new_layer)
+ /* Adds a copy of @new_layer on the layer stack. */
+ new_rule->layers[new_rule->num_layers - 1] = *new_layer;
+ return new_rule;
+}
+
+static void put_rule(struct landlock_rule *const rule)
+{
+ might_sleep();
+ if (!rule)
+ return;
+ landlock_put_object(rule->object);
+ kfree(rule);
+}
+
+static void build_check_ruleset(void)
+{
+ const struct landlock_ruleset ruleset = {
+ .num_rules = ~0,
+ .num_layers = ~0,
+ };
+
+ BUILD_BUG_ON(ruleset.num_rules < LANDLOCK_MAX_NUM_RULES);
+ BUILD_BUG_ON(ruleset.num_layers < LANDLOCK_MAX_NUM_LAYERS);
+}
+
+/**
+ * insert_rule - Create and insert a rule in a ruleset
+ *
+ * @ruleset: The ruleset to be updated.
+ * @object: The object to build the new rule with. The underlying kernel
+ * object must be held by the caller.
+ * @layers: One or multiple layers to be copied into the new rule.
+ * @num_layers: The number of @layers entries.
+ *
+ * When user space requests to add a new rule to a ruleset, @layers only
+ * contains one entry and this entry is not assigned to any level. In this
+ * case, the new rule will extend @ruleset, similarly to a boolean OR between
+ * access rights.
+ *
+ * When merging a ruleset in a domain, or copying a domain, @layers will be
+ * added to @ruleset as new constraints, similarly to a boolean AND between
+ * access rights.
+ */
+static int insert_rule(struct landlock_ruleset *const ruleset,
+ struct landlock_object *const object,
+ const struct landlock_layer (*const layers)[],
+ size_t num_layers)
+{
+ struct rb_node **walker_node;
+ struct rb_node *parent_node = NULL;
+ struct landlock_rule *new_rule;
+
+ might_sleep();
+ lockdep_assert_held(&ruleset->lock);
+ if (WARN_ON_ONCE(!object || !layers))
+ return -ENOENT;
+ walker_node = &(ruleset->root.rb_node);
+ while (*walker_node) {
+ struct landlock_rule *const this = rb_entry(*walker_node,
+ struct landlock_rule, node);
+
+ if (this->object != object) {
+ parent_node = *walker_node;
+ if (this->object < object)
+ walker_node = &((*walker_node)->rb_right);
+ else
+ walker_node = &((*walker_node)->rb_left);
+ continue;
+ }
+
+ /* Only a single-level layer should match an existing rule. */
+ if (WARN_ON_ONCE(num_layers != 1))
+ return -EINVAL;
+
+ /* If there is a matching rule, updates it. */
+ if ((*layers)[0].level == 0) {
+ /*
+ * Extends access rights when the request comes from
+ * landlock_add_rule(2), i.e. @ruleset is not a domain.
+ */
+ if (WARN_ON_ONCE(this->num_layers != 1))
+ return -EINVAL;
+ if (WARN_ON_ONCE(this->layers[0].level != 0))
+ return -EINVAL;
+ this->layers[0].access |= (*layers)[0].access;
+ return 0;
+ }
+
+ if (WARN_ON_ONCE(this->layers[0].level == 0))
+ return -EINVAL;
+
+ /*
+ * Intersects access rights when it is a merge between a
+ * ruleset and a domain.
+ */
+ new_rule = create_rule(object, &this->layers, this->num_layers,
+ &(*layers)[0]);
+ if (IS_ERR(new_rule))
+ return PTR_ERR(new_rule);
+ rb_replace_node(&this->node, &new_rule->node, &ruleset->root);
+ put_rule(this);
+ return 0;
+ }
+
+ /* There is no match for @object. */
+ build_check_ruleset();
+ if (ruleset->num_rules >= LANDLOCK_MAX_NUM_RULES)
+ return -E2BIG;
+ new_rule = create_rule(object, layers, num_layers, NULL);
+ if (IS_ERR(new_rule))
+ return PTR_ERR(new_rule);
+ rb_link_node(&new_rule->node, parent_node, walker_node);
+ rb_insert_color(&new_rule->node, &ruleset->root);
+ ruleset->num_rules++;
+ return 0;
+}
+
+static void build_check_layer(void)
+{
+ const struct landlock_layer layer = {
+ .level = ~0,
+ };
+
+ BUILD_BUG_ON(layer.level < LANDLOCK_MAX_NUM_LAYERS);
+}
+
+/* @ruleset must be locked by the caller. */
+int landlock_insert_rule(struct landlock_ruleset *const ruleset,
+ struct landlock_object *const object, const u32 access)
+{
+ struct landlock_layer layers[] = {{
+ .access = access,
+ /* When @level is zero, insert_rule() extends @ruleset. */
+ .level = 0,
+ }};
+
+ build_check_layer();
+ return insert_rule(ruleset, object, &layers, ARRAY_SIZE(layers));
+}
+
+static inline void get_hierarchy(struct landlock_hierarchy *const hierarchy)
+{
+ if (hierarchy)
+ refcount_inc(&hierarchy->usage);
+}
+
+static void put_hierarchy(struct landlock_hierarchy *hierarchy)
+{
+ while (hierarchy && refcount_dec_and_test(&hierarchy->usage)) {
+ const struct landlock_hierarchy *const freeme = hierarchy;
+
+ hierarchy = hierarchy->parent;
+ kfree(freeme);
+ }
+}
+
+static int merge_ruleset(struct landlock_ruleset *const dst,
+ struct landlock_ruleset *const src)
+{
+ struct landlock_rule *walker_rule, *next_rule;
+ int err = 0;
+
+ might_sleep();
+ /* Should already be checked by landlock_merge_ruleset() */
+ if (WARN_ON_ONCE(!src))
+ return 0;
+ /* Only merge into a domain. */
+ if (WARN_ON_ONCE(!dst || !dst->hierarchy))
+ return -EINVAL;
+
+ /* Locks @dst first because we are its only owner. */
+ mutex_lock(&dst->lock);
+ mutex_lock_nested(&src->lock, SINGLE_DEPTH_NESTING);
+
+ /* Stacks the new layer. */
+ if (WARN_ON_ONCE(src->num_layers != 1 || dst->num_layers < 1)) {
+ err = -EINVAL;
+ goto out_unlock;
+ }
+ dst->fs_access_masks[dst->num_layers - 1] = src->fs_access_masks[0];
+
+ /* Merges the @src tree. */
+ rbtree_postorder_for_each_entry_safe(walker_rule, next_rule,
+ &src->root, node) {
+ struct landlock_layer layers[] = {{
+ .level = dst->num_layers,
+ }};
+
+ if (WARN_ON_ONCE(walker_rule->num_layers != 1)) {
+ err = -EINVAL;
+ goto out_unlock;
+ }
+ if (WARN_ON_ONCE(walker_rule->layers[0].level != 0)) {
+ err = -EINVAL;
+ goto out_unlock;
+ }
+ layers[0].access = walker_rule->layers[0].access;
+ err = insert_rule(dst, walker_rule->object, &layers,
+ ARRAY_SIZE(layers));
+ if (err)
+ goto out_unlock;
+ }
+
+out_unlock:
+ mutex_unlock(&src->lock);
+ mutex_unlock(&dst->lock);
+ return err;
+}
+
+static int inherit_ruleset(struct landlock_ruleset *const parent,
+ struct landlock_ruleset *const child)
+{
+ struct landlock_rule *walker_rule, *next_rule;
+ int err = 0;
+
+ might_sleep();
+ if (!parent)
+ return 0;
+
+ /* Locks @child first because we are its only owner. */
+ mutex_lock(&child->lock);
+ mutex_lock_nested(&parent->lock, SINGLE_DEPTH_NESTING);
+
+ /* Copies the @parent tree. */
+ rbtree_postorder_for_each_entry_safe(walker_rule, next_rule,
+ &parent->root, node) {
+ err = insert_rule(child, walker_rule->object,
+ &walker_rule->layers, walker_rule->num_layers);
+ if (err)
+ goto out_unlock;
+ }
+
+ if (WARN_ON_ONCE(child->num_layers <= parent->num_layers)) {
+ err = -EINVAL;
+ goto out_unlock;
+ }
+ /* Copies the parent layer stack and leaves a space for the new layer. */
+ memcpy(child->fs_access_masks, parent->fs_access_masks,
+ flex_array_size(parent, fs_access_masks, parent->num_layers));
+
+ if (WARN_ON_ONCE(!parent->hierarchy)) {
+ err = -EINVAL;
+ goto out_unlock;
+ }
+ get_hierarchy(parent->hierarchy);
+ child->hierarchy->parent = parent->hierarchy;
+
+out_unlock:
+ mutex_unlock(&parent->lock);
+ mutex_unlock(&child->lock);
+ return err;
+}
+
+static void free_ruleset(struct landlock_ruleset *const ruleset)
+{
+ struct landlock_rule *freeme, *next;
+
+ might_sleep();
+ rbtree_postorder_for_each_entry_safe(freeme, next, &ruleset->root,
+ node)
+ put_rule(freeme);
+ put_hierarchy(ruleset->hierarchy);
+ kfree(ruleset);
+}
+
+void landlock_put_ruleset(struct landlock_ruleset *const ruleset)
+{
+ might_sleep();
+ if (ruleset && refcount_dec_and_test(&ruleset->usage))
+ free_ruleset(ruleset);
+}
+
+static void free_ruleset_work(struct work_struct *const work)
+{
+ struct landlock_ruleset *ruleset;
+
+ ruleset = container_of(work, struct landlock_ruleset, work_free);
+ free_ruleset(ruleset);
+}
+
+void landlock_put_ruleset_deferred(struct landlock_ruleset *const ruleset)
+{
+ if (ruleset && refcount_dec_and_test(&ruleset->usage)) {
+ INIT_WORK(&ruleset->work_free, free_ruleset_work);
+ schedule_work(&ruleset->work_free);
+ }
+}
+
+/**
+ * landlock_merge_ruleset - Merge a ruleset with a domain
+ *
+ * @parent: Parent domain.
+ * @ruleset: New ruleset to be merged.
+ *
+ * Returns the intersection of @parent and @ruleset, or returns @parent if
+ * @ruleset is empty, or returns a duplicate of @ruleset if @parent is empty.
+ */
+struct landlock_ruleset *landlock_merge_ruleset(
+ struct landlock_ruleset *const parent,
+ struct landlock_ruleset *const ruleset)
+{
+ struct landlock_ruleset *new_dom;
+ u32 num_layers;
+ int err;
+
+ might_sleep();
+ if (WARN_ON_ONCE(!ruleset || parent == ruleset))
+ return ERR_PTR(-EINVAL);
+
+ if (parent) {
+ if (parent->num_layers >= LANDLOCK_MAX_NUM_LAYERS)
+ return ERR_PTR(-E2BIG);
+ num_layers = parent->num_layers + 1;
+ } else {
+ num_layers = 1;
+ }
+
+ /* Creates a new domain... */
+ new_dom = create_ruleset(num_layers);
+ if (IS_ERR(new_dom))
+ return new_dom;
+ new_dom->hierarchy = kzalloc(sizeof(*new_dom->hierarchy),
+ GFP_KERNEL_ACCOUNT);
+ if (!new_dom->hierarchy) {
+ err = -ENOMEM;
+ goto out_put_dom;
+ }
+ refcount_set(&new_dom->hierarchy->usage, 1);
+
+ /* ...as a child of @parent... */
+ err = inherit_ruleset(parent, new_dom);
+ if (err)
+ goto out_put_dom;
+
+ /* ...and including @ruleset. */
+ err = merge_ruleset(new_dom, ruleset);
+ if (err)
+ goto out_put_dom;
+
+ return new_dom;
+
+out_put_dom:
+ landlock_put_ruleset(new_dom);
+ return ERR_PTR(err);
+}
+
+/*
+ * The returned access has the same lifetime as @ruleset.
+ */
+const struct landlock_rule *landlock_find_rule(
+ const struct landlock_ruleset *const ruleset,
+ const struct landlock_object *const object)
+{
+ const struct rb_node *node;
+
+ if (!object)
+ return NULL;
+ node = ruleset->root.rb_node;
+ while (node) {
+ struct landlock_rule *this = rb_entry(node,
+ struct landlock_rule, node);
+
+ if (this->object == object)
+ return this;
+ if (this->object < object)
+ node = node->rb_right;
+ else
+ node = node->rb_left;
+ }
+ return NULL;
+}
diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
new file mode 100644
index 000000000000..6b1198458b37
--- /dev/null
+++ b/security/landlock/ruleset.h
@@ -0,0 +1,165 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Landlock LSM - Ruleset management
+ *
+ * Copyright © 2016-2020 Mickaël Salaün <[email protected]>
+ * Copyright © 2018-2020 ANSSI
+ */
+
+#ifndef _SECURITY_LANDLOCK_RULESET_H
+#define _SECURITY_LANDLOCK_RULESET_H
+
+#include <linux/mutex.h>
+#include <linux/rbtree.h>
+#include <linux/refcount.h>
+#include <linux/workqueue.h>
+
+#include "object.h"
+
+/**
+ * struct landlock_layer - Access rights for a given layer
+ */
+struct landlock_layer {
+ /**
+ * @level: Position of this layer in the layer stack.
+ */
+ u16 level;
+ /**
+ * @access: Bitfield of allowed actions on the kernel object. They are
+ * relative to the object type (e.g. %LANDLOCK_ACTION_FS_READ).
+ */
+ u16 access;
+};
+
+/**
+ * struct landlock_rule - Access rights tied to an object
+ */
+struct landlock_rule {
+ /**
+ * @node: Node in the ruleset's red-black tree.
+ */
+ struct rb_node node;
+ /**
+ * @object: Pointer to identify a kernel object (e.g. an inode). This
+ * is used as a key for this ruleset element. This pointer is set once
+ * and never modified. It always points to an allocated object because
+ * each rule increments the refcount of its object.
+ */
+ struct landlock_object *object;
+ /**
+ * @num_layers: Number of entries in @layers.
+ */
+ u32 num_layers;
+ /**
+ * @layers: Stack of layers, from the latest to the newest, implemented
+ * as a flexible array member (FAM).
+ */
+ struct landlock_layer layers[];
+};
+
+/**
+ * struct landlock_hierarchy - Node in a ruleset hierarchy
+ */
+struct landlock_hierarchy {
+ /**
+ * @parent: Pointer to the parent node, or NULL if it is a root
+ * Landlock domain.
+ */
+ struct landlock_hierarchy *parent;
+ /**
+ * @usage: Number of potential children domains plus their parent
+ * domain.
+ */
+ refcount_t usage;
+};
+
+/**
+ * struct landlock_ruleset - Landlock ruleset
+ *
+ * This data structure must contain unique entries, be updatable, and quick to
+ * match an object.
+ */
+struct landlock_ruleset {
+ /**
+ * @root: Root of a red-black tree containing &struct landlock_rule
+ * nodes. Once a ruleset is tied to a process (i.e. as a domain), this
+ * tree is immutable until @usage reaches zero.
+ */
+ struct rb_root root;
+ /**
+ * @hierarchy: Enables hierarchy identification even when a parent
+ * domain vanishes. This is needed for the ptrace protection.
+ */
+ struct landlock_hierarchy *hierarchy;
+ union {
+ /**
+ * @work_free: Enables to free a ruleset within a lockless
+ * section. This is only used by
+ * landlock_put_ruleset_deferred() when @usage reaches zero.
+ * The fields @lock, @usage, @num_rules, @num_layers and
+ * @fs_access_masks are then unused.
+ */
+ struct work_struct work_free;
+ struct {
+ /**
+ * @lock: Guards against concurrent modifications of
+ * @root, if @usage is greater than zero.
+ */
+ struct mutex lock;
+ /**
+ * @usage: Number of processes (i.e. domains) or file
+ * descriptors referencing this ruleset.
+ */
+ refcount_t usage;
+ /**
+ * @num_rules: Number of non-overlapping (i.e. not for
+ * the same object) rules in this ruleset.
+ */
+ u32 num_rules;
+ /**
+ * @num_layers: Number of layers that are used in this
+ * ruleset. This enables to check that all the layers
+ * allow an access request. A value of 0 identifies a
+ * non-merged ruleset (i.e. not a domain).
+ */
+ u32 num_layers;
+ /**
+ * @fs_access_masks: Contains the subset of filesystem
+ * actions that are restricted by a ruleset. A domain
+ * saves all layers of merged rulesets in a stack
+ * (FAM), starting from the first layer to the last
+ * one. These layers are used when merging rulesets,
+ * for user space backward compatibility (i.e.
+ * future-proof), and to properly handle merged
+ * rulesets without overlapping access rights. These
+ * layers are set once and never changed for the
+ * lifetime of the ruleset.
+ */
+ u16 fs_access_masks[];
+ };
+ };
+};
+
+struct landlock_ruleset *landlock_create_ruleset(const u32 fs_access_mask);
+
+void landlock_put_ruleset(struct landlock_ruleset *const ruleset);
+void landlock_put_ruleset_deferred(struct landlock_ruleset *const ruleset);
+
+int landlock_insert_rule(struct landlock_ruleset *const ruleset,
+ struct landlock_object *const object, const u32 access);
+
+struct landlock_ruleset *landlock_merge_ruleset(
+ struct landlock_ruleset *const parent,
+ struct landlock_ruleset *const ruleset);
+
+const struct landlock_rule *landlock_find_rule(
+ const struct landlock_ruleset *const ruleset,
+ const struct landlock_object *const object);
+
+static inline void landlock_get_ruleset(struct landlock_ruleset *const ruleset)
+{
+ if (ruleset)
+ refcount_inc(&ruleset->usage);
+}
+
+#endif /* _SECURITY_LANDLOCK_RULESET_H */
--
2.30.0

2021-02-03 15:30:28

by Mickaël Salaün

[permalink] [raw]
Subject: Re: [PATCH v28 01/12] landlock: Add object management


On 03/02/2021 15:21, Serge E. Hallyn wrote:
> On Tue, Feb 02, 2021 at 05:26:59PM +0100, Micka?l Sala?n wrote:
>> From: Micka?l Sala?n <[email protected]>
>>
>> A Landlock object enables to identify a kernel object (e.g. an inode).
>> A Landlock rule is a set of access rights allowed on an object. Rules
>> are grouped in rulesets that may be tied to a set of processes (i.e.
>> subjects) to enforce a scoped access-control (i.e. a domain).
>>
>> Because Landlock's goal is to empower any process (especially
>> unprivileged ones) to sandbox themselves, we cannot rely on a
>> system-wide object identification such as file extended attributes.
>> Indeed, we need innocuous, composable and modular access-controls.
>>
>> The main challenge with these constraints is to identify kernel objects
>> while this identification is useful (i.e. when a security policy makes
>> use of this object). But this identification data should be freed once
>> no policy is using it. This ephemeral tagging should not and may not be
>> written in the filesystem. We then need to manage the lifetime of a
>> rule according to the lifetime of its objects. To avoid a global lock,
>> this implementation make use of RCU and counters to safely reference
>> objects.
>>
>> A following commit uses this generic object management for inodes.
>>
>> Cc: James Morris <[email protected]>
>> Cc: Kees Cook <[email protected]>
>> Cc: Serge E. Hallyn <[email protected]>
>
> Acked-by: Serge Hallyn <[email protected]>
>
> Just a few suggestions for the description below.
>
>> Signed-off-by: Micka?l Sala?n <[email protected]>
>> Reviewed-by: Jann Horn <[email protected]>
>> ---
>>
>> Changes since v27:
>> * Update Kconfig for landlock_restrict_self(2).
>> * Cosmetic fixes: use 80 columns in Kconfig and align Makefile
>> declarations.
>>
>> Changes since v26:
>> * Update Kconfig for landlock_enforce_ruleset_self(2).
>> * Fix spelling.
>>
>> Changes since v24:
>> * Fix typo in comment (spotted by Jann Horn).
>> * Add Reviewed-by: Jann Horn <[email protected]>
>>
>> Changes since v23:
>> * Update landlock_create_object() to return error codes instead of NULL.
>> This help error handling in callers.
>> * When using make oldconfig with a previous configuration already
>> including the CONFIG_LSM variable, no question is asked to update its
>> content. Update the Kconfig help to warn about LSM stacking
>> configuration.
>> * Constify variable (spotted by Vincent Dagonneau).
>>
>> Changes since v22:
>> * Fix spelling (spotted by Jann Horn).
>>
>> Changes since v21:
>> * Update Kconfig help.
>> * Clean up comments.
>>
>> Changes since v18:
>> * Account objects to kmemcg.
>>
>> Changes since v14:
>> * Simplify the object, rule and ruleset management at the expense of a
>> less aggressive memory freeing (contributed by Jann Horn, with
>> additional modifications):
>> - Remove object->list aggregating the rules tied to an object.
>> - Remove landlock_get_object(), landlock_drop_object(),
>> {get,put}_object_cleaner() and landlock_rule_is_disabled().
>> - Rewrite landlock_put_object() to use a more simple mechanism
>> (no tricky RCU).
>> - Replace enum landlock_object_type and landlock_release_object() with
>> landlock_object_underops->release()
>> - Adjust unions and Sparse annotations.
>> Cf. https://lore.kernel.org/lkml/CAG48ez21bEn0wL1bbmTiiu8j9jP5iEWtHOwz4tURUJ+ki0ydYw@mail.gmail.com/
>> * Merge struct landlock_rule into landlock_ruleset_elem to simplify the
>> rule management.
>> * Constify variables.
>> * Improve kernel documentation.
>> * Cosmetic variable renames.
>> * Remove the "default" in the Kconfig (suggested by Jann Horn).
>> * Only use refcount_inc() through getter helpers.
>> * Update Kconfig description.
>>
>> Changes since v13:
>> * New dedicated implementation, removing the need for eBPF.
>>
>> Previous changes:
>> https://lore.kernel.org/lkml/[email protected]/
>> ---
>> MAINTAINERS | 10 +++++
>> security/Kconfig | 1 +
>> security/Makefile | 2 +
>> security/landlock/Kconfig | 21 +++++++++
>> security/landlock/Makefile | 3 ++
>> security/landlock/object.c | 67 ++++++++++++++++++++++++++++
>> security/landlock/object.h | 91 ++++++++++++++++++++++++++++++++++++++
>> 7 files changed, 195 insertions(+)
>> create mode 100644 security/landlock/Kconfig
>> create mode 100644 security/landlock/Makefile
>> create mode 100644 security/landlock/object.c
>> create mode 100644 security/landlock/object.h
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index d3e847f7f3dc..a0e57ade0524 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -9936,6 +9936,16 @@ F: net/core/sock_map.c
>> F: net/ipv4/tcp_bpf.c
>> F: net/ipv4/udp_bpf.c
>>
>> +LANDLOCK SECURITY MODULE
>> +M: Micka?l Sala?n <[email protected]>
>> +L: [email protected]
>> +S: Supported
>> +W: https://landlock.io
>> +T: git https://github.com/landlock-lsm/linux.git
>> +F: security/landlock/
>> +K: landlock
>> +K: LANDLOCK
>> +
>> LANTIQ / INTEL Ethernet drivers
>> M: Hauke Mehrtens <[email protected]>
>> L: [email protected]
>> diff --git a/security/Kconfig b/security/Kconfig
>> index 7561f6f99f1d..15a4342b5d01 100644
>> --- a/security/Kconfig
>> +++ b/security/Kconfig
>> @@ -238,6 +238,7 @@ source "security/loadpin/Kconfig"
>> source "security/yama/Kconfig"
>> source "security/safesetid/Kconfig"
>> source "security/lockdown/Kconfig"
>> +source "security/landlock/Kconfig"
>>
>> source "security/integrity/Kconfig"
>>
>> diff --git a/security/Makefile b/security/Makefile
>> index 3baf435de541..47e432900e24 100644
>> --- a/security/Makefile
>> +++ b/security/Makefile
>> @@ -13,6 +13,7 @@ subdir-$(CONFIG_SECURITY_LOADPIN) += loadpin
>> subdir-$(CONFIG_SECURITY_SAFESETID) += safesetid
>> subdir-$(CONFIG_SECURITY_LOCKDOWN_LSM) += lockdown
>> subdir-$(CONFIG_BPF_LSM) += bpf
>> +subdir-$(CONFIG_SECURITY_LANDLOCK) += landlock
>>
>> # always enable default capabilities
>> obj-y += commoncap.o
>> @@ -32,6 +33,7 @@ obj-$(CONFIG_SECURITY_SAFESETID) += safesetid/
>> obj-$(CONFIG_SECURITY_LOCKDOWN_LSM) += lockdown/
>> obj-$(CONFIG_CGROUPS) += device_cgroup.o
>> obj-$(CONFIG_BPF_LSM) += bpf/
>> +obj-$(CONFIG_SECURITY_LANDLOCK) += landlock/
>>
>> # Object integrity file lists
>> subdir-$(CONFIG_INTEGRITY) += integrity
>> diff --git a/security/landlock/Kconfig b/security/landlock/Kconfig
>> new file mode 100644
>> index 000000000000..79b7d0c3b11e
>> --- /dev/null
>> +++ b/security/landlock/Kconfig
>> @@ -0,0 +1,21 @@
>> +# SPDX-License-Identifier: GPL-2.0-only
>> +
>> +config SECURITY_LANDLOCK
>> + bool "Landlock support"
>> + depends on SECURITY
>> + select SECURITY_PATH
>> + help
>> + Landlock is a safe sandboxing mechanism that enables processes to
>
> "safe" probably doesn't need to be there :)
>
>> + restrict themselves (and their future children) by gradually
>> + enforcing tailored access control policies. A security policy is a
>
> You're redefining "security policy" which could be confusing. How about
> saying "a landlock security policy is a..."?
>
>> + set of access rights (e.g. open a file in read-only, make a
>> + directory, etc.) tied to a file hierarchy. Such policy can be
>> + configured and enforced by any processes for themselves thanks to
>
> s/thanks to/using the/ ?

OK for these three modifications. Thanks!

>
>> + dedicated system calls: landlock_create_ruleset(),
>> + landlock_add_rule(), and landlock_restrict_self().
>> +
>> + See Documentation/userspace-api/landlock.rst for further information.
>> +
>> + If you are unsure how to answer this question, answer N. Otherwise,
>> + you should also prepend "landlock," to the content of CONFIG_LSM to
>> + enable Landlock at boot time.
>> diff --git a/security/landlock/Makefile b/security/landlock/Makefile
>> new file mode 100644
>> index 000000000000..cb6deefbf4c0
>> --- /dev/null
>> +++ b/security/landlock/Makefile
>> @@ -0,0 +1,3 @@
>> +obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
>> +
>> +landlock-y := object.o
>> diff --git a/security/landlock/object.c b/security/landlock/object.c
>> new file mode 100644
>> index 000000000000..d674fdf9ff04
>> --- /dev/null
>> +++ b/security/landlock/object.c
>> @@ -0,0 +1,67 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Landlock LSM - Object management
>> + *
>> + * Copyright ? 2016-2020 Micka?l Sala?n <[email protected]>
>> + * Copyright ? 2018-2020 ANSSI
>> + */
>> +
>> +#include <linux/bug.h>
>> +#include <linux/compiler_types.h>
>> +#include <linux/err.h>
>> +#include <linux/kernel.h>
>> +#include <linux/rcupdate.h>
>> +#include <linux/refcount.h>
>> +#include <linux/slab.h>
>> +#include <linux/spinlock.h>
>> +
>> +#include "object.h"
>> +
>> +struct landlock_object *landlock_create_object(
>> + const struct landlock_object_underops *const underops,
>> + void *const underobj)
>> +{
>> + struct landlock_object *new_object;
>> +
>> + if (WARN_ON_ONCE(!underops || !underobj))
>> + return ERR_PTR(-ENOENT);
>> + new_object = kzalloc(sizeof(*new_object), GFP_KERNEL_ACCOUNT);
>> + if (!new_object)
>> + return ERR_PTR(-ENOMEM);
>> + refcount_set(&new_object->usage, 1);
>> + spin_lock_init(&new_object->lock);
>> + new_object->underops = underops;
>> + new_object->underobj = underobj;
>> + return new_object;
>> +}
>> +
>> +/*
>> + * The caller must own the object (i.e. thanks to object->usage) to safely put
>> + * it.
>> + */
>> +void landlock_put_object(struct landlock_object *const object)
>> +{
>> + /*
>> + * The call to @object->underops->release(object) might sleep, e.g.
>> + * because of iput().
>> + */
>> + might_sleep();
>> + if (!object)
>> + return;
>> +
>> + /*
>> + * If the @object's refcount cannot drop to zero, we can just decrement
>> + * the refcount without holding a lock. Otherwise, the decrement must
>> + * happen under @object->lock for synchronization with things like
>> + * get_inode_object().
>> + */
>> + if (refcount_dec_and_lock(&object->usage, &object->lock)) {
>> + __acquire(&object->lock);
>> + /*
>> + * With @object->lock initially held, remove the reference from
>> + * @object->underobj to @object (if it still exists).
>> + */
>> + object->underops->release(object);
>> + kfree_rcu(object, rcu_free);
>> + }
>> +}
>> diff --git a/security/landlock/object.h b/security/landlock/object.h
>> new file mode 100644
>> index 000000000000..56f17c51df01
>> --- /dev/null
>> +++ b/security/landlock/object.h
>> @@ -0,0 +1,91 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Landlock LSM - Object management
>> + *
>> + * Copyright ? 2016-2020 Micka?l Sala?n <[email protected]>
>> + * Copyright ? 2018-2020 ANSSI
>> + */
>> +
>> +#ifndef _SECURITY_LANDLOCK_OBJECT_H
>> +#define _SECURITY_LANDLOCK_OBJECT_H
>> +
>> +#include <linux/compiler_types.h>
>> +#include <linux/refcount.h>
>> +#include <linux/spinlock.h>
>> +
>> +struct landlock_object;
>> +
>> +/**
>> + * struct landlock_object_underops - Operations on an underlying object
>> + */
>> +struct landlock_object_underops {
>> + /**
>> + * @release: Releases the underlying object (e.g. iput() for an inode).
>> + */
>> + void (*release)(struct landlock_object *const object)
>> + __releases(object->lock);
>> +};
>> +
>> +/**
>> + * struct landlock_object - Security blob tied to a kernel object
>> + *
>> + * The goal of this structure is to enable to tie a set of ephemeral access
>> + * rights (pertaining to different domains) to a kernel object (e.g an inode)
>> + * in a safe way. This implies to handle concurrent use and modification.
>> + *
>> + * The lifetime of a &struct landlock_object depends of the rules referring to
>> + * it.
>> + */
>> +struct landlock_object {
>> + /**
>> + * @usage: This counter is used to tie an object to the rules matching
>> + * it or to keep it alive while adding a new rule. If this counter
>> + * reaches zero, this struct must not be modified, but this counter can
>> + * still be read from within an RCU read-side critical section. When
>> + * adding a new rule to an object with a usage counter of zero, we must
>> + * wait until the pointer to this object is set to NULL (or recycled).
>> + */
>> + refcount_t usage;
>> + /**
>> + * @lock: Guards against concurrent modifications. This lock must be
>> + * held from the time @usage drops to zero until any weak references
>> + * from @underobj to this object have been cleaned up.
>> + *
>> + * Lock ordering: inode->i_lock nests inside this.
>> + */
>> + spinlock_t lock;
>> + /**
>> + * @underobj: Used when cleaning up an object and to mark an object as
>> + * tied to its underlying kernel structure. This pointer is protected
>> + * by @lock. Cf. landlock_release_inodes() and release_inode().
>> + */
>> + void *underobj;
>> + union {
>> + /**
>> + * @rcu_free: Enables lockless use of @usage, @lock and
>> + * @underobj from within an RCU read-side critical section.
>> + * @rcu_free and @underops are only used by
>> + * landlock_put_object().
>> + */
>> + struct rcu_head rcu_free;
>> + /**
>> + * @underops: Enables landlock_put_object() to release the
>> + * underlying object (e.g. inode).
>> + */
>> + const struct landlock_object_underops *underops;
>> + };
>> +};
>> +
>> +struct landlock_object *landlock_create_object(
>> + const struct landlock_object_underops *const underops,
>> + void *const underobj);
>> +
>> +void landlock_put_object(struct landlock_object *const object);
>> +
>> +static inline void landlock_get_object(struct landlock_object *const object)
>> +{
>> + if (object)
>> + refcount_inc(&object->usage);
>> +}
>> +
>> +#endif /* _SECURITY_LANDLOCK_OBJECT_H */
>> --
>> 2.30.0

2021-02-04 03:33:50

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [PATCH v28 02/12] landlock: Add ruleset and domain management

On Tue, Feb 02, 2021 at 05:27:00PM +0100, Micka?l Sala?n wrote:
> From: Micka?l Sala?n <[email protected]>
>
> A Landlock ruleset is mainly a red-black tree with Landlock rules as
> nodes. This enables quick update and lookup to match a requested
> access, e.g. to a file. A ruleset is usable through a dedicated file
> descriptor (cf. following commit implementing syscalls) which enables a
> process to create and populate a ruleset with new rules.
>
> A domain is a ruleset tied to a set of processes. This group of rules
> defines the security policy enforced on these processes and their future
> children. A domain can transition to a new domain which is the
> intersection of all its constraints and those of a ruleset provided by
> the current process. This modification only impact the current process.
> This means that a process can only gain more constraints (i.e. lose
> accesses) over time.
>
> Cc: James Morris <[email protected]>
> Cc: Jann Horn <[email protected]>
> Cc: Kees Cook <[email protected]>
> Cc: Serge E. Hallyn <[email protected]>

Acked-by: Serge Hallyn <[email protected]>

> Signed-off-by: Micka?l Sala?n <[email protected]>
> ---
>
> Changes since v27:
> * Fix domains with layers of non-overlapping access rights.
> * Add stricter limit checks (same semantic).
> * Change the grow direction of a rule layer stack to make it the same as
> the new ruleset fs_access_masks stack (cosmetic change).
> * Cosmetic fix for a comment block.
>
> Changes since v26:
> * Fix spelling.
>
> Changes since v25:
> * Add build-time checks for the num_layers and num_rules variables
> according to LANDLOCK_MAX_NUM_LAYERS and LANDLOCK_MAX_NUM_RULES, and
> move these limits to a dedicated file.
> * Cosmetic variable renames.
>
> Changes since v24:
> * Update struct landlock_rule with a layer stack. This reverts "Always
> intersect access rights" from v24 and also adds the ability to tie
> access rights with their policy layer. As noted by Jann Horn, always
> intersecting access rights made some use cases uselessly more
> difficult to handle in user space. Thanks to this new stack, we still
> have a deterministic policy behavior whatever their level in the stack
> of policies, while using a "union" of accesses when building a
> ruleset. The implementation use a FAM to keep the access checks quick
> and memory efficient (4 bytes per layer per inode). Update
> insert_rule() accordingly.
>
> Changes since v23:
> * Always intersect access rights. Following the filesystem change
> logic, make ruleset updates more consistent by always intersecting
> access rights (boolean AND) instead of combining them (boolean OR) for
> the same layer. This defensive approach could also help avoid user
> space to inadvertently allow multiple access rights for the same
> object (e.g. write and execute access on a path hierarchy) instead of
> dealing with such inconsistency. This can happen when there is no
> deduplication of objects (e.g. paths and underlying inodes) whereas
> they get different access rights with landlock_add_rule(2).
> * Add extra checks to make sure that:
> - there is always an (allocated) object in each used rules;
> - when updating a ruleset with a new rule (i.e. not merging two
> rulesets), the ruleset doesn't contain multiple layers.
> * Hide merge parameter from the public landlock_insert_rule() API. This
> helps avoid misuse of this function.
> * Replace a remaining hardcoded 1 with SINGLE_DEPTH_NESTING.
>
> Changes since v22:
> * Explicitely use RB_ROOT and SINGLE_DEPTH_NESTING (suggested by Jann
> Horn).
> * Improve comments and fix spelling (suggested by Jann Horn).
>
> Changes since v21:
> * Add and clean up comments.
>
> Changes since v18:
> * Account rulesets to kmemcg.
> * Remove struct holes.
> * Cosmetic changes.
>
> Changes since v17:
> * Move include/uapi/linux/landlock.h and _LANDLOCK_ACCESS_FS_* to a
> following patch.
>
> Changes since v16:
> * Allow enforcement of empty ruleset, which enables deny-all policies.
>
> Changes since v15:
> * Replace layer_levels and layer_depth with a bitfield of layers, cf.
> filesystem commit.
> * Rename the LANDLOCK_ACCESS_FS_{UNLINK,RMDIR} with
> LANDLOCK_ACCESS_FS_REMOVE_{FILE,DIR} because it makes sense to use
> them for the action of renaming a file or a directory, which may lead
> to the removal of the source file or directory. Removes the
> LANDLOCK_ACCESS_FS_{LINK_TO,RENAME_FROM,RENAME_TO} which are now
> replaced with LANDLOCK_ACCESS_FS_REMOVE_{FILE,DIR} and
> LANDLOCK_ACCESS_FS_MAKE_* .
> * Update the documentation accordingly and highlight how the access
> rights are taken into account.
> * Change nb_rules from atomic_t to u32 because it is not use anymore by
> show_fdinfo().
> * Add safeguard for level variables types.
> * Check max number of rules.
> * Replace struct landlock_access (self and beneath bitfields) with one
> bitfield.
> * Remove useless variable.
> * Add comments.
>
> Changes since v14:
> * Simplify the object, rule and ruleset management at the expense of a
> less aggressive memory freeing (contributed by Jann Horn, with
> additional modifications):
> - Make a domain immutable (remove the opportunistic cleaning).
> - Remove RCU pointers.
> - Merge struct landlock_ref and struct landlock_ruleset_elem into
> landlock_rule: get ride of rule's RCU.
> - Adjust union.
> - Remove the landlock_insert_rule() check about a new object with the
> same address as a previously disabled one, because it is not
> possible to disable a rule anymore.
> Cf. https://lore.kernel.org/lkml/CAG48ez21bEn0wL1bbmTiiu8j9jP5iEWtHOwz4tURUJ+ki0ydYw@mail.gmail.com/
> * Fix nested domains by implementing a notion of layer level and depth:
> - Update landlock_insert_rule() to manage such layers.
> - Add an inherit_ruleset() helper to properly create a new domain.
> - Rename landlock_find_access() to landlock_find_rule() and return a
> full rule reference.
> - Add a layer_level and a layer_depth fields to struct landlock_rule.
> - Add a top_layer_level field to struct landlock_ruleset.
> * Remove access rights that may be required for FD-only requests:
> truncate, getattr, lock, chmod, chown, chgrp, ioctl. This will be
> handle in a future evolution of Landlock, but right now the goal is to
> lighten the code to ease review.
> * Remove LANDLOCK_ACCESS_FS_OPEN and rename
> LANDLOCK_ACCESS_FS_{READ,WRITE} with a FILE suffix.
> * Rename LANDLOCK_ACCESS_FS_READDIR to match the *_FILE pattern.
> * Remove LANDLOCK_ACCESS_FS_MAP which was useless.
> * Fix memory leak in put_hierarchy() (reported by Jann Horn).
> * Fix user-after-free and rename free_ruleset() (reported by Jann Horn).
> * Replace the for loops with rbtree_postorder_for_each_entry_safe().
> * Constify variables.
> * Only use refcount_inc() through getter helpers.
> * Change Landlock_insert_ruleset_access() to
> Landlock_insert_ruleset_rule().
> * Rename landlock_put_ruleset_enqueue() to landlock_put_ruleset_deferred().
> * Improve kernel documentation and add a warning about the unhandled
> access/syscall families.
> * Move ABI check to syscall.c .
>
> Changes since v13:
> * New implementation, inspired by the previous inode eBPF map, but
> agnostic to the underlying kernel object.
>
> Previous changes:
> https://lore.kernel.org/lkml/[email protected]/
> ---
> security/landlock/Makefile | 2 +-
> security/landlock/limits.h | 17 ++
> security/landlock/ruleset.c | 469 ++++++++++++++++++++++++++++++++++++
> security/landlock/ruleset.h | 165 +++++++++++++
> 4 files changed, 652 insertions(+), 1 deletion(-)
> create mode 100644 security/landlock/limits.h
> create mode 100644 security/landlock/ruleset.c
> create mode 100644 security/landlock/ruleset.h
>
> diff --git a/security/landlock/Makefile b/security/landlock/Makefile
> index cb6deefbf4c0..d846eba445bb 100644
> --- a/security/landlock/Makefile
> +++ b/security/landlock/Makefile
> @@ -1,3 +1,3 @@
> obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
>
> -landlock-y := object.o
> +landlock-y := object.o ruleset.o
> diff --git a/security/landlock/limits.h b/security/landlock/limits.h
> new file mode 100644
> index 000000000000..b734f597bb0e
> --- /dev/null
> +++ b/security/landlock/limits.h
> @@ -0,0 +1,17 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Landlock LSM - Limits for different components
> + *
> + * Copyright ? 2016-2020 Micka?l Sala?n <[email protected]>
> + * Copyright ? 2018-2020 ANSSI
> + */
> +
> +#ifndef _SECURITY_LANDLOCK_LIMITS_H
> +#define _SECURITY_LANDLOCK_LIMITS_H
> +
> +#include <linux/limits.h>
> +
> +#define LANDLOCK_MAX_NUM_LAYERS 64
> +#define LANDLOCK_MAX_NUM_RULES U32_MAX
> +
> +#endif /* _SECURITY_LANDLOCK_LIMITS_H */
> diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
> new file mode 100644
> index 000000000000..59c86126ea1c
> --- /dev/null
> +++ b/security/landlock/ruleset.c
> @@ -0,0 +1,469 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Landlock LSM - Ruleset management
> + *
> + * Copyright ? 2016-2020 Micka?l Sala?n <[email protected]>
> + * Copyright ? 2018-2020 ANSSI
> + */
> +
> +#include <linux/bits.h>
> +#include <linux/bug.h>
> +#include <linux/compiler_types.h>
> +#include <linux/err.h>
> +#include <linux/errno.h>
> +#include <linux/kernel.h>
> +#include <linux/lockdep.h>
> +#include <linux/overflow.h>
> +#include <linux/rbtree.h>
> +#include <linux/refcount.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock.h>
> +#include <linux/workqueue.h>
> +
> +#include "limits.h"
> +#include "object.h"
> +#include "ruleset.h"
> +
> +static struct landlock_ruleset *create_ruleset(const u32 num_layers)
> +{
> + struct landlock_ruleset *new_ruleset;
> +
> + new_ruleset = kzalloc(struct_size(new_ruleset, fs_access_masks,
> + num_layers), GFP_KERNEL_ACCOUNT);
> + if (!new_ruleset)
> + return ERR_PTR(-ENOMEM);
> + refcount_set(&new_ruleset->usage, 1);
> + mutex_init(&new_ruleset->lock);
> + new_ruleset->root = RB_ROOT;
> + new_ruleset->num_layers = num_layers;
> + /*
> + * hierarchy = NULL
> + * num_rules = 0
> + * fs_access_masks[] = 0
> + */
> + return new_ruleset;
> +}
> +
> +struct landlock_ruleset *landlock_create_ruleset(const u32 fs_access_mask)
> +{
> + struct landlock_ruleset *new_ruleset;
> +
> + /* Informs about useless ruleset. */
> + if (!fs_access_mask)
> + return ERR_PTR(-ENOMSG);
> + new_ruleset = create_ruleset(1);
> + if (!IS_ERR(new_ruleset))
> + new_ruleset->fs_access_masks[0] = fs_access_mask;
> + return new_ruleset;
> +}
> +
> +static void build_check_rule(void)
> +{
> + const struct landlock_rule rule = {
> + .num_layers = ~0,
> + };
> +
> + BUILD_BUG_ON(rule.num_layers < LANDLOCK_MAX_NUM_LAYERS);
> +}
> +
> +static struct landlock_rule *create_rule(
> + struct landlock_object *const object,
> + const struct landlock_layer (*const layers)[],
> + const u32 num_layers,
> + const struct landlock_layer *const new_layer)
> +{
> + struct landlock_rule *new_rule;
> + u32 new_num_layers;
> +
> + build_check_rule();
> + if (new_layer) {
> + /* Should already be checked by landlock_merge_ruleset(). */
> + if (WARN_ON_ONCE(num_layers >= LANDLOCK_MAX_NUM_LAYERS))
> + return ERR_PTR(-E2BIG);
> + new_num_layers = num_layers + 1;
> + } else {
> + new_num_layers = num_layers;
> + }
> + new_rule = kzalloc(struct_size(new_rule, layers, new_num_layers),
> + GFP_KERNEL_ACCOUNT);
> + if (!new_rule)
> + return ERR_PTR(-ENOMEM);
> + RB_CLEAR_NODE(&new_rule->node);
> + landlock_get_object(object);
> + new_rule->object = object;
> + new_rule->num_layers = new_num_layers;
> + /* Copies the original layer stack. */
> + memcpy(new_rule->layers, layers,
> + flex_array_size(new_rule, layers, num_layers));
> + if (new_layer)
> + /* Adds a copy of @new_layer on the layer stack. */
> + new_rule->layers[new_rule->num_layers - 1] = *new_layer;
> + return new_rule;
> +}
> +
> +static void put_rule(struct landlock_rule *const rule)
> +{
> + might_sleep();
> + if (!rule)
> + return;
> + landlock_put_object(rule->object);
> + kfree(rule);
> +}
> +
> +static void build_check_ruleset(void)
> +{
> + const struct landlock_ruleset ruleset = {
> + .num_rules = ~0,
> + .num_layers = ~0,
> + };
> +
> + BUILD_BUG_ON(ruleset.num_rules < LANDLOCK_MAX_NUM_RULES);
> + BUILD_BUG_ON(ruleset.num_layers < LANDLOCK_MAX_NUM_LAYERS);
> +}
> +
> +/**
> + * insert_rule - Create and insert a rule in a ruleset
> + *
> + * @ruleset: The ruleset to be updated.
> + * @object: The object to build the new rule with. The underlying kernel
> + * object must be held by the caller.
> + * @layers: One or multiple layers to be copied into the new rule.
> + * @num_layers: The number of @layers entries.
> + *
> + * When user space requests to add a new rule to a ruleset, @layers only
> + * contains one entry and this entry is not assigned to any level. In this
> + * case, the new rule will extend @ruleset, similarly to a boolean OR between
> + * access rights.
> + *
> + * When merging a ruleset in a domain, or copying a domain, @layers will be
> + * added to @ruleset as new constraints, similarly to a boolean AND between
> + * access rights.
> + */
> +static int insert_rule(struct landlock_ruleset *const ruleset,
> + struct landlock_object *const object,
> + const struct landlock_layer (*const layers)[],
> + size_t num_layers)
> +{
> + struct rb_node **walker_node;
> + struct rb_node *parent_node = NULL;
> + struct landlock_rule *new_rule;
> +
> + might_sleep();
> + lockdep_assert_held(&ruleset->lock);
> + if (WARN_ON_ONCE(!object || !layers))
> + return -ENOENT;
> + walker_node = &(ruleset->root.rb_node);
> + while (*walker_node) {
> + struct landlock_rule *const this = rb_entry(*walker_node,
> + struct landlock_rule, node);
> +
> + if (this->object != object) {
> + parent_node = *walker_node;
> + if (this->object < object)
> + walker_node = &((*walker_node)->rb_right);
> + else
> + walker_node = &((*walker_node)->rb_left);
> + continue;
> + }
> +
> + /* Only a single-level layer should match an existing rule. */
> + if (WARN_ON_ONCE(num_layers != 1))
> + return -EINVAL;
> +
> + /* If there is a matching rule, updates it. */
> + if ((*layers)[0].level == 0) {
> + /*
> + * Extends access rights when the request comes from
> + * landlock_add_rule(2), i.e. @ruleset is not a domain.
> + */
> + if (WARN_ON_ONCE(this->num_layers != 1))
> + return -EINVAL;
> + if (WARN_ON_ONCE(this->layers[0].level != 0))
> + return -EINVAL;
> + this->layers[0].access |= (*layers)[0].access;
> + return 0;
> + }
> +
> + if (WARN_ON_ONCE(this->layers[0].level == 0))
> + return -EINVAL;
> +
> + /*
> + * Intersects access rights when it is a merge between a
> + * ruleset and a domain.
> + */
> + new_rule = create_rule(object, &this->layers, this->num_layers,
> + &(*layers)[0]);
> + if (IS_ERR(new_rule))
> + return PTR_ERR(new_rule);
> + rb_replace_node(&this->node, &new_rule->node, &ruleset->root);
> + put_rule(this);
> + return 0;
> + }
> +
> + /* There is no match for @object. */
> + build_check_ruleset();
> + if (ruleset->num_rules >= LANDLOCK_MAX_NUM_RULES)
> + return -E2BIG;
> + new_rule = create_rule(object, layers, num_layers, NULL);
> + if (IS_ERR(new_rule))
> + return PTR_ERR(new_rule);
> + rb_link_node(&new_rule->node, parent_node, walker_node);
> + rb_insert_color(&new_rule->node, &ruleset->root);
> + ruleset->num_rules++;
> + return 0;
> +}
> +
> +static void build_check_layer(void)
> +{
> + const struct landlock_layer layer = {
> + .level = ~0,
> + };
> +
> + BUILD_BUG_ON(layer.level < LANDLOCK_MAX_NUM_LAYERS);
> +}
> +
> +/* @ruleset must be locked by the caller. */
> +int landlock_insert_rule(struct landlock_ruleset *const ruleset,
> + struct landlock_object *const object, const u32 access)
> +{
> + struct landlock_layer layers[] = {{
> + .access = access,
> + /* When @level is zero, insert_rule() extends @ruleset. */
> + .level = 0,
> + }};
> +
> + build_check_layer();
> + return insert_rule(ruleset, object, &layers, ARRAY_SIZE(layers));
> +}
> +
> +static inline void get_hierarchy(struct landlock_hierarchy *const hierarchy)
> +{
> + if (hierarchy)
> + refcount_inc(&hierarchy->usage);
> +}
> +
> +static void put_hierarchy(struct landlock_hierarchy *hierarchy)
> +{
> + while (hierarchy && refcount_dec_and_test(&hierarchy->usage)) {
> + const struct landlock_hierarchy *const freeme = hierarchy;
> +
> + hierarchy = hierarchy->parent;
> + kfree(freeme);
> + }
> +}
> +
> +static int merge_ruleset(struct landlock_ruleset *const dst,
> + struct landlock_ruleset *const src)
> +{
> + struct landlock_rule *walker_rule, *next_rule;
> + int err = 0;
> +
> + might_sleep();
> + /* Should already be checked by landlock_merge_ruleset() */
> + if (WARN_ON_ONCE(!src))
> + return 0;
> + /* Only merge into a domain. */
> + if (WARN_ON_ONCE(!dst || !dst->hierarchy))
> + return -EINVAL;
> +
> + /* Locks @dst first because we are its only owner. */
> + mutex_lock(&dst->lock);
> + mutex_lock_nested(&src->lock, SINGLE_DEPTH_NESTING);
> +
> + /* Stacks the new layer. */
> + if (WARN_ON_ONCE(src->num_layers != 1 || dst->num_layers < 1)) {
> + err = -EINVAL;
> + goto out_unlock;
> + }
> + dst->fs_access_masks[dst->num_layers - 1] = src->fs_access_masks[0];
> +
> + /* Merges the @src tree. */
> + rbtree_postorder_for_each_entry_safe(walker_rule, next_rule,
> + &src->root, node) {
> + struct landlock_layer layers[] = {{
> + .level = dst->num_layers,
> + }};
> +
> + if (WARN_ON_ONCE(walker_rule->num_layers != 1)) {
> + err = -EINVAL;
> + goto out_unlock;
> + }
> + if (WARN_ON_ONCE(walker_rule->layers[0].level != 0)) {
> + err = -EINVAL;
> + goto out_unlock;
> + }
> + layers[0].access = walker_rule->layers[0].access;
> + err = insert_rule(dst, walker_rule->object, &layers,
> + ARRAY_SIZE(layers));
> + if (err)
> + goto out_unlock;
> + }
> +
> +out_unlock:
> + mutex_unlock(&src->lock);
> + mutex_unlock(&dst->lock);
> + return err;
> +}
> +
> +static int inherit_ruleset(struct landlock_ruleset *const parent,
> + struct landlock_ruleset *const child)
> +{
> + struct landlock_rule *walker_rule, *next_rule;
> + int err = 0;
> +
> + might_sleep();
> + if (!parent)
> + return 0;
> +
> + /* Locks @child first because we are its only owner. */
> + mutex_lock(&child->lock);
> + mutex_lock_nested(&parent->lock, SINGLE_DEPTH_NESTING);
> +
> + /* Copies the @parent tree. */
> + rbtree_postorder_for_each_entry_safe(walker_rule, next_rule,
> + &parent->root, node) {
> + err = insert_rule(child, walker_rule->object,
> + &walker_rule->layers, walker_rule->num_layers);
> + if (err)
> + goto out_unlock;
> + }
> +
> + if (WARN_ON_ONCE(child->num_layers <= parent->num_layers)) {
> + err = -EINVAL;
> + goto out_unlock;
> + }
> + /* Copies the parent layer stack and leaves a space for the new layer. */
> + memcpy(child->fs_access_masks, parent->fs_access_masks,
> + flex_array_size(parent, fs_access_masks, parent->num_layers));
> +
> + if (WARN_ON_ONCE(!parent->hierarchy)) {
> + err = -EINVAL;
> + goto out_unlock;
> + }
> + get_hierarchy(parent->hierarchy);
> + child->hierarchy->parent = parent->hierarchy;
> +
> +out_unlock:
> + mutex_unlock(&parent->lock);
> + mutex_unlock(&child->lock);
> + return err;
> +}
> +
> +static void free_ruleset(struct landlock_ruleset *const ruleset)
> +{
> + struct landlock_rule *freeme, *next;
> +
> + might_sleep();
> + rbtree_postorder_for_each_entry_safe(freeme, next, &ruleset->root,
> + node)
> + put_rule(freeme);
> + put_hierarchy(ruleset->hierarchy);
> + kfree(ruleset);
> +}
> +
> +void landlock_put_ruleset(struct landlock_ruleset *const ruleset)
> +{
> + might_sleep();
> + if (ruleset && refcount_dec_and_test(&ruleset->usage))
> + free_ruleset(ruleset);
> +}
> +
> +static void free_ruleset_work(struct work_struct *const work)
> +{
> + struct landlock_ruleset *ruleset;
> +
> + ruleset = container_of(work, struct landlock_ruleset, work_free);
> + free_ruleset(ruleset);
> +}
> +
> +void landlock_put_ruleset_deferred(struct landlock_ruleset *const ruleset)
> +{
> + if (ruleset && refcount_dec_and_test(&ruleset->usage)) {
> + INIT_WORK(&ruleset->work_free, free_ruleset_work);
> + schedule_work(&ruleset->work_free);
> + }
> +}
> +
> +/**
> + * landlock_merge_ruleset - Merge a ruleset with a domain
> + *
> + * @parent: Parent domain.
> + * @ruleset: New ruleset to be merged.
> + *
> + * Returns the intersection of @parent and @ruleset, or returns @parent if
> + * @ruleset is empty, or returns a duplicate of @ruleset if @parent is empty.
> + */
> +struct landlock_ruleset *landlock_merge_ruleset(
> + struct landlock_ruleset *const parent,
> + struct landlock_ruleset *const ruleset)
> +{
> + struct landlock_ruleset *new_dom;
> + u32 num_layers;
> + int err;
> +
> + might_sleep();
> + if (WARN_ON_ONCE(!ruleset || parent == ruleset))
> + return ERR_PTR(-EINVAL);
> +
> + if (parent) {
> + if (parent->num_layers >= LANDLOCK_MAX_NUM_LAYERS)
> + return ERR_PTR(-E2BIG);
> + num_layers = parent->num_layers + 1;
> + } else {
> + num_layers = 1;
> + }
> +
> + /* Creates a new domain... */
> + new_dom = create_ruleset(num_layers);
> + if (IS_ERR(new_dom))
> + return new_dom;
> + new_dom->hierarchy = kzalloc(sizeof(*new_dom->hierarchy),
> + GFP_KERNEL_ACCOUNT);
> + if (!new_dom->hierarchy) {
> + err = -ENOMEM;
> + goto out_put_dom;
> + }
> + refcount_set(&new_dom->hierarchy->usage, 1);
> +
> + /* ...as a child of @parent... */
> + err = inherit_ruleset(parent, new_dom);
> + if (err)
> + goto out_put_dom;
> +
> + /* ...and including @ruleset. */
> + err = merge_ruleset(new_dom, ruleset);
> + if (err)
> + goto out_put_dom;
> +
> + return new_dom;
> +
> +out_put_dom:
> + landlock_put_ruleset(new_dom);
> + return ERR_PTR(err);
> +}
> +
> +/*
> + * The returned access has the same lifetime as @ruleset.
> + */
> +const struct landlock_rule *landlock_find_rule(
> + const struct landlock_ruleset *const ruleset,
> + const struct landlock_object *const object)
> +{
> + const struct rb_node *node;
> +
> + if (!object)
> + return NULL;
> + node = ruleset->root.rb_node;
> + while (node) {
> + struct landlock_rule *this = rb_entry(node,
> + struct landlock_rule, node);
> +
> + if (this->object == object)
> + return this;
> + if (this->object < object)
> + node = node->rb_right;
> + else
> + node = node->rb_left;
> + }
> + return NULL;
> +}
> diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
> new file mode 100644
> index 000000000000..6b1198458b37
> --- /dev/null
> +++ b/security/landlock/ruleset.h
> @@ -0,0 +1,165 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Landlock LSM - Ruleset management
> + *
> + * Copyright ? 2016-2020 Micka?l Sala?n <[email protected]>
> + * Copyright ? 2018-2020 ANSSI
> + */
> +
> +#ifndef _SECURITY_LANDLOCK_RULESET_H
> +#define _SECURITY_LANDLOCK_RULESET_H
> +
> +#include <linux/mutex.h>
> +#include <linux/rbtree.h>
> +#include <linux/refcount.h>
> +#include <linux/workqueue.h>
> +
> +#include "object.h"
> +
> +/**
> + * struct landlock_layer - Access rights for a given layer
> + */
> +struct landlock_layer {
> + /**
> + * @level: Position of this layer in the layer stack.
> + */
> + u16 level;
> + /**
> + * @access: Bitfield of allowed actions on the kernel object. They are
> + * relative to the object type (e.g. %LANDLOCK_ACTION_FS_READ).
> + */
> + u16 access;
> +};
> +
> +/**
> + * struct landlock_rule - Access rights tied to an object
> + */
> +struct landlock_rule {
> + /**
> + * @node: Node in the ruleset's red-black tree.
> + */
> + struct rb_node node;
> + /**
> + * @object: Pointer to identify a kernel object (e.g. an inode). This
> + * is used as a key for this ruleset element. This pointer is set once
> + * and never modified. It always points to an allocated object because
> + * each rule increments the refcount of its object.
> + */
> + struct landlock_object *object;
> + /**
> + * @num_layers: Number of entries in @layers.
> + */
> + u32 num_layers;
> + /**
> + * @layers: Stack of layers, from the latest to the newest, implemented
> + * as a flexible array member (FAM).
> + */
> + struct landlock_layer layers[];
> +};
> +
> +/**
> + * struct landlock_hierarchy - Node in a ruleset hierarchy
> + */
> +struct landlock_hierarchy {
> + /**
> + * @parent: Pointer to the parent node, or NULL if it is a root
> + * Landlock domain.
> + */
> + struct landlock_hierarchy *parent;
> + /**
> + * @usage: Number of potential children domains plus their parent
> + * domain.
> + */
> + refcount_t usage;
> +};
> +
> +/**
> + * struct landlock_ruleset - Landlock ruleset
> + *
> + * This data structure must contain unique entries, be updatable, and quick to
> + * match an object.
> + */
> +struct landlock_ruleset {
> + /**
> + * @root: Root of a red-black tree containing &struct landlock_rule
> + * nodes. Once a ruleset is tied to a process (i.e. as a domain), this
> + * tree is immutable until @usage reaches zero.
> + */
> + struct rb_root root;
> + /**
> + * @hierarchy: Enables hierarchy identification even when a parent
> + * domain vanishes. This is needed for the ptrace protection.
> + */
> + struct landlock_hierarchy *hierarchy;
> + union {
> + /**
> + * @work_free: Enables to free a ruleset within a lockless
> + * section. This is only used by
> + * landlock_put_ruleset_deferred() when @usage reaches zero.
> + * The fields @lock, @usage, @num_rules, @num_layers and
> + * @fs_access_masks are then unused.
> + */
> + struct work_struct work_free;
> + struct {
> + /**
> + * @lock: Guards against concurrent modifications of
> + * @root, if @usage is greater than zero.
> + */
> + struct mutex lock;
> + /**
> + * @usage: Number of processes (i.e. domains) or file
> + * descriptors referencing this ruleset.
> + */
> + refcount_t usage;
> + /**
> + * @num_rules: Number of non-overlapping (i.e. not for
> + * the same object) rules in this ruleset.
> + */
> + u32 num_rules;
> + /**
> + * @num_layers: Number of layers that are used in this
> + * ruleset. This enables to check that all the layers
> + * allow an access request. A value of 0 identifies a
> + * non-merged ruleset (i.e. not a domain).
> + */
> + u32 num_layers;
> + /**
> + * @fs_access_masks: Contains the subset of filesystem
> + * actions that are restricted by a ruleset. A domain
> + * saves all layers of merged rulesets in a stack
> + * (FAM), starting from the first layer to the last
> + * one. These layers are used when merging rulesets,
> + * for user space backward compatibility (i.e.
> + * future-proof), and to properly handle merged
> + * rulesets without overlapping access rights. These
> + * layers are set once and never changed for the
> + * lifetime of the ruleset.
> + */
> + u16 fs_access_masks[];
> + };
> + };
> +};
> +
> +struct landlock_ruleset *landlock_create_ruleset(const u32 fs_access_mask);
> +
> +void landlock_put_ruleset(struct landlock_ruleset *const ruleset);
> +void landlock_put_ruleset_deferred(struct landlock_ruleset *const ruleset);
> +
> +int landlock_insert_rule(struct landlock_ruleset *const ruleset,
> + struct landlock_object *const object, const u32 access);
> +
> +struct landlock_ruleset *landlock_merge_ruleset(
> + struct landlock_ruleset *const parent,
> + struct landlock_ruleset *const ruleset);
> +
> +const struct landlock_rule *landlock_find_rule(
> + const struct landlock_ruleset *const ruleset,
> + const struct landlock_object *const object);
> +
> +static inline void landlock_get_ruleset(struct landlock_ruleset *const ruleset)
> +{
> + if (ruleset)
> + refcount_inc(&ruleset->usage);
> +}
> +
> +#endif /* _SECURITY_LANDLOCK_RULESET_H */
> --
> 2.30.0

2021-02-05 13:57:42

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [PATCH v28 04/12] landlock: Add ptrace restrictions

On Tue, Feb 02, 2021 at 05:27:02PM +0100, Micka?l Sala?n wrote:
> From: Micka?l Sala?n <[email protected]>
>
> Using ptrace(2) and related debug features on a target process can lead
> to a privilege escalation. Indeed, ptrace(2) can be used by an attacker
> to impersonate another task and to remain undetected while performing
> malicious activities. Thanks to ptrace_may_access(), various part of
> the kernel can check if a tracer is more privileged than a tracee.
>
> A landlocked process has fewer privileges than a non-landlocked process
> and must then be subject to additional restrictions when manipulating
> processes. To be allowed to use ptrace(2) and related syscalls on a
> target process, a landlocked process must have a subset of the target
> process's rules (i.e. the tracee must be in a sub-domain of the tracer).
>
> Cc: James Morris <[email protected]>
> Cc: Kees Cook <[email protected]>
> Cc: Serge E. Hallyn <[email protected]>

Acked-by: Serge Hallyn <[email protected]>

Thanks, I appreciate that things are well named and easy to reason
about.

> Signed-off-by: Micka?l Sala?n <[email protected]>
> Reviewed-by: Jann Horn <[email protected]>
> ---
>
> Changes since v25:
> * Rename function to landlock_add_ptrace_hooks().
>
> Changes since v22:
> * Add Reviewed-by: Jann Horn <[email protected]>
>
> Changes since v21:
> * Fix copyright dates.
>
> Changes since v14:
> * Constify variables.
>
> Changes since v13:
> * Make the ptrace restriction mandatory, like in the v10.
> * Remove the eBPF dependency.
>
> Previous changes:
> https://lore.kernel.org/lkml/[email protected]/
> ---
> security/landlock/Makefile | 2 +-
> security/landlock/ptrace.c | 120 +++++++++++++++++++++++++++++++++++++
> security/landlock/ptrace.h | 14 +++++
> security/landlock/setup.c | 2 +
> 4 files changed, 137 insertions(+), 1 deletion(-)
> create mode 100644 security/landlock/ptrace.c
> create mode 100644 security/landlock/ptrace.h
>
> diff --git a/security/landlock/Makefile b/security/landlock/Makefile
> index 041ea242e627..f1d1eb72fa76 100644
> --- a/security/landlock/Makefile
> +++ b/security/landlock/Makefile
> @@ -1,4 +1,4 @@
> obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
>
> landlock-y := setup.o object.o ruleset.o \
> - cred.o
> + cred.o ptrace.o
> diff --git a/security/landlock/ptrace.c b/security/landlock/ptrace.c
> new file mode 100644
> index 000000000000..f55b82446de2
> --- /dev/null
> +++ b/security/landlock/ptrace.c
> @@ -0,0 +1,120 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Landlock LSM - Ptrace hooks
> + *
> + * Copyright ? 2017-2020 Micka?l Sala?n <[email protected]>
> + * Copyright ? 2019-2020 ANSSI
> + */
> +
> +#include <asm/current.h>
> +#include <linux/cred.h>
> +#include <linux/errno.h>
> +#include <linux/kernel.h>
> +#include <linux/lsm_hooks.h>
> +#include <linux/rcupdate.h>
> +#include <linux/sched.h>
> +
> +#include "common.h"
> +#include "cred.h"
> +#include "ptrace.h"
> +#include "ruleset.h"
> +#include "setup.h"
> +
> +/**
> + * domain_scope_le - Checks domain ordering for scoped ptrace
> + *
> + * @parent: Parent domain.
> + * @child: Potential child of @parent.
> + *
> + * Checks if the @parent domain is less or equal to (i.e. an ancestor, which
> + * means a subset of) the @child domain.
> + */
> +static bool domain_scope_le(const struct landlock_ruleset *const parent,
> + const struct landlock_ruleset *const child)
> +{
> + const struct landlock_hierarchy *walker;
> +
> + if (!parent)
> + return true;
> + if (!child)
> + return false;
> + for (walker = child->hierarchy; walker; walker = walker->parent) {
> + if (walker == parent->hierarchy)
> + /* @parent is in the scoped hierarchy of @child. */
> + return true;
> + }
> + /* There is no relationship between @parent and @child. */
> + return false;
> +}
> +
> +static bool task_is_scoped(const struct task_struct *const parent,
> + const struct task_struct *const child)
> +{
> + bool is_scoped;
> + const struct landlock_ruleset *dom_parent, *dom_child;
> +
> + rcu_read_lock();
> + dom_parent = landlock_get_task_domain(parent);
> + dom_child = landlock_get_task_domain(child);
> + is_scoped = domain_scope_le(dom_parent, dom_child);
> + rcu_read_unlock();
> + return is_scoped;
> +}
> +
> +static int task_ptrace(const struct task_struct *const parent,
> + const struct task_struct *const child)
> +{
> + /* Quick return for non-landlocked tasks. */
> + if (!landlocked(parent))
> + return 0;
> + if (task_is_scoped(parent, child))
> + return 0;
> + return -EPERM;
> +}
> +
> +/**
> + * hook_ptrace_access_check - Determines whether the current process may access
> + * another
> + *
> + * @child: Process to be accessed.
> + * @mode: Mode of attachment.
> + *
> + * If the current task has Landlock rules, then the child must have at least
> + * the same rules. Else denied.
> + *
> + * Determines whether a process may access another, returning 0 if permission
> + * granted, -errno if denied.
> + */
> +static int hook_ptrace_access_check(struct task_struct *const child,
> + const unsigned int mode)
> +{
> + return task_ptrace(current, child);
> +}
> +
> +/**
> + * hook_ptrace_traceme - Determines whether another process may trace the
> + * current one
> + *
> + * @parent: Task proposed to be the tracer.
> + *
> + * If the parent has Landlock rules, then the current task must have the same
> + * or more rules. Else denied.
> + *
> + * Determines whether the nominated task is permitted to trace the current
> + * process, returning 0 if permission is granted, -errno if denied.
> + */
> +static int hook_ptrace_traceme(struct task_struct *const parent)
> +{
> + return task_ptrace(parent, current);
> +}
> +
> +static struct security_hook_list landlock_hooks[] __lsm_ro_after_init = {
> + LSM_HOOK_INIT(ptrace_access_check, hook_ptrace_access_check),
> + LSM_HOOK_INIT(ptrace_traceme, hook_ptrace_traceme),
> +};
> +
> +__init void landlock_add_ptrace_hooks(void)
> +{
> + security_add_hooks(landlock_hooks, ARRAY_SIZE(landlock_hooks),
> + LANDLOCK_NAME);
> +}
> diff --git a/security/landlock/ptrace.h b/security/landlock/ptrace.h
> new file mode 100644
> index 000000000000..265b220ae3bf
> --- /dev/null
> +++ b/security/landlock/ptrace.h
> @@ -0,0 +1,14 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Landlock LSM - Ptrace hooks
> + *
> + * Copyright ? 2017-2019 Micka?l Sala?n <[email protected]>
> + * Copyright ? 2019 ANSSI
> + */
> +
> +#ifndef _SECURITY_LANDLOCK_PTRACE_H
> +#define _SECURITY_LANDLOCK_PTRACE_H
> +
> +__init void landlock_add_ptrace_hooks(void);
> +
> +#endif /* _SECURITY_LANDLOCK_PTRACE_H */
> diff --git a/security/landlock/setup.c b/security/landlock/setup.c
> index 8661112fb238..a5d6ef334991 100644
> --- a/security/landlock/setup.c
> +++ b/security/landlock/setup.c
> @@ -11,6 +11,7 @@
>
> #include "common.h"
> #include "cred.h"
> +#include "ptrace.h"
> #include "setup.h"
>
> struct lsm_blob_sizes landlock_blob_sizes __lsm_ro_after_init = {
> @@ -20,6 +21,7 @@ struct lsm_blob_sizes landlock_blob_sizes __lsm_ro_after_init = {
> static int __init landlock_init(void)
> {
> landlock_add_cred_hooks();
> + landlock_add_ptrace_hooks();
> pr_info("Up and running.\n");
> return 0;
> }
> --
> 2.30.0

2021-02-05 17:26:20

by Casey Schaufler

[permalink] [raw]
Subject: Re: [PATCH v28 05/12] LSM: Infrastructure management of the superblock

On 2/5/2021 6:17 AM, Serge E. Hallyn wrote:
> On Tue, Feb 02, 2021 at 05:27:03PM +0100, Mickaël Salaün wrote:
>> From: Casey Schaufler <[email protected]>
>>
>> Move management of the superblock->sb_security blob out of the
>> individual security modules and into the security infrastructure.
>> Instead of allocating the blobs from within the modules, the modules
>> tell the infrastructure how much space is required, and the space is
>> allocated there.
>>
>> Cc: Kees Cook <[email protected]>
>> Cc: John Johansen <[email protected]>
>> Signed-off-by: Casey Schaufler <[email protected]>
>> Signed-off-by: Mickaël Salaün <[email protected]>
>> Reviewed-by: Stephen Smalley <[email protected]>
> Acked-by: Serge Hallyn <[email protected]>
>
> I wonder how many out of tree modules this will impact :)

There are several blobs that have already been converted
to infrastructure management. Not a peep from out-of-tree
module developers/maintainers. I can only speculate that
OOT modules are either less common than we may think, using
alternative data management models (as does eBPF) or
sticking with very old kernels. It's also possible that
they're suffering in silence, which would be sad because
every module that's worth having should be in the tree.

> Actually
> if some new incoming module does an rcu callback to free the
> sb_security, then the security_sb_free will need an update, but
> that seems unlikely.

We're already doing that for the inode blob, so it's
really just a small matter of cut-n-paste and s/inode/sb/
to make that happen.


2021-02-05 17:27:17

by Casey Schaufler

[permalink] [raw]
Subject: Re: [PATCH v28 05/12] LSM: Infrastructure management of the superblock

On 2/5/2021 6:17 AM, Serge E. Hallyn wrote:
> On Tue, Feb 02, 2021 at 05:27:03PM +0100, Mickaël Salaün wrote:
>> From: Casey Schaufler <[email protected]>
>>
>> Move management of the superblock->sb_security blob out of the
>> individual security modules and into the security infrastructure.
>> Instead of allocating the blobs from within the modules, the modules
>> tell the infrastructure how much space is required, and the space is
>> allocated there.
>>
>> Cc: Kees Cook <[email protected]>
>> Cc: John Johansen <[email protected]>
>> Signed-off-by: Casey Schaufler <[email protected]>
>> Signed-off-by: Mickaël Salaün <[email protected]>
>> Reviewed-by: Stephen Smalley <[email protected]>
> Acked-by: Serge Hallyn <[email protected]>
>
> I wonder how many out of tree modules this will impact :)

There are several blobs that have already been converted
to infrastructure management. Not a peep from out-of-tree
module developers/maintainers. I can only speculate that
OOT modules are either less common than we may think, using
alternative data management models (as does eBPF) or
sticking with very old kernels. It's also possible that
they're suffering in silence, which would be sad because
every module that's worth having should be in the tree.

> Actually
> if some new incoming module does an rcu callback to free the
> sb_security, then the security_sb_free will need an update, but
> that seems unlikely.

We're already doing that for the inode blob, so it's
really just a small matter of cut-n-paste and s/inode/sb/
to make that happen.


2021-02-05 20:14:13

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [PATCH v28 06/12] fs,security: Add sb_delete hook

On Tue, Feb 02, 2021 at 05:27:04PM +0100, Micka?l Sala?n wrote:
> From: Micka?l Sala?n <[email protected]>
>
> The sb_delete security hook is called when shutting down a superblock,
> which may be useful to release kernel objects tied to the superblock's
> lifetime (e.g. inodes).
>
> This new hook is needed by Landlock to release (ephemerally) tagged
> struct inodes. This comes from the unprivileged nature of Landlock
> described in the next commit.
>
> Cc: Al Viro <[email protected]>
> Cc: James Morris <[email protected]>
> Cc: Kees Cook <[email protected]>
> Cc: Serge E. Hallyn <[email protected]>

One note below, but

Acked-by: Serge Hallyn <[email protected]>

> Signed-off-by: Micka?l Sala?n <[email protected]>
> Reviewed-by: Jann Horn <[email protected]>
> ---
>
> Changes since v22:
> * Add Reviewed-by: Jann Horn <[email protected]>
>
> Changes since v17:
> * Initial patch to replace the direct call to landlock_release_inodes()
> (requested by James Morris).
> https://lore.kernel.org/lkml/[email protected]/
> ---
> fs/super.c | 1 +
> include/linux/lsm_hook_defs.h | 1 +
> include/linux/lsm_hooks.h | 2 ++
> include/linux/security.h | 4 ++++
> security/security.c | 5 +++++
> 5 files changed, 13 insertions(+)
>
> diff --git a/fs/super.c b/fs/super.c
> index 2c6cdea2ab2d..c3c5178cde65 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -454,6 +454,7 @@ void generic_shutdown_super(struct super_block *sb)
> evict_inodes(sb);
> /* only nonzero refcount inodes can have marks */
> fsnotify_sb_delete(sb);
> + security_sb_delete(sb);
>
> if (sb->s_dio_done_wq) {
> destroy_workqueue(sb->s_dio_done_wq);
> diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
> index 7aaa753b8608..32472b3849bc 100644
> --- a/include/linux/lsm_hook_defs.h
> +++ b/include/linux/lsm_hook_defs.h
> @@ -59,6 +59,7 @@ LSM_HOOK(int, 0, fs_context_dup, struct fs_context *fc,
> LSM_HOOK(int, -ENOPARAM, fs_context_parse_param, struct fs_context *fc,
> struct fs_parameter *param)
> LSM_HOOK(int, 0, sb_alloc_security, struct super_block *sb)
> +LSM_HOOK(void, LSM_RET_VOID, sb_delete, struct super_block *sb)
> LSM_HOOK(void, LSM_RET_VOID, sb_free_security, struct super_block *sb)
> LSM_HOOK(void, LSM_RET_VOID, sb_free_mnt_opts, void *mnt_opts)
> LSM_HOOK(int, 0, sb_eat_lsm_opts, char *orig, void **mnt_opts)
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index 970106d98306..e339b201f79b 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -108,6 +108,8 @@
> * allocated.
> * @sb contains the super_block structure to be modified.
> * Return 0 if operation was successful.
> + * @sb_delete:
> + * Release objects tied to a superblock (e.g. inodes).

It's customary here to add the line detailing the @sb argument.

> * @sb_free_security:
> * Deallocate and clear the sb->s_security field.
> * @sb contains the super_block structure to be modified.
> diff --git a/include/linux/security.h b/include/linux/security.h
> index c35ea0ffccd9..c41a94e29b62 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -288,6 +288,7 @@ void security_bprm_committed_creds(struct linux_binprm *bprm);
> int security_fs_context_dup(struct fs_context *fc, struct fs_context *src_fc);
> int security_fs_context_parse_param(struct fs_context *fc, struct fs_parameter *param);
> int security_sb_alloc(struct super_block *sb);
> +void security_sb_delete(struct super_block *sb);
> void security_sb_free(struct super_block *sb);
> void security_free_mnt_opts(void **mnt_opts);
> int security_sb_eat_lsm_opts(char *options, void **mnt_opts);
> @@ -620,6 +621,9 @@ static inline int security_sb_alloc(struct super_block *sb)
> return 0;
> }
>
> +static inline void security_sb_delete(struct super_block *sb)
> +{ }
> +
> static inline void security_sb_free(struct super_block *sb)
> { }
>
> diff --git a/security/security.c b/security/security.c
> index 9f979d4afe6c..1b4a73b2549a 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -900,6 +900,11 @@ int security_sb_alloc(struct super_block *sb)
> return rc;
> }
>
> +void security_sb_delete(struct super_block *sb)
> +{
> + call_void_hook(sb_delete, sb);
> +}
> +
> void security_sb_free(struct super_block *sb)
> {
> call_void_hook(sb_free_security, sb);
> --
> 2.30.0

2021-02-05 20:23:38

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [PATCH v28 05/12] LSM: Infrastructure management of the superblock

On Tue, Feb 02, 2021 at 05:27:03PM +0100, Micka?l Sala?n wrote:
> From: Casey Schaufler <[email protected]>
>
> Move management of the superblock->sb_security blob out of the
> individual security modules and into the security infrastructure.
> Instead of allocating the blobs from within the modules, the modules
> tell the infrastructure how much space is required, and the space is
> allocated there.
>
> Cc: Kees Cook <[email protected]>
> Cc: John Johansen <[email protected]>
> Signed-off-by: Casey Schaufler <[email protected]>
> Signed-off-by: Micka?l Sala?n <[email protected]>
> Reviewed-by: Stephen Smalley <[email protected]>

Acked-by: Serge Hallyn <[email protected]>

I wonder how many out of tree modules this will impact :) Actually
if some new incoming module does an rcu callback to free the
sb_security, then the security_sb_free will need an update, but
that seems unlikely.

> ---
>
> Changes since v26:
> * Rebase on commit b159e86b5a2a ("selinux: drop super_block backpointer
> from superblock_security_struct"). No change in the patch itself,
> only a trivial conflict because of an updated nearby line in
> selinux_set_mnt_opts() variable declarations.
>
> Changes since v20:
> * Remove all Reviewed-by except Stephen Smalley:
> https://lore.kernel.org/lkml/CAEjxPJ7ARJO57MBW66=xsBzMMRb=9uLgqocK5eskHCaiVMx7Vw@mail.gmail.com/
> * Cosmetic fix in the commit message.
>
> Changes since v17:
> * Rebase the original LSM stacking patch from v5.3 to v5.7: I fixed some
> diff conflicts caused by code moves and function renames in
> selinux/include/objsec.h and selinux/hooks.c . I checked that it
> builds but I didn't test the changes for SELinux nor SMACK.
> https://lore.kernel.org/r/[email protected]
> ---
> include/linux/lsm_hooks.h | 1 +
> security/security.c | 46 ++++++++++++++++++++----
> security/selinux/hooks.c | 58 ++++++++++++-------------------
> security/selinux/include/objsec.h | 6 ++++
> security/selinux/ss/services.c | 3 +-
> security/smack/smack.h | 6 ++++
> security/smack/smack_lsm.c | 35 +++++--------------
> 7 files changed, 85 insertions(+), 70 deletions(-)
>
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index a19adef1f088..970106d98306 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -1563,6 +1563,7 @@ struct lsm_blob_sizes {
> int lbs_cred;
> int lbs_file;
> int lbs_inode;
> + int lbs_superblock;
> int lbs_ipc;
> int lbs_msg_msg;
> int lbs_task;
> diff --git a/security/security.c b/security/security.c
> index 7b09cfbae94f..9f979d4afe6c 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -203,6 +203,7 @@ static void __init lsm_set_blob_sizes(struct lsm_blob_sizes *needed)
> lsm_set_blob_size(&needed->lbs_inode, &blob_sizes.lbs_inode);
> lsm_set_blob_size(&needed->lbs_ipc, &blob_sizes.lbs_ipc);
> lsm_set_blob_size(&needed->lbs_msg_msg, &blob_sizes.lbs_msg_msg);
> + lsm_set_blob_size(&needed->lbs_superblock, &blob_sizes.lbs_superblock);
> lsm_set_blob_size(&needed->lbs_task, &blob_sizes.lbs_task);
> }
>
> @@ -333,12 +334,13 @@ static void __init ordered_lsm_init(void)
> for (lsm = ordered_lsms; *lsm; lsm++)
> prepare_lsm(*lsm);
>
> - init_debug("cred blob size = %d\n", blob_sizes.lbs_cred);
> - init_debug("file blob size = %d\n", blob_sizes.lbs_file);
> - init_debug("inode blob size = %d\n", blob_sizes.lbs_inode);
> - init_debug("ipc blob size = %d\n", blob_sizes.lbs_ipc);
> - init_debug("msg_msg blob size = %d\n", blob_sizes.lbs_msg_msg);
> - init_debug("task blob size = %d\n", blob_sizes.lbs_task);
> + init_debug("cred blob size = %d\n", blob_sizes.lbs_cred);
> + init_debug("file blob size = %d\n", blob_sizes.lbs_file);
> + init_debug("inode blob size = %d\n", blob_sizes.lbs_inode);
> + init_debug("ipc blob size = %d\n", blob_sizes.lbs_ipc);
> + init_debug("msg_msg blob size = %d\n", blob_sizes.lbs_msg_msg);
> + init_debug("superblock blob size = %d\n", blob_sizes.lbs_superblock);
> + init_debug("task blob size = %d\n", blob_sizes.lbs_task);
>
> /*
> * Create any kmem_caches needed for blobs
> @@ -670,6 +672,27 @@ static void __init lsm_early_task(struct task_struct *task)
> panic("%s: Early task alloc failed.\n", __func__);
> }
>
> +/**
> + * lsm_superblock_alloc - allocate a composite superblock blob
> + * @sb: the superblock that needs a blob
> + *
> + * Allocate the superblock blob for all the modules
> + *
> + * Returns 0, or -ENOMEM if memory can't be allocated.
> + */
> +static int lsm_superblock_alloc(struct super_block *sb)
> +{
> + if (blob_sizes.lbs_superblock == 0) {
> + sb->s_security = NULL;
> + return 0;
> + }
> +
> + sb->s_security = kzalloc(blob_sizes.lbs_superblock, GFP_KERNEL);
> + if (sb->s_security == NULL)
> + return -ENOMEM;
> + return 0;
> +}
> +
> /*
> * The default value of the LSM hook is defined in linux/lsm_hook_defs.h and
> * can be accessed with:
> @@ -867,12 +890,21 @@ int security_fs_context_parse_param(struct fs_context *fc, struct fs_parameter *
>
> int security_sb_alloc(struct super_block *sb)
> {
> - return call_int_hook(sb_alloc_security, 0, sb);
> + int rc = lsm_superblock_alloc(sb);
> +
> + if (unlikely(rc))
> + return rc;
> + rc = call_int_hook(sb_alloc_security, 0, sb);
> + if (unlikely(rc))
> + security_sb_free(sb);
> + return rc;
> }
>
> void security_sb_free(struct super_block *sb)
> {
> call_void_hook(sb_free_security, sb);
> + kfree(sb->s_security);
> + sb->s_security = NULL;
> }
>
> void security_free_mnt_opts(void **mnt_opts)
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index 644b17ec9e63..ecf0ca8c3108 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -322,7 +322,7 @@ static void inode_free_security(struct inode *inode)
>
> if (!isec)
> return;
> - sbsec = inode->i_sb->s_security;
> + sbsec = selinux_superblock(inode->i_sb);
> /*
> * As not all inode security structures are in a list, we check for
> * empty list outside of the lock to make sure that we won't waste
> @@ -340,13 +340,6 @@ static void inode_free_security(struct inode *inode)
> }
> }
>
> -static void superblock_free_security(struct super_block *sb)
> -{
> - struct superblock_security_struct *sbsec = sb->s_security;
> - sb->s_security = NULL;
> - kfree(sbsec);
> -}
> -
> struct selinux_mnt_opts {
> const char *fscontext, *context, *rootcontext, *defcontext;
> };
> @@ -458,7 +451,7 @@ static int selinux_is_genfs_special_handling(struct super_block *sb)
>
> static int selinux_is_sblabel_mnt(struct super_block *sb)
> {
> - struct superblock_security_struct *sbsec = sb->s_security;
> + struct superblock_security_struct *sbsec = selinux_superblock(sb);
>
> /*
> * IMPORTANT: Double-check logic in this function when adding a new
> @@ -486,7 +479,7 @@ static int selinux_is_sblabel_mnt(struct super_block *sb)
>
> static int sb_finish_set_opts(struct super_block *sb)
> {
> - struct superblock_security_struct *sbsec = sb->s_security;
> + struct superblock_security_struct *sbsec = selinux_superblock(sb);
> struct dentry *root = sb->s_root;
> struct inode *root_inode = d_backing_inode(root);
> int rc = 0;
> @@ -599,7 +592,7 @@ static int selinux_set_mnt_opts(struct super_block *sb,
> unsigned long *set_kern_flags)
> {
> const struct cred *cred = current_cred();
> - struct superblock_security_struct *sbsec = sb->s_security;
> + struct superblock_security_struct *sbsec = selinux_superblock(sb);
> struct dentry *root = sb->s_root;
> struct selinux_mnt_opts *opts = mnt_opts;
> struct inode_security_struct *root_isec;
> @@ -836,8 +829,8 @@ static int selinux_set_mnt_opts(struct super_block *sb,
> static int selinux_cmp_sb_context(const struct super_block *oldsb,
> const struct super_block *newsb)
> {
> - struct superblock_security_struct *old = oldsb->s_security;
> - struct superblock_security_struct *new = newsb->s_security;
> + struct superblock_security_struct *old = selinux_superblock(oldsb);
> + struct superblock_security_struct *new = selinux_superblock(newsb);
> char oldflags = old->flags & SE_MNTMASK;
> char newflags = new->flags & SE_MNTMASK;
>
> @@ -869,8 +862,9 @@ static int selinux_sb_clone_mnt_opts(const struct super_block *oldsb,
> unsigned long *set_kern_flags)
> {
> int rc = 0;
> - const struct superblock_security_struct *oldsbsec = oldsb->s_security;
> - struct superblock_security_struct *newsbsec = newsb->s_security;
> + const struct superblock_security_struct *oldsbsec =
> + selinux_superblock(oldsb);
> + struct superblock_security_struct *newsbsec = selinux_superblock(newsb);
>
> int set_fscontext = (oldsbsec->flags & FSCONTEXT_MNT);
> int set_context = (oldsbsec->flags & CONTEXT_MNT);
> @@ -1049,7 +1043,7 @@ static int show_sid(struct seq_file *m, u32 sid)
>
> static int selinux_sb_show_options(struct seq_file *m, struct super_block *sb)
> {
> - struct superblock_security_struct *sbsec = sb->s_security;
> + struct superblock_security_struct *sbsec = selinux_superblock(sb);
> int rc;
>
> if (!(sbsec->flags & SE_SBINITIALIZED))
> @@ -1399,7 +1393,7 @@ static int inode_doinit_with_dentry(struct inode *inode, struct dentry *opt_dent
> if (isec->sclass == SECCLASS_FILE)
> isec->sclass = inode_mode_to_security_class(inode->i_mode);
>
> - sbsec = inode->i_sb->s_security;
> + sbsec = selinux_superblock(inode->i_sb);
> if (!(sbsec->flags & SE_SBINITIALIZED)) {
> /* Defer initialization until selinux_complete_init,
> after the initial policy is loaded and the security
> @@ -1750,7 +1744,8 @@ selinux_determine_inode_label(const struct task_security_struct *tsec,
> const struct qstr *name, u16 tclass,
> u32 *_new_isid)
> {
> - const struct superblock_security_struct *sbsec = dir->i_sb->s_security;
> + const struct superblock_security_struct *sbsec =
> + selinux_superblock(dir->i_sb);
>
> if ((sbsec->flags & SE_SBINITIALIZED) &&
> (sbsec->behavior == SECURITY_FS_USE_MNTPOINT)) {
> @@ -1781,7 +1776,7 @@ static int may_create(struct inode *dir,
> int rc;
>
> dsec = inode_security(dir);
> - sbsec = dir->i_sb->s_security;
> + sbsec = selinux_superblock(dir->i_sb);
>
> sid = tsec->sid;
>
> @@ -1930,7 +1925,7 @@ static int superblock_has_perm(const struct cred *cred,
> struct superblock_security_struct *sbsec;
> u32 sid = cred_sid(cred);
>
> - sbsec = sb->s_security;
> + sbsec = selinux_superblock(sb);
> return avc_has_perm(&selinux_state,
> sid, sbsec->sid, SECCLASS_FILESYSTEM, perms, ad);
> }
> @@ -2559,11 +2554,7 @@ static void selinux_bprm_committed_creds(struct linux_binprm *bprm)
>
> static int selinux_sb_alloc_security(struct super_block *sb)
> {
> - struct superblock_security_struct *sbsec;
> -
> - sbsec = kzalloc(sizeof(struct superblock_security_struct), GFP_KERNEL);
> - if (!sbsec)
> - return -ENOMEM;
> + struct superblock_security_struct *sbsec = selinux_superblock(sb);
>
> mutex_init(&sbsec->lock);
> INIT_LIST_HEAD(&sbsec->isec_head);
> @@ -2571,16 +2562,10 @@ static int selinux_sb_alloc_security(struct super_block *sb)
> sbsec->sid = SECINITSID_UNLABELED;
> sbsec->def_sid = SECINITSID_FILE;
> sbsec->mntpoint_sid = SECINITSID_UNLABELED;
> - sb->s_security = sbsec;
>
> return 0;
> }
>
> -static void selinux_sb_free_security(struct super_block *sb)
> -{
> - superblock_free_security(sb);
> -}
> -
> static inline int opt_len(const char *s)
> {
> bool open_quote = false;
> @@ -2659,7 +2644,7 @@ static int selinux_sb_eat_lsm_opts(char *options, void **mnt_opts)
> static int selinux_sb_remount(struct super_block *sb, void *mnt_opts)
> {
> struct selinux_mnt_opts *opts = mnt_opts;
> - struct superblock_security_struct *sbsec = sb->s_security;
> + struct superblock_security_struct *sbsec = selinux_superblock(sb);
> u32 sid;
> int rc;
>
> @@ -2897,7 +2882,7 @@ static int selinux_inode_init_security(struct inode *inode, struct inode *dir,
> int rc;
> char *context;
>
> - sbsec = dir->i_sb->s_security;
> + sbsec = selinux_superblock(dir->i_sb);
>
> newsid = tsec->create_sid;
>
> @@ -3142,7 +3127,7 @@ static int selinux_inode_setxattr(struct dentry *dentry, const char *name,
> if (!selinux_initialized(&selinux_state))
> return (inode_owner_or_capable(inode) ? 0 : -EPERM);
>
> - sbsec = inode->i_sb->s_security;
> + sbsec = selinux_superblock(inode->i_sb);
> if (!(sbsec->flags & SBLABEL_MNT))
> return -EOPNOTSUPP;
>
> @@ -3384,13 +3369,14 @@ static int selinux_inode_setsecurity(struct inode *inode, const char *name,
> const void *value, size_t size, int flags)
> {
> struct inode_security_struct *isec = inode_security_novalidate(inode);
> - struct superblock_security_struct *sbsec = inode->i_sb->s_security;
> + struct superblock_security_struct *sbsec;
> u32 newsid;
> int rc;
>
> if (strcmp(name, XATTR_SELINUX_SUFFIX))
> return -EOPNOTSUPP;
>
> + sbsec = selinux_superblock(inode->i_sb);
> if (!(sbsec->flags & SBLABEL_MNT))
> return -EOPNOTSUPP;
>
> @@ -6882,6 +6868,7 @@ struct lsm_blob_sizes selinux_blob_sizes __lsm_ro_after_init = {
> .lbs_inode = sizeof(struct inode_security_struct),
> .lbs_ipc = sizeof(struct ipc_security_struct),
> .lbs_msg_msg = sizeof(struct msg_security_struct),
> + .lbs_superblock = sizeof(struct superblock_security_struct),
> };
>
> #ifdef CONFIG_PERF_EVENTS
> @@ -6982,7 +6969,6 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
> LSM_HOOK_INIT(bprm_committing_creds, selinux_bprm_committing_creds),
> LSM_HOOK_INIT(bprm_committed_creds, selinux_bprm_committed_creds),
>
> - LSM_HOOK_INIT(sb_free_security, selinux_sb_free_security),
> LSM_HOOK_INIT(sb_free_mnt_opts, selinux_free_mnt_opts),
> LSM_HOOK_INIT(sb_remount, selinux_sb_remount),
> LSM_HOOK_INIT(sb_kern_mount, selinux_sb_kern_mount),
> diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h
> index ca4d7ab6a835..2953132408bf 100644
> --- a/security/selinux/include/objsec.h
> +++ b/security/selinux/include/objsec.h
> @@ -188,4 +188,10 @@ static inline u32 current_sid(void)
> return tsec->sid;
> }
>
> +static inline struct superblock_security_struct *selinux_superblock(
> + const struct super_block *superblock)
> +{
> + return superblock->s_security + selinux_blob_sizes.lbs_superblock;
> +}
> +
> #endif /* _SELINUX_OBJSEC_H_ */
> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
> index 597b79703584..74e3905dd9c5 100644
> --- a/security/selinux/ss/services.c
> +++ b/security/selinux/ss/services.c
> @@ -47,6 +47,7 @@
> #include <linux/sched.h>
> #include <linux/audit.h>
> #include <linux/vmalloc.h>
> +#include <linux/lsm_hooks.h>
> #include <net/netlabel.h>
>
> #include "flask.h"
> @@ -2873,7 +2874,7 @@ int security_fs_use(struct selinux_state *state, struct super_block *sb)
> struct sidtab *sidtab;
> int rc = 0;
> struct ocontext *c;
> - struct superblock_security_struct *sbsec = sb->s_security;
> + struct superblock_security_struct *sbsec = selinux_superblock(sb);
> const char *fstype = sb->s_type->name;
>
> if (!selinux_initialized(state)) {
> diff --git a/security/smack/smack.h b/security/smack/smack.h
> index a9768b12716b..7077b18c79ec 100644
> --- a/security/smack/smack.h
> +++ b/security/smack/smack.h
> @@ -357,6 +357,12 @@ static inline struct smack_known **smack_ipc(const struct kern_ipc_perm *ipc)
> return ipc->security + smack_blob_sizes.lbs_ipc;
> }
>
> +static inline struct superblock_smack *smack_superblock(
> + const struct super_block *superblock)
> +{
> + return superblock->s_security + smack_blob_sizes.lbs_superblock;
> +}
> +
> /*
> * Is the directory transmuting?
> */
> diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
> index f69c3dd9a0c6..767084dc2c29 100644
> --- a/security/smack/smack_lsm.c
> +++ b/security/smack/smack_lsm.c
> @@ -535,12 +535,7 @@ static int smack_syslog(int typefrom_file)
> */
> static int smack_sb_alloc_security(struct super_block *sb)
> {
> - struct superblock_smack *sbsp;
> -
> - sbsp = kzalloc(sizeof(struct superblock_smack), GFP_KERNEL);
> -
> - if (sbsp == NULL)
> - return -ENOMEM;
> + struct superblock_smack *sbsp = smack_superblock(sb);
>
> sbsp->smk_root = &smack_known_floor;
> sbsp->smk_default = &smack_known_floor;
> @@ -549,22 +544,10 @@ static int smack_sb_alloc_security(struct super_block *sb)
> /*
> * SMK_SB_INITIALIZED will be zero from kzalloc.
> */
> - sb->s_security = sbsp;
>
> return 0;
> }
>
> -/**
> - * smack_sb_free_security - free a superblock blob
> - * @sb: the superblock getting the blob
> - *
> - */
> -static void smack_sb_free_security(struct super_block *sb)
> -{
> - kfree(sb->s_security);
> - sb->s_security = NULL;
> -}
> -
> struct smack_mnt_opts {
> const char *fsdefault, *fsfloor, *fshat, *fsroot, *fstransmute;
> };
> @@ -772,7 +755,7 @@ static int smack_set_mnt_opts(struct super_block *sb,
> {
> struct dentry *root = sb->s_root;
> struct inode *inode = d_backing_inode(root);
> - struct superblock_smack *sp = sb->s_security;
> + struct superblock_smack *sp = smack_superblock(sb);
> struct inode_smack *isp;
> struct smack_known *skp;
> struct smack_mnt_opts *opts = mnt_opts;
> @@ -871,7 +854,7 @@ static int smack_set_mnt_opts(struct super_block *sb,
> */
> static int smack_sb_statfs(struct dentry *dentry)
> {
> - struct superblock_smack *sbp = dentry->d_sb->s_security;
> + struct superblock_smack *sbp = smack_superblock(dentry->d_sb);
> int rc;
> struct smk_audit_info ad;
>
> @@ -905,7 +888,7 @@ static int smack_bprm_creds_for_exec(struct linux_binprm *bprm)
> if (isp->smk_task == NULL || isp->smk_task == bsp->smk_task)
> return 0;
>
> - sbsp = inode->i_sb->s_security;
> + sbsp = smack_superblock(inode->i_sb);
> if ((sbsp->smk_flags & SMK_SB_UNTRUSTED) &&
> isp->smk_task != sbsp->smk_root)
> return 0;
> @@ -1157,7 +1140,7 @@ static int smack_inode_rename(struct inode *old_inode,
> */
> static int smack_inode_permission(struct inode *inode, int mask)
> {
> - struct superblock_smack *sbsp = inode->i_sb->s_security;
> + struct superblock_smack *sbsp = smack_superblock(inode->i_sb);
> struct smk_audit_info ad;
> int no_block = mask & MAY_NOT_BLOCK;
> int rc;
> @@ -1398,7 +1381,7 @@ static int smack_inode_removexattr(struct dentry *dentry, const char *name)
> */
> if (strcmp(name, XATTR_NAME_SMACK) == 0) {
> struct super_block *sbp = dentry->d_sb;
> - struct superblock_smack *sbsp = sbp->s_security;
> + struct superblock_smack *sbsp = smack_superblock(sbp);
>
> isp->smk_inode = sbsp->smk_default;
> } else if (strcmp(name, XATTR_NAME_SMACKEXEC) == 0)
> @@ -1668,7 +1651,7 @@ static int smack_mmap_file(struct file *file,
> isp = smack_inode(file_inode(file));
> if (isp->smk_mmap == NULL)
> return 0;
> - sbsp = file_inode(file)->i_sb->s_security;
> + sbsp = smack_superblock(file_inode(file)->i_sb);
> if (sbsp->smk_flags & SMK_SB_UNTRUSTED &&
> isp->smk_mmap != sbsp->smk_root)
> return -EACCES;
> @@ -3283,7 +3266,7 @@ static void smack_d_instantiate(struct dentry *opt_dentry, struct inode *inode)
> return;
>
> sbp = inode->i_sb;
> - sbsp = sbp->s_security;
> + sbsp = smack_superblock(sbp);
> /*
> * We're going to use the superblock default label
> * if there's no label on the file.
> @@ -4696,6 +4679,7 @@ struct lsm_blob_sizes smack_blob_sizes __lsm_ro_after_init = {
> .lbs_inode = sizeof(struct inode_smack),
> .lbs_ipc = sizeof(struct smack_known *),
> .lbs_msg_msg = sizeof(struct smack_known *),
> + .lbs_superblock = sizeof(struct superblock_smack),
> };
>
> static struct security_hook_list smack_hooks[] __lsm_ro_after_init = {
> @@ -4707,7 +4691,6 @@ static struct security_hook_list smack_hooks[] __lsm_ro_after_init = {
> LSM_HOOK_INIT(fs_context_parse_param, smack_fs_context_parse_param),
>
> LSM_HOOK_INIT(sb_alloc_security, smack_sb_alloc_security),
> - LSM_HOOK_INIT(sb_free_security, smack_sb_free_security),
> LSM_HOOK_INIT(sb_free_mnt_opts, smack_free_mnt_opts),
> LSM_HOOK_INIT(sb_eat_lsm_opts, smack_sb_eat_lsm_opts),
> LSM_HOOK_INIT(sb_statfs, smack_sb_statfs),
> --
> 2.30.0

2021-02-05 23:14:27

by Mickaël Salaün

[permalink] [raw]
Subject: Re: [PATCH v28 06/12] fs,security: Add sb_delete hook


On 05/02/2021 15:21, Serge E. Hallyn wrote:
> On Tue, Feb 02, 2021 at 05:27:04PM +0100, Micka?l Sala?n wrote:
>> From: Micka?l Sala?n <[email protected]>
>>
>> The sb_delete security hook is called when shutting down a superblock,
>> which may be useful to release kernel objects tied to the superblock's
>> lifetime (e.g. inodes).
>>
>> This new hook is needed by Landlock to release (ephemerally) tagged
>> struct inodes. This comes from the unprivileged nature of Landlock
>> described in the next commit.
>>
>> Cc: Al Viro <[email protected]>
>> Cc: James Morris <[email protected]>
>> Cc: Kees Cook <[email protected]>
>> Cc: Serge E. Hallyn <[email protected]>
>
> One note below, but
>
> Acked-by: Serge Hallyn <[email protected]>
>
>> Signed-off-by: Micka?l Sala?n <[email protected]>
>> Reviewed-by: Jann Horn <[email protected]>
>> ---
>>
>> Changes since v22:
>> * Add Reviewed-by: Jann Horn <[email protected]>
>>
>> Changes since v17:
>> * Initial patch to replace the direct call to landlock_release_inodes()
>> (requested by James Morris).
>> https://lore.kernel.org/lkml/[email protected]/
>> ---
>> fs/super.c | 1 +
>> include/linux/lsm_hook_defs.h | 1 +
>> include/linux/lsm_hooks.h | 2 ++
>> include/linux/security.h | 4 ++++
>> security/security.c | 5 +++++
>> 5 files changed, 13 insertions(+)
>>
>> diff --git a/fs/super.c b/fs/super.c
>> index 2c6cdea2ab2d..c3c5178cde65 100644
>> --- a/fs/super.c
>> +++ b/fs/super.c
>> @@ -454,6 +454,7 @@ void generic_shutdown_super(struct super_block *sb)
>> evict_inodes(sb);
>> /* only nonzero refcount inodes can have marks */
>> fsnotify_sb_delete(sb);
>> + security_sb_delete(sb);
>>
>> if (sb->s_dio_done_wq) {
>> destroy_workqueue(sb->s_dio_done_wq);
>> diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
>> index 7aaa753b8608..32472b3849bc 100644
>> --- a/include/linux/lsm_hook_defs.h
>> +++ b/include/linux/lsm_hook_defs.h
>> @@ -59,6 +59,7 @@ LSM_HOOK(int, 0, fs_context_dup, struct fs_context *fc,
>> LSM_HOOK(int, -ENOPARAM, fs_context_parse_param, struct fs_context *fc,
>> struct fs_parameter *param)
>> LSM_HOOK(int, 0, sb_alloc_security, struct super_block *sb)
>> +LSM_HOOK(void, LSM_RET_VOID, sb_delete, struct super_block *sb)
>> LSM_HOOK(void, LSM_RET_VOID, sb_free_security, struct super_block *sb)
>> LSM_HOOK(void, LSM_RET_VOID, sb_free_mnt_opts, void *mnt_opts)
>> LSM_HOOK(int, 0, sb_eat_lsm_opts, char *orig, void **mnt_opts)
>> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
>> index 970106d98306..e339b201f79b 100644
>> --- a/include/linux/lsm_hooks.h
>> +++ b/include/linux/lsm_hooks.h
>> @@ -108,6 +108,8 @@
>> * allocated.
>> * @sb contains the super_block structure to be modified.
>> * Return 0 if operation was successful.
>> + * @sb_delete:
>> + * Release objects tied to a superblock (e.g. inodes).
>
> It's customary here to add the line detailing the @sb argument.

What about "@sb contains the super_block structure being released."?

>
>> * @sb_free_security:
>> * Deallocate and clear the sb->s_security field.
>> * @sb contains the super_block structure to be modified.
>> diff --git a/include/linux/security.h b/include/linux/security.h
>> index c35ea0ffccd9..c41a94e29b62 100644
>> --- a/include/linux/security.h
>> +++ b/include/linux/security.h
>> @@ -288,6 +288,7 @@ void security_bprm_committed_creds(struct linux_binprm *bprm);
>> int security_fs_context_dup(struct fs_context *fc, struct fs_context *src_fc);
>> int security_fs_context_parse_param(struct fs_context *fc, struct fs_parameter *param);
>> int security_sb_alloc(struct super_block *sb);
>> +void security_sb_delete(struct super_block *sb);
>> void security_sb_free(struct super_block *sb);
>> void security_free_mnt_opts(void **mnt_opts);
>> int security_sb_eat_lsm_opts(char *options, void **mnt_opts);
>> @@ -620,6 +621,9 @@ static inline int security_sb_alloc(struct super_block *sb)
>> return 0;
>> }
>>
>> +static inline void security_sb_delete(struct super_block *sb)
>> +{ }
>> +
>> static inline void security_sb_free(struct super_block *sb)
>> { }
>>
>> diff --git a/security/security.c b/security/security.c
>> index 9f979d4afe6c..1b4a73b2549a 100644
>> --- a/security/security.c
>> +++ b/security/security.c
>> @@ -900,6 +900,11 @@ int security_sb_alloc(struct super_block *sb)
>> return rc;
>> }
>>
>> +void security_sb_delete(struct super_block *sb)
>> +{
>> + call_void_hook(sb_delete, sb);
>> +}
>> +
>> void security_sb_free(struct super_block *sb)
>> {
>> call_void_hook(sb_free_security, sb);
>> --
>> 2.30.0

2021-02-07 04:21:51

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [PATCH v28 06/12] fs,security: Add sb_delete hook

On Fri, Feb 05, 2021 at 03:57:37PM +0100, Micka?l Sala?n wrote:
>
> On 05/02/2021 15:21, Serge E. Hallyn wrote:
> > On Tue, Feb 02, 2021 at 05:27:04PM +0100, Micka?l Sala?n wrote:
> >> From: Micka?l Sala?n <[email protected]>
> >>
> >> The sb_delete security hook is called when shutting down a superblock,
> >> which may be useful to release kernel objects tied to the superblock's
> >> lifetime (e.g. inodes).
> >>
> >> This new hook is needed by Landlock to release (ephemerally) tagged
> >> struct inodes. This comes from the unprivileged nature of Landlock
> >> described in the next commit.
> >>
> >> Cc: Al Viro <[email protected]>
> >> Cc: James Morris <[email protected]>
> >> Cc: Kees Cook <[email protected]>
> >> Cc: Serge E. Hallyn <[email protected]>
> >
> > One note below, but
> >
> > Acked-by: Serge Hallyn <[email protected]>
> >
> >> Signed-off-by: Micka?l Sala?n <[email protected]>
> >> Reviewed-by: Jann Horn <[email protected]>
> >> ---
> >>
> >> Changes since v22:
> >> * Add Reviewed-by: Jann Horn <[email protected]>
> >>
> >> Changes since v17:
> >> * Initial patch to replace the direct call to landlock_release_inodes()
> >> (requested by James Morris).
> >> https://lore.kernel.org/lkml/[email protected]/
> >> ---
> >> fs/super.c | 1 +
> >> include/linux/lsm_hook_defs.h | 1 +
> >> include/linux/lsm_hooks.h | 2 ++
> >> include/linux/security.h | 4 ++++
> >> security/security.c | 5 +++++
> >> 5 files changed, 13 insertions(+)
> >>
> >> diff --git a/fs/super.c b/fs/super.c
> >> index 2c6cdea2ab2d..c3c5178cde65 100644
> >> --- a/fs/super.c
> >> +++ b/fs/super.c
> >> @@ -454,6 +454,7 @@ void generic_shutdown_super(struct super_block *sb)
> >> evict_inodes(sb);
> >> /* only nonzero refcount inodes can have marks */
> >> fsnotify_sb_delete(sb);
> >> + security_sb_delete(sb);
> >>
> >> if (sb->s_dio_done_wq) {
> >> destroy_workqueue(sb->s_dio_done_wq);
> >> diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
> >> index 7aaa753b8608..32472b3849bc 100644
> >> --- a/include/linux/lsm_hook_defs.h
> >> +++ b/include/linux/lsm_hook_defs.h
> >> @@ -59,6 +59,7 @@ LSM_HOOK(int, 0, fs_context_dup, struct fs_context *fc,
> >> LSM_HOOK(int, -ENOPARAM, fs_context_parse_param, struct fs_context *fc,
> >> struct fs_parameter *param)
> >> LSM_HOOK(int, 0, sb_alloc_security, struct super_block *sb)
> >> +LSM_HOOK(void, LSM_RET_VOID, sb_delete, struct super_block *sb)
> >> LSM_HOOK(void, LSM_RET_VOID, sb_free_security, struct super_block *sb)
> >> LSM_HOOK(void, LSM_RET_VOID, sb_free_mnt_opts, void *mnt_opts)
> >> LSM_HOOK(int, 0, sb_eat_lsm_opts, char *orig, void **mnt_opts)
> >> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> >> index 970106d98306..e339b201f79b 100644
> >> --- a/include/linux/lsm_hooks.h
> >> +++ b/include/linux/lsm_hooks.h
> >> @@ -108,6 +108,8 @@
> >> * allocated.
> >> * @sb contains the super_block structure to be modified.
> >> * Return 0 if operation was successful.
> >> + * @sb_delete:
> >> + * Release objects tied to a superblock (e.g. inodes).
> >
> > It's customary here to add the line detailing the @sb argument.
>
> What about "@sb contains the super_block structure being released."?

That's good. Thanks.

> >
> >> * @sb_free_security:
> >> * Deallocate and clear the sb->s_security field.
> >> * @sb contains the super_block structure to be modified.
> >> diff --git a/include/linux/security.h b/include/linux/security.h
> >> index c35ea0ffccd9..c41a94e29b62 100644
> >> --- a/include/linux/security.h
> >> +++ b/include/linux/security.h
> >> @@ -288,6 +288,7 @@ void security_bprm_committed_creds(struct linux_binprm *bprm);
> >> int security_fs_context_dup(struct fs_context *fc, struct fs_context *src_fc);
> >> int security_fs_context_parse_param(struct fs_context *fc, struct fs_parameter *param);
> >> int security_sb_alloc(struct super_block *sb);
> >> +void security_sb_delete(struct super_block *sb);
> >> void security_sb_free(struct super_block *sb);
> >> void security_free_mnt_opts(void **mnt_opts);
> >> int security_sb_eat_lsm_opts(char *options, void **mnt_opts);
> >> @@ -620,6 +621,9 @@ static inline int security_sb_alloc(struct super_block *sb)
> >> return 0;
> >> }
> >>
> >> +static inline void security_sb_delete(struct super_block *sb)
> >> +{ }
> >> +
> >> static inline void security_sb_free(struct super_block *sb)
> >> { }
> >>
> >> diff --git a/security/security.c b/security/security.c
> >> index 9f979d4afe6c..1b4a73b2549a 100644
> >> --- a/security/security.c
> >> +++ b/security/security.c
> >> @@ -900,6 +900,11 @@ int security_sb_alloc(struct super_block *sb)
> >> return rc;
> >> }
> >>
> >> +void security_sb_delete(struct super_block *sb)
> >> +{
> >> + call_void_hook(sb_delete, sb);
> >> +}
> >> +
> >> void security_sb_free(struct super_block *sb)
> >> {
> >> call_void_hook(sb_free_security, sb);
> >> --
> >> 2.30.0