Hi,
This third patch series is mostly a rebase with some whitespace changes
because of clang-format. There is also some new "unlikely()" calls and
minor code cleanup.
Test coverage for security/landlock was 94.4% of 504 lines (with the
previous patch series), and it is now 95.4% of 604 lines according to
gcc/gcov-11.
Problem
=======
One of the most annoying limitations of Landlock is that sandboxed
processes can only link and rename files to the same directory (i.e.
file reparenting is always denied). Indeed, because of the unprivileged
nature of Landlock, file hierarchy are identified thanks to ephemeral
inode tagging, which may cause arbitrary renaming and linking to change
the security policy in an unexpected way.
Solution
========
This patch series brings a new access right, LANDLOCK_ACCESS_FS_REFER,
which enables to allow safe file linking and renaming. In a nutshell,
Landlock checks that the inherited access rights of a moved or renamed
file cannot increase but only reduce. Eleven new test suits cover file
renaming and linking, which improves test coverage.
The documentation and the tutorial is extended with this new access
right, along with more explanations about backward and forward
compatibility, good practices, and a bit about the current access
rights rational.
While developing this new feature, I also found an issue with the
current implementation of Landlock. In some (rare) cases, sandboxed
processes may be more restricted than intended. Indeed, because of the
current way to check file hierarchy access rights, composition of rules
may be incomplete when requesting multiple accesses at the same time.
This is fixed with a dedicated patch involving some refactoring. A new
test suite checks relevant new edge cases.
As a side effect, and to limit the increased use of the stack, I reduced
the number of Landlock nested domains from 64 to 16. I think this
should be more than enough for legitimate use cases, but feel free to
challenge this decision with real and legitimate use cases.
Additionally, a new dedicated syzkaller test has been developed to cover
new paths.
This patch series is based on and was developed with some complementary
new tests sent in a standalone patch series:
https://lore.kernel.org/r/[email protected]
Previous versions:
v2: https://lore.kernel.org/r/[email protected]
v1: https://lore.kernel.org/r/[email protected]
Regards,
Mickaël Salaün (12):
landlock: Define access_mask_t to enforce a consistent access mask
size
landlock: Reduce the maximum number of layers to 16
landlock: Create find_rule() from unmask_layers()
landlock: Fix same-layer rule unions
landlock: Move filesystem helpers and add a new one
LSM: Remove double path_rename hook calls for RENAME_EXCHANGE
landlock: Add support for file reparenting with
LANDLOCK_ACCESS_FS_REFER
selftests/landlock: Add 11 new test suites dedicated to file
reparenting
samples/landlock: Add support for file reparenting
landlock: Document LANDLOCK_ACCESS_FS_REFER and ABI versioning
landlock: Document good practices about filesystem policies
landlock: Add design choices documentation for filesystem access
rights
Documentation/security/landlock.rst | 17 +-
Documentation/userspace-api/landlock.rst | 151 ++-
include/linux/lsm_hook_defs.h | 2 +-
include/linux/lsm_hooks.h | 1 +
include/uapi/linux/landlock.h | 27 +-
samples/landlock/sandboxer.c | 40 +-
security/apparmor/lsm.c | 30 +-
security/landlock/fs.c | 771 ++++++++++---
security/landlock/fs.h | 2 +-
security/landlock/limits.h | 6 +-
security/landlock/ruleset.c | 6 +-
security/landlock/ruleset.h | 22 +-
security/landlock/syscalls.c | 2 +-
security/security.c | 9 +-
security/tomoyo/tomoyo.c | 11 +-
tools/testing/selftests/landlock/base_test.c | 2 +-
tools/testing/selftests/landlock/fs_test.c | 1039 ++++++++++++++++--
17 files changed, 1853 insertions(+), 285 deletions(-)
base-commit: 4b0cdb0cf6eefa7521322007931ccfb7edc96c53
--
2.35.1
In order to be able to identify a file exchange with renameat2(2) and
RENAME_EXCHANGE, which will be useful for Landlock [1], propagate the
rename flags to LSMs. This may also improve performance because of the
switch from two set of LSM hook calls to only one, and because LSMs
using this hook may optimize the double check (e.g. only one lock,
reduce the number of path walks).
AppArmor, Landlock and Tomoyo are updated to leverage this change. This
should not change the current behavior (same check order), except
(different level of) speed boosts.
[1] https://lore.kernel.org/r/[email protected]
Cc: James Morris <[email protected]>
Cc: Kentaro Takeda <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Acked-by: John Johansen <[email protected]>
Acked-by: Tetsuo Handa <[email protected]>
Reviewed-by: Paul Moore <[email protected]>
Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
Changes since v2:
* Add Reviewed-by: Paul Moore.
* Format security/landlock/fs.c with clang-format.
Changes since v1:
* Import patch from
https://lore.kernel.org/r/[email protected]
* Add Acked-by: Tetsuo Handa.
* Add Acked-by: John Johansen.
---
include/linux/lsm_hook_defs.h | 2 +-
include/linux/lsm_hooks.h | 1 +
security/apparmor/lsm.c | 30 +++++++++++++++++++++++++-----
security/landlock/fs.c | 11 ++++++++++-
security/security.c | 9 +--------
security/tomoyo/tomoyo.c | 11 ++++++++++-
6 files changed, 48 insertions(+), 16 deletions(-)
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index db924fe379c9..eafa1d2489fd 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -100,7 +100,7 @@ LSM_HOOK(int, 0, path_link, struct dentry *old_dentry,
const struct path *new_dir, struct dentry *new_dentry)
LSM_HOOK(int, 0, path_rename, const struct path *old_dir,
struct dentry *old_dentry, const struct path *new_dir,
- struct dentry *new_dentry)
+ struct dentry *new_dentry, unsigned int flags)
LSM_HOOK(int, 0, path_chmod, const struct path *path, umode_t mode)
LSM_HOOK(int, 0, path_chown, const struct path *path, kuid_t uid, kgid_t gid)
LSM_HOOK(int, 0, path_chroot, const struct path *path)
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 419b5febc3ca..9acf5e368d73 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -358,6 +358,7 @@
* @old_dentry contains the dentry structure of the old link.
* @new_dir contains the path structure for parent of the new link.
* @new_dentry contains the dentry structure of the new link.
+ * @flags may contain rename options such as RENAME_EXCHANGE.
* Return 0 if permission is granted.
* @path_chmod:
* Check for permission to change a mode of the file @path. The new
diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
index 4f0eecb67dde..900bc540656a 100644
--- a/security/apparmor/lsm.c
+++ b/security/apparmor/lsm.c
@@ -354,13 +354,16 @@ static int apparmor_path_link(struct dentry *old_dentry, const struct path *new_
}
static int apparmor_path_rename(const struct path *old_dir, struct dentry *old_dentry,
- const struct path *new_dir, struct dentry *new_dentry)
+ const struct path *new_dir, struct dentry *new_dentry,
+ const unsigned int flags)
{
struct aa_label *label;
int error = 0;
if (!path_mediated_fs(old_dentry))
return 0;
+ if ((flags & RENAME_EXCHANGE) && !path_mediated_fs(new_dentry))
+ return 0;
label = begin_current_label_crit_section();
if (!unconfined(label)) {
@@ -374,10 +377,27 @@ static int apparmor_path_rename(const struct path *old_dir, struct dentry *old_d
d_backing_inode(old_dentry)->i_mode
};
- error = aa_path_perm(OP_RENAME_SRC, label, &old_path, 0,
- MAY_READ | AA_MAY_GETATTR | MAY_WRITE |
- AA_MAY_SETATTR | AA_MAY_DELETE,
- &cond);
+ if (flags & RENAME_EXCHANGE) {
+ struct path_cond cond_exchange = {
+ i_uid_into_mnt(mnt_userns, d_backing_inode(new_dentry)),
+ d_backing_inode(new_dentry)->i_mode
+ };
+
+ error = aa_path_perm(OP_RENAME_SRC, label, &new_path, 0,
+ MAY_READ | AA_MAY_GETATTR | MAY_WRITE |
+ AA_MAY_SETATTR | AA_MAY_DELETE,
+ &cond_exchange);
+ if (!error)
+ error = aa_path_perm(OP_RENAME_DEST, label, &old_path,
+ 0, MAY_WRITE | AA_MAY_SETATTR |
+ AA_MAY_CREATE, &cond_exchange);
+ }
+
+ if (!error)
+ error = aa_path_perm(OP_RENAME_SRC, label, &old_path, 0,
+ MAY_READ | AA_MAY_GETATTR | MAY_WRITE |
+ AA_MAY_SETATTR | AA_MAY_DELETE,
+ &cond);
if (!error)
error = aa_path_perm(OP_RENAME_DEST, label, &new_path,
0, MAY_WRITE | AA_MAY_SETATTR |
diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 7b7860039a08..30b42cdee52e 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -622,10 +622,12 @@ static int hook_path_link(struct dentry *const old_dentry,
static int hook_path_rename(const struct path *const old_dir,
struct dentry *const old_dentry,
const struct path *const new_dir,
- struct dentry *const new_dentry)
+ struct dentry *const new_dentry,
+ const unsigned int flags)
{
const struct landlock_ruleset *const dom =
landlock_get_current_domain();
+ u32 exchange_access = 0;
if (!dom)
return 0;
@@ -633,12 +635,19 @@ static int hook_path_rename(const struct path *const old_dir,
if (old_dir->dentry != new_dir->dentry)
/* Gracefully forbids reparenting. */
return -EXDEV;
+ if (flags & RENAME_EXCHANGE) {
+ if (unlikely(d_is_negative(new_dentry)))
+ return -ENOENT;
+ exchange_access =
+ get_mode_access(d_backing_inode(new_dentry)->i_mode);
+ }
if (unlikely(d_is_negative(old_dentry)))
return -ENOENT;
/* RENAME_EXCHANGE is handled because directories are the same. */
return check_access_path(
dom, old_dir,
maybe_remove(old_dentry) | maybe_remove(new_dentry) |
+ exchange_access |
get_mode_access(d_backing_inode(old_dentry)->i_mode));
}
diff --git a/security/security.c b/security/security.c
index b7cf5cbfdc67..c9bfc0b70b28 100644
--- a/security/security.c
+++ b/security/security.c
@@ -1197,15 +1197,8 @@ int security_path_rename(const struct path *old_dir, struct dentry *old_dentry,
(d_is_positive(new_dentry) && IS_PRIVATE(d_backing_inode(new_dentry)))))
return 0;
- if (flags & RENAME_EXCHANGE) {
- int err = call_int_hook(path_rename, 0, new_dir, new_dentry,
- old_dir, old_dentry);
- if (err)
- return err;
- }
-
return call_int_hook(path_rename, 0, old_dir, old_dentry, new_dir,
- new_dentry);
+ new_dentry, flags);
}
EXPORT_SYMBOL(security_path_rename);
diff --git a/security/tomoyo/tomoyo.c b/security/tomoyo/tomoyo.c
index b6a31901f289..71e82d855ebf 100644
--- a/security/tomoyo/tomoyo.c
+++ b/security/tomoyo/tomoyo.c
@@ -264,17 +264,26 @@ static int tomoyo_path_link(struct dentry *old_dentry, const struct path *new_di
* @old_dentry: Pointer to "struct dentry".
* @new_parent: Pointer to "struct path".
* @new_dentry: Pointer to "struct dentry".
+ * @flags: Rename options.
*
* Returns 0 on success, negative value otherwise.
*/
static int tomoyo_path_rename(const struct path *old_parent,
struct dentry *old_dentry,
const struct path *new_parent,
- struct dentry *new_dentry)
+ struct dentry *new_dentry,
+ const unsigned int flags)
{
struct path path1 = { .mnt = old_parent->mnt, .dentry = old_dentry };
struct path path2 = { .mnt = new_parent->mnt, .dentry = new_dentry };
+ if (flags & RENAME_EXCHANGE) {
+ const int err = tomoyo_path2_perm(TOMOYO_TYPE_RENAME, &path2,
+ &path1);
+
+ if (err)
+ return err;
+ }
return tomoyo_path2_perm(TOMOYO_TYPE_RENAME, &path1, &path2);
}
--
2.35.1
Add a new LANDLOCK_ACCESS_FS_REFER access right to enable policy writers
to allow sandboxed processes to link and rename files from and to a
specific set of file hierarchies. This access right should be composed
with LANDLOCK_ACCESS_FS_MAKE_* for the destination of a link or rename,
and with LANDLOCK_ACCESS_FS_REMOVE_* for a source of a rename. This
lift a Landlock limitation that always denied changing the parent of an
inode.
Renaming or linking to the same directory is still always allowed,
whatever LANDLOCK_ACCESS_FS_REFER is used or not, because it is not
considered a threat to user data.
However, creating multiple links or renaming to a different parent
directory may lead to privilege escalations if not handled properly.
Indeed, we must be sure that the source doesn't gain more privileges by
being accessible from the destination. This is handled by making sure
that the source hierarchy (including the referenced file or directory
itself) restricts at least as much the destination hierarchy. If it is
not the case, an EXDEV error is returned, making it potentially possible
for user space to copy the file hierarchy instead of moving or linking
it.
Instead of creating different access rights for the source and the
destination, we choose to make it simple and consistent for users.
Indeed, considering the previous constraint, it would be weird to
require such destination access right to be also granted to the source
(to make it a superset). Moreover, RENAME_EXCHANGE would also add to
the confusion because of paths being both a source and a destination.
See the provided documentation for additional details.
New tests are provided with a following commit.
Reviewed-by: Paul Moore <[email protected]>
Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
Changes since v2:
* Add Reviewed-by: Paul Moore.
* Update collect_domain_accesses() documentation for the RENAME_EXCHANGE
case and fix spelling.
* Add unlikely() branches in check_access_path_dual() to favor single
access request (i.e. non-refer actions).
* Remove useless ret assignment in collect_domain_accesses().
* Format with clang-format and rebase.
Changes since v1:
* Update current_check_access_path() to efficiently handle
RENAME_EXCHANGE thanks to the updated LSM hook (see previous patch).
Only one path walk is performed per rename arguments until their
common mount point is reached. Superset of access rights is correctly
checked, including when exchanging a file with a directory. This
requires to store another matrix of layer masks.
* Reorder and rename check_access_path_dual() arguments in a more
generic way: switch from src/dst to 1/2. This makes it easier to
understand the RENAME_EXCHANGE cases alongs with the others. Update
and improve check_access_path_dual() documentation accordingly.
* Clean up the check_access_path_dual() loop: set both allowed_parent*
when reaching internal filesystems and remove a useless one. This
allows potential renames in internal filesystems (like for other
operations).
* Move the function arguments checks from BUILD_BUG_ON() to
WARN_ON_ONCE() to avoid clang build error.
* Rename is_superset() to no_more_access() and make it handle superset
checks of source and destination for simple and exchange cases.
* Move the layer_masks_child* creation from current_check_refer_path()
to check_access_path_dual(): this is simpler and less error-prone,
especially with RENAME_EXCHANGE.
* Remove one optimization in current_check_refer_path() to make the code
simpler, especially with the RENAME_EXCHANGE handling.
* Remove overzealous WARN_ON_ONCE() for !access_request check in
init_layer_masks().
---
include/uapi/linux/landlock.h | 27 +-
security/landlock/fs.c | 600 ++++++++++++++++---
security/landlock/limits.h | 2 +-
security/landlock/syscalls.c | 2 +-
tools/testing/selftests/landlock/base_test.c | 2 +-
tools/testing/selftests/landlock/fs_test.c | 3 +-
6 files changed, 556 insertions(+), 80 deletions(-)
diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
index 21c8d58283c9..23df4e0e8ace 100644
--- a/include/uapi/linux/landlock.h
+++ b/include/uapi/linux/landlock.h
@@ -21,8 +21,14 @@ struct landlock_ruleset_attr {
/**
* @handled_access_fs: Bitmask of actions (cf. `Filesystem flags`_)
* that is handled by this ruleset and should then be forbidden if no
- * rule explicitly allow them. This is needed for backward
- * compatibility reasons.
+ * rule explicitly allow them: it is a deny-by-default list that should
+ * contain as much Landlock access rights as possible. Indeed, all
+ * Landlock filesystem access rights that are not part of
+ * handled_access_fs are allowed. This is needed for backward
+ * compatibility reasons. One exception is the
+ * LANDLOCK_ACCESS_FS_REFER access right, which is always implicitly
+ * handled, but must still be explicitly handled to add new rules with
+ * this access right.
*/
__u64 handled_access_fs;
};
@@ -112,6 +118,22 @@ struct landlock_path_beneath_attr {
* - %LANDLOCK_ACCESS_FS_MAKE_FIFO: Create (or rename or link) a named pipe.
* - %LANDLOCK_ACCESS_FS_MAKE_BLOCK: Create (or rename or link) a block device.
* - %LANDLOCK_ACCESS_FS_MAKE_SYM: Create (or rename or link) a symbolic link.
+ * - %LANDLOCK_ACCESS_FS_REFER: Link or rename a file from or to a different
+ * directory (i.e. reparent a file hierarchy). This access right is
+ * available since the second version of the Landlock ABI. This is also the
+ * only access right which is always considered handled by any ruleset in
+ * such a way that reparenting a file hierarchy is always denied by default.
+ * To avoid privilege escalation, it is not enough to add a rule with this
+ * access right. When linking or renaming a file, the destination directory
+ * hierarchy must also always have the same or a superset of restrictions of
+ * the source hierarchy. If it is not the case, or if the domain doesn't
+ * handle this access right, such actions are denied by default with errno
+ * set to EXDEV. Linking also requires a LANDLOCK_ACCESS_FS_MAKE_* access
+ * right on the destination directory, and renaming also requires a
+ * LANDLOCK_ACCESS_FS_REMOVE_* access right on the source's (file or
+ * directory) parent. Otherwise, such actions are denied with errno set to
+ * EACCES. The EACCES errno prevails over EXDEV to let user space
+ * efficiently deal with an unrecoverable error.
*
* .. warning::
*
@@ -137,6 +159,7 @@ struct landlock_path_beneath_attr {
#define LANDLOCK_ACCESS_FS_MAKE_FIFO (1ULL << 10)
#define LANDLOCK_ACCESS_FS_MAKE_BLOCK (1ULL << 11)
#define LANDLOCK_ACCESS_FS_MAKE_SYM (1ULL << 12)
+#define LANDLOCK_ACCESS_FS_REFER (1ULL << 13)
/* clang-format on */
#endif /* _UAPI_LINUX_LANDLOCK_H */
diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 30b42cdee52e..ec5a6247cd3e 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -4,6 +4,7 @@
*
* Copyright © 2016-2020 Mickaël Salaün <[email protected]>
* Copyright © 2018-2020 ANSSI
+ * Copyright © 2021-2022 Microsoft Corporation
*/
#include <linux/atomic.h>
@@ -273,40 +274,262 @@ static inline bool is_nouser_or_private(const struct dentry *dentry)
unlikely(IS_PRIVATE(d_backing_inode(dentry))));
}
-static int check_access_path(const struct landlock_ruleset *const domain,
- const struct path *const path,
- const access_mask_t access_request)
+static inline access_mask_t
+get_handled_accesses(const struct landlock_ruleset *const domain)
{
- layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
- bool allowed = false, has_access = false;
- struct path walker_path;
- size_t i;
+ access_mask_t access_dom = 0;
+ unsigned long access_bit;
+
+ for (access_bit = 0; access_bit < LANDLOCK_NUM_ACCESS_FS;
+ access_bit++) {
+ size_t layer_level;
+
+ for (layer_level = 0; layer_level < domain->num_layers;
+ layer_level++) {
+ if (domain->fs_access_masks[layer_level] &
+ BIT_ULL(access_bit)) {
+ access_dom |= BIT_ULL(access_bit);
+ break;
+ }
+ }
+ }
+ return access_dom;
+}
+
+static inline access_mask_t
+init_layer_masks(const struct landlock_ruleset *const domain,
+ const access_mask_t access_request,
+ layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS])
+{
+ access_mask_t handled_accesses = 0;
+ size_t layer_level;
+ memset(layer_masks, 0, sizeof(*layer_masks));
+ /* An empty access request can happen because of O_WRONLY | O_RDWR. */
if (!access_request)
return 0;
- if (WARN_ON_ONCE(!domain || !path))
- return 0;
- if (is_nouser_or_private(path->dentry))
- return 0;
- if (WARN_ON_ONCE(domain->num_layers < 1))
- return -EACCES;
- /* Saves all layers handling a subset of requested accesses. */
- for (i = 0; i < domain->num_layers; i++) {
+ /* Saves all handled accesses per layer. */
+ for (layer_level = 0; layer_level < domain->num_layers; layer_level++) {
const unsigned long access_req = access_request;
unsigned long access_bit;
for_each_set_bit(access_bit, &access_req,
- ARRAY_SIZE(layer_masks)) {
- if (domain->fs_access_masks[i] & BIT_ULL(access_bit)) {
- layer_masks[access_bit] |= BIT_ULL(i);
- has_access = true;
+ ARRAY_SIZE(*layer_masks)) {
+ if (domain->fs_access_masks[layer_level] &
+ BIT_ULL(access_bit)) {
+ (*layer_masks)[access_bit] |=
+ BIT_ULL(layer_level);
+ handled_accesses |= BIT_ULL(access_bit);
}
}
}
- /* An access request not handled by the domain is allowed. */
- if (!has_access)
+ return handled_accesses;
+}
+
+/*
+ * Check that a destination file hierarchy has more restrictions than a source
+ * file hierarchy. This is only used for link and rename actions.
+ *
+ * @layer_masks_child2: Optional child masks.
+ */
+static inline bool no_more_access(
+ const layer_mask_t (*const layer_masks_parent1)[LANDLOCK_NUM_ACCESS_FS],
+ const layer_mask_t (*const layer_masks_child1)[LANDLOCK_NUM_ACCESS_FS],
+ const bool child1_is_directory,
+ const layer_mask_t (*const layer_masks_parent2)[LANDLOCK_NUM_ACCESS_FS],
+ const layer_mask_t (*const layer_masks_child2)[LANDLOCK_NUM_ACCESS_FS],
+ const bool child2_is_directory)
+{
+ unsigned long access_bit;
+
+ for (access_bit = 0; access_bit < ARRAY_SIZE(*layer_masks_parent2);
+ access_bit++) {
+ /* Ignores accesses that only make sense for directories. */
+ const bool is_file_access =
+ !!(BIT_ULL(access_bit) & ACCESS_FILE);
+
+ if (child1_is_directory || is_file_access) {
+ /*
+ * Checks if the destination restrictions are a
+ * superset of the source ones (i.e. inherited access
+ * rights without child exceptions):
+ * restrictions(parent2) >= restrictions(child1)
+ */
+ if ((((*layer_masks_parent1)[access_bit] &
+ (*layer_masks_child1)[access_bit]) |
+ (*layer_masks_parent2)[access_bit]) !=
+ (*layer_masks_parent2)[access_bit])
+ return false;
+ }
+
+ if (!layer_masks_child2)
+ continue;
+ if (child2_is_directory || is_file_access) {
+ /*
+ * Checks inverted restrictions for RENAME_EXCHANGE:
+ * restrictions(parent1) >= restrictions(child2)
+ */
+ if ((((*layer_masks_parent2)[access_bit] &
+ (*layer_masks_child2)[access_bit]) |
+ (*layer_masks_parent1)[access_bit]) !=
+ (*layer_masks_parent1)[access_bit])
+ return false;
+ }
+ }
+ return true;
+}
+
+/*
+ * Removes @layer_masks accesses that are not requested.
+ *
+ * Returns true if the request is allowed, false otherwise.
+ */
+static inline bool
+scope_to_request(const access_mask_t access_request,
+ layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS])
+{
+ const unsigned long access_req = access_request;
+ unsigned long access_bit;
+
+ if (WARN_ON_ONCE(!layer_masks))
+ return true;
+
+ for_each_clear_bit(access_bit, &access_req, ARRAY_SIZE(*layer_masks))
+ (*layer_masks)[access_bit] = 0;
+ return !memchr_inv(layer_masks, 0, sizeof(*layer_masks));
+}
+
+/*
+ * Returns true if there is at least one access right different than
+ * LANDLOCK_ACCESS_FS_REFER.
+ */
+static inline bool
+is_eacces(const layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS],
+ const access_mask_t access_request)
+{
+ unsigned long access_bit;
+ /* LANDLOCK_ACCESS_FS_REFER alone must return -EXDEV. */
+ const unsigned long access_check = access_request &
+ ~LANDLOCK_ACCESS_FS_REFER;
+
+ if (!layer_masks)
+ return false;
+
+ for_each_set_bit(access_bit, &access_check, ARRAY_SIZE(*layer_masks)) {
+ if ((*layer_masks)[access_bit])
+ return true;
+ }
+ return false;
+}
+
+/**
+ * check_access_path_dual - Check accesses for requests with a common path
+ *
+ * @domain: Domain to check against.
+ * @path: File hierarchy to walk through.
+ * @access_request_parent1: Accesses to check, once @layer_masks_parent1 is
+ * equal to @layer_masks_parent2 (if any). This is tied to the unique
+ * requested path for most actions, or the source in case of a refer action
+ * (i.e. rename or link), or the source and destination in case of
+ * RENAME_EXCHANGE.
+ * @layer_masks_parent1: Pointer to a matrix of layer masks per access
+ * masks, identifying the layers that forbid a specific access. Bits from
+ * this matrix can be unset according to the @path walk. An empty matrix
+ * means that @domain allows all possible Landlock accesses (i.e. not only
+ * those identified by @access_request_parent1). This matrix can
+ * initially refer to domain layer masks and, when the accesses for the
+ * destination and source are the same, to requested layer masks.
+ * @dentry_child1: Dentry to the initial child of the parent1 path. This
+ * pointer must be NULL for non-refer actions (i.e. not link nor rename).
+ * @access_request_parent2: Similar to @access_request_parent1 but for a
+ * request involving a source and a destination. This refers to the
+ * destination, except in case of RENAME_EXCHANGE where it also refers to
+ * the source. Must be set to 0 when using a simple path request.
+ * @layer_masks_parent2: Similar to @layer_masks_parent1 but for a refer
+ * action. This must be NULL otherwise.
+ * @dentry_child2: Dentry to the initial child of the parent2 path. This
+ * pointer is only set for RENAME_EXCHANGE actions and must be NULL
+ * otherwise.
+ *
+ * This helper first checks that the destination has a superset of restrictions
+ * compared to the source (if any) for a common path. Because of
+ * RENAME_EXCHANGE actions, source and destinations may be swapped. It then
+ * checks that the collected accesses and the remaining ones are enough to
+ * allow the request.
+ *
+ * Returns:
+ * - 0 if the access request is granted;
+ * - -EACCES if it is denied because of access right other than
+ * LANDLOCK_ACCESS_FS_REFER;
+ * - -EXDEV if the renaming or linking would be a privileged escalation
+ * (according to each layered policies), or if LANDLOCK_ACCESS_FS_REFER is
+ * not allowed by the source or the destination.
+ */
+static int check_access_path_dual(
+ const struct landlock_ruleset *const domain,
+ const struct path *const path,
+ const access_mask_t access_request_parent1,
+ layer_mask_t (*const layer_masks_parent1)[LANDLOCK_NUM_ACCESS_FS],
+ const struct dentry *const dentry_child1,
+ const access_mask_t access_request_parent2,
+ layer_mask_t (*const layer_masks_parent2)[LANDLOCK_NUM_ACCESS_FS],
+ const struct dentry *const dentry_child2)
+{
+ bool allowed_parent1 = false, allowed_parent2 = false, is_dom_check,
+ child1_is_directory = true, child2_is_directory = true;
+ struct path walker_path;
+ access_mask_t access_masked_parent1, access_masked_parent2;
+ layer_mask_t _layer_masks_child1[LANDLOCK_NUM_ACCESS_FS],
+ _layer_masks_child2[LANDLOCK_NUM_ACCESS_FS];
+ layer_mask_t(*layer_masks_child1)[LANDLOCK_NUM_ACCESS_FS] = NULL,
+ (*layer_masks_child2)[LANDLOCK_NUM_ACCESS_FS] = NULL;
+
+ if (!access_request_parent1 && !access_request_parent2)
return 0;
+ if (WARN_ON_ONCE(!domain || !path))
+ return 0;
+ if (is_nouser_or_private(path->dentry))
+ return 0;
+ if (WARN_ON_ONCE(domain->num_layers < 1 || !layer_masks_parent1))
+ return -EACCES;
+
+ if (unlikely(layer_masks_parent2)) {
+ if (WARN_ON_ONCE(!dentry_child1))
+ return -EACCES;
+ /*
+ * For a double request, first check for potential privilege
+ * escalation by looking at domain handled accesses (which are
+ * a superset of the meaningful requested accesses).
+ */
+ access_masked_parent1 = access_masked_parent2 =
+ get_handled_accesses(domain);
+ is_dom_check = true;
+ } else {
+ if (WARN_ON_ONCE(dentry_child1 || dentry_child2))
+ return -EACCES;
+ /* For a simple request, only check for requested accesses. */
+ access_masked_parent1 = access_request_parent1;
+ access_masked_parent2 = access_request_parent2;
+ is_dom_check = false;
+ }
+
+ if (unlikely(dentry_child1)) {
+ unmask_layers(find_rule(domain, dentry_child1),
+ init_layer_masks(domain, LANDLOCK_MASK_ACCESS_FS,
+ &_layer_masks_child1),
+ &_layer_masks_child1);
+ layer_masks_child1 = &_layer_masks_child1;
+ child1_is_directory = d_is_dir(dentry_child1);
+ }
+ if (unlikely(dentry_child2)) {
+ unmask_layers(find_rule(domain, dentry_child2),
+ init_layer_masks(domain, LANDLOCK_MASK_ACCESS_FS,
+ &_layer_masks_child2),
+ &_layer_masks_child2);
+ layer_masks_child2 = &_layer_masks_child2;
+ child2_is_directory = d_is_dir(dentry_child2);
+ }
walker_path = *path;
path_get(&walker_path);
@@ -316,11 +539,52 @@ static int check_access_path(const struct landlock_ruleset *const domain,
*/
while (true) {
struct dentry *parent_dentry;
+ const struct landlock_rule *rule;
+
+ /*
+ * If at least all accesses allowed on the destination are
+ * already allowed on the source, respectively if there is at
+ * least as much as restrictions on the destination than on the
+ * source, then we can safely refer files from the source to
+ * the destination without risking a privilege escalation.
+ * This also applies in the case of RENAME_EXCHANGE, which
+ * implies checks on both direction. This is crucial for
+ * standalone multilayered security policies. Furthermore,
+ * this helps avoid policy writers to shoot themselves in the
+ * foot.
+ */
+ if (unlikely(is_dom_check &&
+ no_more_access(
+ layer_masks_parent1, layer_masks_child1,
+ child1_is_directory, layer_masks_parent2,
+ layer_masks_child2,
+ child2_is_directory))) {
+ allowed_parent1 = scope_to_request(
+ access_request_parent1, layer_masks_parent1);
+ allowed_parent2 = scope_to_request(
+ access_request_parent2, layer_masks_parent2);
+
+ /* Stops when all accesses are granted. */
+ if (allowed_parent1 && allowed_parent2)
+ break;
- allowed = unmask_layers(find_rule(domain, walker_path.dentry),
- access_request, &layer_masks);
- if (allowed)
- /* Stops when a rule from each layer grants access. */
+ /*
+ * Now, downgrades the remaining checks from domain
+ * handled accesses to requested accesses.
+ */
+ is_dom_check = false;
+ access_masked_parent1 = access_request_parent1;
+ access_masked_parent2 = access_request_parent2;
+ }
+
+ rule = find_rule(domain, walker_path.dentry);
+ allowed_parent1 = unmask_layers(rule, access_masked_parent1,
+ layer_masks_parent1);
+ allowed_parent2 = unmask_layers(rule, access_masked_parent2,
+ layer_masks_parent2);
+
+ /* Stops when a rule from each layer grants access. */
+ if (allowed_parent1 && allowed_parent2)
break;
jump_up:
@@ -333,7 +597,6 @@ static int check_access_path(const struct landlock_ruleset *const domain,
* Stops at the real root. Denies access
* because not all layers have granted access.
*/
- allowed = false;
break;
}
}
@@ -343,7 +606,8 @@ static int check_access_path(const struct landlock_ruleset *const domain,
* access to internal filesystems (e.g. nsfs, which is
* reachable through /proc/<pid>/ns/<namespace>).
*/
- allowed = !!(walker_path.mnt->mnt_flags & MNT_INTERNAL);
+ allowed_parent1 = allowed_parent2 =
+ !!(walker_path.mnt->mnt_flags & MNT_INTERNAL);
break;
}
parent_dentry = dget_parent(walker_path.dentry);
@@ -351,7 +615,36 @@ static int check_access_path(const struct landlock_ruleset *const domain,
walker_path.dentry = parent_dentry;
}
path_put(&walker_path);
- return allowed ? 0 : -EACCES;
+
+ if (allowed_parent1 && allowed_parent2)
+ return 0;
+
+ /*
+ * This prioritizes EACCES over EXDEV for all actions, including
+ * renames with RENAME_EXCHANGE.
+ */
+ if (likely(is_eacces(layer_masks_parent1, access_request_parent1) ||
+ is_eacces(layer_masks_parent2, access_request_parent2)))
+ return -EACCES;
+
+ /*
+ * Gracefully forbids reparenting if the destination directory
+ * hierarchy is not a superset of restrictions of the source directory
+ * hierarchy, or if LANDLOCK_ACCESS_FS_REFER is not allowed by the
+ * source or the destination.
+ */
+ return -EXDEV;
+}
+
+static inline int check_access_path(const struct landlock_ruleset *const domain,
+ const struct path *const path,
+ access_mask_t access_request)
+{
+ layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
+
+ access_request = init_layer_masks(domain, access_request, &layer_masks);
+ return check_access_path_dual(domain, path, access_request,
+ &layer_masks, NULL, 0, NULL, NULL);
}
static inline int current_check_access_path(const struct path *const path,
@@ -398,6 +691,206 @@ static inline access_mask_t maybe_remove(const struct dentry *const dentry)
LANDLOCK_ACCESS_FS_REMOVE_FILE;
}
+/**
+ * collect_domain_accesses - Walk through a file path and collect accesses
+ *
+ * @domain: Domain to check against.
+ * @mnt_root: Last directory to check.
+ * @dir: Directory to start the walk from.
+ * @layer_masks_dom: Where to store the collected accesses.
+ *
+ * This helper is useful to begin a path walk from the @dir directory to a
+ * @mnt_root directory used as a mount point. This mount point is the common
+ * ancestor between the source and the destination of a renamed and linked
+ * file. While walking from @dir to @mnt_root, we record all the domain's
+ * allowed accesses in @layer_masks_dom.
+ *
+ * This is similar to check_access_path_dual() but much simpler because it only
+ * handles walking on the same mount point and only check one set of accesses.
+ *
+ * Returns:
+ * - true if all the domain access rights are allowed for @dir;
+ * - false if the walk reached @mnt_root.
+ */
+static bool collect_domain_accesses(
+ const struct landlock_ruleset *const domain,
+ const struct dentry *const mnt_root, struct dentry *dir,
+ layer_mask_t (*const layer_masks_dom)[LANDLOCK_NUM_ACCESS_FS])
+{
+ unsigned long access_dom;
+ bool ret = false;
+
+ if (WARN_ON_ONCE(!domain || !mnt_root || !dir || !layer_masks_dom))
+ return true;
+ if (is_nouser_or_private(dir))
+ return true;
+
+ access_dom = init_layer_masks(domain, LANDLOCK_MASK_ACCESS_FS,
+ layer_masks_dom);
+
+ dget(dir);
+ while (true) {
+ struct dentry *parent_dentry;
+
+ /* Gets all layers allowing all domain accesses. */
+ if (unmask_layers(find_rule(domain, dir), access_dom,
+ layer_masks_dom)) {
+ /*
+ * Stops when all handled accesses are allowed by at
+ * least one rule in each layer.
+ */
+ ret = true;
+ break;
+ }
+
+ /* We should not reach a root other than @mnt_root. */
+ if (dir == mnt_root || WARN_ON_ONCE(IS_ROOT(dir)))
+ break;
+
+ parent_dentry = dget_parent(dir);
+ dput(dir);
+ dir = parent_dentry;
+ }
+ dput(dir);
+ return ret;
+}
+
+/**
+ * current_check_refer_path - Check if a rename or link action is allowed
+ *
+ * @old_dentry: File or directory requested to be moved or linked.
+ * @new_dir: Destination parent directory.
+ * @new_dentry: Destination file or directory.
+ * @removable: Sets to true if it is a rename operation.
+ * @exchange: Sets to true if it is a rename operation with RENAME_EXCHANGE.
+ *
+ * Because of its unprivileged constraints, Landlock relies on file hierarchies
+ * (and not only inodes) to tie access rights to files. Being able to link or
+ * rename a file hierarchy brings some challenges. Indeed, moving or linking a
+ * file (i.e. creating a new reference to an inode) can have an impact on the
+ * actions allowed for a set of files if it would change its parent directory
+ * (i.e. reparenting).
+ *
+ * To avoid trivial access right bypasses, Landlock first checks if the file or
+ * directory requested to be moved would gain new access rights inherited from
+ * its new hierarchy. Before returning any error, Landlock then checks that
+ * the parent source hierarchy and the destination hierarchy would allow the
+ * link or rename action. If it is not the case, an error with EACCES is
+ * returned to inform user space that there is no way to remove or create the
+ * requested source file type. If it should be allowed but the new inherited
+ * access rights would be greater than the source access rights, then the
+ * kernel returns an error with EXDEV. Prioritizing EACCES over EXDEV enables
+ * user space to abort the whole operation if there is no way to do it, or to
+ * manually copy the source to the destination if this remains allowed, e.g.
+ * because file creation is allowed on the destination directory but not direct
+ * linking.
+ *
+ * To achieve this goal, the kernel needs to compare two file hierarchies: the
+ * one identifying the source file or directory (including itself), and the
+ * destination one. This can be seen as a multilayer partial ordering problem.
+ * The kernel walks through these paths and collects in a matrix the access
+ * rights that are denied per layer. These matrices are then compared to see
+ * if the destination one has more (or the same) restrictions as the source
+ * one. If this is the case, the requested action will not return EXDEV, which
+ * doesn't mean the action is allowed. The parent hierarchy of the source
+ * (i.e. parent directory), and the destination hierarchy must also be checked
+ * to verify that they explicitly allow such action (i.e. referencing,
+ * creation and potentially removal rights). The kernel implementation is then
+ * required to rely on potentially four matrices of access rights: one for the
+ * source file or directory (i.e. the child), a potentially other one for the
+ * other source/destination (in case of RENAME_EXCHANGE), one for the source
+ * parent hierarchy and a last one for the destination hierarchy. These
+ * ephemeral matrices take some space on the stack, which limits the number of
+ * layers to a deemed reasonable number: 16.
+ *
+ * Returns:
+ * - 0 if access is allowed;
+ * - -EXDEV if @old_dentry would inherit new access rights from @new_dir;
+ * - -EACCES if file removal or creation is denied.
+ */
+static int current_check_refer_path(struct dentry *const old_dentry,
+ const struct path *const new_dir,
+ struct dentry *const new_dentry,
+ const bool removable, const bool exchange)
+{
+ const struct landlock_ruleset *const dom =
+ landlock_get_current_domain();
+ bool allow_parent1, allow_parent2;
+ access_mask_t access_request_parent1, access_request_parent2;
+ struct path mnt_dir;
+ layer_mask_t layer_masks_parent1[LANDLOCK_NUM_ACCESS_FS],
+ layer_masks_parent2[LANDLOCK_NUM_ACCESS_FS];
+
+ if (!dom)
+ return 0;
+ if (WARN_ON_ONCE(dom->num_layers < 1))
+ return -EACCES;
+ if (unlikely(d_is_negative(old_dentry)))
+ return -ENOENT;
+ if (exchange) {
+ if (unlikely(d_is_negative(new_dentry)))
+ return -ENOENT;
+ access_request_parent1 =
+ get_mode_access(d_backing_inode(new_dentry)->i_mode);
+ } else {
+ access_request_parent1 = 0;
+ }
+ access_request_parent2 =
+ get_mode_access(d_backing_inode(old_dentry)->i_mode);
+ if (removable) {
+ access_request_parent1 |= maybe_remove(old_dentry);
+ access_request_parent2 |= maybe_remove(new_dentry);
+ }
+
+ /* The mount points are the same for old and new paths, cf. EXDEV. */
+ if (old_dentry->d_parent == new_dir->dentry) {
+ /*
+ * The LANDLOCK_ACCESS_FS_REFER access right is not required
+ * for same-directory referer (i.e. no reparenting).
+ */
+ access_request_parent1 = init_layer_masks(
+ dom, access_request_parent1 | access_request_parent2,
+ &layer_masks_parent1);
+ return check_access_path_dual(dom, new_dir,
+ access_request_parent1,
+ &layer_masks_parent1, NULL, 0,
+ NULL, NULL);
+ }
+
+ /* Backward compatibility: no reparenting support. */
+ if (!(get_handled_accesses(dom) & LANDLOCK_ACCESS_FS_REFER))
+ return -EXDEV;
+
+ access_request_parent1 |= LANDLOCK_ACCESS_FS_REFER;
+ access_request_parent2 |= LANDLOCK_ACCESS_FS_REFER;
+
+ /* Saves the common mount point. */
+ mnt_dir.mnt = new_dir->mnt;
+ mnt_dir.dentry = new_dir->mnt->mnt_root;
+
+ /* new_dir->dentry is equal to new_dentry->d_parent */
+ allow_parent1 = collect_domain_accesses(dom, mnt_dir.dentry,
+ old_dentry->d_parent,
+ &layer_masks_parent1);
+ allow_parent2 = collect_domain_accesses(
+ dom, mnt_dir.dentry, new_dir->dentry, &layer_masks_parent2);
+
+ if (allow_parent1 && allow_parent2)
+ return 0;
+
+ /*
+ * To be able to compare source and destination domain access rights,
+ * take into account the @old_dentry access rights aggregated with its
+ * parent access rights. This will be useful to compare with the
+ * destination parent access rights.
+ */
+ return check_access_path_dual(dom, &mnt_dir, access_request_parent1,
+ &layer_masks_parent1, old_dentry,
+ access_request_parent2,
+ &layer_masks_parent2,
+ exchange ? new_dentry : NULL);
+}
+
/* Inode hooks */
static void hook_inode_free_security(struct inode *const inode)
@@ -591,32 +1084,12 @@ static int hook_sb_pivotroot(const struct path *const old_path,
/* Path hooks */
-/*
- * Creating multiple links or renaming may lead to privilege escalations if not
- * handled properly. Indeed, we must be sure that the source doesn't gain more
- * privileges by being accessible from the destination. This is getting more
- * complex when dealing with multiple layers. The whole picture can be seen as
- * a multilayer partial ordering problem. A future version of Landlock will
- * deal with that.
- */
static int hook_path_link(struct dentry *const old_dentry,
const struct path *const new_dir,
struct dentry *const new_dentry)
{
- const struct landlock_ruleset *const dom =
- landlock_get_current_domain();
-
- if (!dom)
- return 0;
- /* The mount points are the same for old and new paths, cf. EXDEV. */
- if (old_dentry->d_parent != new_dir->dentry)
- /* Gracefully forbids reparenting. */
- return -EXDEV;
- if (unlikely(d_is_negative(old_dentry)))
- return -ENOENT;
- return check_access_path(
- dom, new_dir,
- get_mode_access(d_backing_inode(old_dentry)->i_mode));
+ return current_check_refer_path(old_dentry, new_dir, new_dentry, false,
+ false);
}
static int hook_path_rename(const struct path *const old_dir,
@@ -625,30 +1098,9 @@ static int hook_path_rename(const struct path *const old_dir,
struct dentry *const new_dentry,
const unsigned int flags)
{
- const struct landlock_ruleset *const dom =
- landlock_get_current_domain();
- u32 exchange_access = 0;
-
- if (!dom)
- return 0;
- /* The mount points are the same for old and new paths, cf. EXDEV. */
- if (old_dir->dentry != new_dir->dentry)
- /* Gracefully forbids reparenting. */
- return -EXDEV;
- if (flags & RENAME_EXCHANGE) {
- if (unlikely(d_is_negative(new_dentry)))
- return -ENOENT;
- exchange_access =
- get_mode_access(d_backing_inode(new_dentry)->i_mode);
- }
- if (unlikely(d_is_negative(old_dentry)))
- return -ENOENT;
- /* RENAME_EXCHANGE is handled because directories are the same. */
- return check_access_path(
- dom, old_dir,
- maybe_remove(old_dentry) | maybe_remove(new_dentry) |
- exchange_access |
- get_mode_access(d_backing_inode(old_dentry)->i_mode));
+ /* old_dir refers to old_dentry->d_parent and new_dir->mnt */
+ return current_check_refer_path(old_dentry, new_dir, new_dentry, true,
+ !!(flags & RENAME_EXCHANGE));
}
static int hook_path_mkdir(const struct path *const dir,
diff --git a/security/landlock/limits.h b/security/landlock/limits.h
index 17c2a2e7fe1e..b54184ab9439 100644
--- a/security/landlock/limits.h
+++ b/security/landlock/limits.h
@@ -18,7 +18,7 @@
#define LANDLOCK_MAX_NUM_LAYERS 16
#define LANDLOCK_MAX_NUM_RULES U32_MAX
-#define LANDLOCK_LAST_ACCESS_FS LANDLOCK_ACCESS_FS_MAKE_SYM
+#define LANDLOCK_LAST_ACCESS_FS LANDLOCK_ACCESS_FS_REFER
#define LANDLOCK_MASK_ACCESS_FS ((LANDLOCK_LAST_ACCESS_FS << 1) - 1)
#define LANDLOCK_NUM_ACCESS_FS __const_hweight64(LANDLOCK_MASK_ACCESS_FS)
diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
index 507d43827afe..735a0865ea11 100644
--- a/security/landlock/syscalls.c
+++ b/security/landlock/syscalls.c
@@ -129,7 +129,7 @@ static const struct file_operations ruleset_fops = {
.write = fop_dummy_write,
};
-#define LANDLOCK_ABI_VERSION 1
+#define LANDLOCK_ABI_VERSION 2
/**
* sys_landlock_create_ruleset - Create a new ruleset
diff --git a/tools/testing/selftests/landlock/base_test.c b/tools/testing/selftests/landlock/base_test.c
index 35f64832b869..da9290817866 100644
--- a/tools/testing/selftests/landlock/base_test.c
+++ b/tools/testing/selftests/landlock/base_test.c
@@ -75,7 +75,7 @@ TEST(abi_version)
const struct landlock_ruleset_attr ruleset_attr = {
.handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE,
};
- ASSERT_EQ(1, landlock_create_ruleset(NULL, 0,
+ ASSERT_EQ(2, landlock_create_ruleset(NULL, 0,
LANDLOCK_CREATE_RULESET_VERSION));
ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr, 0,
diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
index a4fdcda62bde..69f9c7409198 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -401,7 +401,7 @@ TEST_F_FORK(layout1, inval)
LANDLOCK_ACCESS_FS_WRITE_FILE | \
LANDLOCK_ACCESS_FS_READ_FILE)
-#define ACCESS_LAST LANDLOCK_ACCESS_FS_MAKE_SYM
+#define ACCESS_LAST LANDLOCK_ACCESS_FS_REFER
#define ACCESS_ALL ( \
ACCESS_FILE | \
@@ -414,6 +414,7 @@ TEST_F_FORK(layout1, inval)
LANDLOCK_ACCESS_FS_MAKE_SOCK | \
LANDLOCK_ACCESS_FS_MAKE_FIFO | \
LANDLOCK_ACCESS_FS_MAKE_BLOCK | \
+ LANDLOCK_ACCESS_FS_MAKE_SYM | \
ACCESS_LAST)
/* clang-format on */
--
2.35.1
The four related patch series are available here:
https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git/log/?h=landlock-wip
On 06/05/2022 18:10, Mickaël Salaün wrote:
> Hi,
>
> This third patch series is mostly a rebase with some whitespace changes
> because of clang-format. There is also some new "unlikely()" calls and
> minor code cleanup.
>
> Test coverage for security/landlock was 94.4% of 504 lines (with the
> previous patch series), and it is now 95.4% of 604 lines according to
> gcc/gcov-11.
>
> Problem
> =======
>
> One of the most annoying limitations of Landlock is that sandboxed
> processes can only link and rename files to the same directory (i.e.
> file reparenting is always denied). Indeed, because of the unprivileged
> nature of Landlock, file hierarchy are identified thanks to ephemeral
> inode tagging, which may cause arbitrary renaming and linking to change
> the security policy in an unexpected way.
>
> Solution
> ========
>
> This patch series brings a new access right, LANDLOCK_ACCESS_FS_REFER,
> which enables to allow safe file linking and renaming. In a nutshell,
> Landlock checks that the inherited access rights of a moved or renamed
> file cannot increase but only reduce. Eleven new test suits cover file
> renaming and linking, which improves test coverage.
>
> The documentation and the tutorial is extended with this new access
> right, along with more explanations about backward and forward
> compatibility, good practices, and a bit about the current access
> rights rational.
>
> While developing this new feature, I also found an issue with the
> current implementation of Landlock. In some (rare) cases, sandboxed
> processes may be more restricted than intended. Indeed, because of the
> current way to check file hierarchy access rights, composition of rules
> may be incomplete when requesting multiple accesses at the same time.
> This is fixed with a dedicated patch involving some refactoring. A new
> test suite checks relevant new edge cases.
>
> As a side effect, and to limit the increased use of the stack, I reduced
> the number of Landlock nested domains from 64 to 16. I think this
> should be more than enough for legitimate use cases, but feel free to
> challenge this decision with real and legitimate use cases.
>
> Additionally, a new dedicated syzkaller test has been developed to cover
> new paths.
>
> This patch series is based on and was developed with some complementary
> new tests sent in a standalone patch series:
> https://lore.kernel.org/r/[email protected]
>
> Previous versions:
> v2: https://lore.kernel.org/r/[email protected]
> v1: https://lore.kernel.org/r/[email protected]
>
> Regards,
>
> Mickaël Salaün (12):
> landlock: Define access_mask_t to enforce a consistent access mask
> size
> landlock: Reduce the maximum number of layers to 16
> landlock: Create find_rule() from unmask_layers()
> landlock: Fix same-layer rule unions
> landlock: Move filesystem helpers and add a new one
> LSM: Remove double path_rename hook calls for RENAME_EXCHANGE
> landlock: Add support for file reparenting with
> LANDLOCK_ACCESS_FS_REFER
> selftests/landlock: Add 11 new test suites dedicated to file
> reparenting
> samples/landlock: Add support for file reparenting
> landlock: Document LANDLOCK_ACCESS_FS_REFER and ABI versioning
> landlock: Document good practices about filesystem policies
> landlock: Add design choices documentation for filesystem access
> rights
>
> Documentation/security/landlock.rst | 17 +-
> Documentation/userspace-api/landlock.rst | 151 ++-
> include/linux/lsm_hook_defs.h | 2 +-
> include/linux/lsm_hooks.h | 1 +
> include/uapi/linux/landlock.h | 27 +-
> samples/landlock/sandboxer.c | 40 +-
> security/apparmor/lsm.c | 30 +-
> security/landlock/fs.c | 771 ++++++++++---
> security/landlock/fs.h | 2 +-
> security/landlock/limits.h | 6 +-
> security/landlock/ruleset.c | 6 +-
> security/landlock/ruleset.h | 22 +-
> security/landlock/syscalls.c | 2 +-
> security/security.c | 9 +-
> security/tomoyo/tomoyo.c | 11 +-
> tools/testing/selftests/landlock/base_test.c | 2 +-
> tools/testing/selftests/landlock/fs_test.c | 1039 ++++++++++++++++--
> 17 files changed, 1853 insertions(+), 285 deletions(-)
>
>
> base-commit: 4b0cdb0cf6eefa7521322007931ccfb7edc96c53
This refactoring will be useful in a following commit.
Reviewed-by: Paul Moore <[email protected]>
Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
Changes since v2:
* Format with clang-format and rebase.
Changes since v1:
* Add Reviewed-by: Paul Moore.
---
security/landlock/fs.c | 41 ++++++++++++++++++++++++++++-------------
1 file changed, 28 insertions(+), 13 deletions(-)
diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index f48c0a3b1e75..20953bff8fd5 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -183,23 +183,36 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
/* Access-control management */
-static inline layer_mask_t
-unmask_layers(const struct landlock_ruleset *const domain,
- const struct path *const path, const access_mask_t access_request,
- layer_mask_t layer_mask)
+/*
+ * The lifetime of the returned rule is tied to @domain.
+ *
+ * Returns NULL if no rule is found or if @dentry is negative.
+ */
+static inline const struct landlock_rule *
+find_rule(const struct landlock_ruleset *const domain,
+ const struct dentry *const dentry)
{
const struct landlock_rule *rule;
const struct inode *inode;
- size_t i;
- if (d_is_negative(path->dentry))
- /* Ignore nonexistent leafs. */
- return layer_mask;
- inode = d_backing_inode(path->dentry);
+ /* Ignores nonexistent leafs. */
+ if (d_is_negative(dentry))
+ return NULL;
+
+ inode = d_backing_inode(dentry);
rcu_read_lock();
rule = landlock_find_rule(
domain, rcu_dereference(landlock_inode(inode)->object));
rcu_read_unlock();
+ return rule;
+}
+
+static inline layer_mask_t unmask_layers(const struct landlock_rule *const rule,
+ const access_mask_t access_request,
+ layer_mask_t layer_mask)
+{
+ size_t layer_level;
+
if (!rule)
return layer_mask;
@@ -210,8 +223,9 @@ unmask_layers(const struct landlock_ruleset *const domain,
* the remaining layers for each inode, from the first added layer to
* the last one.
*/
- for (i = 0; i < rule->num_layers; i++) {
- const struct landlock_layer *const layer = &rule->layers[i];
+ for (layer_level = 0; layer_level < rule->num_layers; layer_level++) {
+ const struct landlock_layer *const layer =
+ &rule->layers[layer_level];
const layer_mask_t layer_bit = BIT_ULL(layer->level - 1);
/* Checks that the layer grants access to the full request. */
@@ -269,8 +283,9 @@ static int check_access_path(const struct landlock_ruleset *const domain,
while (true) {
struct dentry *parent_dentry;
- layer_mask = unmask_layers(domain, &walker_path, access_request,
- layer_mask);
+ layer_mask =
+ unmask_layers(find_rule(domain, walker_path.dentry),
+ access_request, layer_mask);
if (layer_mask == 0) {
/* Stops when a rule from each layer grants access. */
allowed = true;
--
2.35.1
Add LANDLOCK_ACCESS_FS_REFER in the example and properly check to only
use it if the current kernel support it thanks to the Landlock ABI
version.
Move the file renaming and linking limitation to a new "Previous
limitations" section.
Improve documentation about the backward and forward compatibility,
including the rational for ruleset's handled_access_fs.
Update the document date.
Reviewed-by: Paul Moore <[email protected]>
Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
Changes since v2:
* Fix spelling spotted by Paul.
* Explain a bit more the renaming and linking constraint alternative
(previous limitations), which is implemented with
LANDLOCK_ACCESS_FS_REFER.
* Update date.
Changes since v1:
* Add Reviewed-by: Paul Moore.
* Update date.
---
Documentation/userspace-api/landlock.rst | 126 +++++++++++++++++++----
1 file changed, 106 insertions(+), 20 deletions(-)
diff --git a/Documentation/userspace-api/landlock.rst b/Documentation/userspace-api/landlock.rst
index b68e7a51009f..ae2aea986aa6 100644
--- a/Documentation/userspace-api/landlock.rst
+++ b/Documentation/userspace-api/landlock.rst
@@ -8,7 +8,7 @@ Landlock: unprivileged access control
=====================================
:Author: Mickaël Salaün
-:Date: March 2021
+:Date: May 2022
The goal of Landlock is to enable to restrict ambient rights (e.g. global
filesystem access) for a set of processes. Because Landlock is a stackable
@@ -29,14 +29,15 @@ the thread enforcing it, and its future children.
Defining and enforcing a security policy
----------------------------------------
-We first need to create the ruleset that will contain our rules. For this
+We first need to define the ruleset that will contain our rules. For this
example, the ruleset will contain rules that only allow read actions, but write
actions will be denied. The ruleset then needs to handle both of these kind of
-actions.
+actions. This is required for backward and forward compatibility (i.e. the
+kernel and user space may not know each other's supported restrictions), hence
+the need to be explicit about the denied-by-default access rights.
.. code-block:: c
- int ruleset_fd;
struct landlock_ruleset_attr ruleset_attr = {
.handled_access_fs =
LANDLOCK_ACCESS_FS_EXECUTE |
@@ -51,9 +52,34 @@ actions.
LANDLOCK_ACCESS_FS_MAKE_SOCK |
LANDLOCK_ACCESS_FS_MAKE_FIFO |
LANDLOCK_ACCESS_FS_MAKE_BLOCK |
- LANDLOCK_ACCESS_FS_MAKE_SYM,
+ LANDLOCK_ACCESS_FS_MAKE_SYM |
+ LANDLOCK_ACCESS_FS_REFER,
};
+Because we may not know on which kernel version an application will be
+executed, it is safer to follow a best-effort security approach. Indeed, we
+should try to protect users as much as possible whatever the kernel they are
+using. To avoid binary enforcement (i.e. either all security features or
+none), we can leverage a dedicated Landlock command to get the current version
+of the Landlock ABI and adapt the handled accesses. Let's check if we should
+remove the `LANDLOCK_ACCESS_FS_REFER` access right which is only supported
+starting with the second version of the ABI.
+
+.. code-block:: c
+
+ int abi;
+
+ abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
+ if (abi < 2) {
+ ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_REFER;
+ }
+
+This enables to create an inclusive ruleset that will contain our rules.
+
+.. code-block:: c
+
+ int ruleset_fd;
+
ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
if (ruleset_fd < 0) {
perror("Failed to create a ruleset");
@@ -92,6 +118,11 @@ descriptor.
return 1;
}
+It may also be required to create rules following the same logic as explained
+for the ruleset creation, by filtering access rights according to the Landlock
+ABI version. In this example, this is not required because
+`LANDLOCK_ACCESS_FS_REFER` is not allowed by any rule.
+
We now have a ruleset with one rule allowing read access to ``/usr`` while
denying all other handled accesses for the filesystem. The next step is to
restrict the current thread from gaining more privileges (e.g. thanks to a SUID
@@ -192,6 +223,56 @@ To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target
process, a sandboxed process should have a subset of the target process rules,
which means the tracee must be in a sub-domain of the tracer.
+Compatibility
+=============
+
+Backward and forward compatibility
+----------------------------------
+
+Landlock is designed to be compatible with past and future versions of the
+kernel. This is achieved thanks to the system call attributes and the
+associated bitflags, particularly the ruleset's `handled_access_fs`. Making
+handled access right explicit enables the kernel and user space to have a clear
+contract with each other. This is required to make sure sandboxing will not
+get stricter with a system update, which could break applications.
+
+Developers can subscribe to the `Landlock mailing list
+<https://subspace.kernel.org/lists.linux.dev.html>`_ to knowingly update and
+test their applications with the latest available features. In the interest of
+users, and because they may use different kernel versions, it is strongly
+encouraged to follow a best-effort security approach by checking the Landlock
+ABI version at runtime and only enforcing the supported features.
+
+Landlock ABI versions
+---------------------
+
+The Landlock ABI version can be read with the sys_landlock_create_ruleset()
+system call:
+
+.. code-block:: c
+
+ int abi;
+
+ abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
+ if (abi < 0) {
+ switch (errno) {
+ case ENOSYS:
+ printf("Landlock is not supported by the current kernel.\n");
+ break;
+ case EOPNOTSUPP:
+ printf("Landlock is currently disabled.\n");
+ break;
+ }
+ return 0;
+ }
+ if (abi >= 2) {
+ printf("Landlock supports LANDLOCK_ACCESS_FS_REFER.\n");
+ }
+
+The following kernel interfaces are implicitly supported by the first ABI
+version. Features only supported from a specific version are explicitly marked
+as such.
+
Kernel interface
================
@@ -228,21 +309,6 @@ Enforcing a ruleset
Current limitations
===================
-File renaming and linking
--------------------------
-
-Because Landlock targets unprivileged access controls, it is needed to properly
-handle composition of rules. Such property also implies rules nesting.
-Properly handling multiple layers of ruleset, each one of them able to restrict
-access to files, also implies to inherit the ruleset restrictions from a parent
-to its hierarchy. Because files are identified and restricted by their
-hierarchy, moving or linking a file from one directory to another implies to
-propagate the hierarchy constraints. To protect against privilege escalations
-through renaming or linking, and for the sake of simplicity, Landlock currently
-limits linking and renaming to the same directory. Future Landlock evolutions
-will enable more flexibility for renaming and linking, with dedicated ruleset
-flags.
-
Filesystem topology modification
--------------------------------
@@ -281,6 +347,26 @@ Memory usage
Kernel memory allocated to create rulesets is accounted and can be restricted
by the Documentation/admin-guide/cgroup-v1/memory.rst.
+Previous limitations
+====================
+
+File renaming and linking (ABI 1)
+---------------------------------
+
+Because Landlock targets unprivileged access controls, it needs to properly
+handle composition of rules. Such property also implies rules nesting.
+Properly handling multiple layers of rulesets, each one of them able to
+restrict access to files, also implies inheritance of the ruleset restrictions
+from a parent to its hierarchy. Because files are identified and restricted by
+their hierarchy, moving or linking a file from one directory to another implies
+propagation of the hierarchy constraints, or restriction of these actions
+according to the potentially lost constraints. To protect against privilege
+escalations through renaming or linking, and for the sake of simplicity,
+Landlock previously limited linking and renaming to the same directory.
+Starting with the Landlock ABI version 2, it is now possible to securely
+control renaming and linking thanks to the new `LANDLOCK_ACCESS_FS_REFER`
+access right.
+
Questions and answers
=====================
--
2.35.1