2022-02-22 01:16:23

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v1 00/11] Landlock: file linking and renaming support

Hi,

One of the most annoying limitations of Landlock is that sandboxed
processes can only link and rename files to the same directory (i.e.
file reparenting is always denied). Indeed, because of the unprivileged
nature of Landlock, file hierarchy are identified thanks to ephemeral
inode tagging, which may cause arbitrary renaming and linking to change
the security policy in an unexpected way.

This patch series brings a new access right, LANDLOCK_ACCESS_FS_REFER,
which enables to allow safe file linking and renaming. In a nutshell,
Landlock checks that the inherited access rights of a moved or renamed
file cannot increase but only reduce. Six new test suits cover file
renaming and linking, which brings coverage for security/landlock/ from
93.5% of lines to 94.4%.

The documentation and the tutorial is extended with this new access
right, along with more explanations about backward and forward
compatibility, good practices, and a bit about the current access
rights rational.

While developing this new feature, I also found an issue with the
current implementation of Landlock. In some (rare) cases, sandboxed
processes may be more restricted than intended. Indeed, because of the
current way to check file hierarchy access rights, composition of rules
may be incomplete when requesting multiple accesses at the same time.
This is fixed with a dedicated patch involving some refactoring. A new
test suite checks relevant new edge cases.

As a side effect, and to limit the increased use of the stack, I reduced
the number of Landlock nested domains from 64 to 16. I think this
should be more than enough for legitimate use cases, but feel free to
challenge this decision with real and legitimate use cases.

Because of the current path_rename security hook, Landlock cannot yet
return consistent error codes with RENAME_EXCHANGE. I plan to address
this issue with a next series.

This patch series was developed with some complementary new tests sent
in a standalone patch series:
https://lore.kernel.org/r/[email protected]

Additionally, a new dedicated syzkaller test has been developed to cover
new paths.

Regards,

Mickaël Salaün (11):
landlock: Define access_mask_t to enforce a consistent access mask
size
landlock: Reduce the maximum number of layers to 16
landlock: Create find_rule() from unmask_layers()
landlock: Fix same-layer rule unions
landlock: Move filesystem helpers and add a new one
landlock: Add support for file reparenting with
LANDLOCK_ACCESS_FS_REFER
selftest/landlock: Add 6 new test suites dedicated to file reparenting
samples/landlock: Add support for file reparenting
landlock: Document LANDLOCK_ACCESS_FS_REFER and ABI versioning
landlock: Document good practices about filesystem policies
landlock: Add design choices documentation for filesystem access
rights

Documentation/security/landlock.rst | 17 +-
Documentation/userspace-api/landlock.rst | 145 +++-
include/uapi/linux/landlock.h | 27 +-
samples/landlock/sandboxer.c | 37 +-
security/landlock/fs.c | 721 +++++++++++++++----
security/landlock/fs.h | 2 +-
security/landlock/limits.h | 6 +-
security/landlock/ruleset.c | 6 +-
security/landlock/ruleset.h | 23 +-
security/landlock/syscalls.c | 2 +-
tools/testing/selftests/landlock/base_test.c | 2 +-
tools/testing/selftests/landlock/fs_test.c | 634 +++++++++++++++-
12 files changed, 1447 insertions(+), 175 deletions(-)


base-commit: cfb92440ee71adcc2105b0890bb01ac3cddb8507
--
2.35.1


2022-02-22 01:16:27

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v1 01/11] landlock: Define access_mask_t to enforce a consistent access mask size

From: Mickaël Salaün <[email protected]>

Create and use the access_mask_t typedef to enforce a consistent access
mask size and uniformly use a 16-bits type. This will helps transition
to a 32-bits value one day.

Add a build check to make sure all (filesystem) access rights fit in.
This will be extended with a following commit.

Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
security/landlock/fs.c | 19 ++++++++++---------
security/landlock/fs.h | 2 +-
security/landlock/limits.h | 2 ++
security/landlock/ruleset.c | 6 ++++--
security/landlock/ruleset.h | 17 +++++++++++++----
5 files changed, 30 insertions(+), 16 deletions(-)

diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 97b8e421f617..9de2a460a762 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -150,7 +150,7 @@ static struct landlock_object *get_inode_object(struct inode *const inode)
* @path: Should have been checked by get_path_from_fd().
*/
int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
- const struct path *const path, u32 access_rights)
+ const struct path *const path, access_mask_t access_rights)
{
int err;
struct landlock_object *object;
@@ -182,8 +182,8 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,

static inline u64 unmask_layers(
const struct landlock_ruleset *const domain,
- const struct path *const path, const u32 access_request,
- u64 layer_mask)
+ const struct path *const path,
+ const access_mask_t access_request, u64 layer_mask)
{
const struct landlock_rule *rule;
const struct inode *inode;
@@ -223,7 +223,8 @@ static inline u64 unmask_layers(
}

static int check_access_path(const struct landlock_ruleset *const domain,
- const struct path *const path, u32 access_request)
+ const struct path *const path,
+ const access_mask_t access_request)
{
bool allowed = false;
struct path walker_path;
@@ -308,7 +309,7 @@ static int check_access_path(const struct landlock_ruleset *const domain,
}

static inline int current_check_access_path(const struct path *const path,
- const u32 access_request)
+ const access_mask_t access_request)
{
const struct landlock_ruleset *const dom =
landlock_get_current_domain();
@@ -511,7 +512,7 @@ static int hook_sb_pivotroot(const struct path *const old_path,

/* Path hooks */

-static inline u32 get_mode_access(const umode_t mode)
+static inline access_mask_t get_mode_access(const umode_t mode)
{
switch (mode & S_IFMT) {
case S_IFLNK:
@@ -563,7 +564,7 @@ static int hook_path_link(struct dentry *const old_dentry,
get_mode_access(d_backing_inode(old_dentry)->i_mode));
}

-static inline u32 maybe_remove(const struct dentry *const dentry)
+static inline access_mask_t maybe_remove(const struct dentry *const dentry)
{
if (d_is_negative(dentry))
return 0;
@@ -631,9 +632,9 @@ static int hook_path_rmdir(const struct path *const dir,

/* File hooks */

-static inline u32 get_file_access(const struct file *const file)
+static inline access_mask_t get_file_access(const struct file *const file)
{
- u32 access = 0;
+ access_mask_t access = 0;

if (file->f_mode & FMODE_READ) {
/* A directory can only be opened in read mode. */
diff --git a/security/landlock/fs.h b/security/landlock/fs.h
index 187284b421c9..74be312aad96 100644
--- a/security/landlock/fs.h
+++ b/security/landlock/fs.h
@@ -65,6 +65,6 @@ static inline struct landlock_superblock_security *landlock_superblock(
__init void landlock_add_fs_hooks(void);

int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
- const struct path *const path, u32 access_hierarchy);
+ const struct path *const path, access_mask_t access_hierarchy);

#endif /* _SECURITY_LANDLOCK_FS_H */
diff --git a/security/landlock/limits.h b/security/landlock/limits.h
index 2a0a1095ee27..458d1de32ed5 100644
--- a/security/landlock/limits.h
+++ b/security/landlock/limits.h
@@ -9,6 +9,7 @@
#ifndef _SECURITY_LANDLOCK_LIMITS_H
#define _SECURITY_LANDLOCK_LIMITS_H

+#include <linux/bitops.h>
#include <linux/limits.h>
#include <uapi/linux/landlock.h>

@@ -17,5 +18,6 @@

#define LANDLOCK_LAST_ACCESS_FS LANDLOCK_ACCESS_FS_MAKE_SYM
#define LANDLOCK_MASK_ACCESS_FS ((LANDLOCK_LAST_ACCESS_FS << 1) - 1)
+#define LANDLOCK_NUM_ACCESS_FS __const_hweight64(LANDLOCK_MASK_ACCESS_FS)

#endif /* _SECURITY_LANDLOCK_LIMITS_H */
diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
index ec72b9262bf3..4e7aa8024fff 100644
--- a/security/landlock/ruleset.c
+++ b/security/landlock/ruleset.c
@@ -44,7 +44,8 @@ static struct landlock_ruleset *create_ruleset(const u32 num_layers)
return new_ruleset;
}

-struct landlock_ruleset *landlock_create_ruleset(const u32 fs_access_mask)
+struct landlock_ruleset *landlock_create_ruleset(
+ const access_mask_t fs_access_mask)
{
struct landlock_ruleset *new_ruleset;

@@ -228,7 +229,8 @@ static void build_check_layer(void)

/* @ruleset must be locked by the caller. */
int landlock_insert_rule(struct landlock_ruleset *const ruleset,
- struct landlock_object *const object, const u32 access)
+ struct landlock_object *const object,
+ const access_mask_t access)
{
struct landlock_layer layers[] = {{
.access = access,
diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
index 2d3ed7ec5a0a..7e7cac68e443 100644
--- a/security/landlock/ruleset.h
+++ b/security/landlock/ruleset.h
@@ -9,13 +9,20 @@
#ifndef _SECURITY_LANDLOCK_RULESET_H
#define _SECURITY_LANDLOCK_RULESET_H

+#include <linux/bitops.h>
+#include <linux/build_bug.h>
#include <linux/mutex.h>
#include <linux/rbtree.h>
#include <linux/refcount.h>
#include <linux/workqueue.h>

+#include "limits.h"
#include "object.h"

+typedef u16 access_mask_t;
+/* Makes sure all filesystem access rights can be stored. */
+static_assert(BITS_PER_TYPE(access_mask_t) >= LANDLOCK_NUM_ACCESS_FS);
+
/**
* struct landlock_layer - Access rights for a given layer
*/
@@ -28,7 +35,7 @@ struct landlock_layer {
* @access: Bitfield of allowed actions on the kernel object. They are
* relative to the object type (e.g. %LANDLOCK_ACTION_FS_READ).
*/
- u16 access;
+ access_mask_t access;
};

/**
@@ -135,18 +142,20 @@ struct landlock_ruleset {
* layers are set once and never changed for the
* lifetime of the ruleset.
*/
- u16 fs_access_masks[];
+ access_mask_t fs_access_masks[];
};
};
};

-struct landlock_ruleset *landlock_create_ruleset(const u32 fs_access_mask);
+struct landlock_ruleset *landlock_create_ruleset(
+ const access_mask_t fs_access_mask);

void landlock_put_ruleset(struct landlock_ruleset *const ruleset);
void landlock_put_ruleset_deferred(struct landlock_ruleset *const ruleset);

int landlock_insert_rule(struct landlock_ruleset *const ruleset,
- struct landlock_object *const object, const u32 access);
+ struct landlock_object *const object,
+ const access_mask_t access);

struct landlock_ruleset *landlock_merge_ruleset(
struct landlock_ruleset *const parent,
--
2.35.1

2022-02-22 04:22:47

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v1 06/11] landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER

From: Mickaël Salaün <[email protected]>

Add a new LANDLOCK_ACCESS_FS_REFER access right to enable policy writers
to allow sandboxed processes to link and rename files from and to a
specific set of file hierarchies. This access right should be composed
with LANDLOCK_ACCESS_FS_MAKE_* for the destination of a link or rename,
and with LANDLOCK_ACCESS_FS_REMOVE_* for a source of a rename. This
lift a Landlock limitation that always denied changing the parent of an
inode.

Renaming or linking to the same directory is still always allowed,
whatever LANDLOCK_ACCESS_FS_REFER is used or not, because it is not
considered a threat to user data.

However, creating multiple links or renaming to a different parent
directory may lead to privilege escalations if not handled properly.
Indeed, we must be sure that the source doesn't gain more privileges by
being accessible from the destination. This is handled by making sure
that the source hierarchy (including the referenced file or directory
itself) restricts at least as much the destination hierarchy. If it is
not the case, an EXDEV error is returned, making it potentially possible
for user space to copy the file hierarchy instead of moving or linking
it.

Instead of creating different access rights for the source and the
destination, we choose to make it simple and consistent for users.
Indeed, considering the previous constraint, it would be weird to
require such destination access right to be also granted to the source
(to make it a superset).

See the provided documentation for additional details.

New tests are provided with a following commit.

Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
include/uapi/linux/landlock.h | 27 +-
security/landlock/fs.c | 550 ++++++++++++++++---
security/landlock/limits.h | 2 +-
security/landlock/syscalls.c | 2 +-
tools/testing/selftests/landlock/base_test.c | 2 +-
tools/testing/selftests/landlock/fs_test.c | 3 +-
6 files changed, 516 insertions(+), 70 deletions(-)

diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
index b3d952067f59..f433d58a58f2 100644
--- a/include/uapi/linux/landlock.h
+++ b/include/uapi/linux/landlock.h
@@ -21,8 +21,14 @@ struct landlock_ruleset_attr {
/**
* @handled_access_fs: Bitmask of actions (cf. `Filesystem flags`_)
* that is handled by this ruleset and should then be forbidden if no
- * rule explicitly allow them. This is needed for backward
- * compatibility reasons.
+ * rule explicitly allow them: it is a deny-by-default list that should
+ * contain as much Landlock access rights as possible. Indeed, all
+ * Landlock filesystem access rights that are not part of
+ * handled_access_fs are allowed. This is needed for backward
+ * compatibility reasons. One exception is the
+ * LANDLOCK_ACCESS_FS_REFER access right, which is always implicitly
+ * handled, but must still be explicitly handled to add new rules with
+ * this access right.
*/
__u64 handled_access_fs;
};
@@ -109,6 +115,22 @@ struct landlock_path_beneath_attr {
* - %LANDLOCK_ACCESS_FS_MAKE_FIFO: Create (or rename or link) a named pipe.
* - %LANDLOCK_ACCESS_FS_MAKE_BLOCK: Create (or rename or link) a block device.
* - %LANDLOCK_ACCESS_FS_MAKE_SYM: Create (or rename or link) a symbolic link.
+ * - %LANDLOCK_ACCESS_FS_REFER: Link or rename a file from or to a different
+ * directory (i.e. reparent a file hierarchy). This access right is
+ * available since the second version of the Landlock ABI. This is also the
+ * only access right which is always considered handled by any ruleset in
+ * such a way that reparenting a file hierarchy is always denied by default.
+ * To avoid privilege escalation, it is not enough to add a rule with this
+ * access right. When linking or renaming a file, the destination directory
+ * hierarchy must also always have the same or a superset of restrictions of
+ * the source hierarchy. If it is not the case, or if the domain doesn't
+ * handle this access right, such actions are denied by default with errno
+ * set to EXDEV. Linking also requires a LANDLOCK_ACCESS_FS_MAKE_* access
+ * right on the destination directory, and renaming also requires a
+ * LANDLOCK_ACCESS_FS_REMOVE_* access right on the source's (file or
+ * directory) parent. Otherwise, such actions are denied with errno set to
+ * EACCES. The EACCES errno prevails over EXDEV to let user space
+ * efficiently deal with an unrecoverable error.
*
* .. warning::
*
@@ -133,5 +155,6 @@ struct landlock_path_beneath_attr {
#define LANDLOCK_ACCESS_FS_MAKE_FIFO (1ULL << 10)
#define LANDLOCK_ACCESS_FS_MAKE_BLOCK (1ULL << 11)
#define LANDLOCK_ACCESS_FS_MAKE_SYM (1ULL << 12)
+#define LANDLOCK_ACCESS_FS_REFER (1ULL << 13)

#endif /* _UAPI_LINUX_LANDLOCK_H */
diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 3886f9ad1a60..c7c7ce4e7cd5 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -4,6 +4,7 @@
*
* Copyright © 2016-2020 Mickaël Salaün <[email protected]>
* Copyright © 2018-2020 ANSSI
+ * Copyright © 2021-2022 Microsoft Corporation
*/

#include <linux/atomic.h>
@@ -269,16 +270,188 @@ static inline bool is_nouser_or_private(const struct dentry *dentry)
unlikely(IS_PRIVATE(d_backing_inode(dentry))));
}

-static int check_access_path(const struct landlock_ruleset *const domain,
- const struct path *const path,
+static inline access_mask_t get_handled_accesses(
+ const struct landlock_ruleset *const domain)
+{
+ access_mask_t access_dom = 0;
+ unsigned long access_bit;
+
+ for (access_bit = 0; access_bit < LANDLOCK_NUM_ACCESS_FS;
+ access_bit++) {
+ size_t layer_level;
+
+ for (layer_level = 0; layer_level < domain->num_layers;
+ layer_level++) {
+ if (domain->fs_access_masks[layer_level] &
+ BIT_ULL(access_bit)) {
+ access_dom |= BIT_ULL(access_bit);
+ break;
+ }
+ }
+ }
+ return access_dom;
+}
+
+static inline access_mask_t init_layer_masks(
+ const struct landlock_ruleset *const domain,
+ const access_mask_t access_request,
+ layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS])
+{
+ access_mask_t handled_accesses = 0;
+ size_t layer_level;
+
+ memset(layer_masks, 0, sizeof(*layer_masks));
+ if (WARN_ON_ONCE(!access_request))
+ return 0;
+
+ /* Saves all handled accesses per layer. */
+ for (layer_level = 0; layer_level < domain->num_layers;
+ layer_level++) {
+ const unsigned long access_req = access_request;
+ unsigned long access_bit;
+
+ for_each_set_bit(access_bit, &access_req,
+ ARRAY_SIZE(*layer_masks)) {
+ if (domain->fs_access_masks[layer_level] &
+ BIT_ULL(access_bit)) {
+ (*layer_masks)[access_bit] |=
+ BIT_ULL(layer_level);
+ handled_accesses |= BIT_ULL(access_bit);
+ }
+ }
+ }
+ return handled_accesses;
+}
+
+/*
+ * Check that a destination file hierarchy has more restrictions than a source
+ * file hierarchy. This is only used for link and rename actions.
+ */
+static inline bool is_superset(bool child_is_directory,
+ const layer_mask_t (*const
+ layer_masks_dst_parent)[LANDLOCK_NUM_ACCESS_FS],
+ const layer_mask_t (*const
+ layer_masks_src_parent)[LANDLOCK_NUM_ACCESS_FS],
+ const layer_mask_t (*const
+ layer_masks_child)[LANDLOCK_NUM_ACCESS_FS])
+{
+ unsigned long access_bit;
+
+ for (access_bit = 0; access_bit < ARRAY_SIZE(*layer_masks_dst_parent);
+ access_bit++) {
+ /* Ignores accesses that only make sense for directories. */
+ if (!child_is_directory && !(BIT_ULL(access_bit) & ACCESS_FILE))
+ continue;
+
+ /*
+ * Checks if the destination restrictions are a superset of the
+ * source ones (i.e. inherited access rights without child
+ * exceptions).
+ */
+ if ((((*layer_masks_src_parent)[access_bit] & (*layer_masks_child)[access_bit]) |
+ (*layer_masks_dst_parent)[access_bit]) !=
+ (*layer_masks_dst_parent)[access_bit])
+ return false;
+ }
+ return true;
+}
+
+/*
+ * Removes @layer_masks accesses that are not requested.
+ *
+ * Returns true if the request is allowed, false otherwise.
+ */
+static inline bool scope_to_request(const access_mask_t access_request,
+ layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS])
+{
+ const unsigned long access_req = access_request;
+ unsigned long access_bit;
+
+ if (WARN_ON_ONCE(!layer_masks))
+ return true;
+
+ for_each_clear_bit(access_bit, &access_req, ARRAY_SIZE(*layer_masks))
+ (*layer_masks)[access_bit] = 0;
+ return !memchr_inv(layer_masks, 0, sizeof(*layer_masks));
+}
+
+/*
+ * Returns true if there is at least one access right different than
+ * LANDLOCK_ACCESS_FS_REFER.
+ */
+static inline bool is_eacces(
+ const layer_mask_t (*const
+ layer_masks)[LANDLOCK_NUM_ACCESS_FS],
const access_mask_t access_request)
{
- layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
- bool allowed = false, has_access = false;
+ unsigned long access_bit;
+ /* LANDLOCK_ACCESS_FS_REFER alone must return -EXDEV. */
+ const unsigned long access_check = access_request &
+ ~LANDLOCK_ACCESS_FS_REFER;
+
+ if (!layer_masks)
+ return false;
+
+ for_each_set_bit(access_bit, &access_check, ARRAY_SIZE(*layer_masks)) {
+ if ((*layer_masks)[access_bit])
+ return true;
+ }
+ return false;
+}
+
+/**
+ * check_access_path_dual - Check a source and a destination accesses
+ *
+ * @domain: Domain to check against.
+ * @path: File hierarchy to walk through.
+ * @child_is_directory: Must be set to true if the (original) leaf is a
+ * directory, false otherwise.
+ * @access_request_dst_parent: Accesses to check, once @layer_masks_dst_parent
+ * is equal to @layer_masks_src_parent (if any).
+ * @layer_masks_dst_parent: Pointer to a matrix of layer masks per access
+ * masks, identifying the layers that forbid a specific access. Bits from
+ * this matrix can be unset according to the @path walk. An empty matrix
+ * means that @domain allows all possible Landlock accesses (i.e. not only
+ * those identified by @access_request_dst_parent). This matrix can
+ * initially refer to domain layer masks and, when the accesses for the
+ * destination and source are the same, to request layer masks.
+ * @access_request_src_parent: Similar to @access_request_dst_parent but for an
+ * initial source path request. Only taken into account if
+ * @layer_masks_src_parent is not NULL.
+ * @layer_masks_src_parent: Similar to @layer_masks_dst_parent but for an
+ * initial source path walk. This can be NULL if only dealing with a
+ * destination access request (i.e. not a rename nor a link action).
+ * @layer_masks_child: Similar to @layer_masks_src_parent but only for the
+ * linked or renamed inode (without hierarchy). This is only used if
+ * @layer_masks_src_parent is not NULL.
+ *
+ * This helper first checks that the destination has a superset of restrictions
+ * compared to the source (if any) for a common path. It then checks that the
+ * collected accesses and the remaining ones are enough to allow the request.
+ *
+ * Returns:
+ * - 0 if the access request is granted;
+ * - -EACCES if it is denied because of access right other than
+ * LANDLOCK_ACCESS_FS_REFER;
+ * - -EXDEV if the renaming or linking would be a privileged escalation
+ * (according to each layered policies), or if LANDLOCK_ACCESS_FS_REFER is
+ * not allowed by the source or the destination.
+ */
+static int check_access_path_dual(const struct landlock_ruleset *const domain,
+ const struct path *const path,
+ bool child_is_directory,
+ const access_mask_t access_request_dst_parent,
+ layer_mask_t (*const
+ layer_masks_dst_parent)[LANDLOCK_NUM_ACCESS_FS],
+ const access_mask_t access_request_src_parent,
+ layer_mask_t (*layer_masks_src_parent)[LANDLOCK_NUM_ACCESS_FS],
+ layer_mask_t (*layer_masks_child)[LANDLOCK_NUM_ACCESS_FS])
+{
+ bool allowed_dst_parent = false, allowed_src_parent = false, is_dom_check;
struct path walker_path;
- size_t i;
+ access_mask_t access_masked_dst_parent, access_masked_src_parent;

- if (!access_request)
+ if (!access_request_dst_parent && !access_request_src_parent)
return 0;
if (WARN_ON_ONCE(!domain || !path))
return 0;
@@ -287,22 +460,20 @@ static int check_access_path(const struct landlock_ruleset *const domain,
if (WARN_ON_ONCE(domain->num_layers < 1))
return -EACCES;

- /* Saves all layers handling a subset of requested accesses. */
- for (i = 0; i < domain->num_layers; i++) {
- const unsigned long access_req = access_request;
- unsigned long access_bit;
-
- for_each_set_bit(access_bit, &access_req,
- ARRAY_SIZE(layer_masks)) {
- if (domain->fs_access_masks[i] & BIT_ULL(access_bit)) {
- layer_masks[access_bit] |= BIT_ULL(i);
- has_access = true;
- }
- }
+ BUILD_BUG_ON(!layer_masks_dst_parent);
+ if (layer_masks_src_parent) {
+ if (WARN_ON_ONCE(!layer_masks_child))
+ return -EACCES;
+ access_masked_dst_parent = access_masked_src_parent =
+ get_handled_accesses(domain);
+ is_dom_check = true;
+ } else {
+ if (WARN_ON_ONCE(layer_masks_child))
+ return -EACCES;
+ access_masked_dst_parent = access_request_dst_parent;
+ access_masked_src_parent = access_request_src_parent;
+ is_dom_check = false;
}
- /* An access request not handled by the domain is allowed. */
- if (!has_access)
- return 0;

walker_path = *path;
path_get(&walker_path);
@@ -312,11 +483,50 @@ static int check_access_path(const struct landlock_ruleset *const domain,
*/
while (true) {
struct dentry *parent_dentry;
+ const struct landlock_rule *rule;
+
+ /*
+ * If at least all accesses allowed on the destination are
+ * already allowed on the source, respectively if there is at
+ * least as much as restrictions on the destination than on the
+ * source, then we can safely refer files from the source to
+ * the destination without risking a privilege escalation.
+ * This is crucial for standalone multilayered security
+ * policies. Furthermore, this helps avoid policy writers to
+ * shoot themselves in the foot.
+ */
+ if (is_dom_check && is_superset(child_is_directory,
+ layer_masks_dst_parent,
+ layer_masks_src_parent,
+ layer_masks_child)) {
+ allowed_dst_parent =
+ scope_to_request(access_request_dst_parent,
+ layer_masks_dst_parent);
+ allowed_src_parent =
+ scope_to_request(access_request_src_parent,
+ layer_masks_src_parent);
+
+ /* Stops when all accesses are granted. */
+ if (allowed_dst_parent && allowed_src_parent)
+ break;
+
+ /*
+ * Downgrades checks from domain handled accesses to
+ * requested accesses.
+ */
+ is_dom_check = false;
+ access_masked_dst_parent = access_request_dst_parent;
+ access_masked_src_parent = access_request_src_parent;
+ }
+
+ rule = find_rule(domain, walker_path.dentry);
+ allowed_dst_parent = unmask_layers(rule, access_masked_dst_parent,
+ layer_masks_dst_parent);
+ allowed_src_parent = unmask_layers(rule, access_masked_src_parent,
+ layer_masks_src_parent);

- allowed = unmask_layers(find_rule(domain, walker_path.dentry),
- access_request, &layer_masks);
- if (allowed)
- /* Stops when a rule from each layer grants access. */
+ /* Stops when a rule from each layer grants access. */
+ if (allowed_dst_parent && allowed_src_parent)
break;

jump_up:
@@ -329,7 +539,7 @@ static int check_access_path(const struct landlock_ruleset *const domain,
* Stops at the real root. Denies access
* because not all layers have granted access.
*/
- allowed = false;
+ allowed_dst_parent = false;
break;
}
}
@@ -339,7 +549,8 @@ static int check_access_path(const struct landlock_ruleset *const domain,
* access to internal filesystems (e.g. nsfs, which is
* reachable through /proc/<pid>/ns/<namespace>).
*/
- allowed = !!(walker_path.mnt->mnt_flags & MNT_INTERNAL);
+ allowed_dst_parent = !!(walker_path.mnt->mnt_flags &
+ MNT_INTERNAL);
break;
}
parent_dentry = dget_parent(walker_path.dentry);
@@ -347,7 +558,40 @@ static int check_access_path(const struct landlock_ruleset *const domain,
walker_path.dentry = parent_dentry;
}
path_put(&walker_path);
- return allowed ? 0 : -EACCES;
+
+ if (allowed_dst_parent && allowed_src_parent)
+ return 0;
+
+ /*
+ * Unfortunately, we cannot prioritize EACCES over EXDEV for all
+ * RENAME_EXCHANGE cases because it depends on the source and
+ * destination order. This could be changed with a new
+ * security_path_rename hook implementation.
+ */
+ if (likely(is_eacces(layer_masks_dst_parent, access_request_dst_parent)
+ || is_eacces(layer_masks_src_parent,
+ access_request_src_parent)))
+ return -EACCES;
+
+ /*
+ * Gracefully forbids reparenting if the destination directory
+ * hierarchy is not a superset of restrictions of the source directory
+ * hierarchy, or if LANDLOCK_ACCESS_FS_REFER is not allowed by the
+ * source or the destination.
+ */
+ return -EXDEV;
+}
+
+static inline int check_access_path(const struct landlock_ruleset *const domain,
+ const struct path *const path,
+ access_mask_t access_request)
+{
+ layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
+
+ access_request = init_layer_masks(domain, access_request,
+ &layer_masks);
+ return check_access_path_dual(domain, path, d_is_dir(path->dentry),
+ access_request, &layer_masks, 0, NULL, NULL);
}

static inline int current_check_access_path(const struct path *const path,
@@ -394,6 +638,217 @@ static inline access_mask_t maybe_remove(const struct dentry *const dentry)
LANDLOCK_ACCESS_FS_REMOVE_FILE;
}

+/**
+ * collect_domain_accesses - Walk through a file path and collect accesses
+ *
+ * @domain: Domain to check against.
+ * @mnt_root: Last directory to check.
+ * @dir: Directory to start the walk from.
+ * @layer_masks_dom: Where to store the collected accesses.
+ *
+ * This helper is useful to begin a path walk from the @dir directory to a
+ * @mnt_root directory used as a mount point. This mount point is the common
+ * ancestor between the source and the destination of a renamed and linked
+ * file. While walking from @dir to @mnt_root, we record all the domain's
+ * allowed accesses in @layer_masks_dom.
+ *
+ * This is similar to check_access_path_dual() but much simpler because it only
+ * handles walking on the same mount point and only check one set of accesses.
+ *
+ * Returns:
+ * - true if all the domain access rights are allowed for @dir;
+ * - false if the walk reached @mnt_root.
+ */
+static bool collect_domain_accesses(
+ const struct landlock_ruleset *const domain,
+ const struct dentry *const mnt_root, struct dentry *dir,
+ layer_mask_t (*const layer_masks_dom)[LANDLOCK_NUM_ACCESS_FS])
+{
+ unsigned long access_dom;
+ bool ret = false;
+
+ BUILD_BUG_ON(!layer_masks_dom);
+ if (WARN_ON_ONCE(!domain || !mnt_root || !dir))
+ return true;
+ if (is_nouser_or_private(dir))
+ return true;
+
+ access_dom = init_layer_masks(domain, LANDLOCK_MASK_ACCESS_FS,
+ layer_masks_dom);
+
+ dget(dir);
+ while (true) {
+ struct dentry *parent_dentry;
+
+ /* Gets all layers allowing all domain accesses. */
+ if (unmask_layers(find_rule(domain, dir), access_dom,
+ layer_masks_dom)) {
+ /*
+ * Stops when all handled accesses are allowed by at
+ * least one rule in each layer.
+ */
+ ret = true;
+ break;
+ }
+
+ /* We should not reach a root other than @mnt_root. */
+ if (dir == mnt_root || WARN_ON_ONCE(IS_ROOT(dir))) {
+ ret = false;
+ break;
+ }
+
+ parent_dentry = dget_parent(dir);
+ dput(dir);
+ dir = parent_dentry;
+ }
+ dput(dir);
+ return ret;
+}
+
+/**
+ * current_check_refer_path - Check if a rename or link action is allowed
+ *
+ * @old_dentry: File or directory requested to be moved or linked.
+ * @new_dir: Destination parent directory.
+ * @new_dentry: Destination file or directory.
+ * @removable: Sets to true if it is a rename operation.
+ *
+ * Because of its unprivileged constraints, Landlock relies on file hierarchies
+ * (and not only inodes) to tie access rights to files. Being able to link or
+ * rename a file hierarchy brings some challenges. Indeed, moving or linking a
+ * file (i.e. creating a new reference to an inode) can have an impact on the
+ * actions allowed for a set of files if it would change its parent directory
+ * (i.e. reparenting).
+ *
+ * To avoid trivial access right bypasses, Landlock first checks if the file or
+ * directory requested to be moved would gain new access rights inherited from
+ * its new hierarchy. Before returning any error, Landlock then checks that
+ * the parent source hierarchy and the destination hierarchy would allow the
+ * link or rename action. If it is not the case, an error with EACCES is
+ * returned to inform user space that there is no way to remove or create the
+ * requested source file type. If it should be allowed but the new inherited
+ * access rights would be greater than the source access rights, then the
+ * kernel returns an error with EXDEV. Prioritizing EACCES over EXDEV enables
+ * user space to abort the whole operation if there is no way to do it, or to
+ * manually copy the source to the destination if this remains allowed, e.g.
+ * because file creation is allowed on the destination directory but not direct
+ * linking.
+ *
+ * To achieve this goal, the kernel needs to compare two file hierarchies: the
+ * one identifying the source file or directory (including itself), and the
+ * destination one. This can be seen as a multilayer partial ordering problem.
+ * The kernel walks through these paths and collect in a matrix the access
+ * rights that are denied per layer. These matrices are then compared to see
+ * if the destination one has more (or the same) restrictions as the source
+ * one. If this is the case, the requested action will not return EXDEV, which
+ * doesn't mean the action is allowed. The parent hierarchy of the source
+ * (i.e. parent directory), and the destination hierarchy must also be checked
+ * to verify that they explicitly allow such action (i.e. referencing,
+ * creation and potentially removal rights). The kernel implementation is then
+ * required to rely on three matrices of access rights: one for the source file
+ * or directory (i.e. the child), one for the source parent hierarchy and one
+ * for the destination hierarchy. These ephemeral matrices take some space on
+ * the stack, which limits the number of layers to a deemed reasonable number:
+ * 16.
+ *
+ * Returns:
+ * - 0 if access is allowed;
+ * - -EXDEV if @old_dentry would inherit new access rights from @new_dir;
+ * - -EACCES if file removal or creation is denied.
+ */
+static int current_check_refer_path(struct dentry *const old_dentry,
+ const struct path *const new_dir,
+ struct dentry *const new_dentry,
+ bool removable)
+{
+ const struct landlock_ruleset *const dom =
+ landlock_get_current_domain();
+ bool allow_dst_parent, allow_src_parent;
+ access_mask_t access_request_dst_parent, access_request_src_parent,
+ access_child;
+ struct path mnt_dir;
+ layer_mask_t layer_masks_dst_parent[LANDLOCK_NUM_ACCESS_FS],
+ layer_masks_src_parent[LANDLOCK_NUM_ACCESS_FS],
+ layer_masks_child[LANDLOCK_NUM_ACCESS_FS];
+
+ if (!dom)
+ return 0;
+ if (WARN_ON_ONCE(dom->num_layers < 1))
+ return -EACCES;
+ if (unlikely(d_is_negative(old_dentry)))
+ return -ENOENT;
+
+ access_request_dst_parent =
+ get_mode_access(d_backing_inode(old_dentry)->i_mode);
+ access_request_src_parent = 0;
+ if (removable) {
+ access_request_dst_parent |= maybe_remove(new_dentry);
+ access_request_src_parent |= maybe_remove(old_dentry);
+ }
+
+ /* The mount points are the same for old and new paths, cf. EXDEV. */
+ if (old_dentry->d_parent == new_dir->dentry) {
+ /*
+ * The LANDLOCK_ACCESS_FS_REFER access right is not required
+ * for same-directory referer (i.e. no reparenting).
+ */
+ access_request_dst_parent = init_layer_masks(dom,
+ access_request_dst_parent | access_request_src_parent,
+ &layer_masks_dst_parent);
+ return check_access_path_dual(dom, new_dir, d_is_dir(old_dentry),
+ access_request_dst_parent, &layer_masks_dst_parent,
+ 0, NULL, NULL);
+ }
+
+ /* Backward compatibility: no reparenting support. */
+ if (!(get_handled_accesses(dom) & LANDLOCK_ACCESS_FS_REFER))
+ return -EXDEV;
+
+ access_request_src_parent |= LANDLOCK_ACCESS_FS_REFER;
+ access_request_dst_parent |= LANDLOCK_ACCESS_FS_REFER;
+
+ /* Saves the common mount point. */
+ mnt_dir.mnt = new_dir->mnt;
+ mnt_dir.dentry = new_dir->mnt->mnt_root;
+
+ /* new_dir->dentry is equal to new_dentry->d_parent */
+ allow_dst_parent = collect_domain_accesses(dom, mnt_dir.dentry,
+ new_dir->dentry, &layer_masks_dst_parent);
+ allow_src_parent = collect_domain_accesses(dom, mnt_dir.dentry,
+ old_dentry->d_parent, &layer_masks_src_parent);
+
+ if (allow_src_parent) {
+ /* No need to go further if everything is allowed. */
+ if (allow_dst_parent)
+ return 0;
+
+ /* @new_dentry can only gain more restrictions. */
+ if (scope_to_request(access_request_dst_parent,
+ &layer_masks_dst_parent))
+ return 0;
+
+ return check_access_path_dual(dom, &mnt_dir, d_is_dir(old_dentry),
+ access_request_dst_parent, &layer_masks_dst_parent,
+ 0, NULL, NULL);
+ }
+
+ /*
+ * To be able to compare source and destination domain access rights,
+ * take into account the @old_dentry access rights aggregated with its
+ * parent access rights. This will be useful to compare with the
+ * destination parent access rights.
+ */
+ access_child = init_layer_masks(dom, LANDLOCK_MASK_ACCESS_FS,
+ &layer_masks_child);
+ unmask_layers(find_rule(dom, old_dentry), access_child,
+ &layer_masks_child);
+
+ return check_access_path_dual(dom, &mnt_dir, d_is_dir(old_dentry),
+ access_request_dst_parent, &layer_masks_dst_parent,
+ access_request_src_parent, &layer_masks_src_parent,
+ &layer_masks_child);
+}
+
/* Inode hooks */

static void hook_inode_free_security(struct inode *const inode)
@@ -587,31 +1042,11 @@ static int hook_sb_pivotroot(const struct path *const old_path,

/* Path hooks */

-/*
- * Creating multiple links or renaming may lead to privilege escalations if not
- * handled properly. Indeed, we must be sure that the source doesn't gain more
- * privileges by being accessible from the destination. This is getting more
- * complex when dealing with multiple layers. The whole picture can be seen as
- * a multilayer partial ordering problem. A future version of Landlock will
- * deal with that.
- */
static int hook_path_link(struct dentry *const old_dentry,
const struct path *const new_dir,
struct dentry *const new_dentry)
{
- const struct landlock_ruleset *const dom =
- landlock_get_current_domain();
-
- if (!dom)
- return 0;
- /* The mount points are the same for old and new paths, cf. EXDEV. */
- if (old_dentry->d_parent != new_dir->dentry)
- /* Gracefully forbids reparenting. */
- return -EXDEV;
- if (unlikely(d_is_negative(old_dentry)))
- return -ENOENT;
- return check_access_path(dom, new_dir,
- get_mode_access(d_backing_inode(old_dentry)->i_mode));
+ return current_check_refer_path(old_dentry, new_dir, new_dentry, false);
}

static int hook_path_rename(const struct path *const old_dir,
@@ -619,21 +1054,8 @@ static int hook_path_rename(const struct path *const old_dir,
const struct path *const new_dir,
struct dentry *const new_dentry)
{
- const struct landlock_ruleset *const dom =
- landlock_get_current_domain();
-
- if (!dom)
- return 0;
- /* The mount points are the same for old and new paths, cf. EXDEV. */
- if (old_dir->dentry != new_dir->dentry)
- /* Gracefully forbids reparenting. */
- return -EXDEV;
- if (unlikely(d_is_negative(old_dentry)))
- return -ENOENT;
- /* RENAME_EXCHANGE is handled because directories are the same. */
- return check_access_path(dom, old_dir, maybe_remove(old_dentry) |
- maybe_remove(new_dentry) |
- get_mode_access(d_backing_inode(old_dentry)->i_mode));
+ /* old_dir refers to old_dentry->d_parent and new_dir->mnt */
+ return current_check_refer_path(old_dentry, new_dir, new_dentry, true);
}

static int hook_path_mkdir(const struct path *const dir,
diff --git a/security/landlock/limits.h b/security/landlock/limits.h
index 126d1ec04d34..26c8166d0265 100644
--- a/security/landlock/limits.h
+++ b/security/landlock/limits.h
@@ -16,7 +16,7 @@
#define LANDLOCK_MAX_NUM_LAYERS 16
#define LANDLOCK_MAX_NUM_RULES U32_MAX

-#define LANDLOCK_LAST_ACCESS_FS LANDLOCK_ACCESS_FS_MAKE_SYM
+#define LANDLOCK_LAST_ACCESS_FS LANDLOCK_ACCESS_FS_REFER
#define LANDLOCK_MASK_ACCESS_FS ((LANDLOCK_LAST_ACCESS_FS << 1) - 1)
#define LANDLOCK_NUM_ACCESS_FS __const_hweight64(LANDLOCK_MASK_ACCESS_FS)

diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c
index 32396962f04d..fa14f09b6bf4 100644
--- a/security/landlock/syscalls.c
+++ b/security/landlock/syscalls.c
@@ -128,7 +128,7 @@ static const struct file_operations ruleset_fops = {
.write = fop_dummy_write,
};

-#define LANDLOCK_ABI_VERSION 1
+#define LANDLOCK_ABI_VERSION 2

/**
* sys_landlock_create_ruleset - Create a new ruleset
diff --git a/tools/testing/selftests/landlock/base_test.c b/tools/testing/selftests/landlock/base_test.c
index ca40abe9daa8..99aab93d50e1 100644
--- a/tools/testing/selftests/landlock/base_test.c
+++ b/tools/testing/selftests/landlock/base_test.c
@@ -67,7 +67,7 @@ TEST(abi_version) {
const struct landlock_ruleset_attr ruleset_attr = {
.handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE,
};
- ASSERT_EQ(1, landlock_create_ruleset(NULL, 0,
+ ASSERT_EQ(2, landlock_create_ruleset(NULL, 0,
LANDLOCK_CREATE_RULESET_VERSION));

ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr, 0,
diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
index 1ac41bfa7382..0568d1193492 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -381,7 +381,7 @@ TEST_F_FORK(layout1, inval)
LANDLOCK_ACCESS_FS_WRITE_FILE | \
LANDLOCK_ACCESS_FS_READ_FILE)

-#define ACCESS_LAST LANDLOCK_ACCESS_FS_MAKE_SYM
+#define ACCESS_LAST LANDLOCK_ACCESS_FS_REFER

#define ACCESS_ALL ( \
ACCESS_FILE | \
@@ -394,6 +394,7 @@ TEST_F_FORK(layout1, inval)
LANDLOCK_ACCESS_FS_MAKE_SOCK | \
LANDLOCK_ACCESS_FS_MAKE_FIFO | \
LANDLOCK_ACCESS_FS_MAKE_BLOCK | \
+ LANDLOCK_ACCESS_FS_MAKE_SYM | \
ACCESS_LAST)

TEST_F_FORK(layout1, file_access_rights)
--
2.35.1

2022-02-22 05:10:24

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v1 05/11] landlock: Move filesystem helpers and add a new one

From: Mickaël Salaün <[email protected]>

Move the SB_NOUSER and IS_PRIVATE dentry check to a standalone
is_nouser_or_private() helper. This will be useful for a following
commit.

Move get_mode_access() and maybe_remove() to make them usable by new
code provided by a following commit.

Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
security/landlock/fs.c | 87 ++++++++++++++++++++++--------------------
1 file changed, 46 insertions(+), 41 deletions(-)

diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 9662f9fb3cd0..3886f9ad1a60 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -257,6 +257,18 @@ static inline bool unmask_layers(const struct landlock_rule *const rule,
return false;
}

+static inline bool is_nouser_or_private(const struct dentry *dentry)
+{
+ /*
+ * Allows access to pseudo filesystems that will never be mountable
+ * (e.g. sockfs, pipefs), but can still be reachable through
+ * /proc/<pid>/fd/<file-descriptor> .
+ */
+ return (dentry->d_sb->s_flags & SB_NOUSER) ||
+ (d_is_positive(dentry) &&
+ unlikely(IS_PRIVATE(d_backing_inode(dentry))));
+}
+
static int check_access_path(const struct landlock_ruleset *const domain,
const struct path *const path,
const access_mask_t access_request)
@@ -270,14 +282,7 @@ static int check_access_path(const struct landlock_ruleset *const domain,
return 0;
if (WARN_ON_ONCE(!domain || !path))
return 0;
- /*
- * Allows access to pseudo filesystems that will never be mountable
- * (e.g. sockfs, pipefs), but can still be reachable through
- * /proc/<pid>/fd/<file-descriptor> .
- */
- if ((path->dentry->d_sb->s_flags & SB_NOUSER) ||
- (d_is_positive(path->dentry) &&
- unlikely(IS_PRIVATE(d_backing_inode(path->dentry)))))
+ if (is_nouser_or_private(path->dentry))
return 0;
if (WARN_ON_ONCE(domain->num_layers < 1))
return -EACCES;
@@ -356,6 +361,39 @@ static inline int current_check_access_path(const struct path *const path,
return check_access_path(dom, path, access_request);
}

+static inline access_mask_t get_mode_access(const umode_t mode)
+{
+ switch (mode & S_IFMT) {
+ case S_IFLNK:
+ return LANDLOCK_ACCESS_FS_MAKE_SYM;
+ case 0:
+ /* A zero mode translates to S_IFREG. */
+ case S_IFREG:
+ return LANDLOCK_ACCESS_FS_MAKE_REG;
+ case S_IFDIR:
+ return LANDLOCK_ACCESS_FS_MAKE_DIR;
+ case S_IFCHR:
+ return LANDLOCK_ACCESS_FS_MAKE_CHAR;
+ case S_IFBLK:
+ return LANDLOCK_ACCESS_FS_MAKE_BLOCK;
+ case S_IFIFO:
+ return LANDLOCK_ACCESS_FS_MAKE_FIFO;
+ case S_IFSOCK:
+ return LANDLOCK_ACCESS_FS_MAKE_SOCK;
+ default:
+ WARN_ON_ONCE(1);
+ return 0;
+ }
+}
+
+static inline access_mask_t maybe_remove(const struct dentry *const dentry)
+{
+ if (d_is_negative(dentry))
+ return 0;
+ return d_is_dir(dentry) ? LANDLOCK_ACCESS_FS_REMOVE_DIR :
+ LANDLOCK_ACCESS_FS_REMOVE_FILE;
+}
+
/* Inode hooks */

static void hook_inode_free_security(struct inode *const inode)
@@ -549,31 +587,6 @@ static int hook_sb_pivotroot(const struct path *const old_path,

/* Path hooks */

-static inline access_mask_t get_mode_access(const umode_t mode)
-{
- switch (mode & S_IFMT) {
- case S_IFLNK:
- return LANDLOCK_ACCESS_FS_MAKE_SYM;
- case 0:
- /* A zero mode translates to S_IFREG. */
- case S_IFREG:
- return LANDLOCK_ACCESS_FS_MAKE_REG;
- case S_IFDIR:
- return LANDLOCK_ACCESS_FS_MAKE_DIR;
- case S_IFCHR:
- return LANDLOCK_ACCESS_FS_MAKE_CHAR;
- case S_IFBLK:
- return LANDLOCK_ACCESS_FS_MAKE_BLOCK;
- case S_IFIFO:
- return LANDLOCK_ACCESS_FS_MAKE_FIFO;
- case S_IFSOCK:
- return LANDLOCK_ACCESS_FS_MAKE_SOCK;
- default:
- WARN_ON_ONCE(1);
- return 0;
- }
-}
-
/*
* Creating multiple links or renaming may lead to privilege escalations if not
* handled properly. Indeed, we must be sure that the source doesn't gain more
@@ -601,14 +614,6 @@ static int hook_path_link(struct dentry *const old_dentry,
get_mode_access(d_backing_inode(old_dentry)->i_mode));
}

-static inline access_mask_t maybe_remove(const struct dentry *const dentry)
-{
- if (d_is_negative(dentry))
- return 0;
- return d_is_dir(dentry) ? LANDLOCK_ACCESS_FS_REMOVE_DIR :
- LANDLOCK_ACCESS_FS_REMOVE_FILE;
-}
-
static int hook_path_rename(const struct path *const old_dir,
struct dentry *const old_dentry,
const struct path *const new_dir,
--
2.35.1

2022-02-22 05:14:22

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v1 08/11] samples/landlock: Add support for file reparenting

From: Mickaël Salaün <[email protected]>

Add LANDLOCK_ACCESS_FS_REFER to the "roughly write" access rights and
leverage the Landlock ABI version to only try to enforce it if it is
supported by the running kernel.

Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
samples/landlock/sandboxer.c | 37 +++++++++++++++++++++++++-----------
1 file changed, 26 insertions(+), 11 deletions(-)

diff --git a/samples/landlock/sandboxer.c b/samples/landlock/sandboxer.c
index 7a15910d2171..8509543fcbbb 100644
--- a/samples/landlock/sandboxer.c
+++ b/samples/landlock/sandboxer.c
@@ -153,16 +153,21 @@ static int populate_ruleset(
LANDLOCK_ACCESS_FS_MAKE_SOCK | \
LANDLOCK_ACCESS_FS_MAKE_FIFO | \
LANDLOCK_ACCESS_FS_MAKE_BLOCK | \
- LANDLOCK_ACCESS_FS_MAKE_SYM)
+ LANDLOCK_ACCESS_FS_MAKE_SYM | \
+ LANDLOCK_ACCESS_FS_REFER)
+
+#define ACCESS_ABI_2 ( \
+ LANDLOCK_ACCESS_FS_REFER)

int main(const int argc, char *const argv[], char *const *const envp)
{
const char *cmd_path;
char *const *cmd_argv;
- int ruleset_fd;
+ int ruleset_fd, abi;
+ __u64 access_fs_ro = ACCESS_FS_ROUGHLY_READ,
+ access_fs_rw = ACCESS_FS_ROUGHLY_READ | ACCESS_FS_ROUGHLY_WRITE;
struct landlock_ruleset_attr ruleset_attr = {
- .handled_access_fs = ACCESS_FS_ROUGHLY_READ |
- ACCESS_FS_ROUGHLY_WRITE,
+ .handled_access_fs = access_fs_rw,
};

if (argc < 2) {
@@ -183,11 +188,11 @@ int main(const int argc, char *const argv[], char *const *const envp)
return 1;
}

- ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
- if (ruleset_fd < 0) {
+ abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
+ if (abi < 0) {
const int err = errno;

- perror("Failed to create a ruleset");
+ perror("Failed to check Landlock compatibility");
switch (err) {
case ENOSYS:
fprintf(stderr, "Hint: Landlock is not supported by the current kernel. "
@@ -205,12 +210,22 @@ int main(const int argc, char *const argv[], char *const *const envp)
}
return 1;
}
- if (populate_ruleset(ENV_FS_RO_NAME, ruleset_fd,
- ACCESS_FS_ROUGHLY_READ)) {
+ /* Best-effort security. */
+ if (abi < 2) {
+ ruleset_attr.handled_access_fs &= ~ACCESS_ABI_2;
+ access_fs_ro &= ~ACCESS_ABI_2;
+ access_fs_rw &= ~ACCESS_ABI_2;
+ }
+
+ ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
+ if (ruleset_fd < 0) {
+ perror("Failed to create a ruleset");
+ return 1;
+ }
+ if (populate_ruleset(ENV_FS_RO_NAME, ruleset_fd, access_fs_ro)) {
goto err_close_ruleset;
}
- if (populate_ruleset(ENV_FS_RW_NAME, ruleset_fd,
- ACCESS_FS_ROUGHLY_READ | ACCESS_FS_ROUGHLY_WRITE)) {
+ if (populate_ruleset(ENV_FS_RW_NAME, ruleset_fd, access_fs_rw)) {
goto err_close_ruleset;
}
if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
--
2.35.1

2022-02-22 05:20:33

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v1 07/11] selftest/landlock: Add 6 new test suites dedicated to file reparenting

From: Mickaël Salaün <[email protected]>

These test suites try to check all edge cases for directory and file
renaming or linking involving a new parent directory, with and without
LANDLOCK_ACCESS_FS_REFER and other access rights.

layout1:
* reparent_refer: Tests simple FS_REFER usage.
* reparent_link: Tests a mix of FS_MAKE_REG and FS_REFER with links.
* reparent_rename: Tests a mix of FS_MAKE_REG and FS_REFER with renames.
* reparent_exdev_layers: Tests with two layers.
* reparent_dom_superset: Tests access partial ordering.

layout1_bind:
* reparent_cross_mount: Tests FS_REFER propagation across mount points.

Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
tools/testing/selftests/landlock/fs_test.c | 522 +++++++++++++++++++++
1 file changed, 522 insertions(+)

diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
index 0568d1193492..c42fcd9e62ec 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -1851,6 +1851,491 @@ TEST_F_FORK(layout1, rename_dir)
ASSERT_EQ(0, rmdir(dir_s1d3));
}

+TEST_F_FORK(layout1, reparent_refer)
+{
+ const struct rule layer1[] = {
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_REFER,
+ },
+ {
+ .path = dir_s2d2,
+ .access = LANDLOCK_ACCESS_FS_REFER,
+ },
+ {}
+ };
+ int ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_REFER,
+ layer1);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ ASSERT_EQ(-1, rename(dir_s1d2, dir_s2d1));
+ ASSERT_EQ(EXDEV, errno);
+ ASSERT_EQ(-1, rename(dir_s1d2, dir_s2d2));
+ ASSERT_EQ(EXDEV, errno);
+ ASSERT_EQ(-1, rename(dir_s1d2, dir_s2d3));
+ ASSERT_EQ(EXDEV, errno);
+
+ ASSERT_EQ(-1, rename(dir_s1d3, dir_s2d1));
+ ASSERT_EQ(EXDEV, errno);
+ ASSERT_EQ(-1, rename(dir_s1d3, dir_s2d2));
+ ASSERT_EQ(EXDEV, errno);
+ /*
+ * Moving should only be allowed when the source and the destination
+ * parent directory have REFER.
+ */
+ ASSERT_EQ(-1, rename(dir_s1d3, dir_s2d3));
+ ASSERT_EQ(ENOTEMPTY, errno);
+ ASSERT_EQ(0, unlink(file1_s2d3));
+ ASSERT_EQ(0, unlink(file2_s2d3));
+ ASSERT_EQ(0, rename(dir_s1d3, dir_s2d3));
+}
+
+TEST_F_FORK(layout1, reparent_link)
+{
+ const struct rule layer1[] = {
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_MAKE_REG,
+ },
+ {
+ .path = dir_s1d3,
+ .access = LANDLOCK_ACCESS_FS_REFER,
+ },
+ {
+ .path = dir_s2d2,
+ .access = LANDLOCK_ACCESS_FS_REFER,
+ },
+ {
+ .path = dir_s2d3,
+ .access = LANDLOCK_ACCESS_FS_MAKE_REG,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata,
+ LANDLOCK_ACCESS_FS_MAKE_REG | LANDLOCK_ACCESS_FS_REFER,
+ layer1);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ ASSERT_EQ(0, unlink(file1_s1d1));
+ ASSERT_EQ(0, unlink(file1_s1d2));
+ ASSERT_EQ(0, unlink(file1_s1d3));
+
+ /* Denies linking because of missing MAKE_REG. */
+ ASSERT_EQ(-1, link(file2_s1d1, file1_s1d1));
+ ASSERT_EQ(EACCES, errno);
+ /* Denies linking because of missing source and destination REFER. */
+ ASSERT_EQ(-1, link(file1_s2d1, file1_s1d2));
+ ASSERT_EQ(EXDEV, errno);
+ /* Denies linking because of missing source REFER. */
+ ASSERT_EQ(-1, link(file1_s2d1, file1_s1d3));
+ ASSERT_EQ(EXDEV, errno);
+
+ /* Denies linking because of missing MAKE_REG. */
+ ASSERT_EQ(-1, link(file1_s2d2, file1_s1d1));
+ ASSERT_EQ(EACCES, errno);
+ /* Denies linking because of missing destination REFER. */
+ ASSERT_EQ(-1, link(file1_s2d2, file1_s1d2));
+ ASSERT_EQ(EXDEV, errno);
+
+ /* Allows linking because of REFER and MAKE_REG. */
+ ASSERT_EQ(0, link(file1_s2d2, file1_s1d3));
+ ASSERT_EQ(0, unlink(file1_s2d2));
+ /* Reverse linking denied because of missing MAKE_REG. */
+ ASSERT_EQ(-1, link(file1_s1d3, file1_s2d2));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(0, unlink(file1_s2d3));
+ /* Checks reverse linking. */
+ ASSERT_EQ(0, link(file1_s1d3, file1_s2d3));
+ ASSERT_EQ(0, unlink(file1_s1d3));
+
+ /*
+ * This is OK for a file link, but it should not be allowed for a
+ * directory rename (because of the superset of access rights.
+ */
+ ASSERT_EQ(0, link(file1_s2d3, file1_s1d3));
+ ASSERT_EQ(0, unlink(file1_s1d3));
+
+ ASSERT_EQ(-1, link(file2_s1d2, file1_s1d3));
+ ASSERT_EQ(EXDEV, errno);
+ ASSERT_EQ(-1, link(file2_s1d3, file1_s1d2));
+ ASSERT_EQ(EXDEV, errno);
+
+ ASSERT_EQ(0, link(file2_s1d2, file1_s1d2));
+ ASSERT_EQ(0, link(file2_s1d3, file1_s1d3));
+}
+
+TEST_F_FORK(layout1, reparent_rename)
+{
+ /* Same rules as for reparent_link. */
+ const struct rule layer1[] = {
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_MAKE_REG,
+ },
+ {
+ .path = dir_s1d3,
+ .access = LANDLOCK_ACCESS_FS_REFER,
+ },
+ {
+ .path = dir_s2d2,
+ .access = LANDLOCK_ACCESS_FS_REFER,
+ },
+ {
+ .path = dir_s2d3,
+ .access = LANDLOCK_ACCESS_FS_MAKE_REG,
+ },
+ {}
+ };
+ const int ruleset_fd = create_ruleset(_metadata,
+ LANDLOCK_ACCESS_FS_MAKE_REG | LANDLOCK_ACCESS_FS_REFER,
+ layer1);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ ASSERT_EQ(0, unlink(file1_s1d2));
+ ASSERT_EQ(0, unlink(file1_s1d3));
+
+ /* Denies renaming because of missing MAKE_REG. */
+ ASSERT_EQ(-1, renameat2(AT_FDCWD, file2_s1d1, AT_FDCWD, file1_s1d1,
+ RENAME_EXCHANGE));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(-1, renameat2(AT_FDCWD, file1_s1d1, AT_FDCWD, file2_s1d1,
+ RENAME_EXCHANGE));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(0, unlink(file1_s1d1));
+ ASSERT_EQ(-1, rename(file2_s1d1, file1_s1d1));
+ ASSERT_EQ(EACCES, errno);
+ /* Even denies same file exchange. */
+ ASSERT_EQ(-1, renameat2(AT_FDCWD, file2_s1d1, AT_FDCWD, file2_s1d1,
+ RENAME_EXCHANGE));
+ ASSERT_EQ(EACCES, errno);
+
+ /* Denies renaming because of missing source and destination REFER. */
+ ASSERT_EQ(-1, rename(file1_s2d1, file1_s1d2));
+ ASSERT_EQ(EXDEV, errno);
+ /*
+ * Denies renaming because of missing MAKE_REG, source and destination
+ * REFER.
+ */
+ ASSERT_EQ(-1, renameat2(AT_FDCWD, file1_s2d1, AT_FDCWD, file2_s1d1,
+ RENAME_EXCHANGE));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(-1, renameat2(AT_FDCWD, file2_s1d1, AT_FDCWD, file1_s2d1,
+ RENAME_EXCHANGE));
+ ASSERT_EQ(EACCES, errno);
+
+ /* Denies renaming because of missing source REFER. */
+ ASSERT_EQ(-1, rename(file1_s2d1, file1_s1d3));
+ ASSERT_EQ(EXDEV, errno);
+ /* Denies renaming because of missing MAKE_REG. */
+ ASSERT_EQ(-1, renameat2(AT_FDCWD, file1_s2d1, AT_FDCWD, file2_s1d3,
+ RENAME_EXCHANGE));
+ ASSERT_EQ(EACCES, errno);
+
+ /* Denies renaming because of missing MAKE_REG. */
+ ASSERT_EQ(-1, rename(file1_s2d2, file1_s1d1));
+ ASSERT_EQ(EACCES, errno);
+ /* Denies renaming because of missing destination REFER*/
+ ASSERT_EQ(-1, rename(file1_s2d2, file1_s1d2));
+ ASSERT_EQ(EXDEV, errno);
+
+ /* Denies exchange because of one missing MAKE_REG. */
+ ASSERT_EQ(-1, renameat2(AT_FDCWD, file1_s2d2, AT_FDCWD, file2_s1d3,
+ RENAME_EXCHANGE));
+ ASSERT_EQ(EACCES, errno);
+ /* Allows renaming because of REFER and MAKE_REG. */
+ ASSERT_EQ(0, rename(file1_s2d2, file1_s1d3));
+
+ /* Reverse renaming denied because of missing MAKE_REG. */
+ ASSERT_EQ(-1, rename(file1_s1d3, file1_s2d2));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(0, unlink(file1_s2d3));
+ ASSERT_EQ(0, rename(file1_s1d3, file1_s2d3));
+
+ /* Tests reverse renaming. */
+ ASSERT_EQ(0, rename(file1_s2d3, file1_s1d3));
+ ASSERT_EQ(0, renameat2(AT_FDCWD, file2_s2d3, AT_FDCWD, file1_s1d3,
+ RENAME_EXCHANGE));
+ ASSERT_EQ(0, rename(file1_s1d3, file1_s2d3));
+
+ /*
+ * This is OK for a file rename, but it should not be allowed for a
+ * directory rename (because of the superset of access rights).
+ */
+ ASSERT_EQ(0, rename(file1_s2d3, file1_s1d3));
+ ASSERT_EQ(0, rename(file1_s1d3, file1_s2d3));
+
+ /*
+ * Tests superset restrictions applied to directories. Not only the
+ * dir_s2d3's parent (dir_s2d2) should be taken into account but also
+ * access rights tied to dir_s2d3. dir_s2d2 is missing one access right
+ * compared to dir_s1d3/file1_s1d3 (MAKE_REG) but it is provided
+ * directly by the moved dir_s2d3.
+ */
+ ASSERT_EQ(0, rename(dir_s2d3, file1_s1d3));
+ ASSERT_EQ(0, rename(file1_s1d3, dir_s2d3));
+ /*
+ * The first rename is allowed but not the exchange because dir_s1d3's
+ * parent (dir_s1d2) doesn't have REFER.
+ */
+ ASSERT_EQ(-1, renameat2(AT_FDCWD, file1_s2d3, AT_FDCWD, dir_s1d3,
+ RENAME_EXCHANGE));
+ ASSERT_EQ(EXDEV, errno);
+ ASSERT_EQ(-1, renameat2(AT_FDCWD, dir_s1d3, AT_FDCWD, file1_s2d3,
+ RENAME_EXCHANGE));
+ ASSERT_EQ(EXDEV, errno);
+ ASSERT_EQ(-1, rename(file1_s2d3, dir_s1d3));
+ ASSERT_EQ(EXDEV, errno);
+
+ ASSERT_EQ(-1, rename(file2_s1d2, file1_s1d3));
+ ASSERT_EQ(EXDEV, errno);
+ ASSERT_EQ(-1, rename(file2_s1d3, file1_s1d2));
+ ASSERT_EQ(EXDEV, errno);
+
+ /* Renaming in the same directory is always allowed. */
+ ASSERT_EQ(0, rename(file2_s1d2, file1_s1d2));
+ ASSERT_EQ(0, rename(file2_s1d3, file1_s1d3));
+
+ ASSERT_EQ(0, unlink(file1_s1d2));
+ /* Denies because of missing source MAKE_REG and destination REFER. */
+ ASSERT_EQ(-1, rename(dir_s2d3, file1_s1d2));
+ ASSERT_EQ(EXDEV, errno);
+
+ ASSERT_EQ(0, unlink(file1_s1d3));
+ /* Denies because of missing source MAKE_REG and REFER. */
+ ASSERT_EQ(-1, rename(dir_s2d2, file1_s1d3));
+ ASSERT_EQ(EXDEV, errno);
+}
+
+TEST_F_FORK(layout1, reparent_exdev_layers)
+{
+ const struct rule layer1[] = {
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_REFER,
+ },
+ {
+ /* Interesting for the layer2 tests. */
+ .path = dir_s1d3,
+ .access = LANDLOCK_ACCESS_FS_MAKE_REG,
+ },
+ {
+ .path = dir_s2d2,
+ .access = LANDLOCK_ACCESS_FS_REFER,
+ },
+ {
+ .path = dir_s2d3,
+ .access = LANDLOCK_ACCESS_FS_MAKE_REG,
+ },
+ {}
+ };
+ const struct rule layer2[] = {
+ {
+ .path = dir_s2d3,
+ .access = LANDLOCK_ACCESS_FS_MAKE_DIR,
+ },
+ {}
+ };
+ int ruleset_fd = create_ruleset(_metadata,
+ LANDLOCK_ACCESS_FS_MAKE_REG | LANDLOCK_ACCESS_FS_REFER,
+ layer1);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks EACCES predominance over EXDEV. */
+ ASSERT_EQ(-1, rename(file1_s1d1, file1_s2d2));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(-1, rename(file1_s1d2, file1_s2d2));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(-1, rename(file1_s1d1, file1_s2d3));
+ ASSERT_EQ(EXDEV, errno);
+ ASSERT_EQ(0, rename(file1_s1d2, file1_s2d3));
+
+ /* Without REFER source. */
+ ASSERT_EQ(-1, rename(dir_s1d1, file1_s2d2));
+ ASSERT_EQ(EXDEV, errno);
+ ASSERT_EQ(-1, rename(dir_s1d2, file1_s2d2));
+ ASSERT_EQ(EXDEV, errno);
+
+ /*
+ * Moving the dir_s1d3 directory below dir_s2d2 is allowed by Landlock
+ * because it doesn't inherit new access rights.
+ */
+ ASSERT_EQ(-1, rename(dir_s1d3, file1_s2d2));
+ ASSERT_EQ(ENOTDIR, errno);
+ ASSERT_EQ(0, unlink(file1_s2d2));
+ ASSERT_EQ(0, rename(dir_s1d3, file1_s2d2));
+ ASSERT_EQ(0, rename(file1_s2d2, dir_s1d3));
+
+ /*
+ * Moving the dir_s1d3 directory below dir_s2d3 is allowed, even if it
+ * gets a new inherited access rights (MAKE_REG), because MAKE_REG is
+ * already allowed for dir_s1d3.
+ */
+ ASSERT_EQ(-1, rename(dir_s1d3, file1_s2d3));
+ ASSERT_EQ(ENOTDIR, errno);
+ ASSERT_EQ(0, unlink(file1_s2d3));
+ ASSERT_EQ(0, rename(dir_s1d3, file1_s2d3));
+ ASSERT_EQ(0, rename(file1_s2d3, dir_s1d3));
+
+ /*
+ * However, moving the file1_s1d3 file below dir_s2d3 is allowed
+ * because it cannot inherit MAKE_REG right (which is dedicated to
+ * directories).
+ */
+ ASSERT_EQ(0, rename(file1_s1d3, file1_s2d3));
+
+ /*
+ * Same checks as before but with a second layer and a new MAKE_DIR
+ * rule (and no explicit handling of REFER).
+ */
+ ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_MAKE_DIR,
+ layer2);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks EACCES predominance over EXDEV. */
+ ASSERT_EQ(-1, rename(file1_s1d1, file1_s2d2));
+ ASSERT_EQ(EACCES, errno);
+ /* Checks with actual file2_s1d2. */
+ ASSERT_EQ(-1, rename(file2_s1d2, file1_s2d2));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(-1, rename(file1_s1d1, file1_s2d3));
+ ASSERT_EQ(EXDEV, errno);
+ ASSERT_EQ(0, rename(file2_s1d2, file1_s2d3));
+
+ /* Without REFER source, EACCES wins over EXDEV. */
+ ASSERT_EQ(-1, rename(dir_s1d1, file1_s2d2));
+ ASSERT_EQ(EACCES, errno);
+ ASSERT_EQ(-1, rename(dir_s1d2, file1_s2d2));
+ ASSERT_EQ(EACCES, errno);
+
+ /*
+ * Moving the dir_s1d3 directory below dir_s2d2 is now denied because
+ * MAKE_DIR is not tied to dir_s2d2.
+ */
+ ASSERT_EQ(-1, rename(dir_s1d3, file1_s2d2));
+ ASSERT_EQ(EACCES, errno);
+
+ /*
+ * Moving the dir_s1d3 directory below dir_s2d3 is forbidden because it
+ * would grants MAKE_REG and MAKE_DIR rights to it.
+ */
+ ASSERT_EQ(-1, rename(dir_s1d3, file1_s2d3));
+ ASSERT_EQ(EXDEV, errno);
+
+ /*
+ * However, moving the file2_s1d3 file below dir_s2d3 is allowed
+ * because it cannot inherit MAKE_REG nor MAKE_DIR rights (which are
+ * dedicated to directories).
+ */
+ ASSERT_EQ(0, rename(file2_s1d3, file1_s2d3));
+}
+
+TEST_F_FORK(layout1, reparent_dom_superset)
+{
+ const struct rule layer1[] = {
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_REFER,
+ },
+ {
+ .path = file1_s1d2,
+ .access = LANDLOCK_ACCESS_FS_EXECUTE,
+ },
+ {
+ .path = dir_s1d3,
+ .access = LANDLOCK_ACCESS_FS_MAKE_SOCK |
+ LANDLOCK_ACCESS_FS_EXECUTE,
+ },
+ {
+ .path = dir_s2d2,
+ .access = LANDLOCK_ACCESS_FS_REFER |
+ LANDLOCK_ACCESS_FS_EXECUTE |
+ LANDLOCK_ACCESS_FS_MAKE_SOCK,
+ },
+ {
+ .path = dir_s2d3,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_MAKE_FIFO,
+ },
+ {}
+ };
+ int ruleset_fd = create_ruleset(_metadata,
+ LANDLOCK_ACCESS_FS_REFER |
+ LANDLOCK_ACCESS_FS_EXECUTE |
+ LANDLOCK_ACCESS_FS_MAKE_SOCK |
+ LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_MAKE_FIFO,
+ layer1);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ ASSERT_EQ(-1, rename(file1_s1d2, file1_s2d1));
+ ASSERT_EQ(EXDEV, errno);
+ /*
+ * Moving file1_s1d2 beneath dir_s2d3 would grant it the READ_FILE
+ * access right.
+ */
+ ASSERT_EQ(-1, rename(file1_s1d2, file1_s2d3));
+ ASSERT_EQ(EXDEV, errno);
+ /*
+ * Moving file1_s1d2 should be allowed even if dir_s2d2 grants a
+ * superset of access rights compared to dir_s1d2, because file1_s1d2
+ * already has these access rights anyway.
+ */
+ ASSERT_EQ(0, rename(file1_s1d2, file1_s2d2));
+ ASSERT_EQ(0, rename(file1_s2d2, file1_s1d2));
+
+ ASSERT_EQ(-1, rename(dir_s1d3, file1_s2d1));
+ ASSERT_EQ(EXDEV, errno);
+ /*
+ * Moving dir_s1d3 beneath dir_s2d3 would grant it the MAKE_FIFO access
+ * right.
+ */
+ ASSERT_EQ(-1, rename(dir_s1d3, file1_s2d3));
+ ASSERT_EQ(EXDEV, errno);
+ /*
+ * Moving dir_s1d3 should be allowed even if dir_s2d2 grants a superset
+ * of access rights compared to dir_s1d2, because dir_s1d3 already has
+ * these access rights anyway.
+ */
+ ASSERT_EQ(0, rename(dir_s1d3, file1_s2d2));
+ ASSERT_EQ(0, rename(file1_s2d2, dir_s1d3));
+
+ /*
+ * Moving file1_s2d3 beneath dir_s1d2 is allowed, but moving it back
+ * will be denied because the new inherited access rights from dir_s1d2
+ * will be less than the destination (original) dir_s2d3. This is a
+ * sinkhole scenario where we cannot move back files or directories.
+ */
+ ASSERT_EQ(0, rename(file1_s2d3, file2_s1d2));
+ ASSERT_EQ(-1, rename(file2_s1d2, file1_s2d3));
+ ASSERT_EQ(EXDEV, errno);
+ ASSERT_EQ(0, unlink(file2_s1d2));
+ ASSERT_EQ(0, unlink(file2_s2d3));
+ /*
+ * Checks similar directory one-way move: dir_s2d3 loses EXECUTE and
+ * MAKE_SOCK which were inherited from dir_s1d3.
+ */
+ ASSERT_EQ(0, rename(dir_s2d3, file2_s1d2));
+ ASSERT_EQ(-1, rename(file2_s1d2, dir_s2d3));
+ ASSERT_EQ(EXDEV, errno);
+}
+
TEST_F_FORK(layout1, remove_dir)
{
const struct rule rules[] = {
@@ -2390,6 +2875,43 @@ TEST_F_FORK(layout1_bind, same_content_same_file)
ASSERT_EQ(EACCES, test_open(bind_file1_s1d3, O_WRONLY));
}

+TEST_F_FORK(layout1_bind, reparent_cross_mount)
+{
+ const struct rule layer1[] = {
+ {
+ /* dir_s2d1 is beneath the dir_s2d2 mount point. */
+ .path = dir_s2d1,
+ .access = LANDLOCK_ACCESS_FS_REFER,
+ },
+ {
+ .path = bind_dir_s1d3,
+ .access = LANDLOCK_ACCESS_FS_EXECUTE,
+ },
+ {}
+ };
+ int ruleset_fd = create_ruleset(_metadata, LANDLOCK_ACCESS_FS_REFER |
+ LANDLOCK_ACCESS_FS_EXECUTE, layer1);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks basic denied move. */
+ ASSERT_EQ(-1, rename(file1_s1d1, file1_s1d2));
+ ASSERT_EQ(EXDEV, errno);
+
+ /* Checks real cross-mount move (Landlock is not involved). */
+ ASSERT_EQ(-1, rename(file1_s2d1, file1_s2d2));
+ ASSERT_EQ(EXDEV, errno);
+
+ /* Checks move that will give more accesses. */
+ ASSERT_EQ(-1, rename(file1_s2d2, bind_file1_s1d3));
+ ASSERT_EQ(EXDEV, errno);
+
+ /* Checks legitimate downgrade move. */
+ ASSERT_EQ(0, rename(bind_file1_s1d3, file1_s2d2));
+}
+
#define LOWER_BASE TMP_DIR "/lower"
#define LOWER_DATA LOWER_BASE "/data"
static const char lower_fl1[] = LOWER_DATA "/fl1";
--
2.35.1

2022-02-22 05:28:41

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v1 09/11] landlock: Document LANDLOCK_ACCESS_FS_REFER and ABI versioning

From: Mickaël Salaün <[email protected]>

Add LANDLOCK_ACCESS_FS_REFER in the example and properly check to only
use it if the current kernel support it thanks to the Landlock ABI
version.

Move the file renaming and linking limitation to a new "Previous
limitations" section.

Improve documentation about the backward and forward compatibility,
including the rational for ruleset's handled_access_fs.

Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
Documentation/userspace-api/landlock.rst | 124 +++++++++++++++++++----
1 file changed, 104 insertions(+), 20 deletions(-)

diff --git a/Documentation/userspace-api/landlock.rst b/Documentation/userspace-api/landlock.rst
index f35552ff19ba..97db09d36a5c 100644
--- a/Documentation/userspace-api/landlock.rst
+++ b/Documentation/userspace-api/landlock.rst
@@ -8,7 +8,7 @@ Landlock: unprivileged access control
=====================================

:Author: Mickaël Salaün
-:Date: March 2021
+:Date: February 2022

The goal of Landlock is to enable to restrict ambient rights (e.g. global
filesystem access) for a set of processes. Because Landlock is a stackable
@@ -29,14 +29,15 @@ the thread enforcing it, and its future children.
Defining and enforcing a security policy
----------------------------------------

-We first need to create the ruleset that will contain our rules. For this
+We first need to define the ruleset that will contain our rules. For this
example, the ruleset will contain rules that only allow read actions, but write
actions will be denied. The ruleset then needs to handle both of these kind of
-actions.
+actions. This is required for backward and forward compatibility (i.e. the
+kernel and user space may not know each other's supported restrictions), hence
+the need to be explicit about the denied-by-default access rights.

.. code-block:: c

- int ruleset_fd;
struct landlock_ruleset_attr ruleset_attr = {
.handled_access_fs =
LANDLOCK_ACCESS_FS_EXECUTE |
@@ -51,9 +52,34 @@ actions.
LANDLOCK_ACCESS_FS_MAKE_SOCK |
LANDLOCK_ACCESS_FS_MAKE_FIFO |
LANDLOCK_ACCESS_FS_MAKE_BLOCK |
- LANDLOCK_ACCESS_FS_MAKE_SYM,
+ LANDLOCK_ACCESS_FS_MAKE_SYM |
+ LANDLOCK_ACCESS_FS_REFER,
};

+Because we may not know on which kernel version an application will be
+executed, it is safer to follow a best-effort security approach. Indeed, we
+should try to protect users as much as possible whatever the kernel they are
+using. To avoid binary enforcement (i.e. either all security features or
+none), we can leverage a dedicated Landlock command to get the current version
+of the Landlock ABI and adapt the handled accesses. Let's check if we should
+remove the `LANDLOCK_ACCESS_FS_REFER` access right which is only supported
+starting with the second version of the ABI.
+
+.. code-block:: c
+
+ int abi;
+
+ abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
+ if (abi < 2) {
+ ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_REFER;
+ }
+
+This enables to create an inclusive ruleset that will contain our rules.
+
+.. code-block:: c
+
+ int ruleset_fd;
+
ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
if (ruleset_fd < 0) {
perror("Failed to create a ruleset");
@@ -92,6 +118,11 @@ descriptor.
return 1;
}

+It may also be required to create rules following the same logic as explained
+for the ruleset creation, by filtering access rights according to the Landlock
+ABI version. In this example, this is not required because
+`LANDLOCK_ACCESS_FS_REFER` is not allowed by any rule.
+
We now have a ruleset with one rule allowing read access to ``/usr`` while
denying all other handled accesses for the filesystem. The next step is to
restrict the current thread from gaining more privileges (e.g. thanks to a SUID
@@ -192,6 +223,56 @@ To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target
process, a sandboxed process should have a subset of the target process rules,
which means the tracee must be in a sub-domain of the tracer.

+Compatibility
+=============
+
+Backward and forward compatibility
+----------------------------------
+
+Landlock is designed to be compatible with past and future versions of the
+kernel. This is achieved thanks to the system call attributes and the
+associated bitflags, particularly the ruleset's `handled_access_fs`. Making
+handled access right explicit enables the kernel and user space to have a clear
+contract with each other. This is required to make sure sandboxing will not
+get stricter with a system update, which could break applications.
+
+Developers can subscribe to the `Landlock mailing list
+<https://subspace.kernel.org/lists.linux.dev.html>`_ to knowingly update and
+test their applications with the latest available features. In the interest of
+users, and because they may use different kernel versions, it is strongly
+encouraged to follow a best-effort security approach by checking the Landlock
+ABI version at runtime and only enforcing the supported features.
+
+Landlock ABI versions
+---------------------
+
+The Landlock ABI version can be read with the sys_landlock_create_ruleset()
+system call:
+
+.. code-block:: c
+
+ int abi;
+
+ abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
+ if (abi < 0) {
+ switch (errno) {
+ case ENOSYS:
+ printf("Landlock is not supported by the current kernel.\n");
+ break;
+ case EOPNOTSUPP:
+ printf("Landlock is currently disabled.\n");
+ break;
+ }
+ return 0;
+ }
+ if (abi >= 2) {
+ printf("Landlock supports LANDLOCK_ACCESS_FS_REFER.\n");
+ }
+
+The following kernel interfaces are implicitly supported by the first ABI
+version. Features only supported from a specific version are explicitly marked
+as such.
+
Kernel interface
================

@@ -228,21 +309,6 @@ Enforcing a ruleset
Current limitations
===================

-File renaming and linking
--------------------------
-
-Because Landlock targets unprivileged access controls, it is needed to properly
-handle composition of rules. Such property also implies rules nesting.
-Properly handling multiple layers of ruleset, each one of them able to restrict
-access to files, also implies to inherit the ruleset restrictions from a parent
-to its hierarchy. Because files are identified and restricted by their
-hierarchy, moving or linking a file from one directory to another implies to
-propagate the hierarchy constraints. To protect against privilege escalations
-through renaming or linking, and for the sake of simplicity, Landlock currently
-limits linking and renaming to the same directory. Future Landlock evolutions
-will enable more flexibility for renaming and linking, with dedicated ruleset
-flags.
-
Filesystem topology modification
--------------------------------

@@ -281,6 +347,24 @@ Memory usage
Kernel memory allocated to create rulesets is accounted and can be restricted
by the Documentation/admin-guide/cgroup-v1/memory.rst.

+Previous limitations
+====================
+
+File renaming and linking (ABI 1)
+---------------------------------
+
+Because Landlock targets unprivileged access controls, it is needed to properly
+handle composition of rules. Such property also implies rules nesting.
+Properly handling multiple layers of ruleset, each one of them able to restrict
+access to files, also implies to inherit the ruleset restrictions from a parent
+to its hierarchy. Because files are identified and restricted by their
+hierarchy, moving or linking a file from one directory to another implies to
+propagate the hierarchy constraints. To protect against privilege escalations
+through renaming or linking, and for the sake of simplicity, Landlock previously
+limited linking and renaming to the same directory. Starting with the Landlock
+ABI version 2, it is now possible to securely control renaming and linking
+thanks to the new `LANDLOCK_ACCESS_FS_REFER` access right.
+
Questions and answers
=====================

--
2.35.1

2022-02-22 05:29:45

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v1 02/11] landlock: Reduce the maximum number of layers to 16

From: Mickaël Salaün <[email protected]>

The maximum number of nested Landlock domains is currently 64. Because
of the following fix and to help reduce the stack size, let's reduce it
to 16. This seems large enough for a lot of use cases (e.g. sandboxed
init service, spawning a sandboxed SSH service, in nested sandboxed
containers). Reducing the number of nested domains may also help to
discover misuse of Landlock (e.g. creating a domain per rule).

Add and use a dedicated layer_mask_t typedef to fit with the number of
layers. This might be useful when changing it and to keep it consistent
with the maximum number of layers.

Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
security/landlock/fs.c | 13 +++++--------
security/landlock/limits.h | 2 +-
security/landlock/ruleset.h | 4 ++++
tools/testing/selftests/landlock/fs_test.c | 2 +-
4 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 9de2a460a762..4048e3c04d75 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -180,10 +180,10 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,

/* Access-control management */

-static inline u64 unmask_layers(
+static inline layer_mask_t unmask_layers(
const struct landlock_ruleset *const domain,
const struct path *const path,
- const access_mask_t access_request, u64 layer_mask)
+ const access_mask_t access_request, layer_mask_t layer_mask)
{
const struct landlock_rule *rule;
const struct inode *inode;
@@ -209,11 +209,11 @@ static inline u64 unmask_layers(
*/
for (i = 0; i < rule->num_layers; i++) {
const struct landlock_layer *const layer = &rule->layers[i];
- const u64 layer_level = BIT_ULL(layer->level - 1);
+ const layer_mask_t layer_bit = BIT_ULL(layer->level - 1);

/* Checks that the layer grants access to the full request. */
if ((layer->access & access_request) == access_request) {
- layer_mask &= ~layer_level;
+ layer_mask &= ~layer_bit;

if (layer_mask == 0)
return layer_mask;
@@ -228,12 +228,9 @@ static int check_access_path(const struct landlock_ruleset *const domain,
{
bool allowed = false;
struct path walker_path;
- u64 layer_mask;
+ layer_mask_t layer_mask;
size_t i;

- /* Make sure all layers can be checked. */
- BUILD_BUG_ON(BITS_PER_TYPE(layer_mask) < LANDLOCK_MAX_NUM_LAYERS);
-
if (!access_request)
return 0;
if (WARN_ON_ONCE(!domain || !path))
diff --git a/security/landlock/limits.h b/security/landlock/limits.h
index 458d1de32ed5..126d1ec04d34 100644
--- a/security/landlock/limits.h
+++ b/security/landlock/limits.h
@@ -13,7 +13,7 @@
#include <linux/limits.h>
#include <uapi/linux/landlock.h>

-#define LANDLOCK_MAX_NUM_LAYERS 64
+#define LANDLOCK_MAX_NUM_LAYERS 16
#define LANDLOCK_MAX_NUM_RULES U32_MAX

#define LANDLOCK_LAST_ACCESS_FS LANDLOCK_ACCESS_FS_MAKE_SYM
diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
index 7e7cac68e443..0128c56ee7ff 100644
--- a/security/landlock/ruleset.h
+++ b/security/landlock/ruleset.h
@@ -23,6 +23,10 @@ typedef u16 access_mask_t;
/* Makes sure all filesystem access rights can be stored. */
static_assert(BITS_PER_TYPE(access_mask_t) >= LANDLOCK_NUM_ACCESS_FS);

+typedef u16 layer_mask_t;
+/* Makes sure all layers can be checked. */
+static_assert(BITS_PER_TYPE(layer_mask_t) >= LANDLOCK_MAX_NUM_LAYERS);
+
/**
* struct landlock_layer - Access rights for a given layer
*/
diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
index 10c9a1e4ebd9..99838cac970b 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -1080,7 +1080,7 @@ TEST_F_FORK(layout1, max_layers)
const int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, rules);

ASSERT_LE(0, ruleset_fd);
- for (i = 0; i < 64; i++)
+ for (i = 0; i < 16; i++)
enforce_ruleset(_metadata, ruleset_fd);

for (i = 0; i < 2; i++) {
--
2.35.1

2022-02-22 05:43:07

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v1 03/11] landlock: Create find_rule() from unmask_layers()

From: Mickaël Salaün <[email protected]>

This refactoring will be useful in a following commit.

Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
security/landlock/fs.c | 39 +++++++++++++++++++++++++++------------
1 file changed, 27 insertions(+), 12 deletions(-)

diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 4048e3c04d75..0bcb27f2360a 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -180,23 +180,36 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,

/* Access-control management */

-static inline layer_mask_t unmask_layers(
+/*
+ * The lifetime of the returned rule is tied to @domain.
+ *
+ * Returns NULL if no rule is found or if @dentry is negative.
+ */
+static inline const struct landlock_rule *find_rule(
const struct landlock_ruleset *const domain,
- const struct path *const path,
- const access_mask_t access_request, layer_mask_t layer_mask)
+ const struct dentry *const dentry)
{
const struct landlock_rule *rule;
const struct inode *inode;
- size_t i;

- if (d_is_negative(path->dentry))
- /* Ignore nonexistent leafs. */
- return layer_mask;
- inode = d_backing_inode(path->dentry);
+ /* Ignores nonexistent leafs. */
+ if (d_is_negative(dentry))
+ return NULL;
+
+ inode = d_backing_inode(dentry);
rcu_read_lock();
rule = landlock_find_rule(domain,
rcu_dereference(landlock_inode(inode)->object));
rcu_read_unlock();
+ return rule;
+}
+
+static inline layer_mask_t unmask_layers(
+ const struct landlock_rule *const rule,
+ const access_mask_t access_request, layer_mask_t layer_mask)
+{
+ size_t layer_level;
+
if (!rule)
return layer_mask;

@@ -207,8 +220,9 @@ static inline layer_mask_t unmask_layers(
* the remaining layers for each inode, from the first added layer to
* the last one.
*/
- for (i = 0; i < rule->num_layers; i++) {
- const struct landlock_layer *const layer = &rule->layers[i];
+ for (layer_level = 0; layer_level < rule->num_layers; layer_level++) {
+ const struct landlock_layer *const layer =
+ &rule->layers[layer_level];
const layer_mask_t layer_bit = BIT_ULL(layer->level - 1);

/* Checks that the layer grants access to the full request. */
@@ -266,8 +280,9 @@ static int check_access_path(const struct landlock_ruleset *const domain,
while (true) {
struct dentry *parent_dentry;

- layer_mask = unmask_layers(domain, &walker_path,
- access_request, layer_mask);
+ layer_mask = unmask_layers(find_rule(domain,
+ walker_path.dentry), access_request,
+ layer_mask);
if (layer_mask == 0) {
/* Stops when a rule from each layer grants access. */
allowed = true;
--
2.35.1

2022-02-22 05:47:04

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v1 04/11] landlock: Fix same-layer rule unions

From: Mickaël Salaün <[email protected]>

The original behavior was to check if the full set of requested accesses
was allowed by at least a rule of every relevant layer. This didn't
take into account requests for multiple accesses and same-layer rules
allowing the union of these accesses in a complementary way. As a
result, multiple accesses requested on a file hierarchy matching rules
that, together, allowed these accesses, but without a unique rule
allowing all of them, was illegitimately denied. This case should be
rare in practice and it can only be triggered by the path_rename or
file_open hook implementations.

For instance, if, for the same layer, a rule allows execution
beneath /a/b and another rule allows read beneath /a, requesting access
to read and execute at the same time for /a/b should be allowed for this
layer.

This was an inconsistency because the union of same-layer rule accesses
was already allowed if requested once at a time anyway.

This fix changes the way allowed accesses are gathered over a path walk.
To take into account all these rule accesses, we store in a matrix all
layer granting the set of requested accesses, according to the handled
accesses. To avoid heap allocation, we use an array on the stack which
is 2*13 bytes. A following commit bringing the LANDLOCK_ACCESS_FS_REFER
access right will increase this size to reach 84 bytes (2*14*3) in case
of link or rename actions.

Add a new layout1.layer_rule_unions test to check that accesses from
different rules pertaining to the same layer are ORed in a file
hierarchy. Also test that it is not the case for rules from different
layers.

Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
security/landlock/fs.c | 77 ++++++++++-----
security/landlock/ruleset.h | 2 +
tools/testing/selftests/landlock/fs_test.c | 107 +++++++++++++++++++++
3 files changed, 160 insertions(+), 26 deletions(-)

diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 0bcb27f2360a..9662f9fb3cd0 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -204,45 +204,66 @@ static inline const struct landlock_rule *find_rule(
return rule;
}

-static inline layer_mask_t unmask_layers(
- const struct landlock_rule *const rule,
- const access_mask_t access_request, layer_mask_t layer_mask)
+/*
+ * @layer_masks is read and may be updated according to the access request and
+ * the matching rule.
+ *
+ * Returns true if the request is allowed (i.e. relevant layer masks for the
+ * request are empty).
+ */
+static inline bool unmask_layers(const struct landlock_rule *const rule,
+ const access_mask_t access_request,
+ layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS])
{
size_t layer_level;

+ if (!access_request || !layer_masks)
+ return true;
if (!rule)
- return layer_mask;
+ return false;

/*
* An access is granted if, for each policy layer, at least one rule
- * encountered on the pathwalk grants the requested accesses,
- * regardless of their position in the layer stack. We must then check
+ * encountered on the pathwalk grants the requested access,
+ * regardless of its position in the layer stack. We must then check
* the remaining layers for each inode, from the first added layer to
- * the last one.
+ * the last one. When there is multiple requested accesses, for each
+ * policy layer, the full set of requested accesses may not be granted
+ * by only one rule, but by the union (binary OR) of multiple rules.
+ * E.g. /a/b <execute> + /a <read> = /a/b <execute + read>
*/
for (layer_level = 0; layer_level < rule->num_layers; layer_level++) {
const struct landlock_layer *const layer =
&rule->layers[layer_level];
const layer_mask_t layer_bit = BIT_ULL(layer->level - 1);
+ const unsigned long access_req = access_request;
+ unsigned long access_bit;
+ bool is_empty;

- /* Checks that the layer grants access to the full request. */
- if ((layer->access & access_request) == access_request) {
- layer_mask &= ~layer_bit;
-
- if (layer_mask == 0)
- return layer_mask;
+ /*
+ * Records in @layer_masks which layer grants access to each
+ * requested access.
+ */
+ is_empty = true;
+ for_each_set_bit(access_bit, &access_req,
+ ARRAY_SIZE(*layer_masks)) {
+ if (layer->access & BIT_ULL(access_bit))
+ (*layer_masks)[access_bit] &= ~layer_bit;
+ is_empty = is_empty && !(*layer_masks)[access_bit];
}
+ if (is_empty)
+ return true;
}
- return layer_mask;
+ return false;
}

static int check_access_path(const struct landlock_ruleset *const domain,
const struct path *const path,
const access_mask_t access_request)
{
- bool allowed = false;
+ layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
+ bool allowed = false, has_access = false;
struct path walker_path;
- layer_mask_t layer_mask;
size_t i;

if (!access_request)
@@ -262,13 +283,20 @@ static int check_access_path(const struct landlock_ruleset *const domain,
return -EACCES;

/* Saves all layers handling a subset of requested accesses. */
- layer_mask = 0;
for (i = 0; i < domain->num_layers; i++) {
- if (domain->fs_access_masks[i] & access_request)
- layer_mask |= BIT_ULL(i);
+ const unsigned long access_req = access_request;
+ unsigned long access_bit;
+
+ for_each_set_bit(access_bit, &access_req,
+ ARRAY_SIZE(layer_masks)) {
+ if (domain->fs_access_masks[i] & BIT_ULL(access_bit)) {
+ layer_masks[access_bit] |= BIT_ULL(i);
+ has_access = true;
+ }
+ }
}
/* An access request not handled by the domain is allowed. */
- if (layer_mask == 0)
+ if (!has_access)
return 0;

walker_path = *path;
@@ -280,14 +308,11 @@ static int check_access_path(const struct landlock_ruleset *const domain,
while (true) {
struct dentry *parent_dentry;

- layer_mask = unmask_layers(find_rule(domain,
- walker_path.dentry), access_request,
- layer_mask);
- if (layer_mask == 0) {
+ allowed = unmask_layers(find_rule(domain, walker_path.dentry),
+ access_request, &layer_masks);
+ if (allowed)
/* Stops when a rule from each layer grants access. */
- allowed = true;
break;
- }

jump_up:
if (walker_path.dentry == walker_path.mnt->mnt_root) {
diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
index 0128c56ee7ff..fa17cc1f82db 100644
--- a/security/landlock/ruleset.h
+++ b/security/landlock/ruleset.h
@@ -22,6 +22,8 @@
typedef u16 access_mask_t;
/* Makes sure all filesystem access rights can be stored. */
static_assert(BITS_PER_TYPE(access_mask_t) >= LANDLOCK_NUM_ACCESS_FS);
+/* Makes sure for_each_set_bit() and for_each_clear_bit() calls are OK. */
+static_assert(sizeof(unsigned long) >= sizeof(access_mask_t));

typedef u16 layer_mask_t;
/* Makes sure all layers can be checked. */
diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c
index 99838cac970b..1ac41bfa7382 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -687,6 +687,113 @@ TEST_F_FORK(layout1, ruleset_overlap)
ASSERT_EQ(0, test_open(dir_s1d3, O_RDONLY | O_DIRECTORY));
}

+TEST_F_FORK(layout1, layer_rule_unions)
+{
+ const struct rule layer1[] = {
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE,
+ },
+ /* dir_s1d3 should allow READ_FILE and WRITE_FILE (O_RDWR). */
+ {
+ .path = dir_s1d3,
+ .access = LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {}
+ };
+ const struct rule layer2[] = {
+ /* Doesn't change anything from layer1. */
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_READ_FILE |
+ LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {}
+ };
+ const struct rule layer3[] = {
+ /* Only allows write (but not read) to dir_s1d3. */
+ {
+ .path = dir_s1d2,
+ .access = LANDLOCK_ACCESS_FS_WRITE_FILE,
+ },
+ {}
+ };
+ int ruleset_fd = create_ruleset(_metadata, ACCESS_RW, layer1);
+
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks s1d1 hierarchy with layer1. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDWR));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
+
+ /* Checks s1d2 hierarchy with layer1. */
+ ASSERT_EQ(0, test_open(file1_s1d2, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_RDWR));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
+
+ /* Checks s1d3 hierarchy with layer1. */
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s1d3, O_WRONLY));
+ /* dir_s1d3 should allow READ_FILE and WRITE_FILE (O_RDWR). */
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDWR));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
+
+ /* Doesn't change anything from layer1. */
+ ruleset_fd = create_ruleset(_metadata, ACCESS_RW, layer2);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks s1d1 hierarchy with layer2. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDWR));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
+
+ /* Checks s1d2 hierarchy with layer2. */
+ ASSERT_EQ(0, test_open(file1_s1d2, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_RDWR));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
+
+ /* Checks s1d3 hierarchy with layer2. */
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s1d3, O_WRONLY));
+ /* dir_s1d3 should allow READ_FILE and WRITE_FILE (O_RDWR). */
+ ASSERT_EQ(0, test_open(file1_s1d3, O_RDWR));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
+
+ /* Only allows write (but not read) to dir_s1d3. */
+ ruleset_fd = create_ruleset(_metadata, ACCESS_RW, layer3);
+ ASSERT_LE(0, ruleset_fd);
+ enforce_ruleset(_metadata, ruleset_fd);
+ ASSERT_EQ(0, close(ruleset_fd));
+
+ /* Checks s1d1 hierarchy with layer3. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d1, O_RDWR));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
+
+ /* Checks s1d2 hierarchy with layer3. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_RDONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_WRONLY));
+ ASSERT_EQ(EACCES, test_open(file1_s1d2, O_RDWR));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
+
+ /* Checks s1d3 hierarchy with layer3. */
+ ASSERT_EQ(EACCES, test_open(file1_s1d3, O_RDONLY));
+ ASSERT_EQ(0, test_open(file1_s1d3, O_WRONLY));
+ /* dir_s1d3 should now deny READ_FILE and WRITE_FILE (O_RDWR). */
+ ASSERT_EQ(EACCES, test_open(file1_s1d3, O_RDWR));
+ ASSERT_EQ(EACCES, test_open(dir_s1d1, O_RDONLY | O_DIRECTORY));
+}
+
TEST_F_FORK(layout1, non_overlapping_accesses)
{
const struct rule layer1[] = {
--
2.35.1

2022-02-22 05:48:44

by Mickaël Salaün

[permalink] [raw]
Subject: [PATCH v1 10/11] landlock: Document good practices about filesystem policies

From: Mickaël Salaün <[email protected]>

Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
Documentation/userspace-api/landlock.rst | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)

diff --git a/Documentation/userspace-api/landlock.rst b/Documentation/userspace-api/landlock.rst
index 97db09d36a5c..cc3b52f65f99 100644
--- a/Documentation/userspace-api/landlock.rst
+++ b/Documentation/userspace-api/landlock.rst
@@ -156,6 +156,27 @@ ruleset.

Full working code can be found in `samples/landlock/sandboxer.c`_.

+Good practices
+--------------
+
+It is recommended setting access rights to file hierarchy leaves as much as
+possible. For instance, it is better to be able to have ``~/doc/`` as a
+read-only hierarchy and ``~/tmp/`` as a read-write hierarchy, compared to
+``~/`` as a read-only hierarchy and ``~/tmp/`` as a read-write hierarchy.
+Following this good practice leads to self-sufficient hierarchies that don't
+depend on their location (i.e. parent directories). This is particularly
+relevant when we want to allow linking or renaming. Indeed, having consistent
+access rights per directory enables to change the location of such directory
+without relying on the destination directory access rights (except those that
+are required for this operation, see `LANDLOCK_ACCESS_FS_REFER` documentation).
+Having self-sufficient hierarchies also helps to tighten the required access
+rights to the minimal set of data. This also helps avoid sinkhole directories,
+i.e. directories where data can be linked to but not linked from. However,
+this depends on data organization, which might not be controlled by developers.
+In this case, granting read-write access to ``~/tmp/``, instead of write-only
+access, would potentially allow to move ``~/tmp/`` to a non-readable directory
+and still keep the ability to list the content of ``~/tmp/``.
+
Layers of file path access rights
---------------------------------

--
2.35.1

2022-03-17 03:51:15

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH v1 01/11] landlock: Define access_mask_t to enforce a consistent access mask size

On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <[email protected]> wrote:
>
> From: Mickaël Salaün <[email protected]>
>
> Create and use the access_mask_t typedef to enforce a consistent access
> mask size and uniformly use a 16-bits type. This will helps transition
> to a 32-bits value one day.
>
> Add a build check to make sure all (filesystem) access rights fit in.
> This will be extended with a following commit.
>
> Signed-off-by: Mickaël Salaün <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> ---
> security/landlock/fs.c | 19 ++++++++++---------
> security/landlock/fs.h | 2 +-
> security/landlock/limits.h | 2 ++
> security/landlock/ruleset.c | 6 ++++--
> security/landlock/ruleset.h | 17 +++++++++++++----
> 5 files changed, 30 insertions(+), 16 deletions(-)
>
> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> index 97b8e421f617..9de2a460a762 100644
> --- a/security/landlock/fs.c
> +++ b/security/landlock/fs.c
> @@ -150,7 +150,7 @@ static struct landlock_object *get_inode_object(struct inode *const inode)
> * @path: Should have been checked by get_path_from_fd().
> */
> int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
> - const struct path *const path, u32 access_rights)
> + const struct path *const path, access_mask_t access_rights)
> {
> int err;
> struct landlock_object *object;
> @@ -182,8 +182,8 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
>
> static inline u64 unmask_layers(
> const struct landlock_ruleset *const domain,
> - const struct path *const path, const u32 access_request,
> - u64 layer_mask)
> + const struct path *const path,
> + const access_mask_t access_request, u64 layer_mask)
> {
> const struct landlock_rule *rule;
> const struct inode *inode;
> @@ -223,7 +223,8 @@ static inline u64 unmask_layers(
> }
>
> static int check_access_path(const struct landlock_ruleset *const domain,
> - const struct path *const path, u32 access_request)
> + const struct path *const path,
> + const access_mask_t access_request)
> {
> bool allowed = false;
> struct path walker_path;
> @@ -308,7 +309,7 @@ static int check_access_path(const struct landlock_ruleset *const domain,
> }
>
> static inline int current_check_access_path(const struct path *const path,
> - const u32 access_request)
> + const access_mask_t access_request)
> {
> const struct landlock_ruleset *const dom =
> landlock_get_current_domain();
> @@ -511,7 +512,7 @@ static int hook_sb_pivotroot(const struct path *const old_path,
>
> /* Path hooks */
>
> -static inline u32 get_mode_access(const umode_t mode)
> +static inline access_mask_t get_mode_access(const umode_t mode)
> {
> switch (mode & S_IFMT) {
> case S_IFLNK:
> @@ -563,7 +564,7 @@ static int hook_path_link(struct dentry *const old_dentry,
> get_mode_access(d_backing_inode(old_dentry)->i_mode));
> }
>
> -static inline u32 maybe_remove(const struct dentry *const dentry)
> +static inline access_mask_t maybe_remove(const struct dentry *const dentry)
> {
> if (d_is_negative(dentry))
> return 0;
> @@ -631,9 +632,9 @@ static int hook_path_rmdir(const struct path *const dir,
>
> /* File hooks */
>
> -static inline u32 get_file_access(const struct file *const file)
> +static inline access_mask_t get_file_access(const struct file *const file)
> {
> - u32 access = 0;
> + access_mask_t access = 0;
>
> if (file->f_mode & FMODE_READ) {
> /* A directory can only be opened in read mode. */
> diff --git a/security/landlock/fs.h b/security/landlock/fs.h
> index 187284b421c9..74be312aad96 100644
> --- a/security/landlock/fs.h
> +++ b/security/landlock/fs.h
> @@ -65,6 +65,6 @@ static inline struct landlock_superblock_security *landlock_superblock(
> __init void landlock_add_fs_hooks(void);
>
> int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
> - const struct path *const path, u32 access_hierarchy);
> + const struct path *const path, access_mask_t access_hierarchy);
>
> #endif /* _SECURITY_LANDLOCK_FS_H */
> diff --git a/security/landlock/limits.h b/security/landlock/limits.h
> index 2a0a1095ee27..458d1de32ed5 100644
> --- a/security/landlock/limits.h
> +++ b/security/landlock/limits.h
> @@ -9,6 +9,7 @@
> #ifndef _SECURITY_LANDLOCK_LIMITS_H
> #define _SECURITY_LANDLOCK_LIMITS_H
>
> +#include <linux/bitops.h>
> #include <linux/limits.h>
> #include <uapi/linux/landlock.h>
>
> @@ -17,5 +18,6 @@
>
> #define LANDLOCK_LAST_ACCESS_FS LANDLOCK_ACCESS_FS_MAKE_SYM
> #define LANDLOCK_MASK_ACCESS_FS ((LANDLOCK_LAST_ACCESS_FS << 1) - 1)
> +#define LANDLOCK_NUM_ACCESS_FS __const_hweight64(LANDLOCK_MASK_ACCESS_FS)

The line above, and the static_assert() in ruleset.h are clever. I'll
admit I didn't even know the hweightX() macros existed until looking
at this code :)

However, the LANDLOCK_NUM_ACCESS_FS is never really going to be used
outside the static_assert() in ruleset.h is it? I wonder if it would
be better to skip the extra macro and rewrite the static_assert like
this:

static_assert(BITS_PER_TYPE(access_mask_t) >=
__const_hweight64(LANDLOCK_MASK_ACCESS_FS));

If not, I might suggest changing LANDLOCK_NUM_ACCESS_FS to
LANDLOCK_BITS_ACCESS_FS or something similar.


> diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
> index 2d3ed7ec5a0a..7e7cac68e443 100644
> --- a/security/landlock/ruleset.h
> +++ b/security/landlock/ruleset.h
> @@ -9,13 +9,20 @@
> #ifndef _SECURITY_LANDLOCK_RULESET_H
> #define _SECURITY_LANDLOCK_RULESET_H
>
> +#include <linux/bitops.h>
> +#include <linux/build_bug.h>
> #include <linux/mutex.h>
> #include <linux/rbtree.h>
> #include <linux/refcount.h>
> #include <linux/workqueue.h>
>
> +#include "limits.h"
> #include "object.h"
>
> +typedef u16 access_mask_t;
> +/* Makes sure all filesystem access rights can be stored. */
> +static_assert(BITS_PER_TYPE(access_mask_t) >= LANDLOCK_NUM_ACCESS_FS);

--
paul-moore.com

2022-03-17 04:04:04

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH v1 10/11] landlock: Document good practices about filesystem policies

On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <[email protected]> wrote:
>
> From: Mickaël Salaün <[email protected]>
>
> Signed-off-by: Mickaël Salaün <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> ---
> Documentation/userspace-api/landlock.rst | 21 +++++++++++++++++++++
> 1 file changed, 21 insertions(+)

Reviewed-by: Paul Moore <[email protected]>


--
paul-moore.com

2022-03-17 05:01:30

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH v1 06/11] landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER

On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <[email protected]> wrote:
>
> From: Mickaël Salaün <[email protected]>
>
> Add a new LANDLOCK_ACCESS_FS_REFER access right to enable policy writers
> to allow sandboxed processes to link and rename files from and to a
> specific set of file hierarchies. This access right should be composed
> with LANDLOCK_ACCESS_FS_MAKE_* for the destination of a link or rename,
> and with LANDLOCK_ACCESS_FS_REMOVE_* for a source of a rename. This
> lift a Landlock limitation that always denied changing the parent of an
> inode.
>
> Renaming or linking to the same directory is still always allowed,
> whatever LANDLOCK_ACCESS_FS_REFER is used or not, because it is not
> considered a threat to user data.
>
> However, creating multiple links or renaming to a different parent
> directory may lead to privilege escalations if not handled properly.
> Indeed, we must be sure that the source doesn't gain more privileges by
> being accessible from the destination. This is handled by making sure
> that the source hierarchy (including the referenced file or directory
> itself) restricts at least as much the destination hierarchy. If it is
> not the case, an EXDEV error is returned, making it potentially possible
> for user space to copy the file hierarchy instead of moving or linking
> it.
>
> Instead of creating different access rights for the source and the
> destination, we choose to make it simple and consistent for users.
> Indeed, considering the previous constraint, it would be weird to
> require such destination access right to be also granted to the source
> (to make it a superset).
>
> See the provided documentation for additional details.
>
> New tests are provided with a following commit.
>
> Signed-off-by: Mickaël Salaün <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> ---
> include/uapi/linux/landlock.h | 27 +-
> security/landlock/fs.c | 550 ++++++++++++++++---
> security/landlock/limits.h | 2 +-
> security/landlock/syscalls.c | 2 +-
> tools/testing/selftests/landlock/base_test.c | 2 +-
> tools/testing/selftests/landlock/fs_test.c | 3 +-
> 6 files changed, 516 insertions(+), 70 deletions(-)

...

> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> index 3886f9ad1a60..c7c7ce4e7cd5 100644
> --- a/security/landlock/fs.c
> +++ b/security/landlock/fs.c
> @@ -4,6 +4,7 @@
> *
> * Copyright © 2016-2020 Mickaël Salaün <[email protected]>
> * Copyright © 2018-2020 ANSSI
> + * Copyright © 2021-2022 Microsoft Corporation
> */
>
> #include <linux/atomic.h>
> @@ -269,16 +270,188 @@ static inline bool is_nouser_or_private(const struct dentry *dentry)
> unlikely(IS_PRIVATE(d_backing_inode(dentry))));
> }
>
> -static int check_access_path(const struct landlock_ruleset *const domain,
> - const struct path *const path,
> +static inline access_mask_t get_handled_accesses(
> + const struct landlock_ruleset *const domain)
> +{
> + access_mask_t access_dom = 0;
> + unsigned long access_bit;

Would it be better to declare @access_bit as an access_mask_t type?
You're not using any macros like for_each_set_bit() in this function
so I believe it should be safe.

> + for (access_bit = 0; access_bit < LANDLOCK_NUM_ACCESS_FS;
> + access_bit++) {
> + size_t layer_level;

Considering the number of layers has dropped down to 16, it seems like
a normal unsigned int might be big enough for @layer_level :)

> + for (layer_level = 0; layer_level < domain->num_layers;
> + layer_level++) {
> + if (domain->fs_access_masks[layer_level] &
> + BIT_ULL(access_bit)) {
> + access_dom |= BIT_ULL(access_bit);
> + break;
> + }
> + }
> + }
> + return access_dom;
> +}
> +
> +static inline access_mask_t init_layer_masks(
> + const struct landlock_ruleset *const domain,
> + const access_mask_t access_request,
> + layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS])
> +{
> + access_mask_t handled_accesses = 0;
> + size_t layer_level;
> +
> + memset(layer_masks, 0, sizeof(*layer_masks));
> + if (WARN_ON_ONCE(!access_request))
> + return 0;
> +
> + /* Saves all handled accesses per layer. */
> + for (layer_level = 0; layer_level < domain->num_layers;
> + layer_level++) {
> + const unsigned long access_req = access_request;
> + unsigned long access_bit;
> +
> + for_each_set_bit(access_bit, &access_req,
> + ARRAY_SIZE(*layer_masks)) {
> + if (domain->fs_access_masks[layer_level] &
> + BIT_ULL(access_bit)) {
> + (*layer_masks)[access_bit] |=
> + BIT_ULL(layer_level);
> + handled_accesses |= BIT_ULL(access_bit);
> + }
> + }
> + }
> + return handled_accesses;
> +}
> +
> +/*
> + * Check that a destination file hierarchy has more restrictions than a source
> + * file hierarchy. This is only used for link and rename actions.
> + */
> +static inline bool is_superset(bool child_is_directory,
> + const layer_mask_t (*const
> + layer_masks_dst_parent)[LANDLOCK_NUM_ACCESS_FS],
> + const layer_mask_t (*const
> + layer_masks_src_parent)[LANDLOCK_NUM_ACCESS_FS],
> + const layer_mask_t (*const
> + layer_masks_child)[LANDLOCK_NUM_ACCESS_FS])
> +{
> + unsigned long access_bit;
> +
> + for (access_bit = 0; access_bit < ARRAY_SIZE(*layer_masks_dst_parent);
> + access_bit++) {
> + /* Ignores accesses that only make sense for directories. */
> + if (!child_is_directory && !(BIT_ULL(access_bit) & ACCESS_FILE))
> + continue;
> +
> + /*
> + * Checks if the destination restrictions are a superset of the
> + * source ones (i.e. inherited access rights without child
> + * exceptions).
> + */
> + if ((((*layer_masks_src_parent)[access_bit] & (*layer_masks_child)[access_bit]) |
> + (*layer_masks_dst_parent)[access_bit]) !=
> + (*layer_masks_dst_parent)[access_bit])
> + return false;
> + }
> + return true;
> +}
> +
> +/*
> + * Removes @layer_masks accesses that are not requested.
> + *
> + * Returns true if the request is allowed, false otherwise.
> + */
> +static inline bool scope_to_request(const access_mask_t access_request,
> + layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS])
> +{
> + const unsigned long access_req = access_request;
> + unsigned long access_bit;
> +
> + if (WARN_ON_ONCE(!layer_masks))
> + return true;
> +
> + for_each_clear_bit(access_bit, &access_req, ARRAY_SIZE(*layer_masks))
> + (*layer_masks)[access_bit] = 0;
> + return !memchr_inv(layer_masks, 0, sizeof(*layer_masks));
> +}
> +
> +/*
> + * Returns true if there is at least one access right different than
> + * LANDLOCK_ACCESS_FS_REFER.
> + */
> +static inline bool is_eacces(
> + const layer_mask_t (*const
> + layer_masks)[LANDLOCK_NUM_ACCESS_FS],
> const access_mask_t access_request)
> {

Granted, I don't have as deep of an understanding of Landlock as you
do, but the function name "is_eacces" seems a little odd given the
nature of the function. Perhaps "is_fsrefer"?

> - layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
> - bool allowed = false, has_access = false;
> + unsigned long access_bit;
> + /* LANDLOCK_ACCESS_FS_REFER alone must return -EXDEV. */
> + const unsigned long access_check = access_request &
> + ~LANDLOCK_ACCESS_FS_REFER;
> +
> + if (!layer_masks)
> + return false;
> +
> + for_each_set_bit(access_bit, &access_check, ARRAY_SIZE(*layer_masks)) {
> + if ((*layer_masks)[access_bit])
> + return true;
> + }

Is calling for_each_set_bit() overkill here? @access_check should
only ever have at most one bit set (LANDLOCK_ACCESS_FS_REFER), yes?

> + return false;
> +}
> +
> +/**
> + * check_access_path_dual - Check a source and a destination accesses
> + *
> + * @domain: Domain to check against.
> + * @path: File hierarchy to walk through.
> + * @child_is_directory: Must be set to true if the (original) leaf is a
> + * directory, false otherwise.
> + * @access_request_dst_parent: Accesses to check, once @layer_masks_dst_parent
> + * is equal to @layer_masks_src_parent (if any).
> + * @layer_masks_dst_parent: Pointer to a matrix of layer masks per access
> + * masks, identifying the layers that forbid a specific access. Bits from
> + * this matrix can be unset according to the @path walk. An empty matrix
> + * means that @domain allows all possible Landlock accesses (i.e. not only
> + * those identified by @access_request_dst_parent). This matrix can
> + * initially refer to domain layer masks and, when the accesses for the
> + * destination and source are the same, to request layer masks.
> + * @access_request_src_parent: Similar to @access_request_dst_parent but for an
> + * initial source path request. Only taken into account if
> + * @layer_masks_src_parent is not NULL.
> + * @layer_masks_src_parent: Similar to @layer_masks_dst_parent but for an
> + * initial source path walk. This can be NULL if only dealing with a
> + * destination access request (i.e. not a rename nor a link action).
> + * @layer_masks_child: Similar to @layer_masks_src_parent but only for the
> + * linked or renamed inode (without hierarchy). This is only used if
> + * @layer_masks_src_parent is not NULL.
> + *
> + * This helper first checks that the destination has a superset of restrictions
> + * compared to the source (if any) for a common path. It then checks that the
> + * collected accesses and the remaining ones are enough to allow the request.
> + *
> + * Returns:
> + * - 0 if the access request is granted;
> + * - -EACCES if it is denied because of access right other than
> + * LANDLOCK_ACCESS_FS_REFER;
> + * - -EXDEV if the renaming or linking would be a privileged escalation
> + * (according to each layered policies), or if LANDLOCK_ACCESS_FS_REFER is
> + * not allowed by the source or the destination.
> + */
> +static int check_access_path_dual(const struct landlock_ruleset *const domain,
> + const struct path *const path,
> + bool child_is_directory,
> + const access_mask_t access_request_dst_parent,
> + layer_mask_t (*const
> + layer_masks_dst_parent)[LANDLOCK_NUM_ACCESS_FS],
> + const access_mask_t access_request_src_parent,
> + layer_mask_t (*layer_masks_src_parent)[LANDLOCK_NUM_ACCESS_FS],
> + layer_mask_t (*layer_masks_child)[LANDLOCK_NUM_ACCESS_FS])
> +{
> + bool allowed_dst_parent = false, allowed_src_parent = false, is_dom_check;
> struct path walker_path;
> - size_t i;
> + access_mask_t access_masked_dst_parent, access_masked_src_parent;
>
> - if (!access_request)
> + if (!access_request_dst_parent && !access_request_src_parent)
> return 0;
> if (WARN_ON_ONCE(!domain || !path))
> return 0;
> @@ -287,22 +460,20 @@ static int check_access_path(const struct landlock_ruleset *const domain,
> if (WARN_ON_ONCE(domain->num_layers < 1))
> return -EACCES;
>
> - /* Saves all layers handling a subset of requested accesses. */
> - for (i = 0; i < domain->num_layers; i++) {
> - const unsigned long access_req = access_request;
> - unsigned long access_bit;
> -
> - for_each_set_bit(access_bit, &access_req,
> - ARRAY_SIZE(layer_masks)) {
> - if (domain->fs_access_masks[i] & BIT_ULL(access_bit)) {
> - layer_masks[access_bit] |= BIT_ULL(i);
> - has_access = true;
> - }
> - }
> + BUILD_BUG_ON(!layer_masks_dst_parent);

I know the kbuild robot already flagged this, but checking function
parameters with BUILD_BUG_ON() does seem a bit ... unusual :)

> + if (layer_masks_src_parent) {
> + if (WARN_ON_ONCE(!layer_masks_child))
> + return -EACCES;
> + access_masked_dst_parent = access_masked_src_parent =
> + get_handled_accesses(domain);
> + is_dom_check = true;
> + } else {
> + if (WARN_ON_ONCE(layer_masks_child))
> + return -EACCES;
> + access_masked_dst_parent = access_request_dst_parent;
> + access_masked_src_parent = access_request_src_parent;
> + is_dom_check = false;
> }
> - /* An access request not handled by the domain is allowed. */
> - if (!has_access)
> - return 0;
>
> walker_path = *path;
> path_get(&walker_path);
> @@ -312,11 +483,50 @@ static int check_access_path(const struct landlock_ruleset *const domain,
> */
> while (true) {
> struct dentry *parent_dentry;
> + const struct landlock_rule *rule;
> +
> + /*
> + * If at least all accesses allowed on the destination are
> + * already allowed on the source, respectively if there is at
> + * least as much as restrictions on the destination than on the
> + * source, then we can safely refer files from the source to
> + * the destination without risking a privilege escalation.
> + * This is crucial for standalone multilayered security
> + * policies. Furthermore, this helps avoid policy writers to
> + * shoot themselves in the foot.
> + */
> + if (is_dom_check && is_superset(child_is_directory,
> + layer_masks_dst_parent,
> + layer_masks_src_parent,
> + layer_masks_child)) {
> + allowed_dst_parent =
> + scope_to_request(access_request_dst_parent,
> + layer_masks_dst_parent);
> + allowed_src_parent =
> + scope_to_request(access_request_src_parent,
> + layer_masks_src_parent);
> +
> + /* Stops when all accesses are granted. */
> + if (allowed_dst_parent && allowed_src_parent)
> + break;
> +
> + /*
> + * Downgrades checks from domain handled accesses to
> + * requested accesses.
> + */
> + is_dom_check = false;
> + access_masked_dst_parent = access_request_dst_parent;
> + access_masked_src_parent = access_request_src_parent;
> + }
> +
> + rule = find_rule(domain, walker_path.dentry);
> + allowed_dst_parent = unmask_layers(rule, access_masked_dst_parent,
> + layer_masks_dst_parent);
> + allowed_src_parent = unmask_layers(rule, access_masked_src_parent,
> + layer_masks_src_parent);
>
> - allowed = unmask_layers(find_rule(domain, walker_path.dentry),
> - access_request, &layer_masks);
> - if (allowed)
> - /* Stops when a rule from each layer grants access. */
> + /* Stops when a rule from each layer grants access. */
> + if (allowed_dst_parent && allowed_src_parent)
> break;

If "(allowed_dst_parent && allowed_src_parent)" is true, you break out
of the while loop only to do a path_put(), check the two booleans once
more, and then return zero, yes? Why not just do the path_put() and
return zero here?


--
paul-moore.com

2022-03-17 05:01:35

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH v1 02/11] landlock: Reduce the maximum number of layers to 16

On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <[email protected]> wrote:
>
> From: Mickaël Salaün <[email protected]>
>
> The maximum number of nested Landlock domains is currently 64. Because
> of the following fix and to help reduce the stack size, let's reduce it
> to 16. This seems large enough for a lot of use cases (e.g. sandboxed
> init service, spawning a sandboxed SSH service, in nested sandboxed
> containers). Reducing the number of nested domains may also help to
> discover misuse of Landlock (e.g. creating a domain per rule).
>
> Add and use a dedicated layer_mask_t typedef to fit with the number of
> layers. This might be useful when changing it and to keep it consistent
> with the maximum number of layers.
>
> Signed-off-by: Mickaël Salaün <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> ---
> security/landlock/fs.c | 13 +++++--------
> security/landlock/limits.h | 2 +-
> security/landlock/ruleset.h | 4 ++++
> tools/testing/selftests/landlock/fs_test.c | 2 +-
> 4 files changed, 11 insertions(+), 10 deletions(-)

I'm assuming that the drop in Landlock nesting down to 16 isn't going
to cause any userspace breakage :)

Reviewed-by: Paul Moore <[email protected]>


--
paul-moore.com

2022-03-17 05:21:51

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH v1 03/11] landlock: Create find_rule() from unmask_layers()

On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <[email protected]> wrote:
>
> From: Mickaël Salaün <[email protected]>
>
> This refactoring will be useful in a following commit.
>
> Signed-off-by: Mickaël Salaün <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> ---
> security/landlock/fs.c | 39 +++++++++++++++++++++++++++------------
> 1 file changed, 27 insertions(+), 12 deletions(-)

Reviewed-by: Paul Moore <[email protected]>


--
paul-moore.com

2022-03-17 05:40:53

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH v1 04/11] landlock: Fix same-layer rule unions

On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <[email protected]> wrote:
>
> From: Mickaël Salaün <[email protected]>
>
> The original behavior was to check if the full set of requested accesses
> was allowed by at least a rule of every relevant layer. This didn't
> take into account requests for multiple accesses and same-layer rules
> allowing the union of these accesses in a complementary way. As a
> result, multiple accesses requested on a file hierarchy matching rules
> that, together, allowed these accesses, but without a unique rule
> allowing all of them, was illegitimately denied. This case should be
> rare in practice and it can only be triggered by the path_rename or
> file_open hook implementations.
>
> For instance, if, for the same layer, a rule allows execution
> beneath /a/b and another rule allows read beneath /a, requesting access
> to read and execute at the same time for /a/b should be allowed for this
> layer.
>
> This was an inconsistency because the union of same-layer rule accesses
> was already allowed if requested once at a time anyway.
>
> This fix changes the way allowed accesses are gathered over a path walk.
> To take into account all these rule accesses, we store in a matrix all
> layer granting the set of requested accesses, according to the handled
> accesses. To avoid heap allocation, we use an array on the stack which
> is 2*13 bytes. A following commit bringing the LANDLOCK_ACCESS_FS_REFER
> access right will increase this size to reach 84 bytes (2*14*3) in case
> of link or rename actions.
>
> Add a new layout1.layer_rule_unions test to check that accesses from
> different rules pertaining to the same layer are ORed in a file
> hierarchy. Also test that it is not the case for rules from different
> layers.
>
> Signed-off-by: Mickaël Salaün <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> ---
> security/landlock/fs.c | 77 ++++++++++-----
> security/landlock/ruleset.h | 2 +
> tools/testing/selftests/landlock/fs_test.c | 107 +++++++++++++++++++++
> 3 files changed, 160 insertions(+), 26 deletions(-)
>
> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> index 0bcb27f2360a..9662f9fb3cd0 100644
> --- a/security/landlock/fs.c
> +++ b/security/landlock/fs.c
> @@ -204,45 +204,66 @@ static inline const struct landlock_rule *find_rule(
> return rule;
> }
>
> -static inline layer_mask_t unmask_layers(
> - const struct landlock_rule *const rule,
> - const access_mask_t access_request, layer_mask_t layer_mask)
> +/*
> + * @layer_masks is read and may be updated according to the access request and
> + * the matching rule.
> + *
> + * Returns true if the request is allowed (i.e. relevant layer masks for the
> + * request are empty).
> + */
> +static inline bool unmask_layers(const struct landlock_rule *const rule,
> + const access_mask_t access_request,
> + layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS])
> {
> size_t layer_level;
>
> + if (!access_request || !layer_masks)
> + return true;
> if (!rule)
> - return layer_mask;
> + return false;
>
> /*
> * An access is granted if, for each policy layer, at least one rule
> - * encountered on the pathwalk grants the requested accesses,
> - * regardless of their position in the layer stack. We must then check
> + * encountered on the pathwalk grants the requested access,
> + * regardless of its position in the layer stack. We must then check
> * the remaining layers for each inode, from the first added layer to
> - * the last one.
> + * the last one. When there is multiple requested accesses, for each
> + * policy layer, the full set of requested accesses may not be granted
> + * by only one rule, but by the union (binary OR) of multiple rules.
> + * E.g. /a/b <execute> + /a <read> = /a/b <execute + read>
> */
> for (layer_level = 0; layer_level < rule->num_layers; layer_level++) {
> const struct landlock_layer *const layer =
> &rule->layers[layer_level];
> const layer_mask_t layer_bit = BIT_ULL(layer->level - 1);
> + const unsigned long access_req = access_request;
> + unsigned long access_bit;
> + bool is_empty;
>
> - /* Checks that the layer grants access to the full request. */
> - if ((layer->access & access_request) == access_request) {
> - layer_mask &= ~layer_bit;
> -
> - if (layer_mask == 0)
> - return layer_mask;
> + /*
> + * Records in @layer_masks which layer grants access to each
> + * requested access.
> + */
> + is_empty = true;
> + for_each_set_bit(access_bit, &access_req,
> + ARRAY_SIZE(*layer_masks)) {
> + if (layer->access & BIT_ULL(access_bit))
> + (*layer_masks)[access_bit] &= ~layer_bit;
> + is_empty = is_empty && !(*layer_masks)[access_bit];

From what I can see the only reason not to return immediately once
@is_empty is true is the need to update @layer_masks. However, the
only caller that I can see (up to patch 4/11) is check_access_path()
which thanks to this patch no longer needs to reference @layer_masks
after the call to unmask_layers() returns true. Assuming that to be
the case, is there a reason we can't return immediately after finding
@is_empty true, or am I missing something?


> }
> + if (is_empty)
> + return true;
> }
> - return layer_mask;
> + return false;
> }
>
> static int check_access_path(const struct landlock_ruleset *const domain,
> const struct path *const path,
> const access_mask_t access_request)
> {
> - bool allowed = false;
> + layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
> + bool allowed = false, has_access = false;
> struct path walker_path;
> - layer_mask_t layer_mask;
> size_t i;
>
> if (!access_request)
> @@ -262,13 +283,20 @@ static int check_access_path(const struct landlock_ruleset *const domain,
> return -EACCES;
>
> /* Saves all layers handling a subset of requested accesses. */
> - layer_mask = 0;
> for (i = 0; i < domain->num_layers; i++) {
> - if (domain->fs_access_masks[i] & access_request)
> - layer_mask |= BIT_ULL(i);
> + const unsigned long access_req = access_request;
> + unsigned long access_bit;
> +
> + for_each_set_bit(access_bit, &access_req,
> + ARRAY_SIZE(layer_masks)) {
> + if (domain->fs_access_masks[i] & BIT_ULL(access_bit)) {
> + layer_masks[access_bit] |= BIT_ULL(i);
> + has_access = true;
> + }
> + }
> }
> /* An access request not handled by the domain is allowed. */
> - if (layer_mask == 0)
> + if (!has_access)
> return 0;
>
> walker_path = *path;
> @@ -280,14 +308,11 @@ static int check_access_path(const struct landlock_ruleset *const domain,
> while (true) {
> struct dentry *parent_dentry;
>
> - layer_mask = unmask_layers(find_rule(domain,
> - walker_path.dentry), access_request,
> - layer_mask);
> - if (layer_mask == 0) {
> + allowed = unmask_layers(find_rule(domain, walker_path.dentry),
> + access_request, &layer_masks);
> + if (allowed)
> /* Stops when a rule from each layer grants access. */
> - allowed = true;
> break;
> - }
>
> jump_up:
> if (walker_path.dentry == walker_path.mnt->mnt_root) {

--
paul-moore.com

2022-03-17 05:50:38

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH v1 08/11] samples/landlock: Add support for file reparenting

On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <[email protected]> wrote:
>
> From: Mickaël Salaün <[email protected]>
>
> Add LANDLOCK_ACCESS_FS_REFER to the "roughly write" access rights and
> leverage the Landlock ABI version to only try to enforce it if it is
> supported by the running kernel.
>
> Signed-off-by: Mickaël Salaün <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> ---
> samples/landlock/sandboxer.c | 37 +++++++++++++++++++++++++-----------
> 1 file changed, 26 insertions(+), 11 deletions(-)

Reviewed-by: Paul Moore <[email protected]>


--
paul-moore.com

2022-03-17 05:52:24

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH v1 09/11] landlock: Document LANDLOCK_ACCESS_FS_REFER and ABI versioning

On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <[email protected]> wrote:
>
> From: Mickaël Salaün <[email protected]>
>
> Add LANDLOCK_ACCESS_FS_REFER in the example and properly check to only
> use it if the current kernel support it thanks to the Landlock ABI
> version.
>
> Move the file renaming and linking limitation to a new "Previous
> limitations" section.
>
> Improve documentation about the backward and forward compatibility,
> including the rational for ruleset's handled_access_fs.
>
> Signed-off-by: Mickaël Salaün <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> ---
> Documentation/userspace-api/landlock.rst | 124 +++++++++++++++++++----
> 1 file changed, 104 insertions(+), 20 deletions(-)

Thanks for remembering to update the docs :) I made a few phrasing
suggestions below, but otherwise it looks good to me.

Reviewed-by: Paul Moore <[email protected]>

> diff --git a/Documentation/userspace-api/landlock.rst b/Documentation/userspace-api/landlock.rst
> index f35552ff19ba..97db09d36a5c 100644
> --- a/Documentation/userspace-api/landlock.rst
> +++ b/Documentation/userspace-api/landlock.rst
> @@ -281,6 +347,24 @@ Memory usage
> Kernel memory allocated to create rulesets is accounted and can be restricted
> by the Documentation/admin-guide/cgroup-v1/memory.rst.
>
> +Previous limitations
> +====================
> +
> +File renaming and linking (ABI 1)
> +---------------------------------
> +
> +Because Landlock targets unprivileged access controls, it is needed to properly
^^^^^
"... controls, it needs to ..."

> +handle composition of rules. Such property also implies rules nesting.
> +Properly handling multiple layers of ruleset, each one of them able to restrict
^^^^^^^
"rulesets,"

> +access to files, also implies to inherit the ruleset restrictions from a parent
^^^^^^^^^^
"... implies inheritance of the ..."

> +to its hierarchy. Because files are identified and restricted by their
> +hierarchy, moving or linking a file from one directory to another implies to
> +propagate the hierarchy constraints.

"... one directory to another implies propagation of the hierarchy constraints."

> + To protect against privilege escalations

> +through renaming or linking, and for the sake of simplicity, Landlock previously
> +limited linking and renaming to the same directory. Starting with the Landlock
> +ABI version 2, it is now possible to securely control renaming and linking
> +thanks to the new `LANDLOCK_ACCESS_FS_REFER` access right.

--
paul-moore.com

2022-03-17 06:41:04

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH v1 05/11] landlock: Move filesystem helpers and add a new one

On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <[email protected]> wrote:
>
> From: Mickaël Salaün <[email protected]>
>
> Move the SB_NOUSER and IS_PRIVATE dentry check to a standalone
> is_nouser_or_private() helper. This will be useful for a following
> commit.
>
> Move get_mode_access() and maybe_remove() to make them usable by new
> code provided by a following commit.
>
> Signed-off-by: Mickaël Salaün <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> ---
> security/landlock/fs.c | 87 ++++++++++++++++++++++--------------------
> 1 file changed, 46 insertions(+), 41 deletions(-)

One nit-picky comment below, otherwise it looks fine to me.

Reviewed-by: Paul Moore <[email protected]>

> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> index 9662f9fb3cd0..3886f9ad1a60 100644
> --- a/security/landlock/fs.c
> +++ b/security/landlock/fs.c
> @@ -257,6 +257,18 @@ static inline bool unmask_layers(const struct landlock_rule *const rule,
> return false;
> }
>
> +static inline bool is_nouser_or_private(const struct dentry *dentry)
> +{
> + /*
> + * Allows access to pseudo filesystems that will never be mountable
> + * (e.g. sockfs, pipefs), but can still be reachable through
> + * /proc/<pid>/fd/<file-descriptor> .
> + */

I might suggest moving this explanation up to a function header comment block.


> + return (dentry->d_sb->s_flags & SB_NOUSER) ||
> + (d_is_positive(dentry) &&
> + unlikely(IS_PRIVATE(d_backing_inode(dentry))));
> +}
> +
> static int check_access_path(const struct landlock_ruleset *const domain,
> const struct path *const path,
> const access_mask_t access_request)

--
paul-moore.com

2022-03-17 10:05:26

by Mickaël Salaün

[permalink] [raw]
Subject: Re: [PATCH v1 01/11] landlock: Define access_mask_t to enforce a consistent access mask size


On 17/03/2022 02:26, Paul Moore wrote:
> On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <[email protected]> wrote:
>>
>> From: Mickaël Salaün <[email protected]>
>>
>> Create and use the access_mask_t typedef to enforce a consistent access
>> mask size and uniformly use a 16-bits type. This will helps transition
>> to a 32-bits value one day.
>>
>> Add a build check to make sure all (filesystem) access rights fit in.
>> This will be extended with a following commit.
>>
>> Signed-off-by: Mickaël Salaün <[email protected]>
>> Link: https://lore.kernel.org/r/[email protected]
>> ---
>> security/landlock/fs.c | 19 ++++++++++---------
>> security/landlock/fs.h | 2 +-
>> security/landlock/limits.h | 2 ++
>> security/landlock/ruleset.c | 6 ++++--
>> security/landlock/ruleset.h | 17 +++++++++++++----
>> 5 files changed, 30 insertions(+), 16 deletions(-)
>>
>> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
>> index 97b8e421f617..9de2a460a762 100644
>> --- a/security/landlock/fs.c
>> +++ b/security/landlock/fs.c
>> @@ -150,7 +150,7 @@ static struct landlock_object *get_inode_object(struct inode *const inode)
>> * @path: Should have been checked by get_path_from_fd().
>> */
>> int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
>> - const struct path *const path, u32 access_rights)
>> + const struct path *const path, access_mask_t access_rights)
>> {
>> int err;
>> struct landlock_object *object;
>> @@ -182,8 +182,8 @@ int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
>>
>> static inline u64 unmask_layers(
>> const struct landlock_ruleset *const domain,
>> - const struct path *const path, const u32 access_request,
>> - u64 layer_mask)
>> + const struct path *const path,
>> + const access_mask_t access_request, u64 layer_mask)
>> {
>> const struct landlock_rule *rule;
>> const struct inode *inode;
>> @@ -223,7 +223,8 @@ static inline u64 unmask_layers(
>> }
>>
>> static int check_access_path(const struct landlock_ruleset *const domain,
>> - const struct path *const path, u32 access_request)
>> + const struct path *const path,
>> + const access_mask_t access_request)
>> {
>> bool allowed = false;
>> struct path walker_path;
>> @@ -308,7 +309,7 @@ static int check_access_path(const struct landlock_ruleset *const domain,
>> }
>>
>> static inline int current_check_access_path(const struct path *const path,
>> - const u32 access_request)
>> + const access_mask_t access_request)
>> {
>> const struct landlock_ruleset *const dom =
>> landlock_get_current_domain();
>> @@ -511,7 +512,7 @@ static int hook_sb_pivotroot(const struct path *const old_path,
>>
>> /* Path hooks */
>>
>> -static inline u32 get_mode_access(const umode_t mode)
>> +static inline access_mask_t get_mode_access(const umode_t mode)
>> {
>> switch (mode & S_IFMT) {
>> case S_IFLNK:
>> @@ -563,7 +564,7 @@ static int hook_path_link(struct dentry *const old_dentry,
>> get_mode_access(d_backing_inode(old_dentry)->i_mode));
>> }
>>
>> -static inline u32 maybe_remove(const struct dentry *const dentry)
>> +static inline access_mask_t maybe_remove(const struct dentry *const dentry)
>> {
>> if (d_is_negative(dentry))
>> return 0;
>> @@ -631,9 +632,9 @@ static int hook_path_rmdir(const struct path *const dir,
>>
>> /* File hooks */
>>
>> -static inline u32 get_file_access(const struct file *const file)
>> +static inline access_mask_t get_file_access(const struct file *const file)
>> {
>> - u32 access = 0;
>> + access_mask_t access = 0;
>>
>> if (file->f_mode & FMODE_READ) {
>> /* A directory can only be opened in read mode. */
>> diff --git a/security/landlock/fs.h b/security/landlock/fs.h
>> index 187284b421c9..74be312aad96 100644
>> --- a/security/landlock/fs.h
>> +++ b/security/landlock/fs.h
>> @@ -65,6 +65,6 @@ static inline struct landlock_superblock_security *landlock_superblock(
>> __init void landlock_add_fs_hooks(void);
>>
>> int landlock_append_fs_rule(struct landlock_ruleset *const ruleset,
>> - const struct path *const path, u32 access_hierarchy);
>> + const struct path *const path, access_mask_t access_hierarchy);
>>
>> #endif /* _SECURITY_LANDLOCK_FS_H */
>> diff --git a/security/landlock/limits.h b/security/landlock/limits.h
>> index 2a0a1095ee27..458d1de32ed5 100644
>> --- a/security/landlock/limits.h
>> +++ b/security/landlock/limits.h
>> @@ -9,6 +9,7 @@
>> #ifndef _SECURITY_LANDLOCK_LIMITS_H
>> #define _SECURITY_LANDLOCK_LIMITS_H
>>
>> +#include <linux/bitops.h>
>> #include <linux/limits.h>
>> #include <uapi/linux/landlock.h>
>>
>> @@ -17,5 +18,6 @@
>>
>> #define LANDLOCK_LAST_ACCESS_FS LANDLOCK_ACCESS_FS_MAKE_SYM
>> #define LANDLOCK_MASK_ACCESS_FS ((LANDLOCK_LAST_ACCESS_FS << 1) - 1)
>> +#define LANDLOCK_NUM_ACCESS_FS __const_hweight64(LANDLOCK_MASK_ACCESS_FS)
>
> The line above, and the static_assert() in ruleset.h are clever. I'll
> admit I didn't even know the hweightX() macros existed until looking
> at this code :)
>
> However, the LANDLOCK_NUM_ACCESS_FS is never really going to be used
> outside the static_assert() in ruleset.h is it? I wonder if it would
> be better to skip the extra macro and rewrite the static_assert like
> this:
>
> static_assert(BITS_PER_TYPE(access_mask_t) >=
> __const_hweight64(LANDLOCK_MASK_ACCESS_FS));
>
> If not, I might suggest changing LANDLOCK_NUM_ACCESS_FS to
> LANDLOCK_BITS_ACCESS_FS or something similar.

I declared LANDLOCK_NUM_ACCESS_FS in this patch to be able to have the
static_assert() here and ease the review, but LANDLOCK_NUM_ACCESS_FS is
really used in patch 6/11 to define an array size:
get_handled_acceses(), init_layer_masks(), is_superset(),
check_access_path_dual()…


>
>
>> diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
>> index 2d3ed7ec5a0a..7e7cac68e443 100644
>> --- a/security/landlock/ruleset.h
>> +++ b/security/landlock/ruleset.h
>> @@ -9,13 +9,20 @@
>> #ifndef _SECURITY_LANDLOCK_RULESET_H
>> #define _SECURITY_LANDLOCK_RULESET_H
>>
>> +#include <linux/bitops.h>
>> +#include <linux/build_bug.h>
>> #include <linux/mutex.h>
>> #include <linux/rbtree.h>
>> #include <linux/refcount.h>
>> #include <linux/workqueue.h>
>>
>> +#include "limits.h"
>> #include "object.h"
>>
>> +typedef u16 access_mask_t;
>> +/* Makes sure all filesystem access rights can be stored. */
>> +static_assert(BITS_PER_TYPE(access_mask_t) >= LANDLOCK_NUM_ACCESS_FS);
>
> --
> paul-moore.com

2022-03-17 10:44:25

by Mickaël Salaün

[permalink] [raw]
Subject: Re: [PATCH v1 05/11] landlock: Move filesystem helpers and add a new one


On 17/03/2022 02:26, Paul Moore wrote:
> On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <[email protected]> wrote:
>>
>> From: Mickaël Salaün <[email protected]>
>>
>> Move the SB_NOUSER and IS_PRIVATE dentry check to a standalone
>> is_nouser_or_private() helper. This will be useful for a following
>> commit.
>>
>> Move get_mode_access() and maybe_remove() to make them usable by new
>> code provided by a following commit.
>>
>> Signed-off-by: Mickaël Salaün <[email protected]>
>> Link: https://lore.kernel.org/r/[email protected]
>> ---
>> security/landlock/fs.c | 87 ++++++++++++++++++++++--------------------
>> 1 file changed, 46 insertions(+), 41 deletions(-)
>
> One nit-picky comment below, otherwise it looks fine to me.
>
> Reviewed-by: Paul Moore <[email protected]>
>
>> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
>> index 9662f9fb3cd0..3886f9ad1a60 100644
>> --- a/security/landlock/fs.c
>> +++ b/security/landlock/fs.c
>> @@ -257,6 +257,18 @@ static inline bool unmask_layers(const struct landlock_rule *const rule,
>> return false;
>> }
>>
>> +static inline bool is_nouser_or_private(const struct dentry *dentry)
>> +{
>> + /*
>> + * Allows access to pseudo filesystems that will never be mountable
>> + * (e.g. sockfs, pipefs), but can still be reachable through
>> + * /proc/<pid>/fd/<file-descriptor> .
>> + */
>
> I might suggest moving this explanation up to a function header comment block.

Sounds good.


>
>
>> + return (dentry->d_sb->s_flags & SB_NOUSER) ||
>> + (d_is_positive(dentry) &&
>> + unlikely(IS_PRIVATE(d_backing_inode(dentry))));
>> +}
>> +
>> static int check_access_path(const struct landlock_ruleset *const domain,
>> const struct path *const path,
>> const access_mask_t access_request)
>
> --
> paul-moore.com

2022-03-17 12:55:34

by Mickaël Salaün

[permalink] [raw]
Subject: Re: [PATCH v1 04/11] landlock: Fix same-layer rule unions


On 17/03/2022 02:26, Paul Moore wrote:
> On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <[email protected]> wrote:
>>
>> From: Mickaël Salaün <[email protected]>
>>
>> The original behavior was to check if the full set of requested accesses
>> was allowed by at least a rule of every relevant layer. This didn't
>> take into account requests for multiple accesses and same-layer rules
>> allowing the union of these accesses in a complementary way. As a
>> result, multiple accesses requested on a file hierarchy matching rules
>> that, together, allowed these accesses, but without a unique rule
>> allowing all of them, was illegitimately denied. This case should be
>> rare in practice and it can only be triggered by the path_rename or
>> file_open hook implementations.
>>
>> For instance, if, for the same layer, a rule allows execution
>> beneath /a/b and another rule allows read beneath /a, requesting access
>> to read and execute at the same time for /a/b should be allowed for this
>> layer.
>>
>> This was an inconsistency because the union of same-layer rule accesses
>> was already allowed if requested once at a time anyway.
>>
>> This fix changes the way allowed accesses are gathered over a path walk.
>> To take into account all these rule accesses, we store in a matrix all
>> layer granting the set of requested accesses, according to the handled
>> accesses. To avoid heap allocation, we use an array on the stack which
>> is 2*13 bytes. A following commit bringing the LANDLOCK_ACCESS_FS_REFER
>> access right will increase this size to reach 84 bytes (2*14*3) in case
>> of link or rename actions.
>>
>> Add a new layout1.layer_rule_unions test to check that accesses from
>> different rules pertaining to the same layer are ORed in a file
>> hierarchy. Also test that it is not the case for rules from different
>> layers.
>>
>> Signed-off-by: Mickaël Salaün <[email protected]>
>> Link: https://lore.kernel.org/r/[email protected]
>> ---
>> security/landlock/fs.c | 77 ++++++++++-----
>> security/landlock/ruleset.h | 2 +
>> tools/testing/selftests/landlock/fs_test.c | 107 +++++++++++++++++++++
>> 3 files changed, 160 insertions(+), 26 deletions(-)
>>
>> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
>> index 0bcb27f2360a..9662f9fb3cd0 100644
>> --- a/security/landlock/fs.c
>> +++ b/security/landlock/fs.c
>> @@ -204,45 +204,66 @@ static inline const struct landlock_rule *find_rule(
>> return rule;
>> }
>>
>> -static inline layer_mask_t unmask_layers(
>> - const struct landlock_rule *const rule,
>> - const access_mask_t access_request, layer_mask_t layer_mask)
>> +/*
>> + * @layer_masks is read and may be updated according to the access request and
>> + * the matching rule.
>> + *
>> + * Returns true if the request is allowed (i.e. relevant layer masks for the
>> + * request are empty).
>> + */
>> +static inline bool unmask_layers(const struct landlock_rule *const rule,
>> + const access_mask_t access_request,
>> + layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS])
>> {
>> size_t layer_level;
>>
>> + if (!access_request || !layer_masks)
>> + return true;
>> if (!rule)
>> - return layer_mask;
>> + return false;
>>
>> /*
>> * An access is granted if, for each policy layer, at least one rule
>> - * encountered on the pathwalk grants the requested accesses,
>> - * regardless of their position in the layer stack. We must then check
>> + * encountered on the pathwalk grants the requested access,
>> + * regardless of its position in the layer stack. We must then check
>> * the remaining layers for each inode, from the first added layer to
>> - * the last one.
>> + * the last one. When there is multiple requested accesses, for each
>> + * policy layer, the full set of requested accesses may not be granted
>> + * by only one rule, but by the union (binary OR) of multiple rules.
>> + * E.g. /a/b <execute> + /a <read> = /a/b <execute + read>
>> */
>> for (layer_level = 0; layer_level < rule->num_layers; layer_level++) {
>> const struct landlock_layer *const layer =
>> &rule->layers[layer_level];
>> const layer_mask_t layer_bit = BIT_ULL(layer->level - 1);
>> + const unsigned long access_req = access_request;
>> + unsigned long access_bit;
>> + bool is_empty;
>>
>> - /* Checks that the layer grants access to the full request. */
>> - if ((layer->access & access_request) == access_request) {
>> - layer_mask &= ~layer_bit;
>> -
>> - if (layer_mask == 0)
>> - return layer_mask;
>> + /*
>> + * Records in @layer_masks which layer grants access to each
>> + * requested access.
>> + */
>> + is_empty = true;
>> + for_each_set_bit(access_bit, &access_req,
>> + ARRAY_SIZE(*layer_masks)) {
>> + if (layer->access & BIT_ULL(access_bit))
>> + (*layer_masks)[access_bit] &= ~layer_bit;
>> + is_empty = is_empty && !(*layer_masks)[access_bit];
>
>>From what I can see the only reason not to return immediately once
> @is_empty is true is the need to update @layer_masks. However, the
> only caller that I can see (up to patch 4/11) is check_access_path()
> which thanks to this patch no longer needs to reference @layer_masks
> after the call to unmask_layers() returns true. Assuming that to be
> the case, is there a reason we can't return immediately after finding
> @is_empty true, or am I missing something?

Because @is_empty is initialized to true, and because each access
right/bit must be checked by this loop, we cannot return earlier than
the following if statement. Not returning in this loop also makes this
helper safer (for potential future use) because @layer_mask will never
be partially updated, which could lead to an inconsistent state.
Moreover finishing this bits check loop makes the code simpler and have
a negligible performance impact.


>
>
>> }
>> + if (is_empty)
>> + return true;
>> }
>> - return layer_mask;
>> + return false;
>> }
>>
>> static int check_access_path(const struct landlock_ruleset *const domain,
>> const struct path *const path,
>> const access_mask_t access_request)
>> {
>> - bool allowed = false;
>> + layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
>> + bool allowed = false, has_access = false;
>> struct path walker_path;
>> - layer_mask_t layer_mask;
>> size_t i;
>>
>> if (!access_request)
>> @@ -262,13 +283,20 @@ static int check_access_path(const struct landlock_ruleset *const domain,
>> return -EACCES;
>>
>> /* Saves all layers handling a subset of requested accesses. */
>> - layer_mask = 0;
>> for (i = 0; i < domain->num_layers; i++) {
>> - if (domain->fs_access_masks[i] & access_request)
>> - layer_mask |= BIT_ULL(i);
>> + const unsigned long access_req = access_request;
>> + unsigned long access_bit;
>> +
>> + for_each_set_bit(access_bit, &access_req,
>> + ARRAY_SIZE(layer_masks)) {
>> + if (domain->fs_access_masks[i] & BIT_ULL(access_bit)) {
>> + layer_masks[access_bit] |= BIT_ULL(i);
>> + has_access = true;
>> + }
>> + }
>> }
>> /* An access request not handled by the domain is allowed. */
>> - if (layer_mask == 0)
>> + if (!has_access)
>> return 0;
>>
>> walker_path = *path;
>> @@ -280,14 +308,11 @@ static int check_access_path(const struct landlock_ruleset *const domain,
>> while (true) {
>> struct dentry *parent_dentry;
>>
>> - layer_mask = unmask_layers(find_rule(domain,
>> - walker_path.dentry), access_request,
>> - layer_mask);
>> - if (layer_mask == 0) {
>> + allowed = unmask_layers(find_rule(domain, walker_path.dentry),
>> + access_request, &layer_masks);
>> + if (allowed)
>> /* Stops when a rule from each layer grants access. */
>> - allowed = true;
>> break;
>> - }
>>
>> jump_up:
>> if (walker_path.dentry == walker_path.mnt->mnt_root) {
>
> --
> paul-moore.com

2022-03-17 13:20:18

by Mickaël Salaün

[permalink] [raw]
Subject: Re: [PATCH v1 06/11] landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER


On 17/03/2022 02:26, Paul Moore wrote:
> On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <[email protected]> wrote:
>>
>> From: Mickaël Salaün <[email protected]>
>>
>> Add a new LANDLOCK_ACCESS_FS_REFER access right to enable policy writers
>> to allow sandboxed processes to link and rename files from and to a
>> specific set of file hierarchies. This access right should be composed
>> with LANDLOCK_ACCESS_FS_MAKE_* for the destination of a link or rename,
>> and with LANDLOCK_ACCESS_FS_REMOVE_* for a source of a rename. This
>> lift a Landlock limitation that always denied changing the parent of an
>> inode.
>>
>> Renaming or linking to the same directory is still always allowed,
>> whatever LANDLOCK_ACCESS_FS_REFER is used or not, because it is not
>> considered a threat to user data.
>>
>> However, creating multiple links or renaming to a different parent
>> directory may lead to privilege escalations if not handled properly.
>> Indeed, we must be sure that the source doesn't gain more privileges by
>> being accessible from the destination. This is handled by making sure
>> that the source hierarchy (including the referenced file or directory
>> itself) restricts at least as much the destination hierarchy. If it is
>> not the case, an EXDEV error is returned, making it potentially possible
>> for user space to copy the file hierarchy instead of moving or linking
>> it.
>>
>> Instead of creating different access rights for the source and the
>> destination, we choose to make it simple and consistent for users.
>> Indeed, considering the previous constraint, it would be weird to
>> require such destination access right to be also granted to the source
>> (to make it a superset).
>>
>> See the provided documentation for additional details.
>>
>> New tests are provided with a following commit.
>>
>> Signed-off-by: Mickaël Salaün <[email protected]>
>> Link: https://lore.kernel.org/r/[email protected]
>> ---
>> include/uapi/linux/landlock.h | 27 +-
>> security/landlock/fs.c | 550 ++++++++++++++++---
>> security/landlock/limits.h | 2 +-
>> security/landlock/syscalls.c | 2 +-
>> tools/testing/selftests/landlock/base_test.c | 2 +-
>> tools/testing/selftests/landlock/fs_test.c | 3 +-
>> 6 files changed, 516 insertions(+), 70 deletions(-)
>
> ...
>
>> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
>> index 3886f9ad1a60..c7c7ce4e7cd5 100644
>> --- a/security/landlock/fs.c
>> +++ b/security/landlock/fs.c
>> @@ -4,6 +4,7 @@
>> *
>> * Copyright © 2016-2020 Mickaël Salaün <[email protected]>
>> * Copyright © 2018-2020 ANSSI
>> + * Copyright © 2021-2022 Microsoft Corporation
>> */
>>
>> #include <linux/atomic.h>
>> @@ -269,16 +270,188 @@ static inline bool is_nouser_or_private(const struct dentry *dentry)
>> unlikely(IS_PRIVATE(d_backing_inode(dentry))));
>> }
>>
>> -static int check_access_path(const struct landlock_ruleset *const domain,
>> - const struct path *const path,
>> +static inline access_mask_t get_handled_accesses(
>> + const struct landlock_ruleset *const domain)
>> +{
>> + access_mask_t access_dom = 0;
>> + unsigned long access_bit;
>
> Would it be better to declare @access_bit as an access_mask_t type?
> You're not using any macros like for_each_set_bit() in this function
> so I believe it should be safe.

Right, I'll change that.


>
>> + for (access_bit = 0; access_bit < LANDLOCK_NUM_ACCESS_FS;
>> + access_bit++) {
>> + size_t layer_level;
>
> Considering the number of layers has dropped down to 16, it seems like
> a normal unsigned int might be big enough for @layer_level :)

We could switch to u8, but I prefer to stick to size_t for array indexes
which enable to reduce the cognitive workload related to the size of
such array. ;) I guess there is enough info for compilers to optimize
such code anyway.


>
>> + for (layer_level = 0; layer_level < domain->num_layers;
>> + layer_level++) {
>> + if (domain->fs_access_masks[layer_level] &
>> + BIT_ULL(access_bit)) {
>> + access_dom |= BIT_ULL(access_bit);
>> + break;
>> + }
>> + }
>> + }
>> + return access_dom;
>> +}
>> +
>> +static inline access_mask_t init_layer_masks(
>> + const struct landlock_ruleset *const domain,
>> + const access_mask_t access_request,
>> + layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS])
>> +{
>> + access_mask_t handled_accesses = 0;
>> + size_t layer_level;
>> +
>> + memset(layer_masks, 0, sizeof(*layer_masks));
>> + if (WARN_ON_ONCE(!access_request))
>> + return 0;
>> +
>> + /* Saves all handled accesses per layer. */
>> + for (layer_level = 0; layer_level < domain->num_layers;
>> + layer_level++) {
>> + const unsigned long access_req = access_request;
>> + unsigned long access_bit;
>> +
>> + for_each_set_bit(access_bit, &access_req,
>> + ARRAY_SIZE(*layer_masks)) {
>> + if (domain->fs_access_masks[layer_level] &
>> + BIT_ULL(access_bit)) {
>> + (*layer_masks)[access_bit] |=
>> + BIT_ULL(layer_level);
>> + handled_accesses |= BIT_ULL(access_bit);
>> + }
>> + }
>> + }
>> + return handled_accesses;
>> +}
>> +
>> +/*
>> + * Check that a destination file hierarchy has more restrictions than a source
>> + * file hierarchy. This is only used for link and rename actions.
>> + */
>> +static inline bool is_superset(bool child_is_directory,
>> + const layer_mask_t (*const
>> + layer_masks_dst_parent)[LANDLOCK_NUM_ACCESS_FS],
>> + const layer_mask_t (*const
>> + layer_masks_src_parent)[LANDLOCK_NUM_ACCESS_FS],
>> + const layer_mask_t (*const
>> + layer_masks_child)[LANDLOCK_NUM_ACCESS_FS])
>> +{
>> + unsigned long access_bit;
>> +
>> + for (access_bit = 0; access_bit < ARRAY_SIZE(*layer_masks_dst_parent);
>> + access_bit++) {
>> + /* Ignores accesses that only make sense for directories. */
>> + if (!child_is_directory && !(BIT_ULL(access_bit) & ACCESS_FILE))
>> + continue;
>> +
>> + /*
>> + * Checks if the destination restrictions are a superset of the
>> + * source ones (i.e. inherited access rights without child
>> + * exceptions).
>> + */
>> + if ((((*layer_masks_src_parent)[access_bit] & (*layer_masks_child)[access_bit]) |
>> + (*layer_masks_dst_parent)[access_bit]) !=
>> + (*layer_masks_dst_parent)[access_bit])
>> + return false;
>> + }
>> + return true;
>> +}
>> +
>> +/*
>> + * Removes @layer_masks accesses that are not requested.
>> + *
>> + * Returns true if the request is allowed, false otherwise.
>> + */
>> +static inline bool scope_to_request(const access_mask_t access_request,
>> + layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS])
>> +{
>> + const unsigned long access_req = access_request;
>> + unsigned long access_bit;
>> +
>> + if (WARN_ON_ONCE(!layer_masks))
>> + return true;
>> +
>> + for_each_clear_bit(access_bit, &access_req, ARRAY_SIZE(*layer_masks))
>> + (*layer_masks)[access_bit] = 0;
>> + return !memchr_inv(layer_masks, 0, sizeof(*layer_masks));
>> +}
>> +
>> +/*
>> + * Returns true if there is at least one access right different than
>> + * LANDLOCK_ACCESS_FS_REFER.
>> + */
>> +static inline bool is_eacces(
>> + const layer_mask_t (*const
>> + layer_masks)[LANDLOCK_NUM_ACCESS_FS],
>> const access_mask_t access_request)
>> {
>
> Granted, I don't have as deep of an understanding of Landlock as you
> do, but the function name "is_eacces" seems a little odd given the
> nature of the function. Perhaps "is_fsrefer"?

Hmm, this helper does multiple things which are necessary to know if we
need to return -EACCES or -EXDEV. Renaming it to is_fsrefer() would
require to inverse the logic and use boolean negations in the callers
(because of ordering). Renaming to something like without_fs_refer()
would not be completely correct because we also check if there is no
layer_masks, which indicated that it doesn't contain an access right
that should return -EACCES. This helper is named as such because the
underlying semantic is to check for such error code, which is a tricky.
I can rename it co contains_eacces() or something, but a longer name
would require to cut the caller lines to fit 80 columns. :|


>
>> - layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
>> - bool allowed = false, has_access = false;
>> + unsigned long access_bit;
>> + /* LANDLOCK_ACCESS_FS_REFER alone must return -EXDEV. */
>> + const unsigned long access_check = access_request &
>> + ~LANDLOCK_ACCESS_FS_REFER;
>> +
>> + if (!layer_masks)
>> + return false;
>> +
>> + for_each_set_bit(access_bit, &access_check, ARRAY_SIZE(*layer_masks)) {
>> + if ((*layer_masks)[access_bit])
>> + return true;
>> + }
>
> Is calling for_each_set_bit() overkill here? @access_check should
> only ever have at most one bit set (LANDLOCK_ACCESS_FS_REFER), yes?

No, it is the contrary, the bitmask is inverted and this loop check for
non-FS_REFER access rights that should then return -EACCES. For
instance, if a sandbox handles (and then restricts) MAKE_REG and REFER,
a request to link a regular file would contains both of these bits, and
the kernel should return -EACCES if MAKE_REG is not granted or -EXDEV if
the request is only denied because of REFER. The reparent_* tests check
the consistency of this behavior (with the exception of a
RENAME_EXCHANGE case, see [1]).

[1] https://lore.kernel.org/r/[email protected]


>
>> + return false;
>> +}
>> +
>> +/**
>> + * check_access_path_dual - Check a source and a destination accesses
>> + *
>> + * @domain: Domain to check against.
>> + * @path: File hierarchy to walk through.
>> + * @child_is_directory: Must be set to true if the (original) leaf is a
>> + * directory, false otherwise.
>> + * @access_request_dst_parent: Accesses to check, once @layer_masks_dst_parent
>> + * is equal to @layer_masks_src_parent (if any).
>> + * @layer_masks_dst_parent: Pointer to a matrix of layer masks per access
>> + * masks, identifying the layers that forbid a specific access. Bits from
>> + * this matrix can be unset according to the @path walk. An empty matrix
>> + * means that @domain allows all possible Landlock accesses (i.e. not only
>> + * those identified by @access_request_dst_parent). This matrix can
>> + * initially refer to domain layer masks and, when the accesses for the
>> + * destination and source are the same, to request layer masks.
>> + * @access_request_src_parent: Similar to @access_request_dst_parent but for an
>> + * initial source path request. Only taken into account if
>> + * @layer_masks_src_parent is not NULL.
>> + * @layer_masks_src_parent: Similar to @layer_masks_dst_parent but for an
>> + * initial source path walk. This can be NULL if only dealing with a
>> + * destination access request (i.e. not a rename nor a link action).
>> + * @layer_masks_child: Similar to @layer_masks_src_parent but only for the
>> + * linked or renamed inode (without hierarchy). This is only used if
>> + * @layer_masks_src_parent is not NULL.
>> + *
>> + * This helper first checks that the destination has a superset of restrictions
>> + * compared to the source (if any) for a common path. It then checks that the
>> + * collected accesses and the remaining ones are enough to allow the request.
>> + *
>> + * Returns:
>> + * - 0 if the access request is granted;
>> + * - -EACCES if it is denied because of access right other than
>> + * LANDLOCK_ACCESS_FS_REFER;
>> + * - -EXDEV if the renaming or linking would be a privileged escalation
>> + * (according to each layered policies), or if LANDLOCK_ACCESS_FS_REFER is
>> + * not allowed by the source or the destination.
>> + */
>> +static int check_access_path_dual(const struct landlock_ruleset *const domain,
>> + const struct path *const path,
>> + bool child_is_directory,
>> + const access_mask_t access_request_dst_parent,
>> + layer_mask_t (*const
>> + layer_masks_dst_parent)[LANDLOCK_NUM_ACCESS_FS],
>> + const access_mask_t access_request_src_parent,
>> + layer_mask_t (*layer_masks_src_parent)[LANDLOCK_NUM_ACCESS_FS],
>> + layer_mask_t (*layer_masks_child)[LANDLOCK_NUM_ACCESS_FS])
>> +{
>> + bool allowed_dst_parent = false, allowed_src_parent = false, is_dom_check;
>> struct path walker_path;
>> - size_t i;
>> + access_mask_t access_masked_dst_parent, access_masked_src_parent;
>>
>> - if (!access_request)
>> + if (!access_request_dst_parent && !access_request_src_parent)
>> return 0;
>> if (WARN_ON_ONCE(!domain || !path))
>> return 0;
>> @@ -287,22 +460,20 @@ static int check_access_path(const struct landlock_ruleset *const domain,
>> if (WARN_ON_ONCE(domain->num_layers < 1))
>> return -EACCES;
>>
>> - /* Saves all layers handling a subset of requested accesses. */
>> - for (i = 0; i < domain->num_layers; i++) {
>> - const unsigned long access_req = access_request;
>> - unsigned long access_bit;
>> -
>> - for_each_set_bit(access_bit, &access_req,
>> - ARRAY_SIZE(layer_masks)) {
>> - if (domain->fs_access_masks[i] & BIT_ULL(access_bit)) {
>> - layer_masks[access_bit] |= BIT_ULL(i);
>> - has_access = true;
>> - }
>> - }
>> + BUILD_BUG_ON(!layer_masks_dst_parent);
>
> I know the kbuild robot already flagged this, but checking function
> parameters with BUILD_BUG_ON() does seem a bit ... unusual :)

Yeah, I like such guarantee but it may not work without __always_inline.
I moved this check in the previous WARN_ON_ONCE().


>
>> + if (layer_masks_src_parent) {
>> + if (WARN_ON_ONCE(!layer_masks_child))
>> + return -EACCES;
>> + access_masked_dst_parent = access_masked_src_parent =
>> + get_handled_accesses(domain);
>> + is_dom_check = true;
>> + } else {
>> + if (WARN_ON_ONCE(layer_masks_child))
>> + return -EACCES;
>> + access_masked_dst_parent = access_request_dst_parent;
>> + access_masked_src_parent = access_request_src_parent;
>> + is_dom_check = false;
>> }
>> - /* An access request not handled by the domain is allowed. */
>> - if (!has_access)
>> - return 0;
>>
>> walker_path = *path;
>> path_get(&walker_path);
>> @@ -312,11 +483,50 @@ static int check_access_path(const struct landlock_ruleset *const domain,
>> */
>> while (true) {
>> struct dentry *parent_dentry;
>> + const struct landlock_rule *rule;
>> +
>> + /*
>> + * If at least all accesses allowed on the destination are
>> + * already allowed on the source, respectively if there is at
>> + * least as much as restrictions on the destination than on the
>> + * source, then we can safely refer files from the source to
>> + * the destination without risking a privilege escalation.
>> + * This is crucial for standalone multilayered security
>> + * policies. Furthermore, this helps avoid policy writers to
>> + * shoot themselves in the foot.
>> + */
>> + if (is_dom_check && is_superset(child_is_directory,
>> + layer_masks_dst_parent,
>> + layer_masks_src_parent,
>> + layer_masks_child)) {
>> + allowed_dst_parent =
>> + scope_to_request(access_request_dst_parent,
>> + layer_masks_dst_parent);
>> + allowed_src_parent =
>> + scope_to_request(access_request_src_parent,
>> + layer_masks_src_parent);
>> +
>> + /* Stops when all accesses are granted. */
>> + if (allowed_dst_parent && allowed_src_parent)
>> + break;
>> +
>> + /*
>> + * Downgrades checks from domain handled accesses to
>> + * requested accesses.
>> + */
>> + is_dom_check = false;
>> + access_masked_dst_parent = access_request_dst_parent;
>> + access_masked_src_parent = access_request_src_parent;
>> + }
>> +
>> + rule = find_rule(domain, walker_path.dentry);
>> + allowed_dst_parent = unmask_layers(rule, access_masked_dst_parent,
>> + layer_masks_dst_parent);
>> + allowed_src_parent = unmask_layers(rule, access_masked_src_parent,
>> + layer_masks_src_parent);
>>
>> - allowed = unmask_layers(find_rule(domain, walker_path.dentry),
>> - access_request, &layer_masks);
>> - if (allowed)
>> - /* Stops when a rule from each layer grants access. */
>> + /* Stops when a rule from each layer grants access. */
>> + if (allowed_dst_parent && allowed_src_parent)
>> break;
>
> If "(allowed_dst_parent && allowed_src_parent)" is true, you break out
> of the while loop only to do a path_put(), check the two booleans once
> more, and then return zero, yes? Why not just do the path_put() and
> return zero here?

Correct, that would work, but I prefer not to duplicate the logic of
granting access if it doesn't make the code more complex, which I think
is not the case here, and I'm reluctant to duplicate path_get/put()
calls. This loop break is a small optimization to avoid walking the path
one more step, and writing it this way looks cleaner and less
error-prone from my point of view.

2022-03-17 14:43:16

by Mickaël Salaün

[permalink] [raw]
Subject: Re: [PATCH v1 09/11] landlock: Document LANDLOCK_ACCESS_FS_REFER and ABI versioning


On 17/03/2022 02:27, Paul Moore wrote:
> On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <[email protected]> wrote:
>>
>> From: Mickaël Salaün <[email protected]>
>>
>> Add LANDLOCK_ACCESS_FS_REFER in the example and properly check to only
>> use it if the current kernel support it thanks to the Landlock ABI
>> version.
>>
>> Move the file renaming and linking limitation to a new "Previous
>> limitations" section.
>>
>> Improve documentation about the backward and forward compatibility,
>> including the rational for ruleset's handled_access_fs.
>>
>> Signed-off-by: Mickaël Salaün <[email protected]>
>> Link: https://lore.kernel.org/r/[email protected]
>> ---
>> Documentation/userspace-api/landlock.rst | 124 +++++++++++++++++++----
>> 1 file changed, 104 insertions(+), 20 deletions(-)
>
> Thanks for remembering to update the docs :) I made a few phrasing
> suggestions below, but otherwise it looks good to me.

Thanks Paul! I'll take them.


>
> Reviewed-by: Paul Moore <[email protected]>
>
>> diff --git a/Documentation/userspace-api/landlock.rst b/Documentation/userspace-api/landlock.rst
>> index f35552ff19ba..97db09d36a5c 100644
>> --- a/Documentation/userspace-api/landlock.rst
>> +++ b/Documentation/userspace-api/landlock.rst
>> @@ -281,6 +347,24 @@ Memory usage
>> Kernel memory allocated to create rulesets is accounted and can be restricted
>> by the Documentation/admin-guide/cgroup-v1/memory.rst.
>>
>> +Previous limitations
>> +====================
>> +
>> +File renaming and linking (ABI 1)
>> +---------------------------------
>> +
>> +Because Landlock targets unprivileged access controls, it is needed to properly
> ^^^^^
> "... controls, it needs to ..."
>
>> +handle composition of rules. Such property also implies rules nesting.
>> +Properly handling multiple layers of ruleset, each one of them able to restrict
> ^^^^^^^
> "rulesets,"
>
>> +access to files, also implies to inherit the ruleset restrictions from a parent
> ^^^^^^^^^^
> "... implies inheritance of the ..."
>
>> +to its hierarchy. Because files are identified and restricted by their
>> +hierarchy, moving or linking a file from one directory to another implies to
>> +propagate the hierarchy constraints.
>
> "... one directory to another implies propagation of the hierarchy constraints."
>
>> + To protect against privilege escalations
>
>> +through renaming or linking, and for the sake of simplicity, Landlock previously
>> +limited linking and renaming to the same directory. Starting with the Landlock
>> +ABI version 2, it is now possible to securely control renaming and linking
>> +thanks to the new `LANDLOCK_ACCESS_FS_REFER` access right.
>
> --
> paul-moore.com

2022-03-17 21:56:50

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH v1 01/11] landlock: Define access_mask_t to enforce a consistent access mask size

On Thu, Mar 17, 2022 at 4:35 AM Mickaël Salaün <[email protected]> wrote:
> On 17/03/2022 02:26, Paul Moore wrote:
> > On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <[email protected]> wrote:
> >>
> >> From: Mickaël Salaün <[email protected]>
> >>
> >> Create and use the access_mask_t typedef to enforce a consistent access
> >> mask size and uniformly use a 16-bits type. This will helps transition
> >> to a 32-bits value one day.
> >>
> >> Add a build check to make sure all (filesystem) access rights fit in.
> >> This will be extended with a following commit.
> >>
> >> Signed-off-by: Mickaël Salaün <[email protected]>
> >> Link: https://lore.kernel.org/r/[email protected]
> >> ---
> >> security/landlock/fs.c | 19 ++++++++++---------
> >> security/landlock/fs.h | 2 +-
> >> security/landlock/limits.h | 2 ++
> >> security/landlock/ruleset.c | 6 ++++--
> >> security/landlock/ruleset.h | 17 +++++++++++++----
> >> 5 files changed, 30 insertions(+), 16 deletions(-)

...

> >> diff --git a/security/landlock/limits.h b/security/landlock/limits.h
> >> index 2a0a1095ee27..458d1de32ed5 100644
> >> --- a/security/landlock/limits.h
> >> +++ b/security/landlock/limits.h
> >> @@ -9,6 +9,7 @@
> >> #ifndef _SECURITY_LANDLOCK_LIMITS_H
> >> #define _SECURITY_LANDLOCK_LIMITS_H
> >>
> >> +#include <linux/bitops.h>
> >> #include <linux/limits.h>
> >> #include <uapi/linux/landlock.h>
> >>
> >> @@ -17,5 +18,6 @@
> >>
> >> #define LANDLOCK_LAST_ACCESS_FS LANDLOCK_ACCESS_FS_MAKE_SYM
> >> #define LANDLOCK_MASK_ACCESS_FS ((LANDLOCK_LAST_ACCESS_FS << 1) - 1)
> >> +#define LANDLOCK_NUM_ACCESS_FS __const_hweight64(LANDLOCK_MASK_ACCESS_FS)
> >
> > The line above, and the static_assert() in ruleset.h are clever. I'll
> > admit I didn't even know the hweightX() macros existed until looking
> > at this code :)
> >
> > However, the LANDLOCK_NUM_ACCESS_FS is never really going to be used
> > outside the static_assert() in ruleset.h is it? I wonder if it would
> > be better to skip the extra macro and rewrite the static_assert like
> > this:
> >
> > static_assert(BITS_PER_TYPE(access_mask_t) >=
> > __const_hweight64(LANDLOCK_MASK_ACCESS_FS));
> >
> > If not, I might suggest changing LANDLOCK_NUM_ACCESS_FS to
> > LANDLOCK_BITS_ACCESS_FS or something similar.
>
> I declared LANDLOCK_NUM_ACCESS_FS in this patch to be able to have the
> static_assert() here and ease the review, but LANDLOCK_NUM_ACCESS_FS is
> really used in patch 6/11 to define an array size:
> get_handled_acceses(), init_layer_masks(), is_superset(),
> check_access_path_dual()…

I wrote my comments as I was working my way through the patchset and
didn't think to go back and check this when I hit patch 6/11 :)

Looks good to me, sorry for the noise.

Reviewed-by: Paul Moore <[email protected]>

--
paul-moore.com

2022-03-17 22:00:15

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH v1 04/11] landlock: Fix same-layer rule unions

On Thu, Mar 17, 2022 at 6:40 AM Mickaël Salaün <[email protected]> wrote:
> On 17/03/2022 02:26, Paul Moore wrote:
> > On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <[email protected]> wrote:
> >>
> >> From: Mickaël Salaün <[email protected]>
> >>
> >> The original behavior was to check if the full set of requested accesses
> >> was allowed by at least a rule of every relevant layer. This didn't
> >> take into account requests for multiple accesses and same-layer rules
> >> allowing the union of these accesses in a complementary way. As a
> >> result, multiple accesses requested on a file hierarchy matching rules
> >> that, together, allowed these accesses, but without a unique rule
> >> allowing all of them, was illegitimately denied. This case should be
> >> rare in practice and it can only be triggered by the path_rename or
> >> file_open hook implementations.
> >>
> >> For instance, if, for the same layer, a rule allows execution
> >> beneath /a/b and another rule allows read beneath /a, requesting access
> >> to read and execute at the same time for /a/b should be allowed for this
> >> layer.
> >>
> >> This was an inconsistency because the union of same-layer rule accesses
> >> was already allowed if requested once at a time anyway.
> >>
> >> This fix changes the way allowed accesses are gathered over a path walk.
> >> To take into account all these rule accesses, we store in a matrix all
> >> layer granting the set of requested accesses, according to the handled
> >> accesses. To avoid heap allocation, we use an array on the stack which
> >> is 2*13 bytes. A following commit bringing the LANDLOCK_ACCESS_FS_REFER
> >> access right will increase this size to reach 84 bytes (2*14*3) in case
> >> of link or rename actions.
> >>
> >> Add a new layout1.layer_rule_unions test to check that accesses from
> >> different rules pertaining to the same layer are ORed in a file
> >> hierarchy. Also test that it is not the case for rules from different
> >> layers.
> >>
> >> Signed-off-by: Mickaël Salaün <[email protected]>
> >> Link: https://lore.kernel.org/r/[email protected]
> >> ---
> >> security/landlock/fs.c | 77 ++++++++++-----
> >> security/landlock/ruleset.h | 2 +
> >> tools/testing/selftests/landlock/fs_test.c | 107 +++++++++++++++++++++
> >> 3 files changed, 160 insertions(+), 26 deletions(-)
> >>
> >> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> >> index 0bcb27f2360a..9662f9fb3cd0 100644
> >> --- a/security/landlock/fs.c
> >> +++ b/security/landlock/fs.c
> >> @@ -204,45 +204,66 @@ static inline const struct landlock_rule *find_rule(
> >> return rule;
> >> }
> >>
> >> -static inline layer_mask_t unmask_layers(
> >> - const struct landlock_rule *const rule,
> >> - const access_mask_t access_request, layer_mask_t layer_mask)
> >> +/*
> >> + * @layer_masks is read and may be updated according to the access request and
> >> + * the matching rule.
> >> + *
> >> + * Returns true if the request is allowed (i.e. relevant layer masks for the
> >> + * request are empty).
> >> + */
> >> +static inline bool unmask_layers(const struct landlock_rule *const rule,
> >> + const access_mask_t access_request,
> >> + layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS])
> >> {
> >> size_t layer_level;
> >>
> >> + if (!access_request || !layer_masks)
> >> + return true;
> >> if (!rule)
> >> - return layer_mask;
> >> + return false;
> >>
> >> /*
> >> * An access is granted if, for each policy layer, at least one rule
> >> - * encountered on the pathwalk grants the requested accesses,
> >> - * regardless of their position in the layer stack. We must then check
> >> + * encountered on the pathwalk grants the requested access,
> >> + * regardless of its position in the layer stack. We must then check
> >> * the remaining layers for each inode, from the first added layer to
> >> - * the last one.
> >> + * the last one. When there is multiple requested accesses, for each
> >> + * policy layer, the full set of requested accesses may not be granted
> >> + * by only one rule, but by the union (binary OR) of multiple rules.
> >> + * E.g. /a/b <execute> + /a <read> = /a/b <execute + read>
> >> */
> >> for (layer_level = 0; layer_level < rule->num_layers; layer_level++) {
> >> const struct landlock_layer *const layer =
> >> &rule->layers[layer_level];
> >> const layer_mask_t layer_bit = BIT_ULL(layer->level - 1);
> >> + const unsigned long access_req = access_request;
> >> + unsigned long access_bit;
> >> + bool is_empty;
> >>
> >> - /* Checks that the layer grants access to the full request. */
> >> - if ((layer->access & access_request) == access_request) {
> >> - layer_mask &= ~layer_bit;
> >> -
> >> - if (layer_mask == 0)
> >> - return layer_mask;
> >> + /*
> >> + * Records in @layer_masks which layer grants access to each
> >> + * requested access.
> >> + */
> >> + is_empty = true;
> >> + for_each_set_bit(access_bit, &access_req,
> >> + ARRAY_SIZE(*layer_masks)) {
> >> + if (layer->access & BIT_ULL(access_bit))
> >> + (*layer_masks)[access_bit] &= ~layer_bit;
> >> + is_empty = is_empty && !(*layer_masks)[access_bit];
> >
> >>From what I can see the only reason not to return immediately once
> > @is_empty is true is the need to update @layer_masks. However, the
> > only caller that I can see (up to patch 4/11) is check_access_path()
> > which thanks to this patch no longer needs to reference @layer_masks
> > after the call to unmask_layers() returns true. Assuming that to be
> > the case, is there a reason we can't return immediately after finding
> > @is_empty true, or am I missing something?
>
> Because @is_empty is initialized to true, and because each access
> right/bit must be checked by this loop, we cannot return earlier than
> the following if statement. Not returning in this loop also makes this
> helper safer (for potential future use) because @layer_mask will never
> be partially updated, which could lead to an inconsistent state.
> Moreover finishing this bits check loop makes the code simpler and have
> a negligible performance impact.

My apologies, I must have spaced-out a bit and read the 'is_empty =
true;' initializer as 'is_empty = false;'.

Reviewed-by: Paul Moore <[email protected]>

--
paul-moore.com

2022-03-17 22:01:01

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH v1 06/11] landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER

On Thu, Mar 17, 2022 at 8:03 AM Mickaël Salaün <[email protected]> wrote:
> On 17/03/2022 02:26, Paul Moore wrote:
> > On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <[email protected]> wrote:
> >>
> >> From: Mickaël Salaün <[email protected]>
> >>
> >> Add a new LANDLOCK_ACCESS_FS_REFER access right to enable policy writers
> >> to allow sandboxed processes to link and rename files from and to a
> >> specific set of file hierarchies. This access right should be composed
> >> with LANDLOCK_ACCESS_FS_MAKE_* for the destination of a link or rename,
> >> and with LANDLOCK_ACCESS_FS_REMOVE_* for a source of a rename. This
> >> lift a Landlock limitation that always denied changing the parent of an
> >> inode.
> >>
> >> Renaming or linking to the same directory is still always allowed,
> >> whatever LANDLOCK_ACCESS_FS_REFER is used or not, because it is not
> >> considered a threat to user data.
> >>
> >> However, creating multiple links or renaming to a different parent
> >> directory may lead to privilege escalations if not handled properly.
> >> Indeed, we must be sure that the source doesn't gain more privileges by
> >> being accessible from the destination. This is handled by making sure
> >> that the source hierarchy (including the referenced file or directory
> >> itself) restricts at least as much the destination hierarchy. If it is
> >> not the case, an EXDEV error is returned, making it potentially possible
> >> for user space to copy the file hierarchy instead of moving or linking
> >> it.
> >>
> >> Instead of creating different access rights for the source and the
> >> destination, we choose to make it simple and consistent for users.
> >> Indeed, considering the previous constraint, it would be weird to
> >> require such destination access right to be also granted to the source
> >> (to make it a superset).
> >>
> >> See the provided documentation for additional details.
> >>
> >> New tests are provided with a following commit.
> >>
> >> Signed-off-by: Mickaël Salaün <[email protected]>
> >> Link: https://lore.kernel.org/r/[email protected]
> >> ---
> >> include/uapi/linux/landlock.h | 27 +-
> >> security/landlock/fs.c | 550 ++++++++++++++++---
> >> security/landlock/limits.h | 2 +-
> >> security/landlock/syscalls.c | 2 +-
> >> tools/testing/selftests/landlock/base_test.c | 2 +-
> >> tools/testing/selftests/landlock/fs_test.c | 3 +-
> >> 6 files changed, 516 insertions(+), 70 deletions(-)

...

> >> +/*
> >> + * Returns true if there is at least one access right different than
> >> + * LANDLOCK_ACCESS_FS_REFER.
> >> + */
> >> +static inline bool is_eacces(
> >> + const layer_mask_t (*const
> >> + layer_masks)[LANDLOCK_NUM_ACCESS_FS],
> >> const access_mask_t access_request)
> >> {
> >
> > Granted, I don't have as deep of an understanding of Landlock as you
> > do, but the function name "is_eacces" seems a little odd given the
> > nature of the function. Perhaps "is_fsrefer"?
>
> Hmm, this helper does multiple things which are necessary to know if we
> need to return -EACCES or -EXDEV. Renaming it to is_fsrefer() would
> require to inverse the logic and use boolean negations in the callers
> (because of ordering). Renaming to something like without_fs_refer()
> would not be completely correct because we also check if there is no
> layer_masks, which indicated that it doesn't contain an access right
> that should return -EACCES. This helper is named as such because the
> underlying semantic is to check for such error code, which is a tricky.
> I can rename it co contains_eacces() or something, but a longer name
> would require to cut the caller lines to fit 80 columns. :|

You know the Landlock code better than I do, if you like
"is_eacces()", then leave it as it is.

> >> - layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {};
> >> - bool allowed = false, has_access = false;
> >> + unsigned long access_bit;
> >> + /* LANDLOCK_ACCESS_FS_REFER alone must return -EXDEV. */
> >> + const unsigned long access_check = access_request &
> >> + ~LANDLOCK_ACCESS_FS_REFER;
> >> +
> >> + if (!layer_masks)
> >> + return false;
> >> +
> >> + for_each_set_bit(access_bit, &access_check, ARRAY_SIZE(*layer_masks)) {
> >> + if ((*layer_masks)[access_bit])
> >> + return true;
> >> + }
> >
> > Is calling for_each_set_bit() overkill here? @access_check should
> > only ever have at most one bit set (LANDLOCK_ACCESS_FS_REFER), yes?
>
> No, it is the contrary ...

Gotcha. Thanks for the clarification, I must have missed that when I
was looking at it last night.

> >> @@ -287,22 +460,20 @@ static int check_access_path(const struct landlock_ruleset *const domain,
> >> if (WARN_ON_ONCE(domain->num_layers < 1))
> >> return -EACCES;
> >>
> >> - /* Saves all layers handling a subset of requested accesses. */
> >> - for (i = 0; i < domain->num_layers; i++) {
> >> - const unsigned long access_req = access_request;
> >> - unsigned long access_bit;
> >> -
> >> - for_each_set_bit(access_bit, &access_req,
> >> - ARRAY_SIZE(layer_masks)) {
> >> - if (domain->fs_access_masks[i] & BIT_ULL(access_bit)) {
> >> - layer_masks[access_bit] |= BIT_ULL(i);
> >> - has_access = true;
> >> - }
> >> - }
> >> + BUILD_BUG_ON(!layer_masks_dst_parent);
> >
> > I know the kbuild robot already flagged this, but checking function
> > parameters with BUILD_BUG_ON() does seem a bit ... unusual :)
>
> Yeah, I like such guarantee but it may not work without __always_inline.
> I moved this check in the previous WARN_ON_ONCE().

That sounds good to me.

> >> @@ -312,11 +483,50 @@ static int check_access_path(const struct landlock_ruleset *const domain,
> >> */
> >> while (true) {
> >> struct dentry *parent_dentry;
> >> + const struct landlock_rule *rule;
> >> +
> >> + /*
> >> + * If at least all accesses allowed on the destination are
> >> + * already allowed on the source, respectively if there is at
> >> + * least as much as restrictions on the destination than on the
> >> + * source, then we can safely refer files from the source to
> >> + * the destination without risking a privilege escalation.
> >> + * This is crucial for standalone multilayered security
> >> + * policies. Furthermore, this helps avoid policy writers to
> >> + * shoot themselves in the foot.
> >> + */
> >> + if (is_dom_check && is_superset(child_is_directory,
> >> + layer_masks_dst_parent,
> >> + layer_masks_src_parent,
> >> + layer_masks_child)) {
> >> + allowed_dst_parent =
> >> + scope_to_request(access_request_dst_parent,
> >> + layer_masks_dst_parent);
> >> + allowed_src_parent =
> >> + scope_to_request(access_request_src_parent,
> >> + layer_masks_src_parent);
> >> +
> >> + /* Stops when all accesses are granted. */
> >> + if (allowed_dst_parent && allowed_src_parent)
> >> + break;
> >> +
> >> + /*
> >> + * Downgrades checks from domain handled accesses to
> >> + * requested accesses.
> >> + */
> >> + is_dom_check = false;
> >> + access_masked_dst_parent = access_request_dst_parent;
> >> + access_masked_src_parent = access_request_src_parent;
> >> + }
> >> +
> >> + rule = find_rule(domain, walker_path.dentry);
> >> + allowed_dst_parent = unmask_layers(rule, access_masked_dst_parent,
> >> + layer_masks_dst_parent);
> >> + allowed_src_parent = unmask_layers(rule, access_masked_src_parent,
> >> + layer_masks_src_parent);
> >>
> >> - allowed = unmask_layers(find_rule(domain, walker_path.dentry),
> >> - access_request, &layer_masks);
> >> - if (allowed)
> >> - /* Stops when a rule from each layer grants access. */
> >> + /* Stops when a rule from each layer grants access. */
> >> + if (allowed_dst_parent && allowed_src_parent)
> >> break;
> >
> > If "(allowed_dst_parent && allowed_src_parent)" is true, you break out
> > of the while loop only to do a path_put(), check the two booleans once
> > more, and then return zero, yes? Why not just do the path_put() and
> > return zero here?
>
> Correct, that would work, but I prefer not to duplicate the logic of
> granting access if it doesn't make the code more complex, which I think
> is not the case here, and I'm reluctant to duplicate path_get/put()
> calls. This loop break is a small optimization to avoid walking the path
> one more step, and writing it this way looks cleaner and less
> error-prone from my point of view.

I'm a big fan of maintainable code, and since you are the maintainer,
if you prefer this approach I say stick with what you have :)

--
paul-moore.com

2022-03-25 15:10:51

by Mickaël Salaün

[permalink] [raw]
Subject: Re: [PATCH v1 06/11] landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER


On 17/03/2022 13:04, Mickaël Salaün wrote:
>
> On 17/03/2022 02:26, Paul Moore wrote:

[...]

>>> @@ -269,16 +270,188 @@ static inline bool is_nouser_or_private(const
>>> struct dentry *dentry)
>>>
>>> unlikely(IS_PRIVATE(d_backing_inode(dentry))));
>>>   }
>>>
>>> -static int check_access_path(const struct landlock_ruleset *const
>>> domain,
>>> -               const struct path *const path,
>>> +static inline access_mask_t get_handled_accesses(
>>> +               const struct landlock_ruleset *const domain)
>>> +{
>>> +       access_mask_t access_dom = 0;
>>> +       unsigned long access_bit;
>>
>> Would it be better to declare @access_bit as an access_mask_t type?
>> You're not using any macros like for_each_set_bit() in this function
>> so I believe it should be safe.
>
> Right, I'll change that.

Well, thinking about it again, access_bit is not an access mask but an
index in such mask. access_mask_t gives enough space for such index but
it is definitely not the right semantic. The best type should be size_t,
but I prefer to stick to unsigned long (used for size_t anyway) for
consistency with the other access_bit variable types. There is no need
to use for_each_set_bit() here now but that could change, and I prefer
to do my best to prevent future issues. ;)
Anyway, I guess the compiler can optimize such code.