Hi,
This fourth RFC brings some improvements over the previous one [1]. An important
new point is the abstraction from the raw types of LSM hook arguments. It is
now possible to call a Landlock function the same way for LSM hooks with
different internal argument types. Some parts of the code are revamped with RCU
to properly deal with concurrency. From a userland point of view, the only
remaining link with seccomp-bpf is the ability to use the seccomp(2) syscall to
load and enforce a Landlock rule. Seccomp filters cannot trigger Landlock rules
anymore. For now, it is no more possible for an unprivileged user to enforce a
Landlock rule on a cgroup through delegation.
As suggested, I plan to write documentation for userland and kernel developers
with some kind of guiding principles. A remaining question is how to enforce
limitations for the rule creation?
# Landlock LSM
The goal of this new stackable Linux Security Module (LSM) called Landlock is
to allow any process, including unprivileged ones, to create powerful security
sandboxes comparable to the Seatbelt/XNU Sandbox or the OpenBSD Pledge. This
kind of sandbox is expected to help mitigate the security impact of bugs or
unexpected/malicious behaviors in userland applications.
eBPF programs are used to create a security rule. They are very limited (i.e.
can only call a whitelist of functions) and cannot do a denial of service (i.e.
no loop). A new dedicated eBPF map allows to collect and compare Landlock
handles with system resources (e.g. files or network connections).
The approach taken is to add the minimum amount of code while still allowing
the userland to create quite complex access rules. A dedicated security policy
language as the one used by SELinux, AppArmor and other major LSMs involves a
lot of code and is usually dedicated to a trusted user (i.e. root).
# eBPF
To get an expressive language while still being safe and small, Landlock is
based on eBPF. Landlock should be usable by untrusted processes and must then
expose a minimal attack surface. The eBPF bytecode is minimal while powerful,
widely used and designed to be used by not so trusted application. Reusing this
code allows to not reproduce the same mistakes and minimize new code while
still taking a generic approach. Only a few additional features are added like
a new kind of arraymap and some dedicated eBPF functions.
An eBPF program has access to an eBPF context which contains the LSM hook
arguments (as does seccomp-bpf with syscall arguments). They can be used
directly or passed to helper functions according to their types. It is then
possible to do complex access checks without race conditions nor inconsistent
evaluation (i.e. incorrect mirroring of the OS code and state [2]).
There is one eBPF program subtype per LSM hook. This allows to statically check
which context access is performed by an eBPF program. This is needed to deny
kernel address leak and ensure the right use of LSM hook arguments with eBPF
functions. Moreover, this safe pointer handling removes the need for runtime
check or abstract data, which improves performances. Any user can add multiple
Landlock eBPF programs per LSM hook. They are stacked and evaluated one after
the other (cf. seccomp-bpf).
# LSM hooks
Unlike syscalls, LSM hooks are security checkpoints and are not architecture
dependent. They are designed to match a security need associated with a
security policy (e.g. access to a file). Exposing parts of some LSM hooks
instead of using the syscall API for sandboxing should help to avoid bugs and
hacks as encountered by the first RFC. Instead of redoing the work of the LSM
hooks through syscalls, we should use and expose them as does policies of
access control LSM.
Only a subset of the hooks are meaningful for an unprivileged sandbox mechanism
(e.g. file system or network access control). Landlock uses an abstraction of
raw LSM hooks, which allow to deal with possible future API changes of the LSM
hook API. Moreover, thanks to the ePBF program typing (per LSM hook) used by
Landlock, it should not be hard to make such evolutions backward compatible.
# Use case scenario
First, a process needs to create a new dedicated eBPF map containing handles.
This handles are references to system resources (e.g. file or directory) and
grouped in one or multiple maps to be efficiently managed and checked in
batches. This kind of map can be passed to Landlock eBPF functions to compare,
for example, with a file access request. The handles are only accessible from
the eBPF programs created by the same thread.
The loaded Landlock eBPF programs can be triggered by a seccomp filter
returning RET_LANDLOCK. In addition, a cookie (16-bit value) can be passed from
a seccomp filter to eBPF programs. This allow flexible security policies
between seccomp and Landlock.
Another way to enforce a Landlock security policy is to attach Landlock
programs to a dedicated cgroup. All the processes in this cgroup will then be
subject to this policy. For unprivileged processes, this can be done thanks to
cgroup delegation.
A triggered Landlock eBPF program can allow or deny an access, according to
its subtype (i.e. LSM hook), thanks to errno return values.
# Sandbox example with process hierarchy sandboxing (seccomp)
$ ls /home
user1
$ LANDLOCK_ALLOWED='/bin:/lib:/usr:/tmp:/proc/self/fd/0' \
./samples/landlock/sandbox /bin/sh -i
Launching a new sandboxed process.
$ ls /home
ls: cannot access '/home': No such file or directory
# Sandbox example with conditional access control depending on a cgroup
$ mkdir /sys/fs/cgroup/sandboxed
$ ls /home
user1
$ LANDLOCK_CGROUPS='/sys/fs/cgroup/sandboxed' \
LANDLOCK_ALLOWED='/bin:/lib:/usr:/tmp:/proc/self/fd/0' \
./samples/landlock/sandbox
Ready to sandbox with cgroups.
$ ls /home
user1
$ echo $$ > /sys/fs/cgroup/sandboxed/cgroup.procs
$ ls /home
ls: cannot access '/home': No such file or directory
# Current limitations and possible improvements
For now, eBPF programs can only return an errno code. It may be interesting to
be able to do other actions like seccomp-bpf does (e.g. kill process). Such
features can easily be implemented but the main advantage of the current
approach is to be able to only execute eBPF programs until one returns an errno
code instead of executing all programs like seccomp-bpf does.
It is quite easy to add new eBPF functions to extend Landlock. The main concern
should be about the possibility to leak information from current process to
another one (e.g. through maps) to not reproduce the same security sensitive
behavior as ptrace.
This design does not seem too intrusive but is flexible enough to allow a
powerful sandbox mechanism accessible by any process on Linux. The use of
seccomp and Landlock is more suitable with the help of a userland library (e.g.
libseccomp) that could help to specify a high-level language to express a
security policy instead of raw eBPF programs. Moreover, thanks to LLVM, it is
possible to express an eBPF program with a subset of C.
# FAQ
## Why does seccomp-bpf is not enough?
A seccomp filter can access to raw syscall arguments which means that it is not
possible to filter according to pointed such as a file path. As the first
version of this patch series demonstrated, filtering at the syscall level is
complicated (e.g. need to take care of race conditions). This is mainly because
the access control checkpoints of the kernel are not at this high-level but
more underneath, at LSM hooks level. The LSM hooks are designed to handle this
kind of checks. This series use this approach to leverage the ability of
unprivileged users to limit themselves.
Cf. "What it isn't?" in Documentation/prctl/seccomp_filter.txt
## Why using the seccomp(2) syscall?
Landlock use the same semantic as seccomp to apply access rule restrictions. It
add a new layer of security for the current process which is inherited by its
childs. It makes sense to use an unique access-restricting syscall (that should
be allowed by seccomp-bpf rules) which can only drop privileges. Moreover, a
Landlock eBPF program could come from outside a process (e.g. passed through a
UNIX socket). It is then useful to differentiate the creation/load of Landlock
eBPF programs via bpf(2), from rule enforcing via seccomp(2).
## Why using cgroups?
cgroups are designed to handle groups of processes. One use case is to manage
containers. Sandboxing based on process hierarchy (seccomp) is design to handle
immutable security policies, which is a good security property but does not
match all use cases. A user can attach Landlock rules to a cgroup. Doing so,
all the processes in that cgroup will be subject to the security policy.
However, if the user is allowed to manage this cgroup, it could dynamically
move this group of processes to a cgroup with another security policy (or
none). Landlock rules can be applied either on a process hierarchy (e.g.
application with built-in sandboxing) or a group of processes (e.g. container
sandboxing). Both approaches can be combined for the same process.
## Does Landlock can limit network access or other resources?
Limiting network access is obviously in the scope of Landlock but it is not yet
implemented. The main goal now is to get feedback about the whole concept, the
API and the file access control part. More access control types could be
implemented in the future.
Sargun Dhillon sent a RFC (Checmate) [4] to deal with network manipulation.
This could be implemented on top of the Landlock framework.
## Why a new LSM? Are SELinux, AppArmor, Smack or Tomoyo not good enough?
The current access control LSMs are fine for their purpose which is to give the
*root* the ability to enforce a security policy for the *system*. What is
missing is a way to enforce a security policy for any applications by its
developer and *unprivileged user* as seccomp can do for raw syscall filtering.
Moreover, Landlock handles stacked hook programs from different users. It must
then ensure there is no possible malicious interactions between these programs.
Differences with other (access control) LSMs:
* not only dedicated to administrators (i.e. no_new_priv);
* limited kernel attack surface (e.g. policy parsing);
* helpers to compare complex objects (path/FD), no access to internal kernel
data (do not leak addresses);
* constrained policy rules/programs (no DoS: deterministic execution time);
* do not leak more information than the loader process can legitimately have
access to (minimize metadata inference): must compare from an already allowed
file (through a handle).
## Why not use a policy language like used by SElinux or AppArmor?
This kind of LSMs are dedicated to administrators. They already manage the
system and are not a threat to the system security. However, seccomp, and
Landlock too, should be available to anyone, which potentially include
untrusted users and processes. To reduce the attack surface, Landlock should
expose the minimum amount of code, hence minimal complexity. Moreover, another
threat is to make accessible to a malicious code a new way to gain more
information. For example, Landlock features should not allow a program to get
the file owner if the directory containing this file is not readable. This data
could then be exfiltrated thanks to the access result. Thus, we should limit
the expressiveness of the available checks. The current approach is to do the
checks in such a way that only a comparison with an already accessed resource
(e.g. file descriptor) is possible. This allow to have a reference to compare
with, without exposing much information.
## As a developer, why do I need this feature?
Landlock's goal is to help userland to limit its attack surface.
Security-conscious developers would like to protect users from a security bug
in their applications and the third-party dependencies they are using. Such a
bug can compromise all the user data and help an attacker to perform a
privilege escalation. Using an *unprivileged sandbox* feature such as Landlock
empowers the developer with the ability to properly compartmentalize its
software and limit the impact of vulnerabilities.
## As a user, why do I need a this feature?
Any user can already use seccomp-bpf to whitelist a set of syscalls to
reduce the kernel attack surface for a predefined set of processes. However an
unprivileged user can't create a security policy like the root user can thanks to
SELinux and other access control LSMs. Landlock allows any unprivileged user to
protect their data from being accessed by any process they run but only an
identified subset. User tools can be created to help create such a high-level
access control policy. This policy may not be powerful enough to express the
same policies as the current access control LSMs, because of the threat an
unprivileged user can be to the system, but it should be enough for most
use-cases (e.g. blacklist or whitelist a set of file hierarchies).
# Changes since RFC v3
* use abstract LSM hook arguments with custom types (e.g. *_LANDLOCK_ARG_FS for
struct file, struct inode and struct path)
* add more LSM hooks to support full file system access control
* improve the sandbox example
* fix races and RCU issues:
* eBPF program execution and eBPF helpers
* revamp the arraymap of handles to cleanly deal with update/delete
* eBPF program subtype for Landlock:
* remove the "origin" field
* add an "option" field
* rebase onto Daniel Mack's patches v7 [3]
* remove merged commit 1955351da41c ("bpf: Set register type according to
is_valid_access()")
* fix spelling mistakes
* cleanup some type and variable names
* split patches
* for now, remove cgroup delegation handling for unprivileged user
* remove extra access check for cgroup_get_from_fd()
* remove unused example code dealing with skb
* remove seccomp-bpf link:
* no more seccomp cookie
* for now, it is no more possible to check the current syscall properties
# Changes since RFC v2
* revamp cgroup handling:
* use Daniel Mack's patches "Add eBPF hooks for cgroups" v5
* remove bpf_landlock_cmp_cgroup_beneath()
* make BPF_PROG_ATTACH usable with delegated cgroups
* add a new CGRP_NO_NEW_PRIVS flag for safe cgroups
* handle Landlock sandboxing for cgroups hierarchy
* allow unprivileged processes to attach Landlock eBPF program to cgroups
* add subtype to eBPF programs:
* replace Landlock hook identification by custom eBPF program types with a
dedicated subtype field
* manage fine-grained privileged Landlock programs
* register Landlock programs for dedicated trigger origins (e.g. syscall,
return from seccomp filter and/or interruption)
* performance and memory optimizations: use an array to access Landlock hooks
directly but do not duplicated it for each thread (seccomp-based)
* allow running Landlock programs without seccomp filter
* fix seccomp-related issues
* remove extra errno bounding check for Landlock programs
* add some examples for optional eBPF functions or context access (network
related) according to security checks to allow more features for privileged
programs (e.g. Checmate)
# Changes since RFC v1
* focus on the LSM hooks, not the syscalls:
* much more simple implementation
* does not need audit cache tricks to avoid race conditions
* more simple to use and more generic because using the LSM hook abstraction
directly
* more efficient because only checking in LSM hooks
* architecture agnostic
* switch from cBPF to eBPF:
* new eBPF program types dedicated to Landlock
* custom functions used by the eBPF program
* gain some new features (e.g. 10 registers, can load values of different
size, LLVM translator) but only a few functions allowed and a dedicated map
type
* new context: LSM hook ID, cookie and LSM hook arguments
* need to set the sysctl kernel.unprivileged_bpf_disable to 0 (default value)
to be able to load hook filters as unprivileged users
* smaller and simpler:
* no more checker groups but dedicated arraymap of handles
* simpler userland structs thanks to eBPF functions
* distinctive name: Landlock
This series can be applied on top of Daniel Mack's patches for BPF_PROG_ATTACH
v7 [3] on Linux v4.9-rc2. This can be tested with CONFIG_SECURITY_LANDLOCK,
CONFIG_SECCOMP_FILTER and CONFIG_CGROUP_BPF. I would really appreciate
constructive comments on the usability, architecture, code and userland API of
Landlock LSM.
[1] https://lkml.kernel.org/r/[email protected]
[2] https://crypto.stanford.edu/cs155/papers/traps.pdf
[3] https://lkml.kernel.org/r/[email protected]
[4] https://lkml.kernel.org/r/[email protected]
Regards,
Mickaël Salaün (18):
landlock: Add Kconfig
bpf: Move u64_to_ptr() to BPF headers and inline it
bpf,landlock: Add a new arraymap type to deal with (Landlock) handles
bpf,landlock: Add eBPF program subtype and is_valid_subtype() verifier
bpf,landlock: Define an eBPF program type for Landlock
fs: Constify path_is_under()'s arguments
landlock: Add LSM hooks
landlock: Handle file comparisons
landlock: Add manager functions
seccomp: Split put_seccomp_filter() with put_seccomp()
seccomp,landlock: Handle Landlock hooks per process hierarchy
bpf: Cosmetic change for bpf_prog_attach()
bpf/cgroup: Replace struct bpf_prog with struct bpf_object
bpf/cgroup: Make cgroup_bpf_update() return an error code
bpf/cgroup: Move capability check
bpf/cgroup,landlock: Handle Landlock hooks per cgroup
landlock: Add update and debug access flags
samples/landlock: Add sandbox example
fs/namespace.c | 2 +-
include/linux/bpf-cgroup.h | 19 +-
include/linux/bpf.h | 44 +++-
include/linux/cgroup-defs.h | 2 +
include/linux/filter.h | 1 +
include/linux/fs.h | 2 +-
include/linux/landlock.h | 95 +++++++++
include/linux/lsm_hooks.h | 5 +
include/linux/seccomp.h | 12 +-
include/uapi/linux/bpf.h | 105 ++++++++++
include/uapi/linux/seccomp.h | 1 +
kernel/bpf/arraymap.c | 270 +++++++++++++++++++++++++
kernel/bpf/cgroup.c | 139 ++++++++++---
kernel/bpf/syscall.c | 71 ++++---
kernel/bpf/verifier.c | 35 +++-
kernel/cgroup.c | 6 +-
kernel/fork.c | 15 +-
kernel/seccomp.c | 26 ++-
kernel/trace/bpf_trace.c | 12 +-
net/core/filter.c | 26 ++-
samples/Makefile | 2 +-
samples/bpf/bpf_helpers.h | 5 +
samples/landlock/.gitignore | 1 +
samples/landlock/Makefile | 16 ++
samples/landlock/sandbox.c | 405 +++++++++++++++++++++++++++++++++++++
security/Kconfig | 1 +
security/Makefile | 2 +
security/landlock/Kconfig | 23 +++
security/landlock/Makefile | 3 +
security/landlock/checker_fs.c | 152 ++++++++++++++
security/landlock/checker_fs.h | 20 ++
security/landlock/common.h | 58 ++++++
security/landlock/lsm.c | 449 +++++++++++++++++++++++++++++++++++++++++
security/landlock/manager.c | 379 ++++++++++++++++++++++++++++++++++
security/security.c | 1 +
35 files changed, 2309 insertions(+), 96 deletions(-)
create mode 100644 include/linux/landlock.h
create mode 100644 samples/landlock/.gitignore
create mode 100644 samples/landlock/Makefile
create mode 100644 samples/landlock/sandbox.c
create mode 100644 security/landlock/Kconfig
create mode 100644 security/landlock/Makefile
create mode 100644 security/landlock/checker_fs.c
create mode 100644 security/landlock/checker_fs.h
create mode 100644 security/landlock/common.h
create mode 100644 security/landlock/lsm.c
create mode 100644 security/landlock/manager.c
--
2.9.3
This will be useful to be able to add more BPF attach type with
different capability checks.
Signed-off-by: Mickaël Salaün <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Daniel Mack <[email protected]>
---
kernel/bpf/syscall.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index e62123aeb202..128acb4f7177 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -833,15 +833,15 @@ static int bpf_prog_attach(const union bpf_attr *attr)
struct cgroup *cgrp;
int result;
- if (!capable(CAP_NET_ADMIN))
- return -EPERM;
-
if (CHECK_ATTR(BPF_PROG_ATTACH))
return -EINVAL;
switch (attr->attach_type) {
case BPF_CGROUP_INET_INGRESS:
case BPF_CGROUP_INET_EGRESS:
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
prog = bpf_prog_get_type(attr->attach_bpf_fd,
BPF_PROG_TYPE_CGROUP_SKB);
break;
@@ -872,15 +872,15 @@ static int bpf_prog_detach(const union bpf_attr *attr)
struct cgroup *cgrp;
int result = 0;
- if (!capable(CAP_NET_ADMIN))
- return -EPERM;
-
if (CHECK_ATTR(BPF_PROG_DETACH))
return -EINVAL;
switch (attr->attach_type) {
case BPF_CGROUP_INET_INGRESS:
case BPF_CGROUP_INET_EGRESS:
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
cgrp = cgroup_get_from_fd(attr->target_fd);
if (IS_ERR(cgrp))
return PTR_ERR(cgrp);
--
2.9.3
The seccomp(2) syscall can be use to apply a Landlock rule to the
current process. As with a seccomp filter, the Landlock rule is enforced
for all its future children. An inherited rule tree can be updated
(append-only) by the owner of inherited Landlock nodes (e.g. a parent
process that create a new rule). However, an intermediate task, which
did not create a rule, will not be able to update its children's rules.
Changes since v3:
* remove the hard link with seccomp (suggested by Andy Lutomirski and
Kees Cook):
* remove the cookie which could imply multiple evaluation of Landlock
rules
* remove the origin field in struct landlock_data
* remove documentation fix (merged upstream)
* rename the new seccomp command to SECCOMP_ADD_LANDLOCK_RULE
* internal renaming
Changes since v2:
* Landlock programs can now be run without seccomp filter but for any
syscall (from the process) or interruption
* move Landlock related functions and structs into security/landlock/*
(to manage cgroups as well)
* fix seccomp filter handling: run Landlock programs for each of their
legitimate seccomp filter
* properly clean up all seccomp results
* cosmetic changes to ease the understanding
* fix some ifdef
Signed-off-by: Mickaël Salaün <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Will Drewry <[email protected]>
Cc: Andrew Morton <[email protected]>
Link: https://lkml.kernel.org/r/CAGXu5j+qowiyQuhifOBtupfPxp6XevdgF08BW4yzkVDTCha0xA@mail.gmail.com
---
include/linux/landlock.h | 5 +++++
include/linux/seccomp.h | 8 +++++++
include/uapi/linux/seccomp.h | 1 +
kernel/fork.c | 13 +++++++++--
kernel/seccomp.c | 8 +++++++
security/landlock/lsm.c | 8 +++++--
security/landlock/manager.c | 51 ++++++++++++++++++++++++++++++++++++++++++++
7 files changed, 90 insertions(+), 4 deletions(-)
diff --git a/include/linux/landlock.h b/include/linux/landlock.h
index 263be3cf0b48..72b4235d255f 100644
--- a/include/linux/landlock.h
+++ b/include/linux/landlock.h
@@ -74,5 +74,10 @@ struct landlock_hooks {
void put_landlock_hooks(struct landlock_hooks *hooks);
+#ifdef CONFIG_SECCOMP_FILTER
+int landlock_seccomp_append_prog(unsigned int flags,
+ const char __user *user_bpf_fd);
+#endif /* CONFIG_SECCOMP_FILTER */
+
#endif /* CONFIG_SECURITY_LANDLOCK */
#endif /* _LINUX_LANDLOCK_H */
diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
index e25aee2cdfc0..4a8ccc7ff976 100644
--- a/include/linux/seccomp.h
+++ b/include/linux/seccomp.h
@@ -10,6 +10,10 @@
#include <linux/thread_info.h>
#include <asm/seccomp.h>
+#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
+#include <linux/landlock.h>
+#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
+
struct seccomp_filter;
/**
* struct seccomp - the state of a seccomp'ed process
@@ -18,6 +22,7 @@ struct seccomp_filter;
* system calls available to a process.
* @filter: must always point to a valid seccomp-filter or NULL as it is
* accessed without locking during system call entry.
+ * @landlock_hooks: contains an array of Landlock programs.
*
* @filter must only be accessed from the context of current as there
* is no read locking.
@@ -25,6 +30,9 @@ struct seccomp_filter;
struct seccomp {
int mode;
struct seccomp_filter *filter;
+#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
+ struct landlock_hooks *landlock_hooks;
+#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
};
#ifdef CONFIG_HAVE_ARCH_SECCOMP_FILTER
diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
index 0f238a43ff1e..56dd692cddac 100644
--- a/include/uapi/linux/seccomp.h
+++ b/include/uapi/linux/seccomp.h
@@ -13,6 +13,7 @@
/* Valid operations for seccomp syscall. */
#define SECCOMP_SET_MODE_STRICT 0
#define SECCOMP_SET_MODE_FILTER 1
+#define SECCOMP_ADD_LANDLOCK_RULE 2
/* Valid flags for SECCOMP_SET_MODE_FILTER */
#define SECCOMP_FILTER_FLAG_TSYNC 1
diff --git a/kernel/fork.c b/kernel/fork.c
index 0690e43bdda5..d8af3ba554fa 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -510,7 +510,10 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node)
* the usage counts on the error path calling free_task.
*/
tsk->seccomp.filter = NULL;
-#endif
+#ifdef CONFIG_SECURITY_LANDLOCK
+ tsk->seccomp.landlock_hooks = NULL;
+#endif /* CONFIG_SECURITY_LANDLOCK */
+#endif /* CONFIG_SECCOMP */
setup_thread_stack(tsk, orig);
clear_user_return_notifier(tsk);
@@ -1378,7 +1381,13 @@ static void copy_seccomp(struct task_struct *p)
/* Ref-count the new filter user, and assign it. */
get_seccomp_filter(current);
- p->seccomp = current->seccomp;
+ p->seccomp.mode = current->seccomp.mode;
+ p->seccomp.filter = current->seccomp.filter;
+#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
+ p->seccomp.landlock_hooks = current->seccomp.landlock_hooks;
+ if (p->seccomp.landlock_hooks)
+ atomic_inc(&p->seccomp.landlock_hooks->usage);
+#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
/*
* Explicitly enable no_new_privs here in case it got set
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index e741a82eab4d..f967ddf9d4b5 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -32,6 +32,7 @@
#include <linux/security.h>
#include <linux/tracehook.h>
#include <linux/uaccess.h>
+#include <linux/landlock.h>
/**
* struct seccomp_filter - container for seccomp BPF programs
@@ -493,6 +494,9 @@ static void put_seccomp_filter(struct seccomp_filter *filter)
void put_seccomp(struct task_struct *tsk)
{
put_seccomp_filter(tsk->seccomp.filter);
+#ifdef CONFIG_SECURITY_LANDLOCK
+ put_landlock_hooks(tsk->seccomp.landlock_hooks);
+#endif /* CONFIG_SECURITY_LANDLOCK */
}
/**
@@ -797,6 +801,10 @@ static long do_seccomp(unsigned int op, unsigned int flags,
return seccomp_set_mode_strict();
case SECCOMP_SET_MODE_FILTER:
return seccomp_set_mode_filter(flags, uargs);
+#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
+ case SECCOMP_ADD_LANDLOCK_RULE:
+ return landlock_seccomp_append_prog(flags, uargs);
+#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
default:
return -EINVAL;
}
diff --git a/security/landlock/lsm.c b/security/landlock/lsm.c
index b3c107244df9..572f4f7f9f19 100644
--- a/security/landlock/lsm.c
+++ b/security/landlock/lsm.c
@@ -8,6 +8,7 @@
* published by the Free Software Foundation.
*/
+#include <asm/current.h>
#include <linux/bpf.h> /* enum bpf_reg_type, struct landlock_data */
#include <linux/cred.h>
#include <linux/err.h> /* MAX_ERRNO */
@@ -15,6 +16,7 @@
#include <linux/kernel.h> /* FIELD_SIZEOF() */
#include <linux/landlock.h>
#include <linux/lsm_hooks.h>
+#include <linux/seccomp.h> /* struct seccomp_* */
#include <linux/types.h> /* uintptr_t */
/* hook arguments */
@@ -161,8 +163,10 @@ static int landlock_enforce(enum landlock_hook hook, __u64 args[6])
.args[5] = args[5],
};
- /* placeholder for seccomp and cgroup managers */
- ret = landlock_run_prog(hook_idx, &ctx, NULL);
+#ifdef CONFIG_SECCOMP_FILTER
+ ret = landlock_run_prog(hook_idx, &ctx,
+ current->seccomp.landlock_hooks);
+#endif /* CONFIG_SECCOMP_FILTER */
return -ret;
}
diff --git a/security/landlock/manager.c b/security/landlock/manager.c
index f3f03b64ebef..56e99ccd5708 100644
--- a/security/landlock/manager.c
+++ b/security/landlock/manager.c
@@ -14,8 +14,11 @@
#include <linux/filter.h> /* struct bpf_prog */
#include <linux/kernel.h> /* round_up() */
#include <linux/landlock.h>
+#include <linux/sched.h> /* current_cred(), task_no_new_privs() */
+#include <linux/security.h> /* security_capable_noaudit() */
#include <linux/slab.h> /* alloc(), kfree() */
#include <linux/types.h> /* atomic_t */
+#include <linux/uaccess.h> /* copy_from_user() */
#include "common.h"
@@ -263,3 +266,51 @@ static struct landlock_hooks *landlock_append_prog(
put_landlock_rule(rule);
return new_hooks;
}
+
+/**
+ * landlock_seccomp_append_prog - attach a Landlock program to the current process
+ *
+ * current->seccomp.landlock_hooks is lazily allocated. When a process fork,
+ * only a pointer is copied. When a new hook is added by a process, if there is
+ * other references to this process' landlock_hooks, then a new allocation is
+ * made to contains an array pointing to Landlock program lists. This design
+ * has low-performance impact and memory efficiency while keeping the property
+ * of append-only programs.
+ *
+ * @flags: not used for now, but could be used for TSYNC
+ * @user_bpf_fd: file descriptor pointing to a loaded/checked eBPF program
+ * dedicated to Landlock
+ */
+#ifdef CONFIG_SECCOMP_FILTER
+int landlock_seccomp_append_prog(unsigned int flags, const char __user *user_bpf_fd)
+{
+ struct landlock_hooks *new_hooks;
+ struct bpf_prog *prog;
+ int bpf_fd;
+
+ if (!task_no_new_privs(current) &&
+ security_capable_noaudit(current_cred(),
+ current_user_ns(), CAP_SYS_ADMIN) != 0)
+ return -EPERM;
+ if (!user_bpf_fd)
+ return -EINVAL;
+ if (flags)
+ return -EINVAL;
+ if (copy_from_user(&bpf_fd, user_bpf_fd, sizeof(user_bpf_fd)))
+ return -EFAULT;
+ prog = bpf_prog_get(bpf_fd);
+ if (IS_ERR(prog))
+ return PTR_ERR(prog);
+
+ /*
+ * We don't need to lock anything for the current process hierarchy,
+ * everything is guarded by the atomic counters.
+ */
+ new_hooks = landlock_append_prog(current->seccomp.landlock_hooks, prog);
+ /* @prog is managed/freed by landlock_append_prog() */
+ if (IS_ERR(new_hooks))
+ return PTR_ERR(new_hooks);
+ current->seccomp.landlock_hooks = new_hooks;
+ return 0;
+}
+#endif /* CONFIG_SECCOMP_FILTER */
--
2.9.3
Add a basic sandbox tool to create a process isolated from some part of
the system. This can depend of the current cgroup.
A sandbox process can stat the directories from the root up to the
allowed files. This then allow to stat ".." in an allowed directory.
Accessing to other sibling files (not parent of allowed files), are
denied with ENOENT to forbid file names discovery by bruteforcing.
Example with the current process hierarchy (seccomp):
$ ls /home
user1
$ LANDLOCK_ALLOWED='/bin:/lib:/usr:/tmp:/proc/self/fd/0' \
./samples/landlock/sandbox /bin/sh -i
Launching a new sandboxed process.
$ ls /home
ls: cannot access '/home': No such file or directory
Example with a cgroup:
$ mkdir /sys/fs/cgroup/sandboxed
$ ls /home
user1
$ LANDLOCK_CGROUPS='/sys/fs/cgroup/sandboxed' \
LANDLOCK_ALLOWED='/bin:/lib:/usr:/tmp:/proc/self/fd/0' \
./samples/landlock/sandbox
Ready to sandbox with cgroups.
$ ls /home
user1
$ echo $$ > /sys/fs/cgroup/sandboxed/cgroup.procs
$ ls /home
ls: cannot access '/home': No such file or directory
Changes since v3:
* remove seccomp and origin field: completely free from seccomp programs
* handle more FS-related hooks
* handle inode hooks and directory traversal
* add faked but consistent view thanks to ENOENT
* add /lib64 in the example
* fix spelling
* rename some types and definitions (e.g. SECCOMP_ADD_LANDLOCK_RULE)
Changes since v2:
* use BPF_PROG_ATTACH for cgroup handling
Signed-off-by: Mickaël Salaün <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: James Morris <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
---
samples/Makefile | 2 +-
samples/landlock/.gitignore | 1 +
samples/landlock/Makefile | 16 ++
samples/landlock/sandbox.c | 405 ++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 423 insertions(+), 1 deletion(-)
create mode 100644 samples/landlock/.gitignore
create mode 100644 samples/landlock/Makefile
create mode 100644 samples/landlock/sandbox.c
diff --git a/samples/Makefile b/samples/Makefile
index e17d66d77f09..9f1b87bad1c0 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -2,4 +2,4 @@
obj-$(CONFIG_SAMPLES) += kobject/ kprobes/ trace_events/ livepatch/ \
hw_breakpoint/ kfifo/ kdb/ hidraw/ rpmsg/ seccomp/ \
- configfs/ connector/ v4l/ trace_printk/ blackfin/
+ configfs/ connector/ v4l/ trace_printk/ blackfin/ landlock/
diff --git a/samples/landlock/.gitignore b/samples/landlock/.gitignore
new file mode 100644
index 000000000000..f6c6da930a30
--- /dev/null
+++ b/samples/landlock/.gitignore
@@ -0,0 +1 @@
+/sandbox
diff --git a/samples/landlock/Makefile b/samples/landlock/Makefile
new file mode 100644
index 000000000000..d1044b2afd27
--- /dev/null
+++ b/samples/landlock/Makefile
@@ -0,0 +1,16 @@
+# kbuild trick to avoid linker error. Can be omitted if a module is built.
+obj- := dummy.o
+
+hostprogs-$(CONFIG_SECURITY_LANDLOCK) := sandbox
+sandbox-objs := sandbox.o
+
+always := $(hostprogs-y)
+
+HOSTCFLAGS += -I$(objtree)/usr/include
+
+# Trick to allow make to be run from this directory
+all:
+ $(MAKE) -C ../../ $$PWD/
+
+clean:
+ $(MAKE) -C ../../ M=$$PWD clean
diff --git a/samples/landlock/sandbox.c b/samples/landlock/sandbox.c
new file mode 100644
index 000000000000..9a36ebdf02d8
--- /dev/null
+++ b/samples/landlock/sandbox.c
@@ -0,0 +1,405 @@
+/*
+ * Landlock LSM - Sandbox example
+ *
+ * Copyright (C) 2016 Mickaël Salaün <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 3, as
+ * published by the Free Software Foundation.
+ */
+
+#define _GNU_SOURCE
+#include <errno.h>
+#include <fcntl.h> /* open() */
+#include <linux/bpf.h>
+#include <linux/filter.h>
+#include <linux/prctl.h>
+#include <linux/seccomp.h>
+#include <linux/stat.h> /* S_IFDIR() */
+#include <stddef.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/prctl.h>
+#include <sys/syscall.h>
+#include <unistd.h>
+
+#include "../../tools/include/linux/filter.h"
+
+#include "../bpf/libbpf.c"
+
+#ifndef seccomp
+static int seccomp(unsigned int op, unsigned int flags, void *args)
+{
+ errno = 0;
+ return syscall(__NR_seccomp, op, flags, args);
+}
+#endif
+
+static int landlock_prog_load(const struct bpf_insn *insns, int prog_len,
+ enum landlock_hook hook, __u64 access)
+{
+ union bpf_attr attr = {
+ .prog_type = BPF_PROG_TYPE_LANDLOCK,
+ .insns = ptr_to_u64((void *) insns),
+ .insn_cnt = prog_len / sizeof(struct bpf_insn),
+ .license = ptr_to_u64((void *) "GPL"),
+ .log_buf = ptr_to_u64(bpf_log_buf),
+ .log_size = LOG_BUF_SIZE,
+ .log_level = 1,
+ .prog_subtype.landlock_rule = {
+ .hook = hook,
+ .access = access,
+ },
+ };
+
+ /* assign one field outside of struct init to make sure any
+ * padding is zero initialized
+ */
+ attr.kern_version = 0;
+
+ bpf_log_buf[0] = 0;
+
+ return syscall(__NR_bpf, BPF_PROG_LOAD, &attr, sizeof(attr));
+}
+
+#define ARRAY_SIZE(a) (sizeof(a) / sizeof(a[0]))
+#define MAX_ERRNO 4095
+#define MAY_EXEC 0x00000001
+
+struct landlock_rule {
+ enum landlock_hook hook;
+ struct bpf_insn *bpf;
+ size_t size;
+};
+
+static int apply_sandbox(const char **allowed_paths, int path_nb, const char
+ **cgroup_paths, int cgroup_nb)
+{
+ __u32 key;
+ int i, ret = 0, map_fs = -1;
+
+ /* set up the test sandbox */
+ if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
+ perror("prctl(no_new_priv)");
+ return 1;
+ }
+
+ if (path_nb) {
+ map_fs = bpf_create_map(BPF_MAP_TYPE_LANDLOCK_ARRAY,
+ sizeof(key), sizeof(struct landlock_handle),
+ 10, 0);
+ if (map_fs < 0) {
+ fprintf(stderr, "bpf_create_map(fs): %s\n",
+ strerror(errno));
+ return 1;
+ }
+ for (key = 0; key < path_nb; key++) {
+ int fd = open(allowed_paths[key],
+ O_RDONLY | O_CLOEXEC);
+ if (fd < 0) {
+ fprintf(stderr, "open(fs: \"%s\"): %s\n",
+ allowed_paths[key],
+ strerror(errno));
+ return 1;
+ }
+ struct landlock_handle handle = {
+ .type = BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD,
+ .fd = (__u64)fd,
+ };
+
+ /* register a new LSM handle */
+ if (bpf_update_elem(map_fs, &key, &handle, BPF_ANY)) {
+ fprintf(stderr, "bpf_update_elem(fs: \"%s\"): %s\n",
+ allowed_paths[key],
+ strerror(errno));
+ close(fd);
+ return 1;
+ }
+ close(fd);
+ }
+ }
+
+ /* Landlock rule for file-based and path-based hooks */
+ struct bpf_insn hook_file[] = {
+ /* save context */
+ BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
+ /* specify an option, if any */
+ BPF_MOV32_IMM(BPF_REG_1, 0),
+ /* handles to compare with */
+ BPF_LD_MAP_FD(BPF_REG_2, map_fs),
+ BPF_MOV64_IMM(BPF_REG_3, BPF_MAP_ARRAY_OP_OR),
+ /* hook argument */
+ BPF_LDX_MEM(BPF_DW, BPF_REG_4, BPF_REG_6, offsetof(struct
+ landlock_data, args[0])),
+ /* checker function */
+ BPF_EMIT_CALL(BPF_FUNC_landlock_cmp_fs_beneath),
+ /* if the checked path is beneath the handle */
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 2),
+ BPF_MOV32_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ /* allow anonymous mapping */
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_0, -ENOENT, 2),
+ BPF_MOV32_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ /* deny by default, if any error */
+ BPF_MOV32_IMM(BPF_REG_0, EACCES),
+ BPF_EXIT_INSN(),
+ };
+
+ /* Landlock rule for inode-based hooks */
+ struct bpf_insn hook_inode[] = {
+ /* save context */
+ BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
+ /* specify an option, if any */
+ BPF_MOV32_IMM(BPF_REG_1, 0),
+ /* handles to compare with */
+ BPF_LD_MAP_FD(BPF_REG_2, map_fs),
+ BPF_MOV64_IMM(BPF_REG_3, BPF_MAP_ARRAY_OP_OR),
+ /* hook argument */
+ BPF_LDX_MEM(BPF_DW, BPF_REG_4, BPF_REG_6, offsetof(struct
+ landlock_data, args[0])),
+ /* checker function */
+ BPF_EMIT_CALL(BPF_FUNC_landlock_cmp_fs_beneath),
+ /* if the checked path is beneath the handle */
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 2),
+ BPF_MOV32_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+
+ /*
+ * We must allow MAY_EXEC access on directories from the root to the
+ * handles, otherwise they are not reachable.
+ */
+
+ /* hook argument */
+ BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_6, offsetof(struct
+ landlock_data, args[0])),
+ /* checker function */
+ BPF_EMIT_CALL(BPF_FUNC_landlock_get_fs_mode),
+ /* check if it returned an error */
+ BPF_MOV64_IMM(BPF_REG_7, 0),
+ BPF_ALU64_IMM(BPF_SUB, BPF_REG_7, MAX_ERRNO),
+ BPF_JMP_REG(BPF_JGE, BPF_REG_0, BPF_REG_7, 2),
+ /* check if the inode is a directory */
+ BPF_ALU64_IMM(BPF_AND, BPF_REG_0, S_IFMT),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, S_IFDIR, 2),
+ /* no entry by default, if any error */
+ BPF_MOV32_IMM(BPF_REG_0, ENOENT),
+ BPF_EXIT_INSN(),
+
+ /* specify an option, if any */
+ BPF_MOV32_IMM(BPF_REG_1, LANDLOCK_FLAG_OPT_REVERSE),
+ /* handles to compare with */
+ BPF_LD_MAP_FD(BPF_REG_2, map_fs),
+ BPF_MOV64_IMM(BPF_REG_3, BPF_MAP_ARRAY_OP_OR),
+ /* hook argument */
+ BPF_LDX_MEM(BPF_DW, BPF_REG_4, BPF_REG_6, offsetof(struct
+ landlock_data, args[0])),
+ /* checker function */
+ BPF_EMIT_CALL(BPF_FUNC_landlock_cmp_fs_beneath),
+ /* if one handle is not beneath the checked path */
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
+ BPF_MOV32_IMM(BPF_REG_0, ENOENT),
+ BPF_EXIT_INSN(),
+
+ /* check access mask */
+ BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_6, offsetof(struct
+ landlock_data, args[1])),
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_7, MAY_EXEC, 2),
+ BPF_MOV32_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ BPF_MOV32_IMM(BPF_REG_0, EACCES),
+ BPF_EXIT_INSN(),
+ };
+
+ /* Landlock rule for the stat hook */
+ struct bpf_insn hook_stat[] = {
+ /* save context */
+ BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
+ /* specify an option, if any */
+ BPF_MOV32_IMM(BPF_REG_1, 0),
+ /* handles to compare with */
+ BPF_LD_MAP_FD(BPF_REG_2, map_fs),
+ BPF_MOV64_IMM(BPF_REG_3, BPF_MAP_ARRAY_OP_OR),
+ /* hook argument */
+ BPF_LDX_MEM(BPF_DW, BPF_REG_4, BPF_REG_6, offsetof(struct
+ landlock_data, args[0])),
+ /* checker function */
+ BPF_EMIT_CALL(BPF_FUNC_landlock_cmp_fs_beneath),
+ /* if the checked path is beneath the handle */
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 2),
+ BPF_MOV32_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+
+ /*
+ * We may want to allow discovery of the directories hierarchy
+ * (from the root to the handles).
+ */
+
+ /* hook argument */
+ BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_6, offsetof(struct
+ landlock_data, args[0])),
+ /* checker function */
+ BPF_EMIT_CALL(BPF_FUNC_landlock_get_fs_mode),
+ /* check if it returned an error */
+ BPF_MOV64_IMM(BPF_REG_7, 0),
+ BPF_ALU64_IMM(BPF_SUB, BPF_REG_7, MAX_ERRNO),
+ BPF_JMP_REG(BPF_JGE, BPF_REG_0, BPF_REG_7, 2),
+ /* check if the inode is a directory */
+ BPF_ALU64_IMM(BPF_AND, BPF_REG_0, S_IFMT),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, S_IFDIR, 2),
+ /* no entry by default, if any error */
+ BPF_MOV32_IMM(BPF_REG_0, ENOENT),
+ BPF_EXIT_INSN(),
+
+ /* specify an option, if any */
+ BPF_MOV32_IMM(BPF_REG_1, LANDLOCK_FLAG_OPT_REVERSE),
+ /* handles to compare with */
+ BPF_LD_MAP_FD(BPF_REG_2, map_fs),
+ BPF_MOV64_IMM(BPF_REG_3, BPF_MAP_ARRAY_OP_OR),
+ /* hook argument) */
+ BPF_LDX_MEM(BPF_DW, BPF_REG_4, BPF_REG_6, offsetof(struct
+ landlock_data, args[0])),
+ /* checker function */
+ BPF_EMIT_CALL(BPF_FUNC_landlock_cmp_fs_beneath),
+ /* if one handle is not beneath the checked path */
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
+ BPF_MOV32_IMM(BPF_REG_0, ENOENT),
+ BPF_EXIT_INSN(),
+ BPF_MOV32_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ };
+
+ struct landlock_rule rules[] = {
+ {
+ .hook = LANDLOCK_HOOK_FILE_OPEN,
+ .bpf = hook_file,
+ .size = sizeof(hook_file),
+ },
+ {
+ .hook = LANDLOCK_HOOK_FILE_PERMISSION,
+ .bpf = hook_file,
+ .size = sizeof(hook_file),
+ },
+ {
+ .hook = LANDLOCK_HOOK_MMAP_FILE,
+ .bpf = hook_file,
+ .size = sizeof(hook_file),
+ },
+ {
+ .hook = LANDLOCK_HOOK_INODE_PERMISSION,
+ .bpf = hook_inode,
+ .size = sizeof(hook_inode),
+ },
+ {
+ .hook = LANDLOCK_HOOK_INODE_GETATTR,
+ .bpf = hook_stat,
+ .size = sizeof(hook_stat),
+ },
+ };
+ for (i = 0; i < ARRAY_SIZE(rules) && !ret; i++) {
+ int bpf0 = landlock_prog_load(rules[i].bpf, rules[i].size, rules[i].hook, 0);
+ if (bpf0 == -1) {
+ perror("prog_load");
+ fprintf(stderr, "%s", bpf_log_buf);
+ ret = 1;
+ break;
+ }
+ if (!cgroup_nb) {
+ if (seccomp(SECCOMP_ADD_LANDLOCK_RULE, 0, &bpf0)) {
+ perror("seccomp(set_hook)");
+ ret = 1;
+ }
+ } else {
+ for (key = 0; key < cgroup_nb && !ret; key++) {
+ int fd = open(cgroup_paths[key],
+ O_DIRECTORY | O_CLOEXEC);
+ if (fd < 0) {
+ fprintf(stderr, "open(cgroup: \"%s\"): %s\n",
+ cgroup_paths[key], strerror(errno));
+ ret = 1;
+ break;
+ }
+ if (bpf_prog_attach(bpf0, fd, BPF_CGROUP_LANDLOCK)) {
+ fprintf(stderr, "bpf_prog_attach(cgroup: \"%s\"): %s\n",
+ cgroup_paths[key], strerror(errno));
+ ret = 1;
+ }
+ close(fd);
+ }
+ }
+ close(bpf0);
+ }
+
+ if (path_nb) {
+ close(map_fs);
+ }
+ return ret;
+}
+
+#define ENV_FS_PATH_NAME "LANDLOCK_ALLOWED"
+#define ENV_CGROUP_PATH_NAME "LANDLOCK_CGROUPS"
+#define ENV_PATH_TOKEN ":"
+
+static int parse_path(char *env_path, const char ***path_list)
+{
+ int i, path_nb = 0;
+
+ if (env_path) {
+ path_nb++;
+ for (i = 0; env_path[i]; i++) {
+ if (env_path[i] == ENV_PATH_TOKEN[0]) {
+ path_nb++;
+ }
+ }
+ }
+ *path_list = malloc(path_nb * sizeof(**path_list));
+ for (i = 0; i < path_nb; i++) {
+ (*path_list)[i] = strsep(&env_path, ENV_PATH_TOKEN);
+ }
+
+ return path_nb;
+}
+
+int main(int argc, char * const argv[], char * const *envp)
+{
+ char *cmd_path;
+ char *env_path_allowed, *env_path_cgroup;
+ int path_nb, cgroup_nb;
+ const char **sb_paths = NULL;
+ const char **cg_paths = NULL;
+ char * const *cmd_argv;
+
+ env_path_allowed = getenv(ENV_FS_PATH_NAME);
+ if (env_path_allowed)
+ env_path_allowed = strdup(env_path_allowed);
+ env_path_cgroup = getenv(ENV_CGROUP_PATH_NAME);
+ if (env_path_cgroup)
+ env_path_cgroup = strdup(env_path_cgroup);
+
+ path_nb = parse_path(env_path_allowed, &sb_paths);
+ cgroup_nb = parse_path(env_path_cgroup, &cg_paths);
+ if (argc < 2 && !cgroup_nb) {
+ fprintf(stderr, "usage: %s <cmd> [args]...\n\n", argv[0]);
+ fprintf(stderr, "Environment variables containing paths, each separated by a colon:\n");
+ fprintf(stderr, "* %s (whitelist of allowed files and directories)\n",
+ ENV_FS_PATH_NAME);
+ fprintf(stderr, "* %s (optional cgroup paths for which the sandbox is enabled)\n",
+ ENV_CGROUP_PATH_NAME);
+ fprintf(stderr, "\nexample:\n%s='/bin:/lib:/lib64:/usr:/tmp:/proc/self/fd/0' %s /bin/sh -i\n",
+ ENV_FS_PATH_NAME, argv[0]);
+ return 1;
+ }
+ if (apply_sandbox(sb_paths, path_nb, cg_paths, cgroup_nb))
+ return 1;
+ if (!cgroup_nb) {
+ cmd_path = argv[1];
+ cmd_argv = argv + 1;
+ fprintf(stderr, "Launching a new sandboxed process.\n");
+ execve(cmd_path, cmd_argv, envp);
+ perror("execve");
+ return 1;
+ }
+ fprintf(stderr, "Ready to sandbox with cgroups.\n");
+ return 0;
+}
--
2.9.3
This allows to add new eBPF programs to Landlock hooks dedicated to a
cgroup thanks to the BPF_PROG_ATTACH command. The Landlock hooks
attached to a cgroup are propagated to the children cgroups. When a new
Landlock program is attached to one of this nested cgroup, this cgroup
hierarchy fork the Landlock hooks but will still get the updates from
its parents. This design is easy to deal with. The main difference with
the BPF_PROG_TYPE_CGROUP_SKB lie in the fact that Landlock rules can
only be stacked but not removed. This append-only behavior is consistent
through all the cgroup hierarchy.
A node references a rule list and an optional parent node. A node can be
updated on the fly by its owner to point to a new rule list. This is
useful to be able to atomically and consistently update an inherited
part of a chain of rules. This way, when a parent cgroup add a new
Landlock rule, all the child cgroups atomically update their rules to
include this new one. This is the behavior expected for hierarchy of
rules where the parent can enforce a rule on all its children at any
time.
Changes since v3:
* keep the Landlock rules consistent over all the cgroup hierarchy:
update/insert new parent rules over all its children
* do not use an union of pointers but a struct (suggested by Kees Cook)
* internal renaming
Changes since v2:
* new design based on BPF_PROG_ATTACH (suggested by Alexei Starovoitov)
Signed-off-by: Mickaël Salaün <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Daniel Mack <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Tejun Heo <[email protected]>
---
include/linux/bpf-cgroup.h | 7 +++
include/linux/cgroup-defs.h | 2 +
include/linux/landlock.h | 12 +++++
include/uapi/linux/bpf.h | 1 +
kernel/bpf/cgroup.c | 117 ++++++++++++++++++++++++++++++++++++--------
kernel/bpf/syscall.c | 11 +++++
security/landlock/lsm.c | 34 ++++++++++++-
security/landlock/manager.c | 63 ++++++++++++++++++++++++
8 files changed, 226 insertions(+), 21 deletions(-)
diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index aab1aa91c064..1bf77c5a6895 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -14,8 +14,15 @@ struct sk_buff;
extern struct static_key_false cgroup_bpf_enabled_key;
#define cgroup_bpf_enabled static_branch_unlikely(&cgroup_bpf_enabled_key)
+#ifdef CONFIG_SECURITY_LANDLOCK
+struct landlock_hooks;
+#endif /* CONFIG_SECURITY_LANDLOCK */
+
struct bpf_object {
struct bpf_prog *prog;
+#ifdef CONFIG_SECURITY_LANDLOCK
+ struct landlock_hooks *hooks;
+#endif /* CONFIG_SECURITY_LANDLOCK */
};
struct cgroup_bpf {
diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 861b4677fc5b..fe1023bf7b9d 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -301,8 +301,10 @@ struct cgroup {
/* used to schedule release agent */
struct work_struct release_agent_work;
+#ifdef CONFIG_CGROUP_BPF
/* used to store eBPF programs */
struct cgroup_bpf bpf;
+#endif /* CONFIG_CGROUP_BPF */
/* ids of the ancestors at each level including self */
int ancestor_ids[];
diff --git a/include/linux/landlock.h b/include/linux/landlock.h
index 72b4235d255f..eb2d5986c980 100644
--- a/include/linux/landlock.h
+++ b/include/linux/landlock.h
@@ -19,6 +19,10 @@
#include <linux/seccomp.h> /* struct seccomp_filter */
#endif /* CONFIG_SECCOMP_FILTER */
+#ifdef CONFIG_CGROUP_BPF
+#include <linux/cgroup-defs.h> /* struct cgroup */
+#endif /* CONFIG_CGROUP_BPF */
+
struct landlock_rule;
/**
@@ -73,11 +77,19 @@ struct landlock_hooks {
};
void put_landlock_hooks(struct landlock_hooks *hooks);
+void get_landlock_hooks(struct landlock_hooks *hooks);
#ifdef CONFIG_SECCOMP_FILTER
int landlock_seccomp_append_prog(unsigned int flags,
const char __user *user_bpf_fd);
#endif /* CONFIG_SECCOMP_FILTER */
+#ifdef CONFIG_CGROUP_BPF
+struct landlock_hooks *landlock_cgroup_append_prog(struct cgroup *cgrp,
+ struct bpf_prog *prog);
+void landlock_insert_node(struct landlock_hooks *dst,
+ enum landlock_hook hook, struct landlock_hooks *src);
+#endif /* CONFIG_CGROUP_BPF */
+
#endif /* CONFIG_SECURITY_LANDLOCK */
#endif /* _LINUX_LANDLOCK_H */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 5f09eda3ab68..1d36f7d99288 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -124,6 +124,7 @@ enum bpf_prog_type {
enum bpf_attach_type {
BPF_CGROUP_INET_INGRESS,
BPF_CGROUP_INET_EGRESS,
+ BPF_CGROUP_LANDLOCK,
__MAX_BPF_ATTACH_TYPE
};
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 269b410d890c..c2417416abdf 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -15,6 +15,7 @@
#include <linux/bpf.h>
#include <linux/bpf-cgroup.h>
#include <net/sock.h>
+#include <linux/landlock.h>
DEFINE_STATIC_KEY_FALSE(cgroup_bpf_enabled_key);
EXPORT_SYMBOL(cgroup_bpf_enabled_key);
@@ -31,7 +32,15 @@ void cgroup_bpf_put(struct cgroup *cgrp)
struct bpf_object pinned = cgrp->bpf.pinned[type];
if (pinned.prog) {
- bpf_prog_put(pinned.prog);
+ switch (type) {
+ case BPF_CGROUP_LANDLOCK:
+#ifdef CONFIG_SECURITY_LANDLOCK
+ put_landlock_hooks(pinned.hooks);
+ break;
+#endif /* CONFIG_SECURITY_LANDLOCK */
+ default:
+ bpf_prog_put(pinned.prog);
+ }
static_branch_dec(&cgroup_bpf_enabled_key);
}
}
@@ -48,11 +57,30 @@ void cgroup_bpf_inherit(struct cgroup *cgrp, struct cgroup *parent)
for (type = 0; type < ARRAY_SIZE(cgrp->bpf.effective); type++) {
struct bpf_prog *prog;
-
- prog = rcu_dereference_protected(
- parent->bpf.effective[type].prog,
- lockdep_is_held(&cgroup_mutex));
- rcu_assign_pointer(cgrp->bpf.effective[type].prog, prog);
+#ifdef CONFIG_SECURITY_LANDLOCK
+ struct landlock_hooks *hooks;
+#endif /* CONFIG_SECURITY_LANDLOCK */
+
+ switch (type) {
+ case BPF_CGROUP_INET_INGRESS:
+ case BPF_CGROUP_INET_EGRESS:
+ prog = rcu_dereference_protected(
+ parent->bpf.effective[type].prog,
+ lockdep_is_held(&cgroup_mutex));
+ rcu_assign_pointer(cgrp->bpf.effective[type].prog, prog);
+ break;
+ case BPF_CGROUP_LANDLOCK:
+#ifdef CONFIG_SECURITY_LANDLOCK
+ hooks = rcu_dereference_protected(
+ parent->bpf.effective[type].hooks,
+ lockdep_is_held(&cgroup_mutex));
+ rcu_assign_pointer(cgrp->bpf.effective[type].hooks, hooks);
+ get_landlock_hooks(hooks);
+ break;
+#endif /* CONFIG_SECURITY_LANDLOCK */
+ default:
+ WARN_ON(1);
+ }
}
}
@@ -89,31 +117,80 @@ int __cgroup_bpf_update(struct cgroup *cgrp,
enum bpf_attach_type type)
{
struct bpf_prog *old_prog = NULL, *effective_prog;
+#ifdef CONFIG_SECURITY_LANDLOCK
+ struct landlock_hooks *effective_hooks;
+#endif /* CONFIG_SECURITY_LANDLOCK */
struct cgroup_subsys_state *pos;
-
- old_prog = xchg(&cgrp->bpf.pinned[type].prog, prog);
-
- effective_prog = (!prog && parent) ?
- rcu_dereference_protected(parent->bpf.effective[type].prog,
- lockdep_is_held(&cgroup_mutex)) :
- prog;
+ bool had_obj = false;
+
+ switch (type) {
+ case BPF_CGROUP_INET_INGRESS:
+ case BPF_CGROUP_INET_EGRESS:
+ old_prog = xchg(&cgrp->bpf.pinned[type].prog, prog);
+ if (old_prog)
+ had_obj = true;
+ effective_prog = (!prog && parent) ? rcu_dereference_protected(
+ parent->bpf.effective[type].prog,
+ lockdep_is_held(&cgroup_mutex)) : prog;
+ break;
+ case BPF_CGROUP_LANDLOCK:
+#ifdef CONFIG_SECURITY_LANDLOCK
+ /* append hook */
+ had_obj = !!rcu_dereference_protected(
+ cgrp->bpf.pinned[type].hooks,
+ lockdep_is_held(&cgroup_mutex));
+ effective_hooks = landlock_cgroup_append_prog(cgrp, prog);
+ if (IS_ERR(effective_hooks))
+ return PTR_ERR(effective_hooks);
+ break;
+#endif /* CONFIG_SECURITY_LANDLOCK */
+ default:
+ return -EINVAL;
+ }
css_for_each_descendant_pre(pos, &cgrp->self) {
struct cgroup *desc = container_of(pos, struct cgroup, self);
- /* skip the subtree if the descendant has its own program */
- if (desc->bpf.pinned[type].prog && desc != cgrp)
- pos = css_rightmost_descendant(pos);
- else
+ switch (type) {
+ case BPF_CGROUP_INET_INGRESS:
+ case BPF_CGROUP_INET_EGRESS:
+ /*
+ * skip the subtree if the descendant has its own
+ * program
+ */
+ if (desc->bpf.pinned[type].prog && desc != cgrp) {
+ pos = css_rightmost_descendant(pos);
+ break;
+ }
rcu_assign_pointer(desc->bpf.effective[type].prog,
effective_prog);
+ break;
+ case BPF_CGROUP_LANDLOCK:
+#ifdef CONFIG_SECURITY_LANDLOCK
+ /*
+ * extend the subtree hooks if the descendant has its
+ * own hooks
+ */
+ if (desc->bpf.pinned[type].hooks && desc != cgrp) {
+ landlock_insert_node(desc->bpf.pinned[type].hooks,
+ prog->subtype.landlock_rule.hook,
+ effective_hooks);
+ break;
+ }
+ rcu_assign_pointer(desc->bpf.effective[type].hooks,
+ effective_hooks);
+ break;
+#endif /* CONFIG_SECURITY_LANDLOCK */
+ default:
+ WARN_ON(1);
+ }
}
if (prog)
static_branch_inc(&cgroup_bpf_enabled_key);
-
- if (old_prog) {
- bpf_prog_put(old_prog);
+ if (had_obj) {
+ if (old_prog)
+ bpf_prog_put(old_prog);
static_branch_dec(&cgroup_bpf_enabled_key);
}
return 0;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 128acb4f7177..8980b3218203 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -846,6 +846,16 @@ static int bpf_prog_attach(const union bpf_attr *attr)
BPF_PROG_TYPE_CGROUP_SKB);
break;
+ case BPF_CGROUP_LANDLOCK:
+#ifdef CONFIG_SECURITY_LANDLOCK
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ prog = bpf_prog_get_type(attr->attach_bpf_fd,
+ BPF_PROG_TYPE_LANDLOCK);
+ break;
+#endif /* CONFIG_SECURITY_LANDLOCK */
+
default:
return -EINVAL;
}
@@ -889,6 +899,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
cgroup_put(cgrp);
break;
+ case BPF_CGROUP_LANDLOCK:
default:
return -EINVAL;
}
diff --git a/security/landlock/lsm.c b/security/landlock/lsm.c
index 572f4f7f9f19..b5180aa7291f 100644
--- a/security/landlock/lsm.c
+++ b/security/landlock/lsm.c
@@ -9,6 +9,7 @@
*/
#include <asm/current.h>
+#include <linux/bpf-cgroup.h> /* cgroup_bpf_enabled */
#include <linux/bpf.h> /* enum bpf_reg_type, struct landlock_data */
#include <linux/cred.h>
#include <linux/err.h> /* MAX_ERRNO */
@@ -24,6 +25,10 @@
#include <linux/fs.h> /* struct inode */
#include <linux/path.h> /* struct path */
+#ifdef CONFIG_CGROUP_BPF
+#include <linux/cgroup-defs.h> /* struct cgroup */
+#endif /* CONFIG_CGROUP_BPF */
+
#include "checker_fs.h"
#include "common.h"
@@ -151,6 +156,9 @@ static u32 landlock_run_prog(u32 hook_idx, const struct landlock_data *ctx,
static int landlock_enforce(enum landlock_hook hook, __u64 args[6])
{
u32 ret = 0;
+#ifdef CONFIG_CGROUP_BPF
+ struct cgroup *cgrp;
+#endif /* CONFIG_CGROUP_BPF */
u32 hook_idx = get_index(hook);
struct landlock_data ctx = {
@@ -166,8 +174,20 @@ static int landlock_enforce(enum landlock_hook hook, __u64 args[6])
#ifdef CONFIG_SECCOMP_FILTER
ret = landlock_run_prog(hook_idx, &ctx,
current->seccomp.landlock_hooks);
+ if (ret)
+ goto out;
#endif /* CONFIG_SECCOMP_FILTER */
+#ifdef CONFIG_CGROUP_BPF
+ if (cgroup_bpf_enabled) {
+ /* get the default cgroup associated with the current thread */
+ cgrp = task_css_set(current)->dfl_cgrp;
+ ret = landlock_run_prog(hook_idx, &ctx,
+ cgrp->bpf.effective[BPF_CGROUP_LANDLOCK].hooks);
+ }
+#endif /* CONFIG_CGROUP_BPF */
+
+out:
return -ret;
}
@@ -282,9 +302,21 @@ static struct security_hook_list landlock_hooks[] = {
LANDLOCK_HOOK_INIT(inode_getattr),
};
+#ifdef CONFIG_SECCOMP_FILTER
+#ifdef CONFIG_CGROUP_BPF
+#define LANDLOCK_MANAGERS "seccomp and cgroups"
+#else /* CONFIG_CGROUP_BPF */
+#define LANDLOCK_MANAGERS "seccomp"
+#endif /* CONFIG_CGROUP_BPF */
+#elif define(CONFIG_CGROUP_BPF)
+#define LANDLOCK_MANAGERS "cgroups"
+#else
+#error "Need CONFIG_SECCOMP_FILTER or CONFIG_CGROUP_BPF"
+#endif /* CONFIG_SECCOMP_FILTER */
+
void __init landlock_add_hooks(void)
{
- pr_info("landlock: Becoming ready to sandbox with seccomp\n");
+ pr_info("landlock: Becoming ready to sandbox with " LANDLOCK_MANAGERS "\n");
security_add_hooks(landlock_hooks, ARRAY_SIZE(landlock_hooks));
}
diff --git a/security/landlock/manager.c b/security/landlock/manager.c
index 56e99ccd5708..18ae6c0c4dbf 100644
--- a/security/landlock/manager.c
+++ b/security/landlock/manager.c
@@ -20,6 +20,11 @@
#include <linux/types.h> /* atomic_t */
#include <linux/uaccess.h> /* copy_from_user() */
+#ifdef CONFIG_CGROUP_BPF
+#include <linux/bpf-cgroup.h> /* struct cgroup_bpf */
+#include <linux/cgroup-defs.h> /* struct cgroup */
+#endif /* CONFIG_CGROUP_BPF */
+
#include "common.h"
static void put_landlock_rule(struct landlock_rule *rule)
@@ -68,6 +73,12 @@ void put_landlock_hooks(struct landlock_hooks *hooks)
}
}
+void get_landlock_hooks(struct landlock_hooks *hooks)
+{
+ if (hooks)
+ atomic_inc(&hooks->usage);
+}
+
static struct landlock_hooks *new_raw_landlock_hooks(void)
{
struct landlock_hooks *ret;
@@ -314,3 +325,55 @@ int landlock_seccomp_append_prog(unsigned int flags, const char __user *user_bpf
return 0;
}
#endif /* CONFIG_SECCOMP_FILTER */
+
+/**
+ * landlock_cgroup_set_hook - attach a Landlock program to a cgroup
+ *
+ * Must be called with cgroup_mutex held.
+ *
+ * @crgp: non-NULL cgroup pointer to attach to
+ * @prog: Landlock program pointer
+ */
+#ifdef CONFIG_CGROUP_BPF
+struct landlock_hooks *landlock_cgroup_append_prog(struct cgroup *cgrp,
+ struct bpf_prog *prog)
+{
+ if (!prog)
+ return ERR_PTR(-EINVAL);
+
+ /* copy the inherited hooks and append a new one */
+ return landlock_append_prog(cgrp->bpf.effective[BPF_CGROUP_LANDLOCK].hooks,
+ prog);
+}
+
+/**
+ * landlock_insert_node - insert a Landlock node in an existing hook
+ *
+ * This is useful to keep a consistent hierarchy tree whenever a branch add
+ * its one rules. However, this must be called at every new rule addition to
+ * keep it consistent.
+ *
+ * @dst: Landlock hooks to update. They must not have more than one
+ * missing/desynchronized node to keep the same hierarchy than @src.
+ * @hook: hook to synchronize.
+ * @src: Landlock hooks reference.
+ */
+void landlock_insert_node(struct landlock_hooks *dst,
+ enum landlock_hook hook, struct landlock_hooks *src)
+{
+ struct landlock_node **walker;
+ u32 hook_idx = get_index(hook);
+
+ for (walker = &dst->nodes[hook_idx]; *walker;
+ walker = &(*walker)->prev) {
+ if (*walker == src->nodes[hook_idx])
+ return;
+ /* assume that the parent node was inherited */
+ if (*walker == src->nodes[hook_idx]->prev)
+ break;
+ }
+ atomic_inc(&src->nodes[hook_idx]->usage);
+ put_landlock_node(*walker);
+ smp_store_release(walker, src->nodes[hook_idx]);
+}
+#endif /* CONFIG_CGROUP_BPF */
--
2.9.3
For now, the update and debug accesses are only accessible to a process
with CAP_SYS_ADMIN. This could change in the future.
The capability check is statically done when loading an eBPF program,
according to the current process. If the process has enough rights and
set the appropriate access flags, then the dedicated functions or data
will be accessible.
With the update access, the following functions are available:
* bpf_map_lookup_elem
* bpf_map_update_elem
* bpf_map_delete_elem
* bpf_tail_call
With the debug access, the following functions are available:
* bpf_trace_printk
* bpf_get_prandom_u32
* bpf_get_current_pid_tgid
* bpf_get_current_uid_gid
* bpf_get_current_comm
Signed-off-by: Mickaël Salaün <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Sargun Dhillon <[email protected]>
---
include/uapi/linux/bpf.h | 4 +++-
security/landlock/lsm.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 56 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 1d36f7d99288..013f661e27f8 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -607,7 +607,9 @@ enum landlock_hook {
#define _LANDLOCK_HOOK_LAST LANDLOCK_HOOK_INODE_GETATTR
/* eBPF context and functions allowed for a rule */
-#define _LANDLOCK_SUBTYPE_ACCESS_MASK ((1ULL << 0) - 1)
+#define LANDLOCK_SUBTYPE_ACCESS_UPDATE (1 << 0)
+#define LANDLOCK_SUBTYPE_ACCESS_DEBUG (1 << 1)
+#define _LANDLOCK_SUBTYPE_ACCESS_MASK ((1ULL << 2) - 1)
/*
* (future) options for a Landlock rule (e.g. run even if a previous rule
diff --git a/security/landlock/lsm.c b/security/landlock/lsm.c
index b5180aa7291f..1d924d2414f2 100644
--- a/security/landlock/lsm.c
+++ b/security/landlock/lsm.c
@@ -194,12 +194,57 @@ static int landlock_enforce(enum landlock_hook hook, __u64 args[6])
static const struct bpf_func_proto *bpf_landlock_func_proto(
enum bpf_func_id func_id, union bpf_prog_subtype *prog_subtype)
{
+ bool access_update = !!(prog_subtype->landlock_rule.access &
+ LANDLOCK_SUBTYPE_ACCESS_UPDATE);
+ bool access_debug = !!(prog_subtype->landlock_rule.access &
+ LANDLOCK_SUBTYPE_ACCESS_DEBUG);
+
switch (func_id) {
case BPF_FUNC_landlock_get_fs_mode:
return &bpf_landlock_get_fs_mode_proto;
case BPF_FUNC_landlock_cmp_fs_beneath:
return &bpf_landlock_cmp_fs_beneath_proto;
+ /* access_update */
+ case BPF_FUNC_map_lookup_elem:
+ if (access_update)
+ return &bpf_map_lookup_elem_proto;
+ return NULL;
+ case BPF_FUNC_map_update_elem:
+ if (access_update)
+ return &bpf_map_update_elem_proto;
+ return NULL;
+ case BPF_FUNC_map_delete_elem:
+ if (access_update)
+ return &bpf_map_delete_elem_proto;
+ return NULL;
+ case BPF_FUNC_tail_call:
+ if (access_update)
+ return &bpf_tail_call_proto;
+ return NULL;
+
+ /* access_debug */
+ case BPF_FUNC_trace_printk:
+ if (access_debug)
+ return bpf_get_trace_printk_proto();
+ return NULL;
+ case BPF_FUNC_get_prandom_u32:
+ if (access_debug)
+ return &bpf_get_prandom_u32_proto;
+ return NULL;
+ case BPF_FUNC_get_current_pid_tgid:
+ if (access_debug)
+ return &bpf_get_current_pid_tgid_proto;
+ return NULL;
+ case BPF_FUNC_get_current_uid_gid:
+ if (access_debug)
+ return &bpf_get_current_uid_gid_proto;
+ return NULL;
+ case BPF_FUNC_get_current_comm:
+ if (access_debug)
+ return &bpf_get_current_comm_proto;
+ return NULL;
+
default:
return NULL;
}
@@ -373,6 +418,14 @@ static inline bool bpf_landlock_is_valid_subtype(
if (prog_subtype->landlock_rule.option & ~_LANDLOCK_SUBTYPE_OPTION_MASK)
return false;
+ /* check access flags */
+ if (prog_subtype->landlock_rule.access & LANDLOCK_SUBTYPE_ACCESS_UPDATE &&
+ !capable(CAP_SYS_ADMIN))
+ return false;
+ if (prog_subtype->landlock_rule.access & LANDLOCK_SUBTYPE_ACCESS_DEBUG &&
+ !capable(CAP_SYS_ADMIN))
+ return false;
+
return true;
}
--
2.9.3
Add eBPF functions to compare file system access with a Landlock file
system handle:
* bpf_landlock_cmp_fs_beneath(opt, map, map_op, fs_arg)
This function allows an eBPF program to check if the current accessed
file is the same or in the hierarchy of a reference handle.
* bpf_landlock_get_fs_mode(arg_fs)
This function return the mode of a file. This is useful to check if
a process try to walk through a directory.
The goal of file system handle is to abstract kernel objects such as a
struct file or a struct inode. Userland can create this kind of handle
thanks to the BPF_MAP_UPDATE_ELEM command. The element is a struct
landlock_handle containing the handle type (e.g.
BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD) and a file descriptor. This could
also be any descriptions able to match a struct file or a struct inode
(e.g. path or glob string).
Changes since v3:
* remove bpf_landlock_cmp_fs_prop() (suggested by Alexie Starovoitov)
* add hooks dealing with struct inode and struct path pointers:
inode_permission and inode_getattr
* add abstraction over eBPF helper arguments thanks to wrapping structs
* add bpf_landlock_get_fs_mode() helper to check file type and mode
* merge WARN_ON() (suggested by Kees Cook)
* fix and update bpf_helpers.h
* use BPF_CALL_* for eBPF helpers (suggested by Alexie Starovoitov)
* make handle arraymap safe (RCU) and remove buggy synchronize_rcu()
* factor out the arraymay walk
* use size_t to index array (suggested by Jann Horn)
Changes since v2:
* add MNT_INTERNAL check to only add file handle from user-visible FS
(e.g. no anonymous inode)
* replace struct file* with struct path* in map_landlock_handle
* add BPF protos
* fix bpf_landlock_cmp_fs_prop_with_struct_file()
Signed-off-by: Mickaël Salaün <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: James Morris <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
Cc: Jann Horn <[email protected]>
Link: https://lkml.kernel.org/r/CALCETrWwTiz3kZTkEgOW24-DvhQq6LftwEXh77FD2G5o71yD7g@mail.gmail.com
Link: https://lkml.kernel.org/r/[email protected]
---
include/linux/bpf.h | 5 ++
include/uapi/linux/bpf.h | 35 ++++++++++
samples/bpf/bpf_helpers.h | 5 ++
security/landlock/Makefile | 2 +-
security/landlock/checker_fs.c | 152 +++++++++++++++++++++++++++++++++++++++++
security/landlock/checker_fs.h | 20 ++++++
security/landlock/common.h | 13 ++++
security/landlock/lsm.c | 6 ++
8 files changed, 237 insertions(+), 1 deletion(-)
create mode 100644 security/landlock/checker_fs.c
create mode 100644 security/landlock/checker_fs.h
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index e7ce49642f50..50fbeaac03fe 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -363,6 +363,11 @@ extern const struct bpf_func_proto bpf_skb_vlan_push_proto;
extern const struct bpf_func_proto bpf_skb_vlan_pop_proto;
extern const struct bpf_func_proto bpf_get_stackid_proto;
+#ifdef CONFIG_SECURITY_LANDLOCK
+extern const struct bpf_func_proto bpf_landlock_cmp_fs_beneath_proto;
+extern const struct bpf_func_proto bpf_landlock_get_fs_mode_proto;
+#endif /* CONFIG_SECURITY_LANDLOCK */
+
/* Shared helpers among cBPF and eBPF. */
void bpf_user_rnd_init_once(void);
u64 bpf_user_rnd_u32(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index b6b531a868c0..5f09eda3ab68 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -101,6 +101,13 @@ enum bpf_map_handle_type {
/* BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_GLOB, */
};
+enum bpf_map_array_op {
+ BPF_MAP_ARRAY_OP_UNSPEC,
+ BPF_MAP_ARRAY_OP_OR,
+ BPF_MAP_ARRAY_OP_AND,
+ BPF_MAP_ARRAY_OP_XOR,
+};
+
enum bpf_prog_type {
BPF_PROG_TYPE_UNSPEC,
BPF_PROG_TYPE_SOCKET_FILTER,
@@ -465,6 +472,30 @@ enum bpf_func_id {
*/
BPF_FUNC_set_hash_invalid,
+ /**
+ * bpf_landlock_cmp_fs_beneath(opt, map, map_op, arg_fs)
+ * Check if a struct inode is a leaf of file system handles
+ *
+ * @opt: check options (e.g. LANDLOCK_FLAG_OPT_REVERSE)
+ * @map: handles to compare against
+ * @map_op: which elements of the map to use (e.g. BPF_MAP_ARRAY_OP_OR)
+ * @arg_fs: struct landlock_arg_fs address to compare with
+ *
+ * Return: 0 if the file is the same or beneath the handles,
+ * 1 otherwise, or a negative value if an error occurred.
+ */
+ BPF_FUNC_landlock_cmp_fs_beneath,
+
+ /**
+ * bpf_landlock_get_fs_mode(arg_fs)
+ * Get the mode of a struct landlock_arg_fs
+ *
+ * @arg_fs: struct landlock_arg_fs address
+ *
+ * Return: the file mode
+ */
+ BPF_FUNC_landlock_get_fs_mode,
+
__BPF_FUNC_MAX_ID,
};
@@ -583,6 +614,10 @@ enum landlock_hook {
*/
#define _LANDLOCK_SUBTYPE_OPTION_MASK ((1ULL << 0) - 1)
+/* Handle option flags */
+#define LANDLOCK_FLAG_OPT_REVERSE (1<<0)
+#define _LANDLOCK_FLAG_OPT_MASK ((1ULL << 1) - 1)
+
/* Map handle entry */
struct landlock_handle {
__u32 type; /* enum bpf_map_handle_type */
diff --git a/samples/bpf/bpf_helpers.h b/samples/bpf/bpf_helpers.h
index 90f44bd2045e..52fa1ab1c0c4 100644
--- a/samples/bpf/bpf_helpers.h
+++ b/samples/bpf/bpf_helpers.h
@@ -57,6 +57,11 @@ static int (*bpf_skb_set_tunnel_opt)(void *ctx, void *md, int size) =
(void *) BPF_FUNC_skb_set_tunnel_opt;
static unsigned long long (*bpf_get_prandom_u32)(void) =
(void *) BPF_FUNC_get_prandom_u32;
+static unsigned long long (*bpf_landlock_cmp_fs_beneath)
+ (int option, void *map, int map_op, void *arg_fs) =
+ (void *) BPF_FUNC_landlock_cmp_fs_beneath;
+static unsigned long long (*bpf_landlock_get_fs_mode)(void *arg_fs) =
+ (void *) BPF_FUNC_landlock_get_fs_mode;
/* llvm builtin functions that eBPF C program may use to
* emit BPF_LD_ABS and BPF_LD_IND instructions
diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index 59669d70bc7e..27f359a8cfaa 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -1,3 +1,3 @@
obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
-landlock-y := lsm.o
+landlock-y := lsm.o checker_fs.o
diff --git a/security/landlock/checker_fs.c b/security/landlock/checker_fs.c
new file mode 100644
index 000000000000..01a929a269e6
--- /dev/null
+++ b/security/landlock/checker_fs.c
@@ -0,0 +1,152 @@
+/*
+ * Landlock LSM - File System Checkers
+ *
+ * Copyright (C) 2016 Mickaël Salaün <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/bpf.h> /* enum bpf_map_array_op */
+#include <linux/errno.h>
+#include <linux/filter.h> /* BPF_CALL*() */
+#include <linux/fs.h> /* path_is_under() */
+#include <linux/path.h> /* struct path */
+
+#include "common.h" /* struct landlock_arg_fs */
+#include "checker_fs.h"
+
+/*
+ * bpf_landlock_cmp_fs_beneath
+ *
+ * Cf. include/uapi/linux/bpf.h
+ */
+BPF_CALL_4(bpf_landlock_cmp_fs_beneath, u8, option, struct bpf_map *, map,
+ enum bpf_map_array_op, map_op,
+ struct landlock_arg_fs *, arg_fs)
+{
+ struct bpf_array *array = container_of(map, struct bpf_array, map);
+ const struct path *p1 = NULL, *p2 = NULL;
+ struct dentry *d1 = NULL, *d2 = NULL;
+ struct map_landlock_handle *handle;
+ size_t i;
+
+ if (WARN_ON(!map))
+ return -EFAULT;
+ if (WARN_ON(!arg_fs))
+ return -EFAULT;
+ if (unlikely((option | _LANDLOCK_FLAG_OPT_MASK) != _LANDLOCK_FLAG_OPT_MASK))
+ return -EINVAL;
+
+ if (!arg_fs->file) {
+ /* file can be null for anonymous mmap */
+ WARN_ON(arg_fs->type != LANDLOCK_ARGTYPE_FILE);
+ return -ENOENT;
+ }
+
+ /* for now, only handle OP_OR */
+ switch (map_op) {
+ case BPF_MAP_ARRAY_OP_OR:
+ break;
+ case BPF_MAP_ARRAY_OP_UNSPEC:
+ case BPF_MAP_ARRAY_OP_AND:
+ case BPF_MAP_ARRAY_OP_XOR:
+ default:
+ return -EINVAL;
+ }
+ switch (arg_fs->type) {
+ case LANDLOCK_ARGTYPE_FILE:
+ p1 = &arg_fs->file->f_path;
+ break;
+ case LANDLOCK_ARGTYPE_PATH:
+ p1 = arg_fs->path;
+ break;
+ case LANDLOCK_ARGTYPE_INODE:
+ d1 = d_find_alias(arg_fs->inode);
+ if (WARN_ON(!d1))
+ return -ENOENT;
+ break;
+ case LANDLOCK_ARGTYPE_NONE:
+ default:
+ WARN_ON(1);
+ return -EFAULT;
+ }
+ /* {p,d}1 and {p,d}2 will be set correctly in the loop */
+ p2 = p1;
+ d2 = d1;
+
+ if (p1) {
+ for_each_handle(i, handle, array) {
+ if (WARN_ON(handle->type != BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD))
+ return -EINVAL;
+
+ if (option & LANDLOCK_FLAG_OPT_REVERSE)
+ p2 = &handle->path;
+ else
+ p1 = &handle->path;
+
+ if (path_is_under(p2, p1))
+ return 0;
+ }
+ } else if (d1) {
+ for_each_handle(i, handle, array) {
+ if (WARN_ON(handle->type != BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD))
+ return -EINVAL;
+
+ if (option & LANDLOCK_FLAG_OPT_REVERSE)
+ d2 = handle->path.dentry;
+ else
+ d1 = handle->path.dentry;
+
+ if (is_subdir(d2, d1))
+ return 0;
+ }
+ }
+ return 1;
+}
+
+const struct bpf_func_proto bpf_landlock_cmp_fs_beneath_proto = {
+ .func = bpf_landlock_cmp_fs_beneath,
+ .gpl_only = true,
+ .ret_type = RET_INTEGER,
+ .arg1_type = ARG_ANYTHING,
+ .arg2_type = ARG_CONST_PTR_TO_LANDLOCK_HANDLE_FS,
+ .arg3_type = ARG_ANYTHING,
+ .arg4_type = ARG_CONST_PTR_TO_LANDLOCK_ARG_FS,
+};
+
+BPF_CALL_1(bpf_landlock_get_fs_mode, struct landlock_arg_fs *, arg_fs)
+{
+ if (WARN_ON(!arg_fs))
+ return -EFAULT;
+ if (!arg_fs->file) {
+ /* file can be null for anonymous mmap */
+ WARN_ON(arg_fs->type != LANDLOCK_ARGTYPE_FILE);
+ return -ENOENT;
+ }
+ switch (arg_fs->type) {
+ case LANDLOCK_ARGTYPE_FILE:
+ if (WARN_ON(!arg_fs->file->f_inode))
+ return -ENOENT;
+ return arg_fs->file->f_inode->i_mode;
+ case LANDLOCK_ARGTYPE_INODE:
+ return arg_fs->inode->i_mode;
+ case LANDLOCK_ARGTYPE_PATH:
+ if (WARN_ON(!arg_fs->path->dentry ||
+ !arg_fs->path->dentry->d_inode))
+ return -ENOENT;
+ return arg_fs->path->dentry->d_inode->i_mode;
+ case LANDLOCK_ARGTYPE_NONE:
+ default:
+ WARN_ON(1);
+ return -EFAULT;
+ }
+}
+
+const struct bpf_func_proto bpf_landlock_get_fs_mode_proto = {
+ .func = bpf_landlock_get_fs_mode,
+ .gpl_only = true,
+ .ret_type = RET_INTEGER,
+ .arg1_type = ARG_CONST_PTR_TO_LANDLOCK_ARG_FS,
+};
diff --git a/security/landlock/checker_fs.h b/security/landlock/checker_fs.h
new file mode 100644
index 000000000000..8bcdc9cba2b8
--- /dev/null
+++ b/security/landlock/checker_fs.h
@@ -0,0 +1,20 @@
+/*
+ * Landlock LSM - File System Checkers
+ *
+ * Copyright (C) 2016 Mickaël Salaün <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _SECURITY_LANDLOCK_CHECKER_FS_H
+#define _SECURITY_LANDLOCK_CHECKER_FS_H
+
+#include <linux/fs.h>
+#include <linux/seccomp.h>
+
+extern const struct bpf_func_proto bpf_landlock_cmp_fs_beneath_proto;
+extern const struct bpf_func_proto bpf_landlock_get_fs_mode_proto;
+
+#endif /* _SECURITY_LANDLOCK_CHECKER_FS_H */
diff --git a/security/landlock/common.h b/security/landlock/common.h
index dd64e6391dd8..a30aa93dc1ae 100644
--- a/security/landlock/common.h
+++ b/security/landlock/common.h
@@ -15,6 +15,19 @@
#include <linux/fs.h> /* struct file, struct inode */
#include <linux/path.h> /* struct path */
+/**
+ * for_each_handle - iterate over all handles of an arraymap
+ *
+ * @i: index in the arraymap
+ * @handle: struct map_landlock_handle pointer
+ * @array: struct bpf_array pointer to walk through
+ */
+#define for_each_handle(i, handle, array) \
+ for (i = 0; i < atomic_read(&array->n_entries) && \
+ (handle = *((struct map_landlock_handle **) \
+ (array->value + array->elem_size * i)));\
+ i++)
+
enum landlock_argtype {
LANDLOCK_ARGTYPE_NONE,
LANDLOCK_ARGTYPE_FILE,
diff --git a/security/landlock/lsm.c b/security/landlock/lsm.c
index b3d154275be6..b3c107244df9 100644
--- a/security/landlock/lsm.c
+++ b/security/landlock/lsm.c
@@ -22,6 +22,7 @@
#include <linux/fs.h> /* struct inode */
#include <linux/path.h> /* struct path */
+#include "checker_fs.h"
#include "common.h"
#define MAP0(s, m, ...)
@@ -170,6 +171,11 @@ static const struct bpf_func_proto *bpf_landlock_func_proto(
enum bpf_func_id func_id, union bpf_prog_subtype *prog_subtype)
{
switch (func_id) {
+ case BPF_FUNC_landlock_get_fs_mode:
+ return &bpf_landlock_get_fs_mode_proto;
+ case BPF_FUNC_landlock_cmp_fs_beneath:
+ return &bpf_landlock_cmp_fs_beneath_proto;
+
default:
return NULL;
}
--
2.9.3
Initial Landlock Kconfig needed to split the Landlock eBPF and seccomp
parts to ease the review.
Changes from v2:
* add seccomp filter or cgroups (with eBPF programs attached support)
dependencies
Signed-off-by: Mickaël Salaün <[email protected]>
Cc: James Morris <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
---
security/Kconfig | 1 +
security/landlock/Kconfig | 23 +++++++++++++++++++++++
2 files changed, 24 insertions(+)
create mode 100644 security/landlock/Kconfig
diff --git a/security/Kconfig b/security/Kconfig
index 118f4549404e..c63194c561c5 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -164,6 +164,7 @@ source security/tomoyo/Kconfig
source security/apparmor/Kconfig
source security/loadpin/Kconfig
source security/yama/Kconfig
+source security/landlock/Kconfig
source security/integrity/Kconfig
diff --git a/security/landlock/Kconfig b/security/landlock/Kconfig
new file mode 100644
index 000000000000..dec64270b06d
--- /dev/null
+++ b/security/landlock/Kconfig
@@ -0,0 +1,23 @@
+config SECURITY_LANDLOCK
+ bool "Landlock sandbox support"
+ depends on SECURITY
+ depends on BPF_SYSCALL
+ depends on SECCOMP_FILTER || CGROUP_BPF
+ default y
+ help
+ Landlock is a stacked LSM which allows any user to load a security
+ policy to restrict their processes (i.e. create a sandbox). The
+ policy is a list of stacked eBPF programs for some LSM hooks. Each
+ program can do some access comparison to check if an access request
+ is legitimate.
+
+ You need to enable seccomp filter and/or cgroups (with eBPF programs
+ attached support) to apply a security policy to either a process
+ hierarchy (e.g. application with built-in sandboxing) or a group of
+ processes (e.g. container sandboxing). It is recommended to enable
+ both seccomp filter and cgroups.
+
+ Further information about eBPF can be found in
+ Documentation/networking/filter.txt
+
+ If you are unsure how to answer this question, answer Y.
--
2.9.3
Add 8 file system-related hooks:
* file_open
* file_permission
* mmap_file
* inode_create
* inode_link
* inode_unlink
* inode_permission
* inode_getattr
This hook arguments are available to the Landlock rules in the eBPF
context as pointers. This pointers are an abstraction over the
underlying raw types. For now, the ARG_CONST_PTR_TO_LANDLOCK_ARG_FS type
is used for struct file, struct inode and struct path pointers.
Changes since v3:
* split commit
* add hooks dealing with struct inode and struct path pointers:
inode_permission and inode_getattr
* add abstraction over eBPF helper arguments thanks to wrapping structs
Signed-off-by: Mickaël Salaün <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: James Morris <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
---
include/linux/bpf.h | 2 +
include/linux/lsm_hooks.h | 5 ++
include/uapi/linux/bpf.h | 10 ++-
kernel/bpf/verifier.c | 6 ++
security/landlock/common.h | 18 +++++
security/landlock/lsm.c | 173 +++++++++++++++++++++++++++++++++++++++++++++
security/security.c | 1 +
7 files changed, 214 insertions(+), 1 deletion(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 2cca9fc8b72b..e7ce49642f50 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -88,6 +88,7 @@ enum bpf_arg_type {
ARG_ANYTHING, /* any (initialized) argument is ok */
ARG_CONST_PTR_TO_LANDLOCK_HANDLE_FS, /* pointer to Landlock FS map handle */
+ ARG_CONST_PTR_TO_LANDLOCK_ARG_FS, /* pointer to Landlock FS hook argument */
};
/* type of values returned from helper functions */
@@ -157,6 +158,7 @@ enum bpf_reg_type {
/* Landlock */
CONST_PTR_TO_LANDLOCK_HANDLE_FS,
+ CONST_PTR_TO_LANDLOCK_ARG_FS,
};
struct bpf_prog;
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 558adfa5c8a8..069af34301d4 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1933,5 +1933,10 @@ void __init loadpin_add_hooks(void);
#else
static inline void loadpin_add_hooks(void) { };
#endif
+#ifdef CONFIG_SECURITY_LANDLOCK
+extern void __init landlock_add_hooks(void);
+#else
+static inline void __init landlock_add_hooks(void) { }
+#endif /* CONFIG_SECURITY_LANDLOCK */
#endif /* ! __LINUX_LSM_HOOKS_H */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 335616ab63ff..b6b531a868c0 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -563,8 +563,16 @@ struct xdp_md {
/* LSM hooks */
enum landlock_hook {
LANDLOCK_HOOK_UNSPEC,
+ LANDLOCK_HOOK_FILE_OPEN,
+ LANDLOCK_HOOK_FILE_PERMISSION,
+ LANDLOCK_HOOK_MMAP_FILE,
+ LANDLOCK_HOOK_INODE_CREATE,
+ LANDLOCK_HOOK_INODE_LINK,
+ LANDLOCK_HOOK_INODE_UNLINK,
+ LANDLOCK_HOOK_INODE_PERMISSION,
+ LANDLOCK_HOOK_INODE_GETATTR,
};
-#define _LANDLOCK_HOOK_LAST LANDLOCK_HOOK_UNSPEC
+#define _LANDLOCK_HOOK_LAST LANDLOCK_HOOK_INODE_GETATTR
/* eBPF context and functions allowed for a rule */
#define _LANDLOCK_SUBTYPE_ACCESS_MASK ((1ULL << 0) - 1)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 9b921a9afa3c..32b7941476ec 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -189,6 +189,7 @@ static const char * const reg_type_str[] = {
[PTR_TO_PACKET] = "pkt",
[PTR_TO_PACKET_END] = "pkt_end",
[CONST_PTR_TO_LANDLOCK_HANDLE_FS] = "landlock_handle_fs",
+ [CONST_PTR_TO_LANDLOCK_ARG_FS] = "landlock_arg_fs",
};
static void print_verifier_state(struct bpf_verifier_state *state)
@@ -515,6 +516,7 @@ static bool is_spillable_regtype(enum bpf_reg_type type)
case FRAME_PTR:
case CONST_PTR_TO_MAP:
case CONST_PTR_TO_LANDLOCK_HANDLE_FS:
+ case CONST_PTR_TO_LANDLOCK_ARG_FS:
return true;
default:
return false;
@@ -980,6 +982,10 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
expected_type = CONST_PTR_TO_LANDLOCK_HANDLE_FS;
if (type != expected_type)
goto err_type;
+ } else if (arg_type == ARG_CONST_PTR_TO_LANDLOCK_ARG_FS) {
+ expected_type = CONST_PTR_TO_LANDLOCK_ARG_FS;
+ if (type != expected_type)
+ goto err_type;
} else if (arg_type == ARG_PTR_TO_STACK ||
arg_type == ARG_PTR_TO_RAW_STACK) {
expected_type = PTR_TO_STACK;
diff --git a/security/landlock/common.h b/security/landlock/common.h
index 0b5aad4a7aaa..dd64e6391dd8 100644
--- a/security/landlock/common.h
+++ b/security/landlock/common.h
@@ -12,6 +12,24 @@
#define _SECURITY_LANDLOCK_COMMON_H
#include <linux/bpf.h> /* enum landlock_hook */
+#include <linux/fs.h> /* struct file, struct inode */
+#include <linux/path.h> /* struct path */
+
+enum landlock_argtype {
+ LANDLOCK_ARGTYPE_NONE,
+ LANDLOCK_ARGTYPE_FILE,
+ LANDLOCK_ARGTYPE_INODE,
+ LANDLOCK_ARGTYPE_PATH,
+};
+
+struct landlock_arg_fs {
+ enum landlock_argtype type;
+ union {
+ struct file *file;
+ struct inode *inode;
+ const struct path *path;
+ };
+};
/**
* get_index - get an index for the rules of struct landlock_hooks
diff --git a/security/landlock/lsm.c b/security/landlock/lsm.c
index d7564540c493..b3d154275be6 100644
--- a/security/landlock/lsm.c
+++ b/security/landlock/lsm.c
@@ -15,9 +15,99 @@
#include <linux/kernel.h> /* FIELD_SIZEOF() */
#include <linux/landlock.h>
#include <linux/lsm_hooks.h>
+#include <linux/types.h> /* uintptr_t */
+
+/* hook arguments */
+#include <linux/dcache.h> /* struct dentry */
+#include <linux/fs.h> /* struct inode */
+#include <linux/path.h> /* struct path */
#include "common.h"
+#define MAP0(s, m, ...)
+#define MAP1(s, m, d, t, a) m(d, t, a)
+#define MAP2(s, m, d, t, a, ...) m(d, t, a) s() MAP1(s, m, __VA_ARGS__)
+#define MAP3(s, m, d, t, a, ...) m(d, t, a) s() MAP2(s, m, __VA_ARGS__)
+#define MAP4(s, m, d, t, a, ...) m(d, t, a) s() MAP3(s, m, __VA_ARGS__)
+#define MAP5(s, m, d, t, a, ...) m(d, t, a) s() MAP4(s, m, __VA_ARGS__)
+#define MAP6(s, m, d, t, a, ...) m(d, t, a) s() MAP5(s, m, __VA_ARGS__)
+
+/* separators */
+#define SEP_COMMA() ,
+#define SEP_NONE()
+
+/* arguments */
+#define ARG_MAP(n, ...) MAP##n(SEP_COMMA, __VA_ARGS__)
+#define ARG_REGTYPE(d, t, a) d##_REGTYPE
+#define ARG_TA(d, t, a) t a
+#define ARG_GET(d, t, a) ((u64) d##_GET(a))
+
+/* declarations */
+#define DEC_MAP(n, ...) MAP##n(SEP_NONE, DEC, __VA_ARGS__)
+#define DEC(d, t, a) d##_DEC(a)
+
+#define LANDLOCK_HOOKx(X, NAME, CNAME, ...) \
+ static inline int landlock_hook_##NAME( \
+ ARG_MAP(X, ARG_TA, __VA_ARGS__)) \
+ { \
+ DEC_MAP(X, __VA_ARGS__) \
+ __u64 args[6] = { \
+ ARG_MAP(X, ARG_GET, __VA_ARGS__) \
+ }; \
+ return landlock_enforce(LANDLOCK_HOOK_##CNAME, args); \
+ } \
+ static inline bool __is_valid_access_hook_##CNAME( \
+ int off, int size, enum bpf_access_type type, \
+ enum bpf_reg_type *reg_type, \
+ union bpf_prog_subtype *prog_subtype) \
+ { \
+ enum bpf_reg_type arg_types[6] = { \
+ ARG_MAP(X, ARG_REGTYPE, __VA_ARGS__) \
+ }; \
+ return __is_valid_access(off, size, type, reg_type, \
+ arg_types, prog_subtype); \
+ } \
+
+#define LANDLOCK_HOOK1(NAME, ...) LANDLOCK_HOOKx(1, NAME, __VA_ARGS__)
+#define LANDLOCK_HOOK2(NAME, ...) LANDLOCK_HOOKx(2, NAME, __VA_ARGS__)
+#define LANDLOCK_HOOK3(NAME, ...) LANDLOCK_HOOKx(3, NAME, __VA_ARGS__)
+#define LANDLOCK_HOOK4(NAME, ...) LANDLOCK_HOOKx(4, NAME, __VA_ARGS__)
+#define LANDLOCK_HOOK5(NAME, ...) LANDLOCK_HOOKx(5, NAME, __VA_ARGS__)
+#define LANDLOCK_HOOK6(NAME, ...) LANDLOCK_HOOKx(6, NAME, __VA_ARGS__)
+
+#define LANDLOCK_HOOK_INIT(NAME) LSM_HOOK_INIT(NAME, landlock_hook_##NAME)
+
+/* LANDLOCK_WRAPARG_NONE */
+#define LANDLOCK_WRAPARG_NONE_REGTYPE NOT_INIT
+#define LANDLOCK_WRAPARG_NONE_DEC(arg)
+#define LANDLOCK_WRAPARG_NONE_GET(arg) 0
+
+/* LANDLOCK_WRAPARG_RAW */
+#define LANDLOCK_WRAPARG_RAW_REGTYPE UNKNOWN_VALUE
+#define LANDLOCK_WRAPARG_RAW_DEC(arg)
+#define LANDLOCK_WRAPARG_RAW_GET(arg) arg
+
+/* LANDLOCK_WRAPARG_FILE */
+#define LANDLOCK_WRAPARG_FILE_REGTYPE CONST_PTR_TO_LANDLOCK_ARG_FS
+#define LANDLOCK_WRAPARG_FILE_DEC(arg) \
+ const struct landlock_arg_fs wrap_##arg = \
+ { .type = LANDLOCK_ARGTYPE_FILE, .file = arg };
+#define LANDLOCK_WRAPARG_FILE_GET(arg) (uintptr_t)&wrap_##arg
+
+/* LANDLOCK_WRAPARG_INODE */
+#define LANDLOCK_WRAPARG_INODE_REGTYPE CONST_PTR_TO_LANDLOCK_ARG_FS
+#define LANDLOCK_WRAPARG_INODE_DEC(arg) \
+ const struct landlock_arg_fs wrap_##arg = \
+ { .type = LANDLOCK_ARGTYPE_INODE, .inode = arg };
+#define LANDLOCK_WRAPARG_INODE_GET(arg) (uintptr_t)&wrap_##arg
+
+/* LANDLOCK_WRAPARG_PATH */
+#define LANDLOCK_WRAPARG_PATH_REGTYPE CONST_PTR_TO_LANDLOCK_ARG_FS
+#define LANDLOCK_WRAPARG_PATH_DEC(arg) \
+ const struct landlock_arg_fs wrap_##arg = \
+ { .type = LANDLOCK_ARGTYPE_PATH, .path = arg };
+#define LANDLOCK_WRAPARG_PATH_GET(arg) (uintptr_t)&wrap_##arg
+
/**
* landlock_run_prog - run Landlock program for a syscall
*
@@ -127,6 +217,72 @@ static bool __is_valid_access(int off, int size, enum bpf_access_type type,
return true;
}
+LANDLOCK_HOOK2(file_open, FILE_OPEN,
+ LANDLOCK_WRAPARG_FILE, struct file *, file,
+ LANDLOCK_WRAPARG_NONE, const struct cred *, cred
+)
+
+LANDLOCK_HOOK2(file_permission, FILE_PERMISSION,
+ LANDLOCK_WRAPARG_FILE, struct file *, file,
+ LANDLOCK_WRAPARG_RAW, int, mask
+)
+
+LANDLOCK_HOOK4(mmap_file, MMAP_FILE,
+ LANDLOCK_WRAPARG_FILE, struct file *, file,
+ LANDLOCK_WRAPARG_RAW, unsigned long, reqprot,
+ LANDLOCK_WRAPARG_RAW, unsigned long, prot,
+ LANDLOCK_WRAPARG_RAW, unsigned long, flags
+)
+
+/* a directory inode contains only one dentry */
+LANDLOCK_HOOK3(inode_create, INODE_CREATE,
+ LANDLOCK_WRAPARG_INODE, struct inode *, dir,
+ LANDLOCK_WRAPARG_NONE, struct dentry *, dentry,
+ LANDLOCK_WRAPARG_RAW, umode_t, mode
+)
+
+LANDLOCK_HOOK3(inode_link, INODE_LINK,
+ LANDLOCK_WRAPARG_NONE, struct dentry *, old_dentry,
+ LANDLOCK_WRAPARG_INODE, struct inode *, dir,
+ LANDLOCK_WRAPARG_NONE, struct dentry *, new_dentry
+)
+
+LANDLOCK_HOOK2(inode_unlink, INODE_UNLINK,
+ LANDLOCK_WRAPARG_INODE, struct inode *, dir,
+ LANDLOCK_WRAPARG_NONE, struct dentry *, dentry
+)
+
+LANDLOCK_HOOK2(inode_permission, INODE_PERMISSION,
+ LANDLOCK_WRAPARG_INODE, struct inode *, inode,
+ LANDLOCK_WRAPARG_RAW, int, mask
+)
+
+LANDLOCK_HOOK1(inode_getattr, INODE_GETATTR,
+ LANDLOCK_WRAPARG_PATH, const struct path *, path
+)
+
+static struct security_hook_list landlock_hooks[] = {
+ LANDLOCK_HOOK_INIT(file_open),
+ LANDLOCK_HOOK_INIT(file_permission),
+ LANDLOCK_HOOK_INIT(mmap_file),
+ LANDLOCK_HOOK_INIT(inode_create),
+ LANDLOCK_HOOK_INIT(inode_link),
+ LANDLOCK_HOOK_INIT(inode_unlink),
+ LANDLOCK_HOOK_INIT(inode_permission),
+ LANDLOCK_HOOK_INIT(inode_getattr),
+};
+
+void __init landlock_add_hooks(void)
+{
+ pr_info("landlock: Becoming ready to sandbox with seccomp\n");
+ security_add_hooks(landlock_hooks, ARRAY_SIZE(landlock_hooks));
+}
+
+#define LANDLOCK_CASE_ACCESS_HOOK(CNAME) \
+ case LANDLOCK_HOOK_##CNAME: \
+ return __is_valid_access_hook_##CNAME( \
+ off, size, type, reg_type, prog_subtype);
+
static inline bool bpf_landlock_is_valid_access(int off, int size,
enum bpf_access_type type, enum bpf_reg_type *reg_type,
union bpf_prog_subtype *prog_subtype)
@@ -134,6 +290,14 @@ static inline bool bpf_landlock_is_valid_access(int off, int size,
enum landlock_hook hook = prog_subtype->landlock_rule.hook;
switch (hook) {
+ LANDLOCK_CASE_ACCESS_HOOK(FILE_OPEN)
+ LANDLOCK_CASE_ACCESS_HOOK(FILE_PERMISSION)
+ LANDLOCK_CASE_ACCESS_HOOK(MMAP_FILE)
+ LANDLOCK_CASE_ACCESS_HOOK(INODE_CREATE)
+ LANDLOCK_CASE_ACCESS_HOOK(INODE_LINK)
+ LANDLOCK_CASE_ACCESS_HOOK(INODE_UNLINK)
+ LANDLOCK_CASE_ACCESS_HOOK(INODE_PERMISSION)
+ LANDLOCK_CASE_ACCESS_HOOK(INODE_GETATTR)
case LANDLOCK_HOOK_UNSPEC:
default:
return false;
@@ -146,6 +310,15 @@ static inline bool bpf_landlock_is_valid_subtype(
enum landlock_hook hook = prog_subtype->landlock_rule.hook;
switch (hook) {
+ case LANDLOCK_HOOK_FILE_OPEN:
+ case LANDLOCK_HOOK_FILE_PERMISSION:
+ case LANDLOCK_HOOK_MMAP_FILE:
+ case LANDLOCK_HOOK_INODE_CREATE:
+ case LANDLOCK_HOOK_INODE_LINK:
+ case LANDLOCK_HOOK_INODE_UNLINK:
+ case LANDLOCK_HOOK_INODE_PERMISSION:
+ case LANDLOCK_HOOK_INODE_GETATTR:
+ break;
case LANDLOCK_HOOK_UNSPEC:
default:
return false;
diff --git a/security/security.c b/security/security.c
index f825304f04a7..92f0f1f209b6 100644
--- a/security/security.c
+++ b/security/security.c
@@ -61,6 +61,7 @@ int __init security_init(void)
capability_add_hooks();
yama_add_hooks();
loadpin_add_hooks();
+ landlock_add_hooks();
/*
* Load all the remaining security modules.
--
2.9.3
This helper will be useful for arraymap (next commit).
Signed-off-by: Mickaël Salaün <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Daniel Borkmann <[email protected]>
---
include/linux/bpf.h | 6 ++++++
kernel/bpf/syscall.c | 6 ------
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index c201017b5730..cf87db6daf27 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -283,6 +283,12 @@ static inline void bpf_long_memcpy(void *dst, const void *src, u32 size)
/* verify correctness of eBPF program */
int bpf_check(struct bpf_prog **fp, union bpf_attr *attr);
+
+/* helper to convert user pointers passed inside __aligned_u64 fields */
+static inline void __user *u64_to_ptr(__u64 val)
+{
+ return (void __user *) (unsigned long) val;
+}
#else
static inline void bpf_register_prog_type(struct bpf_prog_type_list *tl)
{
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 1814c010ace6..13149c9cb3a4 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -252,12 +252,6 @@ struct bpf_map *bpf_map_get_with_uref(u32 ufd)
return map;
}
-/* helper to convert user pointers passed inside __aligned_u64 fields */
-static void __user *u64_to_ptr(__u64 val)
-{
- return (void __user *) (unsigned long) val;
-}
-
int __weak bpf_stackmap_copy(struct bpf_map *map, void *key, void *value)
{
return -ENOTSUPP;
--
2.9.3
This is the main code usable by the seccomp and cgroup managers. This
allow to manage the landlock_hooks, landlock_node and landlock_rule
structs.
Landlock rules can be tied to a LSM hook. When such a hook is triggered,
a tree of rules can be evaluated. A tree is created with a first node.
This node reference a list of rules and an optional parent node. Each
rule return a 32-bit value which can interrupt the evaluation with a
non-zero value. This value is then returned by the syscall as an ERRNO
code. If every rules returned zero, the evaluation continues with the
rule list of the parent node, until the end of the tree.
Changes since v3:
* split commit
* new design to be able to inherit on the fly the parent rules
Signed-off-by: Mickaël Salaün <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: James Morris <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
---
include/linux/landlock.h | 2 +
security/landlock/Makefile | 2 +-
security/landlock/manager.c | 265 ++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 268 insertions(+), 1 deletion(-)
create mode 100644 security/landlock/manager.c
diff --git a/include/linux/landlock.h b/include/linux/landlock.h
index 2ab2be8e3e6e..263be3cf0b48 100644
--- a/include/linux/landlock.h
+++ b/include/linux/landlock.h
@@ -72,5 +72,7 @@ struct landlock_hooks {
struct landlock_node *nodes[_LANDLOCK_HOOK_LAST];
};
+void put_landlock_hooks(struct landlock_hooks *hooks);
+
#endif /* CONFIG_SECURITY_LANDLOCK */
#endif /* _LINUX_LANDLOCK_H */
diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index 27f359a8cfaa..1a77e54d8041 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -1,3 +1,3 @@
obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
-landlock-y := lsm.o checker_fs.o
+landlock-y := lsm.o checker_fs.o manager.o
diff --git a/security/landlock/manager.c b/security/landlock/manager.c
new file mode 100644
index 000000000000..f3f03b64ebef
--- /dev/null
+++ b/security/landlock/manager.c
@@ -0,0 +1,265 @@
+/*
+ * Landlock LSM - seccomp and cgroups managers
+ *
+ * Copyright (C) 2016 Mickaël Salaün <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ */
+
+#include <asm/page.h> /* PAGE_SIZE */
+#include <linux/atomic.h> /* atomic_*(), smp_store_release() */
+#include <linux/bpf.h> /* bpf_prog_put() */
+#include <linux/filter.h> /* struct bpf_prog */
+#include <linux/kernel.h> /* round_up() */
+#include <linux/landlock.h>
+#include <linux/slab.h> /* alloc(), kfree() */
+#include <linux/types.h> /* atomic_t */
+
+#include "common.h"
+
+static void put_landlock_rule(struct landlock_rule *rule)
+{
+ struct landlock_rule *orig = rule;
+
+ /* clean up single-reference branches iteratively */
+ while (orig && atomic_dec_and_test(&orig->usage)) {
+ struct landlock_rule *freeme = orig;
+
+ bpf_prog_put(orig->prog);
+ orig = orig->prev;
+ kfree(freeme);
+ }
+}
+
+static void put_landlock_node(struct landlock_node *node)
+{
+ struct landlock_node *orig = node;
+
+ /* clean up single-reference branches iteratively */
+ while (orig && atomic_dec_and_test(&orig->usage)) {
+ struct landlock_node *freeme = orig;
+
+ put_landlock_rule(orig->rule);
+ orig = orig->prev;
+ kfree(freeme);
+ }
+}
+
+void put_landlock_hooks(struct landlock_hooks *hooks)
+{
+ if (hooks && atomic_dec_and_test(&hooks->usage)) {
+ size_t i;
+
+ /* XXX: Do we need to use lockless_dereference() here? */
+ for (i = 0; i < ARRAY_SIZE(hooks->nodes); i++) {
+ if (!hooks->nodes[i])
+ continue;
+ /* Are we the owner of this node? */
+ if (hooks->nodes[i]->owner == &hooks->nodes[i])
+ hooks->nodes[i]->owner = NULL;
+ put_landlock_node(hooks->nodes[i]);
+ }
+ kfree(hooks);
+ }
+}
+
+static struct landlock_hooks *new_raw_landlock_hooks(void)
+{
+ struct landlock_hooks *ret;
+
+ /* array filled with NULL values */
+ ret = kzalloc(sizeof(*ret), GFP_KERNEL);
+ if (!ret)
+ return ERR_PTR(-ENOMEM);
+ atomic_set(&ret->usage, 1);
+ return ret;
+}
+
+static struct landlock_hooks *new_filled_landlock_hooks(void)
+{
+ size_t i;
+ struct landlock_hooks *ret;
+
+ ret = new_raw_landlock_hooks();
+ if (IS_ERR(ret))
+ return ret;
+ /*
+ * We need to initially allocate every nodes to be able to update the
+ * rules they are pointing to, across every (future) children of the
+ * current task.
+ */
+ for (i = 0; i < ARRAY_SIZE(ret->nodes); i++) {
+ struct landlock_node *node;
+
+ node = kzalloc(sizeof(*node), GFP_KERNEL);
+ if (!node)
+ goto put_hooks;
+ atomic_set(&node->usage, 1);
+ /* We are the owner of this node. */
+ node->owner = &ret->nodes[i];
+ ret->nodes[i] = node;
+ }
+ return ret;
+
+put_hooks:
+ put_landlock_hooks(ret);
+ return ERR_PTR(-ENOMEM);
+}
+
+static void add_landlock_rule(struct landlock_hooks *hooks,
+ struct landlock_rule *rule)
+{
+ /* subtype.landlock_rule.hook > 0 for loaded programs */
+ u32 hook_idx = get_index(rule->prog->subtype.landlock_rule.hook);
+
+ rule->prev = hooks->nodes[hook_idx]->rule;
+ WARN_ON(atomic_read(&rule->usage));
+ atomic_set(&rule->usage, 1);
+ /* do not increment the previous rule usage */
+ smp_store_release(&hooks->nodes[hook_idx]->rule, rule);
+}
+
+/* Limit Landlock hooks to 256KB. */
+#define LANDLOCK_HOOKS_MAX_PAGES (1 << 6)
+
+/**
+ * landlock_append_prog - attach a Landlock program to @current_hooks
+ *
+ * @current_hooks: landlock_hooks pointer, must be locked (if needed) to
+ * prevent a concurrent put/free. This pointer must not be
+ * freed after the call.
+ * @prog: non-NULL Landlock program to append to @current_hooks. @prog will be
+ * owned by landlock_append_prog() and freed if an error happened.
+ *
+ * Return @current_hooks or a new pointer when OK. Return a pointer error
+ * otherwise.
+ */
+static struct landlock_hooks *landlock_append_prog(
+ struct landlock_hooks *current_hooks, struct bpf_prog *prog)
+{
+ struct landlock_hooks *new_hooks = current_hooks;
+ unsigned long pages;
+ struct landlock_rule *rule;
+ u32 hook_idx;
+
+ if (prog->type != BPF_PROG_TYPE_LANDLOCK) {
+ new_hooks = ERR_PTR(-EINVAL);
+ goto put_prog;
+ }
+
+ /* validate memory size allocation */
+ pages = prog->pages;
+ if (current_hooks) {
+ size_t i;
+
+ for (i = 0; i < ARRAY_SIZE(current_hooks->nodes); i++) {
+ struct landlock_node *walker_n;
+
+ for (walker_n = current_hooks->nodes[i];
+ walker_n;
+ walker_n = walker_n->prev) {
+ struct landlock_rule *walker_r;
+
+ for (walker_r = walker_n->rule;
+ walker_r;
+ walker_r = walker_r->prev)
+ pages += walker_r->prog->pages;
+ }
+ }
+ /* count a struct landlock_hooks if we need to allocate one */
+ if (atomic_read(¤t_hooks->usage) != 1)
+ pages += round_up(sizeof(*current_hooks), PAGE_SIZE) /
+ PAGE_SIZE;
+ }
+ if (pages > LANDLOCK_HOOKS_MAX_PAGES) {
+ new_hooks = ERR_PTR(-E2BIG);
+ goto put_prog;
+ }
+
+ rule = kzalloc(sizeof(*rule), GFP_KERNEL);
+ if (!rule) {
+ new_hooks = ERR_PTR(-ENOMEM);
+ goto put_prog;
+ }
+ rule->prog = prog;
+
+ /* subtype.landlock_rule.hook > 0 for loaded programs */
+ hook_idx = get_index(rule->prog->subtype.landlock_rule.hook);
+
+ if (!current_hooks) {
+ /* add a new landlock_hooks, if needed */
+ new_hooks = new_filled_landlock_hooks();
+ if (IS_ERR(new_hooks))
+ goto put_rule;
+ add_landlock_rule(new_hooks, rule);
+ } else {
+ if (new_hooks->nodes[hook_idx]->owner == &new_hooks->nodes[hook_idx]) {
+ /* We are the owner, we can then update the node. */
+ add_landlock_rule(new_hooks, rule);
+ } else if (atomic_read(¤t_hooks->usage) == 1) {
+ WARN_ON(new_hooks->nodes[hook_idx]->owner);
+ /*
+ * We can become the new owner if no other task use it.
+ * This avoid an unnecessary allocation.
+ */
+ new_hooks->nodes[hook_idx]->owner =
+ &new_hooks->nodes[hook_idx];
+ add_landlock_rule(new_hooks, rule);
+ } else {
+ /*
+ * We are not the owner, we need to fork current_hooks
+ * and then add a new node.
+ */
+ struct landlock_node *node;
+ size_t i;
+
+ node = kmalloc(sizeof(*node), GFP_KERNEL);
+ if (!node) {
+ new_hooks = ERR_PTR(-ENOMEM);
+ goto put_rule;
+ }
+ atomic_set(&node->usage, 1);
+ /* set the previous node after the new_hooks allocation */
+ node->prev = NULL;
+ /* do not increment the previous node usage */
+ node->owner = &new_hooks->nodes[hook_idx];
+ /* rule->prev is already NULL */
+ atomic_set(&rule->usage, 1);
+ node->rule = rule;
+
+ new_hooks = new_raw_landlock_hooks();
+ if (IS_ERR(new_hooks)) {
+ /* put the rule as well */
+ put_landlock_node(node);
+ return ERR_PTR(-ENOMEM);
+ }
+ for (i = 0; i < ARRAY_SIZE(new_hooks->nodes); i++) {
+ new_hooks->nodes[i] = lockless_dereference(current_hooks->nodes[i]);
+ if (i == hook_idx)
+ node->prev = new_hooks->nodes[i];
+ if (!WARN_ON(!new_hooks->nodes[i]))
+ atomic_inc(&new_hooks->nodes[i]->usage);
+ }
+ new_hooks->nodes[hook_idx] = node;
+
+ /*
+ * @current_hooks will not be freed here because it's usage
+ * field is > 1. It is only prevented to be freed by another
+ * subject thanks to the caller of landlock_append_prog() which
+ * should be locked if needed.
+ */
+ put_landlock_hooks(current_hooks);
+ }
+ }
+ return new_hooks;
+
+put_prog:
+ bpf_prog_put(prog);
+ return new_hooks;
+
+put_rule:
+ put_landlock_rule(rule);
+ return new_hooks;
+}
--
2.9.3
Add a new type of eBPF program used by Landlock rules and the functions
to verify and evaluate them.
Changes since v3:
* split commit
Signed-off-by: Mickaël Salaün <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: James Morris <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Serge E. Hallyn <[email protected]>
---
include/linux/landlock.h | 76 +++++++++++++++++++
include/uapi/linux/bpf.h | 20 +++++
kernel/bpf/syscall.c | 10 ++-
security/Makefile | 2 +
security/landlock/Makefile | 3 +
security/landlock/common.h | 27 +++++++
security/landlock/lsm.c | 181 +++++++++++++++++++++++++++++++++++++++++++++
7 files changed, 317 insertions(+), 2 deletions(-)
create mode 100644 include/linux/landlock.h
create mode 100644 security/landlock/Makefile
create mode 100644 security/landlock/common.h
create mode 100644 security/landlock/lsm.c
diff --git a/include/linux/landlock.h b/include/linux/landlock.h
new file mode 100644
index 000000000000..2ab2be8e3e6e
--- /dev/null
+++ b/include/linux/landlock.h
@@ -0,0 +1,76 @@
+/*
+ * Landlock LSM - Public headers
+ *
+ * Copyright (C) 2016 Mickaël Salaün <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _LINUX_LANDLOCK_H
+#define _LINUX_LANDLOCK_H
+#ifdef CONFIG_SECURITY_LANDLOCK
+
+#include <linux/bpf.h> /* _LANDLOCK_HOOK_LAST */
+#include <linux/types.h> /* atomic_t */
+
+#ifdef CONFIG_SECCOMP_FILTER
+#include <linux/seccomp.h> /* struct seccomp_filter */
+#endif /* CONFIG_SECCOMP_FILTER */
+
+struct landlock_rule;
+
+/**
+ * struct landlock_node - node in the rule hierarchy
+ *
+ * This is created when a task insert its first rule in the Landlock rule
+ * hierarchy. The set of Landlock rules referenced by this node is then
+ * enforced for all the task that inherit this node. However, if a task is
+ * cloned before inserting new rules, it doesn't get a dedicated node and its
+ * children will not inherit this new rules.
+ *
+ * @usage: reference count to manage the node lifetime.
+ * @rule: list of Landlock rules managed by this node.
+ * @prev: reference the parent node.
+ * @owner: reference the address of the node in the struct landlock_hooks. This
+ * is needed to know if we need to append a rule to the current node or
+ * create a new node.
+ */
+struct landlock_node {
+ atomic_t usage;
+ struct landlock_rule *rule;
+ struct landlock_node *prev;
+ struct landlock_node **owner;
+};
+
+struct landlock_rule {
+ atomic_t usage;
+ struct landlock_rule *prev;
+ struct bpf_prog *prog;
+};
+
+/**
+ * struct landlock_hooks - Landlock hook programs enforced on a thread
+ *
+ * This is used for low performance impact when forking a process. Instead of
+ * copying the full array and incrementing the usage field of each entries,
+ * only create a pointer to struct landlock_hooks and increment the usage
+ * field.
+ *
+ * A new struct landlock_hooks must be created thanks to a call to
+ * new_landlock_hooks().
+ *
+ * @usage: reference count to manage the object lifetime. When a thread need to
+ * add Landlock programs and if @usage is greater than 1, then the
+ * thread must duplicate struct landlock_hooks to not change the
+ * children' rules as well. FIXME
+ * @nodes: array of non-NULL struct landlock_node pointers.
+ */
+struct landlock_hooks {
+ atomic_t usage;
+ struct landlock_node *nodes[_LANDLOCK_HOOK_LAST];
+};
+
+#endif /* CONFIG_SECURITY_LANDLOCK */
+#endif /* _LINUX_LANDLOCK_H */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 06621c401bc0..335616ab63ff 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -111,6 +111,7 @@ enum bpf_prog_type {
BPF_PROG_TYPE_XDP,
BPF_PROG_TYPE_PERF_EVENT,
BPF_PROG_TYPE_CGROUP_SKB,
+ BPF_PROG_TYPE_LANDLOCK,
};
enum bpf_attach_type {
@@ -559,6 +560,12 @@ struct xdp_md {
__u32 data_end;
};
+/* LSM hooks */
+enum landlock_hook {
+ LANDLOCK_HOOK_UNSPEC,
+};
+#define _LANDLOCK_HOOK_LAST LANDLOCK_HOOK_UNSPEC
+
/* eBPF context and functions allowed for a rule */
#define _LANDLOCK_SUBTYPE_ACCESS_MASK ((1ULL << 0) - 1)
@@ -577,4 +584,17 @@ struct landlock_handle {
};
} __attribute__((aligned(8)));
+/**
+ * struct landlock_data
+ *
+ * @hook: LSM hook ID (e.g. BPF_PROG_TYPE_LANDLOCK_FILE_OPEN)
+ * @args: LSM hook arguments, see include/linux/lsm_hooks.h for there
+ * description and the LANDLOCK_HOOK* definitions from
+ * security/landlock/lsm.c for their types.
+ */
+struct landlock_data {
+ __u32 hook; /* enum landlock_hook */
+ __u64 args[6];
+};
+
#endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 6eef1da1e8a3..0f7faa9d2262 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -739,8 +739,14 @@ static int bpf_prog_load(union bpf_attr *attr)
attr->kern_version != LINUX_VERSION_CODE)
return -EINVAL;
- if (type != BPF_PROG_TYPE_SOCKET_FILTER && !capable(CAP_SYS_ADMIN))
- return -EPERM;
+ switch (type) {
+ case BPF_PROG_TYPE_SOCKET_FILTER:
+ case BPF_PROG_TYPE_LANDLOCK:
+ break;
+ default:
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+ }
/* plain bpf_prog allocation */
prog = bpf_prog_alloc(bpf_prog_size(attr->insn_cnt), GFP_USER);
diff --git a/security/Makefile b/security/Makefile
index f2d71cdb8e19..3fdc2f19dc48 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -9,6 +9,7 @@ subdir-$(CONFIG_SECURITY_TOMOYO) += tomoyo
subdir-$(CONFIG_SECURITY_APPARMOR) += apparmor
subdir-$(CONFIG_SECURITY_YAMA) += yama
subdir-$(CONFIG_SECURITY_LOADPIN) += loadpin
+subdir-$(CONFIG_SECURITY_LANDLOCK) += landlock
# always enable default capabilities
obj-y += commoncap.o
@@ -24,6 +25,7 @@ obj-$(CONFIG_SECURITY_TOMOYO) += tomoyo/
obj-$(CONFIG_SECURITY_APPARMOR) += apparmor/
obj-$(CONFIG_SECURITY_YAMA) += yama/
obj-$(CONFIG_SECURITY_LOADPIN) += loadpin/
+obj-$(CONFIG_SECURITY_LANDLOCK) += landlock/
obj-$(CONFIG_CGROUP_DEVICE) += device_cgroup.o
# Object integrity file lists
diff --git a/security/landlock/Makefile b/security/landlock/Makefile
new file mode 100644
index 000000000000..59669d70bc7e
--- /dev/null
+++ b/security/landlock/Makefile
@@ -0,0 +1,3 @@
+obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
+
+landlock-y := lsm.o
diff --git a/security/landlock/common.h b/security/landlock/common.h
new file mode 100644
index 000000000000..0b5aad4a7aaa
--- /dev/null
+++ b/security/landlock/common.h
@@ -0,0 +1,27 @@
+/*
+ * Landlock LSM - private headers
+ *
+ * Copyright (C) 2016 Mickaël Salaün <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _SECURITY_LANDLOCK_COMMON_H
+#define _SECURITY_LANDLOCK_COMMON_H
+
+#include <linux/bpf.h> /* enum landlock_hook */
+
+/**
+ * get_index - get an index for the rules of struct landlock_hooks
+ *
+ * @hook: a Landlock hook ID
+ */
+static inline int get_index(enum landlock_hook hook)
+{
+ /* hook ID > 0 for loaded programs */
+ return hook - 1;
+}
+
+#endif /* _SECURITY_LANDLOCK_COMMON_H */
diff --git a/security/landlock/lsm.c b/security/landlock/lsm.c
new file mode 100644
index 000000000000..d7564540c493
--- /dev/null
+++ b/security/landlock/lsm.c
@@ -0,0 +1,181 @@
+/*
+ * Landlock LSM
+ *
+ * Copyright (C) 2016 Mickaël Salaün <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/bpf.h> /* enum bpf_reg_type, struct landlock_data */
+#include <linux/cred.h>
+#include <linux/err.h> /* MAX_ERRNO */
+#include <linux/filter.h> /* struct bpf_prog, BPF_PROG_RUN() */
+#include <linux/kernel.h> /* FIELD_SIZEOF() */
+#include <linux/landlock.h>
+#include <linux/lsm_hooks.h>
+
+#include "common.h"
+
+/**
+ * landlock_run_prog - run Landlock program for a syscall
+ *
+ * @hook_idx: hook index in the rules array
+ * @ctx: non-NULL eBPF context
+ * @hooks: Landlock hooks pointer
+ */
+static u32 landlock_run_prog(u32 hook_idx, const struct landlock_data *ctx,
+ struct landlock_hooks *hooks)
+{
+ struct landlock_node *node;
+ u32 ret = 0;
+
+ if (!hooks)
+ return 0;
+
+ for (node = hooks->nodes[hook_idx]; node; node = node->prev) {
+ struct landlock_rule *rule;
+
+ for (rule = node->rule; rule; rule = rule->prev) {
+ if (WARN_ON(!rule->prog))
+ continue;
+ rcu_read_lock();
+ ret = BPF_PROG_RUN(rule->prog, (void *)ctx);
+ rcu_read_unlock();
+ if (ret) {
+ if (ret > MAX_ERRNO)
+ ret = MAX_ERRNO;
+ goto out;
+ }
+ }
+ }
+
+out:
+ return ret;
+}
+
+static int landlock_enforce(enum landlock_hook hook, __u64 args[6])
+{
+ u32 ret = 0;
+ u32 hook_idx = get_index(hook);
+
+ struct landlock_data ctx = {
+ .hook = hook,
+ .args[0] = args[0],
+ .args[1] = args[1],
+ .args[2] = args[2],
+ .args[3] = args[3],
+ .args[4] = args[4],
+ .args[5] = args[5],
+ };
+
+ /* placeholder for seccomp and cgroup managers */
+ ret = landlock_run_prog(hook_idx, &ctx, NULL);
+
+ return -ret;
+}
+
+static const struct bpf_func_proto *bpf_landlock_func_proto(
+ enum bpf_func_id func_id, union bpf_prog_subtype *prog_subtype)
+{
+ switch (func_id) {
+ default:
+ return NULL;
+ }
+}
+
+static bool __is_valid_access(int off, int size, enum bpf_access_type type,
+ enum bpf_reg_type *reg_type,
+ enum bpf_reg_type arg_types[6],
+ union bpf_prog_subtype *prog_subtype)
+{
+ int arg_nb, expected_size;
+
+ if (type != BPF_READ)
+ return false;
+ if (off < 0 || off >= sizeof(struct landlock_data))
+ return false;
+
+ /* check size */
+ switch (off) {
+ case offsetof(struct landlock_data, hook):
+ expected_size = sizeof(__u32);
+ break;
+ case offsetof(struct landlock_data, args[0]) ...
+ offsetof(struct landlock_data, args[5]):
+ expected_size = sizeof(__u64);
+ break;
+ default:
+ return false;
+ }
+ if (expected_size != size)
+ return false;
+
+ /* check pointer access and set pointer type */
+ switch (off) {
+ case offsetof(struct landlock_data, args[0]) ...
+ offsetof(struct landlock_data, args[5]):
+ arg_nb = (off - offsetof(struct landlock_data, args[0]))
+ / FIELD_SIZEOF(struct landlock_data, args[0]);
+ *reg_type = arg_types[arg_nb];
+ if (*reg_type == NOT_INIT)
+ return false;
+ break;
+ }
+
+ return true;
+}
+
+static inline bool bpf_landlock_is_valid_access(int off, int size,
+ enum bpf_access_type type, enum bpf_reg_type *reg_type,
+ union bpf_prog_subtype *prog_subtype)
+{
+ enum landlock_hook hook = prog_subtype->landlock_rule.hook;
+
+ switch (hook) {
+ case LANDLOCK_HOOK_UNSPEC:
+ default:
+ return false;
+ }
+}
+
+static inline bool bpf_landlock_is_valid_subtype(
+ union bpf_prog_subtype *prog_subtype)
+{
+ enum landlock_hook hook = prog_subtype->landlock_rule.hook;
+
+ switch (hook) {
+ case LANDLOCK_HOOK_UNSPEC:
+ default:
+ return false;
+ }
+ if (!prog_subtype->landlock_rule.hook ||
+ prog_subtype->landlock_rule.hook > _LANDLOCK_HOOK_LAST)
+ return false;
+ if (prog_subtype->landlock_rule.access & ~_LANDLOCK_SUBTYPE_ACCESS_MASK)
+ return false;
+ if (prog_subtype->landlock_rule.option & ~_LANDLOCK_SUBTYPE_OPTION_MASK)
+ return false;
+
+ return true;
+}
+
+static const struct bpf_verifier_ops bpf_landlock_ops = {
+ .get_func_proto = bpf_landlock_func_proto,
+ .is_valid_access = bpf_landlock_is_valid_access,
+ .is_valid_subtype = bpf_landlock_is_valid_subtype,
+};
+
+static struct bpf_prog_type_list bpf_landlock_type __read_mostly = {
+ .ops = &bpf_landlock_ops,
+ .type = BPF_PROG_TYPE_LANDLOCK,
+};
+
+static int __init register_landlock_filter_ops(void)
+{
+ bpf_register_prog_type(&bpf_landlock_type);
+ return 0;
+}
+
+late_initcall(register_landlock_filter_ops);
--
2.9.3
Signed-off-by: Mickaël Salaün <[email protected]>
Cc: Alexander Viro <[email protected]>
---
fs/namespace.c | 2 +-
include/linux/fs.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/namespace.c b/fs/namespace.c
index e6c234b1a645..4d80a5066a1f 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -2997,7 +2997,7 @@ bool is_path_reachable(struct mount *mnt, struct dentry *dentry,
return &mnt->mnt == root->mnt && is_subdir(dentry, root->dentry);
}
-bool path_is_under(struct path *path1, struct path *path2)
+bool path_is_under(const struct path *path1, const struct path *path2)
{
bool res;
read_seqlock_excl(&mount_lock);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 16d2b6e874d6..abbaf162f70e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2709,7 +2709,7 @@ extern struct file * open_exec(const char *);
/* fs/dcache.c -- generic fs support functions */
extern bool is_subdir(struct dentry *, struct dentry *);
-extern bool path_is_under(struct path *, struct path *);
+extern bool path_is_under(const struct path *, const struct path *);
extern char *file_path(struct file *, char *, int);
--
2.9.3
This will be useful to support Landlock for the next commits.
Signed-off-by: Mickaël Salaün <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Daniel Mack <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Tejun Heo <[email protected]>
---
include/linux/bpf-cgroup.h | 4 ++--
kernel/bpf/cgroup.c | 3 ++-
kernel/bpf/syscall.c | 10 ++++++----
kernel/cgroup.c | 6 ++++--
4 files changed, 14 insertions(+), 9 deletions(-)
diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index 2f608e5d94e9..aab1aa91c064 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -31,13 +31,13 @@ struct cgroup_bpf {
void cgroup_bpf_put(struct cgroup *cgrp);
void cgroup_bpf_inherit(struct cgroup *cgrp, struct cgroup *parent);
-void __cgroup_bpf_update(struct cgroup *cgrp,
+int __cgroup_bpf_update(struct cgroup *cgrp,
struct cgroup *parent,
struct bpf_prog *prog,
enum bpf_attach_type type);
/* Wrapper for __cgroup_bpf_update() protected by cgroup_mutex */
-void cgroup_bpf_update(struct cgroup *cgrp,
+int cgroup_bpf_update(struct cgroup *cgrp,
struct bpf_prog *prog,
enum bpf_attach_type type);
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 75482cd92d56..269b410d890c 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -83,7 +83,7 @@ void cgroup_bpf_inherit(struct cgroup *cgrp, struct cgroup *parent)
*
* Must be called with cgroup_mutex held.
*/
-void __cgroup_bpf_update(struct cgroup *cgrp,
+int __cgroup_bpf_update(struct cgroup *cgrp,
struct cgroup *parent,
struct bpf_prog *prog,
enum bpf_attach_type type)
@@ -116,6 +116,7 @@ void __cgroup_bpf_update(struct cgroup *cgrp,
bpf_prog_put(old_prog);
static_branch_dec(&cgroup_bpf_enabled_key);
}
+ return 0;
}
/**
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index ac4cbb98596d..e62123aeb202 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -831,6 +831,7 @@ static int bpf_prog_attach(const union bpf_attr *attr)
{
struct bpf_prog *prog;
struct cgroup *cgrp;
+ int result;
if (!capable(CAP_NET_ADMIN))
return -EPERM;
@@ -858,10 +859,10 @@ static int bpf_prog_attach(const union bpf_attr *attr)
return PTR_ERR(cgrp);
}
- cgroup_bpf_update(cgrp, prog, attr->attach_type);
+ result = cgroup_bpf_update(cgrp, prog, attr->attach_type);
cgroup_put(cgrp);
- return 0;
+ return result;
}
#define BPF_PROG_DETACH_LAST_FIELD attach_type
@@ -869,6 +870,7 @@ static int bpf_prog_attach(const union bpf_attr *attr)
static int bpf_prog_detach(const union bpf_attr *attr)
{
struct cgroup *cgrp;
+ int result = 0;
if (!capable(CAP_NET_ADMIN))
return -EPERM;
@@ -883,7 +885,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
if (IS_ERR(cgrp))
return PTR_ERR(cgrp);
- cgroup_bpf_update(cgrp, NULL, attr->attach_type);
+ result = cgroup_bpf_update(cgrp, NULL, attr->attach_type);
cgroup_put(cgrp);
break;
@@ -891,7 +893,7 @@ static int bpf_prog_detach(const union bpf_attr *attr)
return -EINVAL;
}
- return 0;
+ return result;
}
#endif /* CONFIG_CGROUP_BPF */
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 2ee9ec3051b2..f77a974eb960 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -6501,15 +6501,17 @@ static __init int cgroup_namespaces_init(void)
subsys_initcall(cgroup_namespaces_init);
#ifdef CONFIG_CGROUP_BPF
-void cgroup_bpf_update(struct cgroup *cgrp,
+int cgroup_bpf_update(struct cgroup *cgrp,
struct bpf_prog *prog,
enum bpf_attach_type type)
{
struct cgroup *parent = cgroup_parent(cgrp);
+ int result;
mutex_lock(&cgroup_mutex);
- __cgroup_bpf_update(cgrp, parent, prog, type);
+ result = __cgroup_bpf_update(cgrp, parent, prog, type);
mutex_unlock(&cgroup_mutex);
+ return result;
}
#endif /* CONFIG_CGROUP_BPF */
--
2.9.3
Move code outside a switch/case to ease code factoring (cf. next
commit).
This apply on Daniel Mack's "Add eBPF hooks for cgroups" v7:
https://lkml.kernel.org/r/[email protected]
Signed-off-by: Mickaël Salaün <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Daniel Mack <[email protected]>
---
kernel/bpf/syscall.c | 23 ++++++++++++-----------
1 file changed, 12 insertions(+), 11 deletions(-)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 0f7faa9d2262..ac4cbb98596d 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -843,23 +843,24 @@ static int bpf_prog_attach(const union bpf_attr *attr)
case BPF_CGROUP_INET_EGRESS:
prog = bpf_prog_get_type(attr->attach_bpf_fd,
BPF_PROG_TYPE_CGROUP_SKB);
- if (IS_ERR(prog))
- return PTR_ERR(prog);
-
- cgrp = cgroup_get_from_fd(attr->target_fd);
- if (IS_ERR(cgrp)) {
- bpf_prog_put(prog);
- return PTR_ERR(cgrp);
- }
-
- cgroup_bpf_update(cgrp, prog, attr->attach_type);
- cgroup_put(cgrp);
break;
default:
return -EINVAL;
}
+ if (IS_ERR(prog))
+ return PTR_ERR(prog);
+
+ cgrp = cgroup_get_from_fd(attr->target_fd);
+ if (IS_ERR(cgrp)) {
+ bpf_prog_put(prog);
+ return PTR_ERR(cgrp);
+ }
+
+ cgroup_bpf_update(cgrp, prog, attr->attach_type);
+ cgroup_put(cgrp);
+
return 0;
}
--
2.9.3
This allows CONFIG_CGROUP_BPF to manage different type of pointers
instead of only eBPF programs. This will be useful for the next commits
to support Landlock with cgroups.
Changes since v3:
* do not use an union of pointers but a struct (suggested by Kees Cook)
Signed-off-by: Mickaël Salaün <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Daniel Mack <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Tejun Heo <[email protected]>
Link: https://lkml.kernel.org/r/CAGXu5j+F80XCPSVL0VxAAoTiYk5D1NKKC3jyAU=Z0Gi7L9S0aw@mail.gmail.com
---
include/linux/bpf-cgroup.h | 8 ++++++--
kernel/bpf/cgroup.c | 35 ++++++++++++++++++-----------------
2 files changed, 24 insertions(+), 19 deletions(-)
diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index fc076de74ab9..2f608e5d94e9 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -14,14 +14,18 @@ struct sk_buff;
extern struct static_key_false cgroup_bpf_enabled_key;
#define cgroup_bpf_enabled static_branch_unlikely(&cgroup_bpf_enabled_key)
+struct bpf_object {
+ struct bpf_prog *prog;
+};
+
struct cgroup_bpf {
/*
* Store two sets of bpf_prog pointers, one for programs that are
* pinned directly to this cgroup, and one for those that are effective
* when this cgroup is accessed.
*/
- struct bpf_prog *prog[MAX_BPF_ATTACH_TYPE];
- struct bpf_prog *effective[MAX_BPF_ATTACH_TYPE];
+ struct bpf_object pinned[MAX_BPF_ATTACH_TYPE];
+ struct bpf_object effective[MAX_BPF_ATTACH_TYPE];
};
void cgroup_bpf_put(struct cgroup *cgrp);
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index a0ab43f264b0..75482cd92d56 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -20,18 +20,18 @@ DEFINE_STATIC_KEY_FALSE(cgroup_bpf_enabled_key);
EXPORT_SYMBOL(cgroup_bpf_enabled_key);
/**
- * cgroup_bpf_put() - put references of all bpf programs
+ * cgroup_bpf_put() - put references of all bpf objects
* @cgrp: the cgroup to modify
*/
void cgroup_bpf_put(struct cgroup *cgrp)
{
unsigned int type;
- for (type = 0; type < ARRAY_SIZE(cgrp->bpf.prog); type++) {
- struct bpf_prog *prog = cgrp->bpf.prog[type];
+ for (type = 0; type < ARRAY_SIZE(cgrp->bpf.pinned); type++) {
+ struct bpf_object pinned = cgrp->bpf.pinned[type];
- if (prog) {
- bpf_prog_put(prog);
+ if (pinned.prog) {
+ bpf_prog_put(pinned.prog);
static_branch_dec(&cgroup_bpf_enabled_key);
}
}
@@ -47,11 +47,12 @@ void cgroup_bpf_inherit(struct cgroup *cgrp, struct cgroup *parent)
unsigned int type;
for (type = 0; type < ARRAY_SIZE(cgrp->bpf.effective); type++) {
- struct bpf_prog *e;
+ struct bpf_prog *prog;
- e = rcu_dereference_protected(parent->bpf.effective[type],
- lockdep_is_held(&cgroup_mutex));
- rcu_assign_pointer(cgrp->bpf.effective[type], e);
+ prog = rcu_dereference_protected(
+ parent->bpf.effective[type].prog,
+ lockdep_is_held(&cgroup_mutex));
+ rcu_assign_pointer(cgrp->bpf.effective[type].prog, prog);
}
}
@@ -87,13 +88,13 @@ void __cgroup_bpf_update(struct cgroup *cgrp,
struct bpf_prog *prog,
enum bpf_attach_type type)
{
- struct bpf_prog *old_prog, *effective;
+ struct bpf_prog *old_prog = NULL, *effective_prog;
struct cgroup_subsys_state *pos;
- old_prog = xchg(cgrp->bpf.prog + type, prog);
+ old_prog = xchg(&cgrp->bpf.pinned[type].prog, prog);
- effective = (!prog && parent) ?
- rcu_dereference_protected(parent->bpf.effective[type],
+ effective_prog = (!prog && parent) ?
+ rcu_dereference_protected(parent->bpf.effective[type].prog,
lockdep_is_held(&cgroup_mutex)) :
prog;
@@ -101,11 +102,11 @@ void __cgroup_bpf_update(struct cgroup *cgrp,
struct cgroup *desc = container_of(pos, struct cgroup, self);
/* skip the subtree if the descendant has its own program */
- if (desc->bpf.prog[type] && desc != cgrp)
+ if (desc->bpf.pinned[type].prog && desc != cgrp)
pos = css_rightmost_descendant(pos);
else
- rcu_assign_pointer(desc->bpf.effective[type],
- effective);
+ rcu_assign_pointer(desc->bpf.effective[type].prog,
+ effective_prog);
}
if (prog)
@@ -151,7 +152,7 @@ int __cgroup_bpf_run_filter(struct sock *sk,
rcu_read_lock();
- prog = rcu_dereference(cgrp->bpf.effective[type]);
+ prog = rcu_dereference(cgrp->bpf.effective[type].prog);
if (prog) {
unsigned int offset = skb->data - skb_network_header(skb);
--
2.9.3
The program subtype's goal is to be able to have different static
fine-grained verifications for a unique program type.
The struct bpf_verifier_ops gets a new optional function:
is_valid_subtype(). This new verifier is called at the begening of the
eBPF program verification to check if the (optional) program subtype is
valid.
For now, only Landlock eBPF programs are using a program subtype but
this could be used by other program types in the future.
Cf. the next commit to see how the subtype is used by Landlock LSM.
Changes since v3:
* remove the "origin" field
* add an "option" field
* cleanup comments
Signed-off-by: Mickaël Salaün <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: David S. Miller <[email protected]>
---
include/linux/bpf.h | 7 +++++--
include/linux/filter.h | 1 +
include/uapi/linux/bpf.h | 18 ++++++++++++++++++
kernel/bpf/syscall.c | 5 +++--
kernel/bpf/verifier.c | 9 +++++++--
kernel/trace/bpf_trace.c | 12 ++++++++----
net/core/filter.c | 26 ++++++++++++++++----------
7 files changed, 58 insertions(+), 20 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 34b9e9cd1af7..2cca9fc8b72b 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -163,18 +163,21 @@ struct bpf_prog;
struct bpf_verifier_ops {
/* return eBPF function prototype for verification */
- const struct bpf_func_proto *(*get_func_proto)(enum bpf_func_id func_id);
+ const struct bpf_func_proto *(*get_func_proto)(enum bpf_func_id func_id,
+ union bpf_prog_subtype *prog_subtype);
/* return true if 'size' wide access at offset 'off' within bpf_context
* with 'type' (read or write) is allowed
*/
bool (*is_valid_access)(int off, int size, enum bpf_access_type type,
- enum bpf_reg_type *reg_type);
+ enum bpf_reg_type *reg_type,
+ union bpf_prog_subtype *prog_subtype);
int (*gen_prologue)(struct bpf_insn *insn, bool direct_write,
const struct bpf_prog *prog);
u32 (*convert_ctx_access)(enum bpf_access_type type, int dst_reg,
int src_reg, int ctx_off,
struct bpf_insn *insn, struct bpf_prog *prog);
+ bool (*is_valid_subtype)(union bpf_prog_subtype *prog_subtype);
};
struct bpf_prog_type_list {
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 1f09c521adfe..88470cdd3ee1 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -406,6 +406,7 @@ struct bpf_prog {
kmemcheck_bitfield_end(meta);
u32 len; /* Number of filter blocks */
enum bpf_prog_type type; /* Type of BPF program */
+ union bpf_prog_subtype subtype; /* For fine-grained verifications */
struct bpf_prog_aux *aux; /* Auxiliary fields */
struct sock_fprog_kern *orig_prog; /* Original BPF program */
unsigned int (*bpf_func)(const struct sk_buff *skb,
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 339a9307ba6e..06621c401bc0 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -130,6 +130,14 @@ enum bpf_attach_type {
#define BPF_F_NO_PREALLOC (1U << 0)
+union bpf_prog_subtype {
+ struct {
+ __u32 hook; /* enum landlock_hook */
+ __aligned_u64 access; /* LANDLOCK_SUBTYPE_ACCESS_* */
+ __aligned_u64 option; /* LANDLOCK_SUBTYPE_OPTION_* */
+ } landlock_rule;
+} __attribute__((aligned(8)));
+
union bpf_attr {
struct { /* anonymous struct used by BPF_MAP_CREATE command */
__u32 map_type; /* one of enum bpf_map_type */
@@ -158,6 +166,7 @@ union bpf_attr {
__u32 log_size; /* size of user buffer */
__aligned_u64 log_buf; /* user supplied buffer */
__u32 kern_version; /* checked when prog_type=kprobe */
+ union bpf_prog_subtype prog_subtype;
};
struct { /* anonymous struct used by BPF_OBJ_* commands */
@@ -550,6 +559,15 @@ struct xdp_md {
__u32 data_end;
};
+/* eBPF context and functions allowed for a rule */
+#define _LANDLOCK_SUBTYPE_ACCESS_MASK ((1ULL << 0) - 1)
+
+/*
+ * (future) options for a Landlock rule (e.g. run even if a previous rule
+ * returned an errno code)
+ */
+#define _LANDLOCK_SUBTYPE_OPTION_MASK ((1ULL << 0) - 1)
+
/* Map handle entry */
struct landlock_handle {
__u32 type; /* enum bpf_map_handle_type */
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 13149c9cb3a4..6eef1da1e8a3 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -572,7 +572,7 @@ static void fixup_bpf_calls(struct bpf_prog *prog)
continue;
}
- fn = prog->aux->ops->get_func_proto(insn->imm);
+ fn = prog->aux->ops->get_func_proto(insn->imm, &prog->subtype);
/* all functions that have prototype and verifier allowed
* programs to call them, must be real in-kernel functions
*/
@@ -710,7 +710,7 @@ struct bpf_prog *bpf_prog_get_type(u32 ufd, enum bpf_prog_type type)
EXPORT_SYMBOL_GPL(bpf_prog_get_type);
/* last field in 'union bpf_attr' used by this command */
-#define BPF_PROG_LOAD_LAST_FIELD kern_version
+#define BPF_PROG_LOAD_LAST_FIELD prog_subtype
static int bpf_prog_load(union bpf_attr *attr)
{
@@ -768,6 +768,7 @@ static int bpf_prog_load(union bpf_attr *attr)
err = find_prog_type(type, prog);
if (err < 0)
goto free_prog;
+ prog->subtype = attr->prog_subtype;
/* run eBPF verifier */
err = bpf_check(&prog, attr);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 1bc7701466b0..9b921a9afa3c 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -655,7 +655,8 @@ static int check_ctx_access(struct bpf_verifier_env *env, int off, int size,
return 0;
if (env->prog->aux->ops->is_valid_access &&
- env->prog->aux->ops->is_valid_access(off, size, t, reg_type)) {
+ env->prog->aux->ops->is_valid_access(off, size, t, reg_type,
+ &env->prog->subtype)) {
/* remember the offset of last byte accessed in ctx */
if (env->prog->aux->max_ctx_offset < off + size)
env->prog->aux->max_ctx_offset = off + size;
@@ -1181,7 +1182,7 @@ static int check_call(struct bpf_verifier_env *env, int func_id)
}
if (env->prog->aux->ops->get_func_proto)
- fn = env->prog->aux->ops->get_func_proto(func_id);
+ fn = env->prog->aux->ops->get_func_proto(func_id, &env->prog->subtype);
if (!fn) {
verbose("unknown func %d\n", func_id);
@@ -3065,6 +3066,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr)
if ((*prog)->len <= 0 || (*prog)->len > BPF_MAXINSNS)
return -E2BIG;
+ if ((*prog)->aux->ops->is_valid_subtype &&
+ !(*prog)->aux->ops->is_valid_subtype(&(*prog)->subtype))
+ return -EINVAL;
+
/* 'struct bpf_verifier_env' can be global, but since it's not small,
* allocate/free it every time bpf_check() is called
*/
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 5dcb99281259..51cf0f254bf2 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -435,7 +435,8 @@ static const struct bpf_func_proto *tracing_func_proto(enum bpf_func_id func_id)
}
}
-static const struct bpf_func_proto *kprobe_prog_func_proto(enum bpf_func_id func_id)
+static const struct bpf_func_proto *kprobe_prog_func_proto(enum bpf_func_id func_id,
+ union bpf_prog_subtype *prog_subtype)
{
switch (func_id) {
case BPF_FUNC_perf_event_output:
@@ -449,7 +450,8 @@ static const struct bpf_func_proto *kprobe_prog_func_proto(enum bpf_func_id func
/* bpf+kprobe programs can access fields of 'struct pt_regs' */
static bool kprobe_prog_is_valid_access(int off, int size, enum bpf_access_type type,
- enum bpf_reg_type *reg_type)
+ enum bpf_reg_type *reg_type,
+ union bpf_prog_subtype *prog_subtype)
{
if (off < 0 || off >= sizeof(struct pt_regs))
return false;
@@ -517,7 +519,8 @@ static const struct bpf_func_proto bpf_get_stackid_proto_tp = {
.arg3_type = ARG_ANYTHING,
};
-static const struct bpf_func_proto *tp_prog_func_proto(enum bpf_func_id func_id)
+static const struct bpf_func_proto *tp_prog_func_proto(enum bpf_func_id func_id,
+ union bpf_prog_subtype *prog_subtype)
{
switch (func_id) {
case BPF_FUNC_perf_event_output:
@@ -530,7 +533,8 @@ static const struct bpf_func_proto *tp_prog_func_proto(enum bpf_func_id func_id)
}
static bool tp_prog_is_valid_access(int off, int size, enum bpf_access_type type,
- enum bpf_reg_type *reg_type)
+ enum bpf_reg_type *reg_type,
+ union bpf_prog_subtype *prog_subtype)
{
if (off < sizeof(void *) || off >= PERF_MAX_TRACE_SIZE)
return false;
diff --git a/net/core/filter.c b/net/core/filter.c
index bd6eebeed5c6..a39f5956f31a 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2483,7 +2483,8 @@ static const struct bpf_func_proto bpf_xdp_event_output_proto = {
};
static const struct bpf_func_proto *
-sk_filter_func_proto(enum bpf_func_id func_id)
+sk_filter_func_proto(enum bpf_func_id func_id,
+ union bpf_prog_subtype *prog_subtype)
{
switch (func_id) {
case BPF_FUNC_map_lookup_elem:
@@ -2509,7 +2510,8 @@ sk_filter_func_proto(enum bpf_func_id func_id)
}
static const struct bpf_func_proto *
-tc_cls_act_func_proto(enum bpf_func_id func_id)
+tc_cls_act_func_proto(enum bpf_func_id func_id,
+ union bpf_prog_subtype *prog_subtype)
{
switch (func_id) {
case BPF_FUNC_skb_store_bytes:
@@ -2563,12 +2565,12 @@ tc_cls_act_func_proto(enum bpf_func_id func_id)
case BPF_FUNC_skb_under_cgroup:
return &bpf_skb_under_cgroup_proto;
default:
- return sk_filter_func_proto(func_id);
+ return sk_filter_func_proto(func_id, prog_subtype);
}
}
static const struct bpf_func_proto *
-xdp_func_proto(enum bpf_func_id func_id)
+xdp_func_proto(enum bpf_func_id func_id, union bpf_prog_subtype *prog_subtype)
{
switch (func_id) {
case BPF_FUNC_perf_event_output:
@@ -2576,18 +2578,19 @@ xdp_func_proto(enum bpf_func_id func_id)
case BPF_FUNC_get_smp_processor_id:
return &bpf_get_smp_processor_id_proto;
default:
- return sk_filter_func_proto(func_id);
+ return sk_filter_func_proto(func_id, prog_subtype);
}
}
static const struct bpf_func_proto *
-cg_skb_func_proto(enum bpf_func_id func_id)
+cg_skb_func_proto(enum bpf_func_id func_id,
+ union bpf_prog_subtype *prog_subtype)
{
switch (func_id) {
case BPF_FUNC_skb_load_bytes:
return &bpf_skb_load_bytes_proto;
default:
- return sk_filter_func_proto(func_id);
+ return sk_filter_func_proto(func_id, prog_subtype);
}
}
@@ -2606,7 +2609,8 @@ static bool __is_valid_access(int off, int size, enum bpf_access_type type)
static bool sk_filter_is_valid_access(int off, int size,
enum bpf_access_type type,
- enum bpf_reg_type *reg_type)
+ enum bpf_reg_type *reg_type,
+ union bpf_prog_subtype *prog_subtype)
{
switch (off) {
case offsetof(struct __sk_buff, tc_classid):
@@ -2669,7 +2673,8 @@ static int tc_cls_act_prologue(struct bpf_insn *insn_buf, bool direct_write,
static bool tc_cls_act_is_valid_access(int off, int size,
enum bpf_access_type type,
- enum bpf_reg_type *reg_type)
+ enum bpf_reg_type *reg_type,
+ union bpf_prog_subtype *prog_subtype)
{
if (type == BPF_WRITE) {
switch (off) {
@@ -2712,7 +2717,8 @@ static bool __is_valid_xdp_access(int off, int size,
static bool xdp_is_valid_access(int off, int size,
enum bpf_access_type type,
- enum bpf_reg_type *reg_type)
+ enum bpf_reg_type *reg_type,
+ union bpf_prog_subtype *prog_subtype)
{
if (type == BPF_WRITE)
return false;
--
2.9.3
This new arraymap looks like a set and brings new properties:
* strong typing of entries: the eBPF functions get the array type of
elements instead of CONST_PTR_TO_MAP (e.g.
CONST_PTR_TO_LANDLOCK_HANDLE_FS);
* force sequential filling (i.e. replace or append-only update), which
allow quick browsing of all entries.
This strong typing is useful to statically check if the content of a map
can be passed to an eBPF function. For example, Landlock use it to store
and manage kernel objects (e.g. struct file) instead of dealing with
userland raw data. This improve efficiency and ensure that an eBPF
program can only call functions with the right high-level arguments.
The enum bpf_map_handle_type list low-level types (e.g.
BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD) which are identified when
updating a map entry (handle). This handle types are used to infer a
high-level arraymap type which are listed in enum bpf_map_array_type
(e.g. BPF_MAP_ARRAY_TYPE_LANDLOCK_FS).
For now, this new arraymap is only used by Landlock LSM (cf. next
commits) but it could be useful for other needs.
Changes since v3:
* make handle arraymap safe (RCU) and remove buggy synchronize_rcu()
* factor out the arraymay walk
Changes since v2:
* add a RLIMIT_NOFILE-based limit to the maximum number of arraymap
handle entries (suggested by Andy Lutomirski)
* remove useless checks
Changes since v1:
* arraymap of handles replace custom checker groups
* simpler userland API
Signed-off-by: Mickaël Salaün <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Kees Cook <[email protected]>
Link: https://lkml.kernel.org/r/CALCETrWwTiz3kZTkEgOW24-DvhQq6LftwEXh77FD2G5o71yD7g@mail.gmail.com
---
include/linux/bpf.h | 24 +++++
include/uapi/linux/bpf.h | 21 ++++
kernel/bpf/arraymap.c | 270 +++++++++++++++++++++++++++++++++++++++++++++++
kernel/bpf/verifier.c | 20 +++-
4 files changed, 334 insertions(+), 1 deletion(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index cf87db6daf27..34b9e9cd1af7 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -13,6 +13,11 @@
#include <linux/percpu.h>
#include <linux/err.h>
+#ifdef CONFIG_SECURITY_LANDLOCK
+#include <linux/fs.h> /* struct file */
+#include <linux/spinlock_types.h> /* spinlock_t */
+#endif /* CONFIG_SECURITY_LANDLOCK */
+
struct perf_event;
struct bpf_map;
@@ -38,6 +43,7 @@ struct bpf_map_ops {
struct bpf_map {
atomic_t refcnt;
enum bpf_map_type map_type;
+ enum bpf_map_array_type map_array_type;
u32 key_size;
u32 value_size;
u32 max_entries;
@@ -80,6 +86,8 @@ enum bpf_arg_type {
ARG_PTR_TO_CTX, /* pointer to context */
ARG_ANYTHING, /* any (initialized) argument is ok */
+
+ ARG_CONST_PTR_TO_LANDLOCK_HANDLE_FS, /* pointer to Landlock FS map handle */
};
/* type of values returned from helper functions */
@@ -146,6 +154,9 @@ enum bpf_reg_type {
* map element.
*/
PTR_TO_MAP_VALUE_ADJ,
+
+ /* Landlock */
+ CONST_PTR_TO_LANDLOCK_HANDLE_FS,
};
struct bpf_prog;
@@ -196,6 +207,10 @@ struct bpf_array {
*/
enum bpf_prog_type owner_prog_type;
bool owner_jited;
+#ifdef CONFIG_SECURITY_LANDLOCK
+ atomic_t n_entries; /* number of entries in a handle array */
+ raw_spinlock_t update; /* protect n_entries consistency */
+#endif /* CONFIG_SECURITY_LANDLOCK */
union {
char value[0] __aligned(8);
void *ptrs[0] __aligned(8);
@@ -203,6 +218,15 @@ struct bpf_array {
};
};
+#ifdef CONFIG_SECURITY_LANDLOCK
+struct map_landlock_handle {
+ u32 type; /* enum bpf_map_handle_type */
+ union {
+ struct path path;
+ };
+};
+#endif /* CONFIG_SECURITY_LANDLOCK */
+
#define MAX_TAIL_CALL_CNT 32
struct bpf_event_entry {
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index f31b655f93cf..339a9307ba6e 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -87,6 +87,18 @@ enum bpf_map_type {
BPF_MAP_TYPE_PERCPU_ARRAY,
BPF_MAP_TYPE_STACK_TRACE,
BPF_MAP_TYPE_CGROUP_ARRAY,
+ BPF_MAP_TYPE_LANDLOCK_ARRAY,
+};
+
+enum bpf_map_array_type {
+ BPF_MAP_ARRAY_TYPE_UNSPEC,
+ BPF_MAP_ARRAY_TYPE_LANDLOCK_FS,
+};
+
+enum bpf_map_handle_type {
+ BPF_MAP_HANDLE_TYPE_UNSPEC,
+ BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD,
+ /* BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_GLOB, */
};
enum bpf_prog_type {
@@ -538,4 +550,13 @@ struct xdp_md {
__u32 data_end;
};
+/* Map handle entry */
+struct landlock_handle {
+ __u32 type; /* enum bpf_map_handle_type */
+ union {
+ __u32 fd;
+ __aligned_u64 glob;
+ };
+} __attribute__((aligned(8)));
+
#endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index a2ac051c342f..3d045ee71eef 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -16,6 +16,15 @@
#include <linux/mm.h>
#include <linux/filter.h>
#include <linux/perf_event.h>
+#include <linux/file.h> /* fput() */
+#include <linux/fs.h> /* struct file */
+
+#ifdef CONFIG_SECURITY_LANDLOCK
+#include <asm/resource.h> /* RLIMIT_NOFILE */
+#include <linux/mount.h> /* struct vfsmount, MNT_INTERNAL */
+#include <linux/path.h> /* path_get(), path_put() */
+#include <linux/sched.h> /* rlimit() */
+#endif /* CONFIG_SECURITY_LANDLOCK */
static void bpf_array_free_percpu(struct bpf_array *array)
{
@@ -89,6 +98,10 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
array->map.value_size = attr->value_size;
array->map.max_entries = attr->max_entries;
array->elem_size = elem_size;
+#ifdef CONFIG_SECURITY_LANDLOCK
+ atomic_set(&array->n_entries, 0);
+ raw_spin_lock_init(&array->update);
+#endif /* CONFIG_SECURITY_LANDLOCK */
if (!percpu)
goto out;
@@ -580,3 +593,260 @@ static int __init register_cgroup_array_map(void)
}
late_initcall(register_cgroup_array_map);
#endif
+
+#ifdef CONFIG_SECURITY_LANDLOCK
+
+static struct bpf_map *landlock_array_map_alloc(union bpf_attr *attr)
+{
+ if (attr->value_size != sizeof(struct landlock_handle))
+ return ERR_PTR(-EINVAL);
+ /* XXX: FD arraymap works because elem_size = round_up(attr->value_size, 8) */
+ /* XXX: do we want memory with GFP_USER? */
+ return array_map_alloc(attr);
+}
+
+static void landlock_free_handle(struct map_landlock_handle *handle)
+{
+ enum bpf_map_handle_type handle_type;
+
+ if (WARN_ON(!handle))
+ return;
+ handle_type = handle->type;
+
+ switch (handle_type) {
+ case BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD:
+ path_put(&handle->path);
+ break;
+ case BPF_MAP_HANDLE_TYPE_UNSPEC:
+ default:
+ WARN_ON(1);
+ }
+ kfree(handle);
+}
+
+/* called when map->refcnt goes to zero, either from workqueue or from syscall */
+static void landlock_array_map_free(struct bpf_map *map)
+{
+ struct bpf_array *array = container_of(map, struct bpf_array, map);
+ struct map_landlock_handle **handle;
+ size_t i;
+
+ /* wait for all eBPF programs to complete before freeing the map */
+ synchronize_rcu();
+
+ for (i = 0, handle = (struct map_landlock_handle **) array->value;
+ i < atomic_read(&array->n_entries);
+ i++, handle = (struct map_landlock_handle **)
+ (array->value + array->elem_size * i)) {
+ landlock_free_handle(*handle);
+ }
+ kvfree(array);
+}
+
+static enum bpf_map_array_type landlock_get_array_type(
+ enum bpf_map_handle_type handle_type)
+{
+ switch (handle_type) {
+ case BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD:
+ return BPF_MAP_ARRAY_TYPE_LANDLOCK_FS;
+ case BPF_MAP_HANDLE_TYPE_UNSPEC:
+ default:
+ return -EINVAL;
+ }
+}
+
+/**
+ * landlock_new_handle - store an user handle in an arraymap entry
+ *
+ * @handle: non-NULL user-side Landlock handle source
+ *
+ * Return a new Landlock handle
+ */
+static inline struct map_landlock_handle *landlock_new_handle(
+ struct landlock_handle *handle)
+{
+ enum bpf_map_handle_type handle_type = handle->type;
+ struct file *handle_file;
+ struct map_landlock_handle *ret;
+
+ /* access control already done for the FD */
+
+ switch (handle_type) {
+ case BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD:
+ handle_file = fget(handle->fd);
+ if (IS_ERR(handle_file))
+ return ERR_CAST(handle_file);
+ /* check if the FD is tied to a user mount point */
+ if (unlikely(handle_file->f_path.mnt->mnt_flags & MNT_INTERNAL)) {
+ fput(handle_file);
+ return ERR_PTR(-EINVAL);
+ }
+ path_get(&handle_file->f_path);
+ ret = kmalloc(sizeof(*ret), GFP_KERNEL);
+ ret->path = handle_file->f_path;
+ fput(handle_file);
+ break;
+ case BPF_MAP_HANDLE_TYPE_UNSPEC:
+ default:
+ return ERR_PTR(-EINVAL);
+ }
+ ret->type = handle_type;
+ return ret;
+}
+
+static void *nop_map_lookup_elem(struct bpf_map *map, void *key)
+{
+ return ERR_PTR(-EINVAL);
+}
+
+/* called from syscall or from eBPF program */
+static int landlock_array_map_update_elem(struct bpf_map *map, void *key,
+ void *value, u64 map_flags)
+{
+ struct bpf_array *array = container_of(map, struct bpf_array, map);
+ u32 index = *(u32 *)key;
+ enum bpf_map_array_type array_type;
+ int ret, n_entries;
+ struct landlock_handle *khandle = (struct landlock_handle *)value;
+ struct map_landlock_handle **handle_ref, *handle_old, *handle_new;
+ unsigned long flags;
+
+ if (unlikely(map_flags > BPF_EXIST))
+ /* unknown flags */
+ return -EINVAL;
+
+ /*
+ * Limit number of entries in an arraymap of handles to the maximum
+ * number of open files for the current process. The maximum number of
+ * handle entries (including all arraymaps) for a process is then
+ * (RLIMIT_NOFILE - 1) * RLIMIT_NOFILE. If the process' RLIMIT_NOFILE
+ * is 0, then any entry update is forbidden.
+ *
+ * An eBPF program can inherit all the arraymap FD. The worse case is
+ * to fill a bunch of arraymaps, create an eBPF program, close the
+ * arraymap FDs, and start again. The maximum number of arraymap
+ * entries can then be close to RLIMIT_NOFILE^3.
+ *
+ * FIXME: This should be improved... any idea?
+ */
+ if (unlikely(index >= rlimit(RLIMIT_NOFILE)))
+ return -EMFILE;
+
+ if (unlikely(index >= array->map.max_entries))
+ /* all elements were pre-allocated, cannot insert a new one */
+ return -E2BIG;
+
+ /* TODO: handle all flags, not only BPF_ANY */
+ if (unlikely(map_flags == BPF_NOEXIST))
+ /* all elements already exist */
+ return -EEXIST;
+
+ if (unlikely(!khandle))
+ return -EINVAL;
+
+ array_type = landlock_get_array_type(khandle->type);
+ if (array_type < 0)
+ return array_type;
+
+ if (!map->map_array_type) {
+ /* set the initial set type */
+ map->map_array_type = array_type;
+ } else if (map->map_array_type != array_type) {
+ return -EINVAL;
+ }
+
+ WARN_ON_ONCE(!rcu_read_lock_held());
+
+ /* bpf_map_update_elem() can be called in_irq() */
+ raw_spin_lock_irqsave(&array->update, flags);
+ n_entries = atomic_read(&array->n_entries);
+
+ if (unlikely(index > n_entries)) {
+ /* only replace an existing entry or append a new one */
+ ret = -EINVAL;
+ goto err;
+ }
+
+ handle_new = landlock_new_handle(khandle);
+ if (IS_ERR(handle_new)) {
+ ret = PTR_ERR(handle_new);
+ goto err;
+ }
+
+ handle_ref = (struct map_landlock_handle **)
+ (array->value + array->elem_size * index);
+ handle_old = xchg(handle_ref, handle_new);
+ if (index == n_entries)
+ atomic_inc(&array->n_entries);
+ raw_spin_unlock_irqrestore(&array->update, flags);
+
+ if (index != n_entries) {
+ synchronize_rcu();
+ landlock_free_handle(handle_old);
+ }
+ return 0;
+
+err:
+ raw_spin_unlock_irqrestore(&array->update, flags);
+ return ret;
+}
+
+/* called from syscall or from eBPF program */
+static int landlock_array_map_delete_elem(struct bpf_map *map, void *key)
+{
+ struct bpf_array *array = container_of(map, struct bpf_array, map);
+ u32 index = *(u32 *)key;
+ struct map_landlock_handle *handle_old, **handle_ref;
+ unsigned long flags;
+ int n_entries;
+
+ if (unlikely(index >= array->map.max_entries))
+ return -E2BIG;
+
+ WARN_ON_ONCE(!rcu_read_lock_held());
+
+ /* bpf_map_delete_elem() can be called in_irq() */
+ raw_spin_lock_irqsave(&array->update, flags);
+ n_entries = atomic_read(&array->n_entries);
+
+ /* only delete the last element: forbid holes in the array */
+ if (!n_entries || index != (n_entries - 1))
+ goto err;
+
+ atomic_dec(&array->n_entries);
+ handle_ref = (struct map_landlock_handle **)
+ (array->value + array->elem_size * index);
+ handle_old = xchg(handle_ref, NULL);
+ raw_spin_unlock_irqrestore(&array->update, flags);
+
+ synchronize_rcu();
+ landlock_free_handle(handle_old);
+ return 0;
+
+err:
+ raw_spin_unlock_irqrestore(&array->update, flags);
+ return -EINVAL;
+}
+
+static const struct bpf_map_ops landlock_array_ops = {
+ .map_alloc = landlock_array_map_alloc,
+ .map_free = landlock_array_map_free,
+ .map_get_next_key = array_map_get_next_key,
+ .map_lookup_elem = nop_map_lookup_elem,
+ .map_update_elem = landlock_array_map_update_elem,
+ .map_delete_elem = landlock_array_map_delete_elem,
+};
+
+static struct bpf_map_type_list landlock_array_type __read_mostly = {
+ .ops = &landlock_array_ops,
+ .type = BPF_MAP_TYPE_LANDLOCK_ARRAY,
+};
+
+static int __init register_landlock_array_map(void)
+{
+ bpf_register_map_type(&landlock_array_type);
+ return 0;
+}
+
+late_initcall(register_landlock_array_map);
+#endif /* CONFIG_SECURITY_LANDLOCK */
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 99a7e5b388f2..1bc7701466b0 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -188,6 +188,7 @@ static const char * const reg_type_str[] = {
[CONST_IMM] = "imm",
[PTR_TO_PACKET] = "pkt",
[PTR_TO_PACKET_END] = "pkt_end",
+ [CONST_PTR_TO_LANDLOCK_HANDLE_FS] = "landlock_handle_fs",
};
static void print_verifier_state(struct bpf_verifier_state *state)
@@ -513,6 +514,7 @@ static bool is_spillable_regtype(enum bpf_reg_type type)
case PTR_TO_PACKET_END:
case FRAME_PTR:
case CONST_PTR_TO_MAP:
+ case CONST_PTR_TO_LANDLOCK_HANDLE_FS:
return true;
default:
return false;
@@ -973,6 +975,10 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
expected_type = PTR_TO_CTX;
if (type != expected_type)
goto err_type;
+ } else if (arg_type == ARG_CONST_PTR_TO_LANDLOCK_HANDLE_FS) {
+ expected_type = CONST_PTR_TO_LANDLOCK_HANDLE_FS;
+ if (type != expected_type)
+ goto err_type;
} else if (arg_type == ARG_PTR_TO_STACK ||
arg_type == ARG_PTR_TO_RAW_STACK) {
expected_type = PTR_TO_STACK;
@@ -2031,6 +2037,17 @@ static struct bpf_map *ld_imm64_to_map_ptr(struct bpf_insn *insn)
return (struct bpf_map *) (unsigned long) imm64;
}
+static inline enum bpf_reg_type bpf_reg_type_from_map(struct bpf_map *map)
+{
+ switch (map->map_array_type) {
+ case BPF_MAP_ARRAY_TYPE_LANDLOCK_FS:
+ return CONST_PTR_TO_LANDLOCK_HANDLE_FS;
+ case BPF_MAP_ARRAY_TYPE_UNSPEC:
+ default:
+ return CONST_PTR_TO_MAP;
+ }
+}
+
/* verify BPF_LD_IMM64 instruction */
static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn)
{
@@ -2067,8 +2084,9 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn)
/* replace_map_fd_with_map_ptr() should have caught bad ld_imm64 */
BUG_ON(insn->src_reg != BPF_PSEUDO_MAP_FD);
- regs[insn->dst_reg].type = CONST_PTR_TO_MAP;
regs[insn->dst_reg].map_ptr = ld_imm64_to_map_ptr(insn);
+ regs[insn->dst_reg].type =
+ bpf_reg_type_from_map(regs[insn->dst_reg].map_ptr);
return 0;
}
--
2.9.3
The semantic is unchanged. This will be useful for the Landlock
integration with seccomp (next commit).
Signed-off-by: Mickaël Salaün <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Will Drewry <[email protected]>
---
include/linux/seccomp.h | 4 ++--
kernel/fork.c | 2 +-
kernel/seccomp.c | 18 +++++++++++++-----
3 files changed, 16 insertions(+), 8 deletions(-)
diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
index ecc296c137cd..e25aee2cdfc0 100644
--- a/include/linux/seccomp.h
+++ b/include/linux/seccomp.h
@@ -77,10 +77,10 @@ static inline int seccomp_mode(struct seccomp *s)
#endif /* CONFIG_SECCOMP */
#ifdef CONFIG_SECCOMP_FILTER
-extern void put_seccomp_filter(struct task_struct *tsk);
+extern void put_seccomp(struct task_struct *tsk);
extern void get_seccomp_filter(struct task_struct *tsk);
#else /* CONFIG_SECCOMP_FILTER */
-static inline void put_seccomp_filter(struct task_struct *tsk)
+static inline void put_seccomp(struct task_struct *tsk)
{
return;
}
diff --git a/kernel/fork.c b/kernel/fork.c
index 623259fc794d..0690e43bdda5 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -349,7 +349,7 @@ void free_task(struct task_struct *tsk)
#endif
rt_mutex_debug_task_free(tsk);
ftrace_graph_exit_task(tsk);
- put_seccomp_filter(tsk);
+ put_seccomp(tsk);
arch_release_task_struct(tsk);
free_task_struct(tsk);
}
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 0db7c8a2afe2..e741a82eab4d 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -63,6 +63,8 @@ struct seccomp_filter {
/* Limit any path through the tree to 256KB worth of instructions. */
#define MAX_INSNS_PER_PATH ((1 << 18) / sizeof(struct sock_filter))
+static void put_seccomp_filter(struct seccomp_filter *filter);
+
/*
* Endianness is explicitly ignored and left for BPF program authors to manage
* as per the specific architecture.
@@ -313,7 +315,7 @@ static inline void seccomp_sync_threads(void)
* current's path will hold a reference. (This also
* allows a put before the assignment.)
*/
- put_seccomp_filter(thread);
+ put_seccomp_filter(thread->seccomp.filter);
smp_store_release(&thread->seccomp.filter,
caller->seccomp.filter);
@@ -475,10 +477,11 @@ static inline void seccomp_filter_free(struct seccomp_filter *filter)
}
}
-/* put_seccomp_filter - decrements the ref count of tsk->seccomp.filter */
-void put_seccomp_filter(struct task_struct *tsk)
+/* put_seccomp_filter - decrements the ref count of a filter */
+static void put_seccomp_filter(struct seccomp_filter *filter)
{
- struct seccomp_filter *orig = tsk->seccomp.filter;
+ struct seccomp_filter *orig = filter;
+
/* Clean up single-reference branches iteratively. */
while (orig && atomic_dec_and_test(&orig->usage)) {
struct seccomp_filter *freeme = orig;
@@ -487,6 +490,11 @@ void put_seccomp_filter(struct task_struct *tsk)
}
}
+void put_seccomp(struct task_struct *tsk)
+{
+ put_seccomp_filter(tsk->seccomp.filter);
+}
+
/**
* seccomp_send_sigsys - signals the task to allow in-process syscall emulation
* @syscall: syscall number to send to userland
@@ -898,7 +906,7 @@ long seccomp_get_filter(struct task_struct *task, unsigned long filter_off,
if (copy_to_user(data, fprog->filter, bpf_classic_proglen(fprog)))
ret = -EFAULT;
- put_seccomp_filter(task);
+ put_seccomp_filter(task->seccomp.filter);
return ret;
out:
--
2.9.3
On Wednesday, October 26, 2016 8:56:38 AM CEST Micka?l Sala?n wrote:
> include/linux/bpf.h | 6 ++++++
> kernel/bpf/syscall.c | 6 ------
> 2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index c201017b5730..cf87db6daf27 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -283,6 +283,12 @@ static inline void bpf_long_memcpy(void *dst, const void *src, u32 size)
>
> /* verify correctness of eBPF program */
> int bpf_check(struct bpf_prog **fp, union bpf_attr *attr);
> +
> +/* helper to convert user pointers passed inside __aligned_u64 fields */
> +static inline void __user *u64_to_ptr(__u64 val)
> +{
> + return (void __user *) (unsigned long) val;
> +}
> #else
>
We already have at least six copies of this helper:
fs/btrfs/qgroup.c:#define u64_to_ptr(x) ((struct btrfs_qgroup *)(uintptr_t)x)
kernel/bpf/syscall.c:static void __user *u64_to_ptr(__u64 val)
drivers/staging/android/ion/ion_test.c:#define u64_to_uptr(x) ((void __user *)(unsigned long)(x))
drivers/firewire/core-cdev.c:static void __user *u64_to_uptr(u64 value)
drivers/staging/android/ion/ion_test.c:#define u64_to_uptr(x) ((void __user *)(unsigned long)(x))
include/linux/kernel.h:#define u64_to_user_ptr(x) ( \
Just use the one in linux/kernel.h
Arnd
On Wed, Oct 26, 2016 at 09:19:08AM +0200, Arnd Bergmann wrote:
> On Wednesday, October 26, 2016 8:56:38 AM CEST Micka?l Sala?n wrote:
> > include/linux/bpf.h | 6 ++++++
> > kernel/bpf/syscall.c | 6 ------
> > 2 files changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index c201017b5730..cf87db6daf27 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -283,6 +283,12 @@ static inline void bpf_long_memcpy(void *dst, const void *src, u32 size)
> >
> > /* verify correctness of eBPF program */
> > int bpf_check(struct bpf_prog **fp, union bpf_attr *attr);
> > +
> > +/* helper to convert user pointers passed inside __aligned_u64 fields */
> > +static inline void __user *u64_to_ptr(__u64 val)
> > +{
> > + return (void __user *) (unsigned long) val;
> > +}
> > #else
> >
>
>
> We already have at least six copies of this helper:
>
> fs/btrfs/qgroup.c:#define u64_to_ptr(x) ((struct btrfs_qgroup *)(uintptr_t)x)
This one does not take __user pointers, unlike the rest. Anyway, the name is
misleading, I'll send a cleanup. Thanks.
> kernel/bpf/syscall.c:static void __user *u64_to_ptr(__u64 val)
> drivers/staging/android/ion/ion_test.c:#define u64_to_uptr(x) ((void __user *)(unsigned long)(x))
> drivers/firewire/core-cdev.c:static void __user *u64_to_uptr(u64 value)
> drivers/staging/android/ion/ion_test.c:#define u64_to_uptr(x) ((void __user *)(unsigned long)(x))
> include/linux/kernel.h:#define u64_to_user_ptr(x) ( \
>
> Just use the one in linux/kernel.h
>
> Arnd
On Wed, Oct 26, 2016 at 08:56:36AM +0200, Micka?l Sala?n wrote:
> The loaded Landlock eBPF programs can be triggered by a seccomp filter
> returning RET_LANDLOCK. In addition, a cookie (16-bit value) can be passed from
> a seccomp filter to eBPF programs. This allow flexible security policies
> between seccomp and Landlock.
Is this still up to date, or was that removed in v3?
On 26/10/2016 16:52, Jann Horn wrote:
> On Wed, Oct 26, 2016 at 08:56:36AM +0200, Micka?l Sala?n wrote:
>> The loaded Landlock eBPF programs can be triggered by a seccomp filter
>> returning RET_LANDLOCK. In addition, a cookie (16-bit value) can be passed from
>> a seccomp filter to eBPF programs. This allow flexible security policies
>> between seccomp and Landlock.
>
> Is this still up to date, or was that removed in v3?
>
I forgot to remove this part. In this v4 series, as describe in the
(small) patch 11/18, a Landlock rule cannot be triggered by a seccomp
filter. So there is no more RET_LANDLOCK nor cookie.
On 26/10/2016 18:56, Micka?l Sala?n wrote:
>
> On 26/10/2016 16:52, Jann Horn wrote:
>> On Wed, Oct 26, 2016 at 08:56:36AM +0200, Micka?l Sala?n wrote:
>>> The loaded Landlock eBPF programs can be triggered by a seccomp filter
>>> returning RET_LANDLOCK. In addition, a cookie (16-bit value) can be passed from
>>> a seccomp filter to eBPF programs. This allow flexible security policies
>>> between seccomp and Landlock.
>>
>> Is this still up to date, or was that removed in v3?
>>
>
> I forgot to remove this part. In this v4 series, as describe in the
> (small) patch 11/18, a Landlock rule cannot be triggered by a seccomp
> filter. So there is no more RET_LANDLOCK nor cookie.
>
Here is an up-to-date version:
# Use case scenario
First, a process needs to create a new dedicated eBPF map containing handles.
This handles are references to system resources (e.g. file or directory). They
are grouped in one or multiple maps to be efficiently managed and checked in
batches. This kind of map can be passed to Landlock eBPF functions to compare,
for example, with a file access request.
First, a task need to create or receive a Landlock rule. This rule is a
dedicated eBPF program tied to one of the Landlock hooks, which are a subset of
LSM hooks. Once loaded, a Landlock rule can be enforced through the seccomp(2)
syscall for the current thread and its (future) children, like a seccomp
filter.
Another way to enforce a Landlock security policy is to attach Landlock rules
to a cgroup. All the processes in this cgroup will then be subject to this
policy.
A triggered Landlock eBPF program can allow or deny an access, according to
its subtype (i.e. LSM hook), thanks to errno return values.
On Wed, Oct 26, 2016 at 08:56:39AM +0200, Micka?l Sala?n wrote:
> This new arraymap looks like a set and brings new properties:
> * strong typing of entries: the eBPF functions get the array type of
> elements instead of CONST_PTR_TO_MAP (e.g.
> CONST_PTR_TO_LANDLOCK_HANDLE_FS);
> * force sequential filling (i.e. replace or append-only update), which
> allow quick browsing of all entries.
>
> This strong typing is useful to statically check if the content of a map
> can be passed to an eBPF function. For example, Landlock use it to store
> and manage kernel objects (e.g. struct file) instead of dealing with
> userland raw data. This improve efficiency and ensure that an eBPF
> program can only call functions with the right high-level arguments.
>
> The enum bpf_map_handle_type list low-level types (e.g.
> BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD) which are identified when
> updating a map entry (handle). This handle types are used to infer a
> high-level arraymap type which are listed in enum bpf_map_array_type
> (e.g. BPF_MAP_ARRAY_TYPE_LANDLOCK_FS).
>
> For now, this new arraymap is only used by Landlock LSM (cf. next
> commits) but it could be useful for other needs.
>
> Changes since v3:
> * make handle arraymap safe (RCU) and remove buggy synchronize_rcu()
> * factor out the arraymay walk
>
> Changes since v2:
> * add a RLIMIT_NOFILE-based limit to the maximum number of arraymap
> handle entries (suggested by Andy Lutomirski)
> * remove useless checks
>
> Changes since v1:
> * arraymap of handles replace custom checker groups
> * simpler userland API
[...]
> + case BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD:
> + handle_file = fget(handle->fd);
> + if (IS_ERR(handle_file))
> + return ERR_CAST(handle_file);
> + /* check if the FD is tied to a user mount point */
> + if (unlikely(handle_file->f_path.mnt->mnt_flags & MNT_INTERNAL)) {
> + fput(handle_file);
> + return ERR_PTR(-EINVAL);
> + }
> + path_get(&handle_file->f_path);
> + ret = kmalloc(sizeof(*ret), GFP_KERNEL);
> + ret->path = handle_file->f_path;
> + fput(handle_file);
You can use fdget() and fdput() here because the reference to
handle_file is dropped before the end of the syscall.
> + break;
> + case BPF_MAP_HANDLE_TYPE_UNSPEC:
> + default:
> + return ERR_PTR(-EINVAL);
> + }
> + ret->type = handle_type;
> + return ret;
> +}
> +
> +static void *nop_map_lookup_elem(struct bpf_map *map, void *key)
> +{
> + return ERR_PTR(-EINVAL);
> +}
> +
> +/* called from syscall or from eBPF program */
> +static int landlock_array_map_update_elem(struct bpf_map *map, void *key,
> + void *value, u64 map_flags)
> +{
This being callable from eBPF context is IMO pretty surprising and should
at least be well-documented. Also: What is the usecase here?
On 26/10/2016 21:01, Jann Horn wrote:
> On Wed, Oct 26, 2016 at 08:56:39AM +0200, Micka?l Sala?n wrote:
>> This new arraymap looks like a set and brings new properties:
>> * strong typing of entries: the eBPF functions get the array type of
>> elements instead of CONST_PTR_TO_MAP (e.g.
>> CONST_PTR_TO_LANDLOCK_HANDLE_FS);
>> * force sequential filling (i.e. replace or append-only update), which
>> allow quick browsing of all entries.
>>
>> This strong typing is useful to statically check if the content of a map
>> can be passed to an eBPF function. For example, Landlock use it to store
>> and manage kernel objects (e.g. struct file) instead of dealing with
>> userland raw data. This improve efficiency and ensure that an eBPF
>> program can only call functions with the right high-level arguments.
>>
>> The enum bpf_map_handle_type list low-level types (e.g.
>> BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD) which are identified when
>> updating a map entry (handle). This handle types are used to infer a
>> high-level arraymap type which are listed in enum bpf_map_array_type
>> (e.g. BPF_MAP_ARRAY_TYPE_LANDLOCK_FS).
>>
>> For now, this new arraymap is only used by Landlock LSM (cf. next
>> commits) but it could be useful for other needs.
>>
>> Changes since v3:
>> * make handle arraymap safe (RCU) and remove buggy synchronize_rcu()
>> * factor out the arraymay walk
>>
>> Changes since v2:
>> * add a RLIMIT_NOFILE-based limit to the maximum number of arraymap
>> handle entries (suggested by Andy Lutomirski)
>> * remove useless checks
>>
>> Changes since v1:
>> * arraymap of handles replace custom checker groups
>> * simpler userland API
> [...]
>> + case BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD:
>> + handle_file = fget(handle->fd);
>> + if (IS_ERR(handle_file))
>> + return ERR_CAST(handle_file);
>> + /* check if the FD is tied to a user mount point */
>> + if (unlikely(handle_file->f_path.mnt->mnt_flags & MNT_INTERNAL)) {
>> + fput(handle_file);
>> + return ERR_PTR(-EINVAL);
>> + }
>> + path_get(&handle_file->f_path);
>> + ret = kmalloc(sizeof(*ret), GFP_KERNEL);
>> + ret->path = handle_file->f_path;
>> + fput(handle_file);
>
> You can use fdget() and fdput() here because the reference to
> handle_file is dropped before the end of the syscall.
The reference to handle_file is dropped but not the reference to the
(inner) path thanks to path_get().
>
>
>> + break;
>> + case BPF_MAP_HANDLE_TYPE_UNSPEC:
>> + default:
>> + return ERR_PTR(-EINVAL);
>> + }
>> + ret->type = handle_type;
>> + return ret;
>> +}
>> +
>> +static void *nop_map_lookup_elem(struct bpf_map *map, void *key)
>> +{
>> + return ERR_PTR(-EINVAL);
>> +}
>> +
>> +/* called from syscall or from eBPF program */
>> +static int landlock_array_map_update_elem(struct bpf_map *map, void *key,
>> + void *value, u64 map_flags)
>> +{
>
> This being callable from eBPF context is IMO pretty surprising and should
> at least be well-documented. Also: What is the usecase here?
>
This may be callable but is restricted to CAP_SYS_ADMIN. Any update with
an FD should indeed be denied, but updates with raw values (e.g. GLOB,
IP or port numbers, not yet implemented) may be allowed. Because an eBPF
program is trusted by the process which loaded it (and have the same
rights), this program should be able to update a map like its creator
process. One usecase may be to adjust a map of handles by removing
entries or tightening them (i.e. drop privileges) when a specific
behavior of a monitored process is detected by the eBPF program.
I'm going to fix this update-with-FD case which make no sense anyway.
Thanks,
Micka?l
On Wed, Oct 26, 2016 at 10:03:09PM +0200, Micka?l Sala?n wrote:
> On 26/10/2016 21:01, Jann Horn wrote:
> > On Wed, Oct 26, 2016 at 08:56:39AM +0200, Micka?l Sala?n wrote:
> >> This new arraymap looks like a set and brings new properties:
> >> * strong typing of entries: the eBPF functions get the array type of
> >> elements instead of CONST_PTR_TO_MAP (e.g.
> >> CONST_PTR_TO_LANDLOCK_HANDLE_FS);
> >> * force sequential filling (i.e. replace or append-only update), which
> >> allow quick browsing of all entries.
> >>
> >> This strong typing is useful to statically check if the content of a map
> >> can be passed to an eBPF function. For example, Landlock use it to store
> >> and manage kernel objects (e.g. struct file) instead of dealing with
> >> userland raw data. This improve efficiency and ensure that an eBPF
> >> program can only call functions with the right high-level arguments.
> >>
> >> The enum bpf_map_handle_type list low-level types (e.g.
> >> BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD) which are identified when
> >> updating a map entry (handle). This handle types are used to infer a
> >> high-level arraymap type which are listed in enum bpf_map_array_type
> >> (e.g. BPF_MAP_ARRAY_TYPE_LANDLOCK_FS).
> >>
> >> For now, this new arraymap is only used by Landlock LSM (cf. next
> >> commits) but it could be useful for other needs.
> >>
> >> Changes since v3:
> >> * make handle arraymap safe (RCU) and remove buggy synchronize_rcu()
> >> * factor out the arraymay walk
> >>
> >> Changes since v2:
> >> * add a RLIMIT_NOFILE-based limit to the maximum number of arraymap
> >> handle entries (suggested by Andy Lutomirski)
> >> * remove useless checks
> >>
> >> Changes since v1:
> >> * arraymap of handles replace custom checker groups
> >> * simpler userland API
> > [...]
> >> + case BPF_MAP_HANDLE_TYPE_LANDLOCK_FS_FD:
> >> + handle_file = fget(handle->fd);
> >> + if (IS_ERR(handle_file))
> >> + return ERR_CAST(handle_file);
> >> + /* check if the FD is tied to a user mount point */
> >> + if (unlikely(handle_file->f_path.mnt->mnt_flags & MNT_INTERNAL)) {
> >> + fput(handle_file);
> >> + return ERR_PTR(-EINVAL);
> >> + }
> >> + path_get(&handle_file->f_path);
> >> + ret = kmalloc(sizeof(*ret), GFP_KERNEL);
> >> + ret->path = handle_file->f_path;
> >> + fput(handle_file);
> >
> > You can use fdget() and fdput() here because the reference to
> > handle_file is dropped before the end of the syscall.
>
> The reference to handle_file is dropped but not the reference to the
> (inner) path thanks to path_get().
That's irrelevant. As long as you promise to fdput() any references
acquired using fdget() before any of the following can happen, using
fdget() is okay:
- the syscall exits
- the fd table is shared with a process that might write to it
- an fd is closed by the kernel
In other words, you must be able to prove that nobody can remove the
struct file * from your fd table before you fdput().
Taking a long-term reference to an object pointed to by a struct file
that was looked up with fdget() is fine.
Hi,
After the BoF at LPC last week, we came to a multi-step roadmap to
upstream Landlock.
A first patch series containing the basic properties needed for a
"minimum viable product", which means being able to test it, without
full features. The idea is to set in place the main components which
include the LSM part (some hooks with the manager logic) and the new
eBPF type. To have a minimum amount of code, the first userland entry
point will be the seccomp syscall. This doesn't imply non-upstream
patches and should be more simple. For the sake of simplicity and to
ease the review, this first series will only be dedicated to privileged
processes (i.e. with CAP_SYS_ADMIN). We may want to only allow one level
of rules at first, instead of dealing with more complex rule inheritance
(like seccomp-bpf can do).
The second series will focus on the cgroup manager. It will follow the
same rules of inheritance as the Daniel Mack's patches does.
The third series will try to bring a BPF map of handles for Landlock and
the dedicated BPF helpers.
Finally, the fourth series will bring back the unprivileged mode (with
no_new_privs), at least for process hierarchies (via seccomp). This also
imply to handle multi-level of rules.
Right now, an important point of attention is the userland ABI. We don't
want LSM hooks to be exposed "as is" to userland. This may have some
future implications if their semantic and/or enforcement point(s)
change. In the next series, I will propose a new abstraction over the
currently used LSM hooks. I'll also propose a new way to deal with
resource accountability. Finally, I plan to create a minimal (kernel)
developer documentation and a test suite.
Regards,
Mickaël
On 26/10/2016 08:56, Mickaël Salaün wrote:
> Hi,
>
> This fourth RFC brings some improvements over the previous one [1]. An important
> new point is the abstraction from the raw types of LSM hook arguments. It is
> now possible to call a Landlock function the same way for LSM hooks with
> different internal argument types. Some parts of the code are revamped with RCU
> to properly deal with concurrency. From a userland point of view, the only
> remaining link with seccomp-bpf is the ability to use the seccomp(2) syscall to
> load and enforce a Landlock rule. Seccomp filters cannot trigger Landlock rules
> anymore. For now, it is no more possible for an unprivileged user to enforce a
> Landlock rule on a cgroup through delegation.
>
> As suggested, I plan to write documentation for userland and kernel developers
> with some kind of guiding principles. A remaining question is how to enforce
> limitations for the rule creation?
>
>
> # Landlock LSM
>
> The goal of this new stackable Linux Security Module (LSM) called Landlock is
> to allow any process, including unprivileged ones, to create powerful security
> sandboxes comparable to the Seatbelt/XNU Sandbox or the OpenBSD Pledge. This
> kind of sandbox is expected to help mitigate the security impact of bugs or
> unexpected/malicious behaviors in userland applications.
>
> eBPF programs are used to create a security rule. They are very limited (i.e.
> can only call a whitelist of functions) and cannot do a denial of service (i.e.
> no loop). A new dedicated eBPF map allows to collect and compare Landlock
> handles with system resources (e.g. files or network connections).
>
> The approach taken is to add the minimum amount of code while still allowing
> the userland to create quite complex access rules. A dedicated security policy
> language as the one used by SELinux, AppArmor and other major LSMs involves a
> lot of code and is usually dedicated to a trusted user (i.e. root).
>
>
> # eBPF
>
> To get an expressive language while still being safe and small, Landlock is
> based on eBPF. Landlock should be usable by untrusted processes and must then
> expose a minimal attack surface. The eBPF bytecode is minimal while powerful,
> widely used and designed to be used by not so trusted application. Reusing this
> code allows to not reproduce the same mistakes and minimize new code while
> still taking a generic approach. Only a few additional features are added like
> a new kind of arraymap and some dedicated eBPF functions.
>
> An eBPF program has access to an eBPF context which contains the LSM hook
> arguments (as does seccomp-bpf with syscall arguments). They can be used
> directly or passed to helper functions according to their types. It is then
> possible to do complex access checks without race conditions nor inconsistent
> evaluation (i.e. incorrect mirroring of the OS code and state [2]).
>
> There is one eBPF program subtype per LSM hook. This allows to statically check
> which context access is performed by an eBPF program. This is needed to deny
> kernel address leak and ensure the right use of LSM hook arguments with eBPF
> functions. Moreover, this safe pointer handling removes the need for runtime
> check or abstract data, which improves performances. Any user can add multiple
> Landlock eBPF programs per LSM hook. They are stacked and evaluated one after
> the other (cf. seccomp-bpf).
>
>
> # LSM hooks
>
> Unlike syscalls, LSM hooks are security checkpoints and are not architecture
> dependent. They are designed to match a security need associated with a
> security policy (e.g. access to a file). Exposing parts of some LSM hooks
> instead of using the syscall API for sandboxing should help to avoid bugs and
> hacks as encountered by the first RFC. Instead of redoing the work of the LSM
> hooks through syscalls, we should use and expose them as does policies of
> access control LSM.
>
> Only a subset of the hooks are meaningful for an unprivileged sandbox mechanism
> (e.g. file system or network access control). Landlock uses an abstraction of
> raw LSM hooks, which allow to deal with possible future API changes of the LSM
> hook API. Moreover, thanks to the ePBF program typing (per LSM hook) used by
> Landlock, it should not be hard to make such evolutions backward compatible.
>
>
> # Use case scenario
>
> First, a process needs to create a new dedicated eBPF map containing handles.
> This handles are references to system resources (e.g. file or directory) and
> grouped in one or multiple maps to be efficiently managed and checked in
> batches. This kind of map can be passed to Landlock eBPF functions to compare,
> for example, with a file access request. The handles are only accessible from
> the eBPF programs created by the same thread.
>
> The loaded Landlock eBPF programs can be triggered by a seccomp filter
> returning RET_LANDLOCK. In addition, a cookie (16-bit value) can be passed from
> a seccomp filter to eBPF programs. This allow flexible security policies
> between seccomp and Landlock.
>
> Another way to enforce a Landlock security policy is to attach Landlock
> programs to a dedicated cgroup. All the processes in this cgroup will then be
> subject to this policy. For unprivileged processes, this can be done thanks to
> cgroup delegation.
>
> A triggered Landlock eBPF program can allow or deny an access, according to
> its subtype (i.e. LSM hook), thanks to errno return values.
>
>
> # Sandbox example with process hierarchy sandboxing (seccomp)
>
> $ ls /home
> user1
> $ LANDLOCK_ALLOWED='/bin:/lib:/usr:/tmp:/proc/self/fd/0' \
> ./samples/landlock/sandbox /bin/sh -i
> Launching a new sandboxed process.
> $ ls /home
> ls: cannot access '/home': No such file or directory
>
>
> # Sandbox example with conditional access control depending on a cgroup
>
> $ mkdir /sys/fs/cgroup/sandboxed
> $ ls /home
> user1
> $ LANDLOCK_CGROUPS='/sys/fs/cgroup/sandboxed' \
> LANDLOCK_ALLOWED='/bin:/lib:/usr:/tmp:/proc/self/fd/0' \
> ./samples/landlock/sandbox
> Ready to sandbox with cgroups.
> $ ls /home
> user1
> $ echo $$ > /sys/fs/cgroup/sandboxed/cgroup.procs
> $ ls /home
> ls: cannot access '/home': No such file or directory
>
>
> # Current limitations and possible improvements
>
> For now, eBPF programs can only return an errno code. It may be interesting to
> be able to do other actions like seccomp-bpf does (e.g. kill process). Such
> features can easily be implemented but the main advantage of the current
> approach is to be able to only execute eBPF programs until one returns an errno
> code instead of executing all programs like seccomp-bpf does.
>
> It is quite easy to add new eBPF functions to extend Landlock. The main concern
> should be about the possibility to leak information from current process to
> another one (e.g. through maps) to not reproduce the same security sensitive
> behavior as ptrace.
>
> This design does not seem too intrusive but is flexible enough to allow a
> powerful sandbox mechanism accessible by any process on Linux. The use of
> seccomp and Landlock is more suitable with the help of a userland library (e.g.
> libseccomp) that could help to specify a high-level language to express a
> security policy instead of raw eBPF programs. Moreover, thanks to LLVM, it is
> possible to express an eBPF program with a subset of C.
>
>
> # FAQ
>
> ## Why does seccomp-bpf is not enough?
>
> A seccomp filter can access to raw syscall arguments which means that it is not
> possible to filter according to pointed such as a file path. As the first
> version of this patch series demonstrated, filtering at the syscall level is
> complicated (e.g. need to take care of race conditions). This is mainly because
> the access control checkpoints of the kernel are not at this high-level but
> more underneath, at LSM hooks level. The LSM hooks are designed to handle this
> kind of checks. This series use this approach to leverage the ability of
> unprivileged users to limit themselves.
>
> Cf. "What it isn't?" in Documentation/prctl/seccomp_filter.txt
>
>
> ## Why using the seccomp(2) syscall?
>
> Landlock use the same semantic as seccomp to apply access rule restrictions. It
> add a new layer of security for the current process which is inherited by its
> childs. It makes sense to use an unique access-restricting syscall (that should
> be allowed by seccomp-bpf rules) which can only drop privileges. Moreover, a
> Landlock eBPF program could come from outside a process (e.g. passed through a
> UNIX socket). It is then useful to differentiate the creation/load of Landlock
> eBPF programs via bpf(2), from rule enforcing via seccomp(2).
>
>
> ## Why using cgroups?
>
> cgroups are designed to handle groups of processes. One use case is to manage
> containers. Sandboxing based on process hierarchy (seccomp) is design to handle
> immutable security policies, which is a good security property but does not
> match all use cases. A user can attach Landlock rules to a cgroup. Doing so,
> all the processes in that cgroup will be subject to the security policy.
> However, if the user is allowed to manage this cgroup, it could dynamically
> move this group of processes to a cgroup with another security policy (or
> none). Landlock rules can be applied either on a process hierarchy (e.g.
> application with built-in sandboxing) or a group of processes (e.g. container
> sandboxing). Both approaches can be combined for the same process.
>
>
> ## Does Landlock can limit network access or other resources?
>
> Limiting network access is obviously in the scope of Landlock but it is not yet
> implemented. The main goal now is to get feedback about the whole concept, the
> API and the file access control part. More access control types could be
> implemented in the future.
>
> Sargun Dhillon sent a RFC (Checmate) [4] to deal with network manipulation.
> This could be implemented on top of the Landlock framework.
>
>
> ## Why a new LSM? Are SELinux, AppArmor, Smack or Tomoyo not good enough?
>
> The current access control LSMs are fine for their purpose which is to give the
> *root* the ability to enforce a security policy for the *system*. What is
> missing is a way to enforce a security policy for any applications by its
> developer and *unprivileged user* as seccomp can do for raw syscall filtering.
> Moreover, Landlock handles stacked hook programs from different users. It must
> then ensure there is no possible malicious interactions between these programs.
>
> Differences with other (access control) LSMs:
> * not only dedicated to administrators (i.e. no_new_priv);
> * limited kernel attack surface (e.g. policy parsing);
> * helpers to compare complex objects (path/FD), no access to internal kernel
> data (do not leak addresses);
> * constrained policy rules/programs (no DoS: deterministic execution time);
> * do not leak more information than the loader process can legitimately have
> access to (minimize metadata inference): must compare from an already allowed
> file (through a handle).
>
>
> ## Why not use a policy language like used by SElinux or AppArmor?
>
> This kind of LSMs are dedicated to administrators. They already manage the
> system and are not a threat to the system security. However, seccomp, and
> Landlock too, should be available to anyone, which potentially include
> untrusted users and processes. To reduce the attack surface, Landlock should
> expose the minimum amount of code, hence minimal complexity. Moreover, another
> threat is to make accessible to a malicious code a new way to gain more
> information. For example, Landlock features should not allow a program to get
> the file owner if the directory containing this file is not readable. This data
> could then be exfiltrated thanks to the access result. Thus, we should limit
> the expressiveness of the available checks. The current approach is to do the
> checks in such a way that only a comparison with an already accessed resource
> (e.g. file descriptor) is possible. This allow to have a reference to compare
> with, without exposing much information.
>
>
> ## As a developer, why do I need this feature?
>
> Landlock's goal is to help userland to limit its attack surface.
> Security-conscious developers would like to protect users from a security bug
> in their applications and the third-party dependencies they are using. Such a
> bug can compromise all the user data and help an attacker to perform a
> privilege escalation. Using an *unprivileged sandbox* feature such as Landlock
> empowers the developer with the ability to properly compartmentalize its
> software and limit the impact of vulnerabilities.
>
>
> ## As a user, why do I need a this feature?
>
> Any user can already use seccomp-bpf to whitelist a set of syscalls to
> reduce the kernel attack surface for a predefined set of processes. However an
> unprivileged user can't create a security policy like the root user can thanks to
> SELinux and other access control LSMs. Landlock allows any unprivileged user to
> protect their data from being accessed by any process they run but only an
> identified subset. User tools can be created to help create such a high-level
> access control policy. This policy may not be powerful enough to express the
> same policies as the current access control LSMs, because of the threat an
> unprivileged user can be to the system, but it should be enough for most
> use-cases (e.g. blacklist or whitelist a set of file hierarchies).
>
>
> # Changes since RFC v3
>
> * use abstract LSM hook arguments with custom types (e.g. *_LANDLOCK_ARG_FS for
> struct file, struct inode and struct path)
> * add more LSM hooks to support full file system access control
> * improve the sandbox example
> * fix races and RCU issues:
> * eBPF program execution and eBPF helpers
> * revamp the arraymap of handles to cleanly deal with update/delete
> * eBPF program subtype for Landlock:
> * remove the "origin" field
> * add an "option" field
> * rebase onto Daniel Mack's patches v7 [3]
> * remove merged commit 1955351da41c ("bpf: Set register type according to
> is_valid_access()")
> * fix spelling mistakes
> * cleanup some type and variable names
> * split patches
> * for now, remove cgroup delegation handling for unprivileged user
> * remove extra access check for cgroup_get_from_fd()
> * remove unused example code dealing with skb
> * remove seccomp-bpf link:
> * no more seccomp cookie
> * for now, it is no more possible to check the current syscall properties
>
>
> # Changes since RFC v2
>
> * revamp cgroup handling:
> * use Daniel Mack's patches "Add eBPF hooks for cgroups" v5
> * remove bpf_landlock_cmp_cgroup_beneath()
> * make BPF_PROG_ATTACH usable with delegated cgroups
> * add a new CGRP_NO_NEW_PRIVS flag for safe cgroups
> * handle Landlock sandboxing for cgroups hierarchy
> * allow unprivileged processes to attach Landlock eBPF program to cgroups
> * add subtype to eBPF programs:
> * replace Landlock hook identification by custom eBPF program types with a
> dedicated subtype field
> * manage fine-grained privileged Landlock programs
> * register Landlock programs for dedicated trigger origins (e.g. syscall,
> return from seccomp filter and/or interruption)
> * performance and memory optimizations: use an array to access Landlock hooks
> directly but do not duplicated it for each thread (seccomp-based)
> * allow running Landlock programs without seccomp filter
> * fix seccomp-related issues
> * remove extra errno bounding check for Landlock programs
> * add some examples for optional eBPF functions or context access (network
> related) according to security checks to allow more features for privileged
> programs (e.g. Checmate)
>
>
> # Changes since RFC v1
>
> * focus on the LSM hooks, not the syscalls:
> * much more simple implementation
> * does not need audit cache tricks to avoid race conditions
> * more simple to use and more generic because using the LSM hook abstraction
> directly
> * more efficient because only checking in LSM hooks
> * architecture agnostic
> * switch from cBPF to eBPF:
> * new eBPF program types dedicated to Landlock
> * custom functions used by the eBPF program
> * gain some new features (e.g. 10 registers, can load values of different
> size, LLVM translator) but only a few functions allowed and a dedicated map
> type
> * new context: LSM hook ID, cookie and LSM hook arguments
> * need to set the sysctl kernel.unprivileged_bpf_disable to 0 (default value)
> to be able to load hook filters as unprivileged users
> * smaller and simpler:
> * no more checker groups but dedicated arraymap of handles
> * simpler userland structs thanks to eBPF functions
> * distinctive name: Landlock
>
>
> This series can be applied on top of Daniel Mack's patches for BPF_PROG_ATTACH
> v7 [3] on Linux v4.9-rc2. This can be tested with CONFIG_SECURITY_LANDLOCK,
> CONFIG_SECCOMP_FILTER and CONFIG_CGROUP_BPF. I would really appreciate
> constructive comments on the usability, architecture, code and userland API of
> Landlock LSM.
>
> [1] https://lkml.kernel.org/r/[email protected]
> [2] https://crypto.stanford.edu/cs155/papers/traps.pdf
> [3] https://lkml.kernel.org/r/[email protected]
> [4] https://lkml.kernel.org/r/[email protected]
>
> Regards,
>
> Mickaël Salaün (18):
> landlock: Add Kconfig
> bpf: Move u64_to_ptr() to BPF headers and inline it
> bpf,landlock: Add a new arraymap type to deal with (Landlock) handles
> bpf,landlock: Add eBPF program subtype and is_valid_subtype() verifier
> bpf,landlock: Define an eBPF program type for Landlock
> fs: Constify path_is_under()'s arguments
> landlock: Add LSM hooks
> landlock: Handle file comparisons
> landlock: Add manager functions
> seccomp: Split put_seccomp_filter() with put_seccomp()
> seccomp,landlock: Handle Landlock hooks per process hierarchy
> bpf: Cosmetic change for bpf_prog_attach()
> bpf/cgroup: Replace struct bpf_prog with struct bpf_object
> bpf/cgroup: Make cgroup_bpf_update() return an error code
> bpf/cgroup: Move capability check
> bpf/cgroup,landlock: Handle Landlock hooks per cgroup
> landlock: Add update and debug access flags
> samples/landlock: Add sandbox example
>
> fs/namespace.c | 2 +-
> include/linux/bpf-cgroup.h | 19 +-
> include/linux/bpf.h | 44 +++-
> include/linux/cgroup-defs.h | 2 +
> include/linux/filter.h | 1 +
> include/linux/fs.h | 2 +-
> include/linux/landlock.h | 95 +++++++++
> include/linux/lsm_hooks.h | 5 +
> include/linux/seccomp.h | 12 +-
> include/uapi/linux/bpf.h | 105 ++++++++++
> include/uapi/linux/seccomp.h | 1 +
> kernel/bpf/arraymap.c | 270 +++++++++++++++++++++++++
> kernel/bpf/cgroup.c | 139 ++++++++++---
> kernel/bpf/syscall.c | 71 ++++---
> kernel/bpf/verifier.c | 35 +++-
> kernel/cgroup.c | 6 +-
> kernel/fork.c | 15 +-
> kernel/seccomp.c | 26 ++-
> kernel/trace/bpf_trace.c | 12 +-
> net/core/filter.c | 26 ++-
> samples/Makefile | 2 +-
> samples/bpf/bpf_helpers.h | 5 +
> samples/landlock/.gitignore | 1 +
> samples/landlock/Makefile | 16 ++
> samples/landlock/sandbox.c | 405 +++++++++++++++++++++++++++++++++++++
> security/Kconfig | 1 +
> security/Makefile | 2 +
> security/landlock/Kconfig | 23 +++
> security/landlock/Makefile | 3 +
> security/landlock/checker_fs.c | 152 ++++++++++++++
> security/landlock/checker_fs.h | 20 ++
> security/landlock/common.h | 58 ++++++
> security/landlock/lsm.c | 449 +++++++++++++++++++++++++++++++++++++++++
> security/landlock/manager.c | 379 ++++++++++++++++++++++++++++++++++
> security/security.c | 1 +
> 35 files changed, 2309 insertions(+), 96 deletions(-)
> create mode 100644 include/linux/landlock.h
> create mode 100644 samples/landlock/.gitignore
> create mode 100644 samples/landlock/Makefile
> create mode 100644 samples/landlock/sandbox.c
> create mode 100644 security/landlock/Kconfig
> create mode 100644 security/landlock/Makefile
> create mode 100644 security/landlock/checker_fs.c
> create mode 100644 security/landlock/checker_fs.h
> create mode 100644 security/landlock/common.h
> create mode 100644 security/landlock/lsm.c
> create mode 100644 security/landlock/manager.c
>
On Sun, Nov 13, 2016 at 6:23 AM, Mickaël Salaün <[email protected]> wrote:
> Hi,
>
> After the BoF at LPC last week, we came to a multi-step roadmap to
> upstream Landlock.
>
> A first patch series containing the basic properties needed for a
> "minimum viable product", which means being able to test it, without
> full features. The idea is to set in place the main components which
> include the LSM part (some hooks with the manager logic) and the new
> eBPF type. To have a minimum amount of code, the first userland entry
> point will be the seccomp syscall. This doesn't imply non-upstream
> patches and should be more simple. For the sake of simplicity and to
> ease the review, this first series will only be dedicated to privileged
> processes (i.e. with CAP_SYS_ADMIN). We may want to only allow one level
> of rules at first, instead of dealing with more complex rule inheritance
> (like seccomp-bpf can do).
>
> The second series will focus on the cgroup manager. It will follow the
> same rules of inheritance as the Daniel Mack's patches does.
>
> The third series will try to bring a BPF map of handles for Landlock and
> the dedicated BPF helpers.
>
> Finally, the fourth series will bring back the unprivileged mode (with
> no_new_privs), at least for process hierarchies (via seccomp). This also
> imply to handle multi-level of rules.
>
> Right now, an important point of attention is the userland ABI. We don't
> want LSM hooks to be exposed "as is" to userland. This may have some
> future implications if their semantic and/or enforcement point(s)
> change. In the next series, I will propose a new abstraction over the
> currently used LSM hooks. I'll also propose a new way to deal with
> resource accountability. Finally, I plan to create a minimal (kernel)
> developer documentation and a test suite.
Thanks for the summary.
That's exactly what we discussed and agreed upon.
On Sun, Nov 13, 2016 at 6:23 AM, Mickaël Salaün <[email protected]> wrote:
> Hi,
>
> After the BoF at LPC last week, we came to a multi-step roadmap to
> upstream Landlock.
>
> A first patch series containing the basic properties needed for a
> "minimum viable product", which means being able to test it, without
> full features. The idea is to set in place the main components which
> include the LSM part (some hooks with the manager logic) and the new
> eBPF type. To have a minimum amount of code, the first userland entry
> point will be the seccomp syscall. This doesn't imply non-upstream
> patches and should be more simple. For the sake of simplicity and to
> ease the review, this first series will only be dedicated to privileged
> processes (i.e. with CAP_SYS_ADMIN). We may want to only allow one level
> of rules at first, instead of dealing with more complex rule inheritance
> (like seccomp-bpf can do).
>
> The second series will focus on the cgroup manager. It will follow the
> same rules of inheritance as the Daniel Mack's patches does.
>
> The third series will try to bring a BPF map of handles for Landlock and
> the dedicated BPF helpers.
>
> Finally, the fourth series will bring back the unprivileged mode (with
> no_new_privs), at least for process hierarchies (via seccomp). This also
> imply to handle multi-level of rules.
>
> Right now, an important point of attention is the userland ABI. We don't
> want LSM hooks to be exposed "as is" to userland. This may have some
> future implications if their semantic and/or enforcement point(s)
> change. In the next series, I will propose a new abstraction over the
> currently used LSM hooks. I'll also propose a new way to deal with
> resource accountability. Finally, I plan to create a minimal (kernel)
> developer documentation and a test suite.
>
> Regards,
> Mickaël
>
>
> On 26/10/2016 08:56, Mickaël Salaün wrote:
>> Hi,
>>
>> This fourth RFC brings some improvements over the previous one [1]. An important
>> new point is the abstraction from the raw types of LSM hook arguments. It is
>> now possible to call a Landlock function the same way for LSM hooks with
>> different internal argument types. Some parts of the code are revamped with RCU
>> to properly deal with concurrency. From a userland point of view, the only
>> remaining link with seccomp-bpf is the ability to use the seccomp(2) syscall to
>> load and enforce a Landlock rule. Seccomp filters cannot trigger Landlock rules
>> anymore. For now, it is no more possible for an unprivileged user to enforce a
>> Landlock rule on a cgroup through delegation.
>>
>> As suggested, I plan to write documentation for userland and kernel developers
>> with some kind of guiding principles. A remaining question is how to enforce
>> limitations for the rule creation?
>>
>>
>> # Landlock LSM
>>
>> The goal of this new stackable Linux Security Module (LSM) called Landlock is
>> to allow any process, including unprivileged ones, to create powerful security
>> sandboxes comparable to the Seatbelt/XNU Sandbox or the OpenBSD Pledge. This
>> kind of sandbox is expected to help mitigate the security impact of bugs or
>> unexpected/malicious behaviors in userland applications.
>>
>> eBPF programs are used to create a security rule. They are very limited (i.e.
>> can only call a whitelist of functions) and cannot do a denial of service (i.e.
>> no loop). A new dedicated eBPF map allows to collect and compare Landlock
>> handles with system resources (e.g. files or network connections).
>>
>> The approach taken is to add the minimum amount of code while still allowing
>> the userland to create quite complex access rules. A dedicated security policy
>> language as the one used by SELinux, AppArmor and other major LSMs involves a
>> lot of code and is usually dedicated to a trusted user (i.e. root).
>>
>>
>> # eBPF
>>
>> To get an expressive language while still being safe and small, Landlock is
>> based on eBPF. Landlock should be usable by untrusted processes and must then
>> expose a minimal attack surface. The eBPF bytecode is minimal while powerful,
>> widely used and designed to be used by not so trusted application. Reusing this
>> code allows to not reproduce the same mistakes and minimize new code while
>> still taking a generic approach. Only a few additional features are added like
>> a new kind of arraymap and some dedicated eBPF functions.
>>
>> An eBPF program has access to an eBPF context which contains the LSM hook
>> arguments (as does seccomp-bpf with syscall arguments). They can be used
>> directly or passed to helper functions according to their types. It is then
>> possible to do complex access checks without race conditions nor inconsistent
>> evaluation (i.e. incorrect mirroring of the OS code and state [2]).
>>
>> There is one eBPF program subtype per LSM hook. This allows to statically check
>> which context access is performed by an eBPF program. This is needed to deny
>> kernel address leak and ensure the right use of LSM hook arguments with eBPF
>> functions. Moreover, this safe pointer handling removes the need for runtime
>> check or abstract data, which improves performances. Any user can add multiple
>> Landlock eBPF programs per LSM hook. They are stacked and evaluated one after
>> the other (cf. seccomp-bpf).
>>
>>
>> # LSM hooks
>>
>> Unlike syscalls, LSM hooks are security checkpoints and are not architecture
>> dependent. They are designed to match a security need associated with a
>> security policy (e.g. access to a file). Exposing parts of some LSM hooks
>> instead of using the syscall API for sandboxing should help to avoid bugs and
>> hacks as encountered by the first RFC. Instead of redoing the work of the LSM
>> hooks through syscalls, we should use and expose them as does policies of
>> access control LSM.
>>
>> Only a subset of the hooks are meaningful for an unprivileged sandbox mechanism
>> (e.g. file system or network access control). Landlock uses an abstraction of
>> raw LSM hooks, which allow to deal with possible future API changes of the LSM
>> hook API. Moreover, thanks to the ePBF program typing (per LSM hook) used by
>> Landlock, it should not be hard to make such evolutions backward compatible.
>>
>>
>> # Use case scenario
>>
>> First, a process needs to create a new dedicated eBPF map containing handles.
>> This handles are references to system resources (e.g. file or directory) and
>> grouped in one or multiple maps to be efficiently managed and checked in
>> batches. This kind of map can be passed to Landlock eBPF functions to compare,
>> for example, with a file access request. The handles are only accessible from
>> the eBPF programs created by the same thread.
>>
>> The loaded Landlock eBPF programs can be triggered by a seccomp filter
>> returning RET_LANDLOCK. In addition, a cookie (16-bit value) can be passed from
>> a seccomp filter to eBPF programs. This allow flexible security policies
>> between seccomp and Landlock.
>>
>> Another way to enforce a Landlock security policy is to attach Landlock
>> programs to a dedicated cgroup. All the processes in this cgroup will then be
>> subject to this policy. For unprivileged processes, this can be done thanks to
>> cgroup delegation.
>>
>> A triggered Landlock eBPF program can allow or deny an access, according to
>> its subtype (i.e. LSM hook), thanks to errno return values.
>>
>>
>> # Sandbox example with process hierarchy sandboxing (seccomp)
>>
>> $ ls /home
>> user1
>> $ LANDLOCK_ALLOWED='/bin:/lib:/usr:/tmp:/proc/self/fd/0' \
>> ./samples/landlock/sandbox /bin/sh -i
>> Launching a new sandboxed process.
>> $ ls /home
>> ls: cannot access '/home': No such file or directory
>>
>>
>> # Sandbox example with conditional access control depending on a cgroup
>>
>> $ mkdir /sys/fs/cgroup/sandboxed
>> $ ls /home
>> user1
>> $ LANDLOCK_CGROUPS='/sys/fs/cgroup/sandboxed' \
>> LANDLOCK_ALLOWED='/bin:/lib:/usr:/tmp:/proc/self/fd/0' \
>> ./samples/landlock/sandbox
>> Ready to sandbox with cgroups.
>> $ ls /home
>> user1
>> $ echo $$ > /sys/fs/cgroup/sandboxed/cgroup.procs
>> $ ls /home
>> ls: cannot access '/home': No such file or directory
>>
>>
>> # Current limitations and possible improvements
>>
>> For now, eBPF programs can only return an errno code. It may be interesting to
>> be able to do other actions like seccomp-bpf does (e.g. kill process). Such
>> features can easily be implemented but the main advantage of the current
>> approach is to be able to only execute eBPF programs until one returns an errno
>> code instead of executing all programs like seccomp-bpf does.
>>
>> It is quite easy to add new eBPF functions to extend Landlock. The main concern
>> should be about the possibility to leak information from current process to
>> another one (e.g. through maps) to not reproduce the same security sensitive
>> behavior as ptrace.
>>
>> This design does not seem too intrusive but is flexible enough to allow a
>> powerful sandbox mechanism accessible by any process on Linux. The use of
>> seccomp and Landlock is more suitable with the help of a userland library (e.g.
>> libseccomp) that could help to specify a high-level language to express a
>> security policy instead of raw eBPF programs. Moreover, thanks to LLVM, it is
>> possible to express an eBPF program with a subset of C.
>>
>>
>> # FAQ
>>
>> ## Why does seccomp-bpf is not enough?
>>
>> A seccomp filter can access to raw syscall arguments which means that it is not
>> possible to filter according to pointed such as a file path. As the first
>> version of this patch series demonstrated, filtering at the syscall level is
>> complicated (e.g. need to take care of race conditions). This is mainly because
>> the access control checkpoints of the kernel are not at this high-level but
>> more underneath, at LSM hooks level. The LSM hooks are designed to handle this
>> kind of checks. This series use this approach to leverage the ability of
>> unprivileged users to limit themselves.
>>
>> Cf. "What it isn't?" in Documentation/prctl/seccomp_filter.txt
>>
>>
>> ## Why using the seccomp(2) syscall?
>>
>> Landlock use the same semantic as seccomp to apply access rule restrictions. It
>> add a new layer of security for the current process which is inherited by its
>> childs. It makes sense to use an unique access-restricting syscall (that should
>> be allowed by seccomp-bpf rules) which can only drop privileges. Moreover, a
>> Landlock eBPF program could come from outside a process (e.g. passed through a
>> UNIX socket). It is then useful to differentiate the creation/load of Landlock
>> eBPF programs via bpf(2), from rule enforcing via seccomp(2).
>>
>>
>> ## Why using cgroups?
>>
>> cgroups are designed to handle groups of processes. One use case is to manage
>> containers. Sandboxing based on process hierarchy (seccomp) is design to handle
>> immutable security policies, which is a good security property but does not
>> match all use cases. A user can attach Landlock rules to a cgroup. Doing so,
>> all the processes in that cgroup will be subject to the security policy.
>> However, if the user is allowed to manage this cgroup, it could dynamically
>> move this group of processes to a cgroup with another security policy (or
>> none). Landlock rules can be applied either on a process hierarchy (e.g.
>> application with built-in sandboxing) or a group of processes (e.g. container
>> sandboxing). Both approaches can be combined for the same process.
>>
>>
>> ## Does Landlock can limit network access or other resources?
>>
>> Limiting network access is obviously in the scope of Landlock but it is not yet
>> implemented. The main goal now is to get feedback about the whole concept, the
>> API and the file access control part. More access control types could be
>> implemented in the future.
>>
>> Sargun Dhillon sent a RFC (Checmate) [4] to deal with network manipulation.
>> This could be implemented on top of the Landlock framework.
>>
>>
>> ## Why a new LSM? Are SELinux, AppArmor, Smack or Tomoyo not good enough?
>>
>> The current access control LSMs are fine for their purpose which is to give the
>> *root* the ability to enforce a security policy for the *system*. What is
>> missing is a way to enforce a security policy for any applications by its
>> developer and *unprivileged user* as seccomp can do for raw syscall filtering.
>> Moreover, Landlock handles stacked hook programs from different users. It must
>> then ensure there is no possible malicious interactions between these programs.
>>
>> Differences with other (access control) LSMs:
>> * not only dedicated to administrators (i.e. no_new_priv);
>> * limited kernel attack surface (e.g. policy parsing);
>> * helpers to compare complex objects (path/FD), no access to internal kernel
>> data (do not leak addresses);
>> * constrained policy rules/programs (no DoS: deterministic execution time);
>> * do not leak more information than the loader process can legitimately have
>> access to (minimize metadata inference): must compare from an already allowed
>> file (through a handle).
>>
>>
>> ## Why not use a policy language like used by SElinux or AppArmor?
>>
>> This kind of LSMs are dedicated to administrators. They already manage the
>> system and are not a threat to the system security. However, seccomp, and
>> Landlock too, should be available to anyone, which potentially include
>> untrusted users and processes. To reduce the attack surface, Landlock should
>> expose the minimum amount of code, hence minimal complexity. Moreover, another
>> threat is to make accessible to a malicious code a new way to gain more
>> information. For example, Landlock features should not allow a program to get
>> the file owner if the directory containing this file is not readable. This data
>> could then be exfiltrated thanks to the access result. Thus, we should limit
>> the expressiveness of the available checks. The current approach is to do the
>> checks in such a way that only a comparison with an already accessed resource
>> (e.g. file descriptor) is possible. This allow to have a reference to compare
>> with, without exposing much information.
>>
>>
>> ## As a developer, why do I need this feature?
>>
>> Landlock's goal is to help userland to limit its attack surface.
>> Security-conscious developers would like to protect users from a security bug
>> in their applications and the third-party dependencies they are using. Such a
>> bug can compromise all the user data and help an attacker to perform a
>> privilege escalation. Using an *unprivileged sandbox* feature such as Landlock
>> empowers the developer with the ability to properly compartmentalize its
>> software and limit the impact of vulnerabilities.
>>
>>
>> ## As a user, why do I need a this feature?
>>
>> Any user can already use seccomp-bpf to whitelist a set of syscalls to
>> reduce the kernel attack surface for a predefined set of processes. However an
>> unprivileged user can't create a security policy like the root user can thanks to
>> SELinux and other access control LSMs. Landlock allows any unprivileged user to
>> protect their data from being accessed by any process they run but only an
>> identified subset. User tools can be created to help create such a high-level
>> access control policy. This policy may not be powerful enough to express the
>> same policies as the current access control LSMs, because of the threat an
>> unprivileged user can be to the system, but it should be enough for most
>> use-cases (e.g. blacklist or whitelist a set of file hierarchies).
>>
>>
>> # Changes since RFC v3
>>
>> * use abstract LSM hook arguments with custom types (e.g. *_LANDLOCK_ARG_FS for
>> struct file, struct inode and struct path)
>> * add more LSM hooks to support full file system access control
>> * improve the sandbox example
>> * fix races and RCU issues:
>> * eBPF program execution and eBPF helpers
>> * revamp the arraymap of handles to cleanly deal with update/delete
>> * eBPF program subtype for Landlock:
>> * remove the "origin" field
>> * add an "option" field
>> * rebase onto Daniel Mack's patches v7 [3]
>> * remove merged commit 1955351da41c ("bpf: Set register type according to
>> is_valid_access()")
>> * fix spelling mistakes
>> * cleanup some type and variable names
>> * split patches
>> * for now, remove cgroup delegation handling for unprivileged user
>> * remove extra access check for cgroup_get_from_fd()
>> * remove unused example code dealing with skb
>> * remove seccomp-bpf link:
>> * no more seccomp cookie
>> * for now, it is no more possible to check the current syscall properties
>>
>>
>> # Changes since RFC v2
>>
>> * revamp cgroup handling:
>> * use Daniel Mack's patches "Add eBPF hooks for cgroups" v5
>> * remove bpf_landlock_cmp_cgroup_beneath()
>> * make BPF_PROG_ATTACH usable with delegated cgroups
>> * add a new CGRP_NO_NEW_PRIVS flag for safe cgroups
>> * handle Landlock sandboxing for cgroups hierarchy
>> * allow unprivileged processes to attach Landlock eBPF program to cgroups
>> * add subtype to eBPF programs:
>> * replace Landlock hook identification by custom eBPF program types with a
>> dedicated subtype field
>> * manage fine-grained privileged Landlock programs
>> * register Landlock programs for dedicated trigger origins (e.g. syscall,
>> return from seccomp filter and/or interruption)
>> * performance and memory optimizations: use an array to access Landlock hooks
>> directly but do not duplicated it for each thread (seccomp-based)
>> * allow running Landlock programs without seccomp filter
>> * fix seccomp-related issues
>> * remove extra errno bounding check for Landlock programs
>> * add some examples for optional eBPF functions or context access (network
>> related) according to security checks to allow more features for privileged
>> programs (e.g. Checmate)
>>
>>
>> # Changes since RFC v1
>>
>> * focus on the LSM hooks, not the syscalls:
>> * much more simple implementation
>> * does not need audit cache tricks to avoid race conditions
>> * more simple to use and more generic because using the LSM hook abstraction
>> directly
>> * more efficient because only checking in LSM hooks
>> * architecture agnostic
>> * switch from cBPF to eBPF:
>> * new eBPF program types dedicated to Landlock
>> * custom functions used by the eBPF program
>> * gain some new features (e.g. 10 registers, can load values of different
>> size, LLVM translator) but only a few functions allowed and a dedicated map
>> type
>> * new context: LSM hook ID, cookie and LSM hook arguments
>> * need to set the sysctl kernel.unprivileged_bpf_disable to 0 (default value)
>> to be able to load hook filters as unprivileged users
>> * smaller and simpler:
>> * no more checker groups but dedicated arraymap of handles
>> * simpler userland structs thanks to eBPF functions
>> * distinctive name: Landlock
>>
>>
>> This series can be applied on top of Daniel Mack's patches for BPF_PROG_ATTACH
>> v7 [3] on Linux v4.9-rc2. This can be tested with CONFIG_SECURITY_LANDLOCK,
>> CONFIG_SECCOMP_FILTER and CONFIG_CGROUP_BPF. I would really appreciate
>> constructive comments on the usability, architecture, code and userland API of
>> Landlock LSM.
>>
>> [1] https://lkml.kernel.org/r/[email protected]
>> [2] https://crypto.stanford.edu/cs155/papers/traps.pdf
>> [3] https://lkml.kernel.org/r/[email protected]
>> [4] https://lkml.kernel.org/r/[email protected]
>>
>> Regards,
>>
>> Mickaël Salaün (18):
>> landlock: Add Kconfig
>> bpf: Move u64_to_ptr() to BPF headers and inline it
>> bpf,landlock: Add a new arraymap type to deal with (Landlock) handles
>> bpf,landlock: Add eBPF program subtype and is_valid_subtype() verifier
>> bpf,landlock: Define an eBPF program type for Landlock
>> fs: Constify path_is_under()'s arguments
>> landlock: Add LSM hooks
>> landlock: Handle file comparisons
>> landlock: Add manager functions
>> seccomp: Split put_seccomp_filter() with put_seccomp()
>> seccomp,landlock: Handle Landlock hooks per process hierarchy
>> bpf: Cosmetic change for bpf_prog_attach()
>> bpf/cgroup: Replace struct bpf_prog with struct bpf_object
>> bpf/cgroup: Make cgroup_bpf_update() return an error code
>> bpf/cgroup: Move capability check
>> bpf/cgroup,landlock: Handle Landlock hooks per cgroup
>> landlock: Add update and debug access flags
>> samples/landlock: Add sandbox example
>>
>> fs/namespace.c | 2 +-
>> include/linux/bpf-cgroup.h | 19 +-
>> include/linux/bpf.h | 44 +++-
>> include/linux/cgroup-defs.h | 2 +
>> include/linux/filter.h | 1 +
>> include/linux/fs.h | 2 +-
>> include/linux/landlock.h | 95 +++++++++
>> include/linux/lsm_hooks.h | 5 +
>> include/linux/seccomp.h | 12 +-
>> include/uapi/linux/bpf.h | 105 ++++++++++
>> include/uapi/linux/seccomp.h | 1 +
>> kernel/bpf/arraymap.c | 270 +++++++++++++++++++++++++
>> kernel/bpf/cgroup.c | 139 ++++++++++---
>> kernel/bpf/syscall.c | 71 ++++---
>> kernel/bpf/verifier.c | 35 +++-
>> kernel/cgroup.c | 6 +-
>> kernel/fork.c | 15 +-
>> kernel/seccomp.c | 26 ++-
>> kernel/trace/bpf_trace.c | 12 +-
>> net/core/filter.c | 26 ++-
>> samples/Makefile | 2 +-
>> samples/bpf/bpf_helpers.h | 5 +
>> samples/landlock/.gitignore | 1 +
>> samples/landlock/Makefile | 16 ++
>> samples/landlock/sandbox.c | 405 +++++++++++++++++++++++++++++++++++++
>> security/Kconfig | 1 +
>> security/Makefile | 2 +
>> security/landlock/Kconfig | 23 +++
>> security/landlock/Makefile | 3 +
>> security/landlock/checker_fs.c | 152 ++++++++++++++
>> security/landlock/checker_fs.h | 20 ++
>> security/landlock/common.h | 58 ++++++
>> security/landlock/lsm.c | 449 +++++++++++++++++++++++++++++++++++++++++
>> security/landlock/manager.c | 379 ++++++++++++++++++++++++++++++++++
>> security/security.c | 1 +
>> 35 files changed, 2309 insertions(+), 96 deletions(-)
>> create mode 100644 include/linux/landlock.h
>> create mode 100644 samples/landlock/.gitignore
>> create mode 100644 samples/landlock/Makefile
>> create mode 100644 samples/landlock/sandbox.c
>> create mode 100644 security/landlock/Kconfig
>> create mode 100644 security/landlock/Makefile
>> create mode 100644 security/landlock/checker_fs.c
>> create mode 100644 security/landlock/checker_fs.h
>> create mode 100644 security/landlock/common.h
>> create mode 100644 security/landlock/lsm.c
>> create mode 100644 security/landlock/manager.c
>>
>
Was there a plan around getting Daniel's patches in as well? Also,
rather than making these handles landlock-specific, can they be
implemented in such a way where we can keep track of (some) of these
in other types of programs?
On 14/11/2016 11:35, Sargun Dhillon wrote:
> Was there a plan around getting Daniel's patches in as well? Also,
> rather than making these handles landlock-specific, can they be
> implemented in such a way where we can keep track of (some) of these
> in other types of programs?
>
About the map of handles, this is only a new type of map so it's not
particularly Landlock-specific. Anyway, we'll see that in the third step.
Mickaël