2024-03-15 11:39:32

by Christian Göttsche

[permalink] [raw]
Subject: [PATCH 01/10] capability: introduce new capable flag CAP_OPT_NOAUDIT_ONDENY

Introduce a new capable flag, CAP_OPT_NOAUDIT_ONDENY, to not generate
an audit event if the requested capability is not granted. This will be
used in a new capable_any() functionality to reduce the number of
necessary capable calls.

Handle the flag accordingly in AppArmor and SELinux.

CC: [email protected]
Suggested-by: Paul Moore <[email protected]>
Signed-off-by: Christian Göttsche <[email protected]>
---
v5:
rename flag to CAP_OPT_NOAUDIT_ONDENY, suggested by Serge:
https://lore.kernel.org/all/[email protected]/
---
include/linux/security.h | 2 ++
security/apparmor/capability.c | 8 +++++---
security/selinux/hooks.c | 14 ++++++++------
3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/include/linux/security.h b/include/linux/security.h
index 41a8f667bdfa..c60cae78ff8b 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -70,6 +70,8 @@ struct lsm_ctx;
#define CAP_OPT_NOAUDIT BIT(1)
/* If capable is being called by a setid function */
#define CAP_OPT_INSETID BIT(2)
+/* If capable should audit the security request for authorized requests only */
+#define CAP_OPT_NOAUDIT_ONDENY BIT(3)

/* LSM Agnostic defines for security_sb_set_mnt_opts() flags */
#define SECURITY_LSM_NATIVE_LABELS 1
diff --git a/security/apparmor/capability.c b/security/apparmor/capability.c
index 9934df16c843..08c9c9a0fc19 100644
--- a/security/apparmor/capability.c
+++ b/security/apparmor/capability.c
@@ -108,7 +108,8 @@ static int audit_caps(struct apparmor_audit_data *ad, struct aa_profile *profile
* profile_capable - test if profile allows use of capability @cap
* @profile: profile being enforced (NOT NULL, NOT unconfined)
* @cap: capability to test if allowed
- * @opts: CAP_OPT_NOAUDIT bit determines whether audit record is generated
+ * @opts: CAP_OPT_NOAUDIT/CAP_OPT_NOAUDIT_ONDENY bit determines whether audit
+ * record is generated
* @ad: audit data (MAY BE NULL indicating no auditing)
*
* Returns: 0 if allowed else -EPERM
@@ -126,7 +127,7 @@ static int profile_capable(struct aa_profile *profile, int cap,
else
error = -EPERM;

- if (opts & CAP_OPT_NOAUDIT) {
+ if ((opts & CAP_OPT_NOAUDIT) || ((opts & CAP_OPT_NOAUDIT_ONDENY) && error)) {
if (!COMPLAIN_MODE(profile))
return error;
/* audit the cap request in complain mode but note that it
@@ -143,7 +144,8 @@ static int profile_capable(struct aa_profile *profile, int cap,
* @subj_cred: cred we are testing capability against
* @label: label being tested for capability (NOT NULL)
* @cap: capability to be tested
- * @opts: CAP_OPT_NOAUDIT bit determines whether audit record is generated
+ * @opts: CAP_OPT_NOAUDIT/CAP_OPT_NOAUDIT_ONDENY bit determines whether audit
+ * record is generated
*
* Look up capability in profile capability set.
*
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 3448454c82d0..1a2c7c1a89be 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -1624,7 +1624,7 @@ static int cred_has_capability(const struct cred *cred,
u16 sclass;
u32 sid = cred_sid(cred);
u32 av = CAP_TO_MASK(cap);
- int rc;
+ int rc, rc2;

ad.type = LSM_AUDIT_DATA_CAP;
ad.u.cap = cap;
@@ -1643,11 +1643,13 @@ static int cred_has_capability(const struct cred *cred,
}

rc = avc_has_perm_noaudit(sid, sid, sclass, av, 0, &avd);
- if (!(opts & CAP_OPT_NOAUDIT)) {
- int rc2 = avc_audit(sid, sid, sclass, av, &avd, rc, &ad);
- if (rc2)
- return rc2;
- }
+ if ((opts & CAP_OPT_NOAUDIT) || ((opts & CAP_OPT_NOAUDIT_ONDENY) && rc))
+ return rc;
+
+ rc2 = avc_audit(sid, sid, sclass, av, &avd, rc, &ad);
+ if (rc2)
+ return rc2;
+
return rc;
}

--
2.43.0



2024-03-15 11:39:35

by Christian Göttsche

[permalink] [raw]
Subject: [PATCH 02/10] capability: add any wrappers to test for multiple caps with exactly one audit message

Add the interfaces `capable_any()` and `ns_capable_any()` as an
alternative to multiple `capable()`/`ns_capable()` calls, like
`capable_any(CAP_SYS_NICE, CAP_SYS_ADMIN)` instead of
`capable(CAP_SYS_NICE) || capable(CAP_SYS_ADMIN)`.

`capable_any()`/`ns_capable_any()` will in particular generate exactly
one audit message, either for the left most capability in effect or, if
the task has none, the first one.

This is especially helpful with regard to SELinux, where each audit
message about a not allowed capability request will create a denial
message. Using this new wrapper with the least invasive capability as
left most argument (e.g. CAP_SYS_NICE before CAP_SYS_ADMIN) enables
policy writers to only grant the least invasive one for the particular
subject instead of both.

CC: [email protected]
Signed-off-by: Christian Göttsche <[email protected]>
---
v5:
- add check for identical passed capabilities
- rename internal helper according to flag rename to
ns_capable_noauditondeny()
v4:
Use CAP_OPT_NODENYAUDIT via added ns_capable_nodenyaudit()
v3:
- rename to capable_any()
- fix typo in function documentation
- add ns_capable_any()
v2:
avoid varargs and fix to two capabilities; capable_or3() can be added
later if needed
---
include/linux/capability.h | 10 ++++++
kernel/capability.c | 73 ++++++++++++++++++++++++++++++++++++++
2 files changed, 83 insertions(+)

diff --git a/include/linux/capability.h b/include/linux/capability.h
index 0c356a517991..eeb958440656 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -146,7 +146,9 @@ extern bool has_capability_noaudit(struct task_struct *t, int cap);
extern bool has_ns_capability_noaudit(struct task_struct *t,
struct user_namespace *ns, int cap);
extern bool capable(int cap);
+extern bool capable_any(int cap1, int cap2);
extern bool ns_capable(struct user_namespace *ns, int cap);
+extern bool ns_capable_any(struct user_namespace *ns, int cap1, int cap2);
extern bool ns_capable_noaudit(struct user_namespace *ns, int cap);
extern bool ns_capable_setid(struct user_namespace *ns, int cap);
#else
@@ -172,10 +174,18 @@ static inline bool capable(int cap)
{
return true;
}
+static inline bool capable_any(int cap1, int cap2)
+{
+ return true;
+}
static inline bool ns_capable(struct user_namespace *ns, int cap)
{
return true;
}
+static inline bool ns_capable_any(struct user_namespace *ns, int cap1, int cap2)
+{
+ return true;
+}
static inline bool ns_capable_noaudit(struct user_namespace *ns, int cap)
{
return true;
diff --git a/kernel/capability.c b/kernel/capability.c
index dac4df77e376..73358abfe2e1 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -402,6 +402,23 @@ bool ns_capable_noaudit(struct user_namespace *ns, int cap)
}
EXPORT_SYMBOL(ns_capable_noaudit);

+/**
+ * ns_capable_noauditondeny - Determine if the current task has a superior capability
+ * (unaudited when unauthorized) in effect
+ * @ns: The usernamespace we want the capability in
+ * @cap: The capability to be tested for
+ *
+ * Return true if the current task has the given superior capability currently
+ * available for use, false if not.
+ *
+ * This sets PF_SUPERPRIV on the task if the capability is available on the
+ * assumption that it's about to be used.
+ */
+static bool ns_capable_noauditondeny(struct user_namespace *ns, int cap)
+{
+ return ns_capable_common(ns, cap, CAP_OPT_NOAUDIT_ONDENY);
+}
+
/**
* ns_capable_setid - Determine if the current task has a superior capability
* in effect, while signalling that this check is being done from within a
@@ -421,6 +438,62 @@ bool ns_capable_setid(struct user_namespace *ns, int cap)
}
EXPORT_SYMBOL(ns_capable_setid);

+/**
+ * ns_capable_any - Determine if the current task has one of two superior capabilities in effect
+ * @ns: The usernamespace we want the capability in
+ * @cap1: The capabilities to be tested for first
+ * @cap2: The capabilities to be tested for secondly
+ *
+ * Return true if the current task has at least one of the two given superior
+ * capabilities currently available for use, false if not.
+ *
+ * In contrast to or'ing capable() this call will create exactly one audit
+ * message, either for @cap1, if it is granted or both are not permitted,
+ * or @cap2, if it is granted while the other one is not.
+ *
+ * The capabilities should be ordered from least to most invasive, i.e. CAP_SYS_ADMIN last.
+ *
+ * This sets PF_SUPERPRIV on the task if the capability is available on the
+ * assumption that it's about to be used.
+ */
+bool ns_capable_any(struct user_namespace *ns, int cap1, int cap2)
+{
+ if (cap1 == cap2)
+ return ns_capable(ns, cap1);
+
+ if (ns_capable_noauditondeny(ns, cap1))
+ return true;
+
+ if (ns_capable_noauditondeny(ns, cap2))
+ return true;
+
+ return ns_capable(ns, cap1);
+}
+EXPORT_SYMBOL(ns_capable_any);
+
+/**
+ * capable_any - Determine if the current task has one of two superior capabilities in effect
+ * @cap1: The capabilities to be tested for first
+ * @cap2: The capabilities to be tested for secondly
+ *
+ * Return true if the current task has at least one of the two given superior
+ * capabilities currently available for use, false if not.
+ *
+ * In contrast to or'ing capable() this call will create exactly one audit
+ * message, either for @cap1, if it is granted or both are not permitted,
+ * or @cap2, if it is granted while the other one is not.
+ *
+ * The capabilities should be ordered from least to most invasive, i.e. CAP_SYS_ADMIN last.
+ *
+ * This sets PF_SUPERPRIV on the task if the capability is available on the
+ * assumption that it's about to be used.
+ */
+bool capable_any(int cap1, int cap2)
+{
+ return ns_capable_any(&init_user_ns, cap1, cap2);
+}
+EXPORT_SYMBOL(capable_any);
+
/**
* capable - Determine if the current task has a superior capability in effect
* @cap: The capability to be tested for
--
2.43.0


2024-03-15 11:39:37

by Christian Göttsche

[permalink] [raw]
Subject: [PATCH 03/10] capability: use new capable_any functionality

Use the new added capable_any function in appropriate cases, where a
task is required to have any of two capabilities.

Signed-off-by: Christian Göttsche <[email protected]>
---
v3:
- rename to capable_any()
- simplify checkpoint_restore_ns_capable()
---
include/linux/capability.h | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/include/linux/capability.h b/include/linux/capability.h
index eeb958440656..4db0ffb47271 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -204,18 +204,17 @@ extern bool file_ns_capable(const struct file *file, struct user_namespace *ns,
extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns);
static inline bool perfmon_capable(void)
{
- return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);
+ return capable_any(CAP_PERFMON, CAP_SYS_ADMIN);
}

static inline bool bpf_capable(void)
{
- return capable(CAP_BPF) || capable(CAP_SYS_ADMIN);
+ return capable_any(CAP_BPF, CAP_SYS_ADMIN);
}

static inline bool checkpoint_restore_ns_capable(struct user_namespace *ns)
{
- return ns_capable(ns, CAP_CHECKPOINT_RESTORE) ||
- ns_capable(ns, CAP_SYS_ADMIN);
+ return ns_capable_any(ns, CAP_CHECKPOINT_RESTORE, CAP_SYS_ADMIN);
}

/* audit system wants to get cap info from files as well */
--
2.43.0


2024-03-15 11:40:21

by Christian Göttsche

[permalink] [raw]
Subject: [PATCH 04/10] block: use new capable_any functionality

Use the new added capable_any function in appropriate cases, where a
task is required to have any of two capabilities.

Reorder CAP_SYS_ADMIN last.

Fixes: 94c4b4fd25e6 ("block: Check ADMIN before NICE for IOPRIO_CLASS_RT")

Signed-off-by: Christian Göttsche <[email protected]>
---
v3:
rename to capable_any()
---
block/ioprio.c | 9 +--------
1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/block/ioprio.c b/block/ioprio.c
index 73301a261429..6e1291679ea0 100644
--- a/block/ioprio.c
+++ b/block/ioprio.c
@@ -37,14 +37,7 @@ int ioprio_check_cap(int ioprio)

switch (class) {
case IOPRIO_CLASS_RT:
- /*
- * Originally this only checked for CAP_SYS_ADMIN,
- * which was implicitly allowed for pid 0 by security
- * modules such as SELinux. Make sure we check
- * CAP_SYS_ADMIN first to avoid a denial/avc for
- * possibly missing CAP_SYS_NICE permission.
- */
- if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_NICE))
+ if (!capable_any(CAP_SYS_NICE, CAP_SYS_ADMIN))
return -EPERM;
fallthrough;
/* rt has prio field too */
--
2.43.0


2024-03-15 11:40:34

by Christian Göttsche

[permalink] [raw]
Subject: [PATCH 05/10] drivers: use new capable_any functionality

Use the new added capable_any function in appropriate cases, where a
task is required to have any of two capabilities.

Reorder CAP_SYS_ADMIN last.

Signed-off-by: Christian Göttsche <[email protected]>
Acked-by: Alexander Gordeev <[email protected]> (s390 portion)
---
v4:
Additional usage in kfd_ioctl()
v3:
rename to capable_any()
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 3 +--
drivers/net/caif/caif_serial.c | 2 +-
drivers/s390/block/dasd_eckd.c | 2 +-
3 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index dfa8c69532d4..8c7ebca01c17 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -3290,8 +3290,7 @@ static long kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
* more priviledged access.
*/
if (unlikely(ioctl->flags & KFD_IOC_FLAG_CHECKPOINT_RESTORE)) {
- if (!capable(CAP_CHECKPOINT_RESTORE) &&
- !capable(CAP_SYS_ADMIN)) {
+ if (!capable_any(CAP_CHECKPOINT_RESTORE, CAP_SYS_ADMIN)) {
retcode = -EACCES;
goto err_i1;
}
diff --git a/drivers/net/caif/caif_serial.c b/drivers/net/caif/caif_serial.c
index ed3a589def6b..e908b9ce57dc 100644
--- a/drivers/net/caif/caif_serial.c
+++ b/drivers/net/caif/caif_serial.c
@@ -326,7 +326,7 @@ static int ldisc_open(struct tty_struct *tty)
/* No write no play */
if (tty->ops->write == NULL)
return -EOPNOTSUPP;
- if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_TTY_CONFIG))
+ if (!capable_any(CAP_SYS_TTY_CONFIG, CAP_SYS_ADMIN))
return -EPERM;

/* release devices to avoid name collision */
diff --git a/drivers/s390/block/dasd_eckd.c b/drivers/s390/block/dasd_eckd.c
index 373c1a86c33e..8f9a5136306a 100644
--- a/drivers/s390/block/dasd_eckd.c
+++ b/drivers/s390/block/dasd_eckd.c
@@ -5384,7 +5384,7 @@ static int dasd_symm_io(struct dasd_device *device, void __user *argp)
char psf0, psf1;
int rc;

- if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_RAWIO))
+ if (!capable_any(CAP_SYS_RAWIO, CAP_SYS_ADMIN))
return -EACCES;
psf0 = psf1 = 0;

--
2.43.0


2024-03-15 11:41:26

by Christian Göttsche

[permalink] [raw]
Subject: [PATCH 06/10] fs: use new capable_any functionality

Use the new added capable_any function in appropriate cases, where a
task is required to have any of two capabilities.

Signed-off-by: Christian Göttsche <[email protected]>
Acked-by: Christian Brauner <[email protected]>
---
v3:
rename to capable_any()
---
fs/pipe.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/pipe.c b/fs/pipe.c
index 50c8a8596b52..9d02698ed5d4 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -784,7 +784,7 @@ bool too_many_pipe_buffers_hard(unsigned long user_bufs)

bool pipe_is_unprivileged_user(void)
{
- return !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN);
+ return !capable_any(CAP_SYS_RESOURCE, CAP_SYS_ADMIN);
}

struct pipe_inode_info *alloc_pipe_info(void)
--
2.43.0


2024-03-15 11:41:34

by Christian Göttsche

[permalink] [raw]
Subject: [PATCH 08/10] net: use new capable_any functionality

Use the new added capable_any function in appropriate cases, where a
task is required to have any of two capabilities.

Add sock_ns_capable_any() wrapper similar to existing sock_ns_capable()
one.

Reorder CAP_SYS_ADMIN last.

Signed-off-by: Christian Göttsche <[email protected]>
Reviewed-by: Miquel Raynal <[email protected]> (ieee802154 portion)
---
v4:
- introduce sockopt_ns_capable_any()
v3:
- rename to capable_any()
- make use of ns_capable_any
---
include/net/sock.h | 1 +
net/caif/caif_socket.c | 2 +-
net/core/sock.c | 15 +++++++++------
net/ieee802154/socket.c | 6 ++----
net/ipv4/ip_sockglue.c | 5 +++--
net/ipv6/ipv6_sockglue.c | 3 +--
net/unix/af_unix.c | 2 +-
7 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index b5e00702acc1..2e64a80c8fca 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1736,6 +1736,7 @@ static inline void unlock_sock_fast(struct sock *sk, bool slow)
void sockopt_lock_sock(struct sock *sk);
void sockopt_release_sock(struct sock *sk);
bool sockopt_ns_capable(struct user_namespace *ns, int cap);
+bool sockopt_ns_capable_any(struct user_namespace *ns, int cap1, int cap2);
bool sockopt_capable(int cap);

/* Used by processes to "lock" a socket state, so that
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index 039dfbd367c9..2d811037e378 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -1026,7 +1026,7 @@ static int caif_create(struct net *net, struct socket *sock, int protocol,
.usersize = sizeof_field(struct caifsock, conn_req.param)
};

- if (!capable(CAP_SYS_ADMIN) && !capable(CAP_NET_ADMIN))
+ if (!capable_any(CAP_NET_ADMIN, CAP_SYS_ADMIN))
return -EPERM;
/*
* The sock->type specifies the socket type to use.
diff --git a/net/core/sock.c b/net/core/sock.c
index 43bf3818c19e..fa9edcc3e23d 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1077,6 +1077,12 @@ bool sockopt_ns_capable(struct user_namespace *ns, int cap)
}
EXPORT_SYMBOL(sockopt_ns_capable);

+bool sockopt_ns_capable_any(struct user_namespace *ns, int cap1, int cap2)
+{
+ return has_current_bpf_ctx() || ns_capable_any(ns, cap1, cap2);
+}
+EXPORT_SYMBOL(sockopt_ns_capable_any);
+
bool sockopt_capable(int cap)
{
return has_current_bpf_ctx() || capable(cap);
@@ -1118,8 +1124,7 @@ int sk_setsockopt(struct sock *sk, int level, int optname,
switch (optname) {
case SO_PRIORITY:
if ((val >= 0 && val <= 6) ||
- sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) ||
- sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) {
+ sockopt_ns_capable_any(sock_net(sk)->user_ns, CAP_NET_RAW, CAP_NET_ADMIN)) {
sock_set_priority(sk, val);
return 0;
}
@@ -1422,8 +1427,7 @@ int sk_setsockopt(struct sock *sk, int level, int optname,
break;

case SO_MARK:
- if (!sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) &&
- !sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) {
+ if (!sockopt_ns_capable_any(sock_net(sk)->user_ns, CAP_NET_RAW, CAP_NET_ADMIN)) {
ret = -EPERM;
break;
}
@@ -2813,8 +2817,7 @@ int __sock_cmsg_send(struct sock *sk, struct cmsghdr *cmsg,

switch (cmsg->cmsg_type) {
case SO_MARK:
- if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) &&
- !ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN))
+ if (!ns_capable_any(sock_net(sk)->user_ns, CAP_NET_RAW, CAP_NET_ADMIN))
return -EPERM;
if (cmsg->cmsg_len != CMSG_LEN(sizeof(u32)))
return -EINVAL;
diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c
index 990a83455dcf..42b3b12eb493 100644
--- a/net/ieee802154/socket.c
+++ b/net/ieee802154/socket.c
@@ -902,8 +902,7 @@ static int dgram_setsockopt(struct sock *sk, int level, int optname,
ro->want_lqi = !!val;
break;
case WPAN_SECURITY:
- if (!ns_capable(net->user_ns, CAP_NET_ADMIN) &&
- !ns_capable(net->user_ns, CAP_NET_RAW)) {
+ if (!ns_capable_any(net->user_ns, CAP_NET_ADMIN, CAP_NET_RAW)) {
err = -EPERM;
break;
}
@@ -926,8 +925,7 @@ static int dgram_setsockopt(struct sock *sk, int level, int optname,
}
break;
case WPAN_SECURITY_LEVEL:
- if (!ns_capable(net->user_ns, CAP_NET_ADMIN) &&
- !ns_capable(net->user_ns, CAP_NET_RAW)) {
+ if (!ns_capable_any(net->user_ns, CAP_NET_ADMIN, CAP_NET_RAW)) {
err = -EPERM;
break;
}
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index cf377377b52d..5a1e5ee20ddd 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -1008,8 +1008,9 @@ int do_ip_setsockopt(struct sock *sk, int level, int optname,
inet_assign_bit(MC_ALL, sk, val);
return 0;
case IP_TRANSPARENT:
- if (!!val && !sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) &&
- !sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN))
+ if (!!val &&
+ !sockopt_ns_capable_any(sock_net(sk)->user_ns,
+ CAP_NET_RAW, CAP_NET_ADMIN))
return -EPERM;
if (optlen < 1)
return -EINVAL;
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index d4c28ec1bc51..e46b11b5d3dd 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -773,8 +773,7 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
break;

case IPV6_TRANSPARENT:
- if (valbool && !sockopt_ns_capable(net->user_ns, CAP_NET_RAW) &&
- !sockopt_ns_capable(net->user_ns, CAP_NET_ADMIN)) {
+ if (valbool && !sockopt_ns_capable_any(net->user_ns, CAP_NET_RAW, CAP_NET_ADMIN)) {
retv = -EPERM;
break;
}
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 5b41e2321209..acc36b2d25d7 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1783,7 +1783,7 @@ static inline bool too_many_unix_fds(struct task_struct *p)
struct user_struct *user = current_user();

if (unlikely(READ_ONCE(user->unix_inflight) > task_rlimit(p, RLIMIT_NOFILE)))
- return !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN);
+ return !capable_any(CAP_SYS_RESOURCE, CAP_SYS_ADMIN);
return false;
}

--
2.43.0


2024-03-15 11:41:41

by Christian Göttsche

[permalink] [raw]
Subject: [PATCH 09/10] bpf: use new capable_any functionality

Use the new added capable_any function in bpf_token_capable() and
bpf_net_capable() implementations.

Signed-off-by: Christian Göttsche <[email protected]>
---
v5:
add patch
---
include/linux/bpf.h | 2 +-
kernel/bpf/syscall.c | 2 +-
kernel/bpf/token.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 4f20f62f9d63..bdadf3291bec 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -2701,7 +2701,7 @@ static inline int bpf_obj_get_user(const char __user *pathname, int flags)

static inline bool bpf_token_capable(const struct bpf_token *token, int cap)
{
- return capable(cap) || (cap != CAP_SYS_ADMIN && capable(CAP_SYS_ADMIN));
+ return capable_any(cap, CAP_SYS_ADMIN);
}

static inline void bpf_token_inc(struct bpf_token *token)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index ae2ff73bde7e..a10e6f77002c 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1175,7 +1175,7 @@ static int map_check_btf(struct bpf_map *map, struct bpf_token *token,

static bool bpf_net_capable(void)
{
- return capable(CAP_NET_ADMIN) || capable(CAP_SYS_ADMIN);
+ return capable_any(CAP_NET_ADMIN, CAP_SYS_ADMIN);
}

#define BPF_MAP_CREATE_LAST_FIELD map_token_fd
diff --git a/kernel/bpf/token.c b/kernel/bpf/token.c
index d6ccf8d00eab..53f491046a8d 100644
--- a/kernel/bpf/token.c
+++ b/kernel/bpf/token.c
@@ -11,7 +11,7 @@

static bool bpf_ns_capable(struct user_namespace *ns, int cap)
{
- return ns_capable(ns, cap) || (cap != CAP_SYS_ADMIN && ns_capable(ns, CAP_SYS_ADMIN));
+ return ns_capable_any(ns, cap, CAP_SYS_ADMIN);
}

bool bpf_token_capable(const struct bpf_token *token, int cap)
--
2.43.0


2024-03-15 11:41:47

by Christian Göttsche

[permalink] [raw]
Subject: [PATCH 07/10] kernel: use new capable_any functionality

Use the new added capable_any function in appropriate cases, where a
task is required to have any of two capabilities.

Signed-off-by: Christian Göttsche <[email protected]>
Reviewed-by: Christian Brauner <[email protected]>
---
v3:
rename to capable_any()
---
kernel/fork.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 39a5046c2f0b..645ab8060407 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2257,7 +2257,7 @@ __latent_entropy struct task_struct *copy_process(
retval = -EAGAIN;
if (is_rlimit_overlimit(task_ucounts(p), UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC))) {
if (p->real_cred->user != INIT_USER &&
- !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN))
+ !capable_any(CAP_SYS_RESOURCE, CAP_SYS_ADMIN))
goto bad_fork_cleanup_count;
}
current->flags &= ~PF_NPROC_EXCEEDED;
--
2.43.0


2024-03-15 11:42:56

by Christian Göttsche

[permalink] [raw]
Subject: [PATCH 10/10] coccinelle: add script for capable_any()

Add a script to find and replace chained capable() calls with
capable_any().
Also find and replace capable_any() calls where CAP_SYS_ADMIN was passed
as first argument.

Signed-off-by: Christian Göttsche <[email protected]>
---
v5:
add patch
---
MAINTAINERS | 1 +
scripts/coccinelle/api/capable_any.cocci | 164 +++++++++++++++++++++++
2 files changed, 165 insertions(+)
create mode 100644 scripts/coccinelle/api/capable_any.cocci

diff --git a/MAINTAINERS b/MAINTAINERS
index f4d7f7cb7577..32349e4c5f56 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4731,6 +4731,7 @@ S: Supported
F: include/linux/capability.h
F: include/uapi/linux/capability.h
F: kernel/capability.c
+F: scripts/coccinelle/api/capable_any.cocci
F: security/commoncap.c

CAPELLA MICROSYSTEMS LIGHT SENSOR DRIVER
diff --git a/scripts/coccinelle/api/capable_any.cocci b/scripts/coccinelle/api/capable_any.cocci
new file mode 100644
index 000000000000..83aedd3bf81d
--- /dev/null
+++ b/scripts/coccinelle/api/capable_any.cocci
@@ -0,0 +1,164 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/// Use capable_any rather than chaining capable and order CAP_SYS_ADMIN last
+///
+// Confidence: High
+// Copyright: (C) 2024 Christian Göttsche.
+// URL: https://coccinelle.gitlabpages.inria.fr/website
+// Options: --no-includes --include-headers
+// Keywords: capable, capable_any, ns_capable, ns_capable_any, sockopt_ns_capable, sockopt_ns_capable_any
+
+virtual patch
+virtual context
+virtual org
+virtual report
+
+//----------------------------------------------------------
+// For patch mode
+//----------------------------------------------------------
+
+@ depends on patch@
+binary operator op;
+expression cap1,cap2,E;
+expression ns;
+@@
+
+(
+- capable(cap1) || capable(cap2)
++ capable_any(cap1, cap2)
+|
+- E op capable(cap1) || capable(cap2)
++ E op capable_any(cap1, cap2)
+|
+- !capable(cap1) && !capable(cap2)
++ !capable_any(cap1, cap2)
+|
+- E op !capable(cap1) && !capable(cap2)
++ E op !capable_any(cap1, cap2)
+|
+- ns_capable(ns, cap1) || ns_capable(ns, cap2)
++ ns_capable_any(ns, cap1, cap2)
+|
+- E op ns_capable(ns, cap1) || ns_capable(ns, cap2)
++ E op ns_capable_any(ns, cap1, cap2)
+|
+- !ns_capable(ns, cap1) && !ns_capable(ns, cap2)
++ !ns_capable_any(ns, cap1, cap2)
+|
+- E op !ns_capable(ns, cap1) && !ns_capable(ns, cap2)
++ E op !ns_capable_any(ns, cap1, cap2)
+|
+- sockopt_ns_capable(ns, cap1) || sockopt_ns_capable(ns, cap2)
++ sockopt_ns_capable_any(ns, cap1, cap2)
+|
+- E op sockopt_ns_capable(ns, cap1) || sockopt_ns_capable(ns, cap2)
++ E op sockopt_ns_capable_any(ns, cap1, cap2)
+|
+- !sockopt_ns_capable(ns, cap1) && !sockopt_ns_capable(ns, cap2)
++ !sockopt_ns_capable_any(ns, cap1, cap2)
+|
+- E op !sockopt_ns_capable(ns, cap1) && !sockopt_ns_capable(ns, cap2)
++ E op !sockopt_ns_capable_any(ns, cap1, cap2)
+)
+
+@ depends on patch@
+identifier func = { capable_any, ns_capable_any, sockopt_ns_capable_any };
+expression cap;
+expression ns;
+@@
+
+(
+- func(CAP_SYS_ADMIN, cap)
++ func(cap, CAP_SYS_ADMIN)
+|
+- func(ns, CAP_SYS_ADMIN, cap)
++ func(ns, cap, CAP_SYS_ADMIN)
+)
+
+//----------------------------------------------------------
+// For context mode
+//----------------------------------------------------------
+
+@r1 depends on !patch exists@
+binary operator op;
+expression cap1,cap2,E;
+expression ns;
+position p1,p2;
+@@
+
+(
+* capable@p1(cap1) || capable@p2(cap2)
+|
+* E op capable@p1(cap1) || capable@p2(cap2)
+|
+* !capable@p1(cap1) && !capable@p2(cap2)
+|
+* E op !capable@p1(cap1) && !capable@p2(cap2)
+|
+* ns_capable@p1(ns, cap1) || ns_capable@p2(ns, cap2)
+|
+* E op ns_capable@p1(ns, cap1) || ns_capable@p2(ns, cap2)
+|
+* !ns_capable@p1(ns, cap1) && !ns_capable@p2(ns, cap2)
+|
+* E op !ns_capable@p1(ns, cap1) && !ns_capable@p2(ns, cap2)
+|
+* sockopt_ns_capable@p1(ns, cap1) || sockopt_ns_capable@p2(ns, cap2)
+|
+* E op sockopt_ns_capable@p1(ns, cap1) || sockopt_ns_capable@p2(ns, cap2)
+|
+* !sockopt_ns_capable@p1(ns, cap1) && !sockopt_ns_capable@p2(ns, cap2)
+|
+* E op !sockopt_ns_capable@p1(ns, cap1) && !sockopt_ns_capable@p2(ns, cap2)
+)
+
+@r2 depends on !patch exists@
+identifier func = { capable_any, ns_capable_any, sockopt_ns_capable_any };
+expression cap;
+expression ns;
+position p;
+@@
+
+(
+* func@p(CAP_SYS_ADMIN, cap)
+|
+* func@p(ns, CAP_SYS_ADMIN, cap)
+)
+
+//----------------------------------------------------------
+// For org mode
+//----------------------------------------------------------
+
+@script:python depends on org@
+p1 << r1.p1;
+p2 << r1.p2;
+@@
+
+cocci.print_main("WARNING opportunity for capable_any",p1)
+cocci.print_secs("chained capable",p2)
+
+@script:python depends on org@
+p << r2.p;
+f << r2.func;
+@@
+
+cocci.print_main("WARNING " + f + " arguments should be reordered",p)
+
+//----------------------------------------------------------
+// For report mode
+//----------------------------------------------------------
+
+@script:python depends on report@
+p1 << r1.p1;
+p2 << r1.p2;
+@@
+
+msg = "WARNING opportunity for capable_any (chained capable line %s)" % (p2[0].line)
+coccilib.report.print_report(p1[0], msg)
+
+@script:python depends on report@
+p << r2.p;
+f << r2.func;
+@@
+
+msg = "WARNING %s arguments should be reordered" % (f)
+coccilib.report.print_report(p[0], msg)
--
2.43.0


2024-03-15 15:03:42

by Tycho Andersen

[permalink] [raw]
Subject: Re: [PATCH 07/10] kernel: use new capable_any functionality

On Fri, Mar 15, 2024 at 5:39 AM Christian Göttsche
<[email protected]> wrote:
>
> Use the new added capable_any function in appropriate cases, where a
> task is required to have any of two capabilities.
>
> Signed-off-by: Christian Göttsche <[email protected]>
> Reviewed-by: Christian Brauner <[email protected]>


Reviewed-by: Tycho Andersen <[email protected]>

2024-03-15 15:04:42

by Felix Kuehling

[permalink] [raw]
Subject: Re: [PATCH 05/10] drivers: use new capable_any functionality

On 2024-03-15 7:37, Christian Göttsche wrote:
> Use the new added capable_any function in appropriate cases, where a
> task is required to have any of two capabilities.
>
> Reorder CAP_SYS_ADMIN last.
>
> Signed-off-by: Christian Göttsche <[email protected]>
> Acked-by: Alexander Gordeev <[email protected]> (s390 portion)

Acked-by: Felix Kuehling <[email protected]> (amdkfd portion)


> ---
> v4:
> Additional usage in kfd_ioctl()
> v3:
> rename to capable_any()
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 3 +--
> drivers/net/caif/caif_serial.c | 2 +-
> drivers/s390/block/dasd_eckd.c | 2 +-
> 3 files changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index dfa8c69532d4..8c7ebca01c17 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -3290,8 +3290,7 @@ static long kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
> * more priviledged access.
> */
> if (unlikely(ioctl->flags & KFD_IOC_FLAG_CHECKPOINT_RESTORE)) {
> - if (!capable(CAP_CHECKPOINT_RESTORE) &&
> - !capable(CAP_SYS_ADMIN)) {
> + if (!capable_any(CAP_CHECKPOINT_RESTORE, CAP_SYS_ADMIN)) {
> retcode = -EACCES;
> goto err_i1;
> }
> diff --git a/drivers/net/caif/caif_serial.c b/drivers/net/caif/caif_serial.c
> index ed3a589def6b..e908b9ce57dc 100644
> --- a/drivers/net/caif/caif_serial.c
> +++ b/drivers/net/caif/caif_serial.c
> @@ -326,7 +326,7 @@ static int ldisc_open(struct tty_struct *tty)
> /* No write no play */
> if (tty->ops->write == NULL)
> return -EOPNOTSUPP;
> - if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_TTY_CONFIG))
> + if (!capable_any(CAP_SYS_TTY_CONFIG, CAP_SYS_ADMIN))
> return -EPERM;
>
> /* release devices to avoid name collision */
> diff --git a/drivers/s390/block/dasd_eckd.c b/drivers/s390/block/dasd_eckd.c
> index 373c1a86c33e..8f9a5136306a 100644
> --- a/drivers/s390/block/dasd_eckd.c
> +++ b/drivers/s390/block/dasd_eckd.c
> @@ -5384,7 +5384,7 @@ static int dasd_symm_io(struct dasd_device *device, void __user *argp)
> char psf0, psf1;
> int rc;
>
> - if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_RAWIO))
> + if (!capable_any(CAP_SYS_RAWIO, CAP_SYS_ADMIN))
> return -EACCES;
> psf0 = psf1 = 0;
>

2024-03-15 16:47:03

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH 09/10] bpf: use new capable_any functionality

On Fri, Mar 15, 2024 at 4:39 AM Christian Göttsche
<[email protected]> wrote:
>
> Use the new added capable_any function in bpf_token_capable() and
> bpf_net_capable() implementations.
>
> Signed-off-by: Christian Göttsche <[email protected]>
> ---
> v5:
> add patch
> ---
> include/linux/bpf.h | 2 +-
> kernel/bpf/syscall.c | 2 +-
> kernel/bpf/token.c | 2 +-
> 3 files changed, 3 insertions(+), 3 deletions(-)
>

It's actually a nice readability improvement, thanks!

Acked-by: Andrii Nakryiko <[email protected]>

> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 4f20f62f9d63..bdadf3291bec 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -2701,7 +2701,7 @@ static inline int bpf_obj_get_user(const char __user *pathname, int flags)
>
> static inline bool bpf_token_capable(const struct bpf_token *token, int cap)
> {
> - return capable(cap) || (cap != CAP_SYS_ADMIN && capable(CAP_SYS_ADMIN));
> + return capable_any(cap, CAP_SYS_ADMIN);
> }
>
> static inline void bpf_token_inc(struct bpf_token *token)
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index ae2ff73bde7e..a10e6f77002c 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -1175,7 +1175,7 @@ static int map_check_btf(struct bpf_map *map, struct bpf_token *token,
>
> static bool bpf_net_capable(void)
> {
> - return capable(CAP_NET_ADMIN) || capable(CAP_SYS_ADMIN);
> + return capable_any(CAP_NET_ADMIN, CAP_SYS_ADMIN);
> }
>
> #define BPF_MAP_CREATE_LAST_FIELD map_token_fd
> diff --git a/kernel/bpf/token.c b/kernel/bpf/token.c
> index d6ccf8d00eab..53f491046a8d 100644
> --- a/kernel/bpf/token.c
> +++ b/kernel/bpf/token.c
> @@ -11,7 +11,7 @@
>
> static bool bpf_ns_capable(struct user_namespace *ns, int cap)
> {
> - return ns_capable(ns, cap) || (cap != CAP_SYS_ADMIN && ns_capable(ns, CAP_SYS_ADMIN));
> + return ns_capable_any(ns, cap, CAP_SYS_ADMIN);
> }
>
> bool bpf_token_capable(const struct bpf_token *token, int cap)
> --
> 2.43.0
>

2024-03-15 16:57:07

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH 03/10] capability: use new capable_any functionality

On Fri, Mar 15, 2024 at 4:39 AM Christian Göttsche
<[email protected]> wrote:
>
> Use the new added capable_any function in appropriate cases, where a
> task is required to have any of two capabilities.
>
> Signed-off-by: Christian Göttsche <[email protected]>
> ---
> v3:
> - rename to capable_any()
> - simplify checkpoint_restore_ns_capable()
> ---
> include/linux/capability.h | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
>

Acked-by: Andrii Nakryiko <[email protected]>

> diff --git a/include/linux/capability.h b/include/linux/capability.h
> index eeb958440656..4db0ffb47271 100644
> --- a/include/linux/capability.h
> +++ b/include/linux/capability.h
> @@ -204,18 +204,17 @@ extern bool file_ns_capable(const struct file *file, struct user_namespace *ns,
> extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns);
> static inline bool perfmon_capable(void)
> {
> - return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);
> + return capable_any(CAP_PERFMON, CAP_SYS_ADMIN);
> }
>
> static inline bool bpf_capable(void)
> {
> - return capable(CAP_BPF) || capable(CAP_SYS_ADMIN);
> + return capable_any(CAP_BPF, CAP_SYS_ADMIN);
> }
>
> static inline bool checkpoint_restore_ns_capable(struct user_namespace *ns)
> {
> - return ns_capable(ns, CAP_CHECKPOINT_RESTORE) ||
> - ns_capable(ns, CAP_SYS_ADMIN);
> + return ns_capable_any(ns, CAP_CHECKPOINT_RESTORE, CAP_SYS_ADMIN);
> }
>
> /* audit system wants to get cap info from files as well */
> --
> 2.43.0
>
>

2024-03-15 16:57:28

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH 02/10] capability: add any wrappers to test for multiple caps with exactly one audit message

On Fri, Mar 15, 2024 at 4:39 AM Christian Göttsche
<[email protected]> wrote:
>
> Add the interfaces `capable_any()` and `ns_capable_any()` as an
> alternative to multiple `capable()`/`ns_capable()` calls, like
> `capable_any(CAP_SYS_NICE, CAP_SYS_ADMIN)` instead of
> `capable(CAP_SYS_NICE) || capable(CAP_SYS_ADMIN)`.
>
> `capable_any()`/`ns_capable_any()` will in particular generate exactly
> one audit message, either for the left most capability in effect or, if
> the task has none, the first one.
>
> This is especially helpful with regard to SELinux, where each audit
> message about a not allowed capability request will create a denial
> message. Using this new wrapper with the least invasive capability as
> left most argument (e.g. CAP_SYS_NICE before CAP_SYS_ADMIN) enables
> policy writers to only grant the least invasive one for the particular
> subject instead of both.
>
> CC: [email protected]
> Signed-off-by: Christian Göttsche <[email protected]>
> ---
> v5:
> - add check for identical passed capabilities
> - rename internal helper according to flag rename to
> ns_capable_noauditondeny()
> v4:
> Use CAP_OPT_NODENYAUDIT via added ns_capable_nodenyaudit()
> v3:
> - rename to capable_any()
> - fix typo in function documentation
> - add ns_capable_any()
> v2:
> avoid varargs and fix to two capabilities; capable_or3() can be added
> later if needed
> ---
> include/linux/capability.h | 10 ++++++
> kernel/capability.c | 73 ++++++++++++++++++++++++++++++++++++++
> 2 files changed, 83 insertions(+)
>

[...]

>
> +/**
> + * ns_capable_any - Determine if the current task has one of two superior capabilities in effect
> + * @ns: The usernamespace we want the capability in
> + * @cap1: The capabilities to be tested for first
> + * @cap2: The capabilities to be tested for secondly
> + *
> + * Return true if the current task has at least one of the two given superior
> + * capabilities currently available for use, false if not.
> + *
> + * In contrast to or'ing capable() this call will create exactly one audit
> + * message, either for @cap1, if it is granted or both are not permitted,
> + * or @cap2, if it is granted while the other one is not.
> + *
> + * The capabilities should be ordered from least to most invasive, i.e. CAP_SYS_ADMIN last.
> + *
> + * This sets PF_SUPERPRIV on the task if the capability is available on the
> + * assumption that it's about to be used.
> + */
> +bool ns_capable_any(struct user_namespace *ns, int cap1, int cap2)
> +{
> + if (cap1 == cap2)
> + return ns_capable(ns, cap1);
> +
> + if (ns_capable_noauditondeny(ns, cap1))
> + return true;
> +
> + if (ns_capable_noauditondeny(ns, cap2))
> + return true;
> +
> + return ns_capable(ns, cap1);

this will incur an extra capable() check (with all the LSMs involved,
etc), and so for some cases where capability is expected to not be
present, this will be a regression. Is there some way to not redo the
check, but just audit the failure? At this point we do know that cap1
failed before, so might as well just log that.

> +}
> +EXPORT_SYMBOL(ns_capable_any);
> +
> +/**
> + * capable_any - Determine if the current task has one of two superior capabilities in effect
> + * @cap1: The capabilities to be tested for first
> + * @cap2: The capabilities to be tested for secondly
> + *
> + * Return true if the current task has at least one of the two given superior
> + * capabilities currently available for use, false if not.
> + *
> + * In contrast to or'ing capable() this call will create exactly one audit
> + * message, either for @cap1, if it is granted or both are not permitted,
> + * or @cap2, if it is granted while the other one is not.
> + *
> + * The capabilities should be ordered from least to most invasive, i.e. CAP_SYS_ADMIN last.
> + *
> + * This sets PF_SUPERPRIV on the task if the capability is available on the
> + * assumption that it's about to be used.
> + */
> +bool capable_any(int cap1, int cap2)
> +{
> + return ns_capable_any(&init_user_ns, cap1, cap2);
> +}
> +EXPORT_SYMBOL(capable_any);
> +
> /**
> * capable - Determine if the current task has a superior capability in effect
> * @cap: The capability to be tested for
> --
> 2.43.0
>
>

2024-03-15 18:41:59

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 02/10] capability: add any wrappers to test for multiple caps with exactly one audit message

On 3/15/24 10:45 AM, Andrii Nakryiko wrote:
>> +/**
>> + * ns_capable_any - Determine if the current task has one of two superior capabilities in effect
>> + * @ns: The usernamespace we want the capability in
>> + * @cap1: The capabilities to be tested for first
>> + * @cap2: The capabilities to be tested for secondly
>> + *
>> + * Return true if the current task has at least one of the two given superior
>> + * capabilities currently available for use, false if not.
>> + *
>> + * In contrast to or'ing capable() this call will create exactly one audit
>> + * message, either for @cap1, if it is granted or both are not permitted,
>> + * or @cap2, if it is granted while the other one is not.
>> + *
>> + * The capabilities should be ordered from least to most invasive, i.e. CAP_SYS_ADMIN last.
>> + *
>> + * This sets PF_SUPERPRIV on the task if the capability is available on the
>> + * assumption that it's about to be used.
>> + */
>> +bool ns_capable_any(struct user_namespace *ns, int cap1, int cap2)
>> +{
>> + if (cap1 == cap2)
>> + return ns_capable(ns, cap1);
>> +
>> + if (ns_capable_noauditondeny(ns, cap1))
>> + return true;
>> +
>> + if (ns_capable_noauditondeny(ns, cap2))
>> + return true;
>> +
>> + return ns_capable(ns, cap1);
>
> this will incur an extra capable() check (with all the LSMs involved,
> etc), and so for some cases where capability is expected to not be
> present, this will be a regression. Is there some way to not redo the
> check, but just audit the failure? At this point we do know that cap1
> failed before, so might as well just log that.

Not sure why that's important - if it's a failure case, and any audit
failure should be, then why would we care if that's now doing a bit of
extra work?

I say this not knowing the full picture, as I unhelpfully was only CC'ed
on two of the patches... Please don't do that when sending patchsets.

--
Jens Axboe


2024-03-15 19:49:31

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH 02/10] capability: add any wrappers to test for multiple caps with exactly one audit message

On Fri, Mar 15, 2024 at 2:41 PM Jens Axboe <[email protected]> wrote:
> On 3/15/24 10:45 AM, Andrii Nakryiko wrote:
> >> +/**
> >> + * ns_capable_any - Determine if the current task has one of two superior capabilities in effect
> >> + * @ns: The usernamespace we want the capability in
> >> + * @cap1: The capabilities to be tested for first
> >> + * @cap2: The capabilities to be tested for secondly
> >> + *
> >> + * Return true if the current task has at least one of the two given superior
> >> + * capabilities currently available for use, false if not.
> >> + *
> >> + * In contrast to or'ing capable() this call will create exactly one audit
> >> + * message, either for @cap1, if it is granted or both are not permitted,
> >> + * or @cap2, if it is granted while the other one is not.
> >> + *
> >> + * The capabilities should be ordered from least to most invasive, i.e. CAP_SYS_ADMIN last.
> >> + *
> >> + * This sets PF_SUPERPRIV on the task if the capability is available on the
> >> + * assumption that it's about to be used.
> >> + */
> >> +bool ns_capable_any(struct user_namespace *ns, int cap1, int cap2)
> >> +{
> >> + if (cap1 == cap2)
> >> + return ns_capable(ns, cap1);
> >> +
> >> + if (ns_capable_noauditondeny(ns, cap1))
> >> + return true;
> >> +
> >> + if (ns_capable_noauditondeny(ns, cap2))
> >> + return true;
> >> +
> >> + return ns_capable(ns, cap1);
> >
> > this will incur an extra capable() check (with all the LSMs involved,
> > etc), and so for some cases where capability is expected to not be
> > present, this will be a regression. Is there some way to not redo the
> > check, but just audit the failure? At this point we do know that cap1
> > failed before, so might as well just log that.
>
> Not sure why that's important - if it's a failure case, and any audit
> failure should be, then why would we care if that's now doing a bit of
> extra work?

Exactly. We discussed this in an earlier patchset in 2022 (lore link below):

https://lore.kernel.org/all/CAHC9VhS8ASN+BB7adi=uoAj=LeNhiD4LEidbMc=_bcD3UTqabg@mail.gmail.com

> I say this not knowing the full picture, as I unhelpfully was only CC'ed
> on two of the patches... Please don't do that when sending patchsets.

Agreed, if the patchset touches anything in the audit, LSM, or SELinux
code please send the full patchset to the related lists. If I have to
dig the full patchset out of lore for review it makes me grumpy.
Don't resend the patchset for just this reason, but please keep it in
mind for future patchsets.

--
paul-moore.com

2024-03-15 19:59:47

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [PATCH 01/10] capability: introduce new capable flag CAP_OPT_NOAUDIT_ONDENY

On Fri, Mar 15, 2024 at 12:37:22PM +0100, Christian G?ttsche wrote:
> Introduce a new capable flag, CAP_OPT_NOAUDIT_ONDENY, to not generate
> an audit event if the requested capability is not granted. This will be
> used in a new capable_any() functionality to reduce the number of
> necessary capable calls.
>
> Handle the flag accordingly in AppArmor and SELinux.
>
> CC: [email protected]
> Suggested-by: Paul Moore <[email protected]>
> Signed-off-by: Christian G?ttsche <[email protected]>

Thanks.

Reviewed-by: Serge Hallyn <[email protected]>

> ---
> v5:
> rename flag to CAP_OPT_NOAUDIT_ONDENY, suggested by Serge:
> https://lore.kernel.org/all/[email protected]/
> ---
> include/linux/security.h | 2 ++
> security/apparmor/capability.c | 8 +++++---
> security/selinux/hooks.c | 14 ++++++++------
> 3 files changed, 15 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/security.h b/include/linux/security.h
> index 41a8f667bdfa..c60cae78ff8b 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -70,6 +70,8 @@ struct lsm_ctx;
> #define CAP_OPT_NOAUDIT BIT(1)
> /* If capable is being called by a setid function */
> #define CAP_OPT_INSETID BIT(2)
> +/* If capable should audit the security request for authorized requests only */
> +#define CAP_OPT_NOAUDIT_ONDENY BIT(3)
>
> /* LSM Agnostic defines for security_sb_set_mnt_opts() flags */
> #define SECURITY_LSM_NATIVE_LABELS 1
> diff --git a/security/apparmor/capability.c b/security/apparmor/capability.c
> index 9934df16c843..08c9c9a0fc19 100644
> --- a/security/apparmor/capability.c
> +++ b/security/apparmor/capability.c
> @@ -108,7 +108,8 @@ static int audit_caps(struct apparmor_audit_data *ad, struct aa_profile *profile
> * profile_capable - test if profile allows use of capability @cap
> * @profile: profile being enforced (NOT NULL, NOT unconfined)
> * @cap: capability to test if allowed
> - * @opts: CAP_OPT_NOAUDIT bit determines whether audit record is generated
> + * @opts: CAP_OPT_NOAUDIT/CAP_OPT_NOAUDIT_ONDENY bit determines whether audit
> + * record is generated
> * @ad: audit data (MAY BE NULL indicating no auditing)
> *
> * Returns: 0 if allowed else -EPERM
> @@ -126,7 +127,7 @@ static int profile_capable(struct aa_profile *profile, int cap,
> else
> error = -EPERM;
>
> - if (opts & CAP_OPT_NOAUDIT) {
> + if ((opts & CAP_OPT_NOAUDIT) || ((opts & CAP_OPT_NOAUDIT_ONDENY) && error)) {
> if (!COMPLAIN_MODE(profile))
> return error;
> /* audit the cap request in complain mode but note that it
> @@ -143,7 +144,8 @@ static int profile_capable(struct aa_profile *profile, int cap,
> * @subj_cred: cred we are testing capability against
> * @label: label being tested for capability (NOT NULL)
> * @cap: capability to be tested
> - * @opts: CAP_OPT_NOAUDIT bit determines whether audit record is generated
> + * @opts: CAP_OPT_NOAUDIT/CAP_OPT_NOAUDIT_ONDENY bit determines whether audit
> + * record is generated
> *
> * Look up capability in profile capability set.
> *
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index 3448454c82d0..1a2c7c1a89be 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -1624,7 +1624,7 @@ static int cred_has_capability(const struct cred *cred,
> u16 sclass;
> u32 sid = cred_sid(cred);
> u32 av = CAP_TO_MASK(cap);
> - int rc;
> + int rc, rc2;
>
> ad.type = LSM_AUDIT_DATA_CAP;
> ad.u.cap = cap;
> @@ -1643,11 +1643,13 @@ static int cred_has_capability(const struct cred *cred,
> }
>
> rc = avc_has_perm_noaudit(sid, sid, sclass, av, 0, &avd);
> - if (!(opts & CAP_OPT_NOAUDIT)) {
> - int rc2 = avc_audit(sid, sid, sclass, av, &avd, rc, &ad);
> - if (rc2)
> - return rc2;
> - }
> + if ((opts & CAP_OPT_NOAUDIT) || ((opts & CAP_OPT_NOAUDIT_ONDENY) && rc))
> + return rc;
> +
> + rc2 = avc_audit(sid, sid, sclass, av, &avd, rc, &ad);
> + if (rc2)
> + return rc2;
> +
> return rc;
> }
>
> --
> 2.43.0
>
>

2024-03-15 21:17:33

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH 02/10] capability: add any wrappers to test for multiple caps with exactly one audit message

On Fri, Mar 15, 2024 at 11:41 AM Jens Axboe <[email protected]> wrote:
>
> On 3/15/24 10:45 AM, Andrii Nakryiko wrote:
> >> +/**
> >> + * ns_capable_any - Determine if the current task has one of two superior capabilities in effect
> >> + * @ns: The usernamespace we want the capability in
> >> + * @cap1: The capabilities to be tested for first
> >> + * @cap2: The capabilities to be tested for secondly
> >> + *
> >> + * Return true if the current task has at least one of the two given superior
> >> + * capabilities currently available for use, false if not.
> >> + *
> >> + * In contrast to or'ing capable() this call will create exactly one audit
> >> + * message, either for @cap1, if it is granted or both are not permitted,
> >> + * or @cap2, if it is granted while the other one is not.
> >> + *
> >> + * The capabilities should be ordered from least to most invasive, i.e. CAP_SYS_ADMIN last.
> >> + *
> >> + * This sets PF_SUPERPRIV on the task if the capability is available on the
> >> + * assumption that it's about to be used.
> >> + */
> >> +bool ns_capable_any(struct user_namespace *ns, int cap1, int cap2)
> >> +{
> >> + if (cap1 == cap2)
> >> + return ns_capable(ns, cap1);
> >> +
> >> + if (ns_capable_noauditondeny(ns, cap1))
> >> + return true;
> >> +
> >> + if (ns_capable_noauditondeny(ns, cap2))
> >> + return true;
> >> +
> >> + return ns_capable(ns, cap1);
> >
> > this will incur an extra capable() check (with all the LSMs involved,
> > etc), and so for some cases where capability is expected to not be
> > present, this will be a regression. Is there some way to not redo the
> > check, but just audit the failure? At this point we do know that cap1
> > failed before, so might as well just log that.
>
> Not sure why that's important - if it's a failure case, and any audit
> failure should be, then why would we care if that's now doing a bit of
> extra work?

Lack of capability doesn't necessarily mean "failure". E.g., in FUSE
there are at least few places where the code checks
capable(CAP_SYS_ADMIN), and based on that decides on some limit values
or extra checks. So if !capable(CAP_SYS_ADMIN), operation doesn't
necessarily fail outright, it just has some more restricted resources
or something.

Luckily in FUSE's case it's singular capable() check, so capable_any()
won't incur extra overhead. But I was just wondering if it would be
possible to avoid this with capable_any() as well, so that no one has
to do these trade-offs.

We also had cases in production of some BPF applications tracing
cap_capable() calls, so each extra triggering of it would be a bit of
added overhead, as a general rule.

Having said the above, I do like capable_any() changes (which is why I
acked BPF side of things).

>
> I say this not knowing the full picture, as I unhelpfully was only CC'ed
> on two of the patches... Please don't do that when sending patchsets.
>
> --
> Jens Axboe
>

2024-03-15 23:12:21

by Kuniyuki Iwashima

[permalink] [raw]
Subject: Re: [PATCH 08/10] net: use new capable_any functionality

From: Christian Göttsche <[email protected]>
Date: Fri, 15 Mar 2024 12:37:29 +0100
> Use the new added capable_any function in appropriate cases, where a
> task is required to have any of two capabilities.
>
> Add sock_ns_capable_any() wrapper similar to existing sock_ns_capable()
> one.
>
> Reorder CAP_SYS_ADMIN last.
>
> Signed-off-by: Christian Göttsche <[email protected]>
> Reviewed-by: Miquel Raynal <[email protected]> (ieee802154 portion)
> ---
> v4:
> - introduce sockopt_ns_capable_any()
> v3:
> - rename to capable_any()
> - make use of ns_capable_any
> ---
> include/net/sock.h | 1 +
> net/caif/caif_socket.c | 2 +-
> net/core/sock.c | 15 +++++++++------
> net/ieee802154/socket.c | 6 ++----
> net/ipv4/ip_sockglue.c | 5 +++--
> net/ipv6/ipv6_sockglue.c | 3 +--
> net/unix/af_unix.c | 2 +-
> 7 files changed, 18 insertions(+), 16 deletions(-)
>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index b5e00702acc1..2e64a80c8fca 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -1736,6 +1736,7 @@ static inline void unlock_sock_fast(struct sock *sk, bool slow)
> void sockopt_lock_sock(struct sock *sk);
> void sockopt_release_sock(struct sock *sk);
> bool sockopt_ns_capable(struct user_namespace *ns, int cap);
> +bool sockopt_ns_capable_any(struct user_namespace *ns, int cap1, int cap2);
> bool sockopt_capable(int cap);
>
> /* Used by processes to "lock" a socket state, so that
> diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
> index 039dfbd367c9..2d811037e378 100644
> --- a/net/caif/caif_socket.c
> +++ b/net/caif/caif_socket.c
> @@ -1026,7 +1026,7 @@ static int caif_create(struct net *net, struct socket *sock, int protocol,
> .usersize = sizeof_field(struct caifsock, conn_req.param)
> };
>
> - if (!capable(CAP_SYS_ADMIN) && !capable(CAP_NET_ADMIN))
> + if (!capable_any(CAP_NET_ADMIN, CAP_SYS_ADMIN))
> return -EPERM;
> /*
> * The sock->type specifies the socket type to use.
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 43bf3818c19e..fa9edcc3e23d 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1077,6 +1077,12 @@ bool sockopt_ns_capable(struct user_namespace *ns, int cap)
> }
> EXPORT_SYMBOL(sockopt_ns_capable);
>
> +bool sockopt_ns_capable_any(struct user_namespace *ns, int cap1, int cap2)
> +{
> + return has_current_bpf_ctx() || ns_capable_any(ns, cap1, cap2);
> +}
> +EXPORT_SYMBOL(sockopt_ns_capable_any);
> +
> bool sockopt_capable(int cap)
> {
> return has_current_bpf_ctx() || capable(cap);
> @@ -1118,8 +1124,7 @@ int sk_setsockopt(struct sock *sk, int level, int optname,
> switch (optname) {
> case SO_PRIORITY:
> if ((val >= 0 && val <= 6) ||
> - sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) ||
> - sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) {
> + sockopt_ns_capable_any(sock_net(sk)->user_ns, CAP_NET_RAW, CAP_NET_ADMIN)) {
> sock_set_priority(sk, val);
> return 0;
> }
> @@ -1422,8 +1427,7 @@ int sk_setsockopt(struct sock *sk, int level, int optname,
> break;
>
> case SO_MARK:
> - if (!sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) &&
> - !sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) {
> + if (!sockopt_ns_capable_any(sock_net(sk)->user_ns, CAP_NET_RAW, CAP_NET_ADMIN)) {
> ret = -EPERM;
> break;
> }
> @@ -2813,8 +2817,7 @@ int __sock_cmsg_send(struct sock *sk, struct cmsghdr *cmsg,
>
> switch (cmsg->cmsg_type) {
> case SO_MARK:
> - if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) &&
> - !ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN))
> + if (!ns_capable_any(sock_net(sk)->user_ns, CAP_NET_RAW, CAP_NET_ADMIN))
> return -EPERM;
> if (cmsg->cmsg_len != CMSG_LEN(sizeof(u32)))
> return -EINVAL;
> diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c
> index 990a83455dcf..42b3b12eb493 100644
> --- a/net/ieee802154/socket.c
> +++ b/net/ieee802154/socket.c
> @@ -902,8 +902,7 @@ static int dgram_setsockopt(struct sock *sk, int level, int optname,
> ro->want_lqi = !!val;
> break;
> case WPAN_SECURITY:
> - if (!ns_capable(net->user_ns, CAP_NET_ADMIN) &&
> - !ns_capable(net->user_ns, CAP_NET_RAW)) {
> + if (!ns_capable_any(net->user_ns, CAP_NET_ADMIN, CAP_NET_RAW)) {

IIUC, should CAP_NET_RAW be tested first ?

Then, perhaps you should remove the Reviewed-by tag.


> err = -EPERM;
> break;
> }
> @@ -926,8 +925,7 @@ static int dgram_setsockopt(struct sock *sk, int level, int optname,
> }
> break;
> case WPAN_SECURITY_LEVEL:
> - if (!ns_capable(net->user_ns, CAP_NET_ADMIN) &&
> - !ns_capable(net->user_ns, CAP_NET_RAW)) {
> + if (!ns_capable_any(net->user_ns, CAP_NET_ADMIN, CAP_NET_RAW)) {

Same here.

Thanks!


> err = -EPERM;
> break;
> }
> diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
> index cf377377b52d..5a1e5ee20ddd 100644
> --- a/net/ipv4/ip_sockglue.c
> +++ b/net/ipv4/ip_sockglue.c
> @@ -1008,8 +1008,9 @@ int do_ip_setsockopt(struct sock *sk, int level, int optname,
> inet_assign_bit(MC_ALL, sk, val);
> return 0;
> case IP_TRANSPARENT:
> - if (!!val && !sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) &&
> - !sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN))
> + if (!!val &&
> + !sockopt_ns_capable_any(sock_net(sk)->user_ns,
> + CAP_NET_RAW, CAP_NET_ADMIN))
> return -EPERM;
> if (optlen < 1)
> return -EINVAL;
> diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> index d4c28ec1bc51..e46b11b5d3dd 100644
> --- a/net/ipv6/ipv6_sockglue.c
> +++ b/net/ipv6/ipv6_sockglue.c
> @@ -773,8 +773,7 @@ int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
> break;
>
> case IPV6_TRANSPARENT:
> - if (valbool && !sockopt_ns_capable(net->user_ns, CAP_NET_RAW) &&
> - !sockopt_ns_capable(net->user_ns, CAP_NET_ADMIN)) {
> + if (valbool && !sockopt_ns_capable_any(net->user_ns, CAP_NET_RAW, CAP_NET_ADMIN)) {
> retv = -EPERM;
> break;
> }
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 5b41e2321209..acc36b2d25d7 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -1783,7 +1783,7 @@ static inline bool too_many_unix_fds(struct task_struct *p)
> struct user_struct *user = current_user();
>
> if (unlikely(READ_ONCE(user->unix_inflight) > task_rlimit(p, RLIMIT_NOFILE)))
> - return !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN);
> + return !capable_any(CAP_SYS_RESOURCE, CAP_SYS_ADMIN);
> return false;
> }
>
> --
> 2.43.0

2024-03-16 17:17:51

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 02/10] capability: add any wrappers to test for multiple caps with exactly one audit message

On 3/15/24 3:16 PM, Andrii Nakryiko wrote:
> On Fri, Mar 15, 2024 at 11:41?AM Jens Axboe <[email protected]> wrote:
>>
>> On 3/15/24 10:45 AM, Andrii Nakryiko wrote:
>>>> +/**
>>>> + * ns_capable_any - Determine if the current task has one of two superior capabilities in effect
>>>> + * @ns: The usernamespace we want the capability in
>>>> + * @cap1: The capabilities to be tested for first
>>>> + * @cap2: The capabilities to be tested for secondly
>>>> + *
>>>> + * Return true if the current task has at least one of the two given superior
>>>> + * capabilities currently available for use, false if not.
>>>> + *
>>>> + * In contrast to or'ing capable() this call will create exactly one audit
>>>> + * message, either for @cap1, if it is granted or both are not permitted,
>>>> + * or @cap2, if it is granted while the other one is not.
>>>> + *
>>>> + * The capabilities should be ordered from least to most invasive, i.e. CAP_SYS_ADMIN last.
>>>> + *
>>>> + * This sets PF_SUPERPRIV on the task if the capability is available on the
>>>> + * assumption that it's about to be used.
>>>> + */
>>>> +bool ns_capable_any(struct user_namespace *ns, int cap1, int cap2)
>>>> +{
>>>> + if (cap1 == cap2)
>>>> + return ns_capable(ns, cap1);
>>>> +
>>>> + if (ns_capable_noauditondeny(ns, cap1))
>>>> + return true;
>>>> +
>>>> + if (ns_capable_noauditondeny(ns, cap2))
>>>> + return true;
>>>> +
>>>> + return ns_capable(ns, cap1);
>>>
>>> this will incur an extra capable() check (with all the LSMs involved,
>>> etc), and so for some cases where capability is expected to not be
>>> present, this will be a regression. Is there some way to not redo the
>>> check, but just audit the failure? At this point we do know that cap1
>>> failed before, so might as well just log that.
>>
>> Not sure why that's important - if it's a failure case, and any audit
>> failure should be, then why would we care if that's now doing a bit of
>> extra work?
>
> Lack of capability doesn't necessarily mean "failure". E.g., in FUSE
> there are at least few places where the code checks
> capable(CAP_SYS_ADMIN), and based on that decides on some limit values
> or extra checks. So if !capable(CAP_SYS_ADMIN), operation doesn't
> necessarily fail outright, it just has some more restricted resources
> or something.
>
> Luckily in FUSE's case it's singular capable() check, so capable_any()
> won't incur extra overhead. But I was just wondering if it would be
> possible to avoid this with capable_any() as well, so that no one has
> to do these trade-offs.

That's certainly a special and odd case, as most other cases really
would be of the:

if (capable(SOMETHING))
return -EFAUL;

Might make more sense to special case the FUSE thing then, or provide a
cheap way for it to do what it needs to do. I really don't think that
kind of:

if (capable(SOMETHING))
do something since I can
else
bummer, do something else then

is a common occurrence.

> We also had cases in production of some BPF applications tracing
> cap_capable() calls, so each extra triggering of it would be a bit of
> added overhead, as a general rule.
>
> Having said the above, I do like capable_any() changes (which is why I
> acked BPF side of things).

Yes, the BPF tracking capable in production is a pain in the butt, as it
slows down any valid fast path capable checking by a substantial amount.
We've had to work around that on the block side, unfortunately. These
are obviously cases where you expect success, and any failure is
permanent as far as that operation goes.

--
Jens Axboe


2024-03-15 18:31:12

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH 02/10] capability: add any wrappers to test for multiple caps with exactly one audit message

On Fri, Mar 15, 2024 at 11:27 AM Christian Göttsche
<[email protected]> wrote:
>
> On Fri, 15 Mar 2024 at 17:46, Andrii Nakryiko <[email protected]> wrote:
> >
> > On Fri, Mar 15, 2024 at 4:39 AM Christian Göttsche
> > <[email protected]> wrote:
> > >
> > > Add the interfaces `capable_any()` and `ns_capable_any()` as an
> > > alternative to multiple `capable()`/`ns_capable()` calls, like
> > > `capable_any(CAP_SYS_NICE, CAP_SYS_ADMIN)` instead of
> > > `capable(CAP_SYS_NICE) || capable(CAP_SYS_ADMIN)`.
> > >
> > > `capable_any()`/`ns_capable_any()` will in particular generate exactly
> > > one audit message, either for the left most capability in effect or, if
> > > the task has none, the first one.
> > >
> > > This is especially helpful with regard to SELinux, where each audit
> > > message about a not allowed capability request will create a denial
> > > message. Using this new wrapper with the least invasive capability as
> > > left most argument (e.g. CAP_SYS_NICE before CAP_SYS_ADMIN) enables
> > > policy writers to only grant the least invasive one for the particular
> > > subject instead of both.
> > >
> > > CC: [email protected]
> > > Signed-off-by: Christian Göttsche <[email protected]>
> > > ---
> > > v5:
> > > - add check for identical passed capabilities
> > > - rename internal helper according to flag rename to
> > > ns_capable_noauditondeny()
> > > v4:
> > > Use CAP_OPT_NODENYAUDIT via added ns_capable_nodenyaudit()
> > > v3:
> > > - rename to capable_any()
> > > - fix typo in function documentation
> > > - add ns_capable_any()
> > > v2:
> > > avoid varargs and fix to two capabilities; capable_or3() can be added
> > > later if needed
> > > ---
> > > include/linux/capability.h | 10 ++++++
> > > kernel/capability.c | 73 ++++++++++++++++++++++++++++++++++++++
> > > 2 files changed, 83 insertions(+)
> > >
> >
> > [...]
> >
> > >
> > > +/**
> > > + * ns_capable_any - Determine if the current task has one of two superior capabilities in effect
> > > + * @ns: The usernamespace we want the capability in
> > > + * @cap1: The capabilities to be tested for first
> > > + * @cap2: The capabilities to be tested for secondly
> > > + *
> > > + * Return true if the current task has at least one of the two given superior
> > > + * capabilities currently available for use, false if not.
> > > + *
> > > + * In contrast to or'ing capable() this call will create exactly one audit
> > > + * message, either for @cap1, if it is granted or both are not permitted,
> > > + * or @cap2, if it is granted while the other one is not.
> > > + *
> > > + * The capabilities should be ordered from least to most invasive, ie. CAP_SYS_ADMIN last.
> > > + *
> > > + * This sets PF_SUPERPRIV on the task if the capability is available on the
> > > + * assumption that it's about to be used.
> > > + */
> > > +bool ns_capable_any(struct user_namespace *ns, int cap1, int cap2)
> > > +{
> > > + if (cap1 == cap2)
> > > + return ns_capable(ns, cap1);
> > > +
> > > + if (ns_capable_noauditondeny(ns, cap1))
> > > + return true;
> > > +
> > > + if (ns_capable_noauditondeny(ns, cap2))
> > > + return true;
> > > +
> > > + return ns_capable(ns, cap1);
> >
> > this will incur an extra capable() check (with all the LSMs involved,
> > etc), and so for some cases where capability is expected to not be
> > present, this will be a regression. Is there some way to not redo the
> > check, but just audit the failure? At this point we do know that cap1
> > failed before, so might as well just log that.
>
> Logging the failure is quite different in AppArmor and SELinux, so
> just log might not be so easy.
> One option would be to change the entire LSM hook security_capable()
> to take two capability arguments, and let the LSMs handle the any
> logic.

that sounds like an even bigger overkill, probably not worth it

>
> > > +}
> > > +EXPORT_SYMBOL(ns_capable_any);
> > > +
> > > +/**
> > > + * capable_any - Determine if the current task has one of two superior capabilities in effect
> > > + * @cap1: The capabilities to be tested for first
> > > + * @cap2: The capabilities to be tested for secondly
> > > + *
> > > + * Return true if the current task has at least one of the two given superior
> > > + * capabilities currently available for use, false if not.
> > > + *
> > > + * In contrast to or'ing capable() this call will create exactly one audit
> > > + * message, either for @cap1, if it is granted or both are not permitted,
> > > + * or @cap2, if it is granted while the other one is not.
> > > + *
> > > + * The capabilities should be ordered from least to most invasive, ie. CAP_SYS_ADMIN last.
> > > + *
> > > + * This sets PF_SUPERPRIV on the task if the capability is available on the
> > > + * assumption that it's about to be used.
> > > + */
> > > +bool capable_any(int cap1, int cap2)
> > > +{
> > > + return ns_capable_any(&init_user_ns, cap1, cap2);
> > > +}
> > > +EXPORT_SYMBOL(capable_any);
> > > +
> > > /**
> > > * capable - Determine if the current task has a superior capability in effect
> > > * @cap: The capability to be tested for
> > > --
> > > 2.43.0
> > >
> > >

2024-03-15 18:27:47

by Christian Göttsche

[permalink] [raw]
Subject: Re: [PATCH 02/10] capability: add any wrappers to test for multiple caps with exactly one audit message

On Fri, 15 Mar 2024 at 17:46, Andrii Nakryiko <[email protected]> wrote:
>
> On Fri, Mar 15, 2024 at 4:39 AM Christian Göttsche
> <[email protected]> wrote:
> >
> > Add the interfaces `capable_any()` and `ns_capable_any()` as an
> > alternative to multiple `capable()`/`ns_capable()` calls, like
> > `capable_any(CAP_SYS_NICE, CAP_SYS_ADMIN)` instead of
> > `capable(CAP_SYS_NICE) || capable(CAP_SYS_ADMIN)`.
> >
> > `capable_any()`/`ns_capable_any()` will in particular generate exactly
> > one audit message, either for the left most capability in effect or, if
> > the task has none, the first one.
> >
> > This is especially helpful with regard to SELinux, where each audit
> > message about a not allowed capability request will create a denial
> > message. Using this new wrapper with the least invasive capability as
> > left most argument (e.g. CAP_SYS_NICE before CAP_SYS_ADMIN) enables
> > policy writers to only grant the least invasive one for the particular
> > subject instead of both.
> >
> > CC: [email protected]
> > Signed-off-by: Christian Göttsche <[email protected]>
> > ---
> > v5:
> > - add check for identical passed capabilities
> > - rename internal helper according to flag rename to
> > ns_capable_noauditondeny()
> > v4:
> > Use CAP_OPT_NODENYAUDIT via added ns_capable_nodenyaudit()
> > v3:
> > - rename to capable_any()
> > - fix typo in function documentation
> > - add ns_capable_any()
> > v2:
> > avoid varargs and fix to two capabilities; capable_or3() can be added
> > later if needed
> > ---
> > include/linux/capability.h | 10 ++++++
> > kernel/capability.c | 73 ++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 83 insertions(+)
> >
>
> [...]
>
> >
> > +/**
> > + * ns_capable_any - Determine if the current task has one of two superior capabilities in effect
> > + * @ns: The usernamespace we want the capability in
> > + * @cap1: The capabilities to be tested for first
> > + * @cap2: The capabilities to be tested for secondly
> > + *
> > + * Return true if the current task has at least one of the two given superior
> > + * capabilities currently available for use, false if not.
> > + *
> > + * In contrast to or'ing capable() this call will create exactly one audit
> > + * message, either for @cap1, if it is granted or both are not permitted,
> > + * or @cap2, if it is granted while the other one is not.
> > + *
> > + * The capabilities should be ordered from least to most invasive, i.e CAP_SYS_ADMIN last.
> > + *
> > + * This sets PF_SUPERPRIV on the task if the capability is available on the
> > + * assumption that it's about to be used.
> > + */
> > +bool ns_capable_any(struct user_namespace *ns, int cap1, int cap2)
> > +{
> > + if (cap1 == cap2)
> > + return ns_capable(ns, cap1);
> > +
> > + if (ns_capable_noauditondeny(ns, cap1))
> > + return true;
> > +
> > + if (ns_capable_noauditondeny(ns, cap2))
> > + return true;
> > +
> > + return ns_capable(ns, cap1);
>
> this will incur an extra capable() check (with all the LSMs involved,
> etc), and so for some cases where capability is expected to not be
> present, this will be a regression. Is there some way to not redo the
> check, but just audit the failure? At this point we do know that cap1
> failed before, so might as well just log that.

Logging the failure is quite different in AppArmor and SELinux, so
just log might not be so easy.
One option would be to change the entire LSM hook security_capable()
to take two capability arguments, and let the LSMs handle the any
logic.

> > +}
> > +EXPORT_SYMBOL(ns_capable_any);
> > +
> > +/**
> > + * capable_any - Determine if the current task has one of two superior capabilities in effect
> > + * @cap1: The capabilities to be tested for first
> > + * @cap2: The capabilities to be tested for secondly
> > + *
> > + * Return true if the current task has at least one of the two given superior
> > + * capabilities currently available for use, false if not.
> > + *
> > + * In contrast to or'ing capable() this call will create exactly one audit
> > + * message, either for @cap1, if it is granted or both are not permitted,
> > + * or @cap2, if it is granted while the other one is not.
> > + *
> > + * The capabilities should be ordered from least to most invasive, i.e CAP_SYS_ADMIN last.
> > + *
> > + * This sets PF_SUPERPRIV on the task if the capability is available on the
> > + * assumption that it's about to be used.
> > + */
> > +bool capable_any(int cap1, int cap2)
> > +{
> > + return ns_capable_any(&init_user_ns, cap1, cap2);
> > +}
> > +EXPORT_SYMBOL(capable_any);
> > +
> > /**
> > * capable - Determine if the current task has a superior capability in effect
> > * @cap: The capability to be tested for
> > --
> > 2.43.0
> >
> >

2024-03-15 20:19:38

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [PATCH 02/10] capability: add any wrappers to test for multiple caps with exactly one audit message

On Fri, Mar 15, 2024 at 12:37:23PM +0100, Christian G?ttsche wrote:
> Add the interfaces `capable_any()` and `ns_capable_any()` as an
> alternative to multiple `capable()`/`ns_capable()` calls, like
> `capable_any(CAP_SYS_NICE, CAP_SYS_ADMIN)` instead of
> `capable(CAP_SYS_NICE) || capable(CAP_SYS_ADMIN)`.
>
> `capable_any()`/`ns_capable_any()` will in particular generate exactly
> one audit message, either for the left most capability in effect or, if
> the task has none, the first one.
>
> This is especially helpful with regard to SELinux, where each audit
> message about a not allowed capability request will create a denial
> message. Using this new wrapper with the least invasive capability as
> left most argument (e.g. CAP_SYS_NICE before CAP_SYS_ADMIN) enables
> policy writers to only grant the least invasive one for the particular
> subject instead of both.
>
> CC: [email protected]
> Signed-off-by: Christian G?ttsche <[email protected]>

Reviewed-by: Serge Hallyn <[email protected]>

> ---
> v5:
> - add check for identical passed capabilities
> - rename internal helper according to flag rename to
> ns_capable_noauditondeny()
> v4:
> Use CAP_OPT_NODENYAUDIT via added ns_capable_nodenyaudit()
> v3:
> - rename to capable_any()
> - fix typo in function documentation
> - add ns_capable_any()
> v2:
> avoid varargs and fix to two capabilities; capable_or3() can be added
> later if needed
> ---
> include/linux/capability.h | 10 ++++++
> kernel/capability.c | 73 ++++++++++++++++++++++++++++++++++++++
> 2 files changed, 83 insertions(+)
>
> diff --git a/include/linux/capability.h b/include/linux/capability.h
> index 0c356a517991..eeb958440656 100644
> --- a/include/linux/capability.h
> +++ b/include/linux/capability.h
> @@ -146,7 +146,9 @@ extern bool has_capability_noaudit(struct task_struct *t, int cap);
> extern bool has_ns_capability_noaudit(struct task_struct *t,
> struct user_namespace *ns, int cap);
> extern bool capable(int cap);
> +extern bool capable_any(int cap1, int cap2);
> extern bool ns_capable(struct user_namespace *ns, int cap);
> +extern bool ns_capable_any(struct user_namespace *ns, int cap1, int cap2);
> extern bool ns_capable_noaudit(struct user_namespace *ns, int cap);
> extern bool ns_capable_setid(struct user_namespace *ns, int cap);
> #else
> @@ -172,10 +174,18 @@ static inline bool capable(int cap)
> {
> return true;
> }
> +static inline bool capable_any(int cap1, int cap2)
> +{
> + return true;
> +}
> static inline bool ns_capable(struct user_namespace *ns, int cap)
> {
> return true;
> }
> +static inline bool ns_capable_any(struct user_namespace *ns, int cap1, int cap2)
> +{
> + return true;
> +}
> static inline bool ns_capable_noaudit(struct user_namespace *ns, int cap)
> {
> return true;
> diff --git a/kernel/capability.c b/kernel/capability.c
> index dac4df77e376..73358abfe2e1 100644
> --- a/kernel/capability.c
> +++ b/kernel/capability.c
> @@ -402,6 +402,23 @@ bool ns_capable_noaudit(struct user_namespace *ns, int cap)
> }
> EXPORT_SYMBOL(ns_capable_noaudit);
>
> +/**
> + * ns_capable_noauditondeny - Determine if the current task has a superior capability
> + * (unaudited when unauthorized) in effect
> + * @ns: The usernamespace we want the capability in
> + * @cap: The capability to be tested for
> + *
> + * Return true if the current task has the given superior capability currently
> + * available for use, false if not.
> + *
> + * This sets PF_SUPERPRIV on the task if the capability is available on the
> + * assumption that it's about to be used.
> + */
> +static bool ns_capable_noauditondeny(struct user_namespace *ns, int cap)
> +{
> + return ns_capable_common(ns, cap, CAP_OPT_NOAUDIT_ONDENY);
> +}
> +
> /**
> * ns_capable_setid - Determine if the current task has a superior capability
> * in effect, while signalling that this check is being done from within a
> @@ -421,6 +438,62 @@ bool ns_capable_setid(struct user_namespace *ns, int cap)
> }
> EXPORT_SYMBOL(ns_capable_setid);
>
> +/**
> + * ns_capable_any - Determine if the current task has one of two superior capabilities in effect
> + * @ns: The usernamespace we want the capability in
> + * @cap1: The capabilities to be tested for first
> + * @cap2: The capabilities to be tested for secondly
> + *
> + * Return true if the current task has at least one of the two given superior
> + * capabilities currently available for use, false if not.
> + *
> + * In contrast to or'ing capable() this call will create exactly one audit
> + * message, either for @cap1, if it is granted or both are not permitted,
> + * or @cap2, if it is granted while the other one is not.
> + *
> + * The capabilities should be ordered from least to most invasive, i.e. CAP_SYS_ADMIN last.
> + *
> + * This sets PF_SUPERPRIV on the task if the capability is available on the
> + * assumption that it's about to be used.
> + */
> +bool ns_capable_any(struct user_namespace *ns, int cap1, int cap2)
> +{
> + if (cap1 == cap2)
> + return ns_capable(ns, cap1);
> +
> + if (ns_capable_noauditondeny(ns, cap1))
> + return true;
> +
> + if (ns_capable_noauditondeny(ns, cap2))
> + return true;
> +
> + return ns_capable(ns, cap1);
> +}
> +EXPORT_SYMBOL(ns_capable_any);
> +
> +/**
> + * capable_any - Determine if the current task has one of two superior capabilities in effect
> + * @cap1: The capabilities to be tested for first
> + * @cap2: The capabilities to be tested for secondly
> + *
> + * Return true if the current task has at least one of the two given superior
> + * capabilities currently available for use, false if not.
> + *
> + * In contrast to or'ing capable() this call will create exactly one audit
> + * message, either for @cap1, if it is granted or both are not permitted,
> + * or @cap2, if it is granted while the other one is not.
> + *
> + * The capabilities should be ordered from least to most invasive, i.e. CAP_SYS_ADMIN last.
> + *
> + * This sets PF_SUPERPRIV on the task if the capability is available on the
> + * assumption that it's about to be used.
> + */
> +bool capable_any(int cap1, int cap2)
> +{
> + return ns_capable_any(&init_user_ns, cap1, cap2);
> +}
> +EXPORT_SYMBOL(capable_any);
> +
> /**
> * capable - Determine if the current task has a superior capability in effect
> * @cap: The capability to be tested for
> --
> 2.43.0
>
>