This is an update to the previous seccomp logging patch sets. The main
difference in this revision compared to the last is that the application now
has the ability to request that all actions in a filter, except for RET_ALLOW,
should be logged. This is done with a new filter flag. In support of that
change, the log_max_action sysctl was renamed to actions_logged as it now lists
the actions that an admin has allowed to be logged. The admin has the final say
in what actions get logged.
Please see the individual patches for summaries of changes since the last
revision.
Thanks!
Tyler
This patch creates a read-only sysctl containing an ordered list of
seccomp actions that the kernel supports. The ordering, from left to
right, is the lowest action value (kill) to the highest action value
(allow). Currently, a read of the sysctl file would return "kill trap
errno trace allow". The contents of this sysctl file can be useful for
userspace code as well as the system administrator.
The path to the sysctl is:
/proc/sys/kernel/seccomp/actions_avail
libseccomp and other userspace code can easily determine which actions
the current kernel supports. The set of actions supported by the current
kernel may be different than the set of action macros found in kernel
headers that were installed where the userspace code was built.
In addition, this sysctl will allow system administrators to know which
actions are supported by the kernel and make it easier to configure
exactly what seccomp logs through the audit subsystem. Support for this
level of logging configuration will come in a future patch.
Signed-off-by: Tyler Hicks <[email protected]>
---
* Changes since v4:
- move device_initcall() into CONFIG_SYSCTL ifdef
- mark the seccomp_actions_avail string as const
- adjust for new reStructuredText format
Documentation/sysctl/kernel.txt | 1 +
Documentation/userspace-api/seccomp_filter.rst | 16 ++++++++
kernel/seccomp.c | 51 ++++++++++++++++++++++++++
3 files changed, 68 insertions(+)
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index bac23c1..995c42c 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -74,6 +74,7 @@ show up in /proc/sys/kernel:
- reboot-cmd [ SPARC only ]
- rtsig-max
- rtsig-nr
+- seccomp/ ==> Documentation/userspace-api/seccomp_filter.rst
- sem
- sem_next_id [ sysv ipc ]
- sg-big-buff [ generic SCSI device (sg) ]
diff --git a/Documentation/userspace-api/seccomp_filter.rst b/Documentation/userspace-api/seccomp_filter.rst
index f71eb5e..35fc7cb 100644
--- a/Documentation/userspace-api/seccomp_filter.rst
+++ b/Documentation/userspace-api/seccomp_filter.rst
@@ -169,7 +169,23 @@ The ``samples/seccomp/`` directory contains both an x86-specific example
and a more generic example of a higher level macro interface for BPF
program generation.
+Sysctls
+=======
+
+Seccomp's sysctl files can be found in the ``/proc/sys/kernel/seccomp/``
+directory. Here's a description of each file in that directory:
+
+``actions_avail``:
+ A read-only ordered list of seccomp return values (refer to the
+ ``SECCOMP_RET_*`` macros above) in string form. The ordering, from
+ left-to-right, is the least permissive return value to the most
+ permissive return value.
+ The list represents the set of seccomp return values supported
+ by the kernel. A userspace program may use this list to
+ determine if the actions found in the ``seccomp.h``, when the
+ program was built, differs from the set of actions actually
+ supported in the current running kernel.
Adding architecture support
===========================
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 98b59b5..6bff068 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -17,11 +17,13 @@
#include <linux/audit.h>
#include <linux/compat.h>
#include <linux/coredump.h>
+#include <linux/kmemleak.h>
#include <linux/sched.h>
#include <linux/sched/task_stack.h>
#include <linux/seccomp.h>
#include <linux/slab.h>
#include <linux/syscalls.h>
+#include <linux/sysctl.h>
#ifdef CONFIG_HAVE_ARCH_SECCOMP_FILTER
#include <asm/syscall.h>
@@ -922,3 +924,52 @@ long seccomp_get_filter(struct task_struct *task, unsigned long filter_off,
return ret;
}
#endif
+
+#ifdef CONFIG_SYSCTL
+
+/* Human readable action names for friendly sysctl interaction */
+#define SECCOMP_RET_KILL_NAME "kill"
+#define SECCOMP_RET_TRAP_NAME "trap"
+#define SECCOMP_RET_ERRNO_NAME "errno"
+#define SECCOMP_RET_TRACE_NAME "trace"
+#define SECCOMP_RET_ALLOW_NAME "allow"
+
+static const char seccomp_actions_avail[] = SECCOMP_RET_KILL_NAME " "
+ SECCOMP_RET_TRAP_NAME " "
+ SECCOMP_RET_ERRNO_NAME " "
+ SECCOMP_RET_TRACE_NAME " "
+ SECCOMP_RET_ALLOW_NAME;
+
+static struct ctl_path seccomp_sysctl_path[] = {
+ { .procname = "kernel", },
+ { .procname = "seccomp", },
+ { }
+};
+
+static struct ctl_table seccomp_sysctl_table[] = {
+ {
+ .procname = "actions_avail",
+ .data = (void *) &seccomp_actions_avail,
+ .maxlen = sizeof(seccomp_actions_avail),
+ .mode = 0444,
+ .proc_handler = proc_dostring,
+ },
+ { }
+};
+
+static int __init seccomp_sysctl_init(void)
+{
+ struct ctl_table_header *hdr;
+
+ hdr = register_sysctl_paths(seccomp_sysctl_path, seccomp_sysctl_table);
+ if (!hdr)
+ pr_warn("seccomp: sysctl registration failed\n");
+ else
+ kmemleak_not_leak(hdr);
+
+ return 0;
+}
+
+device_initcall(seccomp_sysctl_init)
+
+#endif /* CONFIG_SYSCTL */
--
2.7.4
Userspace code that needs to check if the kernel supports a given action
may not be able to use the /proc/sys/kernel/seccomp/actions_avail
sysctl. The process may be running in a sandbox and, therefore,
sufficient filesystem access may not be available. This patch adds an
operation to the seccomp(2) syscall that allows userspace code to ask
the kernel if a given action is available.
If the action is supported by the kernel, 0 is returned. If the action
is not supported by the kernel, -1 is returned with errno set to
-EOPNOTSUPP. If this check is attempted on a kernel that doesn't support
this new operation, -1 is returned with errno set to -EINVAL meaning
that userspace code will have the ability to differentiate between the
two error cases.
Signed-off-by: Tyler Hicks <[email protected]>
---
* Changes since v4:
- This is new patch to allow applications to check if an action is supported
without having to consult the actions_avail sysctl
include/uapi/linux/seccomp.h | 5 ++--
kernel/seccomp.c | 26 +++++++++++++++++++
tools/testing/selftests/seccomp/seccomp_bpf.c | 36 +++++++++++++++++++++++++++
3 files changed, 65 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
index 82c823c..19a611d 100644
--- a/include/uapi/linux/seccomp.h
+++ b/include/uapi/linux/seccomp.h
@@ -11,8 +11,9 @@
#define SECCOMP_MODE_FILTER 2 /* uses user-supplied filter. */
/* Valid operations for seccomp syscall. */
-#define SECCOMP_SET_MODE_STRICT 0
-#define SECCOMP_SET_MODE_FILTER 1
+#define SECCOMP_SET_MODE_STRICT 0
+#define SECCOMP_SET_MODE_FILTER 1
+#define SECCOMP_GET_ACTION_AVAIL 2
/* Valid flags for SECCOMP_SET_MODE_FILTER */
#define SECCOMP_FILTER_FLAG_TSYNC 1
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 1c4c496..03ad3ba 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -858,6 +858,27 @@ static inline long seccomp_set_mode_filter(unsigned int flags,
}
#endif
+static long seccomp_get_action_avail(const char __user *uaction)
+{
+ u32 action;
+
+ if (copy_from_user(&action, uaction, sizeof(action)))
+ return -EFAULT;
+
+ switch (action) {
+ case SECCOMP_RET_KILL:
+ case SECCOMP_RET_TRAP:
+ case SECCOMP_RET_ERRNO:
+ case SECCOMP_RET_TRACE:
+ case SECCOMP_RET_ALLOW:
+ break;
+ default:
+ return -EOPNOTSUPP;
+ }
+
+ return 0;
+}
+
/* Common entry point for both prctl and syscall. */
static long do_seccomp(unsigned int op, unsigned int flags,
const char __user *uargs)
@@ -869,6 +890,11 @@ static long do_seccomp(unsigned int op, unsigned int flags,
return seccomp_set_mode_strict();
case SECCOMP_SET_MODE_FILTER:
return seccomp_set_mode_filter(flags, uargs);
+ case SECCOMP_GET_ACTION_AVAIL:
+ if (flags != 0)
+ return -EINVAL;
+
+ return seccomp_get_action_avail(uargs);
default:
return -EINVAL;
}
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index eeb4f7a..8f0872b 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1683,6 +1683,10 @@ TEST_F_SIGNAL(TRACE_syscall, kill_after_ptrace, SIGSYS)
#define SECCOMP_SET_MODE_FILTER 1
#endif
+#ifndef SECCOMP_GET_ACTION_AVAIL
+#define SECCOMP_GET_ACTION_AVAIL 2
+#endif
+
#ifndef SECCOMP_FILTER_FLAG_TSYNC
#define SECCOMP_FILTER_FLAG_TSYNC 1
#endif
@@ -2486,6 +2490,38 @@ TEST_SIGNAL(filter_flag_log, SIGSYS)
EXPECT_EQ(0, syscall(__NR_getpid));
}
+TEST(get_action_avail)
+{
+ __u32 actions[] = { SECCOMP_RET_KILL, SECCOMP_RET_TRAP,
+ SECCOMP_RET_ERRNO, SECCOMP_RET_TRACE,
+ SECCOMP_RET_ALLOW };
+ __u32 unknown_action = 0x10000000U;
+ int i;
+ long ret;
+
+ ret = seccomp(SECCOMP_GET_ACTION_AVAIL, 0, &actions[0]);
+ ASSERT_NE(ENOSYS, errno) {
+ TH_LOG("Kernel does not support seccomp syscall!");
+ }
+ ASSERT_NE(EINVAL, errno) {
+ TH_LOG("Kernel does not support SECCOMP_GET_ACTION_AVAIL operation!");
+ }
+ EXPECT_EQ(ret, 0);
+
+ for (i = 0; i < ARRAY_SIZE(actions); i++) {
+ ret = seccomp(SECCOMP_GET_ACTION_AVAIL, 0, &actions[i]);
+ EXPECT_EQ(ret, 0) {
+ TH_LOG("Expected action (0x%X) not available!",
+ actions[i]);
+ }
+ }
+
+ /* Check that an unknown action is handled properly (EOPNOTSUPP) */
+ ret = seccomp(SECCOMP_GET_ACTION_AVAIL, 0, &unknown_action);
+ EXPECT_EQ(ret, -1);
+ EXPECT_EQ(errno, EOPNOTSUPP);
+}
+
/*
* TODO:
* - add microbenchmarks
--
2.7.4
Add a new filter flag, SECCOMP_FILTER_FLAG_LOG, that enables logging for
all actions except for SECCOMP_RET_ALLOW for the given filter.
SECCOMP_RET_KILL actions are always logged, when "kill" is in the
actions_logged sysctl, and SECCOMP_RET_ALLOW actions are never logged,
regardless of this flag.
This flag can be used to create noisy filters that result in all
non-allowed actions to be logged. A process may have one noisy filter,
which is loaded with this flag, as well as a quiet filter that's not
loaded with this flag. This allows for the actions in a set of filters
to be selectively conveyed to the admin.
Since a system could have a large number of allocated seccomp_filter
structs, struct packing was taken in consideration. On 64 bit x86, the
new log member takes up one byte of an existing four byte hole in the
struct. On 32 bit x86, the new log member creates a new four byte hole
(unavoidable) and consumes one of those bytes.
Unfortunately, the tests added for SECCOMP_FILTER_FLAG_LOG are not
capable of inspecting the audit log to verify that the actions taken in
the filter were logged.
Signed-off-by: Tyler Hicks <[email protected]>
---
* Changes since v4:
- This is a new patch that allows the application to have a say in what gets
logged
include/linux/seccomp.h | 3 +-
include/uapi/linux/seccomp.h | 1 +
kernel/seccomp.c | 78 ++++++++++++++++-----------
tools/testing/selftests/seccomp/seccomp_bpf.c | 66 +++++++++++++++++++++++
4 files changed, 116 insertions(+), 32 deletions(-)
diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
index ecc296c..c8bef43 100644
--- a/include/linux/seccomp.h
+++ b/include/linux/seccomp.h
@@ -3,7 +3,8 @@
#include <uapi/linux/seccomp.h>
-#define SECCOMP_FILTER_FLAG_MASK (SECCOMP_FILTER_FLAG_TSYNC)
+#define SECCOMP_FILTER_FLAG_MASK (SECCOMP_FILTER_FLAG_TSYNC | \
+ SECCOMP_FILTER_FLAG_LOG)
#ifdef CONFIG_SECCOMP
diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
index 0f238a4..82c823c 100644
--- a/include/uapi/linux/seccomp.h
+++ b/include/uapi/linux/seccomp.h
@@ -16,6 +16,7 @@
/* Valid flags for SECCOMP_SET_MODE_FILTER */
#define SECCOMP_FILTER_FLAG_TSYNC 1
+#define SECCOMP_FILTER_FLAG_LOG 2
/*
* All BPF programs must return a 32-bit value.
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 87257f2..1c4c496 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -44,6 +44,7 @@
* get/put helpers should be used when accessing an instance
* outside of a lifetime-guarded section. In general, this
* is only needed for handling filters shared across tasks.
+ * @log: true if all actions except for SECCOMP_RET_ALLOW should be logged
* @prev: points to a previously installed, or inherited, filter
* @prog: the BPF program to evaluate
*
@@ -59,6 +60,7 @@
*/
struct seccomp_filter {
refcount_t usage;
+ bool log;
struct seccomp_filter *prev;
struct bpf_prog *prog;
};
@@ -172,11 +174,14 @@ static int seccomp_check_filter(struct sock_filter *filter, unsigned int flen)
/**
* seccomp_run_filters - evaluates all seccomp filters against @sd
+ * @filter: upon return, points to the matched filter but may be NULL in some
+ * unexpected situations
* @sd: optional seccomp data to be passed to filters
*
* Returns valid seccomp BPF response codes.
*/
-static u32 seccomp_run_filters(const struct seccomp_data *sd)
+static u32 seccomp_run_filters(struct seccomp_filter **filter,
+ const struct seccomp_data *sd)
{
struct seccomp_data sd_local;
u32 ret = SECCOMP_RET_ALLOW;
@@ -184,6 +189,8 @@ static u32 seccomp_run_filters(const struct seccomp_data *sd)
struct seccomp_filter *f =
lockless_dereference(current->seccomp.filter);
+ *filter = f;
+
/* Ensure unexpected behavior doesn't result in failing open. */
if (unlikely(WARN_ON(f == NULL)))
return SECCOMP_RET_KILL;
@@ -200,8 +207,10 @@ static u32 seccomp_run_filters(const struct seccomp_data *sd)
for (; f; f = f->prev) {
u32 cur_ret = BPF_PROG_RUN(f->prog, sd);
- if ((cur_ret & SECCOMP_RET_ACTION) < (ret & SECCOMP_RET_ACTION))
+ if ((cur_ret & SECCOMP_RET_ACTION) < (ret & SECCOMP_RET_ACTION)) {
ret = cur_ret;
+ *filter = f;
+ }
}
return ret;
}
@@ -446,6 +455,9 @@ static long seccomp_attach_filter(unsigned int flags,
return ret;
}
+ if (flags & SECCOMP_FILTER_FLAG_LOG)
+ filter->log = true;
+
/*
* If there is an existing filter, make it the prev and don't drop its
* task reference.
@@ -526,36 +538,39 @@ static void seccomp_send_sigsys(int syscall, int reason)
static u32 seccomp_actions_logged = SECCOMP_LOG_KILL | SECCOMP_LOG_TRAP |
SECCOMP_LOG_ERRNO | SECCOMP_LOG_TRACE;
-static inline void seccomp_log(unsigned long syscall, long signr, u32 action)
+static inline void seccomp_log(unsigned long syscall, long signr, u32 action,
+ bool requested)
{
- bool log;
+ if (requested) {
+ bool log;
+
+ switch (action) {
+ case SECCOMP_RET_TRAP:
+ log = seccomp_actions_logged & SECCOMP_LOG_TRAP;
+ break;
+ case SECCOMP_RET_ERRNO:
+ log = seccomp_actions_logged & SECCOMP_LOG_ERRNO;
+ break;
+ case SECCOMP_RET_TRACE:
+ log = seccomp_actions_logged & SECCOMP_LOG_TRACE;
+ break;
+ case SECCOMP_RET_ALLOW:
+ log = false;
+ break;
+ case SECCOMP_RET_KILL:
+ default:
+ log = seccomp_actions_logged & SECCOMP_LOG_KILL;
+ }
- switch (action) {
- case SECCOMP_RET_TRAP:
- log = seccomp_actions_logged & SECCOMP_LOG_TRAP;
- break;
- case SECCOMP_RET_ERRNO:
- log = seccomp_actions_logged & SECCOMP_LOG_ERRNO;
- break;
- case SECCOMP_RET_TRACE:
- log = seccomp_actions_logged & SECCOMP_LOG_TRACE;
- break;
- case SECCOMP_RET_ALLOW:
- log = false;
- break;
- case SECCOMP_RET_KILL:
- default:
- log = seccomp_actions_logged & SECCOMP_LOG_KILL;
+ /*
+ * Force an audit message to be emitted when the action is
+ * allowed to be logged by the admin.
+ */
+ if (log)
+ return __audit_seccomp(syscall, signr, action);
}
/*
- * Force an audit message to be emitted when the action is allowed to
- * be logged by the admin.
- */
- if (log)
- return __audit_seccomp(syscall, signr, action);
-
- /*
* Let the audit subsystem decide if the action should be audited based
* on whether the current task itself is being audited.
*/
@@ -587,7 +602,7 @@ static void __secure_computing_strict(int this_syscall)
#ifdef SECCOMP_DEBUG
dump_stack();
#endif
- seccomp_log(this_syscall, SIGKILL, SECCOMP_RET_KILL);
+ seccomp_log(this_syscall, SIGKILL, SECCOMP_RET_KILL, true);
do_exit(SIGKILL);
}
@@ -613,6 +628,7 @@ void secure_computing_strict(int this_syscall)
static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
const bool recheck_after_trace)
{
+ struct seccomp_filter *filter;
u32 filter_ret, action;
int data;
@@ -622,7 +638,7 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
*/
rmb();
- filter_ret = seccomp_run_filters(sd);
+ filter_ret = seccomp_run_filters(&filter, sd);
data = filter_ret & SECCOMP_RET_DATA;
action = filter_ret & SECCOMP_RET_ACTION;
@@ -690,7 +706,7 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
case SECCOMP_RET_KILL:
default:
- seccomp_log(this_syscall, SIGSYS, action);
+ seccomp_log(this_syscall, SIGSYS, action, true);
/* Dump core only if this is the last remaining thread. */
if (get_nr_threads(current) == 1) {
siginfo_t info;
@@ -707,7 +723,7 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
unreachable();
skip:
- seccomp_log(this_syscall, 0, action);
+ seccomp_log(this_syscall, 0, action, filter ? filter->log : false);
return -1;
}
#else
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 73f5ea6..eeb4f7a 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1687,6 +1687,10 @@ TEST_F_SIGNAL(TRACE_syscall, kill_after_ptrace, SIGSYS)
#define SECCOMP_FILTER_FLAG_TSYNC 1
#endif
+#ifndef SECCOMP_FILTER_FLAG_LOG
+#define SECCOMP_FILTER_FLAG_LOG 2
+#endif
+
#ifndef seccomp
int seccomp(unsigned int op, unsigned int flags, void *args)
{
@@ -2421,6 +2425,67 @@ TEST(syscall_restart)
_metadata->passed = 0;
}
+TEST_SIGNAL(filter_flag_log, SIGSYS)
+{
+ struct sock_filter allow_filter[] = {
+ BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW),
+ };
+ struct sock_filter kill_filter[] = {
+ BPF_STMT(BPF_LD|BPF_W|BPF_ABS,
+ offsetof(struct seccomp_data, nr)),
+ BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, __NR_getpid, 0, 1),
+ BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_KILL),
+ BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW),
+ };
+ struct sock_fprog allow_prog = {
+ .len = (unsigned short)ARRAY_SIZE(allow_filter),
+ .filter = allow_filter,
+ };
+ struct sock_fprog kill_prog = {
+ .len = (unsigned short)ARRAY_SIZE(kill_filter),
+ .filter = kill_filter,
+ };
+ long ret;
+ pid_t parent = getppid();
+
+ ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
+ ASSERT_EQ(0, ret);
+
+ /* Verify that the FILTER_FLAG_LOG flag isn't accepted in strict mode */
+ ret = seccomp(SECCOMP_SET_MODE_STRICT, SECCOMP_FILTER_FLAG_LOG,
+ &allow_prog);
+ ASSERT_NE(ENOSYS, errno) {
+ TH_LOG("Kernel does not support seccomp syscall!");
+ }
+ EXPECT_NE(0, ret) {
+ TH_LOG("Kernel accepted FILTER_FLAG_LOG flag in strict mode!");
+ }
+ EXPECT_EQ(EINVAL, errno) {
+ TH_LOG("Kernel returned unexpected errno for FILTER_FLAG_LOG flag in strict mode!");
+ }
+
+ /* Verify that a simple, permissive filter can be added with no flags */
+ ret = seccomp(SECCOMP_SET_MODE_FILTER, 0, &allow_prog);
+ EXPECT_EQ(0, ret);
+
+ /* See if the same filter can be added with the FILTER_FLAG_LOG flag */
+ ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_LOG,
+ &allow_prog);
+ ASSERT_NE(EINVAL, errno) {
+ TH_LOG("Kernel does not support the FILTER_FLAG_LOG flag!");
+ }
+ EXPECT_EQ(0, ret);
+
+ /* Ensure that the kill filter works with the FILTER_FLAG_LOG flag */
+ ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_LOG,
+ &kill_prog);
+ EXPECT_EQ(0, ret);
+
+ EXPECT_EQ(parent, syscall(__NR_getppid));
+ /* getpid() should never return. */
+ EXPECT_EQ(0, syscall(__NR_getpid));
+}
+
/*
* TODO:
* - add microbenchmarks
@@ -2429,6 +2494,7 @@ TEST(syscall_restart)
* - endianness checking when appropriate
* - 64-bit arg prodding
* - arch value testing (x86 modes especially)
+ * - verify that FILTER_FLAG_LOG filters generate log messages
* - ...
*/
--
2.7.4
Add a new action, SECCOMP_RET_LOG, that logs a syscall before allowing
the syscall. At the implementation level, this action is identical to
the existing SECCOMP_RET_ALLOW action. However, it can be very useful when
initially developing a seccomp filter for an application. The developer
can set the default action to be SECCOMP_RET_LOG, maybe mark any
obviously needed syscalls with SECCOMP_RET_ALLOW, and then put the
application through its paces. A list of syscalls that triggered the
default action (SECCOMP_RET_LOG) can be easily gleaned from the logs and
that list can be used to build the syscall whitelist. Finally, the
developer can change the default action to the desired value.
This provides a more friendly experience than seeing the application get
killed, then updating the filter and rebuilding the app, seeing the
application get killed due to a different syscall, then updating the
filter and rebuilding the app, etc.
The functionality is similar to what's supported by the various LSMs.
SELinux has permissive mode, AppArmor has complain mode, SMACK has
bring-up mode, etc.
SECCOMP_RET_LOG is given a lower value than SECCOMP_RET_ALLOW as allow
while logging is slightly more restrictive than quietly allowing.
Unfortunately, the tests added for SECCOMP_RET_LOG are not capable of
inspecting the audit log to verify that the syscall was logged.
Signed-off-by: Tyler Hicks <[email protected]>
---
* Change since v4:
- folded the previously separate selftest patch into this patch
- add TODO to verify that RET_LOG generates log messages in selftests
- adjust for new reStructuredText format
Documentation/userspace-api/seccomp_filter.rst | 6 ++
include/uapi/linux/seccomp.h | 1 +
kernel/seccomp.c | 17 ++++-
tools/testing/selftests/seccomp/seccomp_bpf.c | 97 +++++++++++++++++++++++++-
4 files changed, 118 insertions(+), 3 deletions(-)
diff --git a/Documentation/userspace-api/seccomp_filter.rst b/Documentation/userspace-api/seccomp_filter.rst
index 2d1d8ab..2ece856 100644
--- a/Documentation/userspace-api/seccomp_filter.rst
+++ b/Documentation/userspace-api/seccomp_filter.rst
@@ -141,6 +141,12 @@ In precedence order, they are:
allow use of ptrace, even of other sandboxed processes, without
extreme care; ptracers can use this mechanism to escape.)
+``SECCOMP_RET_LOG``:
+ Results in the system call being executed after it is logged. This
+ should be used by application developers to learn which syscalls their
+ application needs without having to iterate through multiple test and
+ development cycles to build the list.
+
``SECCOMP_RET_ALLOW``:
Results in the system call being executed.
diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
index 19a611d..f944332 100644
--- a/include/uapi/linux/seccomp.h
+++ b/include/uapi/linux/seccomp.h
@@ -31,6 +31,7 @@
#define SECCOMP_RET_TRAP 0x00030000U /* disallow and force a SIGSYS */
#define SECCOMP_RET_ERRNO 0x00050000U /* returns an errno */
#define SECCOMP_RET_TRACE 0x7ff00000U /* pass to a tracer or disallow */
+#define SECCOMP_RET_LOG 0x7ffc0000U /* allow after logging */
#define SECCOMP_RET_ALLOW 0x7fff0000U /* allow */
/* Masks for the return value sections. */
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 03ad3ba..c12d3dd 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -533,10 +533,12 @@ static void seccomp_send_sigsys(int syscall, int reason)
#define SECCOMP_LOG_TRAP (1 << 2)
#define SECCOMP_LOG_ERRNO (1 << 3)
#define SECCOMP_LOG_TRACE (1 << 4)
-#define SECCOMP_LOG_ALLOW (1 << 5)
+#define SECCOMP_LOG_LOG (1 << 5)
+#define SECCOMP_LOG_ALLOW (1 << 6)
static u32 seccomp_actions_logged = SECCOMP_LOG_KILL | SECCOMP_LOG_TRAP |
- SECCOMP_LOG_ERRNO | SECCOMP_LOG_TRACE;
+ SECCOMP_LOG_ERRNO | SECCOMP_LOG_TRACE |
+ SECCOMP_LOG_LOG;
static inline void seccomp_log(unsigned long syscall, long signr, u32 action,
bool requested)
@@ -554,6 +556,9 @@ static inline void seccomp_log(unsigned long syscall, long signr, u32 action,
case SECCOMP_RET_TRACE:
log = seccomp_actions_logged & SECCOMP_LOG_TRACE;
break;
+ case SECCOMP_RET_LOG:
+ log = seccomp_actions_logged & SECCOMP_LOG_LOG;
+ break;
case SECCOMP_RET_ALLOW:
log = false;
break;
@@ -701,6 +706,10 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
return 0;
+ case SECCOMP_RET_LOG:
+ seccomp_log(this_syscall, 0, action, true);
+ return 0;
+
case SECCOMP_RET_ALLOW:
return 0;
@@ -870,6 +879,7 @@ static long seccomp_get_action_avail(const char __user *uaction)
case SECCOMP_RET_TRAP:
case SECCOMP_RET_ERRNO:
case SECCOMP_RET_TRACE:
+ case SECCOMP_RET_LOG:
case SECCOMP_RET_ALLOW:
break;
default:
@@ -1020,12 +1030,14 @@ long seccomp_get_filter(struct task_struct *task, unsigned long filter_off,
#define SECCOMP_RET_TRAP_NAME "trap"
#define SECCOMP_RET_ERRNO_NAME "errno"
#define SECCOMP_RET_TRACE_NAME "trace"
+#define SECCOMP_RET_LOG_NAME "log"
#define SECCOMP_RET_ALLOW_NAME "allow"
static const char seccomp_actions_avail[] = SECCOMP_RET_KILL_NAME " "
SECCOMP_RET_TRAP_NAME " "
SECCOMP_RET_ERRNO_NAME " "
SECCOMP_RET_TRACE_NAME " "
+ SECCOMP_RET_LOG_NAME " "
SECCOMP_RET_ALLOW_NAME;
#define SECCOMP_ACTIONS_AVAIL_LEN strlen(seccomp_actions_avail)
@@ -1040,6 +1052,7 @@ static const struct seccomp_log_name seccomp_log_names[] = {
{ SECCOMP_LOG_TRAP, SECCOMP_RET_TRAP_NAME },
{ SECCOMP_LOG_ERRNO, SECCOMP_RET_ERRNO_NAME },
{ SECCOMP_LOG_TRACE, SECCOMP_RET_TRACE_NAME },
+ { SECCOMP_LOG_LOG, SECCOMP_RET_LOG_NAME },
{ SECCOMP_LOG_ALLOW, SECCOMP_RET_ALLOW_NAME },
{ }
};
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 8f0872b..040e875 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -87,6 +87,10 @@ struct seccomp_data {
};
#endif
+#ifndef SECCOMP_RET_LOG
+#define SECCOMP_RET_LOG 0x7ffc0000U /* allow after logging */
+#endif
+
#if __BYTE_ORDER == __LITTLE_ENDIAN
#define syscall_arg(_n) (offsetof(struct seccomp_data, args[_n]))
#elif __BYTE_ORDER == __BIG_ENDIAN
@@ -342,6 +346,28 @@ TEST(empty_prog)
EXPECT_EQ(EINVAL, errno);
}
+TEST(log_all)
+{
+ struct sock_filter filter[] = {
+ BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_LOG),
+ };
+ struct sock_fprog prog = {
+ .len = (unsigned short)ARRAY_SIZE(filter),
+ .filter = filter,
+ };
+ long ret;
+ pid_t parent = getppid();
+
+ ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
+ ASSERT_EQ(0, ret);
+
+ ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog);
+ ASSERT_EQ(0, ret);
+
+ /* getppid() should succeed and be logged (no check for logging) */
+ EXPECT_EQ(parent, syscall(__NR_getppid));
+}
+
TEST_SIGNAL(unknown_ret_is_kill_inside, SIGSYS)
{
struct sock_filter filter[] = {
@@ -735,6 +761,7 @@ TEST_F(TRAP, handler)
FIXTURE_DATA(precedence) {
struct sock_fprog allow;
+ struct sock_fprog log;
struct sock_fprog trace;
struct sock_fprog error;
struct sock_fprog trap;
@@ -746,6 +773,13 @@ FIXTURE_SETUP(precedence)
struct sock_filter allow_insns[] = {
BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW),
};
+ struct sock_filter log_insns[] = {
+ BPF_STMT(BPF_LD|BPF_W|BPF_ABS,
+ offsetof(struct seccomp_data, nr)),
+ BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, __NR_getpid, 1, 0),
+ BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW),
+ BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_LOG),
+ };
struct sock_filter trace_insns[] = {
BPF_STMT(BPF_LD|BPF_W|BPF_ABS,
offsetof(struct seccomp_data, nr)),
@@ -782,6 +816,7 @@ FIXTURE_SETUP(precedence)
memcpy(self->_x.filter, &_x##_insns, sizeof(_x##_insns)); \
self->_x.len = (unsigned short)ARRAY_SIZE(_x##_insns)
FILTER_ALLOC(allow);
+ FILTER_ALLOC(log);
FILTER_ALLOC(trace);
FILTER_ALLOC(error);
FILTER_ALLOC(trap);
@@ -792,6 +827,7 @@ FIXTURE_TEARDOWN(precedence)
{
#define FILTER_FREE(_x) if (self->_x.filter) free(self->_x.filter)
FILTER_FREE(allow);
+ FILTER_FREE(log);
FILTER_FREE(trace);
FILTER_FREE(error);
FILTER_FREE(trap);
@@ -809,6 +845,8 @@ TEST_F(precedence, allow_ok)
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
ASSERT_EQ(0, ret);
+ ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+ ASSERT_EQ(0, ret);
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
ASSERT_EQ(0, ret);
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
@@ -833,6 +871,8 @@ TEST_F_SIGNAL(precedence, kill_is_highest, SIGSYS)
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
ASSERT_EQ(0, ret);
+ ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+ ASSERT_EQ(0, ret);
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
ASSERT_EQ(0, ret);
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
@@ -864,6 +904,8 @@ TEST_F_SIGNAL(precedence, kill_is_highest_in_any_order, SIGSYS)
ASSERT_EQ(0, ret);
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
ASSERT_EQ(0, ret);
+ ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+ ASSERT_EQ(0, ret);
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
ASSERT_EQ(0, ret);
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trap);
@@ -885,6 +927,8 @@ TEST_F_SIGNAL(precedence, trap_is_second, SIGSYS)
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
ASSERT_EQ(0, ret);
+ ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+ ASSERT_EQ(0, ret);
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
ASSERT_EQ(0, ret);
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
@@ -910,6 +954,8 @@ TEST_F_SIGNAL(precedence, trap_is_second_in_any_order, SIGSYS)
ASSERT_EQ(0, ret);
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trap);
ASSERT_EQ(0, ret);
+ ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+ ASSERT_EQ(0, ret);
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
ASSERT_EQ(0, ret);
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
@@ -931,6 +977,8 @@ TEST_F(precedence, errno_is_third)
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
ASSERT_EQ(0, ret);
+ ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+ ASSERT_EQ(0, ret);
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
ASSERT_EQ(0, ret);
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
@@ -949,6 +997,8 @@ TEST_F(precedence, errno_is_third_in_any_order)
ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
ASSERT_EQ(0, ret);
+ ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+ ASSERT_EQ(0, ret);
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
ASSERT_EQ(0, ret);
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
@@ -971,6 +1021,8 @@ TEST_F(precedence, trace_is_fourth)
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
ASSERT_EQ(0, ret);
+ ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+ ASSERT_EQ(0, ret);
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
ASSERT_EQ(0, ret);
/* Should work just fine. */
@@ -992,12 +1044,54 @@ TEST_F(precedence, trace_is_fourth_in_any_order)
ASSERT_EQ(0, ret);
ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
ASSERT_EQ(0, ret);
+ ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+ ASSERT_EQ(0, ret);
/* Should work just fine. */
EXPECT_EQ(parent, syscall(__NR_getppid));
/* No ptracer */
EXPECT_EQ(-1, syscall(__NR_getpid));
}
+TEST_F(precedence, log_is_fifth)
+{
+ pid_t mypid, parent;
+ long ret;
+
+ mypid = getpid();
+ parent = getppid();
+ ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
+ ASSERT_EQ(0, ret);
+
+ ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
+ ASSERT_EQ(0, ret);
+ ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+ ASSERT_EQ(0, ret);
+ /* Should work just fine. */
+ EXPECT_EQ(parent, syscall(__NR_getppid));
+ /* Should also work just fine */
+ EXPECT_EQ(mypid, syscall(__NR_getpid));
+}
+
+TEST_F(precedence, log_is_fifth_in_any_order)
+{
+ pid_t mypid, parent;
+ long ret;
+
+ mypid = getpid();
+ parent = getppid();
+ ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
+ ASSERT_EQ(0, ret);
+
+ ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
+ ASSERT_EQ(0, ret);
+ ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
+ ASSERT_EQ(0, ret);
+ /* Should work just fine. */
+ EXPECT_EQ(parent, syscall(__NR_getppid));
+ /* Should also work just fine */
+ EXPECT_EQ(mypid, syscall(__NR_getpid));
+}
+
#ifndef PTRACE_O_TRACESECCOMP
#define PTRACE_O_TRACESECCOMP 0x00000080
#endif
@@ -2494,7 +2588,7 @@ TEST(get_action_avail)
{
__u32 actions[] = { SECCOMP_RET_KILL, SECCOMP_RET_TRAP,
SECCOMP_RET_ERRNO, SECCOMP_RET_TRACE,
- SECCOMP_RET_ALLOW };
+ SECCOMP_RET_LOG, SECCOMP_RET_ALLOW };
__u32 unknown_action = 0x10000000U;
int i;
long ret;
@@ -2531,6 +2625,7 @@ TEST(get_action_avail)
* - 64-bit arg prodding
* - arch value testing (x86 modes especially)
* - verify that FILTER_FLAG_LOG filters generate log messages
+ * - verify that RET_LOG generates log messages
* - ...
*/
--
2.7.4
Adminstrators can write to this sysctl to set the seccomp actions that
are allowed to be logged. Any actions not found in this sysctl will not
be logged.
For example, all SECCOMP_RET_KILL, SECCOMP_RET_TRAP, and
SECCOMP_RET_ERRNO actions would be loggable if "kill trap errno" were
written to the sysctl. SECCOMP_RET_TRACE actions would not be logged
since its string representation ("trace") wasn't present in the sysctl
value.
The path to the sysctl is:
/proc/sys/kernel/seccomp/actions_logged
The actions_avail sysctl can be read to discover the valid action names
that can be written to the actions_logged sysctl with the exception of
SECCOMP_RET_ALLOW. It cannot be configured for logging.
The default setting for the sysctl is to allow all actions to be logged
except SECCOMP_RET_ALLOW.
There's one important exception to this sysctl. If a task is
specifically being audited, meaning that an audit context has been
allocated for the task, seccomp will log all actions other than
SECCOMP_RET_ALLOW despite the value of actions_logged. This exception
preserves the existing auditing behavior of tasks with an allocated
audit context.
Signed-off-by: Tyler Hicks <[email protected]>
---
* Changes since v4:
- the sysctl is now a list of actions that are allowed by the admin to be
logged rather than a list of actions that should be logged
+ a follow up patch will let applications have a say in what should be
logged but the admin has the final say with this sysctl
+ RET_ALLOW cannot be allowed to be logged
- fix comment style
- mark the seccomp_action_names array as const
- adjust for new reStructuredText format
Documentation/userspace-api/seccomp_filter.rst | 18 +++
include/linux/audit.h | 6 +-
kernel/seccomp.c | 180 ++++++++++++++++++++++++-
3 files changed, 196 insertions(+), 8 deletions(-)
diff --git a/Documentation/userspace-api/seccomp_filter.rst b/Documentation/userspace-api/seccomp_filter.rst
index 35fc7cb..2d1d8ab 100644
--- a/Documentation/userspace-api/seccomp_filter.rst
+++ b/Documentation/userspace-api/seccomp_filter.rst
@@ -187,6 +187,24 @@ directory. Here's a description of each file in that directory:
program was built, differs from the set of actions actually
supported in the current running kernel.
+``actions_logged``:
+ A read-write ordered list of seccomp return values (refer to the
+ ``SECCOMP_RET_*`` macros above) that are allowed to be logged. Writes
+ to the file do not need to be in ordered form but reads from the file
+ will be ordered in the same way as the actions_avail sysctl.
+
+ It is important to note that the value of ``actions_logged`` does not
+ prevent certain actions from being logged when the audit subsystem is
+ configured to audit a task. If the action is not found in
+ ``actions_logged`` list, the final decision on whether to audit the
+ action for that task is ultimately left up to the audit subsystem to
+ decide for all seccomp return values other than ``SECCOMP_RET_ALLOW``.
+
+ The ``allow`` string is not accepted in the ``actions_logged`` sysctl
+ as it is not possible to log ``SECCOMP_RET_ALLOW`` actions. Attempting
+ to write ``allow`` to the sysctl will result in an EINVAL being
+ returned.
+
Adding architecture support
===========================
diff --git a/include/linux/audit.h b/include/linux/audit.h
index 2150bdc..8c30f06 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -314,11 +314,7 @@ void audit_core_dumps(long signr);
static inline void audit_seccomp(unsigned long syscall, long signr, int code)
{
- if (!audit_enabled)
- return;
-
- /* Force a record to be reported if a signal was delivered. */
- if (signr || unlikely(!audit_dummy_context()))
+ if (audit_enabled && unlikely(!audit_dummy_context()))
__audit_seccomp(syscall, signr, code);
}
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 6bff068..87257f2 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -516,6 +516,52 @@ static void seccomp_send_sigsys(int syscall, int reason)
}
#endif /* CONFIG_SECCOMP_FILTER */
+/* For use with seccomp_actions_logged */
+#define SECCOMP_LOG_KILL (1 << 0)
+#define SECCOMP_LOG_TRAP (1 << 2)
+#define SECCOMP_LOG_ERRNO (1 << 3)
+#define SECCOMP_LOG_TRACE (1 << 4)
+#define SECCOMP_LOG_ALLOW (1 << 5)
+
+static u32 seccomp_actions_logged = SECCOMP_LOG_KILL | SECCOMP_LOG_TRAP |
+ SECCOMP_LOG_ERRNO | SECCOMP_LOG_TRACE;
+
+static inline void seccomp_log(unsigned long syscall, long signr, u32 action)
+{
+ bool log;
+
+ switch (action) {
+ case SECCOMP_RET_TRAP:
+ log = seccomp_actions_logged & SECCOMP_LOG_TRAP;
+ break;
+ case SECCOMP_RET_ERRNO:
+ log = seccomp_actions_logged & SECCOMP_LOG_ERRNO;
+ break;
+ case SECCOMP_RET_TRACE:
+ log = seccomp_actions_logged & SECCOMP_LOG_TRACE;
+ break;
+ case SECCOMP_RET_ALLOW:
+ log = false;
+ break;
+ case SECCOMP_RET_KILL:
+ default:
+ log = seccomp_actions_logged & SECCOMP_LOG_KILL;
+ }
+
+ /*
+ * Force an audit message to be emitted when the action is allowed to
+ * be logged by the admin.
+ */
+ if (log)
+ return __audit_seccomp(syscall, signr, action);
+
+ /*
+ * Let the audit subsystem decide if the action should be audited based
+ * on whether the current task itself is being audited.
+ */
+ return audit_seccomp(syscall, signr, action);
+}
+
/*
* Secure computing mode 1 allows only read/write/exit/sigreturn.
* To be fully secure this must be combined with rlimit
@@ -541,7 +587,7 @@ static void __secure_computing_strict(int this_syscall)
#ifdef SECCOMP_DEBUG
dump_stack();
#endif
- audit_seccomp(this_syscall, SIGKILL, SECCOMP_RET_KILL);
+ seccomp_log(this_syscall, SIGKILL, SECCOMP_RET_KILL);
do_exit(SIGKILL);
}
@@ -644,7 +690,7 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
case SECCOMP_RET_KILL:
default:
- audit_seccomp(this_syscall, SIGSYS, action);
+ seccomp_log(this_syscall, SIGSYS, action);
/* Dump core only if this is the last remaining thread. */
if (get_nr_threads(current) == 1) {
siginfo_t info;
@@ -661,7 +707,7 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
unreachable();
skip:
- audit_seccomp(this_syscall, 0, action);
+ seccomp_log(this_syscall, 0, action);
return -1;
}
#else
@@ -940,6 +986,129 @@ static const char seccomp_actions_avail[] = SECCOMP_RET_KILL_NAME " "
SECCOMP_RET_TRACE_NAME " "
SECCOMP_RET_ALLOW_NAME;
+#define SECCOMP_ACTIONS_AVAIL_LEN strlen(seccomp_actions_avail)
+
+struct seccomp_log_name {
+ u32 log;
+ const char *name;
+};
+
+static const struct seccomp_log_name seccomp_log_names[] = {
+ { SECCOMP_LOG_KILL, SECCOMP_RET_KILL_NAME },
+ { SECCOMP_LOG_TRAP, SECCOMP_RET_TRAP_NAME },
+ { SECCOMP_LOG_ERRNO, SECCOMP_RET_ERRNO_NAME },
+ { SECCOMP_LOG_TRACE, SECCOMP_RET_TRACE_NAME },
+ { SECCOMP_LOG_ALLOW, SECCOMP_RET_ALLOW_NAME },
+ { }
+};
+
+static bool seccomp_names_from_actions_logged(char *names, size_t size,
+ u32 actions_logged)
+{
+ const struct seccomp_log_name *cur;
+ bool append_space = false;
+
+ for (cur = seccomp_log_names; cur->name && size; cur++) {
+ ssize_t ret;
+
+ if (!(actions_logged & cur->log))
+ continue;
+
+ if (append_space) {
+ ret = strscpy(names, " ", size);
+ if (ret < 0)
+ return false;
+
+ names += ret;
+ size -= ret;
+ } else
+ append_space = true;
+
+ ret = strscpy(names, cur->name, size);
+ if (ret < 0)
+ return false;
+
+ names += ret;
+ size -= ret;
+ }
+
+ return true;
+}
+
+static bool seccomp_action_logged_from_name(u32 *action_logged,
+ const char *name)
+{
+ const struct seccomp_log_name *cur;
+
+ for (cur = seccomp_log_names; cur->name; cur++) {
+ if (!strcmp(cur->name, name)) {
+ *action_logged = cur->log;
+ return true;
+ }
+ }
+
+ return false;
+}
+
+static bool seccomp_actions_logged_from_names(u32 *actions_logged, char *names)
+{
+ char *name;
+
+ *actions_logged = 0;
+ while ((name = strsep(&names, " ")) && *name) {
+ u32 action_logged = 0;
+
+ if (!seccomp_action_logged_from_name(&action_logged, name))
+ return false;
+
+ *actions_logged |= action_logged;
+ }
+
+ return true;
+}
+
+static int seccomp_actions_logged_handler(struct ctl_table *ro_table, int write,
+ void __user *buffer, size_t *lenp,
+ loff_t *ppos)
+{
+ char names[SECCOMP_ACTIONS_AVAIL_LEN + 1];
+ struct ctl_table table;
+ int ret;
+
+ if (write && !capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ memset(names, 0, sizeof(names));
+
+ if (!write) {
+ if (!seccomp_names_from_actions_logged(names, sizeof(names),
+ seccomp_actions_logged))
+ return -EINVAL;
+ }
+
+ table = *ro_table;
+ table.data = names;
+ table.maxlen = sizeof(names);
+ ret = proc_dostring(&table, write, buffer, lenp, ppos);
+ if (ret)
+ return ret;
+
+ if (write) {
+ u32 actions_logged;
+
+ if (!seccomp_actions_logged_from_names(&actions_logged,
+ table.data))
+ return -EINVAL;
+
+ if (actions_logged & SECCOMP_LOG_ALLOW)
+ return -EINVAL;
+
+ seccomp_actions_logged = actions_logged;
+ }
+
+ return 0;
+}
+
static struct ctl_path seccomp_sysctl_path[] = {
{ .procname = "kernel", },
{ .procname = "seccomp", },
@@ -954,6 +1123,11 @@ static struct ctl_table seccomp_sysctl_table[] = {
.mode = 0444,
.proc_handler = proc_dostring,
},
+ {
+ .procname = "actions_logged",
+ .mode = 0644,
+ .proc_handler = seccomp_actions_logged_handler,
+ },
{ }
};
--
2.7.4
Userspace needs to be able to reliably detect the support of a filter
flag. A good way of doing that is by attempting to enter filter mode,
with the flag bit(s) in question set, and a NULL pointer for the args
parameter of seccomp(2). EFAULT indicates that the flag is valid and
EINVAL indicates that the flag is invalid.
This patch adds a selftest that can be used to test this method of
detection in userspace.
Signed-off-by: Tyler Hicks <[email protected]>
---
* Changes since v4:
- This is a new patch
tools/testing/selftests/seccomp/seccomp_bpf.c | 58 +++++++++++++++++++++++++++
1 file changed, 58 insertions(+)
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 040e875..d221437 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1885,6 +1885,64 @@ TEST(seccomp_syscall_mode_lock)
}
}
+/* Test detection of known and unknown filter flags. Userspace needs to be able
+ * to check if a filter flag is support by the current kernel and a good way of
+ * doing that is by attempting to enter filter mode, with the flag bit in
+ * question set, and a NULL pointer for the _args_ parameter. EFAULT indicates
+ * that the flag is valid and EINVAL indicates that the flag is invalid.
+ */
+TEST(detect_seccomp_filter_flags)
+{
+ unsigned int flags[] = { SECCOMP_FILTER_FLAG_TSYNC,
+ SECCOMP_FILTER_FLAG_LOG };
+ unsigned int flag, all_flags;
+ int i;
+ long ret;
+
+ /* Test detection of known-good filter flags */
+ for (i = 0, all_flags = 0; i < ARRAY_SIZE(flags); i++) {
+ flag = flags[i];
+ ret = seccomp(SECCOMP_SET_MODE_FILTER, flag, NULL);
+ ASSERT_NE(ENOSYS, errno) {
+ TH_LOG("Kernel does not support seccomp syscall!");
+ }
+ EXPECT_EQ(-1, ret);
+ EXPECT_EQ(EFAULT, errno) {
+ TH_LOG("Failed to detect that a known-good filter flag (0x%X) is supported!",
+ flag);
+ }
+
+ all_flags |= flag;
+ }
+
+ /* Test detection of all known-good filter flags */
+ ret = seccomp(SECCOMP_SET_MODE_FILTER, all_flags, NULL);
+ EXPECT_EQ(-1, ret);
+ EXPECT_EQ(EFAULT, errno) {
+ TH_LOG("Failed to detect that all known-good filter flags (0x%X) are supported!",
+ all_flags);
+ }
+
+ /* Test detection of an unknown filter flag */
+ flag = -1;
+ ret = seccomp(SECCOMP_SET_MODE_FILTER, flag, NULL);
+ EXPECT_EQ(-1, ret);
+ EXPECT_EQ(EINVAL, errno) {
+ TH_LOG("Failed to detect that an unknown filter flag (0x%X) is unsupported!",
+ flag);
+ }
+
+ /* Test detection of an unknown filter flag that may simply need to be
+ * added to this test */
+ flag = flags[ARRAY_SIZE(flags) - 1] << 1;
+ ret = seccomp(SECCOMP_SET_MODE_FILTER, flag, NULL);
+ EXPECT_EQ(-1, ret);
+ EXPECT_EQ(EINVAL, errno) {
+ TH_LOG("Failed to detect that an unknown filter flag (0x%X) is unsupported! Does a new flag need to be added to this test?",
+ flag);
+ }
+}
+
TEST(TSYNC_first)
{
struct sock_filter filter[] = {
--
2.7.4
On Fri, Jul 28, 2017 at 1:55 PM, Tyler Hicks <[email protected]> wrote:
> Adminstrators can write to this sysctl to set the seccomp actions that
> are allowed to be logged. Any actions not found in this sysctl will not
> be logged.
>
> For example, all SECCOMP_RET_KILL, SECCOMP_RET_TRAP, and
> SECCOMP_RET_ERRNO actions would be loggable if "kill trap errno" were
> written to the sysctl. SECCOMP_RET_TRACE actions would not be logged
> since its string representation ("trace") wasn't present in the sysctl
> value.
Just to make sure I'm clear on this, the key word above is "loggable",
in that filters requesting logging will be seen.
i.e. at the end of the series, the final state of "will it be logged?" is:
if action==RET_ALLOW:
do not log
else if action==RET_KILL || audit-enabled:
log
else if filter-requests-logging && action in actions_logged:
log
else:
do not log
> The path to the sysctl is:
>
> /proc/sys/kernel/seccomp/actions_logged
>
> The actions_avail sysctl can be read to discover the valid action names
> that can be written to the actions_logged sysctl with the exception of
> SECCOMP_RET_ALLOW. It cannot be configured for logging.
>
> The default setting for the sysctl is to allow all actions to be logged
> except SECCOMP_RET_ALLOW.
>
> There's one important exception to this sysctl. If a task is
> specifically being audited, meaning that an audit context has been
> allocated for the task, seccomp will log all actions other than
> SECCOMP_RET_ALLOW despite the value of actions_logged. This exception
> preserves the existing auditing behavior of tasks with an allocated
> audit context.
>
> Signed-off-by: Tyler Hicks <[email protected]>
> ---
>
> * Changes since v4:
> - the sysctl is now a list of actions that are allowed by the admin to be
> logged rather than a list of actions that should be logged
> + a follow up patch will let applications have a say in what should be
> logged but the admin has the final say with this sysctl
> + RET_ALLOW cannot be allowed to be logged
> - fix comment style
> - mark the seccomp_action_names array as const
> - adjust for new reStructuredText format
>
> Documentation/userspace-api/seccomp_filter.rst | 18 +++
> include/linux/audit.h | 6 +-
> kernel/seccomp.c | 180 ++++++++++++++++++++++++-
> 3 files changed, 196 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/userspace-api/seccomp_filter.rst b/Documentation/userspace-api/seccomp_filter.rst
> index 35fc7cb..2d1d8ab 100644
> --- a/Documentation/userspace-api/seccomp_filter.rst
> +++ b/Documentation/userspace-api/seccomp_filter.rst
> @@ -187,6 +187,24 @@ directory. Here's a description of each file in that directory:
> program was built, differs from the set of actions actually
> supported in the current running kernel.
>
> +``actions_logged``:
> + A read-write ordered list of seccomp return values (refer to the
> + ``SECCOMP_RET_*`` macros above) that are allowed to be logged. Writes
> + to the file do not need to be in ordered form but reads from the file
> + will be ordered in the same way as the actions_avail sysctl.
> +
> + It is important to note that the value of ``actions_logged`` does not
> + prevent certain actions from being logged when the audit subsystem is
> + configured to audit a task. If the action is not found in
> + ``actions_logged`` list, the final decision on whether to audit the
> + action for that task is ultimately left up to the audit subsystem to
> + decide for all seccomp return values other than ``SECCOMP_RET_ALLOW``.
> +
> + The ``allow`` string is not accepted in the ``actions_logged`` sysctl
> + as it is not possible to log ``SECCOMP_RET_ALLOW`` actions. Attempting
> + to write ``allow`` to the sysctl will result in an EINVAL being
> + returned.
> +
> Adding architecture support
> ===========================
>
> diff --git a/include/linux/audit.h b/include/linux/audit.h
> index 2150bdc..8c30f06 100644
> --- a/include/linux/audit.h
> +++ b/include/linux/audit.h
> @@ -314,11 +314,7 @@ void audit_core_dumps(long signr);
>
> static inline void audit_seccomp(unsigned long syscall, long signr, int code)
> {
> - if (!audit_enabled)
> - return;
> -
> - /* Force a record to be reported if a signal was delivered. */
> - if (signr || unlikely(!audit_dummy_context()))
> + if (audit_enabled && unlikely(!audit_dummy_context()))
> __audit_seccomp(syscall, signr, code);
> }
>
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index 6bff068..87257f2 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -516,6 +516,52 @@ static void seccomp_send_sigsys(int syscall, int reason)
> }
> #endif /* CONFIG_SECCOMP_FILTER */
>
> +/* For use with seccomp_actions_logged */
> +#define SECCOMP_LOG_KILL (1 << 0)
> +#define SECCOMP_LOG_TRAP (1 << 2)
> +#define SECCOMP_LOG_ERRNO (1 << 3)
> +#define SECCOMP_LOG_TRACE (1 << 4)
> +#define SECCOMP_LOG_ALLOW (1 << 5)
> +
> +static u32 seccomp_actions_logged = SECCOMP_LOG_KILL | SECCOMP_LOG_TRAP |
> + SECCOMP_LOG_ERRNO | SECCOMP_LOG_TRACE;
> +
> +static inline void seccomp_log(unsigned long syscall, long signr, u32 action)
> +{
> + bool log;
> +
> + switch (action) {
> + case SECCOMP_RET_TRAP:
> + log = seccomp_actions_logged & SECCOMP_LOG_TRAP;
> + break;
> + case SECCOMP_RET_ERRNO:
> + log = seccomp_actions_logged & SECCOMP_LOG_ERRNO;
> + break;
> + case SECCOMP_RET_TRACE:
> + log = seccomp_actions_logged & SECCOMP_LOG_TRACE;
> + break;
> + case SECCOMP_RET_ALLOW:
> + log = false;
> + break;
> + case SECCOMP_RET_KILL:
> + default:
> + log = seccomp_actions_logged & SECCOMP_LOG_KILL;
> + }
> +
> + /*
> + * Force an audit message to be emitted when the action is allowed to
> + * be logged by the admin.
> + */
> + if (log)
> + return __audit_seccomp(syscall, signr, action);
At this point in the patch series, there's no filter-requested-logging
flag, so I think the above logic isn't needed until later in the
series (or rather, only RET_KILL should be checked).
> +
> + /*
> + * Let the audit subsystem decide if the action should be audited based
> + * on whether the current task itself is being audited.
> + */
> + return audit_seccomp(syscall, signr, action);
With audit_seccomp() being a single if, maybe it should just be
collapsed into this function?
if (log || (audit_enabled && unlikely(!audit_dummy_context()))
audit_seccomp(...)
I do like the change in name, though: this new function is correctly
named seccomp_log().
-Kees
--
Kees Cook
Pixel Security
On Fri, Jul 28, 2017 at 1:55 PM, Tyler Hicks <[email protected]> wrote:
> This patch creates a read-only sysctl containing an ordered list of
> seccomp actions that the kernel supports. The ordering, from left to
> right, is the lowest action value (kill) to the highest action value
> (allow). Currently, a read of the sysctl file would return "kill trap
> errno trace allow". The contents of this sysctl file can be useful for
> userspace code as well as the system administrator.
>
> The path to the sysctl is:
>
> /proc/sys/kernel/seccomp/actions_avail
>
> libseccomp and other userspace code can easily determine which actions
> the current kernel supports. The set of actions supported by the current
> kernel may be different than the set of action macros found in kernel
> headers that were installed where the userspace code was built.
>
> In addition, this sysctl will allow system administrators to know which
> actions are supported by the kernel and make it easier to configure
> exactly what seccomp logs through the audit subsystem. Support for this
> level of logging configuration will come in a future patch.
>
> Signed-off-by: Tyler Hicks <[email protected]>
> ---
>
> * Changes since v4:
> - move device_initcall() into CONFIG_SYSCTL ifdef
> - mark the seccomp_actions_avail string as const
> - adjust for new reStructuredText format
>
> Documentation/sysctl/kernel.txt | 1 +
> Documentation/userspace-api/seccomp_filter.rst | 16 ++++++++
> kernel/seccomp.c | 51 ++++++++++++++++++++++++++
> 3 files changed, 68 insertions(+)
>
> diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
> index bac23c1..995c42c 100644
> --- a/Documentation/sysctl/kernel.txt
> +++ b/Documentation/sysctl/kernel.txt
> @@ -74,6 +74,7 @@ show up in /proc/sys/kernel:
> - reboot-cmd [ SPARC only ]
> - rtsig-max
> - rtsig-nr
> +- seccomp/ ==> Documentation/userspace-api/seccomp_filter.rst
> - sem
> - sem_next_id [ sysv ipc ]
> - sg-big-buff [ generic SCSI device (sg) ]
> diff --git a/Documentation/userspace-api/seccomp_filter.rst b/Documentation/userspace-api/seccomp_filter.rst
> index f71eb5e..35fc7cb 100644
> --- a/Documentation/userspace-api/seccomp_filter.rst
> +++ b/Documentation/userspace-api/seccomp_filter.rst
> @@ -169,7 +169,23 @@ The ``samples/seccomp/`` directory contains both an x86-specific example
> and a more generic example of a higher level macro interface for BPF
> program generation.
>
> +Sysctls
> +=======
> +
> +Seccomp's sysctl files can be found in the ``/proc/sys/kernel/seccomp/``
> +directory. Here's a description of each file in that directory:
> +
> +``actions_avail``:
> + A read-only ordered list of seccomp return values (refer to the
> + ``SECCOMP_RET_*`` macros above) in string form. The ordering, from
> + left-to-right, is the least permissive return value to the most
> + permissive return value.
>
> + The list represents the set of seccomp return values supported
> + by the kernel. A userspace program may use this list to
> + determine if the actions found in the ``seccomp.h``, when the
> + program was built, differs from the set of actions actually
> + supported in the current running kernel.
>
> Adding architecture support
> ===========================
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index 98b59b5..6bff068 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -17,11 +17,13 @@
> #include <linux/audit.h>
> #include <linux/compat.h>
> #include <linux/coredump.h>
> +#include <linux/kmemleak.h>
> #include <linux/sched.h>
> #include <linux/sched/task_stack.h>
> #include <linux/seccomp.h>
> #include <linux/slab.h>
> #include <linux/syscalls.h>
> +#include <linux/sysctl.h>
>
> #ifdef CONFIG_HAVE_ARCH_SECCOMP_FILTER
> #include <asm/syscall.h>
> @@ -922,3 +924,52 @@ long seccomp_get_filter(struct task_struct *task, unsigned long filter_off,
> return ret;
> }
> #endif
> +
> +#ifdef CONFIG_SYSCTL
> +
> +/* Human readable action names for friendly sysctl interaction */
> +#define SECCOMP_RET_KILL_NAME "kill"
> +#define SECCOMP_RET_TRAP_NAME "trap"
> +#define SECCOMP_RET_ERRNO_NAME "errno"
> +#define SECCOMP_RET_TRACE_NAME "trace"
> +#define SECCOMP_RET_ALLOW_NAME "allow"
> +
> +static const char seccomp_actions_avail[] = SECCOMP_RET_KILL_NAME " "
> + SECCOMP_RET_TRAP_NAME " "
> + SECCOMP_RET_ERRNO_NAME " "
> + SECCOMP_RET_TRACE_NAME " "
> + SECCOMP_RET_ALLOW_NAME;
> +
> +static struct ctl_path seccomp_sysctl_path[] = {
> + { .procname = "kernel", },
> + { .procname = "seccomp", },
> + { }
> +};
> +
> +static struct ctl_table seccomp_sysctl_table[] = {
> + {
> + .procname = "actions_avail",
> + .data = (void *) &seccomp_actions_avail,
> + .maxlen = sizeof(seccomp_actions_avail),
> + .mode = 0444,
> + .proc_handler = proc_dostring,
> + },
> + { }
> +};
> +
> +static int __init seccomp_sysctl_init(void)
> +{
> + struct ctl_table_header *hdr;
> +
> + hdr = register_sysctl_paths(seccomp_sysctl_path, seccomp_sysctl_table);
> + if (!hdr)
> + pr_warn("seccomp: sysctl registration failed\n");
> + else
> + kmemleak_not_leak(hdr);
> +
> + return 0;
> +}
> +
> +device_initcall(seccomp_sysctl_init)
> +
> +#endif /* CONFIG_SYSCTL */
Looks good. And for the record, the BPF return values, while not
checked in seccomp_check_filter(), are part of ABI and the kernel will
behave differently for unexpected values. For example, an older kernel
encountering the future SECCOMP_RET_LOG will treat it as
SECCOMP_RET_KILL since it's missing from the switch statement in
__seccomp_filter().
A question about patch ordering: should the new seccomp action
introspection patch maybe follow this one, so they're together in the
series (they provide the same information)?
-Kees
--
Kees Cook
Pixel Security
On Fri, Jul 28, 2017 at 1:55 PM, Tyler Hicks <[email protected]> wrote:
> Userspace code that needs to check if the kernel supports a given action
> may not be able to use the /proc/sys/kernel/seccomp/actions_avail
> sysctl. The process may be running in a sandbox and, therefore,
> sufficient filesystem access may not be available. This patch adds an
> operation to the seccomp(2) syscall that allows userspace code to ask
> the kernel if a given action is available.
>
> If the action is supported by the kernel, 0 is returned. If the action
> is not supported by the kernel, -1 is returned with errno set to
> -EOPNOTSUPP. If this check is attempted on a kernel that doesn't support
> this new operation, -1 is returned with errno set to -EINVAL meaning
> that userspace code will have the ability to differentiate between the
> two error cases.
>
> Signed-off-by: Tyler Hicks <[email protected]>
> ---
>
> * Changes since v4:
> - This is new patch to allow applications to check if an action is supported
> without having to consult the actions_avail sysctl
>
> include/uapi/linux/seccomp.h | 5 ++--
> kernel/seccomp.c | 26 +++++++++++++++++++
> tools/testing/selftests/seccomp/seccomp_bpf.c | 36 +++++++++++++++++++++++++++
> 3 files changed, 65 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
> index 82c823c..19a611d 100644
> --- a/include/uapi/linux/seccomp.h
> +++ b/include/uapi/linux/seccomp.h
> @@ -11,8 +11,9 @@
> #define SECCOMP_MODE_FILTER 2 /* uses user-supplied filter. */
>
> /* Valid operations for seccomp syscall. */
> -#define SECCOMP_SET_MODE_STRICT 0
> -#define SECCOMP_SET_MODE_FILTER 1
> +#define SECCOMP_SET_MODE_STRICT 0
> +#define SECCOMP_SET_MODE_FILTER 1
> +#define SECCOMP_GET_ACTION_AVAIL 2
>
> /* Valid flags for SECCOMP_SET_MODE_FILTER */
> #define SECCOMP_FILTER_FLAG_TSYNC 1
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index 1c4c496..03ad3ba 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -858,6 +858,27 @@ static inline long seccomp_set_mode_filter(unsigned int flags,
> }
> #endif
>
> +static long seccomp_get_action_avail(const char __user *uaction)
> +{
> + u32 action;
> +
> + if (copy_from_user(&action, uaction, sizeof(action)))
> + return -EFAULT;
> +
> + switch (action) {
> + case SECCOMP_RET_KILL:
> + case SECCOMP_RET_TRAP:
> + case SECCOMP_RET_ERRNO:
> + case SECCOMP_RET_TRACE:
> + case SECCOMP_RET_ALLOW:
> + break;
> + default:
> + return -EOPNOTSUPP;
> + }
> +
> + return 0;
> +}
> +
> /* Common entry point for both prctl and syscall. */
> static long do_seccomp(unsigned int op, unsigned int flags,
> const char __user *uargs)
> @@ -869,6 +890,11 @@ static long do_seccomp(unsigned int op, unsigned int flags,
> return seccomp_set_mode_strict();
> case SECCOMP_SET_MODE_FILTER:
> return seccomp_set_mode_filter(flags, uargs);
> + case SECCOMP_GET_ACTION_AVAIL:
> + if (flags != 0)
> + return -EINVAL;
> +
> + return seccomp_get_action_avail(uargs);
> default:
> return -EINVAL;
> }
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index eeb4f7a..8f0872b 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -1683,6 +1683,10 @@ TEST_F_SIGNAL(TRACE_syscall, kill_after_ptrace, SIGSYS)
> #define SECCOMP_SET_MODE_FILTER 1
> #endif
>
> +#ifndef SECCOMP_GET_ACTION_AVAIL
> +#define SECCOMP_GET_ACTION_AVAIL 2
> +#endif
> +
> #ifndef SECCOMP_FILTER_FLAG_TSYNC
> #define SECCOMP_FILTER_FLAG_TSYNC 1
> #endif
> @@ -2486,6 +2490,38 @@ TEST_SIGNAL(filter_flag_log, SIGSYS)
> EXPECT_EQ(0, syscall(__NR_getpid));
> }
>
> +TEST(get_action_avail)
> +{
> + __u32 actions[] = { SECCOMP_RET_KILL, SECCOMP_RET_TRAP,
> + SECCOMP_RET_ERRNO, SECCOMP_RET_TRACE,
> + SECCOMP_RET_ALLOW };
> + __u32 unknown_action = 0x10000000U;
> + int i;
> + long ret;
> +
> + ret = seccomp(SECCOMP_GET_ACTION_AVAIL, 0, &actions[0]);
> + ASSERT_NE(ENOSYS, errno) {
> + TH_LOG("Kernel does not support seccomp syscall!");
> + }
> + ASSERT_NE(EINVAL, errno) {
> + TH_LOG("Kernel does not support SECCOMP_GET_ACTION_AVAIL operation!");
> + }
> + EXPECT_EQ(ret, 0);
> +
> + for (i = 0; i < ARRAY_SIZE(actions); i++) {
> + ret = seccomp(SECCOMP_GET_ACTION_AVAIL, 0, &actions[i]);
> + EXPECT_EQ(ret, 0) {
> + TH_LOG("Expected action (0x%X) not available!",
> + actions[i]);
> + }
> + }
> +
> + /* Check that an unknown action is handled properly (EOPNOTSUPP) */
> + ret = seccomp(SECCOMP_GET_ACTION_AVAIL, 0, &unknown_action);
> + EXPECT_EQ(ret, -1);
> + EXPECT_EQ(errno, EOPNOTSUPP);
> +}
> +
> /*
> * TODO:
> * - add microbenchmarks
> --
> 2.7.4
>
I like this a lot. I think it should follow the sysctl patch in the
series, but otherwise looks great.
-Kees
--
Kees Cook
Pixel Security
On Fri, Jul 28, 2017 at 1:55 PM, Tyler Hicks <[email protected]> wrote:
> Add a new filter flag, SECCOMP_FILTER_FLAG_LOG, that enables logging for
> all actions except for SECCOMP_RET_ALLOW for the given filter.
>
> SECCOMP_RET_KILL actions are always logged, when "kill" is in the
> actions_logged sysctl, and SECCOMP_RET_ALLOW actions are never logged,
> regardless of this flag.
>
> This flag can be used to create noisy filters that result in all
> non-allowed actions to be logged. A process may have one noisy filter,
> which is loaded with this flag, as well as a quiet filter that's not
> loaded with this flag. This allows for the actions in a set of filters
> to be selectively conveyed to the admin.
>
> Since a system could have a large number of allocated seccomp_filter
> structs, struct packing was taken in consideration. On 64 bit x86, the
> new log member takes up one byte of an existing four byte hole in the
> struct. On 32 bit x86, the new log member creates a new four byte hole
> (unavoidable) and consumes one of those bytes.
Ah, good note about packing. I'll adjust my other patch to move into
the hole in 64-bit.
FWIW, I would have asked to extract the filter-assignment logic into a
separate patch (since it's a distinct logical piece), but since I've
already stolen it for the kill-process series, that's all done now. ;)
I think it's a good sign that both that and this need similar
functionality, so thank you for implementing that!
> Unfortunately, the tests added for SECCOMP_FILTER_FLAG_LOG are not
> capable of inspecting the audit log to verify that the actions taken in
> the filter were logged.
Strictly speaking, the test could probably do some horrible stuff to
access the kernel log, but yes, I'm fine with this just getting listed
in the test TODO.
>
> Signed-off-by: Tyler Hicks <[email protected]>
> ---
>
> * Changes since v4:
> - This is a new patch that allows the application to have a say in what gets
> logged
>
> include/linux/seccomp.h | 3 +-
> include/uapi/linux/seccomp.h | 1 +
> kernel/seccomp.c | 78 ++++++++++++++++-----------
> tools/testing/selftests/seccomp/seccomp_bpf.c | 66 +++++++++++++++++++++++
> 4 files changed, 116 insertions(+), 32 deletions(-)
>
> diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
> index ecc296c..c8bef43 100644
> --- a/include/linux/seccomp.h
> +++ b/include/linux/seccomp.h
> @@ -3,7 +3,8 @@
>
> #include <uapi/linux/seccomp.h>
>
> -#define SECCOMP_FILTER_FLAG_MASK (SECCOMP_FILTER_FLAG_TSYNC)
> +#define SECCOMP_FILTER_FLAG_MASK (SECCOMP_FILTER_FLAG_TSYNC | \
> + SECCOMP_FILTER_FLAG_LOG)
>
> #ifdef CONFIG_SECCOMP
>
> diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
> index 0f238a4..82c823c 100644
> --- a/include/uapi/linux/seccomp.h
> +++ b/include/uapi/linux/seccomp.h
> @@ -16,6 +16,7 @@
>
> /* Valid flags for SECCOMP_SET_MODE_FILTER */
> #define SECCOMP_FILTER_FLAG_TSYNC 1
> +#define SECCOMP_FILTER_FLAG_LOG 2
I can sort out the flag collision with ..._KILL_PROCESS.
>
> /*
> * All BPF programs must return a 32-bit value.
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index 87257f2..1c4c496 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -44,6 +44,7 @@
> * get/put helpers should be used when accessing an instance
> * outside of a lifetime-guarded section. In general, this
> * is only needed for handling filters shared across tasks.
> + * @log: true if all actions except for SECCOMP_RET_ALLOW should be logged
> * @prev: points to a previously installed, or inherited, filter
> * @prog: the BPF program to evaluate
> *
> @@ -59,6 +60,7 @@
> */
> struct seccomp_filter {
> refcount_t usage;
> + bool log;
> struct seccomp_filter *prev;
> struct bpf_prog *prog;
> };
> @@ -172,11 +174,14 @@ static int seccomp_check_filter(struct sock_filter *filter, unsigned int flen)
>
> /**
> * seccomp_run_filters - evaluates all seccomp filters against @sd
> + * @filter: upon return, points to the matched filter but may be NULL in some
> + * unexpected situations
> * @sd: optional seccomp data to be passed to filters
> *
> * Returns valid seccomp BPF response codes.
> */
> -static u32 seccomp_run_filters(const struct seccomp_data *sd)
> +static u32 seccomp_run_filters(struct seccomp_filter **filter,
> + const struct seccomp_data *sd)
> {
> struct seccomp_data sd_local;
> u32 ret = SECCOMP_RET_ALLOW;
> @@ -184,6 +189,8 @@ static u32 seccomp_run_filters(const struct seccomp_data *sd)
> struct seccomp_filter *f =
> lockless_dereference(current->seccomp.filter);
>
> + *filter = f;
> +
> /* Ensure unexpected behavior doesn't result in failing open. */
> if (unlikely(WARN_ON(f == NULL)))
> return SECCOMP_RET_KILL;
> @@ -200,8 +207,10 @@ static u32 seccomp_run_filters(const struct seccomp_data *sd)
> for (; f; f = f->prev) {
> u32 cur_ret = BPF_PROG_RUN(f->prog, sd);
>
> - if ((cur_ret & SECCOMP_RET_ACTION) < (ret & SECCOMP_RET_ACTION))
> + if ((cur_ret & SECCOMP_RET_ACTION) < (ret & SECCOMP_RET_ACTION)) {
> ret = cur_ret;
> + *filter = f;
> + }
> }
> return ret;
> }
> @@ -446,6 +455,9 @@ static long seccomp_attach_filter(unsigned int flags,
> return ret;
> }
>
> + if (flags & SECCOMP_FILTER_FLAG_LOG)
> + filter->log = true;
> +
> /*
> * If there is an existing filter, make it the prev and don't drop its
> * task reference.
> @@ -526,36 +538,39 @@ static void seccomp_send_sigsys(int syscall, int reason)
> static u32 seccomp_actions_logged = SECCOMP_LOG_KILL | SECCOMP_LOG_TRAP |
> SECCOMP_LOG_ERRNO | SECCOMP_LOG_TRACE;
>
> -static inline void seccomp_log(unsigned long syscall, long signr, u32 action)
> +static inline void seccomp_log(unsigned long syscall, long signr, u32 action,
> + bool requested)
> {
> - bool log;
> + if (requested) {
> + bool log;
> +
> + switch (action) {
> + case SECCOMP_RET_TRAP:
> + log = seccomp_actions_logged & SECCOMP_LOG_TRAP;
> + break;
> + case SECCOMP_RET_ERRNO:
> + log = seccomp_actions_logged & SECCOMP_LOG_ERRNO;
> + break;
> + case SECCOMP_RET_TRACE:
> + log = seccomp_actions_logged & SECCOMP_LOG_TRACE;
> + break;
> + case SECCOMP_RET_ALLOW:
> + log = false;
> + break;
> + case SECCOMP_RET_KILL:
> + default:
> + log = seccomp_actions_logged & SECCOMP_LOG_KILL;
> + }
>
> - switch (action) {
> - case SECCOMP_RET_TRAP:
> - log = seccomp_actions_logged & SECCOMP_LOG_TRAP;
> - break;
> - case SECCOMP_RET_ERRNO:
> - log = seccomp_actions_logged & SECCOMP_LOG_ERRNO;
> - break;
> - case SECCOMP_RET_TRACE:
> - log = seccomp_actions_logged & SECCOMP_LOG_TRACE;
> - break;
> - case SECCOMP_RET_ALLOW:
> - log = false;
> - break;
> - case SECCOMP_RET_KILL:
> - default:
> - log = seccomp_actions_logged & SECCOMP_LOG_KILL;
> + /*
> + * Force an audit message to be emitted when the action is
> + * allowed to be logged by the admin.
> + */
> + if (log)
> + return __audit_seccomp(syscall, signr, action);
> }
>
> /*
> - * Force an audit message to be emitted when the action is allowed to
> - * be logged by the admin.
> - */
> - if (log)
> - return __audit_seccomp(syscall, signr, action);
> -
> - /*
> * Let the audit subsystem decide if the action should be audited based
> * on whether the current task itself is being audited.
> */
> @@ -587,7 +602,7 @@ static void __secure_computing_strict(int this_syscall)
> #ifdef SECCOMP_DEBUG
> dump_stack();
> #endif
> - seccomp_log(this_syscall, SIGKILL, SECCOMP_RET_KILL);
> + seccomp_log(this_syscall, SIGKILL, SECCOMP_RET_KILL, true);
I had to read these patches a few times to convince myself that the
logic was correct for RET_KILL logging, etc. As it stands, I'd kind of
like to rearrange the checks in seccomp_log() so the action stays at
the top-level, so it's slightly easier to scan for the logic of how an
action is logged. For example:
bool log = false;
switch (action) {
case SECCOMP_RET_ALLOW:
break;
case SECCOMP_RET_TRAP:
log = requested && seccomp_actions_logged & SECCOMP_LOG_TRAP;
break;
case SECCOMP_RET_ERRNO:
log = requested && seccomp_actions_logged & SECCOMP_LOG_ERRNO;
break;
case SECCOMP_RET_TRACE:
log = requested && seccomp_actions_logged & SECCOMP_LOG_TRACE;
break; case SECCOMP_RET_KILL:
default:
log = seccomp_actions_logged & SECCOMP_LOG_KILL;
i.e. ALLOW and KILL are clear about how they're special. ALLOW is
never logged, and KILL is only logged if admin wants it (which is the
default sysctl value).
> do_exit(SIGKILL);
> }
>
> @@ -613,6 +628,7 @@ void secure_computing_strict(int this_syscall)
> static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
> const bool recheck_after_trace)
> {
> + struct seccomp_filter *filter;
Out of paranoia I set this to NULL by default in the extracted
filter-assignment patch.
> u32 filter_ret, action;
> int data;
>
> @@ -622,7 +638,7 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
> */
> rmb();
>
> - filter_ret = seccomp_run_filters(sd);
> + filter_ret = seccomp_run_filters(&filter, sd);
> data = filter_ret & SECCOMP_RET_DATA;
> action = filter_ret & SECCOMP_RET_ACTION;
>
> @@ -690,7 +706,7 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
>
> case SECCOMP_RET_KILL:
> default:
> - seccomp_log(this_syscall, SIGSYS, action);
> + seccomp_log(this_syscall, SIGSYS, action, true);
> /* Dump core only if this is the last remaining thread. */
> if (get_nr_threads(current) == 1) {
> siginfo_t info;
> @@ -707,7 +723,7 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
> unreachable();
>
> skip:
> - seccomp_log(this_syscall, 0, action);
> + seccomp_log(this_syscall, 0, action, filter ? filter->log : false);
Yes, thanks for the double-check on the filter being non-NULL! :)
> return -1;
> }
> #else
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index 73f5ea6..eeb4f7a 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -1687,6 +1687,10 @@ TEST_F_SIGNAL(TRACE_syscall, kill_after_ptrace, SIGSYS)
> #define SECCOMP_FILTER_FLAG_TSYNC 1
> #endif
>
> +#ifndef SECCOMP_FILTER_FLAG_LOG
> +#define SECCOMP_FILTER_FLAG_LOG 2
> +#endif
> +
> #ifndef seccomp
> int seccomp(unsigned int op, unsigned int flags, void *args)
> {
> @@ -2421,6 +2425,67 @@ TEST(syscall_restart)
> _metadata->passed = 0;
> }
>
> +TEST_SIGNAL(filter_flag_log, SIGSYS)
> +{
> + struct sock_filter allow_filter[] = {
> + BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW),
> + };
> + struct sock_filter kill_filter[] = {
> + BPF_STMT(BPF_LD|BPF_W|BPF_ABS,
> + offsetof(struct seccomp_data, nr)),
> + BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, __NR_getpid, 0, 1),
> + BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_KILL),
> + BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW),
> + };
> + struct sock_fprog allow_prog = {
> + .len = (unsigned short)ARRAY_SIZE(allow_filter),
> + .filter = allow_filter,
> + };
> + struct sock_fprog kill_prog = {
> + .len = (unsigned short)ARRAY_SIZE(kill_filter),
> + .filter = kill_filter,
> + };
> + long ret;
> + pid_t parent = getppid();
> +
> + ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
> + ASSERT_EQ(0, ret);
> +
> + /* Verify that the FILTER_FLAG_LOG flag isn't accepted in strict mode */
> + ret = seccomp(SECCOMP_SET_MODE_STRICT, SECCOMP_FILTER_FLAG_LOG,
> + &allow_prog);
> + ASSERT_NE(ENOSYS, errno) {
> + TH_LOG("Kernel does not support seccomp syscall!");
> + }
> + EXPECT_NE(0, ret) {
> + TH_LOG("Kernel accepted FILTER_FLAG_LOG flag in strict mode!");
> + }
> + EXPECT_EQ(EINVAL, errno) {
> + TH_LOG("Kernel returned unexpected errno for FILTER_FLAG_LOG flag in strict mode!");
> + }
> +
> + /* Verify that a simple, permissive filter can be added with no flags */
> + ret = seccomp(SECCOMP_SET_MODE_FILTER, 0, &allow_prog);
> + EXPECT_EQ(0, ret);
> +
> + /* See if the same filter can be added with the FILTER_FLAG_LOG flag */
> + ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_LOG,
> + &allow_prog);
> + ASSERT_NE(EINVAL, errno) {
> + TH_LOG("Kernel does not support the FILTER_FLAG_LOG flag!");
> + }
> + EXPECT_EQ(0, ret);
> +
> + /* Ensure that the kill filter works with the FILTER_FLAG_LOG flag */
> + ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_LOG,
> + &kill_prog);
> + EXPECT_EQ(0, ret);
> +
> + EXPECT_EQ(parent, syscall(__NR_getppid));
> + /* getpid() should never return. */
> + EXPECT_EQ(0, syscall(__NR_getpid));
> +}
> +
> /*
> * TODO:
> * - add microbenchmarks
> @@ -2429,6 +2494,7 @@ TEST(syscall_restart)
> * - endianness checking when appropriate
> * - 64-bit arg prodding
> * - arch value testing (x86 modes especially)
> + * - verify that FILTER_FLAG_LOG filters generate log messages
> * - ...
> */
>
> --
> 2.7.4
>
Test looks good, thanks!
-Kees
--
Kees Cook
Pixel Security
On Fri, Jul 28, 2017 at 1:55 PM, Tyler Hicks <[email protected]> wrote:
> Add a new action, SECCOMP_RET_LOG, that logs a syscall before allowing
> the syscall. At the implementation level, this action is identical to
> the existing SECCOMP_RET_ALLOW action. However, it can be very useful when
> initially developing a seccomp filter for an application. The developer
> can set the default action to be SECCOMP_RET_LOG, maybe mark any
> obviously needed syscalls with SECCOMP_RET_ALLOW, and then put the
> application through its paces. A list of syscalls that triggered the
> default action (SECCOMP_RET_LOG) can be easily gleaned from the logs and
> that list can be used to build the syscall whitelist. Finally, the
> developer can change the default action to the desired value.
>
> This provides a more friendly experience than seeing the application get
> killed, then updating the filter and rebuilding the app, seeing the
> application get killed due to a different syscall, then updating the
> filter and rebuilding the app, etc.
>
> The functionality is similar to what's supported by the various LSMs.
> SELinux has permissive mode, AppArmor has complain mode, SMACK has
> bring-up mode, etc.
>
> SECCOMP_RET_LOG is given a lower value than SECCOMP_RET_ALLOW as allow
> while logging is slightly more restrictive than quietly allowing.
>
> Unfortunately, the tests added for SECCOMP_RET_LOG are not capable of
> inspecting the audit log to verify that the syscall was logged.
>
> Signed-off-by: Tyler Hicks <[email protected]>
> ---
>
> * Change since v4:
> - folded the previously separate selftest patch into this patch
> - add TODO to verify that RET_LOG generates log messages in selftests
> - adjust for new reStructuredText format
>
> Documentation/userspace-api/seccomp_filter.rst | 6 ++
> include/uapi/linux/seccomp.h | 1 +
> kernel/seccomp.c | 17 ++++-
> tools/testing/selftests/seccomp/seccomp_bpf.c | 97 +++++++++++++++++++++++++-
> 4 files changed, 118 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/userspace-api/seccomp_filter.rst b/Documentation/userspace-api/seccomp_filter.rst
> index 2d1d8ab..2ece856 100644
> --- a/Documentation/userspace-api/seccomp_filter.rst
> +++ b/Documentation/userspace-api/seccomp_filter.rst
> @@ -141,6 +141,12 @@ In precedence order, they are:
> allow use of ptrace, even of other sandboxed processes, without
> extreme care; ptracers can use this mechanism to escape.)
>
> +``SECCOMP_RET_LOG``:
> + Results in the system call being executed after it is logged. This
> + should be used by application developers to learn which syscalls their
> + application needs without having to iterate through multiple test and
> + development cycles to build the list.
Perhaps this should note that it'll only be logged if the admin has
left the default sysctl value alone.
> +
> ``SECCOMP_RET_ALLOW``:
> Results in the system call being executed.
>
> diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
> index 19a611d..f944332 100644
> --- a/include/uapi/linux/seccomp.h
> +++ b/include/uapi/linux/seccomp.h
> @@ -31,6 +31,7 @@
> #define SECCOMP_RET_TRAP 0x00030000U /* disallow and force a SIGSYS */
> #define SECCOMP_RET_ERRNO 0x00050000U /* returns an errno */
> #define SECCOMP_RET_TRACE 0x7ff00000U /* pass to a tracer or disallow */
> +#define SECCOMP_RET_LOG 0x7ffc0000U /* allow after logging */
> #define SECCOMP_RET_ALLOW 0x7fff0000U /* allow */
>
> /* Masks for the return value sections. */
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index 03ad3ba..c12d3dd 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -533,10 +533,12 @@ static void seccomp_send_sigsys(int syscall, int reason)
> #define SECCOMP_LOG_TRAP (1 << 2)
> #define SECCOMP_LOG_ERRNO (1 << 3)
> #define SECCOMP_LOG_TRACE (1 << 4)
> -#define SECCOMP_LOG_ALLOW (1 << 5)
> +#define SECCOMP_LOG_LOG (1 << 5)
> +#define SECCOMP_LOG_ALLOW (1 << 6)
>
> static u32 seccomp_actions_logged = SECCOMP_LOG_KILL | SECCOMP_LOG_TRAP |
> - SECCOMP_LOG_ERRNO | SECCOMP_LOG_TRACE;
> + SECCOMP_LOG_ERRNO | SECCOMP_LOG_TRACE |
> + SECCOMP_LOG_LOG;
>
> static inline void seccomp_log(unsigned long syscall, long signr, u32 action,
> bool requested)
> @@ -554,6 +556,9 @@ static inline void seccomp_log(unsigned long syscall, long signr, u32 action,
> case SECCOMP_RET_TRACE:
> log = seccomp_actions_logged & SECCOMP_LOG_TRACE;
> break;
> + case SECCOMP_RET_LOG:
> + log = seccomp_actions_logged & SECCOMP_LOG_LOG;
> + break;
> case SECCOMP_RET_ALLOW:
> log = false;
> break;
> @@ -701,6 +706,10 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
>
> return 0;
>
> + case SECCOMP_RET_LOG:
> + seccomp_log(this_syscall, 0, action, true);
> + return 0;
> +
> case SECCOMP_RET_ALLOW:
> return 0;
>
> @@ -870,6 +879,7 @@ static long seccomp_get_action_avail(const char __user *uaction)
> case SECCOMP_RET_TRAP:
> case SECCOMP_RET_ERRNO:
> case SECCOMP_RET_TRACE:
> + case SECCOMP_RET_LOG:
> case SECCOMP_RET_ALLOW:
> break;
> default:
> @@ -1020,12 +1030,14 @@ long seccomp_get_filter(struct task_struct *task, unsigned long filter_off,
> #define SECCOMP_RET_TRAP_NAME "trap"
> #define SECCOMP_RET_ERRNO_NAME "errno"
> #define SECCOMP_RET_TRACE_NAME "trace"
> +#define SECCOMP_RET_LOG_NAME "log"
> #define SECCOMP_RET_ALLOW_NAME "allow"
>
> static const char seccomp_actions_avail[] = SECCOMP_RET_KILL_NAME " "
> SECCOMP_RET_TRAP_NAME " "
> SECCOMP_RET_ERRNO_NAME " "
> SECCOMP_RET_TRACE_NAME " "
> + SECCOMP_RET_LOG_NAME " "
> SECCOMP_RET_ALLOW_NAME;
>
> #define SECCOMP_ACTIONS_AVAIL_LEN strlen(seccomp_actions_avail)
> @@ -1040,6 +1052,7 @@ static const struct seccomp_log_name seccomp_log_names[] = {
> { SECCOMP_LOG_TRAP, SECCOMP_RET_TRAP_NAME },
> { SECCOMP_LOG_ERRNO, SECCOMP_RET_ERRNO_NAME },
> { SECCOMP_LOG_TRACE, SECCOMP_RET_TRACE_NAME },
> + { SECCOMP_LOG_LOG, SECCOMP_RET_LOG_NAME },
> { SECCOMP_LOG_ALLOW, SECCOMP_RET_ALLOW_NAME },
> { }
> };
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index 8f0872b..040e875 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -87,6 +87,10 @@ struct seccomp_data {
> };
> #endif
>
> +#ifndef SECCOMP_RET_LOG
> +#define SECCOMP_RET_LOG 0x7ffc0000U /* allow after logging */
> +#endif
> +
> #if __BYTE_ORDER == __LITTLE_ENDIAN
> #define syscall_arg(_n) (offsetof(struct seccomp_data, args[_n]))
> #elif __BYTE_ORDER == __BIG_ENDIAN
> @@ -342,6 +346,28 @@ TEST(empty_prog)
> EXPECT_EQ(EINVAL, errno);
> }
>
> +TEST(log_all)
> +{
> + struct sock_filter filter[] = {
> + BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_LOG),
> + };
> + struct sock_fprog prog = {
> + .len = (unsigned short)ARRAY_SIZE(filter),
> + .filter = filter,
> + };
> + long ret;
> + pid_t parent = getppid();
> +
> + ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
> + ASSERT_EQ(0, ret);
> +
> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog);
> + ASSERT_EQ(0, ret);
> +
> + /* getppid() should succeed and be logged (no check for logging) */
> + EXPECT_EQ(parent, syscall(__NR_getppid));
> +}
> +
> TEST_SIGNAL(unknown_ret_is_kill_inside, SIGSYS)
> {
> struct sock_filter filter[] = {
> @@ -735,6 +761,7 @@ TEST_F(TRAP, handler)
>
> FIXTURE_DATA(precedence) {
> struct sock_fprog allow;
> + struct sock_fprog log;
> struct sock_fprog trace;
> struct sock_fprog error;
> struct sock_fprog trap;
> @@ -746,6 +773,13 @@ FIXTURE_SETUP(precedence)
> struct sock_filter allow_insns[] = {
> BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW),
> };
> + struct sock_filter log_insns[] = {
> + BPF_STMT(BPF_LD|BPF_W|BPF_ABS,
> + offsetof(struct seccomp_data, nr)),
> + BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, __NR_getpid, 1, 0),
> + BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW),
> + BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_LOG),
> + };
> struct sock_filter trace_insns[] = {
> BPF_STMT(BPF_LD|BPF_W|BPF_ABS,
> offsetof(struct seccomp_data, nr)),
> @@ -782,6 +816,7 @@ FIXTURE_SETUP(precedence)
> memcpy(self->_x.filter, &_x##_insns, sizeof(_x##_insns)); \
> self->_x.len = (unsigned short)ARRAY_SIZE(_x##_insns)
> FILTER_ALLOC(allow);
> + FILTER_ALLOC(log);
> FILTER_ALLOC(trace);
> FILTER_ALLOC(error);
> FILTER_ALLOC(trap);
> @@ -792,6 +827,7 @@ FIXTURE_TEARDOWN(precedence)
> {
> #define FILTER_FREE(_x) if (self->_x.filter) free(self->_x.filter)
> FILTER_FREE(allow);
> + FILTER_FREE(log);
> FILTER_FREE(trace);
> FILTER_FREE(error);
> FILTER_FREE(trap);
> @@ -809,6 +845,8 @@ TEST_F(precedence, allow_ok)
>
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
> ASSERT_EQ(0, ret);
> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
> + ASSERT_EQ(0, ret);
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
> ASSERT_EQ(0, ret);
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
> @@ -833,6 +871,8 @@ TEST_F_SIGNAL(precedence, kill_is_highest, SIGSYS)
>
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
> ASSERT_EQ(0, ret);
> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
> + ASSERT_EQ(0, ret);
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
> ASSERT_EQ(0, ret);
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
> @@ -864,6 +904,8 @@ TEST_F_SIGNAL(precedence, kill_is_highest_in_any_order, SIGSYS)
> ASSERT_EQ(0, ret);
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
> ASSERT_EQ(0, ret);
> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
> + ASSERT_EQ(0, ret);
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
> ASSERT_EQ(0, ret);
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trap);
> @@ -885,6 +927,8 @@ TEST_F_SIGNAL(precedence, trap_is_second, SIGSYS)
>
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
> ASSERT_EQ(0, ret);
> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
> + ASSERT_EQ(0, ret);
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
> ASSERT_EQ(0, ret);
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
> @@ -910,6 +954,8 @@ TEST_F_SIGNAL(precedence, trap_is_second_in_any_order, SIGSYS)
> ASSERT_EQ(0, ret);
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trap);
> ASSERT_EQ(0, ret);
> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
> + ASSERT_EQ(0, ret);
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
> ASSERT_EQ(0, ret);
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
> @@ -931,6 +977,8 @@ TEST_F(precedence, errno_is_third)
>
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
> ASSERT_EQ(0, ret);
> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
> + ASSERT_EQ(0, ret);
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
> ASSERT_EQ(0, ret);
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
> @@ -949,6 +997,8 @@ TEST_F(precedence, errno_is_third_in_any_order)
> ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
> ASSERT_EQ(0, ret);
>
> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
> + ASSERT_EQ(0, ret);
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
> ASSERT_EQ(0, ret);
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
> @@ -971,6 +1021,8 @@ TEST_F(precedence, trace_is_fourth)
>
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
> ASSERT_EQ(0, ret);
> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
> + ASSERT_EQ(0, ret);
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
> ASSERT_EQ(0, ret);
> /* Should work just fine. */
> @@ -992,12 +1044,54 @@ TEST_F(precedence, trace_is_fourth_in_any_order)
> ASSERT_EQ(0, ret);
> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
> ASSERT_EQ(0, ret);
> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
> + ASSERT_EQ(0, ret);
> /* Should work just fine. */
> EXPECT_EQ(parent, syscall(__NR_getppid));
> /* No ptracer */
> EXPECT_EQ(-1, syscall(__NR_getpid));
> }
>
> +TEST_F(precedence, log_is_fifth)
> +{
> + pid_t mypid, parent;
> + long ret;
> +
> + mypid = getpid();
> + parent = getppid();
> + ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
> + ASSERT_EQ(0, ret);
> +
> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
> + ASSERT_EQ(0, ret);
> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
> + ASSERT_EQ(0, ret);
> + /* Should work just fine. */
> + EXPECT_EQ(parent, syscall(__NR_getppid));
> + /* Should also work just fine */
> + EXPECT_EQ(mypid, syscall(__NR_getpid));
> +}
> +
> +TEST_F(precedence, log_is_fifth_in_any_order)
> +{
> + pid_t mypid, parent;
> + long ret;
> +
> + mypid = getpid();
> + parent = getppid();
> + ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
> + ASSERT_EQ(0, ret);
> +
> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
> + ASSERT_EQ(0, ret);
> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
> + ASSERT_EQ(0, ret);
> + /* Should work just fine. */
> + EXPECT_EQ(parent, syscall(__NR_getppid));
> + /* Should also work just fine */
> + EXPECT_EQ(mypid, syscall(__NR_getpid));
> +}
> +
> #ifndef PTRACE_O_TRACESECCOMP
> #define PTRACE_O_TRACESECCOMP 0x00000080
> #endif
> @@ -2494,7 +2588,7 @@ TEST(get_action_avail)
> {
> __u32 actions[] = { SECCOMP_RET_KILL, SECCOMP_RET_TRAP,
> SECCOMP_RET_ERRNO, SECCOMP_RET_TRACE,
> - SECCOMP_RET_ALLOW };
> + SECCOMP_RET_LOG, SECCOMP_RET_ALLOW };
> __u32 unknown_action = 0x10000000U;
> int i;
> long ret;
> @@ -2531,6 +2625,7 @@ TEST(get_action_avail)
> * - 64-bit arg prodding
> * - arch value testing (x86 modes especially)
> * - verify that FILTER_FLAG_LOG filters generate log messages
> + * - verify that RET_LOG generates log messages
> * - ...
> */
>
> --
> 2.7.4
>
Test updates look great, thanks!
-Kees
--
Kees Cook
Pixel Security
On Fri, Jul 28, 2017 at 1:55 PM, Tyler Hicks <[email protected]> wrote:
> Userspace needs to be able to reliably detect the support of a filter
> flag. A good way of doing that is by attempting to enter filter mode,
> with the flag bit(s) in question set, and a NULL pointer for the args
> parameter of seccomp(2). EFAULT indicates that the flag is valid and
> EINVAL indicates that the flag is invalid.
>
> This patch adds a selftest that can be used to test this method of
> detection in userspace.
>
> Signed-off-by: Tyler Hicks <[email protected]>
> ---
>
> * Changes since v4:
> - This is a new patch
>
> tools/testing/selftests/seccomp/seccomp_bpf.c | 58 +++++++++++++++++++++++++++
> 1 file changed, 58 insertions(+)
>
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index 040e875..d221437 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -1885,6 +1885,64 @@ TEST(seccomp_syscall_mode_lock)
> }
> }
>
> +/* Test detection of known and unknown filter flags. Userspace needs to be able
> + * to check if a filter flag is support by the current kernel and a good way of
> + * doing that is by attempting to enter filter mode, with the flag bit in
> + * question set, and a NULL pointer for the _args_ parameter. EFAULT indicates
> + * that the flag is valid and EINVAL indicates that the flag is invalid.
> + */
> +TEST(detect_seccomp_filter_flags)
> +{
> + unsigned int flags[] = { SECCOMP_FILTER_FLAG_TSYNC,
> + SECCOMP_FILTER_FLAG_LOG };
> + unsigned int flag, all_flags;
> + int i;
> + long ret;
> +
> + /* Test detection of known-good filter flags */
> + for (i = 0, all_flags = 0; i < ARRAY_SIZE(flags); i++) {
> + flag = flags[i];
> + ret = seccomp(SECCOMP_SET_MODE_FILTER, flag, NULL);
> + ASSERT_NE(ENOSYS, errno) {
> + TH_LOG("Kernel does not support seccomp syscall!");
> + }
> + EXPECT_EQ(-1, ret);
> + EXPECT_EQ(EFAULT, errno) {
> + TH_LOG("Failed to detect that a known-good filter flag (0x%X) is supported!",
> + flag);
> + }
> +
> + all_flags |= flag;
> + }
> +
> + /* Test detection of all known-good filter flags */
> + ret = seccomp(SECCOMP_SET_MODE_FILTER, all_flags, NULL);
> + EXPECT_EQ(-1, ret);
> + EXPECT_EQ(EFAULT, errno) {
> + TH_LOG("Failed to detect that all known-good filter flags (0x%X) are supported!",
> + all_flags);
> + }
> +
> + /* Test detection of an unknown filter flag */
> + flag = -1;
> + ret = seccomp(SECCOMP_SET_MODE_FILTER, flag, NULL);
> + EXPECT_EQ(-1, ret);
> + EXPECT_EQ(EINVAL, errno) {
> + TH_LOG("Failed to detect that an unknown filter flag (0x%X) is unsupported!",
> + flag);
> + }
> +
> + /* Test detection of an unknown filter flag that may simply need to be
> + * added to this test */
> + flag = flags[ARRAY_SIZE(flags) - 1] << 1;
> + ret = seccomp(SECCOMP_SET_MODE_FILTER, flag, NULL);
> + EXPECT_EQ(-1, ret);
> + EXPECT_EQ(EINVAL, errno) {
> + TH_LOG("Failed to detect that an unknown filter flag (0x%X) is unsupported! Does a new flag need to be added to this test?",
> + flag);
> + }
> +}
> +
> TEST(TSYNC_first)
> {
> struct sock_filter filter[] = {
> --
> 2.7.4
>
This is good, yes. Can you actually move it earlier in the series, so
it will pass before adding ..._FLAG_LOG, and then the patch adding
..._FLAG_LOG will add it to this test too?
Thanks!
-Kees
--
Kees Cook
Pixel Security
On 08/03/2017 11:37 AM, Kees Cook wrote:
> On Fri, Jul 28, 2017 at 1:55 PM, Tyler Hicks <[email protected]> wrote:
>> This patch creates a read-only sysctl containing an ordered list of
>> seccomp actions that the kernel supports. The ordering, from left to
>> right, is the lowest action value (kill) to the highest action value
>> (allow). Currently, a read of the sysctl file would return "kill trap
>> errno trace allow". The contents of this sysctl file can be useful for
>> userspace code as well as the system administrator.
>>
>> The path to the sysctl is:
>>
>> /proc/sys/kernel/seccomp/actions_avail
>>
>> libseccomp and other userspace code can easily determine which actions
>> the current kernel supports. The set of actions supported by the current
>> kernel may be different than the set of action macros found in kernel
>> headers that were installed where the userspace code was built.
>>
>> In addition, this sysctl will allow system administrators to know which
>> actions are supported by the kernel and make it easier to configure
>> exactly what seccomp logs through the audit subsystem. Support for this
>> level of logging configuration will come in a future patch.
>>
>> Signed-off-by: Tyler Hicks <[email protected]>
>> ---
>>
>> * Changes since v4:
>> - move device_initcall() into CONFIG_SYSCTL ifdef
>> - mark the seccomp_actions_avail string as const
>> - adjust for new reStructuredText format
>>
>> Documentation/sysctl/kernel.txt | 1 +
>> Documentation/userspace-api/seccomp_filter.rst | 16 ++++++++
>> kernel/seccomp.c | 51 ++++++++++++++++++++++++++
>> 3 files changed, 68 insertions(+)
>>
>> diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
>> index bac23c1..995c42c 100644
>> --- a/Documentation/sysctl/kernel.txt
>> +++ b/Documentation/sysctl/kernel.txt
>> @@ -74,6 +74,7 @@ show up in /proc/sys/kernel:
>> - reboot-cmd [ SPARC only ]
>> - rtsig-max
>> - rtsig-nr
>> +- seccomp/ ==> Documentation/userspace-api/seccomp_filter.rst
>> - sem
>> - sem_next_id [ sysv ipc ]
>> - sg-big-buff [ generic SCSI device (sg) ]
>> diff --git a/Documentation/userspace-api/seccomp_filter.rst b/Documentation/userspace-api/seccomp_filter.rst
>> index f71eb5e..35fc7cb 100644
>> --- a/Documentation/userspace-api/seccomp_filter.rst
>> +++ b/Documentation/userspace-api/seccomp_filter.rst
>> @@ -169,7 +169,23 @@ The ``samples/seccomp/`` directory contains both an x86-specific example
>> and a more generic example of a higher level macro interface for BPF
>> program generation.
>>
>> +Sysctls
>> +=======
>> +
>> +Seccomp's sysctl files can be found in the ``/proc/sys/kernel/seccomp/``
>> +directory. Here's a description of each file in that directory:
>> +
>> +``actions_avail``:
>> + A read-only ordered list of seccomp return values (refer to the
>> + ``SECCOMP_RET_*`` macros above) in string form. The ordering, from
>> + left-to-right, is the least permissive return value to the most
>> + permissive return value.
>>
>> + The list represents the set of seccomp return values supported
>> + by the kernel. A userspace program may use this list to
>> + determine if the actions found in the ``seccomp.h``, when the
>> + program was built, differs from the set of actions actually
>> + supported in the current running kernel.
>>
>> Adding architecture support
>> ===========================
>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
>> index 98b59b5..6bff068 100644
>> --- a/kernel/seccomp.c
>> +++ b/kernel/seccomp.c
>> @@ -17,11 +17,13 @@
>> #include <linux/audit.h>
>> #include <linux/compat.h>
>> #include <linux/coredump.h>
>> +#include <linux/kmemleak.h>
>> #include <linux/sched.h>
>> #include <linux/sched/task_stack.h>
>> #include <linux/seccomp.h>
>> #include <linux/slab.h>
>> #include <linux/syscalls.h>
>> +#include <linux/sysctl.h>
>>
>> #ifdef CONFIG_HAVE_ARCH_SECCOMP_FILTER
>> #include <asm/syscall.h>
>> @@ -922,3 +924,52 @@ long seccomp_get_filter(struct task_struct *task, unsigned long filter_off,
>> return ret;
>> }
>> #endif
>> +
>> +#ifdef CONFIG_SYSCTL
>> +
>> +/* Human readable action names for friendly sysctl interaction */
>> +#define SECCOMP_RET_KILL_NAME "kill"
>> +#define SECCOMP_RET_TRAP_NAME "trap"
>> +#define SECCOMP_RET_ERRNO_NAME "errno"
>> +#define SECCOMP_RET_TRACE_NAME "trace"
>> +#define SECCOMP_RET_ALLOW_NAME "allow"
>> +
>> +static const char seccomp_actions_avail[] = SECCOMP_RET_KILL_NAME " "
>> + SECCOMP_RET_TRAP_NAME " "
>> + SECCOMP_RET_ERRNO_NAME " "
>> + SECCOMP_RET_TRACE_NAME " "
>> + SECCOMP_RET_ALLOW_NAME;
>> +
>> +static struct ctl_path seccomp_sysctl_path[] = {
>> + { .procname = "kernel", },
>> + { .procname = "seccomp", },
>> + { }
>> +};
>> +
>> +static struct ctl_table seccomp_sysctl_table[] = {
>> + {
>> + .procname = "actions_avail",
>> + .data = (void *) &seccomp_actions_avail,
>> + .maxlen = sizeof(seccomp_actions_avail),
>> + .mode = 0444,
>> + .proc_handler = proc_dostring,
>> + },
>> + { }
>> +};
>> +
>> +static int __init seccomp_sysctl_init(void)
>> +{
>> + struct ctl_table_header *hdr;
>> +
>> + hdr = register_sysctl_paths(seccomp_sysctl_path, seccomp_sysctl_table);
>> + if (!hdr)
>> + pr_warn("seccomp: sysctl registration failed\n");
>> + else
>> + kmemleak_not_leak(hdr);
>> +
>> + return 0;
>> +}
>> +
>> +device_initcall(seccomp_sysctl_init)
>> +
>> +#endif /* CONFIG_SYSCTL */
>
> Looks good. And for the record, the BPF return values, while not
> checked in seccomp_check_filter(), are part of ABI and the kernel will
> behave differently for unexpected values. For example, an older kernel
> encountering the future SECCOMP_RET_LOG will treat it as
> SECCOMP_RET_KILL since it's missing from the switch statement in
> __seccomp_filter().
>
> A question about patch ordering: should the new seccomp action
> introspection patch maybe follow this one, so they're together in the
> series (they provide the same information)?
That would be fine. I'll move it to patch #2.
Tyler
>
> -Kees
>
On 08/03/2017 11:33 AM, Kees Cook wrote:
> On Fri, Jul 28, 2017 at 1:55 PM, Tyler Hicks <[email protected]> wrote:
>> Adminstrators can write to this sysctl to set the seccomp actions that
>> are allowed to be logged. Any actions not found in this sysctl will not
>> be logged.
>>
>> For example, all SECCOMP_RET_KILL, SECCOMP_RET_TRAP, and
>> SECCOMP_RET_ERRNO actions would be loggable if "kill trap errno" were
>> written to the sysctl. SECCOMP_RET_TRACE actions would not be logged
>> since its string representation ("trace") wasn't present in the sysctl
>> value.
>
> Just to make sure I'm clear on this, the key word above is "loggable",
> in that filters requesting logging will be seen.
>
> i.e. at the end of the series, the final state of "will it be logged?" is:
>
> if action==RET_ALLOW:
> do not log
> else if action==RET_KILL || audit-enabled:
> log
> else if filter-requests-logging && action in actions_logged:
> log
> else:
> do not log
Not quite. You didn't mention RET_LOG, RET_KILL (and RET_LOG) can be
quieted by the admin, and the audit behavior is different. It is like this:
if action == RET_ALLOW:
do not log
else if action == RET_KILL && RET_KILL in actions_logged:
log
else if action == RET_LOG && RET_LOG in actions_logged:
log
else if filter-requests-logging && action in actions_logged:
log
else if audit_enabled && process-is-being-audited:
log
else:
do not log
Writing that up made me realize that there is a behavior change that my
patch set introduces when the process is being audited. Before my patch
set, this was the behavior:
...
else if action == RET_KILL && audit_enabled && process-is-being-audited:
log
...
Now it is:
...
else if audit_enabled && process-is-being-audited:
log
...
Should I go back to only logging RET_KILL actions in that situation?
>
>> The path to the sysctl is:
>>
>> /proc/sys/kernel/seccomp/actions_logged
>>
>> The actions_avail sysctl can be read to discover the valid action names
>> that can be written to the actions_logged sysctl with the exception of
>> SECCOMP_RET_ALLOW. It cannot be configured for logging.
>>
>> The default setting for the sysctl is to allow all actions to be logged
>> except SECCOMP_RET_ALLOW.
>>
>> There's one important exception to this sysctl. If a task is
>> specifically being audited, meaning that an audit context has been
>> allocated for the task, seccomp will log all actions other than
>> SECCOMP_RET_ALLOW despite the value of actions_logged. This exception
>> preserves the existing auditing behavior of tasks with an allocated
>> audit context.
>>
>> Signed-off-by: Tyler Hicks <[email protected]>
>> ---
>>
>> * Changes since v4:
>> - the sysctl is now a list of actions that are allowed by the admin to be
>> logged rather than a list of actions that should be logged
>> + a follow up patch will let applications have a say in what should be
>> logged but the admin has the final say with this sysctl
>> + RET_ALLOW cannot be allowed to be logged
>> - fix comment style
>> - mark the seccomp_action_names array as const
>> - adjust for new reStructuredText format
>>
>> Documentation/userspace-api/seccomp_filter.rst | 18 +++
>> include/linux/audit.h | 6 +-
>> kernel/seccomp.c | 180 ++++++++++++++++++++++++-
>> 3 files changed, 196 insertions(+), 8 deletions(-)
>>
>> diff --git a/Documentation/userspace-api/seccomp_filter.rst b/Documentation/userspace-api/seccomp_filter.rst
>> index 35fc7cb..2d1d8ab 100644
>> --- a/Documentation/userspace-api/seccomp_filter.rst
>> +++ b/Documentation/userspace-api/seccomp_filter.rst
>> @@ -187,6 +187,24 @@ directory. Here's a description of each file in that directory:
>> program was built, differs from the set of actions actually
>> supported in the current running kernel.
>>
>> +``actions_logged``:
>> + A read-write ordered list of seccomp return values (refer to the
>> + ``SECCOMP_RET_*`` macros above) that are allowed to be logged. Writes
>> + to the file do not need to be in ordered form but reads from the file
>> + will be ordered in the same way as the actions_avail sysctl.
>> +
>> + It is important to note that the value of ``actions_logged`` does not
>> + prevent certain actions from being logged when the audit subsystem is
>> + configured to audit a task. If the action is not found in
>> + ``actions_logged`` list, the final decision on whether to audit the
>> + action for that task is ultimately left up to the audit subsystem to
>> + decide for all seccomp return values other than ``SECCOMP_RET_ALLOW``.
>> +
>> + The ``allow`` string is not accepted in the ``actions_logged`` sysctl
>> + as it is not possible to log ``SECCOMP_RET_ALLOW`` actions. Attempting
>> + to write ``allow`` to the sysctl will result in an EINVAL being
>> + returned.
>> +
>> Adding architecture support
>> ===========================
>>
>> diff --git a/include/linux/audit.h b/include/linux/audit.h
>> index 2150bdc..8c30f06 100644
>> --- a/include/linux/audit.h
>> +++ b/include/linux/audit.h
>> @@ -314,11 +314,7 @@ void audit_core_dumps(long signr);
>>
>> static inline void audit_seccomp(unsigned long syscall, long signr, int code)
>> {
>> - if (!audit_enabled)
>> - return;
>> -
>> - /* Force a record to be reported if a signal was delivered. */
>> - if (signr || unlikely(!audit_dummy_context()))
>> + if (audit_enabled && unlikely(!audit_dummy_context()))
>> __audit_seccomp(syscall, signr, code);
>> }
>>
>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
>> index 6bff068..87257f2 100644
>> --- a/kernel/seccomp.c
>> +++ b/kernel/seccomp.c
>> @@ -516,6 +516,52 @@ static void seccomp_send_sigsys(int syscall, int reason)
>> }
>> #endif /* CONFIG_SECCOMP_FILTER */
>>
>> +/* For use with seccomp_actions_logged */
>> +#define SECCOMP_LOG_KILL (1 << 0)
>> +#define SECCOMP_LOG_TRAP (1 << 2)
>> +#define SECCOMP_LOG_ERRNO (1 << 3)
>> +#define SECCOMP_LOG_TRACE (1 << 4)
>> +#define SECCOMP_LOG_ALLOW (1 << 5)
>> +
>> +static u32 seccomp_actions_logged = SECCOMP_LOG_KILL | SECCOMP_LOG_TRAP |
>> + SECCOMP_LOG_ERRNO | SECCOMP_LOG_TRACE;
>> +
>> +static inline void seccomp_log(unsigned long syscall, long signr, u32 action)
>> +{
>> + bool log;
>> +
>> + switch (action) {
>> + case SECCOMP_RET_TRAP:
>> + log = seccomp_actions_logged & SECCOMP_LOG_TRAP;
>> + break;
>> + case SECCOMP_RET_ERRNO:
>> + log = seccomp_actions_logged & SECCOMP_LOG_ERRNO;
>> + break;
>> + case SECCOMP_RET_TRACE:
>> + log = seccomp_actions_logged & SECCOMP_LOG_TRACE;
>> + break;
>> + case SECCOMP_RET_ALLOW:
>> + log = false;
>> + break;
>> + case SECCOMP_RET_KILL:
>> + default:
>> + log = seccomp_actions_logged & SECCOMP_LOG_KILL;
>> + }
>> +
>> + /*
>> + * Force an audit message to be emitted when the action is allowed to
>> + * be logged by the admin.
>> + */
>> + if (log)
>> + return __audit_seccomp(syscall, signr, action);
>
> At this point in the patch series, there's no filter-requested-logging
> flag, so I think the above logic isn't needed until later in the
> series (or rather, only RET_KILL should be checked).
Hmmm... you're right. This slipped in since the sysctl's purpose morphed
from configuring what actions should be logged to configuring what
actions are allowed to be logged.
>
>> +
>> + /*
>> + * Let the audit subsystem decide if the action should be audited based
>> + * on whether the current task itself is being audited.
>> + */
>> + return audit_seccomp(syscall, signr, action);
>
> With audit_seccomp() being a single if, maybe it should just be
> collapsed into this function?
>
> if (log || (audit_enabled && unlikely(!audit_dummy_context()))
> audit_seccomp(...)
I think that would be fine. Unless you say otherwise, I'll also rename
__audit_seccomp() to audit_seccomp() after doing that.
Tyler
>
> I do like the change in name, though: this new function is correctly
> named seccomp_log().
>
> -Kees
>
On 08/03/2017 11:51 AM, Kees Cook wrote:
> On Fri, Jul 28, 2017 at 1:55 PM, Tyler Hicks <[email protected]> wrote:
>> Add a new filter flag, SECCOMP_FILTER_FLAG_LOG, that enables logging for
>> all actions except for SECCOMP_RET_ALLOW for the given filter.
>>
>> SECCOMP_RET_KILL actions are always logged, when "kill" is in the
>> actions_logged sysctl, and SECCOMP_RET_ALLOW actions are never logged,
>> regardless of this flag.
>>
>> This flag can be used to create noisy filters that result in all
>> non-allowed actions to be logged. A process may have one noisy filter,
>> which is loaded with this flag, as well as a quiet filter that's not
>> loaded with this flag. This allows for the actions in a set of filters
>> to be selectively conveyed to the admin.
>>
>> Since a system could have a large number of allocated seccomp_filter
>> structs, struct packing was taken in consideration. On 64 bit x86, the
>> new log member takes up one byte of an existing four byte hole in the
>> struct. On 32 bit x86, the new log member creates a new four byte hole
>> (unavoidable) and consumes one of those bytes.
>
> Ah, good note about packing. I'll adjust my other patch to move into
> the hole in 64-bit.
>
> FWIW, I would have asked to extract the filter-assignment logic into a
> separate patch (since it's a distinct logical piece), but since I've
> already stolen it for the kill-process series, that's all done now. ;)
> I think it's a good sign that both that and this need similar
> functionality, so thank you for implementing that!
>
>> Unfortunately, the tests added for SECCOMP_FILTER_FLAG_LOG are not
>> capable of inspecting the audit log to verify that the actions taken in
>> the filter were logged.
>
> Strictly speaking, the test could probably do some horrible stuff to
> access the kernel log, but yes, I'm fine with this just getting listed
> in the test TODO.
>
>>
>> Signed-off-by: Tyler Hicks <[email protected]>
>> ---
>>
>> * Changes since v4:
>> - This is a new patch that allows the application to have a say in what gets
>> logged
>>
>> include/linux/seccomp.h | 3 +-
>> include/uapi/linux/seccomp.h | 1 +
>> kernel/seccomp.c | 78 ++++++++++++++++-----------
>> tools/testing/selftests/seccomp/seccomp_bpf.c | 66 +++++++++++++++++++++++
>> 4 files changed, 116 insertions(+), 32 deletions(-)
>>
>> diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
>> index ecc296c..c8bef43 100644
>> --- a/include/linux/seccomp.h
>> +++ b/include/linux/seccomp.h
>> @@ -3,7 +3,8 @@
>>
>> #include <uapi/linux/seccomp.h>
>>
>> -#define SECCOMP_FILTER_FLAG_MASK (SECCOMP_FILTER_FLAG_TSYNC)
>> +#define SECCOMP_FILTER_FLAG_MASK (SECCOMP_FILTER_FLAG_TSYNC | \
>> + SECCOMP_FILTER_FLAG_LOG)
>>
>> #ifdef CONFIG_SECCOMP
>>
>> diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
>> index 0f238a4..82c823c 100644
>> --- a/include/uapi/linux/seccomp.h
>> +++ b/include/uapi/linux/seccomp.h
>> @@ -16,6 +16,7 @@
>>
>> /* Valid flags for SECCOMP_SET_MODE_FILTER */
>> #define SECCOMP_FILTER_FLAG_TSYNC 1
>> +#define SECCOMP_FILTER_FLAG_LOG 2
>
> I can sort out the flag collision with ..._KILL_PROCESS.
>
>>
>> /*
>> * All BPF programs must return a 32-bit value.
>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
>> index 87257f2..1c4c496 100644
>> --- a/kernel/seccomp.c
>> +++ b/kernel/seccomp.c
>> @@ -44,6 +44,7 @@
>> * get/put helpers should be used when accessing an instance
>> * outside of a lifetime-guarded section. In general, this
>> * is only needed for handling filters shared across tasks.
>> + * @log: true if all actions except for SECCOMP_RET_ALLOW should be logged
>> * @prev: points to a previously installed, or inherited, filter
>> * @prog: the BPF program to evaluate
>> *
>> @@ -59,6 +60,7 @@
>> */
>> struct seccomp_filter {
>> refcount_t usage;
>> + bool log;
>> struct seccomp_filter *prev;
>> struct bpf_prog *prog;
>> };
>> @@ -172,11 +174,14 @@ static int seccomp_check_filter(struct sock_filter *filter, unsigned int flen)
>>
>> /**
>> * seccomp_run_filters - evaluates all seccomp filters against @sd
>> + * @filter: upon return, points to the matched filter but may be NULL in some
>> + * unexpected situations
>> * @sd: optional seccomp data to be passed to filters
>> *
>> * Returns valid seccomp BPF response codes.
>> */
>> -static u32 seccomp_run_filters(const struct seccomp_data *sd)
>> +static u32 seccomp_run_filters(struct seccomp_filter **filter,
>> + const struct seccomp_data *sd)
>> {
>> struct seccomp_data sd_local;
>> u32 ret = SECCOMP_RET_ALLOW;
>> @@ -184,6 +189,8 @@ static u32 seccomp_run_filters(const struct seccomp_data *sd)
>> struct seccomp_filter *f =
>> lockless_dereference(current->seccomp.filter);
>>
>> + *filter = f;
>> +
>> /* Ensure unexpected behavior doesn't result in failing open. */
>> if (unlikely(WARN_ON(f == NULL)))
>> return SECCOMP_RET_KILL;
>> @@ -200,8 +207,10 @@ static u32 seccomp_run_filters(const struct seccomp_data *sd)
>> for (; f; f = f->prev) {
>> u32 cur_ret = BPF_PROG_RUN(f->prog, sd);
>>
>> - if ((cur_ret & SECCOMP_RET_ACTION) < (ret & SECCOMP_RET_ACTION))
>> + if ((cur_ret & SECCOMP_RET_ACTION) < (ret & SECCOMP_RET_ACTION)) {
>> ret = cur_ret;
>> + *filter = f;
>> + }
>> }
>> return ret;
>> }
>> @@ -446,6 +455,9 @@ static long seccomp_attach_filter(unsigned int flags,
>> return ret;
>> }
>>
>> + if (flags & SECCOMP_FILTER_FLAG_LOG)
>> + filter->log = true;
>> +
>> /*
>> * If there is an existing filter, make it the prev and don't drop its
>> * task reference.
>> @@ -526,36 +538,39 @@ static void seccomp_send_sigsys(int syscall, int reason)
>> static u32 seccomp_actions_logged = SECCOMP_LOG_KILL | SECCOMP_LOG_TRAP |
>> SECCOMP_LOG_ERRNO | SECCOMP_LOG_TRACE;
>>
>> -static inline void seccomp_log(unsigned long syscall, long signr, u32 action)
>> +static inline void seccomp_log(unsigned long syscall, long signr, u32 action,
>> + bool requested)
>> {
>> - bool log;
>> + if (requested) {
>> + bool log;
>> +
>> + switch (action) {
>> + case SECCOMP_RET_TRAP:
>> + log = seccomp_actions_logged & SECCOMP_LOG_TRAP;
>> + break;
>> + case SECCOMP_RET_ERRNO:
>> + log = seccomp_actions_logged & SECCOMP_LOG_ERRNO;
>> + break;
>> + case SECCOMP_RET_TRACE:
>> + log = seccomp_actions_logged & SECCOMP_LOG_TRACE;
>> + break;
>> + case SECCOMP_RET_ALLOW:
>> + log = false;
>> + break;
>> + case SECCOMP_RET_KILL:
>> + default:
>> + log = seccomp_actions_logged & SECCOMP_LOG_KILL;
>> + }
>>
>> - switch (action) {
>> - case SECCOMP_RET_TRAP:
>> - log = seccomp_actions_logged & SECCOMP_LOG_TRAP;
>> - break;
>> - case SECCOMP_RET_ERRNO:
>> - log = seccomp_actions_logged & SECCOMP_LOG_ERRNO;
>> - break;
>> - case SECCOMP_RET_TRACE:
>> - log = seccomp_actions_logged & SECCOMP_LOG_TRACE;
>> - break;
>> - case SECCOMP_RET_ALLOW:
>> - log = false;
>> - break;
>> - case SECCOMP_RET_KILL:
>> - default:
>> - log = seccomp_actions_logged & SECCOMP_LOG_KILL;
>> + /*
>> + * Force an audit message to be emitted when the action is
>> + * allowed to be logged by the admin.
>> + */
>> + if (log)
>> + return __audit_seccomp(syscall, signr, action);
>> }
>>
>> /*
>> - * Force an audit message to be emitted when the action is allowed to
>> - * be logged by the admin.
>> - */
>> - if (log)
>> - return __audit_seccomp(syscall, signr, action);
>> -
>> - /*
>> * Let the audit subsystem decide if the action should be audited based
>> * on whether the current task itself is being audited.
>> */
>> @@ -587,7 +602,7 @@ static void __secure_computing_strict(int this_syscall)
>> #ifdef SECCOMP_DEBUG
>> dump_stack();
>> #endif
>> - seccomp_log(this_syscall, SIGKILL, SECCOMP_RET_KILL);
>> + seccomp_log(this_syscall, SIGKILL, SECCOMP_RET_KILL, true);
>
> I had to read these patches a few times to convince myself that the
> logic was correct for RET_KILL logging, etc. As it stands, I'd kind of
> like to rearrange the checks in seccomp_log() so the action stays at
> the top-level, so it's slightly easier to scan for the logic of how an
> action is logged. For example:
>
> bool log = false;
>
> switch (action) {
> case SECCOMP_RET_ALLOW:
> break;
> case SECCOMP_RET_TRAP:
> log = requested && seccomp_actions_logged & SECCOMP_LOG_TRAP;
> break;
> case SECCOMP_RET_ERRNO:
> log = requested && seccomp_actions_logged & SECCOMP_LOG_ERRNO;
> break;
> case SECCOMP_RET_TRACE:
> log = requested && seccomp_actions_logged & SECCOMP_LOG_TRACE;
> break; case SECCOMP_RET_KILL:
> default:
> log = seccomp_actions_logged & SECCOMP_LOG_KILL;
>
> i.e. ALLOW and KILL are clear about how they're special. ALLOW is
> never logged, and KILL is only logged if admin wants it (which is the
> default sysctl value).
I will make these changes.
>
>
>
>> do_exit(SIGKILL);
>> }
>>
>> @@ -613,6 +628,7 @@ void secure_computing_strict(int this_syscall)
>> static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
>> const bool recheck_after_trace)
>> {
>> + struct seccomp_filter *filter;
>
> Out of paranoia I set this to NULL by default in the extracted
> filter-assignment patch.
Good idea. I went back and forth on if I should do that. :)
Tyler
>
>> u32 filter_ret, action;
>> int data;
>>
>> @@ -622,7 +638,7 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
>> */
>> rmb();
>>
>> - filter_ret = seccomp_run_filters(sd);
>> + filter_ret = seccomp_run_filters(&filter, sd);
>> data = filter_ret & SECCOMP_RET_DATA;
>> action = filter_ret & SECCOMP_RET_ACTION;
>>
>> @@ -690,7 +706,7 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
>>
>> case SECCOMP_RET_KILL:
>> default:
>> - seccomp_log(this_syscall, SIGSYS, action);
>> + seccomp_log(this_syscall, SIGSYS, action, true);
>> /* Dump core only if this is the last remaining thread. */
>> if (get_nr_threads(current) == 1) {
>> siginfo_t info;
>> @@ -707,7 +723,7 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
>> unreachable();
>>
>> skip:
>> - seccomp_log(this_syscall, 0, action);
>> + seccomp_log(this_syscall, 0, action, filter ? filter->log : false);
>
> Yes, thanks for the double-check on the filter being non-NULL! :)
>
>> return -1;
>> }
>> #else
>> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
>> index 73f5ea6..eeb4f7a 100644
>> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
>> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
>> @@ -1687,6 +1687,10 @@ TEST_F_SIGNAL(TRACE_syscall, kill_after_ptrace, SIGSYS)
>> #define SECCOMP_FILTER_FLAG_TSYNC 1
>> #endif
>>
>> +#ifndef SECCOMP_FILTER_FLAG_LOG
>> +#define SECCOMP_FILTER_FLAG_LOG 2
>> +#endif
>> +
>> #ifndef seccomp
>> int seccomp(unsigned int op, unsigned int flags, void *args)
>> {
>> @@ -2421,6 +2425,67 @@ TEST(syscall_restart)
>> _metadata->passed = 0;
>> }
>>
>> +TEST_SIGNAL(filter_flag_log, SIGSYS)
>> +{
>> + struct sock_filter allow_filter[] = {
>> + BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW),
>> + };
>> + struct sock_filter kill_filter[] = {
>> + BPF_STMT(BPF_LD|BPF_W|BPF_ABS,
>> + offsetof(struct seccomp_data, nr)),
>> + BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, __NR_getpid, 0, 1),
>> + BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_KILL),
>> + BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW),
>> + };
>> + struct sock_fprog allow_prog = {
>> + .len = (unsigned short)ARRAY_SIZE(allow_filter),
>> + .filter = allow_filter,
>> + };
>> + struct sock_fprog kill_prog = {
>> + .len = (unsigned short)ARRAY_SIZE(kill_filter),
>> + .filter = kill_filter,
>> + };
>> + long ret;
>> + pid_t parent = getppid();
>> +
>> + ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
>> + ASSERT_EQ(0, ret);
>> +
>> + /* Verify that the FILTER_FLAG_LOG flag isn't accepted in strict mode */
>> + ret = seccomp(SECCOMP_SET_MODE_STRICT, SECCOMP_FILTER_FLAG_LOG,
>> + &allow_prog);
>> + ASSERT_NE(ENOSYS, errno) {
>> + TH_LOG("Kernel does not support seccomp syscall!");
>> + }
>> + EXPECT_NE(0, ret) {
>> + TH_LOG("Kernel accepted FILTER_FLAG_LOG flag in strict mode!");
>> + }
>> + EXPECT_EQ(EINVAL, errno) {
>> + TH_LOG("Kernel returned unexpected errno for FILTER_FLAG_LOG flag in strict mode!");
>> + }
>> +
>> + /* Verify that a simple, permissive filter can be added with no flags */
>> + ret = seccomp(SECCOMP_SET_MODE_FILTER, 0, &allow_prog);
>> + EXPECT_EQ(0, ret);
>> +
>> + /* See if the same filter can be added with the FILTER_FLAG_LOG flag */
>> + ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_LOG,
>> + &allow_prog);
>> + ASSERT_NE(EINVAL, errno) {
>> + TH_LOG("Kernel does not support the FILTER_FLAG_LOG flag!");
>> + }
>> + EXPECT_EQ(0, ret);
>> +
>> + /* Ensure that the kill filter works with the FILTER_FLAG_LOG flag */
>> + ret = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_LOG,
>> + &kill_prog);
>> + EXPECT_EQ(0, ret);
>> +
>> + EXPECT_EQ(parent, syscall(__NR_getppid));
>> + /* getpid() should never return. */
>> + EXPECT_EQ(0, syscall(__NR_getpid));
>> +}
>> +
>> /*
>> * TODO:
>> * - add microbenchmarks
>> @@ -2429,6 +2494,7 @@ TEST(syscall_restart)
>> * - endianness checking when appropriate
>> * - 64-bit arg prodding
>> * - arch value testing (x86 modes especially)
>> + * - verify that FILTER_FLAG_LOG filters generate log messages
>> * - ...
>> */
>>
>> --
>> 2.7.4
>>
>
> Test looks good, thanks!
>
> -Kees
>
On 08/03/2017 11:54 AM, Kees Cook wrote:
> On Fri, Jul 28, 2017 at 1:55 PM, Tyler Hicks <[email protected]> wrote:
>> Userspace code that needs to check if the kernel supports a given action
>> may not be able to use the /proc/sys/kernel/seccomp/actions_avail
>> sysctl. The process may be running in a sandbox and, therefore,
>> sufficient filesystem access may not be available. This patch adds an
>> operation to the seccomp(2) syscall that allows userspace code to ask
>> the kernel if a given action is available.
>>
>> If the action is supported by the kernel, 0 is returned. If the action
>> is not supported by the kernel, -1 is returned with errno set to
>> -EOPNOTSUPP. If this check is attempted on a kernel that doesn't support
>> this new operation, -1 is returned with errno set to -EINVAL meaning
>> that userspace code will have the ability to differentiate between the
>> two error cases.
>>
>> Signed-off-by: Tyler Hicks <[email protected]>
>> ---
>>
>> * Changes since v4:
>> - This is new patch to allow applications to check if an action is supported
>> without having to consult the actions_avail sysctl
>>
>> include/uapi/linux/seccomp.h | 5 ++--
>> kernel/seccomp.c | 26 +++++++++++++++++++
>> tools/testing/selftests/seccomp/seccomp_bpf.c | 36 +++++++++++++++++++++++++++
>> 3 files changed, 65 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
>> index 82c823c..19a611d 100644
>> --- a/include/uapi/linux/seccomp.h
>> +++ b/include/uapi/linux/seccomp.h
>> @@ -11,8 +11,9 @@
>> #define SECCOMP_MODE_FILTER 2 /* uses user-supplied filter. */
>>
>> /* Valid operations for seccomp syscall. */
>> -#define SECCOMP_SET_MODE_STRICT 0
>> -#define SECCOMP_SET_MODE_FILTER 1
>> +#define SECCOMP_SET_MODE_STRICT 0
>> +#define SECCOMP_SET_MODE_FILTER 1
>> +#define SECCOMP_GET_ACTION_AVAIL 2
>>
>> /* Valid flags for SECCOMP_SET_MODE_FILTER */
>> #define SECCOMP_FILTER_FLAG_TSYNC 1
>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
>> index 1c4c496..03ad3ba 100644
>> --- a/kernel/seccomp.c
>> +++ b/kernel/seccomp.c
>> @@ -858,6 +858,27 @@ static inline long seccomp_set_mode_filter(unsigned int flags,
>> }
>> #endif
>>
>> +static long seccomp_get_action_avail(const char __user *uaction)
>> +{
>> + u32 action;
>> +
>> + if (copy_from_user(&action, uaction, sizeof(action)))
>> + return -EFAULT;
>> +
>> + switch (action) {
>> + case SECCOMP_RET_KILL:
>> + case SECCOMP_RET_TRAP:
>> + case SECCOMP_RET_ERRNO:
>> + case SECCOMP_RET_TRACE:
>> + case SECCOMP_RET_ALLOW:
>> + break;
>> + default:
>> + return -EOPNOTSUPP;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> /* Common entry point for both prctl and syscall. */
>> static long do_seccomp(unsigned int op, unsigned int flags,
>> const char __user *uargs)
>> @@ -869,6 +890,11 @@ static long do_seccomp(unsigned int op, unsigned int flags,
>> return seccomp_set_mode_strict();
>> case SECCOMP_SET_MODE_FILTER:
>> return seccomp_set_mode_filter(flags, uargs);
>> + case SECCOMP_GET_ACTION_AVAIL:
>> + if (flags != 0)
>> + return -EINVAL;
>> +
>> + return seccomp_get_action_avail(uargs);
>> default:
>> return -EINVAL;
>> }
>> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
>> index eeb4f7a..8f0872b 100644
>> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
>> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
>> @@ -1683,6 +1683,10 @@ TEST_F_SIGNAL(TRACE_syscall, kill_after_ptrace, SIGSYS)
>> #define SECCOMP_SET_MODE_FILTER 1
>> #endif
>>
>> +#ifndef SECCOMP_GET_ACTION_AVAIL
>> +#define SECCOMP_GET_ACTION_AVAIL 2
>> +#endif
>> +
>> #ifndef SECCOMP_FILTER_FLAG_TSYNC
>> #define SECCOMP_FILTER_FLAG_TSYNC 1
>> #endif
>> @@ -2486,6 +2490,38 @@ TEST_SIGNAL(filter_flag_log, SIGSYS)
>> EXPECT_EQ(0, syscall(__NR_getpid));
>> }
>>
>> +TEST(get_action_avail)
>> +{
>> + __u32 actions[] = { SECCOMP_RET_KILL, SECCOMP_RET_TRAP,
>> + SECCOMP_RET_ERRNO, SECCOMP_RET_TRACE,
>> + SECCOMP_RET_ALLOW };
>> + __u32 unknown_action = 0x10000000U;
>> + int i;
>> + long ret;
>> +
>> + ret = seccomp(SECCOMP_GET_ACTION_AVAIL, 0, &actions[0]);
>> + ASSERT_NE(ENOSYS, errno) {
>> + TH_LOG("Kernel does not support seccomp syscall!");
>> + }
>> + ASSERT_NE(EINVAL, errno) {
>> + TH_LOG("Kernel does not support SECCOMP_GET_ACTION_AVAIL operation!");
>> + }
>> + EXPECT_EQ(ret, 0);
>> +
>> + for (i = 0; i < ARRAY_SIZE(actions); i++) {
>> + ret = seccomp(SECCOMP_GET_ACTION_AVAIL, 0, &actions[i]);
>> + EXPECT_EQ(ret, 0) {
>> + TH_LOG("Expected action (0x%X) not available!",
>> + actions[i]);
>> + }
>> + }
>> +
>> + /* Check that an unknown action is handled properly (EOPNOTSUPP) */
>> + ret = seccomp(SECCOMP_GET_ACTION_AVAIL, 0, &unknown_action);
>> + EXPECT_EQ(ret, -1);
>> + EXPECT_EQ(errno, EOPNOTSUPP);
>> +}
>> +
>> /*
>> * TODO:
>> * - add microbenchmarks
>> --
>> 2.7.4
>>
>
> I like this a lot. I think it should follow the sysctl patch in the
> series, but otherwise looks great.
Good to hear! I like it a lot, as well. I'm pretty sure Andy suggested
it so I'll add a Suggested-by tag in the next revision.
Tyler
On 08/03/2017 11:56 AM, Kees Cook wrote:
> On Fri, Jul 28, 2017 at 1:55 PM, Tyler Hicks <[email protected]> wrote:
>> Add a new action, SECCOMP_RET_LOG, that logs a syscall before allowing
>> the syscall. At the implementation level, this action is identical to
>> the existing SECCOMP_RET_ALLOW action. However, it can be very useful when
>> initially developing a seccomp filter for an application. The developer
>> can set the default action to be SECCOMP_RET_LOG, maybe mark any
>> obviously needed syscalls with SECCOMP_RET_ALLOW, and then put the
>> application through its paces. A list of syscalls that triggered the
>> default action (SECCOMP_RET_LOG) can be easily gleaned from the logs and
>> that list can be used to build the syscall whitelist. Finally, the
>> developer can change the default action to the desired value.
>>
>> This provides a more friendly experience than seeing the application get
>> killed, then updating the filter and rebuilding the app, seeing the
>> application get killed due to a different syscall, then updating the
>> filter and rebuilding the app, etc.
>>
>> The functionality is similar to what's supported by the various LSMs.
>> SELinux has permissive mode, AppArmor has complain mode, SMACK has
>> bring-up mode, etc.
>>
>> SECCOMP_RET_LOG is given a lower value than SECCOMP_RET_ALLOW as allow
>> while logging is slightly more restrictive than quietly allowing.
>>
>> Unfortunately, the tests added for SECCOMP_RET_LOG are not capable of
>> inspecting the audit log to verify that the syscall was logged.
>>
>> Signed-off-by: Tyler Hicks <[email protected]>
>> ---
>>
>> * Change since v4:
>> - folded the previously separate selftest patch into this patch
>> - add TODO to verify that RET_LOG generates log messages in selftests
>> - adjust for new reStructuredText format
>>
>> Documentation/userspace-api/seccomp_filter.rst | 6 ++
>> include/uapi/linux/seccomp.h | 1 +
>> kernel/seccomp.c | 17 ++++-
>> tools/testing/selftests/seccomp/seccomp_bpf.c | 97 +++++++++++++++++++++++++-
>> 4 files changed, 118 insertions(+), 3 deletions(-)
>>
>> diff --git a/Documentation/userspace-api/seccomp_filter.rst b/Documentation/userspace-api/seccomp_filter.rst
>> index 2d1d8ab..2ece856 100644
>> --- a/Documentation/userspace-api/seccomp_filter.rst
>> +++ b/Documentation/userspace-api/seccomp_filter.rst
>> @@ -141,6 +141,12 @@ In precedence order, they are:
>> allow use of ptrace, even of other sandboxed processes, without
>> extreme care; ptracers can use this mechanism to escape.)
>>
>> +``SECCOMP_RET_LOG``:
>> + Results in the system call being executed after it is logged. This
>> + should be used by application developers to learn which syscalls their
>> + application needs without having to iterate through multiple test and
>> + development cycles to build the list.
>
> Perhaps this should note that it'll only be logged if the admin has
> left the default sysctl value alone.
>
Good idea. I'll add something about that.
Tyler
>> +
>> ``SECCOMP_RET_ALLOW``:
>> Results in the system call being executed.
>>
>> diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
>> index 19a611d..f944332 100644
>> --- a/include/uapi/linux/seccomp.h
>> +++ b/include/uapi/linux/seccomp.h
>> @@ -31,6 +31,7 @@
>> #define SECCOMP_RET_TRAP 0x00030000U /* disallow and force a SIGSYS */
>> #define SECCOMP_RET_ERRNO 0x00050000U /* returns an errno */
>> #define SECCOMP_RET_TRACE 0x7ff00000U /* pass to a tracer or disallow */
>> +#define SECCOMP_RET_LOG 0x7ffc0000U /* allow after logging */
>> #define SECCOMP_RET_ALLOW 0x7fff0000U /* allow */
>>
>> /* Masks for the return value sections. */
>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
>> index 03ad3ba..c12d3dd 100644
>> --- a/kernel/seccomp.c
>> +++ b/kernel/seccomp.c
>> @@ -533,10 +533,12 @@ static void seccomp_send_sigsys(int syscall, int reason)
>> #define SECCOMP_LOG_TRAP (1 << 2)
>> #define SECCOMP_LOG_ERRNO (1 << 3)
>> #define SECCOMP_LOG_TRACE (1 << 4)
>> -#define SECCOMP_LOG_ALLOW (1 << 5)
>> +#define SECCOMP_LOG_LOG (1 << 5)
>> +#define SECCOMP_LOG_ALLOW (1 << 6)
>>
>> static u32 seccomp_actions_logged = SECCOMP_LOG_KILL | SECCOMP_LOG_TRAP |
>> - SECCOMP_LOG_ERRNO | SECCOMP_LOG_TRACE;
>> + SECCOMP_LOG_ERRNO | SECCOMP_LOG_TRACE |
>> + SECCOMP_LOG_LOG;
>>
>> static inline void seccomp_log(unsigned long syscall, long signr, u32 action,
>> bool requested)
>> @@ -554,6 +556,9 @@ static inline void seccomp_log(unsigned long syscall, long signr, u32 action,
>> case SECCOMP_RET_TRACE:
>> log = seccomp_actions_logged & SECCOMP_LOG_TRACE;
>> break;
>> + case SECCOMP_RET_LOG:
>> + log = seccomp_actions_logged & SECCOMP_LOG_LOG;
>> + break;
>> case SECCOMP_RET_ALLOW:
>> log = false;
>> break;
>> @@ -701,6 +706,10 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
>>
>> return 0;
>>
>> + case SECCOMP_RET_LOG:
>> + seccomp_log(this_syscall, 0, action, true);
>> + return 0;
>> +
>> case SECCOMP_RET_ALLOW:
>> return 0;
>>
>> @@ -870,6 +879,7 @@ static long seccomp_get_action_avail(const char __user *uaction)
>> case SECCOMP_RET_TRAP:
>> case SECCOMP_RET_ERRNO:
>> case SECCOMP_RET_TRACE:
>> + case SECCOMP_RET_LOG:
>> case SECCOMP_RET_ALLOW:
>> break;
>> default:
>> @@ -1020,12 +1030,14 @@ long seccomp_get_filter(struct task_struct *task, unsigned long filter_off,
>> #define SECCOMP_RET_TRAP_NAME "trap"
>> #define SECCOMP_RET_ERRNO_NAME "errno"
>> #define SECCOMP_RET_TRACE_NAME "trace"
>> +#define SECCOMP_RET_LOG_NAME "log"
>> #define SECCOMP_RET_ALLOW_NAME "allow"
>>
>> static const char seccomp_actions_avail[] = SECCOMP_RET_KILL_NAME " "
>> SECCOMP_RET_TRAP_NAME " "
>> SECCOMP_RET_ERRNO_NAME " "
>> SECCOMP_RET_TRACE_NAME " "
>> + SECCOMP_RET_LOG_NAME " "
>> SECCOMP_RET_ALLOW_NAME;
>>
>> #define SECCOMP_ACTIONS_AVAIL_LEN strlen(seccomp_actions_avail)
>> @@ -1040,6 +1052,7 @@ static const struct seccomp_log_name seccomp_log_names[] = {
>> { SECCOMP_LOG_TRAP, SECCOMP_RET_TRAP_NAME },
>> { SECCOMP_LOG_ERRNO, SECCOMP_RET_ERRNO_NAME },
>> { SECCOMP_LOG_TRACE, SECCOMP_RET_TRACE_NAME },
>> + { SECCOMP_LOG_LOG, SECCOMP_RET_LOG_NAME },
>> { SECCOMP_LOG_ALLOW, SECCOMP_RET_ALLOW_NAME },
>> { }
>> };
>> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
>> index 8f0872b..040e875 100644
>> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
>> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
>> @@ -87,6 +87,10 @@ struct seccomp_data {
>> };
>> #endif
>>
>> +#ifndef SECCOMP_RET_LOG
>> +#define SECCOMP_RET_LOG 0x7ffc0000U /* allow after logging */
>> +#endif
>> +
>> #if __BYTE_ORDER == __LITTLE_ENDIAN
>> #define syscall_arg(_n) (offsetof(struct seccomp_data, args[_n]))
>> #elif __BYTE_ORDER == __BIG_ENDIAN
>> @@ -342,6 +346,28 @@ TEST(empty_prog)
>> EXPECT_EQ(EINVAL, errno);
>> }
>>
>> +TEST(log_all)
>> +{
>> + struct sock_filter filter[] = {
>> + BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_LOG),
>> + };
>> + struct sock_fprog prog = {
>> + .len = (unsigned short)ARRAY_SIZE(filter),
>> + .filter = filter,
>> + };
>> + long ret;
>> + pid_t parent = getppid();
>> +
>> + ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
>> + ASSERT_EQ(0, ret);
>> +
>> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog);
>> + ASSERT_EQ(0, ret);
>> +
>> + /* getppid() should succeed and be logged (no check for logging) */
>> + EXPECT_EQ(parent, syscall(__NR_getppid));
>> +}
>> +
>> TEST_SIGNAL(unknown_ret_is_kill_inside, SIGSYS)
>> {
>> struct sock_filter filter[] = {
>> @@ -735,6 +761,7 @@ TEST_F(TRAP, handler)
>>
>> FIXTURE_DATA(precedence) {
>> struct sock_fprog allow;
>> + struct sock_fprog log;
>> struct sock_fprog trace;
>> struct sock_fprog error;
>> struct sock_fprog trap;
>> @@ -746,6 +773,13 @@ FIXTURE_SETUP(precedence)
>> struct sock_filter allow_insns[] = {
>> BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW),
>> };
>> + struct sock_filter log_insns[] = {
>> + BPF_STMT(BPF_LD|BPF_W|BPF_ABS,
>> + offsetof(struct seccomp_data, nr)),
>> + BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, __NR_getpid, 1, 0),
>> + BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_ALLOW),
>> + BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_LOG),
>> + };
>> struct sock_filter trace_insns[] = {
>> BPF_STMT(BPF_LD|BPF_W|BPF_ABS,
>> offsetof(struct seccomp_data, nr)),
>> @@ -782,6 +816,7 @@ FIXTURE_SETUP(precedence)
>> memcpy(self->_x.filter, &_x##_insns, sizeof(_x##_insns)); \
>> self->_x.len = (unsigned short)ARRAY_SIZE(_x##_insns)
>> FILTER_ALLOC(allow);
>> + FILTER_ALLOC(log);
>> FILTER_ALLOC(trace);
>> FILTER_ALLOC(error);
>> FILTER_ALLOC(trap);
>> @@ -792,6 +827,7 @@ FIXTURE_TEARDOWN(precedence)
>> {
>> #define FILTER_FREE(_x) if (self->_x.filter) free(self->_x.filter)
>> FILTER_FREE(allow);
>> + FILTER_FREE(log);
>> FILTER_FREE(trace);
>> FILTER_FREE(error);
>> FILTER_FREE(trap);
>> @@ -809,6 +845,8 @@ TEST_F(precedence, allow_ok)
>>
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
>> ASSERT_EQ(0, ret);
>> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
>> + ASSERT_EQ(0, ret);
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
>> ASSERT_EQ(0, ret);
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
>> @@ -833,6 +871,8 @@ TEST_F_SIGNAL(precedence, kill_is_highest, SIGSYS)
>>
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
>> ASSERT_EQ(0, ret);
>> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
>> + ASSERT_EQ(0, ret);
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
>> ASSERT_EQ(0, ret);
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
>> @@ -864,6 +904,8 @@ TEST_F_SIGNAL(precedence, kill_is_highest_in_any_order, SIGSYS)
>> ASSERT_EQ(0, ret);
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
>> ASSERT_EQ(0, ret);
>> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
>> + ASSERT_EQ(0, ret);
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
>> ASSERT_EQ(0, ret);
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trap);
>> @@ -885,6 +927,8 @@ TEST_F_SIGNAL(precedence, trap_is_second, SIGSYS)
>>
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
>> ASSERT_EQ(0, ret);
>> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
>> + ASSERT_EQ(0, ret);
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
>> ASSERT_EQ(0, ret);
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
>> @@ -910,6 +954,8 @@ TEST_F_SIGNAL(precedence, trap_is_second_in_any_order, SIGSYS)
>> ASSERT_EQ(0, ret);
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trap);
>> ASSERT_EQ(0, ret);
>> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
>> + ASSERT_EQ(0, ret);
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
>> ASSERT_EQ(0, ret);
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
>> @@ -931,6 +977,8 @@ TEST_F(precedence, errno_is_third)
>>
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
>> ASSERT_EQ(0, ret);
>> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
>> + ASSERT_EQ(0, ret);
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
>> ASSERT_EQ(0, ret);
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
>> @@ -949,6 +997,8 @@ TEST_F(precedence, errno_is_third_in_any_order)
>> ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
>> ASSERT_EQ(0, ret);
>>
>> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
>> + ASSERT_EQ(0, ret);
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->error);
>> ASSERT_EQ(0, ret);
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
>> @@ -971,6 +1021,8 @@ TEST_F(precedence, trace_is_fourth)
>>
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
>> ASSERT_EQ(0, ret);
>> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
>> + ASSERT_EQ(0, ret);
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->trace);
>> ASSERT_EQ(0, ret);
>> /* Should work just fine. */
>> @@ -992,12 +1044,54 @@ TEST_F(precedence, trace_is_fourth_in_any_order)
>> ASSERT_EQ(0, ret);
>> ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
>> ASSERT_EQ(0, ret);
>> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
>> + ASSERT_EQ(0, ret);
>> /* Should work just fine. */
>> EXPECT_EQ(parent, syscall(__NR_getppid));
>> /* No ptracer */
>> EXPECT_EQ(-1, syscall(__NR_getpid));
>> }
>>
>> +TEST_F(precedence, log_is_fifth)
>> +{
>> + pid_t mypid, parent;
>> + long ret;
>> +
>> + mypid = getpid();
>> + parent = getppid();
>> + ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
>> + ASSERT_EQ(0, ret);
>> +
>> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
>> + ASSERT_EQ(0, ret);
>> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
>> + ASSERT_EQ(0, ret);
>> + /* Should work just fine. */
>> + EXPECT_EQ(parent, syscall(__NR_getppid));
>> + /* Should also work just fine */
>> + EXPECT_EQ(mypid, syscall(__NR_getpid));
>> +}
>> +
>> +TEST_F(precedence, log_is_fifth_in_any_order)
>> +{
>> + pid_t mypid, parent;
>> + long ret;
>> +
>> + mypid = getpid();
>> + parent = getppid();
>> + ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
>> + ASSERT_EQ(0, ret);
>> +
>> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->log);
>> + ASSERT_EQ(0, ret);
>> + ret = prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &self->allow);
>> + ASSERT_EQ(0, ret);
>> + /* Should work just fine. */
>> + EXPECT_EQ(parent, syscall(__NR_getppid));
>> + /* Should also work just fine */
>> + EXPECT_EQ(mypid, syscall(__NR_getpid));
>> +}
>> +
>> #ifndef PTRACE_O_TRACESECCOMP
>> #define PTRACE_O_TRACESECCOMP 0x00000080
>> #endif
>> @@ -2494,7 +2588,7 @@ TEST(get_action_avail)
>> {
>> __u32 actions[] = { SECCOMP_RET_KILL, SECCOMP_RET_TRAP,
>> SECCOMP_RET_ERRNO, SECCOMP_RET_TRACE,
>> - SECCOMP_RET_ALLOW };
>> + SECCOMP_RET_LOG, SECCOMP_RET_ALLOW };
>> __u32 unknown_action = 0x10000000U;
>> int i;
>> long ret;
>> @@ -2531,6 +2625,7 @@ TEST(get_action_avail)
>> * - 64-bit arg prodding
>> * - arch value testing (x86 modes especially)
>> * - verify that FILTER_FLAG_LOG filters generate log messages
>> + * - verify that RET_LOG generates log messages
>> * - ...
>> */
>>
>> --
>> 2.7.4
>>
>
> Test updates look great, thanks!
>
> -Kees
>
On 08/03/2017 11:58 AM, Kees Cook wrote:
> On Fri, Jul 28, 2017 at 1:55 PM, Tyler Hicks <[email protected]> wrote:
>> Userspace needs to be able to reliably detect the support of a filter
>> flag. A good way of doing that is by attempting to enter filter mode,
>> with the flag bit(s) in question set, and a NULL pointer for the args
>> parameter of seccomp(2). EFAULT indicates that the flag is valid and
>> EINVAL indicates that the flag is invalid.
>>
>> This patch adds a selftest that can be used to test this method of
>> detection in userspace.
>>
>> Signed-off-by: Tyler Hicks <[email protected]>
>> ---
>>
>> * Changes since v4:
>> - This is a new patch
>>
>> tools/testing/selftests/seccomp/seccomp_bpf.c | 58 +++++++++++++++++++++++++++
>> 1 file changed, 58 insertions(+)
>>
>> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
>> index 040e875..d221437 100644
>> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
>> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
>> @@ -1885,6 +1885,64 @@ TEST(seccomp_syscall_mode_lock)
>> }
>> }
>>
>> +/* Test detection of known and unknown filter flags. Userspace needs to be able
>> + * to check if a filter flag is support by the current kernel and a good way of
>> + * doing that is by attempting to enter filter mode, with the flag bit in
>> + * question set, and a NULL pointer for the _args_ parameter. EFAULT indicates
>> + * that the flag is valid and EINVAL indicates that the flag is invalid.
>> + */
>> +TEST(detect_seccomp_filter_flags)
>> +{
>> + unsigned int flags[] = { SECCOMP_FILTER_FLAG_TSYNC,
>> + SECCOMP_FILTER_FLAG_LOG };
>> + unsigned int flag, all_flags;
>> + int i;
>> + long ret;
>> +
>> + /* Test detection of known-good filter flags */
>> + for (i = 0, all_flags = 0; i < ARRAY_SIZE(flags); i++) {
>> + flag = flags[i];
>> + ret = seccomp(SECCOMP_SET_MODE_FILTER, flag, NULL);
>> + ASSERT_NE(ENOSYS, errno) {
>> + TH_LOG("Kernel does not support seccomp syscall!");
>> + }
>> + EXPECT_EQ(-1, ret);
>> + EXPECT_EQ(EFAULT, errno) {
>> + TH_LOG("Failed to detect that a known-good filter flag (0x%X) is supported!",
>> + flag);
>> + }
>> +
>> + all_flags |= flag;
>> + }
>> +
>> + /* Test detection of all known-good filter flags */
>> + ret = seccomp(SECCOMP_SET_MODE_FILTER, all_flags, NULL);
>> + EXPECT_EQ(-1, ret);
>> + EXPECT_EQ(EFAULT, errno) {
>> + TH_LOG("Failed to detect that all known-good filter flags (0x%X) are supported!",
>> + all_flags);
>> + }
>> +
>> + /* Test detection of an unknown filter flag */
>> + flag = -1;
>> + ret = seccomp(SECCOMP_SET_MODE_FILTER, flag, NULL);
>> + EXPECT_EQ(-1, ret);
>> + EXPECT_EQ(EINVAL, errno) {
>> + TH_LOG("Failed to detect that an unknown filter flag (0x%X) is unsupported!",
>> + flag);
>> + }
>> +
>> + /* Test detection of an unknown filter flag that may simply need to be
>> + * added to this test */
>> + flag = flags[ARRAY_SIZE(flags) - 1] << 1;
>> + ret = seccomp(SECCOMP_SET_MODE_FILTER, flag, NULL);
>> + EXPECT_EQ(-1, ret);
>> + EXPECT_EQ(EINVAL, errno) {
>> + TH_LOG("Failed to detect that an unknown filter flag (0x%X) is unsupported! Does a new flag need to be added to this test?",
>> + flag);
>> + }
>> +}
>> +
>> TEST(TSYNC_first)
>> {
>> struct sock_filter filter[] = {
>> --
>> 2.7.4
>>
>
> This is good, yes. Can you actually move it earlier in the series, so
> it will pass before adding ..._FLAG_LOG, and then the patch adding
> ..._FLAG_LOG will add it to this test too?
Yeah, that's the correct way to order it.
Tyler
>
> Thanks!
>
> -Kees
>
On 2017-08-04 17:24:00, Tyler Hicks wrote:
> On 08/03/2017 11:33 AM, Kees Cook wrote:
> > On Fri, Jul 28, 2017 at 1:55 PM, Tyler Hicks <[email protected]> wrote:
> >> Adminstrators can write to this sysctl to set the seccomp actions that
> >> are allowed to be logged. Any actions not found in this sysctl will not
> >> be logged.
> >>
> >> For example, all SECCOMP_RET_KILL, SECCOMP_RET_TRAP, and
> >> SECCOMP_RET_ERRNO actions would be loggable if "kill trap errno" were
> >> written to the sysctl. SECCOMP_RET_TRACE actions would not be logged
> >> since its string representation ("trace") wasn't present in the sysctl
> >> value.
> >
> > Just to make sure I'm clear on this, the key word above is "loggable",
> > in that filters requesting logging will be seen.
> >
> > i.e. at the end of the series, the final state of "will it be logged?" is:
> >
> > if action==RET_ALLOW:
> > do not log
> > else if action==RET_KILL || audit-enabled:
> > log
> > else if filter-requests-logging && action in actions_logged:
> > log
> > else:
> > do not log
>
> Not quite. You didn't mention RET_LOG, RET_KILL (and RET_LOG) can be
> quieted by the admin, and the audit behavior is different. It is like this:
>
> if action == RET_ALLOW:
> do not log
> else if action == RET_KILL && RET_KILL in actions_logged:
> log
> else if action == RET_LOG && RET_LOG in actions_logged:
> log
> else if filter-requests-logging && action in actions_logged:
> log
> else if audit_enabled && process-is-being-audited:
> log
> else:
> do not log
>
> Writing that up made me realize that there is a behavior change that my
> patch set introduces when the process is being audited. Before my patch
> set, this was the behavior:
>
> ...
> else if action == RET_KILL && audit_enabled && process-is-being-audited:
> log
> ...
>
> Now it is:
>
> ...
> else if audit_enabled && process-is-being-audited:
> log
> ...
>
> Should I go back to only logging RET_KILL actions in that situation?
>
> >
> >> The path to the sysctl is:
> >>
> >> /proc/sys/kernel/seccomp/actions_logged
> >>
> >> The actions_avail sysctl can be read to discover the valid action names
> >> that can be written to the actions_logged sysctl with the exception of
> >> SECCOMP_RET_ALLOW. It cannot be configured for logging.
> >>
> >> The default setting for the sysctl is to allow all actions to be logged
> >> except SECCOMP_RET_ALLOW.
> >>
> >> There's one important exception to this sysctl. If a task is
> >> specifically being audited, meaning that an audit context has been
> >> allocated for the task, seccomp will log all actions other than
> >> SECCOMP_RET_ALLOW despite the value of actions_logged. This exception
> >> preserves the existing auditing behavior of tasks with an allocated
> >> audit context.
> >>
> >> Signed-off-by: Tyler Hicks <[email protected]>
> >> ---
> >>
> >> * Changes since v4:
> >> - the sysctl is now a list of actions that are allowed by the admin to be
> >> logged rather than a list of actions that should be logged
> >> + a follow up patch will let applications have a say in what should be
> >> logged but the admin has the final say with this sysctl
> >> + RET_ALLOW cannot be allowed to be logged
> >> - fix comment style
> >> - mark the seccomp_action_names array as const
> >> - adjust for new reStructuredText format
> >>
> >> Documentation/userspace-api/seccomp_filter.rst | 18 +++
> >> include/linux/audit.h | 6 +-
> >> kernel/seccomp.c | 180 ++++++++++++++++++++++++-
> >> 3 files changed, 196 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/Documentation/userspace-api/seccomp_filter.rst b/Documentation/userspace-api/seccomp_filter.rst
> >> index 35fc7cb..2d1d8ab 100644
> >> --- a/Documentation/userspace-api/seccomp_filter.rst
> >> +++ b/Documentation/userspace-api/seccomp_filter.rst
> >> @@ -187,6 +187,24 @@ directory. Here's a description of each file in that directory:
> >> program was built, differs from the set of actions actually
> >> supported in the current running kernel.
> >>
> >> +``actions_logged``:
> >> + A read-write ordered list of seccomp return values (refer to the
> >> + ``SECCOMP_RET_*`` macros above) that are allowed to be logged. Writes
> >> + to the file do not need to be in ordered form but reads from the file
> >> + will be ordered in the same way as the actions_avail sysctl.
> >> +
> >> + It is important to note that the value of ``actions_logged`` does not
> >> + prevent certain actions from being logged when the audit subsystem is
> >> + configured to audit a task. If the action is not found in
> >> + ``actions_logged`` list, the final decision on whether to audit the
> >> + action for that task is ultimately left up to the audit subsystem to
> >> + decide for all seccomp return values other than ``SECCOMP_RET_ALLOW``.
> >> +
> >> + The ``allow`` string is not accepted in the ``actions_logged`` sysctl
> >> + as it is not possible to log ``SECCOMP_RET_ALLOW`` actions. Attempting
> >> + to write ``allow`` to the sysctl will result in an EINVAL being
> >> + returned.
> >> +
> >> Adding architecture support
> >> ===========================
> >>
> >> diff --git a/include/linux/audit.h b/include/linux/audit.h
> >> index 2150bdc..8c30f06 100644
> >> --- a/include/linux/audit.h
> >> +++ b/include/linux/audit.h
> >> @@ -314,11 +314,7 @@ void audit_core_dumps(long signr);
> >>
> >> static inline void audit_seccomp(unsigned long syscall, long signr, int code)
> >> {
> >> - if (!audit_enabled)
> >> - return;
> >> -
> >> - /* Force a record to be reported if a signal was delivered. */
> >> - if (signr || unlikely(!audit_dummy_context()))
> >> + if (audit_enabled && unlikely(!audit_dummy_context()))
> >> __audit_seccomp(syscall, signr, code);
> >> }
> >>
> >> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> >> index 6bff068..87257f2 100644
> >> --- a/kernel/seccomp.c
> >> +++ b/kernel/seccomp.c
> >> @@ -516,6 +516,52 @@ static void seccomp_send_sigsys(int syscall, int reason)
> >> }
> >> #endif /* CONFIG_SECCOMP_FILTER */
> >>
> >> +/* For use with seccomp_actions_logged */
> >> +#define SECCOMP_LOG_KILL (1 << 0)
> >> +#define SECCOMP_LOG_TRAP (1 << 2)
> >> +#define SECCOMP_LOG_ERRNO (1 << 3)
> >> +#define SECCOMP_LOG_TRACE (1 << 4)
> >> +#define SECCOMP_LOG_ALLOW (1 << 5)
> >> +
> >> +static u32 seccomp_actions_logged = SECCOMP_LOG_KILL | SECCOMP_LOG_TRAP |
> >> + SECCOMP_LOG_ERRNO | SECCOMP_LOG_TRACE;
> >> +
> >> +static inline void seccomp_log(unsigned long syscall, long signr, u32 action)
> >> +{
> >> + bool log;
> >> +
> >> + switch (action) {
> >> + case SECCOMP_RET_TRAP:
> >> + log = seccomp_actions_logged & SECCOMP_LOG_TRAP;
> >> + break;
> >> + case SECCOMP_RET_ERRNO:
> >> + log = seccomp_actions_logged & SECCOMP_LOG_ERRNO;
> >> + break;
> >> + case SECCOMP_RET_TRACE:
> >> + log = seccomp_actions_logged & SECCOMP_LOG_TRACE;
> >> + break;
> >> + case SECCOMP_RET_ALLOW:
> >> + log = false;
> >> + break;
> >> + case SECCOMP_RET_KILL:
> >> + default:
> >> + log = seccomp_actions_logged & SECCOMP_LOG_KILL;
> >> + }
> >> +
> >> + /*
> >> + * Force an audit message to be emitted when the action is allowed to
> >> + * be logged by the admin.
> >> + */
> >> + if (log)
> >> + return __audit_seccomp(syscall, signr, action);
> >
> > At this point in the patch series, there's no filter-requested-logging
> > flag, so I think the above logic isn't needed until later in the
> > series (or rather, only RET_KILL should be checked).
>
> Hmmm... you're right. This slipped in since the sysctl's purpose morphed
> from configuring what actions should be logged to configuring what
> actions are allowed to be logged.
>
> >
> >> +
> >> + /*
> >> + * Let the audit subsystem decide if the action should be audited based
> >> + * on whether the current task itself is being audited.
> >> + */
> >> + return audit_seccomp(syscall, signr, action);
> >
> > With audit_seccomp() being a single if, maybe it should just be
> > collapsed into this function?
> >
> > if (log || (audit_enabled && unlikely(!audit_dummy_context()))
> > audit_seccomp(...)
>
> I think that would be fine. Unless you say otherwise, I'll also rename
> __audit_seccomp() to audit_seccomp() after doing that.
After looking at making this change, the common pattern is for
include/linux/audit.h to have a function such as this...
static inline audit_foo(...)
{
if (unlikely(!audit_dummy_context()))
__audit_foo(...);
}
... and then for kernel/auditsc.c to contain __audit_foo() which
actually constructs and emits the audit message.
I don't feel like I should deviate from this pattern and should leave
this part of the patch alone.
Tyler
> >
> > I do like the change in name, though: this new function is correctly
> > named seccomp_log().
> >
> > -Kees
> >
>
>
On 08/04/2017 05:24 PM, Tyler Hicks wrote:
> On 08/03/2017 11:33 AM, Kees Cook wrote:
>> On Fri, Jul 28, 2017 at 1:55 PM, Tyler Hicks <[email protected]> wrote:
>>> Adminstrators can write to this sysctl to set the seccomp actions that
>>> are allowed to be logged. Any actions not found in this sysctl will not
>>> be logged.
>>>
>>> For example, all SECCOMP_RET_KILL, SECCOMP_RET_TRAP, and
>>> SECCOMP_RET_ERRNO actions would be loggable if "kill trap errno" were
>>> written to the sysctl. SECCOMP_RET_TRACE actions would not be logged
>>> since its string representation ("trace") wasn't present in the sysctl
>>> value.
>>
>> Just to make sure I'm clear on this, the key word above is "loggable",
>> in that filters requesting logging will be seen.
>>
>> i.e. at the end of the series, the final state of "will it be logged?" is:
>>
>> if action==RET_ALLOW:
>> do not log
>> else if action==RET_KILL || audit-enabled:
>> log
>> else if filter-requests-logging && action in actions_logged:
>> log
>> else:
>> do not log
>
> Not quite. You didn't mention RET_LOG, RET_KILL (and RET_LOG) can be
> quieted by the admin, and the audit behavior is different. It is like this:
>
> if action == RET_ALLOW:
> do not log
> else if action == RET_KILL && RET_KILL in actions_logged:
> log
> else if action == RET_LOG && RET_LOG in actions_logged:
> log
> else if filter-requests-logging && action in actions_logged:
> log
> else if audit_enabled && process-is-being-audited:
> log
> else:
> do not log
>
> Writing that up made me realize that there is a behavior change that my
> patch set introduces when the process is being audited. Before my patch
> set, this was the behavior:
>
> ...
> else if action == RET_KILL && audit_enabled && process-is-being-audited:
> log
> ...
>
> Now it is:
>
> ...
> else if audit_enabled && process-is-being-audited:
> log
> ...
>
> Should I go back to only logging RET_KILL actions in that situation?
After going back and manually testing, I was wrong about this. There's
no change in behavior for tasks that are being audited. The original
behavior is preserved and I even say that in this patch's commit
message. Sorry for the noise here.
Tyler
>
>>
>>> The path to the sysctl is:
>>>
>>> /proc/sys/kernel/seccomp/actions_logged
>>>
>>> The actions_avail sysctl can be read to discover the valid action names
>>> that can be written to the actions_logged sysctl with the exception of
>>> SECCOMP_RET_ALLOW. It cannot be configured for logging.
>>>
>>> The default setting for the sysctl is to allow all actions to be logged
>>> except SECCOMP_RET_ALLOW.
>>>
>>> There's one important exception to this sysctl. If a task is
>>> specifically being audited, meaning that an audit context has been
>>> allocated for the task, seccomp will log all actions other than
>>> SECCOMP_RET_ALLOW despite the value of actions_logged. This exception
>>> preserves the existing auditing behavior of tasks with an allocated
>>> audit context.
>>>
>>> Signed-off-by: Tyler Hicks <[email protected]>
>>> ---
>>>
>>> * Changes since v4:
>>> - the sysctl is now a list of actions that are allowed by the admin to be
>>> logged rather than a list of actions that should be logged
>>> + a follow up patch will let applications have a say in what should be
>>> logged but the admin has the final say with this sysctl
>>> + RET_ALLOW cannot be allowed to be logged
>>> - fix comment style
>>> - mark the seccomp_action_names array as const
>>> - adjust for new reStructuredText format
>>>
>>> Documentation/userspace-api/seccomp_filter.rst | 18 +++
>>> include/linux/audit.h | 6 +-
>>> kernel/seccomp.c | 180 ++++++++++++++++++++++++-
>>> 3 files changed, 196 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/Documentation/userspace-api/seccomp_filter.rst b/Documentation/userspace-api/seccomp_filter.rst
>>> index 35fc7cb..2d1d8ab 100644
>>> --- a/Documentation/userspace-api/seccomp_filter.rst
>>> +++ b/Documentation/userspace-api/seccomp_filter.rst
>>> @@ -187,6 +187,24 @@ directory. Here's a description of each file in that directory:
>>> program was built, differs from the set of actions actually
>>> supported in the current running kernel.
>>>
>>> +``actions_logged``:
>>> + A read-write ordered list of seccomp return values (refer to the
>>> + ``SECCOMP_RET_*`` macros above) that are allowed to be logged. Writes
>>> + to the file do not need to be in ordered form but reads from the file
>>> + will be ordered in the same way as the actions_avail sysctl.
>>> +
>>> + It is important to note that the value of ``actions_logged`` does not
>>> + prevent certain actions from being logged when the audit subsystem is
>>> + configured to audit a task. If the action is not found in
>>> + ``actions_logged`` list, the final decision on whether to audit the
>>> + action for that task is ultimately left up to the audit subsystem to
>>> + decide for all seccomp return values other than ``SECCOMP_RET_ALLOW``.
>>> +
>>> + The ``allow`` string is not accepted in the ``actions_logged`` sysctl
>>> + as it is not possible to log ``SECCOMP_RET_ALLOW`` actions. Attempting
>>> + to write ``allow`` to the sysctl will result in an EINVAL being
>>> + returned.
>>> +
>>> Adding architecture support
>>> ===========================
>>>
>>> diff --git a/include/linux/audit.h b/include/linux/audit.h
>>> index 2150bdc..8c30f06 100644
>>> --- a/include/linux/audit.h
>>> +++ b/include/linux/audit.h
>>> @@ -314,11 +314,7 @@ void audit_core_dumps(long signr);
>>>
>>> static inline void audit_seccomp(unsigned long syscall, long signr, int code)
>>> {
>>> - if (!audit_enabled)
>>> - return;
>>> -
>>> - /* Force a record to be reported if a signal was delivered. */
>>> - if (signr || unlikely(!audit_dummy_context()))
>>> + if (audit_enabled && unlikely(!audit_dummy_context()))
>>> __audit_seccomp(syscall, signr, code);
>>> }
>>>
>>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
>>> index 6bff068..87257f2 100644
>>> --- a/kernel/seccomp.c
>>> +++ b/kernel/seccomp.c
>>> @@ -516,6 +516,52 @@ static void seccomp_send_sigsys(int syscall, int reason)
>>> }
>>> #endif /* CONFIG_SECCOMP_FILTER */
>>>
>>> +/* For use with seccomp_actions_logged */
>>> +#define SECCOMP_LOG_KILL (1 << 0)
>>> +#define SECCOMP_LOG_TRAP (1 << 2)
>>> +#define SECCOMP_LOG_ERRNO (1 << 3)
>>> +#define SECCOMP_LOG_TRACE (1 << 4)
>>> +#define SECCOMP_LOG_ALLOW (1 << 5)
>>> +
>>> +static u32 seccomp_actions_logged = SECCOMP_LOG_KILL | SECCOMP_LOG_TRAP |
>>> + SECCOMP_LOG_ERRNO | SECCOMP_LOG_TRACE;
>>> +
>>> +static inline void seccomp_log(unsigned long syscall, long signr, u32 action)
>>> +{
>>> + bool log;
>>> +
>>> + switch (action) {
>>> + case SECCOMP_RET_TRAP:
>>> + log = seccomp_actions_logged & SECCOMP_LOG_TRAP;
>>> + break;
>>> + case SECCOMP_RET_ERRNO:
>>> + log = seccomp_actions_logged & SECCOMP_LOG_ERRNO;
>>> + break;
>>> + case SECCOMP_RET_TRACE:
>>> + log = seccomp_actions_logged & SECCOMP_LOG_TRACE;
>>> + break;
>>> + case SECCOMP_RET_ALLOW:
>>> + log = false;
>>> + break;
>>> + case SECCOMP_RET_KILL:
>>> + default:
>>> + log = seccomp_actions_logged & SECCOMP_LOG_KILL;
>>> + }
>>> +
>>> + /*
>>> + * Force an audit message to be emitted when the action is allowed to
>>> + * be logged by the admin.
>>> + */
>>> + if (log)
>>> + return __audit_seccomp(syscall, signr, action);
>>
>> At this point in the patch series, there's no filter-requested-logging
>> flag, so I think the above logic isn't needed until later in the
>> series (or rather, only RET_KILL should be checked).
>
> Hmmm... you're right. This slipped in since the sysctl's purpose morphed
> from configuring what actions should be logged to configuring what
> actions are allowed to be logged.
>
>>
>>> +
>>> + /*
>>> + * Let the audit subsystem decide if the action should be audited based
>>> + * on whether the current task itself is being audited.
>>> + */
>>> + return audit_seccomp(syscall, signr, action);
>>
>> With audit_seccomp() being a single if, maybe it should just be
>> collapsed into this function?
>>
>> if (log || (audit_enabled && unlikely(!audit_dummy_context()))
>> audit_seccomp(...)
>
> I think that would be fine. Unless you say otherwise, I'll also rename
> __audit_seccomp() to audit_seccomp() after doing that.
>
> Tyler
>
>>
>> I do like the change in name, though: this new function is correctly
>> named seccomp_log().
>>
>> -Kees
>>
>
>