I am the process of cleaning up the process exit path in the kernel, and
as part of that I am looking at the callers of do_exit. A very
interesting one is __seccure_computing_strict.
Looking at the code is very clear that if a system call is attempted
that is not in the table the thread attempting to execute that system
call is terminated.
Reading the man page for seccomp it says that the process is delivered
SIGKILL.
The practical difference is what happens for multi-threaded
applications.
What are the desired semantics for a multi-threaded application if one
thread attempts to use a unsupported system call? Should the thread be
terminated or the entire application?
Do we need to fix the kernel, or do we need to fix the manpages?
Thank you,
Eric
On Tue, Jun 29, 2021 at 05:54:24PM -0500, Eric W. Biederman wrote:
>
> I am the process of cleaning up the process exit path in the kernel, and
> as part of that I am looking at the callers of do_exit. A very
> interesting one is __seccure_computing_strict.
>
> Looking at the code is very clear that if a system call is attempted
> that is not in the table the thread attempting to execute that system
> call is terminated.
>
> Reading the man page for seccomp it says that the process is delivered
> SIGKILL.
>
> The practical difference is what happens for multi-threaded
> applications.
>
> What are the desired semantics for a multi-threaded application if one
> thread attempts to use a unsupported system call? Should the thread be
> terminated or the entire application?
>
> Do we need to fix the kernel, or do we need to fix the manpages?
I don't know of anyone actually using SECCOMP_MODE_STRICT, but the
original implementation was (perhaps accidentally) thread-killing. It
turns out this is not a particularly desirable situation, and when
SECCOMP_MODE_FILTER was created, it continued with that semantic,
but later grew a process-killing flags, as that's what most programs
actually wanted.
It's likely the manpage needs fixing (we had to make similar updates
for SECCOMP_MODE_FILTER), since some of the early examples of using
SECCOMP_MODE_STRICT were basically "fork, calculate, write result to
fd, exit".
FWIW the seccomp selftests don't even check for the thread-vs-process
SIGKILL of SECCOMP_MODE_STRICT. :)
--
Kees Cook
On 2021-06-29, Eric W. Biederman <[email protected]> wrote:
>
> I am the process of cleaning up the process exit path in the kernel, and
> as part of that I am looking at the callers of do_exit. A very
> interesting one is __seccure_computing_strict.
>
> Looking at the code is very clear that if a system call is attempted
> that is not in the table the thread attempting to execute that system
> call is terminated.
>
> Reading the man page for seccomp it says that the process is delivered
> SIGKILL.
>
> The practical difference is what happens for multi-threaded
> applications.
>
> What are the desired semantics for a multi-threaded application if one
> thread attempts to use a unsupported system call? Should the thread be
> terminated or the entire application?
>
> Do we need to fix the kernel, or do we need to fix the manpages?
My expectation is that the correct action should be the equivalent of
SECCOMP_RET_KILL(_THREAD) which kills the thread and is the current
behaviour (SECCOMP_RET_KILL_PROCESS is relatively speaking quite new).
--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>
Signed-off-by: Eric W. Biederman <[email protected]>
---
man2/seccomp.2 | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/man2/seccomp.2 b/man2/seccomp.2
index a3421871f0f4..bde54c3e3e99 100644
--- a/man2/seccomp.2
+++ b/man2/seccomp.2
@@ -69,9 +69,10 @@ The only system calls that the calling thread is permitted to make are
.BR exit_group (2)),
and
.BR sigreturn (2).
-Other system calls result in the delivery of a
+Other system calls result in the termination of the calling thread,
+or termination of the entire process with the
.BR SIGKILL
-signal.
+signal when there is only one thread.
Strict secure computing mode is useful for number-crunching
applications that may need to execute untrusted byte code, perhaps
obtained by reading from a pipe or socket.
--
2.29.2
On Wed, Jun 30, 2021 at 03:11:23PM -0500, Eric W. Biederman wrote:
>
> Signed-off-by: Eric W. Biederman <[email protected]>
> ---
> man2/seccomp.2 | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/man2/seccomp.2 b/man2/seccomp.2
> index a3421871f0f4..bde54c3e3e99 100644
> --- a/man2/seccomp.2
> +++ b/man2/seccomp.2
> @@ -69,9 +69,10 @@ The only system calls that the calling thread is permitted to make are
> .BR exit_group (2)),
> and
> .BR sigreturn (2).
> -Other system calls result in the delivery of a
> +Other system calls result in the termination of the calling thread,
> +or termination of the entire process with the
> .BR SIGKILL
> -signal.
> +signal when there is only one thread.
> Strict secure computing mode is useful for number-crunching
> applications that may need to execute untrusted byte code, perhaps
> obtained by reading from a pipe or socket.
Thanks!
Acked-by: Kees Cook <[email protected]>
--
Kees Cook
Hi Eric,
On 6/30/21 10:11 PM, Eric W. Biederman wrote:
>
> Signed-off-by: Eric W. Biederman <[email protected]>
Thanks. Patch applied, with Kees' Ack.
Cheers,
Michael
> ---
> man2/seccomp.2 | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/man2/seccomp.2 b/man2/seccomp.2
> index a3421871f0f4..bde54c3e3e99 100644
> --- a/man2/seccomp.2
> +++ b/man2/seccomp.2
> @@ -69,9 +69,10 @@ The only system calls that the calling thread is permitted to make are
> .BR exit_group (2)),
> and
> .BR sigreturn (2).
> -Other system calls result in the delivery of a
> +Other system calls result in the termination of the calling thread,
> +or termination of the entire process with the
> .BR SIGKILL
> -signal.
> +signal when there is only one thread.
> Strict secure computing mode is useful for number-crunching
> applications that may need to execute untrusted byte code, perhaps
> obtained by reading from a pipe or socket.
>
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/