On Thu, Oct 28, 2021 at 12:26:26PM -0500, Eric W. Biederman wrote:
> Kees Cook <[email protected]> writes:
>
> > On Thu, Oct 28, 2021 at 06:21:12PM +0200, Andrea Righi wrote:
> >> The following sub-tests are failing in seccomp_bpf selftest:
> >>
> >> 18:56:54 DEBUG| [stdout] # selftests: seccomp: seccomp_bpf
> >> ...
> >> 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.ptrace.kill_after ...
> >> 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (0)
> >> 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (2) == msg (1)
> >> 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (2)
> >> 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 12)
> >> 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.ptrace.kill_after
> >> ...
> >> 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.seccomp.kill_after ...
> >> 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:1547:kill_after:Expected !ptrace_syscall (1) == IS_SECCOMP_EVENT(status) (0)
> >> 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 0)
> >> 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.seccomp.kill_after
> >> 18:56:57 DEBUG| [stdout] # not ok 80 TRACE_syscall.seccomp.kill_after
> >> ...
> >> 18:56:57 DEBUG| [stdout] # # FAILED: 85 / 87 tests passed.
> >> 18:56:57 DEBUG| [stdout] # # Totals: pass:85 fail:2 xfail:0 xpass:0 skip:0 error:0
> >> 18:56:57 DEBUG| [stdout] not ok 1 selftests: seccomp: seccomp_bpf # exit=1
> >>
> >> I did some bisecting and found that the failures started to happen with:
> >>
> >> 307d522f5eb8 ("signal/seccomp: Refactor seccomp signal and coredump generation")
> >>
> >> Not sure if the test needs to be fixed after this commit, or if the
> >> commit is actually introducing an issue. I'll investigate more, unless
> >> someone knows already what's going on.
> >
> > Ah thanks for noticing; I will investigate...
>
>
> I just did a quick read through of the test and while
> I don't understand everything having a failure seems
> very weird.
>
> I don't understand the comment:
> /* Tracer will redirect getpid to getppid, and we should die. */
>
> As I think what happens is it the bpf programs loads the signal
> number. Tests to see if the signal number if GETPPID and allows
> that system call and causes any other system call to be terminated.
The test suite runs a series of seccomp filter vs syscalls under tracing,
either with ptrace or with seccomp SECCOMP_RET_TRACE, to validate the
expected behavioral states. It seems that what's happened is that the
SIGSYS has suddenly become non-killing:
# RUN TRACE_syscall.ptrace.kill_after ...
# seccomp_bpf.c:1555:kill_after:Expected WSTOPSIG(status) & 0x80 (0) == 0x80 (128)
# seccomp_bpf.c:1556:kill_after:WSTOPSIG: 31
# kill_after: Test exited normally instead of by signal (code: 12)
# FAIL TRACE_syscall.ptrace.kill_after
i.e. the ptracer no longer sees a dead tracee, which would pass through
here:
if (WIFSIGNALED(status) || WIFEXITED(status))
/* Child is dead. Time to go. */
return;
So the above saw a SIG_TRAP|SIGSYS rather than a killing SIGSYS. i.e.
instead of WIFSIGNALED(stauts) being true, it instead catches a
PTRACE_EVENT_STOP for SIGSYS, which should be impossible (the process
should be getting killed).
> Which being single threaded would seem to cause the kernel to execute
> the changed code.
>
> How there kernel at that point is having the process exit with anything
> except SIGSYS I am not immediately seeing.
I've run out of time at the moment to debug further, but I've appended
my changes to the test, and a brute-force change to kernel/seccomp.c to
restore original behavior (though I haven't tested if coredumping works
still). I'll return to this in a few hours...
>
> The logic is the same as that for SECCOMP_RET_TRAP is there a test for
> that, that is also failing?
>
> How do you run that test anyway?
cd tools/testing/selftests/seccomp
make seccomp_bpf
scp seccomp_bpf target:
ssh target ./seccomp_bpf
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 4d8f44a17727..b6c8c8f8bd69 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -1269,10 +1269,12 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
syscall_rollback(current, current_pt_regs());
/* Trigger a coredump with SIGSYS */
force_sig_seccomp(this_syscall, data, true);
- } else {
- do_exit(SIGSYS);
+ do_group_exit(SIGSYS);
}
- return -1; /* skip the syscall go directly to signal handling */
+ if (action == SECCOMP_RET_KILL_THREAD)
+ do_exit(SIGSYS);
+ else
+ do_group_exit(SIGSYS);
}
unreachable();
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 1d64891e6492..8f8c1df885d6 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1487,7 +1487,7 @@ TEST_F(precedence, log_is_fifth_in_any_order)
#define PTRACE_EVENT_SECCOMP 7
#endif
-#define IS_SECCOMP_EVENT(status) ((status >> 16) == PTRACE_EVENT_SECCOMP)
+#define PTRACE_EVENT_MASK(status) ((status) >> 16)
bool tracer_running;
void tracer_stop(int sig)
{
@@ -1536,17 +1536,34 @@ void start_tracer(struct __test_metadata *_metadata, int fd, pid_t tracee,
/* Run until we're shut down. Must assert to stop execution. */
while (tracer_running) {
int status;
+ bool run_callback = true;
if (wait(&status) != tracee)
continue;
+
if (WIFSIGNALED(status) || WIFEXITED(status))
/* Child is dead. Time to go. */
return;
- /* Check if this is a seccomp event. */
- ASSERT_EQ(!ptrace_syscall, IS_SECCOMP_EVENT(status));
+ /* Check if we got an expected event. */
+ ASSERT_EQ(WIFCONTINUED(status), false);
+ ASSERT_EQ(WIFSTOPPED(status), true);
+ ASSERT_EQ(WSTOPSIG(status) & SIGTRAP, SIGTRAP) {
+ TH_LOG("WSTOPSIG: %d", WSTOPSIG(status));
+ }
+ if (ptrace_syscall) {
+ EXPECT_EQ(WSTOPSIG(status) & 0x80, 0x80) {
+ TH_LOG("WSTOPSIG: %d", WSTOPSIG(status));
+ run_callback = false;
+ };
+ } else {
+ EXPECT_EQ(PTRACE_EVENT_MASK(status), PTRACE_EVENT_SECCOMP) {
+ run_callback = false;
+ };
+ }
- tracer_func(_metadata, tracee, status, args);
+ if (run_callback)
+ tracer_func(_metadata, tracee, status, args);
ret = ptrace(ptrace_syscall ? PTRACE_SYSCALL : PTRACE_CONT,
tracee, NULL, 0);
--
Kees Cook
Kees Cook <[email protected]> writes:
> On Thu, Oct 28, 2021 at 12:26:26PM -0500, Eric W. Biederman wrote:
>> Kees Cook <[email protected]> writes:
>>
>> > On Thu, Oct 28, 2021 at 06:21:12PM +0200, Andrea Righi wrote:
>> >> The following sub-tests are failing in seccomp_bpf selftest:
>> >>
>> >> 18:56:54 DEBUG| [stdout] # selftests: seccomp: seccomp_bpf
>> >> ...
>> >> 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.ptrace.kill_after ...
>> >> 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (0)
>> >> 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (2) == msg (1)
>> >> 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (2)
>> >> 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 12)
>> >> 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.ptrace.kill_after
>> >> ...
>> >> 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.seccomp.kill_after ...
>> >> 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:1547:kill_after:Expected !ptrace_syscall (1) == IS_SECCOMP_EVENT(status) (0)
>> >> 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 0)
>> >> 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.seccomp.kill_after
>> >> 18:56:57 DEBUG| [stdout] # not ok 80 TRACE_syscall.seccomp.kill_after
>> >> ...
>> >> 18:56:57 DEBUG| [stdout] # # FAILED: 85 / 87 tests passed.
>> >> 18:56:57 DEBUG| [stdout] # # Totals: pass:85 fail:2 xfail:0 xpass:0 skip:0 error:0
>> >> 18:56:57 DEBUG| [stdout] not ok 1 selftests: seccomp: seccomp_bpf # exit=1
>> >>
>> >> I did some bisecting and found that the failures started to happen with:
>> >>
>> >> 307d522f5eb8 ("signal/seccomp: Refactor seccomp signal and coredump generation")
>> >>
>> >> Not sure if the test needs to be fixed after this commit, or if the
>> >> commit is actually introducing an issue. I'll investigate more, unless
>> >> someone knows already what's going on.
>> >
>> > Ah thanks for noticing; I will investigate...
>>
>>
>> I just did a quick read through of the test and while
>> I don't understand everything having a failure seems
>> very weird.
>>
>> I don't understand the comment:
>> /* Tracer will redirect getpid to getppid, and we should die. */
>>
>> As I think what happens is it the bpf programs loads the signal
>> number. Tests to see if the signal number if GETPPID and allows
>> that system call and causes any other system call to be terminated.
>
> The test suite runs a series of seccomp filter vs syscalls under tracing,
> either with ptrace or with seccomp SECCOMP_RET_TRACE, to validate the
> expected behavioral states. It seems that what's happened is that the
> SIGSYS has suddenly become non-killing:
>
> # RUN TRACE_syscall.ptrace.kill_after ...
> # seccomp_bpf.c:1555:kill_after:Expected WSTOPSIG(status) & 0x80 (0) == 0x80 (128)
> # seccomp_bpf.c:1556:kill_after:WSTOPSIG: 31
> # kill_after: Test exited normally instead of by signal (code: 12)
> # FAIL TRACE_syscall.ptrace.kill_after
>
> i.e. the ptracer no longer sees a dead tracee, which would pass through
> here:
>
> if (WIFSIGNALED(status) || WIFEXITED(status))
> /* Child is dead. Time to go. */
> return;
>
> So the above saw a SIG_TRAP|SIGSYS rather than a killing SIGSYS. i.e.
> instead of WIFSIGNALED(stauts) being true, it instead catches a
> PTRACE_EVENT_STOP for SIGSYS, which should be impossible (the process
> should be getting killed).
Oh. This is being ptraced as part of the test?
Yes. The signal started being delivered. As far as that goes that
sounds correct.
Ptrace is allowed to intercept even fatal signals. Everything except
SIGKILL.
Is this a condition we don't want even ptrace to be able to catch?
I think we can arrange it so that even ptrace can't intercept this
signal. I need to sit this problem on the back burner for a few
minutes. It is an angle I had not considered.
Is it a problem that the debugger can see the signal if the process does
not?
Eric
On Thu, Oct 28, 2021 at 05:06:53PM -0500, Eric W. Biederman wrote:
> Kees Cook <[email protected]> writes:
>
> > On Thu, Oct 28, 2021 at 12:26:26PM -0500, Eric W. Biederman wrote:
> >> Kees Cook <[email protected]> writes:
> >>
> >> > On Thu, Oct 28, 2021 at 06:21:12PM +0200, Andrea Righi wrote:
> >> >> The following sub-tests are failing in seccomp_bpf selftest:
> >> >>
> >> >> 18:56:54 DEBUG| [stdout] # selftests: seccomp: seccomp_bpf
> >> >> ...
> >> >> 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.ptrace.kill_after ...
> >> >> 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (0)
> >> >> 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (2) == msg (1)
> >> >> 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (2)
> >> >> 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 12)
> >> >> 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.ptrace.kill_after
> >> >> ...
> >> >> 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.seccomp.kill_after ...
> >> >> 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:1547:kill_after:Expected !ptrace_syscall (1) == IS_SECCOMP_EVENT(status) (0)
> >> >> 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 0)
> >> >> 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.seccomp.kill_after
> >> >> 18:56:57 DEBUG| [stdout] # not ok 80 TRACE_syscall.seccomp.kill_after
> >> >> ...
> >> >> 18:56:57 DEBUG| [stdout] # # FAILED: 85 / 87 tests passed.
> >> >> 18:56:57 DEBUG| [stdout] # # Totals: pass:85 fail:2 xfail:0 xpass:0 skip:0 error:0
> >> >> 18:56:57 DEBUG| [stdout] not ok 1 selftests: seccomp: seccomp_bpf # exit=1
> >> >>
> >> >> I did some bisecting and found that the failures started to happen with:
> >> >>
> >> >> 307d522f5eb8 ("signal/seccomp: Refactor seccomp signal and coredump generation")
> >> >>
> >> >> Not sure if the test needs to be fixed after this commit, or if the
> >> >> commit is actually introducing an issue. I'll investigate more, unless
> >> >> someone knows already what's going on.
> >> >
> >> > Ah thanks for noticing; I will investigate...
> >>
> >>
> >> I just did a quick read through of the test and while
> >> I don't understand everything having a failure seems
> >> very weird.
> >>
> >> I don't understand the comment:
> >> /* Tracer will redirect getpid to getppid, and we should die. */
> >>
> >> As I think what happens is it the bpf programs loads the signal
> >> number. Tests to see if the signal number if GETPPID and allows
> >> that system call and causes any other system call to be terminated.
> >
> > The test suite runs a series of seccomp filter vs syscalls under tracing,
> > either with ptrace or with seccomp SECCOMP_RET_TRACE, to validate the
> > expected behavioral states. It seems that what's happened is that the
> > SIGSYS has suddenly become non-killing:
> >
> > # RUN TRACE_syscall.ptrace.kill_after ...
> > # seccomp_bpf.c:1555:kill_after:Expected WSTOPSIG(status) & 0x80 (0) == 0x80 (128)
> > # seccomp_bpf.c:1556:kill_after:WSTOPSIG: 31
> > # kill_after: Test exited normally instead of by signal (code: 12)
> > # FAIL TRACE_syscall.ptrace.kill_after
> >
> > i.e. the ptracer no longer sees a dead tracee, which would pass through
> > here:
> >
> > if (WIFSIGNALED(status) || WIFEXITED(status))
> > /* Child is dead. Time to go. */
> > return;
> >
> > So the above saw a SIG_TRAP|SIGSYS rather than a killing SIGSYS. i.e.
> > instead of WIFSIGNALED(stauts) being true, it instead catches a
> > PTRACE_EVENT_STOP for SIGSYS, which should be impossible (the process
> > should be getting killed).
>
> Oh. This is being ptraced as part of the test?
>
> Yes. The signal started being delivered. As far as that goes that
> sounds correct.
>
> Ptrace is allowed to intercept even fatal signals. Everything except
> SIGKILL.
>
> Is this a condition we don't want even ptrace to be able to catch?
>
> I think we can arrange it so that even ptrace can't intercept this
> signal. I need to sit this problem on the back burner for a few
> minutes. It is an angle I had not considered.
>
> Is it a problem that the debugger can see the signal if the process does
> not?
Right, I'm trying to understand that too. However, my neighbor just lost
power. :|
What I was in the middle of checking was what ptrace "sees" going
through a fatal SIGSYS; my initial debugging attempts were weird.
-Kees
--
Kees Cook
As Andy pointed out that there are races between
force_sig_info_to_task and sigaction[1] when force_sig_info_task. As
Kees discovered[2] ptrace is also able to change these signals.
In the case of seeccomp killing a process with a signal it is a
security violation to allow the signal to be caught or manipulated.
Solve this problem by introducing a new flag SA_IMMUTABLE that
prevents sigaction and ptrace from modifying these forced signals.
This flag is carefully made kernel internal so that no new ABI is
introduced.
Longer term I think this can be solved by guaranteeing short circuit
delivery of signals in this case. Unfortunately reliable and
guaranteed short circuit delivery of these signals is still a ways off
from being implemented, tested, and merged. So I have implemented a much
simpler alternative for now.
[1] https://lkml.kernel.org/r/[email protected]
[2] https://lkml.kernel.org/r/202110281136.5CE65399A7@keescook
Cc: [email protected]
Fixes: 307d522f5eb8 ("signal/seccomp: Refactor seccomp signal and coredump generation")
Signed-off-by: "Eric W. Biederman" <[email protected]>
---
I have tested this patch and this changed works for me to fix the issue.
I believe this closes all of the races that force_sig_info_to_task
has when sigdfl is specified. So this should be enough for anything
that needs a guaranteed that userspace can not race with the kernel
is handled.
Can folks look this over and see if I missed something?
Thank you,
Eric
include/linux/signal_types.h | 3 +++
include/uapi/asm-generic/signal-defs.h | 1 +
kernel/signal.c | 8 +++++++-
3 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/include/linux/signal_types.h b/include/linux/signal_types.h
index 34cb28b8f16c..927f7c0e5bff 100644
--- a/include/linux/signal_types.h
+++ b/include/linux/signal_types.h
@@ -70,6 +70,9 @@ struct ksignal {
int sig;
};
+/* Used to kill the race between sigaction and forced signals */
+#define SA_IMMUTABLE 0x008000000
+
#ifndef __ARCH_UAPI_SA_FLAGS
#ifdef SA_RESTORER
#define __ARCH_UAPI_SA_FLAGS SA_RESTORER
diff --git a/include/uapi/asm-generic/signal-defs.h b/include/uapi/asm-generic/signal-defs.h
index fe929e7b77ca..7572f2f46ee8 100644
--- a/include/uapi/asm-generic/signal-defs.h
+++ b/include/uapi/asm-generic/signal-defs.h
@@ -45,6 +45,7 @@
#define SA_UNSUPPORTED 0x00000400
#define SA_EXPOSE_TAGBITS 0x00000800
/* 0x00010000 used on mips */
+/* 0x00800000 used for internal SA_IMMUTABLE */
/* 0x01000000 used on x86 */
/* 0x02000000 used on x86 */
/*
diff --git a/kernel/signal.c b/kernel/signal.c
index 6a5e1802b9a2..056a107e3cbc 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1336,6 +1336,7 @@ force_sig_info_to_task(struct kernel_siginfo *info, struct task_struct *t, bool
blocked = sigismember(&t->blocked, sig);
if (blocked || ignored || sigdfl) {
action->sa.sa_handler = SIG_DFL;
+ action->sa.sa_flags |= SA_IMMUTABLE;
if (blocked) {
sigdelset(&t->blocked, sig);
recalc_sigpending_and_wake(t);
@@ -2760,7 +2761,8 @@ bool get_signal(struct ksignal *ksig)
if (!signr)
break; /* will return 0 */
- if (unlikely(current->ptrace) && signr != SIGKILL) {
+ if (unlikely(current->ptrace) && (signr != SIGKILL) &&
+ !(sighand->action[signr -1].sa.sa_flags & SA_IMMUTABLE)) {
signr = ptrace_signal(signr, &ksig->info);
if (!signr)
continue;
@@ -4110,6 +4112,10 @@ int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact)
k = &p->sighand->action[sig-1];
spin_lock_irq(&p->sighand->siglock);
+ if (k->sa.sa_flags & SA_IMMUTABLE) {
+ spin_unlock_irq(&p->sighand->siglock);
+ return -EINVAL;
+ }
if (oact)
*oact = *k;
--
2.20.1
Kees Cook <[email protected]> writes:
> On Thu, Oct 28, 2021 at 05:06:53PM -0500, Eric W. Biederman wrote:
>> Kees Cook <[email protected]> writes:
>>
>> > On Thu, Oct 28, 2021 at 12:26:26PM -0500, Eric W. Biederman wrote:
>> >> Kees Cook <[email protected]> writes:
>> >>
>> >> > On Thu, Oct 28, 2021 at 06:21:12PM +0200, Andrea Righi wrote:
>> >> >> The following sub-tests are failing in seccomp_bpf selftest:
>> >> >>
>> >> >> 18:56:54 DEBUG| [stdout] # selftests: seccomp: seccomp_bpf
>> >> >> ...
>> >> >> 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.ptrace.kill_after ...
>> >> >> 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (0)
>> >> >> 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (2) == msg (1)
>> >> >> 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:2023:kill_after:Expected entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT (1) == msg (2)
>> >> >> 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 12)
>> >> >> 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.ptrace.kill_after
>> >> >> ...
>> >> >> 18:56:57 DEBUG| [stdout] # # RUN TRACE_syscall.seccomp.kill_after ...
>> >> >> 18:56:57 DEBUG| [stdout] # # seccomp_bpf.c:1547:kill_after:Expected !ptrace_syscall (1) == IS_SECCOMP_EVENT(status) (0)
>> >> >> 18:56:57 DEBUG| [stdout] # # kill_after: Test exited normally instead of by signal (code: 0)
>> >> >> 18:56:57 DEBUG| [stdout] # # FAIL TRACE_syscall.seccomp.kill_after
>> >> >> 18:56:57 DEBUG| [stdout] # not ok 80 TRACE_syscall.seccomp.kill_after
>> >> >> ...
>> >> >> 18:56:57 DEBUG| [stdout] # # FAILED: 85 / 87 tests passed.
>> >> >> 18:56:57 DEBUG| [stdout] # # Totals: pass:85 fail:2 xfail:0 xpass:0 skip:0 error:0
>> >> >> 18:56:57 DEBUG| [stdout] not ok 1 selftests: seccomp: seccomp_bpf # exit=1
>> >> >>
>> >> >> I did some bisecting and found that the failures started to happen with:
>> >> >>
>> >> >> 307d522f5eb8 ("signal/seccomp: Refactor seccomp signal and coredump generation")
>> >> >>
>> >> >> Not sure if the test needs to be fixed after this commit, or if the
>> >> >> commit is actually introducing an issue. I'll investigate more, unless
>> >> >> someone knows already what's going on.
>> >> >
>> >> > Ah thanks for noticing; I will investigate...
>> >>
>> >>
>> >> I just did a quick read through of the test and while
>> >> I don't understand everything having a failure seems
>> >> very weird.
>> >>
>> >> I don't understand the comment:
>> >> /* Tracer will redirect getpid to getppid, and we should die. */
>> >>
>> >> As I think what happens is it the bpf programs loads the signal
>> >> number. Tests to see if the signal number if GETPPID and allows
>> >> that system call and causes any other system call to be terminated.
>> >
>> > The test suite runs a series of seccomp filter vs syscalls under tracing,
>> > either with ptrace or with seccomp SECCOMP_RET_TRACE, to validate the
>> > expected behavioral states. It seems that what's happened is that the
>> > SIGSYS has suddenly become non-killing:
>> >
>> > # RUN TRACE_syscall.ptrace.kill_after ...
>> > # seccomp_bpf.c:1555:kill_after:Expected WSTOPSIG(status) & 0x80 (0) == 0x80 (128)
>> > # seccomp_bpf.c:1556:kill_after:WSTOPSIG: 31
>> > # kill_after: Test exited normally instead of by signal (code: 12)
>> > # FAIL TRACE_syscall.ptrace.kill_after
>> >
>> > i.e. the ptracer no longer sees a dead tracee, which would pass through
>> > here:
>> >
>> > if (WIFSIGNALED(status) || WIFEXITED(status))
>> > /* Child is dead. Time to go. */
>> > return;
>> >
>> > So the above saw a SIG_TRAP|SIGSYS rather than a killing SIGSYS. i.e.
>> > instead of WIFSIGNALED(stauts) being true, it instead catches a
>> > PTRACE_EVENT_STOP for SIGSYS, which should be impossible (the process
>> > should be getting killed).
>>
>> Oh. This is being ptraced as part of the test?
>>
>> Yes. The signal started being delivered. As far as that goes that
>> sounds correct.
>>
>> Ptrace is allowed to intercept even fatal signals. Everything except
>> SIGKILL.
>>
>> Is this a condition we don't want even ptrace to be able to catch?
>>
>> I think we can arrange it so that even ptrace can't intercept this
>> signal. I need to sit this problem on the back burner for a few
>> minutes. It is an angle I had not considered.
>>
>> Is it a problem that the debugger can see the signal if the process does
>> not?
>
> Right, I'm trying to understand that too. However, my neighbor just lost
> power. :|
>
> What I was in the middle of checking was what ptrace "sees" going
> through a fatal SIGSYS; my initial debugging attempts were weird.
If we don't allow ptrace to see these signals, then it makes it possible
for complete_signal to short circuit deliver them and ignore ptrace
later on. Which seems nice, and allows for not needing to change
sigaction at all in the future.
I don't know if it is strictly necessary. It is not like people using
debuggers have complained yet.
I just posted a patch that solves this by setting an extra flag called
SA_IMMUTABLE and disabling sigaction and ptrace when the flag is set.
I think that is a simple patch that sorts this out.
Eric
On Fri, Oct 29, 2021 at 10:09:04AM -0500, Eric W. Biederman wrote:
>
> As Andy pointed out that there are races between
> force_sig_info_to_task and sigaction[1] when force_sig_info_task. As
> Kees discovered[2] ptrace is also able to change these signals.
>
> In the case of seeccomp killing a process with a signal it is a
> security violation to allow the signal to be caught or manipulated.
>
> Solve this problem by introducing a new flag SA_IMMUTABLE that
> prevents sigaction and ptrace from modifying these forced signals.
> This flag is carefully made kernel internal so that no new ABI is
> introduced.
>
> Longer term I think this can be solved by guaranteeing short circuit
> delivery of signals in this case. Unfortunately reliable and
> guaranteed short circuit delivery of these signals is still a ways off
> from being implemented, tested, and merged. So I have implemented a much
> simpler alternative for now.
>
> [1] https://lkml.kernel.org/r/[email protected]
> [2] https://lkml.kernel.org/r/202110281136.5CE65399A7@keescook
> Cc: [email protected]
> Fixes: 307d522f5eb8 ("signal/seccomp: Refactor seccomp signal and coredump generation")
> Signed-off-by: "Eric W. Biederman" <[email protected]>
> ---
FWIW I've tested this patch and I confirm that it fixes the failure that
I reported with the seccomp_bpf selftest.
Tested-by: Andrea Righi <[email protected]>
Thanks!
-Andrea
>
> I have tested this patch and this changed works for me to fix the issue.
>
> I believe this closes all of the races that force_sig_info_to_task
> has when sigdfl is specified. So this should be enough for anything
> that needs a guaranteed that userspace can not race with the kernel
> is handled.
>
> Can folks look this over and see if I missed something?
> Thank you,
> Eric
>
>
> include/linux/signal_types.h | 3 +++
> include/uapi/asm-generic/signal-defs.h | 1 +
> kernel/signal.c | 8 +++++++-
> 3 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/signal_types.h b/include/linux/signal_types.h
> index 34cb28b8f16c..927f7c0e5bff 100644
> --- a/include/linux/signal_types.h
> +++ b/include/linux/signal_types.h
> @@ -70,6 +70,9 @@ struct ksignal {
> int sig;
> };
>
> +/* Used to kill the race between sigaction and forced signals */
> +#define SA_IMMUTABLE 0x008000000
> +
> #ifndef __ARCH_UAPI_SA_FLAGS
> #ifdef SA_RESTORER
> #define __ARCH_UAPI_SA_FLAGS SA_RESTORER
> diff --git a/include/uapi/asm-generic/signal-defs.h b/include/uapi/asm-generic/signal-defs.h
> index fe929e7b77ca..7572f2f46ee8 100644
> --- a/include/uapi/asm-generic/signal-defs.h
> +++ b/include/uapi/asm-generic/signal-defs.h
> @@ -45,6 +45,7 @@
> #define SA_UNSUPPORTED 0x00000400
> #define SA_EXPOSE_TAGBITS 0x00000800
> /* 0x00010000 used on mips */
> +/* 0x00800000 used for internal SA_IMMUTABLE */
> /* 0x01000000 used on x86 */
> /* 0x02000000 used on x86 */
> /*
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 6a5e1802b9a2..056a107e3cbc 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -1336,6 +1336,7 @@ force_sig_info_to_task(struct kernel_siginfo *info, struct task_struct *t, bool
> blocked = sigismember(&t->blocked, sig);
> if (blocked || ignored || sigdfl) {
> action->sa.sa_handler = SIG_DFL;
> + action->sa.sa_flags |= SA_IMMUTABLE;
> if (blocked) {
> sigdelset(&t->blocked, sig);
> recalc_sigpending_and_wake(t);
> @@ -2760,7 +2761,8 @@ bool get_signal(struct ksignal *ksig)
> if (!signr)
> break; /* will return 0 */
>
> - if (unlikely(current->ptrace) && signr != SIGKILL) {
> + if (unlikely(current->ptrace) && (signr != SIGKILL) &&
> + !(sighand->action[signr -1].sa.sa_flags & SA_IMMUTABLE)) {
> signr = ptrace_signal(signr, &ksig->info);
> if (!signr)
> continue;
> @@ -4110,6 +4112,10 @@ int do_sigaction(int sig, struct k_sigaction *act, struct k_sigaction *oact)
> k = &p->sighand->action[sig-1];
>
> spin_lock_irq(&p->sighand->siglock);
> + if (k->sa.sa_flags & SA_IMMUTABLE) {
> + spin_unlock_irq(&p->sighand->siglock);
> + return -EINVAL;
> + }
> if (oact)
> *oact = *k;
>
> --
> 2.20.1
Andrea Righi <[email protected]> writes:
> On Fri, Oct 29, 2021 at 10:09:04AM -0500, Eric W. Biederman wrote:
>>
>> As Andy pointed out that there are races between
>> force_sig_info_to_task and sigaction[1] when force_sig_info_task. As
>> Kees discovered[2] ptrace is also able to change these signals.
>>
>> In the case of seeccomp killing a process with a signal it is a
>> security violation to allow the signal to be caught or manipulated.
>>
>> Solve this problem by introducing a new flag SA_IMMUTABLE that
>> prevents sigaction and ptrace from modifying these forced signals.
>> This flag is carefully made kernel internal so that no new ABI is
>> introduced.
>>
>> Longer term I think this can be solved by guaranteeing short circuit
>> delivery of signals in this case. Unfortunately reliable and
>> guaranteed short circuit delivery of these signals is still a ways off
>> from being implemented, tested, and merged. So I have implemented a much
>> simpler alternative for now.
>>
>> [1] https://lkml.kernel.org/r/[email protected]
>> [2] https://lkml.kernel.org/r/202110281136.5CE65399A7@keescook
>> Cc: [email protected]
>> Fixes: 307d522f5eb8 ("signal/seccomp: Refactor seccomp signal and coredump generation")
>> Signed-off-by: "Eric W. Biederman" <[email protected]>
>> ---
>
> FWIW I've tested this patch and I confirm that it fixes the failure that
> I reported with the seccomp_bpf selftest.
>
> Tested-by: Andrea Righi <[email protected]>
Sigh. Except for the extra 0 in the definition of SA_IMMUTABLE
that caused it to conflict with the x86 specific signal numbers.
Eric
Kees Cook <[email protected]> writes:
> On Thu, Oct 28, 2021 at 05:06:53PM -0500, Eric W. Biederman wrote:
>> Kees Cook <[email protected]> writes:
>>
>> > On Thu, Oct 28, 2021 at 12:26:26PM -0500, Eric W. Biederman wrote:
>>
>> Is it a problem that the debugger can see the signal if the process does
>> not?
>
> Right, I'm trying to understand that too. However, my neighbor just lost
> power. :|
>
> What I was in the middle of checking was what ptrace "sees" going
> through a fatal SIGSYS; my initial debugging attempts were weird.
Kees have you regained power and had a chance to see my SA_IMMUTABLE
patch?
Does what I implemented seem like it will work for you?
I think it is a solid and simple solution to a pair of problems with my
change to use the ordinary coredump path for seccomp. But I would very
much love to hear it seems reasonable to you, as you were looking at the
problem as well.
Eric
On Tue, Nov 02, 2021 at 01:22:19PM -0500, Eric W. Biederman wrote:
> Kees Cook <[email protected]> writes:
>
> > On Thu, Oct 28, 2021 at 05:06:53PM -0500, Eric W. Biederman wrote:
> >> Kees Cook <[email protected]> writes:
> >>
> >> > On Thu, Oct 28, 2021 at 12:26:26PM -0500, Eric W. Biederman wrote:
> >>
> >> Is it a problem that the debugger can see the signal if the process does
> >> not?
> >
> > Right, I'm trying to understand that too. However, my neighbor just lost
> > power. :|
> >
> > What I was in the middle of checking was what ptrace "sees" going
> > through a fatal SIGSYS; my initial debugging attempts were weird.
>
> Kees have you regained power and had a chance to see my SA_IMMUTABLE
> patch?
Apologies; I got busy with other stuff, but I've tested this now. It's
happy and I see the expected behaviors again. Note that I used the patch
with this change:
-#define SA_IMMUTABLE 0x008000000
+#define SA_IMMUTABLE 0x00800000
Tested-by: Kees Cook <[email protected]>
Thanks!
-Kees
--
Kees Cook
Kees Cook <[email protected]> writes:
> On Tue, Nov 02, 2021 at 01:22:19PM -0500, Eric W. Biederman wrote:
>> Kees Cook <[email protected]> writes:
>>
>> > On Thu, Oct 28, 2021 at 05:06:53PM -0500, Eric W. Biederman wrote:
>> >> Kees Cook <[email protected]> writes:
>> >>
>> >> > On Thu, Oct 28, 2021 at 12:26:26PM -0500, Eric W. Biederman wrote:
>> >>
>> >> Is it a problem that the debugger can see the signal if the process does
>> >> not?
>> >
>> > Right, I'm trying to understand that too. However, my neighbor just lost
>> > power. :|
>> >
>> > What I was in the middle of checking was what ptrace "sees" going
>> > through a fatal SIGSYS; my initial debugging attempts were weird.
>>
>> Kees have you regained power and had a chance to see my SA_IMMUTABLE
>> patch?
>
> Apologies; I got busy with other stuff, but I've tested this now. It's
> happy and I see the expected behaviors again. Note that I used the patch
> with this change:
>
> -#define SA_IMMUTABLE 0x008000000
> +#define SA_IMMUTABLE 0x00800000
>
> Tested-by: Kees Cook <[email protected]>
Thanks.
And I see you have written a test as well that should keep
this kind of bug from happening again.
Eric