Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp6336170ybp; Tue, 15 Oct 2019 13:20:44 -0700 (PDT) X-Google-Smtp-Source: APXvYqybkBZ0yLVXBBnYjr8XEUg7DZXpqX4vVJ2GnGrqp/cGwCdS+FfWP4qIp1gCspNVWnA7gXbi X-Received: by 2002:a17:906:cc90:: with SMTP id oq16mr35628296ejb.322.1571170844321; Tue, 15 Oct 2019 13:20:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571170844; cv=none; d=google.com; s=arc-20160816; b=KBWu0VuTtG7ldwsVt6Y07aX7XtdAHvVc6pWJm2NxyJCAeujv5SYgwiZyBh0Za9xHx2 aYocaEceoMrQk0B9t1ToGK37nhY3Gwn993ubcIQZebXT6xb8wsBOH8KpW7bdGzwNCjmK jPIsnMoAcXV+f3w0eLjYHRzMmb8To6r2iRgkuMnQswGhO8BqYVIQCSyEEp6jutcFzaw8 yHhqYWTAWmmY71UVNt6hFI7AAxyhPO87qI4rEm8qaA6KzeCSbEIxU+vQt4OZ2qugL9Bs 2IjUAqPw06uy6oi0NHW5FFZqofi4X1Ggk03EgZwvy0rozkIOTWNyzK/SjXAsKKQMPaNn fp3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=2YPG3gRwYfXbp+PqjrfJDTcKIYwl4QysGzAj/FIuSos=; b=dW2LlkCtyvTTB07GlhPkSFGKJjU8pwxyFYEZgnpObt5yEQ9S3DIkXLnYsXkHb09594 WjOQYEogjySl1boD4XlkaMXxCwlw0nW26E3LB7vqLXCAPL32RUmfW00AVIiptGi8ZDYH BoqZX4zbXIKNRAB6PeqAB12RCzRNSAzLvFC8CqMYT8BlHjK/OY+4UHpPY43zu+K6UCPF 5THLwF9HftWpFK+H3BaCwigEmFcVMeYDI+82qpiRiEeluaoqXEXvmaV2Jy3/+mZaWTpm TlhxaTqSAb0obWlzI61BbYogYqNdKQCMUOUTZ+ps1egFYk3njfeXPA6aBw0eRe0b1+E1 e1nQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=MXn9aY+e; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g18si13508523eje.297.2019.10.15.13.20.20; Tue, 15 Oct 2019 13:20:44 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=MXn9aY+e; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729476AbfJOQ1n (ORCPT + 99 others); Tue, 15 Oct 2019 12:27:43 -0400 Received: from mail-pg1-f196.google.com ([209.85.215.196]:41422 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728554AbfJOQ1m (ORCPT ); Tue, 15 Oct 2019 12:27:42 -0400 Received: by mail-pg1-f196.google.com with SMTP id t3so12447707pga.8 for ; Tue, 15 Oct 2019 09:27:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=2YPG3gRwYfXbp+PqjrfJDTcKIYwl4QysGzAj/FIuSos=; b=MXn9aY+e1v9kk20NICPzGvwY+Px1ZIi4lzFWAFpC0mgYifIcIA4Ng8+M/saZ9oWevZ 5thi8H4LsvXpJWxGVtelermxVi1AOXvSgGBkrczcoQj0t0SGpVvshqREN96VE0z/cBjD dCYkKU1SCSnbEcYrjv2tLpwXPFjgHkrNOrUPc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=2YPG3gRwYfXbp+PqjrfJDTcKIYwl4QysGzAj/FIuSos=; b=l9irwKSqpxqff7ePw0W3Yk137jb6sfnnU0zyrI7j4ZFcn5pa43QMmVNwwxY40yh6Oz KhXwQ28XMUoLkV3e4YeUHCtFSqQfzfDlNtO2dUZS1q43k1Y4Inv+hoQvuz3fzyXJJxss obkcPTNmrQe/I7I74YPQvAIue35oefEeMcU18fNPbSu5s9Q7Qr2cDUQ/91QhgdnJBi8p V/DHl85/HrgdBxijF63m5URNVbwt7hImkYQjcDlhRMyvXSwAc9nOAD4eg++pR9+iXUni dx9EhsuRO1wHOMwZySO+fuy6AsTYUtOmqjecBRwnXjKtbwAr5rB9g5uc6v7F25I/i1TF pjew== X-Gm-Message-State: APjAAAXo6W4F5SeCtkKfpsTrsIE8aqhsx2+YUS2bCkIGL33CIihqhX1u IHhpIYzqrQ0zkIXJ/x121SDvdg== X-Received: by 2002:a17:90a:80c2:: with SMTP id k2mr43947622pjw.92.1571156861760; Tue, 15 Oct 2019 09:27:41 -0700 (PDT) Received: from www.outflux.net (smtp.outflux.net. [198.145.64.163]) by smtp.gmail.com with ESMTPSA id 11sm21224383pgd.0.2019.10.15.09.27.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Oct 2019 09:27:40 -0700 (PDT) Date: Tue, 15 Oct 2019 09:27:39 -0700 From: Kees Cook To: Paul Walmsley Cc: Shuah Khan , Palmer Dabbelt , David Abdurachmanov , Albert Ou , Oleg Nesterov , Andy Lutomirski , Will Drewry , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , David Abdurachmanov , Thomas Gleixner , Allison Randal , Alexios Zavras , Anup Patel , Vincent Chen , Alan Kao , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, me@carlosedp.com Subject: Re: [PATCH v2] riscv: add support for SECCOMP and SECCOMP_FILTER Message-ID: <201910150926.E621A5B@keescook> References: <20190822205533.4877-1-david.abdurachmanov@sifive.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 14, 2019 at 02:06:07PM -0700, Paul Walmsley wrote: > Shuah, > > Could you please take a quick look at this and ack it if you're OK with > the tools/testing change? We'd like to get this merged soon. FWIW, I regularly carry these kinds of selftest changes via my seccomp tree, so if Shuah is busy, I think it'll be fine to take this in riscv. If not, I'll take responsibility of apologizing to Shuah! :) :) -Kees > > - Paul > > > On Fri, 4 Oct 2019, Paul Walmsley wrote: > > > Hello Shuah, > > > > On Thu, 22 Aug 2019, David Abdurachmanov wrote: > > > > > This patch was extensively tested on Fedora/RISCV (applied by default on > > > top of 5.2-rc7 kernel for <2 months). The patch was also tested with 5.3-rc > > > on QEMU and SiFive Unleashed board. > > > > > > libseccomp (userspace) was rebased: > > > https://github.com/seccomp/libseccomp/pull/134 > > > > > > Fully passes libseccomp regression testing (simulation and live). > > > > > > There is one failing kernel selftest: global.user_notification_signal > > > > > > v1 -> v2: > > > - return immediatly if secure_computing(NULL) returns -1 > > > - fixed whitespace issues > > > - add missing seccomp.h > > > - remove patch #2 (solved now) > > > - add riscv to seccomp kernel selftest > > > > > > Cc: keescook@chromium.org > > > Cc: me@carlosedp.com > > > > > > Signed-off-by: David Abdurachmanov > > > > We'd like to merge this patch through the RISC-V tree. > > Care to ack the change to tools/testing/selftests/seccomp/seccomp_bpf.c ? > > > > Kees has already reviewed it: > > > > https://lore.kernel.org/linux-riscv/CAJr-aD=UnCN9E_mdVJ2H5nt=6juRSWikZnA5HxDLQxXLbsRz-w@mail.gmail.com/ > > > > > > - Paul > > > > > > > --- > > > arch/riscv/Kconfig | 14 ++++++++++ > > > arch/riscv/include/asm/seccomp.h | 10 +++++++ > > > arch/riscv/include/asm/thread_info.h | 5 +++- > > > arch/riscv/kernel/entry.S | 27 +++++++++++++++++-- > > > arch/riscv/kernel/ptrace.c | 10 +++++++ > > > tools/testing/selftests/seccomp/seccomp_bpf.c | 8 +++++- > > > 6 files changed, 70 insertions(+), 4 deletions(-) > > > create mode 100644 arch/riscv/include/asm/seccomp.h > > > > > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > > > index 59a4727ecd6c..441e63ff5adc 100644 > > > --- a/arch/riscv/Kconfig > > > +++ b/arch/riscv/Kconfig > > > @@ -31,6 +31,7 @@ config RISCV > > > select GENERIC_SMP_IDLE_THREAD > > > select GENERIC_ATOMIC64 if !64BIT > > > select HAVE_ARCH_AUDITSYSCALL > > > + select HAVE_ARCH_SECCOMP_FILTER > > > select HAVE_MEMBLOCK_NODE_MAP > > > select HAVE_DMA_CONTIGUOUS > > > select HAVE_FUTEX_CMPXCHG if FUTEX > > > @@ -235,6 +236,19 @@ menu "Kernel features" > > > > > > source "kernel/Kconfig.hz" > > > > > > +config SECCOMP > > > + bool "Enable seccomp to safely compute untrusted bytecode" > > > + help > > > + This kernel feature is useful for number crunching applications > > > + that may need to compute untrusted bytecode during their > > > + execution. By using pipes or other transports made available to > > > + the process as file descriptors supporting the read/write > > > + syscalls, it's possible to isolate those applications in > > > + their own address space using seccomp. Once seccomp is > > > + enabled via prctl(PR_SET_SECCOMP), it cannot be disabled > > > + and the task is only allowed to execute a few safe syscalls > > > + defined by each seccomp mode. > > > + > > > endmenu > > > > > > menu "Boot options" > > > diff --git a/arch/riscv/include/asm/seccomp.h b/arch/riscv/include/asm/seccomp.h > > > new file mode 100644 > > > index 000000000000..bf7744ee3b3d > > > --- /dev/null > > > +++ b/arch/riscv/include/asm/seccomp.h > > > @@ -0,0 +1,10 @@ > > > +/* SPDX-License-Identifier: GPL-2.0 */ > > > + > > > +#ifndef _ASM_SECCOMP_H > > > +#define _ASM_SECCOMP_H > > > + > > > +#include > > > + > > > +#include > > > + > > > +#endif /* _ASM_SECCOMP_H */ > > > diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h > > > index 905372d7eeb8..a0b2a29a0da1 100644 > > > --- a/arch/riscv/include/asm/thread_info.h > > > +++ b/arch/riscv/include/asm/thread_info.h > > > @@ -75,6 +75,7 @@ struct thread_info { > > > #define TIF_MEMDIE 5 /* is terminating due to OOM killer */ > > > #define TIF_SYSCALL_TRACEPOINT 6 /* syscall tracepoint instrumentation */ > > > #define TIF_SYSCALL_AUDIT 7 /* syscall auditing */ > > > +#define TIF_SECCOMP 8 /* syscall secure computing */ > > > > > > #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE) > > > #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME) > > > @@ -82,11 +83,13 @@ struct thread_info { > > > #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) > > > #define _TIF_SYSCALL_TRACEPOINT (1 << TIF_SYSCALL_TRACEPOINT) > > > #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT) > > > +#define _TIF_SECCOMP (1 << TIF_SECCOMP) > > > > > > #define _TIF_WORK_MASK \ > > > (_TIF_NOTIFY_RESUME | _TIF_SIGPENDING | _TIF_NEED_RESCHED) > > > > > > #define _TIF_SYSCALL_WORK \ > > > - (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_TRACEPOINT | _TIF_SYSCALL_AUDIT) > > > + (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_TRACEPOINT | _TIF_SYSCALL_AUDIT | \ > > > + _TIF_SECCOMP ) > > > > > > #endif /* _ASM_RISCV_THREAD_INFO_H */ > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > > index bc7a56e1ca6f..0bbedfa3e47d 100644 > > > --- a/arch/riscv/kernel/entry.S > > > +++ b/arch/riscv/kernel/entry.S > > > @@ -203,8 +203,25 @@ check_syscall_nr: > > > /* Check to make sure we don't jump to a bogus syscall number. */ > > > li t0, __NR_syscalls > > > la s0, sys_ni_syscall > > > - /* Syscall number held in a7 */ > > > - bgeu a7, t0, 1f > > > + /* > > > + * The tracer can change syscall number to valid/invalid value. > > > + * We use syscall_set_nr helper in syscall_trace_enter thus we > > > + * cannot trust the current value in a7 and have to reload from > > > + * the current task pt_regs. > > > + */ > > > + REG_L a7, PT_A7(sp) > > > + /* > > > + * Syscall number held in a7. > > > + * If syscall number is above allowed value, redirect to ni_syscall. > > > + */ > > > + bge a7, t0, 1f > > > + /* > > > + * Check if syscall is rejected by tracer or seccomp, i.e., a7 == -1. > > > + * If yes, we pretend it was executed. > > > + */ > > > + li t1, -1 > > > + beq a7, t1, ret_from_syscall_rejected > > > + /* Call syscall */ > > > la s0, sys_call_table > > > slli t0, a7, RISCV_LGPTR > > > add s0, s0, t0 > > > @@ -215,6 +232,12 @@ check_syscall_nr: > > > ret_from_syscall: > > > /* Set user a0 to kernel a0 */ > > > REG_S a0, PT_A0(sp) > > > + /* > > > + * We didn't execute the actual syscall. > > > + * Seccomp already set return value for the current task pt_regs. > > > + * (If it was configured with SECCOMP_RET_ERRNO/TRACE) > > > + */ > > > +ret_from_syscall_rejected: > > > /* Trace syscalls, but only if requested by the user. */ > > > REG_L t0, TASK_TI_FLAGS(tp) > > > andi t0, t0, _TIF_SYSCALL_WORK > > > diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c > > > index 368751438366..63e47c9f85f0 100644 > > > --- a/arch/riscv/kernel/ptrace.c > > > +++ b/arch/riscv/kernel/ptrace.c > > > @@ -154,6 +154,16 @@ void do_syscall_trace_enter(struct pt_regs *regs) > > > if (tracehook_report_syscall_entry(regs)) > > > syscall_set_nr(current, regs, -1); > > > > > > + /* > > > + * Do the secure computing after ptrace; failures should be fast. > > > + * If this fails we might have return value in a0 from seccomp > > > + * (via SECCOMP_RET_ERRNO/TRACE). > > > + */ > > > + if (secure_computing(NULL) == -1) { > > > + syscall_set_nr(current, regs, -1); > > > + return; > > > + } > > > + > > > #ifdef CONFIG_HAVE_SYSCALL_TRACEPOINTS > > > if (test_thread_flag(TIF_SYSCALL_TRACEPOINT)) > > > trace_sys_enter(regs, syscall_get_nr(current, regs)); > > > diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c > > > index 6ef7f16c4cf5..492e0adad9d3 100644 > > > --- a/tools/testing/selftests/seccomp/seccomp_bpf.c > > > +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c > > > @@ -112,6 +112,8 @@ struct seccomp_data { > > > # define __NR_seccomp 383 > > > # elif defined(__aarch64__) > > > # define __NR_seccomp 277 > > > +# elif defined(__riscv) > > > +# define __NR_seccomp 277 > > > # elif defined(__hppa__) > > > # define __NR_seccomp 338 > > > # elif defined(__powerpc__) > > > @@ -1582,6 +1584,10 @@ TEST_F(TRACE_poke, getpid_runs_normally) > > > # define ARCH_REGS struct user_pt_regs > > > # define SYSCALL_NUM regs[8] > > > # define SYSCALL_RET regs[0] > > > +#elif defined(__riscv) && __riscv_xlen == 64 > > > +# define ARCH_REGS struct user_regs_struct > > > +# define SYSCALL_NUM a7 > > > +# define SYSCALL_RET a0 > > > #elif defined(__hppa__) > > > # define ARCH_REGS struct user_regs_struct > > > # define SYSCALL_NUM gr[20] > > > @@ -1671,7 +1677,7 @@ void change_syscall(struct __test_metadata *_metadata, > > > EXPECT_EQ(0, ret) {} > > > > > > #if defined(__x86_64__) || defined(__i386__) || defined(__powerpc__) || \ > > > - defined(__s390__) || defined(__hppa__) > > > + defined(__s390__) || defined(__hppa__) || defined(__riscv) > > > { > > > regs.SYSCALL_NUM = syscall; > > > } > > > -- > > > 2.21.0 > > > > > > > > > > > > - Paul > > > > -- Kees Cook