Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp3101532pxk; Mon, 7 Sep 2020 03:17:20 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw5WiMsc84Qs/ITtutLRzL8+g6lpk26+oN3l4bSsT5Ixl4pxZs3AcNB85QYIA36TnWpAhmI X-Received: by 2002:a05:6402:b72:: with SMTP id cb18mr20172425edb.299.1599473840391; Mon, 07 Sep 2020 03:17:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599473840; cv=none; d=google.com; s=arc-20160816; b=AXQOcAwHn1BV3UUAlrbJ95jKX9eDYv+cM9iiRlRwmzPFkcuDMUrSj5J0fKGRH1S0pA oAqid/8kdwVDdOGEgsI9xiy6HDcSMBZAnMRCjiN7iYFpQ6AZUd+ByMQhtS19d6qt25QJ hJW01Z5HRUR6WfldD0uMIQ7h6HVOkeifu+3GA4XDlUoMUsZFbEZwEjEGgixLeFNp2/M3 N1878yX1iFupEDVVHJlKnphDpJDYiaDtBg0lfGcPJQ7i7tEnOT+BPC8/y4KhIy8TRCL4 oMs9s1zgljvN8ThzD1S9mNhLEyERbRDcEMGlWW0+JlFM90B9vl7cBloNqODPzYmjoDMY rjLA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=0TMJj4X+41y3Z2XECb0dDEW7VAXI1bGIZFijKNNAAE4=; b=q+6AumFABVioNqQjluYnyPPTL8tZxv9XKJ0zZ95NMh3/vxB00SCmSK45oQljhGG14E vwjsHw2racLo/gI3ZyWgPE//2PRhcUftwfC+DveJJphGQmMj0PaKoENFTdYDW7MKcZap CFG0jMj8LzftUXoRQFECJg7SVCf4BqCbFlImRgLBNJxTqwMWZ8bNtZZJ7rL4gHxVeBIf qXa/Sg1dA6wWbHXrb+W/kKidwhynoMMay5sBf25h4HAv3OGvdzj8sm4SAk1NXYLv5Puv teEZ1hxnJYilw55fFdDf7tdHiSjET+vMujBT+7ns7Q9CLcGC90oA6SG9ANuW5jdrcHA7 Necg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v21si831414ejg.582.2020.09.07.03.16.58; Mon, 07 Sep 2020 03:17:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728480AbgIGKPa (ORCPT + 99 others); Mon, 7 Sep 2020 06:15:30 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:59694 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728243AbgIGKP1 (ORCPT ); Mon, 7 Sep 2020 06:15:27 -0400 Received: from ip5f5af70b.dynamic.kabel-deutschland.de ([95.90.247.11] helo=wittgenstein) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1kFEBG-0005Rk-Rm; Mon, 07 Sep 2020 10:15:22 +0000 Date: Mon, 7 Sep 2020 12:15:22 +0200 From: Christian Brauner To: Gabriel Krisman Bertazi Cc: luto@kernel.org, tglx@linutronix.de, keescook@chromium.org, x86@kernel.org, linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, willy@infradead.org, linux-kselftest@vger.kernel.org, shuah@kernel.org, kernel@collabora.com Subject: Re: [PATCH v6 6/9] kernel: entry: Support Syscall User Dispatch for common syscall entry Message-ID: <20200907101522.zo6qzgp4qfzkz7cs@wittgenstein> References: <20200904203147.2908430-1-krisman@collabora.com> <20200904203147.2908430-7-krisman@collabora.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200904203147.2908430-7-krisman@collabora.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 04, 2020 at 04:31:44PM -0400, Gabriel Krisman Bertazi wrote: > Syscall User Dispatch (SUD) must take precedence over seccomp, since the > use case is emulation (it can be invoked with a different ABI) such that > seccomp filtering by syscall number doesn't make sense in the first > place. In addition, either the syscall is dispatched back to userspace, > in which case there is no resource for seccomp to protect, or the Tbh, I'm torn here. I'm not a super clever attacker but it feels to me that this is still at least a clever way to circumvent a seccomp sandbox. If I'd be confined by a seccomp profile that would cause me to be SIGKILLed when I try do open() I could prctl() myself to do user dispatch to prevent that from happening, no? > syscall will be executed, and seccomp will execute next. > > Regarding ptrace, I experimented with before and after, and while the > same ABI argument applies, I felt it was easier to debug if I let ptrace > happen for syscalls that are dispatched back to userspace. In addition, > doing it after ptrace makes the code in syscall_exit_work slightly > simpler, since it doesn't require special handling for this feature. > > Signed-off-by: Gabriel Krisman Bertazi > --- > kernel/entry/common.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/kernel/entry/common.c b/kernel/entry/common.c > index 44fd089d59da..fdb0c543539d 100644 > --- a/kernel/entry/common.c > +++ b/kernel/entry/common.c > @@ -6,6 +6,8 @@ > #include > #include > > +#include "common.h" > + > #define CREATE_TRACE_POINTS > #include > > @@ -47,6 +49,12 @@ static inline long do_syscall_intercept(struct pt_regs *regs) > int sysint_work = READ_ONCE(current->syscall_intercept); > int ret; > > + if (sysint_work & SYSINT_USER_DISPATCH) { > + ret = do_syscall_user_dispatch(regs); > + if (ret == -1L) > + return ret; > + } > + > if (sysint_work & SYSINT_SECCOMP) { > ret = __secure_computing(NULL); > if (ret == -1L) > -- > 2.28.0