Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp1341497ybt; Thu, 9 Jul 2020 04:48:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwCplPZTPyHKILWaI1uFdltZPfCbw6gFmQwpTwvSf2LnlPUPrxY39JuJERxD20dAz/MTO49 X-Received: by 2002:a17:906:4d4c:: with SMTP id b12mr54834496ejv.506.1594295321793; Thu, 09 Jul 2020 04:48:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594295321; cv=none; d=google.com; s=arc-20160816; b=bSaazCUFRHPFpWdcHhF+WAmT7xNw+0O6T4WUmoK/FPOZy+KqHG5Q6XMgktekHGnl0/ theWtnqAeZfeOA40GYeUIh8G2wSW/iVRa4/nQFbMcbOF67MTcYbt4fyvPzMMl0YQo01A u2KE8xAAhMd+L5U9l40WV4+Wjoe1GafzJAW/GkiMkrP4rd55Do86l7Ut/U2nbcmvlTMD x11Bq6zKdcLu9RurJpVA4uCcYoFp/QXAt1PJbJm+1Etm16/vX2I2KqXp3bgijXepIEvC MQj+lUsaHCje6WgZaSqYztA+s+HY3YMVPQ98AzTNaEyAzaLkm5aGsXjWRaPAl03RX0LO UohA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=jwBkggp/9SDR2NBbKDrX7UsLr9nx6UUjUaSdXDeg+/c=; b=NaYJm0Iyv3+6pN5rAQA7jx282owbJiZ3yxKj4O3psnQE612iRSUeOl/nAEiQ79oqeG doLJ48JWWxjPnaY/mycSkZkxQC3I03epjmZhJtnDFfMzX9j7GspJWZ8pvuqfMy6lDrY+ 8uZRSIKq1QydBOVU8A0R0N4zNH6jpugfR+DKWkjPEuxqdewpsqosRDeq4uhRawhCMEhZ i+qQ+TgUK2Vp9zlzEf0IcfIN0hC9DF5fXJPmadKmGHPSLiyAPUc3w1ThE4eIssU89LE7 obRdWpZApAbfOHYgmkMIt+lGECcJ3jGejbCqRPg75dH+yybX+8VtofRzEFKZBGJmRWMv VekA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=casper.20170209 header.b=H1VGQHOB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j18si1861270edp.389.2020.07.09.04.48.17; Thu, 09 Jul 2020 04:48:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=casper.20170209 header.b=H1VGQHOB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726830AbgGILrK (ORCPT + 99 others); Thu, 9 Jul 2020 07:47:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726387AbgGILrK (ORCPT ); Thu, 9 Jul 2020 07:47:10 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 42227C061A0B for ; Thu, 9 Jul 2020 04:47:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=jwBkggp/9SDR2NBbKDrX7UsLr9nx6UUjUaSdXDeg+/c=; b=H1VGQHOBrIsAiZUkDj9nu8eOaW Zp3FQ6s558T2eW/iAvO8B4Yo6qVV72ZfWuhKt1v/+WUJVEGqVy+AWQ0zXswqJrZ/PEdIlOGz8e5Xu 2rSxELHP11pEsakqwu6SIDfLWAJfjElYtkLjuMSAFPVczJAjEuxHLAp4oBy9lw6fW31nCtsyIy0XD NAcPvQGKL1OPJLOm6I5++2V4QLM6RvorJBoQ1pQmH4yIJSjHf/cUOBAy646UHeABXpiTa5JeXmkVi ffd4wGQAK+zc1TdcpkZUKATzc6NgIhEhuh0jYY8rOqx7SmAMYcsgH3OcrwD9YNEUoLr973Blzx+Ew dG8VrBhg==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1jtV18-00057j-TW; Thu, 09 Jul 2020 11:47:07 +0000 Date: Thu, 9 Jul 2020 12:47:06 +0100 From: Matthew Wilcox To: Gabriel Krisman Bertazi Cc: tglx@linutronix.de, linux-kernel@vger.kernel.org, kernel@collabora.com, Andy Lutomirski , Paul Gofman , Kees Cook Subject: Re: [PATCH v2] kernel: Implement selective syscall userspace redirection Message-ID: <20200709114706.GB12769@casper.infradead.org> References: <20200709043840.4189530-1-krisman@collabora.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200709043840.4189530-1-krisman@collabora.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 09, 2020 at 12:38:40AM -0400, Gabriel Krisman Bertazi wrote: > The proposed interface looks like this: > > prctl(PR_SET_SYSCALL_USER_DISPATCH, , , [selector]) > > Dispatcher is the address of a syscall instruction that is allowed to > by-pass the blockage, such that in fast paths you don't need to disable > the trap nor check the selector. This is essential to return from > SIGSYS to a blocked area without triggering another SIGSYS from the > rt_sigreturn. Should be a single pointer or should the interface specify a range from which syscalls may be made without being redirected? eg, one could specify the whole of libc. prctl(PR_SET_SYSCALL_USER_DISPATCH, , , , [selector]) > +++ b/include/linux/syscall_user_dispatch.h > @@ -0,0 +1,45 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#ifndef _SYSCALL_USER_DISPATCH_H > +#define _SYSCALL_USER_DISPATCH_H > + > +struct task_struct; > +static void clear_tsk_thread_flag(struct task_struct *tsk, int flag); > + > +#ifdef CONFIG_SYSCALL_USER_DISPATCH > +struct syscall_user_dispatch { > + int __user *selector; > + unsigned long __user dispatcher; The __user annotation is on the pointer, not the value. ie, it's unsigned long foo; unsigned long __user *p; get_user(foo, p) > +++ b/include/uapi/asm-generic/siginfo.h > @@ -285,6 +285,7 @@ typedef struct siginfo { > */ > #define SYS_SECCOMP 1 /* seccomp triggered */ > #define NSIGSYS 1 > +#define SYS_USER_REDIRECT 2 I'd suggest that SYS_USER_REDIRECT should be moved up by one line. > +int set_syscall_user_dispatch(int mode, unsigned long __user dispatcher, > + int __user *selector) > +{ > + switch (mode) { > + case PR_SYSCALL_DISPATCH_DISABLE: > + if (dispatcher || selector) > + return -EINVAL; > + break; > + case PR_SYSCALL_DISPATCH_ENABLE: > + break; > + default: > + return -EINVAL; > + } > + > + if (selector) { > + if (!access_ok(selector, sizeof(int))) > + return -EFAULT; > + } You're not enforcing the alignment requirement here. > + spin_lock_irq(¤t->sighand->siglock); > + > + current->syscall_dispatch.selector = selector; > + current->syscall_dispatch.dispatcher = dispatcher; > + > + /* make sure fastlock is committed before setting the flag. */ fastlock? ;-) I don't think you actually need this. You're setting per-thread state on yourself, so what's the race that you're concerned about? > + smp_mb__before_atomic(); > + > + if (mode == PR_SYSCALL_DISPATCH_ENABLE) > + set_tsk_thread_flag(current, TIF_SYSCALL_USER_DISPATCH); > + else > + clear_tsk_thread_flag(current, TIF_SYSCALL_USER_DISPATCH); > + > + spin_unlock_irq(¤t->sighand->siglock); > + > + return 0; > +} > -- > 2.27.0 >