Received: by 10.213.65.68 with SMTP id h4csp502283imn; Fri, 16 Mar 2018 09:42:06 -0700 (PDT) X-Google-Smtp-Source: AG47ELsQ1qlLQFFnVO2iFFAzsQFAioTx8wExyiPc69UaHPulL1vPiTMasR5UuMeSKnlg4yG/Aza5 X-Received: by 2002:a17:902:aa8d:: with SMTP id d13-v6mr2890712plr.378.1521218526174; Fri, 16 Mar 2018 09:42:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521218526; cv=none; d=google.com; s=arc-20160816; b=ScClW9gwKhnuE8tkGmLjqu7RdYtTQseC+BIqYL9VJk/NHz54xoSCwONkRLnwol/T12 B1458HyCtLcZeN7no5joIzamncy5v+/iOD46zUkq6VcbhiFFMmLiQukixBMpDyi1EABU DnzJ1jzoS5aXaADUZ87P8b/U+pU0UqxGxazWu+6+ZRNre3itbP2scc+aEQtNu6gUOxp0 VaOef7MhgIZyGeN1oBZMyHmYns7xI1+n01mEMO3WOW1uR2Hpbdj7yRINbrFyUXEB7n8K L+nqD4614TA4rpQ5OVVsjBz7GhEmfLiEw/Gwulej+6p3rUux4kCjjehD5bofb1p0XvxW D8KQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=SlcWBV6XNj+brYn97h0gLw9FyBIk2jnXf+SkFIygGBc=; b=v8qtcYHMFJrNqp2KCthscf2gKAQC8yiYZJdQACn692nRPyg8YAycWwzhevw8OH32Q1 FfnL7u8sw2GZRBKpZRTPo19WAGlxxrqfsTav+SogoxbPYHFfEuxvOFMZKT/gWbvHprlc U8zg4f0YAGj6Ha6fMPbYCiLsTqww0LkSpkX4lIIrtG79cZ0bNHF+fecv9aIwJKzvORVf kdCyqq7jAXlJI3te9dH+K4GFQr/Zu/1ru6K21F6EyY7MJcEEDKg03rc+T8mZymMKHdPd uoaRIXa2+/DMGiXID8FDHUEuyQETyPQhJ1iYbACHoytSZDjfdx9RZfUACH+/gYutD9Bg NpRw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@mailbox.org header.s=mail20150812 header.b=yp3EHXAE; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=mailbox.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e34-v6si6459273plb.588.2018.03.16.09.41.51; Fri, 16 Mar 2018 09:42:06 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@mailbox.org header.s=mail20150812 header.b=yp3EHXAE; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=mailbox.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753170AbeCPQk6 (ORCPT + 99 others); Fri, 16 Mar 2018 12:40:58 -0400 Received: from mx1.mailbox.org ([80.241.60.212]:9228 "EHLO mx1.mailbox.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751630AbeCPQky (ORCPT ); Fri, 16 Mar 2018 12:40:54 -0400 Received: from smtp1.mailbox.org (smtp1.mailbox.org [80.241.60.240]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.mailbox.org (Postfix) with ESMTPS id DDB2B42BBB; Fri, 16 Mar 2018 17:40:52 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=mailbox.org; h= in-reply-to:content-transfer-encoding:content-disposition :content-type:content-type:mime-version:references:message-id :subject:subject:from:from:date:date:received; s=mail20150812; t=1521218451; bh=uCTW1tlw8RNoQPh+Ao6TSatcZBO3p0+Y/3E8ZyKoYtA=; b= yp3EHXAE0g2n2zYSAtVlIcJR2vgupKe3EGPGRlnaw5MfvSpY37pG0qwEVrwKlFlE DjTe9vzkVKLv1ydRtlMk3DMLSpVa1swEOEkKtoLqxoS8qg0ixnCYuSWCL3IUs1p8 Tsf9373CXPiImTB1SlYSQWogkE5kWMOVjA0TT48a/40tLhZD3dtuHrIaFsCNryyR w0qyqxDu35k6llJ5Z0bnsS0I1MQQlUaUU2k392kvnE3wE6N1hXbJOBjxY+1kBehP RSzLpGnCcH1MWXlollZahTcMZ/IzW2OozYhaRRySD0/VQgWAZXFpzPnUh3gXpP9J i24BGnOtuIoK/SZ7cDaExQ== X-Virus-Scanned: amavisd-new at heinlein-support.de Received: from smtp1.mailbox.org ([80.241.60.240]) by spamfilter02.heinlein-hosting.de (spamfilter02.heinlein-hosting.de [80.241.56.116]) (amavisd-new, port 10030) with ESMTP id iVAuXukzslQy; Fri, 16 Mar 2018 17:40:51 +0100 (CET) Date: Fri, 16 Mar 2018 17:40:48 +0100 From: Christian Brauner To: Andy Lutomirski Cc: Andy Lutomirski , Tycho Andersen , Kees Cook , Linux Containers , Akihiro Suda , LKML , Oleg Nesterov , Christian Brauner , "Eric W . Biederman" , Christian Brauner , Tyler Hicks , Alexei Starovoitov Subject: Re: [RFC 0/3] seccomp trap to userspace Message-ID: <20180316164048.GA30454@mailbox.org> References: <20180204104946.25559-1-tycho@tycho.ws> <20180315160924.GA12744@gmail.com> <20180315170509.GA32766@mail.hallyn.com> <20180315173524.k7vwnvnhomg2j5yv@smitten> <20180316144751.GA3304@mailbox.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 16, 2018 at 09:01:47AM -0700, Andy Lutomirski wrote: > > > > On Mar 16, 2018, at 7:47 AM, Christian Brauner wrote: > > > >> On Fri, Mar 16, 2018 at 12:46:55AM +0000, Andy Lutomirski wrote: > > > I bet I confused everyone with a blatant typo: > > >> > >> Hmm, I think we have to be very careful to avoid nasty races. I think > >> the correct approach is to notice the signal and send a message to the > >> listener that a signal is pending but to take no additional action. > >> If the handler ends up completing the syscall with a successful > >> return, we don't want to replace it with -EINTR. IOW the code looks > >> kind of like: > >> > >> send_to_listener("hey I got a signal"); > > That should be “hey I got a syscall”. D’oh! Ha ok, that's what led me to believe that listener != handler and I was trying to make sense of thise. :) Thanks! Christian > > >> wait_ret = wait_interruptible for the listener to reply; > >> if (wait_ret == -EINTR) { > > > > Hm, so from the pseudo-code it looks like: The handler would inform the > > listener that it received a signal (either from the syscall requester or > > from somewhere else) and then wait for the listener to reply to that > > message. This would allow the listener to decide what action it wants > > the handler to take based on the signal, i.e. either cancel the request > > or retry? The comment makes it sound like that the handler doesn't > > really wait on the listener when it receives a signal it simply moves > > on. > > It keeps waiting killably but not interruptibly. > > > So no "taking no additional action" here means not have the handler > > decide to abort but the listener? > > If by “handler” you mean kernel, then yes. > > There’s no userspace syscall handler involved. From the kernel’s perspective, a syscall is never still in progress when a signal handler is invoked — we only actually invoke syscall handlers in prepare_exit_to_usermode() or the non-x86 equivalent and the functions it calls. While a syscall is running, the kernel might notice that a signal is pending and do one of a few things: > > 1. Just keep going. Not all syscalls can be interrupted. > > 2. Try to finish early. If a send() call has already sent some but not all data, it can stop waiting and return the number of bytes sent. > > 3. Abort with -EINTR. > > 4. Abort with -ERESTARTSYS or one of its relatives. These fiddle with user registers in a somewhat unpleasant way to pretend that the syscall never actually happened. This works for syscalls that wait with an absolute timeout, for example. > > 5. Set up restart_syscall() magic, rewrite regs so it looks like the user was about to call restart_syscall() when the signal happened, and abort. > > In all cases, the signal is dealt with afterwards. This could result in changing regs to call the handler or in simply returning. > > 1-3 should work fully in seccomp. The only issue is that the kernel doesn’t know *which* to do, nor can the kernel force the listener to abort cleanly, so I think we have no real choice but to let the listener decide. > > 4 could be supported just like 1-3. 5 is awful, and I don’t think we should support it for user listeners.