Received: by 10.213.65.68 with SMTP id h4csp91297imn; Thu, 15 Mar 2018 10:27:32 -0700 (PDT) X-Google-Smtp-Source: AG47ELtLu8v3c+d8ryOYn6mD2HEjHSjIP9vW8KxmsWhkSUagb29v4MfvZOdB1nkED0oDu0qMx2EB X-Received: by 10.167.128.2 with SMTP id j2mr3133235pfi.179.1521134852574; Thu, 15 Mar 2018 10:27:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521134852; cv=none; d=google.com; s=arc-20160816; b=PbDkeTVxivLEW6tgxi9/SQ4ARJ9VlUFmQRC0wmxqvwNnFtpHk1AhWlUioqFmjHnjy2 M/1y6JyQ/at4HyVs0JPEAZV3mB5v8BO0qnMy++rg+/lXUCKslzWsxai8CUcmiRtT3+lW itRg+kWfmgKhgS8zDHXOUaRoqt4jW+AfGqT6LJdUoJVKecbjB2gE5HmR1FRYZ2xfSKms c4aVS4IYLK9Xaki8bx60wZxCCnnBPsNenEUmMrx33va81iiDgL9/oXOuqjZZ9gFHYIvq +vcd8t0ZDvJHorTdg8VmRD5jHedJ2jxB5OWjrcGKCtB6cS1GyeueGbrUPT2Fkf+2KLNY KFQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=fc4S/6S24nu1BXf7e0CdxcYseYsJN8uUYiVPod7IQ74=; b=tgEVF6WAELtJF4Zg0M64is6easJnffcQiQxm/wPOYhGtvxN0debiE2BGHUjpsxzb6f HVVRsPSgHVmssf2KnNXYzRTOgJWKuNNsvnirsJiD61KdMo5l73vLS2nszEs9+ur70PN+ FZIJQnGJwgBDxgkhK73VVd5CRtfzwXdgbx6xzDUK1TnG0fIyhFDrLkZqTh2in4hnDjrh D1aEHrEwrRp+c1x5ybbMndkAcVDc8ck4oCs4vFNHdaNkvgzaq31k0UNt54VpaXmFQ/eG OrDidf2WobI7r7wre4M7SdGwDlo1zTlDz2beU0sWDVA2kGi0Kgizo4VD4sYj0spxTkhr dTqQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z19si4092330pfd.397.2018.03.15.10.27.17; Thu, 15 Mar 2018 10:27:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752217AbeCOR0F (ORCPT + 99 others); Thu, 15 Mar 2018 13:26:05 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:53572 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751595AbeCOR0D (ORCPT ); Thu, 15 Mar 2018 13:26:03 -0400 Received: from mail-wm0-f70.google.com ([74.125.82.70]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1ewWde-0003X1-3K for linux-kernel@vger.kernel.org; Thu, 15 Mar 2018 17:26:02 +0000 Received: by mail-wm0-f70.google.com with SMTP id u83so2810260wmb.3 for ; Thu, 15 Mar 2018 10:26:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=fc4S/6S24nu1BXf7e0CdxcYseYsJN8uUYiVPod7IQ74=; b=h2jPJvk0lsjy0bnKT4fQfO/dkD9spg74G8Ql2esC9EfxW7uUTHgIM2jCQSv7z2JxVG /AMlCW+up3RTUQS1/K2TNky7WgxgtzuGi8nqmApdWChcL2WlTAoauj7scz8H2+HSIsv3 ArwzyTX0qNgvDFNADExHdp1W3x28E17mqQKWwt2juILNVCz6BKV6PdILDYpUtZzjSyCA ZNhEUc9vn2eaexZEwN8ROn8XbcZ17s2+WKpqzaSSqgL9GJfEHjKz2qfnpFtI3oN13BDO X0wdW0FTM/qv9RLJquom55dNYP6w+6v/oXlcug+naXtGeg0kaPVQCTEiwwNgcTUJqzIo sC/Q== X-Gm-Message-State: AElRT7Ff/yTIMN8cTOCW1Vy6rIxdIYSmSRYs2jpxVTR6YMfLWDMR59A5 hTuK9uJyOQWbQfmvM3eSn2IYBha6gSCJDR/ZWClk/7TKcEEiFzW19/oWmBJ6B7iIwfQ3U4IKofW bXjcOAroFHpVQz6y3qfonaxmlcpCypj4R4aQDsVTVPw== X-Received: by 10.28.158.197 with SMTP id h188mr5706976wme.72.1521134761662; Thu, 15 Mar 2018 10:26:01 -0700 (PDT) X-Received: by 10.28.158.197 with SMTP id h188mr5706959wme.72.1521134761412; Thu, 15 Mar 2018 10:26:01 -0700 (PDT) Received: from gmail.com (u-084-c101.eap.uni-tuebingen.de. [134.2.84.101]) by smtp.gmail.com with ESMTPSA id k14sm5859541wrc.62.2018.03.15.10.26.00 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 15 Mar 2018 10:26:00 -0700 (PDT) Date: Thu, 15 Mar 2018 18:25:59 +0100 From: Christian Brauner To: Andy Lutomirski Cc: "Serge E. Hallyn" , Tycho Andersen , LKML , Linux Containers , Kees Cook , Oleg Nesterov , "Eric W . Biederman" , Christian Brauner , Tyler Hicks , Akihiro Suda Subject: Re: [RFC 0/3] seccomp trap to userspace Message-ID: <20180315172558.GA28108@gmail.com> References: <20180315170509.GA32766@mail.hallyn.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 15, 2018 at 05:11:32PM +0000, Andy Lutomirski wrote: > On Thu, Mar 15, 2018 at 5:05 PM, Serge E. Hallyn wrote: > > Quoting Andy Lutomirski (luto@kernel.org): > >> On Thu, Mar 15, 2018 at 4:09 PM, Christian Brauner > >> wrote: > >> > On Sun, Feb 04, 2018 at 11:49:43AM +0100, Tycho Andersen wrote: > >> >> Several months ago at Linux Plumber's, we had a discussion about adding a > >> >> feature to seccomp which would allow seccomp to trigger a notification for some > >> >> other process. Here's a draft of that feature. > >> >> > >> >> Patch 1 contains the bulk of it, patches 2 & 3 offer an alternative way to > >> >> acquire the fd that receives notifications via ptrace (the method in patch 1 > >> >> poses some problems). Other suggestions for how to acquire one of these fds > >> >> would be welcome. > >> >> > >> >> Take a close look at the synchronization. I think I've got it right, but I > >> >> probably don't :) > >> >> > >> >> Thanks! > >> >> > >> >> Tycho Andersen (3): > >> >> seccomp: add a return code to trap to userspace > >> >> seccomp: hoist out filter resolving logic > >> >> seccomp: add a way to get a listener fd from ptrace > >> >> > >> >> arch/Kconfig | 7 + > >> >> include/linux/seccomp.h | 14 +- > >> >> include/uapi/linux/ptrace.h | 1 + > >> >> include/uapi/linux/seccomp.h | 18 +- > >> >> kernel/ptrace.c | 4 + > >> >> kernel/seccomp.c | 467 ++++++++++++++++++++++++-- > >> >> tools/testing/selftests/seccomp/seccomp_bpf.c | 180 +++++++++- > >> >> 7 files changed, 653 insertions(+), 38 deletions(-) > >> > > >> > Hey, > >> > > >> > So, I've been following the discussion silently in the background and I > >> > see that it got sidetracked into seccomp + ebpf. While I can see that > >> > there is value in adding epbf support to seccomp I'd really like to see > >> > this decoupled from this patchset. Afaict, this patchset would just work > >> > fine without the ebpf portion (but I might be just have missed the > >> > point). So if possible I would like to see a second version of this with > >> > the comments accounted for and - if possible - have this up for merging > >> > independent of the ebpf patchset that's floating around. > >> > > >> > >> The issue is that it might be (and, then again, might not be) nicer to > >> to *synchronously* call out to the monitor in the filter. eBPF can do > >> that very cleanly, whereas classic BPF can't. > > > > Hm, synchronously - that brings to mind a thought... I should re-look at > > Tycho's patches first, but, if I'm in a container, start some syscall that > > gets trapped to userspace, then I hit ctrl-c. I'd like to be able to have > > the handler be interrupted and have it return -EINTR. Is that going to > > be possible with the synchronous approach? > > I think so, but it should be possible with the classic async approach > too. The main issue is the difference between a classic filter like > this (pseudocode): > > if (nr == SYS_mount) return TRAP_TO_USERSPACE; > > and the eBPF variant: > > if (nr == SYS_mount) trap_to_userspace(); > > I admit that it's still not 100% clear to me that the latter is > genuinely more useful than the former. We've just discussed this on irc and the fact that most problems can be addressed by interfaces we already have makes it questionable what ebpf brings to the game here. Especially since the discussion gave the impression that if ebpf ever makes it to seccomp it will basically be because it allows a nice implementation of the trap to userspace. If it's even unclear whether it is really the better choice for this task then we could consider to no try and make this patchset use it. (I probably sound way more polemic than I intend to.) > > The case where I think the synchronous function call is a huge win is this one: > > if (nr == SYS_mount) { > log("Someone called mount with args %lx\n", ...); > return RET_KILL; > } > > The idea being that the log message wouldn't show up in the kernel log > -- it would get sent to the listener socket belonging to whoever > created the filter, and that process could then go and log it > properly. This would work perfectly in containers and in totally > unprivileged applications like Chromium. Hm, that is a decent point but that's also a non-essential feature. I also wonder if there's any reason to not simply extend it to use ebpf later if seccomp every uses it? Christian