Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp607107pxf; Thu, 18 Mar 2021 07:57:46 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwgfWyR2Raqlb5iXRY4r5sThHc+QYe/3mlchHnJ52JvH4MIKLvX84Q2ouV2XzFsbgPQXCNi X-Received: by 2002:a05:6402:11c9:: with SMTP id j9mr4139453edw.348.1616079466637; Thu, 18 Mar 2021 07:57:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1616079466; cv=none; d=google.com; s=arc-20160816; b=BTQEm9K/Q5/sRLhOq8aF9UBbe6WH/gnZfM2ailC2xmVRyNmHIL8eRIbNsSGkHDMZIT CrkiBwZ1rYi397V3iKUmyp6OBQrdQsybd5F/kAItgDbJzhxAzjH7KVzsnk+X+S9rL+en Qybkql9KIjIW8eNpS3aU7+7CPKp09U1MilSfoRGi4qxkfwjJ3t3C8CJEk5XkzgnqWLeL r05UAirr0A7WV2GPvHER8ek2CDLyEYs2AIyl8zXKipDeCq2bjfGlu4MVDWAFdAH/W1++ aWzQZGnbuvdFRpUEdBgpRH/e+/U4U2mb1558IgLK34pKbGubBxvD5tw/E40yE56Wx3SM 7isg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=2m3YqkbJxZl1U0bSAuiL8iXgw9nK4kJlWPocyD5rjxI=; b=Qd6pdxE/7kesa8FNx26Na/jsEFx0Ew+7tdV7pq5uf1dnUfLoPXTeUsFUsWCgAnU+ZH REEtoblywH7meF2+tkBgaDGxl8mxn+e45J02IGA9VN5UfQk9Bd/wum+wNhYTWF9SjCVa qhMYCPwT8PjMR8ZgJFAuTip7HAhXu9LntXDNDjVxBMdJlxwyuFMf+m8wTKmf26Xi3iN6 Ddnc2+/FAaOwHPnlYZny26creePr/z6eYd+f4nXjTjMMk4DsTQ9maE39Fr568hyaZbbS BInVyy0NqxynZfQFNW6XetCDMx6dZrYC9gndYZqMbvothfb5jGSKm3u6KTOR7t9ianUq v3gQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x8si1869815edl.510.2021.03.18.07.57.23; Thu, 18 Mar 2021 07:57:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231614AbhCROzW (ORCPT + 99 others); Thu, 18 Mar 2021 10:55:22 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:41168 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230374AbhCROy5 (ORCPT ); Thu, 18 Mar 2021 10:54:57 -0400 Received: from ip5f5af0a0.dynamic.kabel-deutschland.de ([95.90.240.160] helo=wittgenstein) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1lMu35-0000LC-Rm; Thu, 18 Mar 2021 14:54:55 +0000 Date: Thu, 18 Mar 2021 15:54:54 +0100 From: Christian Brauner To: Kees Cook Cc: Sargun Dhillon , LKML , Giuseppe Scrivano , Tycho Andersen , Hariharan Ananthakrishnan , Keerti Lakshminarayan , Kyle Anderson , Linux Containers List , stgraber@ubuntu.com, Andy Lutomirski Subject: Re: seccomp: Delay filter activation Message-ID: <20210318145454.d2xbetk2werv7j2u@wittgenstein> References: <20210301110907.2qoxmiy55gpkgwnq@wittgenstein> <20210301132156.in3z53t5xxy3ity5@wittgenstein> <202103011515.3A941F6@keescook> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <202103011515.3A941F6@keescook> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Sorry, I just found that mail. On Mon, Mar 01, 2021 at 03:44:06PM -0800, Kees Cook wrote: > On Mon, Mar 01, 2021 at 02:21:56PM +0100, Christian Brauner wrote: > > On Mon, Mar 01, 2021 at 12:09:09PM +0100, Christian Brauner wrote: > > > On Sat, Feb 20, 2021 at 01:31:57AM -0800, Sargun Dhillon wrote: > > > > We've run into a problem where attaching a filter can be quite messy > > > > business because the filter itself intercepts sendmsg, and other > > > > syscalls related to exfiltrating the listener FD. I believe that this > > > > problem set has been brought up before, and although there are > > > > "simpler" methods of exfiltrating the listener, like clone3 or > > > > pidfd_getfd, but these are still less than ideal. > > I'm trying to make sure I understand: the target process would like to > have a filter attached that blocks sendmsg, but that would mean it has > no way to send the listener FD to its manager? With pidfd_getfd() that wouldn't be a problem, I think which is what I was trying to say. Unless the supervising task doen't have enough privilege over the supervised task which seems like an odd scenario but is technically possible, I guess. > > And you'd want to have listening working for sendmsg (otherwise you > could do it with two filters, I imagine)? > > > > int fd_filter = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_DETACHED, &prog); > > > > > > BARRIER_WAIT_SETUP_DONE; > > > > > > int ret = seccomp(SECCOMP_ATTACH_FILTER, 0, INT_TO_PTR(fd_listener)); > > > > This obviously should've been sm like: > > > > struct seccomp_filter_attach { > > union { > > __s32 pidfd; > > __s32 pid; > > }; > > __u32 fd_filter; > > }; > > > > and then > > > > int ret = seccomp(SECCOMP_ATTACH_FILTER, 0, seccomp_filter_attach); > > Given the difficulty with TSYNC, I'm not excited about adding an > "apply this filter to another process" API. :) Just to give a more complete reason for suggesting something like this without trying to argue that we must have this: seccomp() has so far been an API that is caller-centric and by that I mean that the caller loaded it's seccomp profile and sandboxed itself. As such seccomp is an example of "caller-managed" security. This security model has obvious advantages and fits into the general fork()-like world of unix. But imho that self-management model breaks down as soon as a file descriptor that can be used to refer to the object in question enters into the picture. For seccomp this "breaking point" was the seccomp notifier fd. Because with the introduction of that fd we have introduced the concept of supervisor and supervisee for seccomp which imho didn't really exist in the same way before. It's pretty obvious from the type of language that we now use both in userspace and in kernelspace when we talk about the seccomp notifier. At the current point we're somewhere in the middle between caller-managed and supervised seccomp which brings up funny probelms and edge-cases. One of them most obvious examples is in fact the question how to get the seccomp notify fd out of the supervised task. This clearly points to the fact that we're missing one of the fundamentals of an fd-based supervision model: open(). This is why I was suggesting the SECCOMP_ATTACH_FILTER command. It's in a sense an open-call for the seccomp notify fd. That all being said I know that it can be weird to implement this and if you prefer we go with another simpler model to work around such things than I fully understand. Christian