Received: by 2002:a05:6a10:9e8c:0:0:0:0 with SMTP id y12csp299809pxx; Thu, 29 Oct 2020 03:01:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwZ9CZj5SKAmKJL/K9JZvkjfKNYntMyCWUPwl1QFKsfRNQXZ/sqIOoFKMWTfWEAtlLptuUa X-Received: by 2002:a50:f0c4:: with SMTP id a4mr2953114edm.289.1603965672322; Thu, 29 Oct 2020 03:01:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603965672; cv=none; d=google.com; s=arc-20160816; b=DMjVM8XS9unAOtD1D4l9zsfSRdT//AKiAZhp0OOI1LKBY5o4taJWtmFVSHtc9/J4iA vXNhcwZt7nfA6r12ABxYdiq66GI000TmMd8mK+YWgxGRH5E1G1AeliD4SdREYwObMsSO mIp6t1RTrInAIF4FIu8b3vyFfpCXLFYSbphDJaJ+SX96mouDO/OkV43ErEc8YJIo7DyH TQofCqvE2Hw+IQlisCl+JK6TBIt6kn2hiTQY8m59lJO/lGIcU3duMJrk24h1pIf2TTda C+TrekDRl784OyQGXR99jb6LpQwEY++XERAH04h3iCzZIte9YH320NwZKpHzcFm5821I IomQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=hlLVGXKwfRDHVa35HBME+3l0zHcPkxKDhxKh4zTrBos=; b=KKDUcq+0oXmS+DqFBSxRlFfdS2o46Pv8piz8g/D8Ajf4r2dDugLhlB9/ouwcPTk6mH yqHsBYS4sgrr+4SzWhj41i2Li25C/Dorv6lV2EYKcjJbuXsx7Gd6hVAAAWV4fnFqZUsM Dl7WhVyqQLDpnIvdI/+lDwigEqaDbiPuKHdzue6fekhRYwUPbPZSLwcWuBfYt2leZGxV fD9j8o/SF2Q7N/V+1VJCyYKszupqqJ2jVRHj9iUOMltjFYNR6q8kBLc/7PalBHd+T+NB qNolUUD5hHpn5g0grNGukNen2CjjmjMRDANbVlA8qu9uBreHUPewq7IjH/tyYYgeeFvK O5Sw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=cpCHnTr6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t24si1506352edr.506.2020.10.29.03.00.50; Thu, 29 Oct 2020 03:01:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=cpCHnTr6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389377AbgJ2AAI (ORCPT + 99 others); Wed, 28 Oct 2020 20:00:08 -0400 Received: from mail.kernel.org ([198.145.29.99]:37076 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389368AbgJ2AAG (ORCPT ); Wed, 28 Oct 2020 20:00:06 -0400 Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id A181E208C7 for ; Thu, 29 Oct 2020 00:00:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1603929604; bh=OmN8vgKewQg9DdpzKeNqxxoP39UtEU/hbWDtA/HryW0=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=cpCHnTr6YQ87BhN5ztDtXudsOlZ7k/obgesU/+f7r1QzKpdZxrvxYeOeJc3/xqF8n aNngE6RhZjWZ7N/vXrN/e09anmuPJf/FG96ikI7D0STmCTUnCSQ94FgDGxuSehA8bG zFY4LtIkClK01umtpgGen7BgWo/sk6MA7ctEF0is= Received: by mail-wr1-f47.google.com with SMTP id w14so868819wrs.9 for ; Wed, 28 Oct 2020 17:00:04 -0700 (PDT) X-Gm-Message-State: AOAM531v7c6HPdBikiYtnRQ5NbNKP8qsi5vZ+yVFpgdo8OGmZJhpBkz4 E84ffM9rkLs9W4db2QaHh2YmrqP9bYyMt8B7mQPndA== X-Received: by 2002:adf:e9c6:: with SMTP id l6mr1973798wrn.257.1603929603069; Wed, 28 Oct 2020 17:00:03 -0700 (PDT) MIME-Version: 1.0 References: <202010281503.3D1FCFE0@keescook> In-Reply-To: <202010281503.3D1FCFE0@keescook> From: Andy Lutomirski Date: Wed, 28 Oct 2020 16:59:51 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [seccomp] Request for a "enable on execve" mode for Seccomp filters To: Kees Cook Cc: Camille Mougey , lkml , Jann Horn , Tycho Andersen , Rich Felker , Sargun Dhillon , Christian Brauner , "Michael Kerrisk (man-pages)" , Denis Efremov , Andy Lutomirski Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 28, 2020 at 3:47 PM Kees Cook wrote: > > On Wed, Oct 28, 2020 at 12:18:47PM +0100, Camille Mougey wrote: > > (This is my first message to the kernel list, I hope I'm doing it right) > > 1- self-confinement > 2- launching external processes > a) cooperating > b) oblivious I remain quite unconvinced that delayed filters will solve a real problem. As you described, 2a could just confine itself. There's an obvious synchronization point -- sd_notify(). I bet sd_notify() could be rigged up to apply externally-supplied filters, or sd_notify() could interact with user notifiers to get some assistance. 2b is nasty. In an ideal world, we would materialize a fully formed process with filters installed. The problem is that processes don't generally come fully formed. Almost all interesting processes are dynamically linked, and they get to specify their own dynamic linkers. Even if we limit ourselves to a known dynamic linker, we would want to make sure that the dynamic linker is hardened against various escape techniques. For dynamic linking, we would probably want to start out with one set of privileges (loading libraries) and then switch. I have an alternative suggestion to try to address some of the above: allow a notifier to run in a mode in which it can replace the BPF program outright. This would be something like: if (fork() != 0) return; // do parent stuff // Start up. Set a BPF program that directs pretty much everything at the listener. int fd = seccomp(..., SECCOMP_FILTER_FLAG_NEW_LISTENER | SECCOMP_FILTER_FLAG_ALLOW_REPLACEMENT, ...); // Set up other things if needed. execve(); Now, in the parent, once the child is ready for its final filters: // Replace the filter on *all* processes using the filter to which we're attached. // I think the locking for this should be straightforward. // Optional flag here to remove the ALLOW_REPLACEMENT flag, but it's not really necessary // since we're about to close() the listener. ioctl(fd, SECCOMP_IOCTL_NOTIF_REPLACE_FILTER, new_filter); // Call recv in a loop to drain and handle notifications. for (...) { ioctl(fd, SECCOMP_IOCTL_NOTIF_RECV, ...); ... } close(fd); And now we're done. We can make the synchronization point be anything we like. What do you all think? For people who really want delay-until-execve(), this can emulate it efficiently.