Received: by 2002:a05:6a10:9e8c:0:0:0:0 with SMTP id y12csp95429pxx; Wed, 28 Oct 2020 19:29:45 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyQKmnZatcFdbmKYNFvnfiJnM7Rr+KM4zSJuUB5URF90DRyMb+7Brqu8jYXu6FOHE68Ve0Z X-Received: by 2002:a17:906:1411:: with SMTP id p17mr2105532ejc.102.1603938585257; Wed, 28 Oct 2020 19:29:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603938585; cv=none; d=google.com; s=arc-20160816; b=ydIyl7CYiUT+iiGRL+9c89mqF3BuLnu463tYQ/LcNuVaU7O87WTljQmFmY7Nhlt6FL N7y0KgY/9KA9O+l0jp7SoJFy5Y5RQiZ50HQrACjVxnyoUAhviUqBjB0GgGy+Y4D+mLA8 MAkzUb1AddIvTdmt0nL9jyf3rpD+UuXeFbkyYZzOyk6ONEwXPV4JEIN5fu0Njy+DZzjg 6IxXyDnyMBYiplfL5K0Y+7kza3UxzTgstfTrm26Au6crQWLfhCtIQEJ8LqT/3pDcCp4x wwL6VH0bPlzWat/j84Jnf+cfpKH7EbPSTxZHGMNtfBqYF2cvrLNZdbjVY+bgdCQt+vdx AGtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:mime-version:dkim-signature; bh=z0bo9p/MHEtnqQo3sYgSNQdn7CC7jYK4NuW7gro8gUQ=; b=wtvGdHOdIcj6wU0lTQiqkIDIlU42aJ3ihIp62rswiUPb6vm/y+O+SLB+bncesHcHFm er950KfmmLUjzrThx1lgJY0ss/Lt8DHYwY82gT8iJn21o0tlnGQd+NhDBQOXwTvoUfPD invvPN/Fpw9u12IYyjBVSd92tho1taHJiT1XH65PDKkq9gEnRy4BVIA53VTBPZzQ8lYM e3EKpIjJDnROMu8WTOm0IwCDSaup86h7J55KnCstZVPN0XRttURTSSkBIcxtHasWXFLi cyRt4y9MrK8ZwFOj/JKx1ITrG6fuons2suhgOQPqSJdOLrmE7aXVAuVGuscfq8UG9ado aqbg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=EjigtYeY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p3si1301431edd.200.2020.10.28.19.29.24; Wed, 28 Oct 2020 19:29:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=EjigtYeY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726762AbgJ1VfY (ORCPT + 99 others); Wed, 28 Oct 2020 17:35:24 -0400 Received: from mail-wr1-f67.google.com ([209.85.221.67]:43012 "EHLO mail-wr1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726700AbgJ1VfU (ORCPT ); Wed, 28 Oct 2020 17:35:20 -0400 Received: by mail-wr1-f67.google.com with SMTP id g12so591115wrp.10 for ; Wed, 28 Oct 2020 14:35:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to:cc :content-transfer-encoding; bh=z0bo9p/MHEtnqQo3sYgSNQdn7CC7jYK4NuW7gro8gUQ=; b=EjigtYeY5Z4vpwcIl4MJFuY9I6ZyMqhUA1zVC5c+gjfE4oPozccC6kfD6777IpAlGO mc7jYs3S+C4MqN5J7lJCxxzJ0WrohShRPgXDyhbAkadQVZUvER7XMM+Ubx0s9JmQnFVd tmfWRuYNN3zXDTJsgfnnNky6CXlsMAufYHZHfDQW0XT3L7x+M4Y8Ghdo6aO8wkcupmCG qxourfq8Nw66DL49D1ZJeijEs/lS2IJ50yqkXGEkxcKA8/CBeod3gjAm1aMKgB9Cyyos yPKmK/QQJ8B2IjkdMT+dkwfD0dRbXNsmOq4MAnD2232N9us6chr6eum9moM9/jjJz7jH hNJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc :content-transfer-encoding; bh=z0bo9p/MHEtnqQo3sYgSNQdn7CC7jYK4NuW7gro8gUQ=; b=fB2z1YbRMHvLwSyzjVuXvvyOvphTN2j9R2WFAO/0Ns6n+BkLdMJH0A8VPGksk7HOja LJgIFzyREamh8j1bskGMivSkDzFhoxOa6F/bw2e8QfbhgeHMLYKMDSQJnL1Uhda1PmCT uHE4Nb8L433nu9kQQtht4ESVlgxZ0AcQTbJG6ym7ezf6qtONeRv5lDw4O0h0ZwnarKKH XhaLvPcUFgx6gcdclctlog8CXj5fmg5Uea9hMK+2W8RW0VKE9800jhz+4hQ1B7ISCKTa AwbZFlCJ3kTvZJeAsn7gKUzZK5QBIcpF3lmYEkVAXWQJ2Csz9+/sWFuwzPVyc21AFLW/ QGYQ== X-Gm-Message-State: AOAM531T0xQtMJWYQkK0aBbRE5TwRi/LwRPdPbnYIlYwF4ECKu3w/X/7 dmL8v2LI2OV1SZJL4mBX/1xWmdPAbCu3Rb/+y5oWfSjEpg== X-Received: by 2002:a17:906:2894:: with SMTP id o20mr6822049ejd.221.1603883932463; Wed, 28 Oct 2020 04:18:52 -0700 (PDT) MIME-Version: 1.0 From: Camille Mougey Date: Wed, 28 Oct 2020 12:18:47 +0100 Message-ID: Subject: [seccomp] Request for a "enable on execve" mode for Seccomp filters To: lkml Cc: Kees Cook , Jann Horn , Tycho Andersen , Rich Felker , Sargun Dhillon , Christian Brauner , "Michael Kerrisk (man-pages)" , Denis Efremov Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, (This is my first message to the kernel list, I hope I'm doing it right) From my understanding, there is no way to delay the activation of seccomp filters, for instance "until an _execve_ call". But this might be useful, especially for tools who sandbox other, non-cooperative, executables, such as "systemd" or "FireJail". It seems to be a caveat of seccomp specific to the system call _execve_. For now, some tools such as "systemd" explicitly mention this exception, and do not support it (from the man page): > Note that strict system call filters may impact execution and error handl= ing code paths of the service invocation. Specifically, access to the execv= e system call is required for the execution of the service binary =E2=80=94= if it is blocked service invocation will necessarily fail "FireJail" takes a different approach[1], with a kind of workaround: the project uses an external library to be loaded through LD_PRELOAD mechanism, in order to install filters during the loader stage. This approach, a bit hacky, also has several caveats: * _openat_, _mmap_, etc. must be allowed in order to reach the LD_PRELOAD mechanism, and for the crafted library to work ; * it doesn't work for static binaries. I only see hackish ways to restrict the use of _execve_ in a non-cooperative executable. These methods seem globally bypassables and not satisfactory from a security point of view. IMHO, a way to prepare filter and enable them only on the next _execve_ would have some benefit: * have a way to restrict _execve_ in a non-cooperative executable; * install filters atomically, ie. before the _execve_ system call return. That would limit racy situations, and have the very firsts instructions of potentially untrusted binaries already subject to seccomp filters. It would also ensure there is only one thread running at the filter enabling time. From what I understand, there is a relative use case[2] where the "enable on exec" mode would also be a solution. Thanks for your attention, C. Mougey [1]: https://github.com/netblue30/firejail/issues/3685 [2]: https://lore.kernel.org/linux-man/202010250759.F9745E0B6@keescook/