Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp1761634imm; Sat, 29 Sep 2018 03:35:32 -0700 (PDT) X-Google-Smtp-Source: ACcGV62JHuf1fsCXP0wQSAkW4+AjTYoh8cuVSQdVlmSx72EG/QOQPy42+m2CoAIZmjHIMLaiMd+e X-Received: by 2002:a63:2bc5:: with SMTP id r188-v6mr1132277pgr.160.1538217332045; Sat, 29 Sep 2018 03:35:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538217332; cv=none; d=google.com; s=arc-20160816; b=sKC8eII0KpMZkNONZ+g7g0t7dyLWGg4SwMw5om99hCf7utKk8WC+Uo+lNRsKfuIs+c M5uUEr95/Z5tyy4wVzJPCl4DGgCvaY2Qnz0ge2JwHkzcAPANNkrstfkBMEw15uSZGEU0 YBzMqWMt+Y2+bo3dE/oXhtMNsWjIisEfjGyN6qo5WRa6+ffK+ockBGy8vlG6mcxk5EA0 bCAoBLCLpjRFMzj8TGOGXdoL8A49dYuUldVv/boa7N7EsHhpckRj8AbyRszzFlui2aT3 XK75e7WodtocB3hPoszXuspoN4dBoCYkXqzuTwYRKGuj4Ue5Omd6vOPfXgWD7gtFuNpY yYow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=eXC3eKA9H0BlVnV3/sxgN0DztLu03EsPi85tmevDQJM=; b=x5vi02D2ckOfOyNejGEDWh/g/vzfyPjTGMbyDOn9GIFX4W6tN9LYcnMEvjhacQ1LKC RbHwgXn53qzuXsul+KXpBFIlO0Sd0HaN89TXPYTY5S2UZGBHfJBCKlxBfOVBYSi/cYTO IehrDIH/QN8v8qRtVIJGxiZWntzCWsFMLytWeu1M5C5GJ2lXiCYDfY9LcFzsZvWi3k5K V2U76v0JdjROZyX9zj4kXoVHpahfUwYZlr3DeSEvKQHfyOJapWgfwQTK5rY4v4NhWDYS KmbdpUfJqrJU+osTzJtY160RwtEEwXjxULTiyjyoTu5jLd/FIptgkKA/GioWINc814NV gZyA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y125-v6si6810322pgb.14.2018.09.29.03.35.17; Sat, 29 Sep 2018 03:35:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727980AbeI2RDF (ORCPT + 99 others); Sat, 29 Sep 2018 13:03:05 -0400 Received: from mx2.mailbox.org ([80.241.60.215]:14122 "EHLO mx2.mailbox.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727786AbeI2RDF (ORCPT ); Sat, 29 Sep 2018 13:03:05 -0400 Received: from smtp2.mailbox.org (smtp2.mailbox.org [80.241.60.241]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx2.mailbox.org (Postfix) with ESMTPS id 44DC4414D8; Sat, 29 Sep 2018 12:35:08 +0200 (CEST) X-Virus-Scanned: amavisd-new at heinlein-support.de Received: from smtp2.mailbox.org ([80.241.60.241]) by hefe.heinlein-support.de (hefe.heinlein-support.de [91.198.250.172]) (amavisd-new, port 10030) with ESMTP id G8X9ZY6rUDEG; Sat, 29 Sep 2018 12:35:06 +0200 (CEST) From: Aleksa Sarai To: Jeff Layton , "J. Bruce Fields" , Al Viro , Arnd Bergmann , Shuah Khan Cc: David Howells , Andy Lutomirski , Christian Brauner , Eric Biederman , Tycho Andersen , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org, dev@opencontainers.org, containers@lists.linux-foundation.org, Aleksa Sarai Subject: [PATCH 0/3] namei: implement various scoping AT_* flags Date: Sat, 29 Sep 2018 20:34:50 +1000 Message-Id: <20180929103453.12025-1-cyphar@cyphar.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The need for some sort of control over VFS's path resolution (to avoid malicious paths resulting in inadvertent breakouts) has been a very long-standing desire of many userspace applications. This patchset is a revival of Al Viro's old AT_NO_JUMPS[1] patchset with a few additions. The most obvious change is that AT_NO_JUMPS has been split as dicussed in the original thread, along with a further split of AT_NO_PROCLINKS which means that each individual property of AT_NO_JUMPS is now a separate flag: * Path-based escapes from the starting-point using "/" or ".." are blocked by AT_BENEATH. * Mountpoint crossings are blocked by AT_XDEV. * /proc/$pid/fd/$fd resolution is blocked by AT_NO_PROCLINKS (more correctly it actually blocks any user of nd_jump_link() because it allows out-of-VFS path resolution manipulation). AT_NO_JUMPS is now effectively (AT_BENEATH|AT_XDEV|AT_NO_PROCLINKS). At Linus' suggestion in the original thread, I've also implemented AT_NO_SYMLINKS which just denies _all_ symlink resolution (including "proclink" resolution). An additional improvement was made to AT_XDEV. The original AT_NO_JUMPS path didn't consider "/tmp/.." as a mountpoint crossing -- this patch blocks this as well (feel free to ask me to remove it if you feel this is not sane). Currently I've only enabled these for openat(2) and the stat(2) family. I would hope we could enable it for basically every *at(2) syscall -- but many of them appear to not have a @flags argument and thus we'll need to add several new syscalls to do this. I'm more than happy to send those patches, but I'd prefer to know that this preliminary work is acceptable before doing a bunch of copy-paste to add new sets of *at(2) syscalls. One additional feature I've implemented is AT_THIS_ROOT (I imagine this is probably going to be more contentious than the refresh of AT_NO_JUMPS, so I've included it in a separate patch). The patch itself describes my reasoning, but the shortened version of the premise is that continer runtimes need to have a way to resolve paths within a potentially malicious rootfs. Container runtimes currently do this in userspace[2] which has implicit race conditions that are not resolvable in userspace (or use fork+exec+chroot and SCM_RIGHTS passing which is inefficient). AT_THIS_ROOT allows for per-call chroot-like semantics for path resolution, which would be invaluable for us -- and the implementation is basically identical to AT_BENEATH (except that we don't return errors when someone actually hits the root). I've added some selftests for this, but it's not clear to me whether they should live here or in xfstests (as far as I can tell there are no other VFS tests in selftests, while there are some tests that look like generic VFS tests in xfstests). If you'd prefer them to be included in xfstests, let me know. [1]: https://lore.kernel.org/patchwork/patch/784221/ [2]: https://github.com/cyphar/filepath-securejoin Aleksa Sarai (3): namei: implement O_BENEATH-style AT_* flags namei: implement AT_THIS_ROOT chroot-like path resolution selftests: vfs: add AT_* path resolution tests fs/fcntl.c | 2 +- fs/namei.c | 158 ++++++++++++------ fs/open.c | 10 ++ fs/stat.c | 15 +- include/linux/fcntl.h | 3 +- include/linux/namei.h | 8 + include/uapi/asm-generic/fcntl.h | 20 +++ include/uapi/linux/fcntl.h | 10 ++ tools/testing/selftests/Makefile | 1 + tools/testing/selftests/vfs/.gitignore | 1 + tools/testing/selftests/vfs/Makefile | 13 ++ tools/testing/selftests/vfs/at_flags.h | 40 +++++ tools/testing/selftests/vfs/common.sh | 37 ++++ .../selftests/vfs/tests/0001_at_beneath.sh | 72 ++++++++ .../selftests/vfs/tests/0002_at_xdev.sh | 54 ++++++ .../vfs/tests/0003_at_no_proclinks.sh | 50 ++++++ .../vfs/tests/0004_at_no_symlinks.sh | 49 ++++++ .../selftests/vfs/tests/0005_at_this_root.sh | 66 ++++++++ tools/testing/selftests/vfs/vfs_helper.c | 154 +++++++++++++++++ 19 files changed, 707 insertions(+), 56 deletions(-) create mode 100644 tools/testing/selftests/vfs/.gitignore create mode 100644 tools/testing/selftests/vfs/Makefile create mode 100644 tools/testing/selftests/vfs/at_flags.h create mode 100644 tools/testing/selftests/vfs/common.sh create mode 100755 tools/testing/selftests/vfs/tests/0001_at_beneath.sh create mode 100755 tools/testing/selftests/vfs/tests/0002_at_xdev.sh create mode 100755 tools/testing/selftests/vfs/tests/0003_at_no_proclinks.sh create mode 100755 tools/testing/selftests/vfs/tests/0004_at_no_symlinks.sh create mode 100755 tools/testing/selftests/vfs/tests/0005_at_this_root.sh create mode 100644 tools/testing/selftests/vfs/vfs_helper.c -- 2.19.0