Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp4498677imm; Tue, 9 Oct 2018 00:03:35 -0700 (PDT) X-Google-Smtp-Source: ACcGV63gip9zkpSD+eJYQwH+fVxJpsziXDkyEDmkiOI8L2WCgZEYyytesV7kYswnN+nZNx+sn0jK X-Received: by 2002:a17:902:ceb:: with SMTP id 98-v6mr16809673plt.331.1539068615095; Tue, 09 Oct 2018 00:03:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539068615; cv=none; d=google.com; s=arc-20160816; b=on0n9GnEV1oXP/SnWVzBUoKt+2gcCHdf6yzTcMztzoUkTrKRfstabxehPLumJWPegt LTWkYcHow8zJZBTd3NztMBw2FHvapG/GSqGhvU68AbOx2Dmu0dXQY43JOGeoMcmFKi00 td8EMjHEQ0JRwVqPGUWjGedtbWwojquRejcZV9x/dAjxQXtEZ0e0cCa6ZfB3meIHtLvI hUVnKhSZ0jcUQvuv4lyU8oi6Yw+w2ELDFGM3n0ZY0HzY++U9nfShYV9gtGQvV5M2IZhM eozB8kOfR551Ytn2LNxgDZNbUOrq21sGAf+8nzPGvEMGRQ6Yn+Iafte3lIe2dWevMJNL 9qQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=iou5yltQ5eHJbRd5RvxMZ/n9K0Xa+N0P0YTLzhawPio=; b=JqsGhQuuJuWrYD07evT4+qR75wbua4L1uhFgffmH8/24MizxeH9kwdWN3Xd4MMBnMv LFShe0j3QTYuKBh4P0qaTtrjzFsKlFhl42oVlhIlOwOdmoh1nIUoT1WXu4/OEokfjDOM Lt7jyw7aeEn6N40IZfXvaVB/LAlgfmDB6Xvs9K6TLZrytJXL3b9UjytA7rMf36H4/Eyd 2JwJbe/uxK6bMtOM5eSIPxkeHvTQ4IVADDIazdzqN4kggeAP424hFuzFjvF0aExUw4nF WAKWSa0R5t4e4BBKVu5ed9iqVcFQc6oczyEUUVc+1rY0RrMNiH4M0YIGelSjyiWYwSu4 p5wA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cyphar-com.20150623.gappssmtp.com header.s=20150623 header.b=TjjW+XqS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z1-v6si20270287plb.131.2018.10.09.00.03.20; Tue, 09 Oct 2018 00:03:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@cyphar-com.20150623.gappssmtp.com header.s=20150623 header.b=TjjW+XqS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726740AbeJIOSM (ORCPT + 99 others); Tue, 9 Oct 2018 10:18:12 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:33986 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726479AbeJIOSL (ORCPT ); Tue, 9 Oct 2018 10:18:11 -0400 Received: by mail-pf1-f196.google.com with SMTP id k19-v6so359433pfi.1 for ; Tue, 09 Oct 2018 00:02:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cyphar-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=iou5yltQ5eHJbRd5RvxMZ/n9K0Xa+N0P0YTLzhawPio=; b=TjjW+XqSix4Uabi0/kpPHDk/1dfubiqk7gozRsnJjQ59ggRLpbHnyQWw3M5u5Dq/Lw y5qgUy2YLq7cS5W/SmC1SpEasYx43Ji+Uf/v68OER9z8kE0ZDvKR+LvYIk3J88Yb2Jpz ng+BV+dhCw3l18wR5Whs5+GKOe7nQDn2ovTqPvftGfchYQRPY+9TKfjVAN4eV4PgGoTW 9uIN+LzSoRN+y7yfSAuwdAik6VImiikJ0bpakCY//kZN+jsJnZUUjsbKoKYgoZcHqDt9 wpzsDQa1GCi4p2wCPhMvLbi2NUty1eySmUNY0DwwnMYmHhXepYvxzs0eyQnfug1ARzpg kVxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=iou5yltQ5eHJbRd5RvxMZ/n9K0Xa+N0P0YTLzhawPio=; b=hdEvufu5rUCwEFl4uZYyjVgfy7se+fE/q4Hev6P8CTOVMMhpLYuYx4iRt5hREkmdid HD7pyBas/L+MNbnZJ02wph1+NjW9jaQfDh75mNcUtZsrhp4c2nrB3TNJCOCZaUBu0e3r E2yLsciWwvOk1EAHi7eU5Mw6o0LYnqUtxTxXF5RQ7SPR7oAvG7mTJ9LbMfHzoNttAQFv Pe3tjSLV6cchdIuujLWwhzNY4S4sxyQlI3ybCsY07TwPGjt8uuEKbbxwkZFS6lJEvpjU 7Xqfy0SSVqZjl4YWrfMgcB3bhdwlBitLlCO16Un5A6YrPfenFsBHuFLqx7noxhsAUwoi Hs3w== X-Gm-Message-State: ABuFfojP95eEgnwh2h2/M8Jmm+yUlB/CPETml8ktjwdtVUqtx3Iz0eJj 6xA+1h+nYgem4rmRXU50+p5Gvg== X-Received: by 2002:a63:f80a:: with SMTP id n10-v6mr24528263pgh.57.1539068563155; Tue, 09 Oct 2018 00:02:43 -0700 (PDT) Received: from ?redacted? ([220.240.25.129]) by smtp.gmail.com with ESMTPSA id y1-v6sm31179246pfy.89.2018.10.09.00.02.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 09 Oct 2018 00:02:42 -0700 (PDT) From: Aleksa Sarai To: Al Viro , Eric Biederman Cc: Aleksa Sarai , Andy Lutomirski , David Howells , Jann Horn , Christian Brauner , David Drysdale , containers@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, Jeff Layton , "J. Bruce Fields" , Arnd Bergmann , Tycho Andersen , dev@opencontainers.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 0/3] namei: implement various lookup restriction AT_* flags Date: Tue, 9 Oct 2018 18:02:27 +1100 Message-Id: <20181009070230.12884-1-cyphar@cyphar.com> X-Mailer: git-send-email 2.19.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The need for some sort of control over VFS's path resolution (to avoid malicious paths resulting in inadvertent breakouts) has been a very long-standing desire of many userspace applications. This patchset is a revival of Al Viro's old AT_NO_JUMPS[1,2] patchset (which was a variant of David Drysdale's O_BENEATH patchset[3] which was a spin-off of the Capsicum project[4]) with a few additions and changes made based on the previous discussion within [5] as well as others I felt were useful. As per the discussion in the AT_NO_JUMPS thread, AT_NO_JUMPS has been split into separate flags. * AT_XDEV blocks mountpoint crossings (both upwards and downwards). openat("/", "tmp", AT_XDEV); // blocked openat("/tmp", "..", AT_XDEV); // blocked openat("/tmp", "/", AT_XDEV); // blocked * AT_NO_PROCLINKS blocks all resolution through /proc/$pid/fd/$fd "symlinks". Specifically, this blocks all jumps caused by a filesystem using nd_jump_link() to shove you around in the filesystem tree (these are referred to as "proclinks" in lieu of a better name). openat(AT_FDCWD, "/proc/self/root", AT_NO_PROCLINKS); // blocked openat(AT_FDCWD, "/proc/self/fd/0", AT_NO_PROCLINKS); // blocked openat(AT_FDCWD, "/proc/self/ns/mnt", AT_NO_PROCLINKS); // blocked * AT_BENEATH disallows escapes from the starting dirfd using ".." or absolute paths (either in the path or during symlink resolution). Conceptually this flag ensures that you "stay below" the starting point in the filesystem tree. ".." resolution is allowed if it doesn't land you outside of the starting point (this is made safe against races by patch 3 in this series). openat("/root", "foo", AT_BENEATH); // *not* blocked openat("/root", "a/../b", AT_BENEATH); // *not* blocked openat("/root", "a/../../root/b", AT_BENEATH); // blocked openat("/root", "/root", AT_BENEATH); // blocked AT_BENEATH also currently disallows all "proclink" resolution because they can trivially throw you outside of the starting point. In a future patch we might allow such resolution (as long as it stays within the root). openat("/", "proc/self/exe", AT_BENEATH); // blocked In addition, two more flags have been added to the series: * AT_NO_SYMLINKS disallows *all* symlink resolution, and thus implies AT_NO_PROCLINKS. Linus mentioned this is something that git would like to have in the original discussion[5]. // assuming 'ln -s / /usr' openat("/", "/usr/bin", AT_NO_SYMLINKS); // blocked openat("/", "/proc/self/root", AT_NO_PROCLINKS); // blocked * AT_THIS_ROOT is a very similar idea to AT_BENEATH, but it serves a very different purpose. Rather than blocking resolutions if they would go outside of the starting point, it treats the starting point as a form of chroot(2). Container runtimes are one of the primary justifications for this flag, as they currently have to implement this sort of path handling racily in userspace[6]. The restrictions on "proclink" resolution are the same as with AT_BENEATH (though in AT_THIS_ROOT's case it's not really clear how "proclink" jumps outside of the root should be handled), and patch 3 in this series was also required to make ".." resolution safe. Currently all of these flags are only enabled for openat(2) (and thus have their own O_* flag names), but the corresponding AT_* flags have been reserved so they can be added to syscalls where openat(O_PATH) is not sufficient. Patch changelog: v2: * Made ".." resolution with AT_THIS_ROOT and AT_BENEATH safe(r) with some semi-aggressive __d_path checking (see patch 3). * Disallowed "proclinks" with AT_THIS_ROOT and AT_BENEATH, in the hopes they can be re-enabled once safe. * Removed the selftests as they will be reimplemented as xfstests. * Removed stat(2) support, since you can already get it through O_PATH and fstatat(2). [1]: https://lwn.net/Articles/721443/ [2]: https://lore.kernel.org/patchwork/patch/784221/ [3]: https://lwn.net/Articles/619151/ [4]: https://lwn.net/Articles/603929/ [5]: https://lwn.net/Articles/723057/ [6]: https://github.com/cyphar/filepath-securejoin Cc: Al Viro Cc: Eric Biederman Cc: Andy Lutomirski Cc: David Howells Cc: Jann Horn Cc: Christian Brauner Cc: David Drysdale Cc: Cc: Cc: Aleksa Sarai (3): namei: implement O_BENEATH-style AT_* flags namei: implement AT_THIS_ROOT chroot-like path resolution namei: aggressively check for nd->root escape on ".." resolution fs/fcntl.c | 2 +- fs/namei.c | 241 +++++++++++++++++++++++-------- fs/open.c | 10 ++ fs/stat.c | 4 +- include/linux/fcntl.h | 3 +- include/linux/namei.h | 8 + include/uapi/asm-generic/fcntl.h | 20 +++ include/uapi/linux/fcntl.h | 10 ++ 8 files changed, 230 insertions(+), 68 deletions(-) -- 2.19.0