Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp2685497yba; Mon, 6 May 2019 09:58:22 -0700 (PDT) X-Google-Smtp-Source: APXvYqyB1FiWaz68vI+QVeIcb3wFrOxtzn80Em/Y3ZVAMddeUH3D8/NHD4J1bvQS2o5bCGBL8heW X-Received: by 2002:a62:2ec4:: with SMTP id u187mr18044402pfu.84.1557161902689; Mon, 06 May 2019 09:58:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557161902; cv=none; d=google.com; s=arc-20160816; b=etVhL125mnbzpARjU6nqWrZA8lJU24mExVY+JSOk1vjEt7ms7OOuOlcPkjbPu+fiD/ K5E0w2AS2WYFJoa6REbzZm9Nh3oqUlH3ZQLxhIzFXaYSLU5N4rwnNu5g83X/IBKa2E2H RzOPfTW5DU5qkhyLjDjpVGYPNTOiHdpEoZJ9LL44ASsWcQWSWyy2TolR1qEQqwkV4F9h e/ZTk2P/cN+tzeCW+8xo2SOssfQTFIGwEz7ITHn/do8miDMj2jAH95+2Yz5eyVbehpAZ h4YC0wjRrHnAJmZr3QFw8Tb+Z8E9htJaIn9OcKxWomMgGh+l/qp/4lzQjMBcw780MU+W n28w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=kkSeZEx1Wn0mfLqdCyYXo/haoP4y6VNxJqzStr3nRcU=; b=DFFSXAvhdPuPgTJIYJ/OP+PZ3OPiH2TSTj/rKcd8esQUrM25BwXfTJP+PaEQLuhXRx xAF8K81Sn6MRPH41LO6n3wf8KH9r68aiQlWesaf3KDoxTIppYcEfw4wofLQL4ZDgNavO P5VACD/R9IV4wPREh3SoFAbVNJk51sX0bbGZgf1MyKkaGpjh1Tmx3q58SajsmS6pPk/k URnsf/F4wSIjf52GhUrX2c4hDT6YMnKOJCAEy74nIKn7w0jb9rA8ksJ2Zz7en/ytOSdj Rp/IWk/4cM4Uct8eMq5jB+/LzuE41+7il7yuCZOzoIrUar+MlRXxr5v1XoEoI1Ap7k6v tUBQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 202si6760443pgh.575.2019.05.06.09.58.06; Mon, 06 May 2019 09:58:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727041AbfEFQzy (ORCPT + 99 others); Mon, 6 May 2019 12:55:54 -0400 Received: from mx2.mailbox.org ([80.241.60.215]:58750 "EHLO mx2.mailbox.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726690AbfEFQzw (ORCPT ); Mon, 6 May 2019 12:55:52 -0400 Received: from smtp1.mailbox.org (smtp1.mailbox.org [80.241.60.240]) (using TLSv1.2 with cipher ECDHE-RSA-CHACHA20-POLY1305 (256/256 bits)) (No client certificate requested) by mx2.mailbox.org (Postfix) with ESMTPS id 4B409A10EB; Mon, 6 May 2019 18:55:48 +0200 (CEST) X-Virus-Scanned: amavisd-new at heinlein-support.de Received: from smtp1.mailbox.org ([80.241.60.240]) by hefe.heinlein-support.de (hefe.heinlein-support.de [91.198.250.172]) (amavisd-new, port 10030) with ESMTP id MwCkKpkE7Jd1; Mon, 6 May 2019 18:55:38 +0200 (CEST) From: Aleksa Sarai To: Al Viro , Jeff Layton , "J. Bruce Fields" , Arnd Bergmann , David Howells Cc: Aleksa Sarai , Eric Biederman , Christian Brauner , Kees Cook , Andy Lutomirski , Andrew Morton , Alexei Starovoitov , Jann Horn , Tycho Andersen , David Drysdale , Chanho Min , Oleg Nesterov , Aleksa Sarai , Linus Torvalds , containers@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v6 3/6] namei: LOOKUP_IN_ROOT: chroot-like path resolution Date: Tue, 7 May 2019 02:54:36 +1000 Message-Id: <20190506165439.9155-4-cyphar@cyphar.com> In-Reply-To: <20190506165439.9155-1-cyphar@cyphar.com> References: <20190506165439.9155-1-cyphar@cyphar.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The primary motivation for the need for this flag is container runtimes which have to interact with malicious root filesystems in the host namespaces. One of the first requirements for a container runtime to be secure against a malicious rootfs is that they correctly scope symlinks (that is, they should be scoped as though they are chroot(2)ed into the container's rootfs) and ".."-style paths[*]. The already-existing O_XDEV and O_NOMAGICLINKS[**] help defend against other potential attacks in a malicious rootfs scenario. Currently most container runtimes try to do this resolution in userspace[1], causing many potential race conditions. In addition, the "obvious" alternative (actually performing a {ch,pivot_}root(2)) requires a fork+exec (for some runtimes) which is *very* costly if necessary for every filesystem operation involving a container. [*] At the moment, ".." and "magic link" jumping are disallowed for the same reason it is disabled for LOOKUP_BENEATH -- currently it is not safe to allow it. Future patches may enable it unconditionally once we have resolved the possible races (for "..") and semantics (for "magic link" jumping). The most significant openat(2) semantic change with LOOKUP_THISROOT is that absolute pathnames no longer cause dirfd to be ignored completely. The rationale is that LOOKUP_THISROOT must necessarily chroot-scope symlinks with absolute paths to dirfd, and so doing it for the base path seems to be the most consistent behaviour (and also avoids foot-gunning users who want to scope paths that are absolute). [1]: https://github.com/cyphar/filepath-securejoin Cc: Eric Biederman Cc: Christian Brauner Cc: Kees Cook Signed-off-by: Aleksa Sarai --- fs/namei.c | 6 +++--- include/linux/namei.h | 1 + 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index e13a02720a9d..3a3cba593b85 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1095,7 +1095,7 @@ const char *get_link(struct nameidata *nd) if (unlikely(nd->flags & LOOKUP_NO_MAGICLINKS)) return ERR_PTR(-ELOOP); /* Not currently safe. */ - if (unlikely(nd->flags & LOOKUP_BENEATH)) + if (unlikely(nd->flags & (LOOKUP_BENEATH | LOOKUP_IN_ROOT))) return ERR_PTR(-EXDEV); } if (IS_ERR_OR_NULL(res)) @@ -1744,7 +1744,7 @@ static inline int handle_dots(struct nameidata *nd, int type) * cause our parent to have moved outside of the root and us to skip * over it. */ - if (unlikely(nd->flags & LOOKUP_BENEATH)) + if (unlikely(nd->flags & (LOOKUP_BENEATH | LOOKUP_IN_ROOT))) return -EXDEV; if (!nd->root.mnt) set_root(nd); @@ -2295,7 +2295,7 @@ static const char *path_init(struct nameidata *nd, unsigned flags) nd->m_seq = read_seqbegin(&mount_lock); - if (unlikely(nd->flags & LOOKUP_BENEATH)) { + if (unlikely(nd->flags & (LOOKUP_BENEATH | LOOKUP_IN_ROOT))) { error = dirfd_path_init(nd); if (unlikely(error)) return ERR_PTR(error); diff --git a/include/linux/namei.h b/include/linux/namei.h index 7bc819ad0cd3..4b1ee717cb14 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -56,6 +56,7 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, LAST_BIND}; #define LOOKUP_NO_MAGICLINKS 0x040000 /* No /proc/$pid/fd/ "symlink" crossing. */ #define LOOKUP_NO_SYMLINKS 0x080000 /* No symlink crossing *at all*. Implies LOOKUP_NO_MAGICLINKS. */ +#define LOOKUP_IN_ROOT 0x100000 /* Treat dirfd as %current->fs->root. */ extern int path_pts(struct path *path); -- 2.21.0