Received: by 2002:a25:86ce:0:0:0:0:0 with SMTP id y14csp16343ybm; Mon, 20 May 2019 11:03:35 -0700 (PDT) X-Google-Smtp-Source: APXvYqxq3gQNadIFvzfhCGPDuY6qGf9CTRSZ1I1F0wTtBGc4WPP/xTMmKuOW8Se0Ie46uNNvUxry X-Received: by 2002:a63:754b:: with SMTP id f11mr76892793pgn.32.1558375415282; Mon, 20 May 2019 11:03:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1558375415; cv=none; d=google.com; s=arc-20160816; b=kvbjBhDeJfBQJHuIVk33aR16R5sPoTZF2mvVWgbaiUVzsFSNyG+p6ZGlJeWPI392gb Pgfms+quxGxRT6KzEJzUN2Z5kVNQtro3RDPerFC3T0mFDTvkwuziT3IqCjGpxFP2M1Gb /5KBnPQCfQtnJK9u5BsDO+Bp8kKgfMiYa/dqACdbEcv0JPjp46FyA4pGkEvekkKomNhz l1EfY1VbTUROhPtVpw9EdLN04dpwwfVxSYZMZvkP6BdMmROyYiIXnUgkOAtjkUsnes1i SFaNInbdy+VejTDMZMndftY10F3lKQA2Q/aRbvKXtNtjpdQPOymevCcCYey1/F5nGVsF Xj8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=c9FRtyR+pyvwApy+jJi+NUSIpmEVk5bvveitmm4l8bk=; b=Kf5Udq6PhQ7CZwtmggoJjVUM3tQjbRVSWHwxmFYE4qKGV8r+kazClyi4VlJNOqAzTi 72U/IxS8v/bm+wN98KfZZYybkHAGTjqqnNt9A93j60XExaPtNEfohpc1WUrETNBhSj0l HAcZi+/pbhjM0Okv/UsZ+GL9/Jsgu5S/M3uEU+2toOAkqqoNPuVnhNyZQMcveRBw3SPZ nrSTfzNKxW8h3iNOAIClneHFKC9fk+XXCEiB8S0+Zzjb9LLLyhuZKG6zmCb67G7glNZ9 M4E7HWVmWM3MWcLM5tftHpH7XjJBHx2itySSX+Mtuv+tkXq/IVMstZbo52w0eZt4a4IG 4UgA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h1si8847977pls.83.2019.05.20.11.03.19; Mon, 20 May 2019 11:03:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388123AbfETNfD (ORCPT + 99 others); Mon, 20 May 2019 09:35:03 -0400 Received: from mx1.mailbox.org ([80.241.60.212]:61162 "EHLO mx1.mailbox.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728634AbfETNfD (ORCPT ); Mon, 20 May 2019 09:35:03 -0400 Received: from smtp2.mailbox.org (smtp2.mailbox.org [IPv6:2001:67c:2050:105:465:1:2:0]) (using TLSv1.2 with cipher ECDHE-RSA-CHACHA20-POLY1305 (256/256 bits)) (No client certificate requested) by mx1.mailbox.org (Postfix) with ESMTPS id C877750122; Mon, 20 May 2019 15:35:00 +0200 (CEST) X-Virus-Scanned: amavisd-new at heinlein-support.de Received: from smtp2.mailbox.org ([80.241.60.241]) by spamfilter06.heinlein-hosting.de (spamfilter06.heinlein-hosting.de [80.241.56.125]) (amavisd-new, port 10030) with ESMTP id csTDFD4X9caf; Mon, 20 May 2019 15:34:58 +0200 (CEST) From: Aleksa Sarai To: Al Viro , Jeff Layton , "J. Bruce Fields" , Arnd Bergmann , David Howells , Shuah Khan , Shuah Khan Cc: Aleksa Sarai , Christian Brauner , Eric Biederman , Andy Lutomirski , Andrew Morton , Alexei Starovoitov , Kees Cook , Jann Horn , Tycho Andersen , David Drysdale , Chanho Min , Oleg Nesterov , Aleksa Sarai , Linus Torvalds , containers@lists.linux-foundation.org, linux-kselftest@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH RFC v8 06/10] namei: LOOKUP_IN_ROOT: chroot-like path resolution Date: Mon, 20 May 2019 23:33:01 +1000 Message-Id: <20190520133305.11925-7-cyphar@cyphar.com> In-Reply-To: <20190520133305.11925-1-cyphar@cyphar.com> References: <20190520133305.11925-1-cyphar@cyphar.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The primary motivation for the need for this flag is container runtimes which have to interact with malicious root filesystems in the host namespaces. One of the first requirements for a container runtime to be secure against a malicious rootfs is that they correctly scope symlinks (that is, they should be scoped as though they are chroot(2)ed into the container's rootfs) and ".."-style paths[*]. The already-existing LOOKUP_XDEV and LOOKUP_NO_MAGICLINKS help defend against other potential attacks in a malicious rootfs scenario. Currently most container runtimes try to do this resolution in userspace[1], causing many potential race conditions. In addition, the "obvious" alternative (actually performing a {ch,pivot_}root(2)) requires a fork+exec (for some runtimes) which is *very* costly if necessary for every filesystem operation involving a container. [*] At the moment, ".." and magic-link jumping are disallowed for the same reason it is disabled for LOOKUP_BENEATH -- currently it is not safe to allow it. Future patches may enable it unconditionally once we have resolved the possible races (for "..") and semantics (for magic-link jumping). The most significant *at(2) semantic change with LOOKUP_IN_ROOT is that absolute pathnames no longer cause dirfd to be ignored completely. The rationale is that LOOKUP_IN_ROOT must necessarily chroot-scope symlinks with absolute paths to dirfd, and so doing it for the base path seems to be the most consistent behaviour (and also avoids foot-gunning users who want to scope paths that are absolute). [1]: https://github.com/cyphar/filepath-securejoin Co-developed-by: Christian Brauner Signed-off-by: Aleksa Sarai --- fs/namei.c | 6 +++--- include/linux/namei.h | 1 + 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index f997c82eb9c2..d18671a06bdb 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1137,7 +1137,7 @@ const char *get_link(struct nameidata *nd, bool trailing) if (unlikely(nd->flags & LOOKUP_NO_MAGICLINKS)) return ERR_PTR(-ELOOP); /* Not currently safe. */ - if (unlikely(nd->flags & LOOKUP_BENEATH)) + if (unlikely(nd->flags & (LOOKUP_BENEATH | LOOKUP_IN_ROOT))) return ERR_PTR(-EXDEV); /* * For trailing_symlink we check whether the symlink's @@ -1827,7 +1827,7 @@ static inline int handle_dots(struct nameidata *nd, int type) * cause our parent to have moved outside of the root and us to skip * over it. */ - if (unlikely(nd->flags & LOOKUP_BENEATH)) + if (unlikely(nd->flags & (LOOKUP_BENEATH | LOOKUP_IN_ROOT))) return -EXDEV; if (!nd->root.mnt) set_root(nd); @@ -2378,7 +2378,7 @@ static const char *path_init(struct nameidata *nd, unsigned flags) nd->m_seq = read_seqbegin(&mount_lock); - if (unlikely(nd->flags & LOOKUP_BENEATH)) { + if (unlikely(nd->flags & (LOOKUP_BENEATH | LOOKUP_IN_ROOT))) { error = dirfd_path_init(nd); if (unlikely(error)) return ERR_PTR(error); diff --git a/include/linux/namei.h b/include/linux/namei.h index 7bc819ad0cd3..4b1ee717cb14 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -56,6 +56,7 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, LAST_BIND}; #define LOOKUP_NO_MAGICLINKS 0x040000 /* No /proc/$pid/fd/ "symlink" crossing. */ #define LOOKUP_NO_SYMLINKS 0x080000 /* No symlink crossing *at all*. Implies LOOKUP_NO_MAGICLINKS. */ +#define LOOKUP_IN_ROOT 0x100000 /* Treat dirfd as %current->fs->root. */ extern int path_pts(struct path *path); -- 2.21.0