Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp4627600pxf; Tue, 30 Mar 2021 12:33:36 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwzPjpAINUnXC0qQLrUOkFx3ZMV7QWuHWN6Wce0rpw60kNDLjj2LmrT3FY6ssrxnm38G1AI X-Received: by 2002:aa7:dd99:: with SMTP id g25mr34823498edv.230.1617132816454; Tue, 30 Mar 2021 12:33:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617132816; cv=none; d=google.com; s=arc-20160816; b=yDqonawtUcaVCxXqHTiDCNoMtL+ck+1b/+LuYSnwz965eLPTTiylL+yqGVAJuBEjQr FpckEiDffKXWjWh1fPB1MVRgnI4G/cCGDg1kOcGfcrIYvALnbKcnx6twtPrv27HExSRx VuDXeU7HkvASpFE0SHnjTjS7gBco9fPF9INNKBgaiZyxM0Arlg6Nfm7dxHNtlsCs2Yq/ HOL/d45e0OSl929D9S7qNh7cI2xNC9N6KBH0LUIPJRKeQMxJTVrWTU8/JyEgoh8TPxVI imGt8B5wM0dUoODhUWsAmKqzwrtziYlyqdt5NUchDH8+NyH1DXzYL35yucVtdPi+RM4X DZxw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=3agADpJBl71j1qA1XMEXGJqq4sez7qTVNLVliOnOMPM=; b=l2WqSrc+i7BJifzTwc73Y6aSaMl/xN5CPE+RPqDZEnvGcHQzICJzS58b2vaHQBABtD 7nHl2ZRH3Ybv8j9f8DiIVOdPDAA8LXKgQE2YmoJe4o3oqLw7i3Q3b0ghZ+JZwkU+iPMT KrnsBOHscvnM7ad6C+EZP8iL96HDKz/dtfK431IT6i8gvcqJJ3SNENr7/Go59mfK6qLf RkiSdrFG3tv18XSOMRzEwLR2mL4vqh4o9HhggOOscW+VONqxJ574cLhazQNKU5wIUfbu mLIw02ni5tcUEIM4B/zBOxONVagRpuYHvP30ZZCL524T85IuYpWj516tXFJ2i/iDwlMs m1Tw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k1si15256522ejj.10.2021.03.30.12.33.13; Tue, 30 Mar 2021 12:33:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233151AbhC3T16 (ORCPT + 99 others); Tue, 30 Mar 2021 15:27:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53982 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233155AbhC3T1b (ORCPT ); Tue, 30 Mar 2021 15:27:31 -0400 Received: from smtp-bc0f.mail.infomaniak.ch (smtp-bc0f.mail.infomaniak.ch [IPv6:2001:1600:3:17::bc0f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 06343C061574 for ; Tue, 30 Mar 2021 12:27:30 -0700 (PDT) Received: from smtp-2-0000.mail.infomaniak.ch (unknown [10.5.36.107]) by smtp-2-3000.mail.infomaniak.ch (Postfix) with ESMTPS id 4F8zwx1YgtzMppPr; Tue, 30 Mar 2021 21:27:29 +0200 (CEST) Received: from ns3096276.ip-94-23-54.eu (unknown [23.97.221.149]) by smtp-2-0000.mail.infomaniak.ch (Postfix) with ESMTPA id 4F8zwq3148zlppyj; Tue, 30 Mar 2021 21:27:23 +0200 (CEST) Subject: Re: [PATCH v5 1/1] fs: Allow no_new_privs tasks to call chroot(2) To: Casey Schaufler , Al Viro , James Morris , Serge Hallyn , Andrew Morton Cc: Andy Lutomirski , Christian Brauner , Christoph Hellwig , David Howells , Dominik Brodowski , "Eric W . Biederman" , Jann Horn , John Johansen , Kees Cook , Kentaro Takeda , Tetsuo Handa , kernel-hardening@lists.openwall.com, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-security-module@vger.kernel.org, =?UTF-8?Q?Micka=c3=abl_Sala=c3=bcn?= References: <20210316203633.424794-1-mic@digikod.net> <20210316203633.424794-2-mic@digikod.net> <85ebb3a1-bd5e-9f12-6d02-c08d2c0acff5@schaufler-ca.com> <77ec5d18-f88e-5c7c-7450-744f69654f69@schaufler-ca.com> From: =?UTF-8?Q?Micka=c3=abl_Sala=c3=bcn?= Message-ID: Date: Tue, 30 Mar 2021 21:28:25 +0200 User-Agent: MIME-Version: 1.0 In-Reply-To: <77ec5d18-f88e-5c7c-7450-744f69654f69@schaufler-ca.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 30/03/2021 20:40, Casey Schaufler wrote: > On 3/30/2021 11:11 AM, Mickaël Salaün wrote: >> On 30/03/2021 19:19, Casey Schaufler wrote: >>> On 3/30/2021 10:01 AM, Mickaël Salaün wrote: >>>> Hi, >>>> >>>> Is there new comments on this patch? Could we move forward? >>> I don't see that new comments are necessary when I don't see >>> that you've provided compelling counters to some of the old ones. >> Which ones? I don't buy your argument about the beauty of CAP_SYS_CHROOT. > > CAP_SYS_CHROOT, namespaces. Bind mounts. The restrictions on > "unprivileged" chroot being sufficiently onerous to make it > unlikely to be usable. There is multiple use cases for these features. > >>> It's possible to use minimal privilege with CAP_SYS_CHROOT. >> CAP_SYS_CHROOT can lead to privilege escalation. > > Not when used in conjunction with the same set of > restrictions you're requiring for "unprivileged" chroot. I'm talking about security with the principle of least privilege: when we consider that a process may be(come) malicious but should still be able to drop (more) accesses, e.g. with prctl(set_no_new_privs) *then* chroot() > >>> It looks like namespaces provide alternatives for all your >>> use cases. >> I explained in the commit message why it is not the case. In a nutshell, >> namespaces bring complexity which may not be required. > > So? I can use a Swiss Army Knife to cut a string even though it > has a corkscrew. Complexity leads to (security) issues. In secure systems, we want to reduce the attack surfaces. There is some pointers here: https://lwn.net/Articles/673597/ > >> When designing a >> secure system, we want to avoid giving access to such complexity to >> untrusted processes (i.e. more complexity leads to more bugs). > > If you're *really* designing a secure system you can design it to > use existing mechanisms, like CAP_SYS_CHROOT! Not always. For instance, in the case of a web browser, we don't want to give CAP_SYS_CHROOT to every users just because their browser could (legitimately) use it as a security sandbox mechanism. The same principle can be applied to a lot of use cases, e.g. network services, file parsers, etc. > >> An >> unprivileged chroot would enable to give just the minimum feature to >> drop some accesses. Of course it is not enough on its own, but it can be >> combined with existing (and future) security features. > > Like NO_NEW_PRIVS, namespaces and capabilities! > You don't need anything new! If a process is compromised before chrooting itself and dropping CAP_SYS_CHROOT, then there is a bigger security issue than without CAP_SYS_CHROOT. > >>> The constraints required to make this work are quite >>> limiting. Where is the real value add? >> As explain in the commit message, it is useful when hardening >> applications (e.g. network services, browsers, parsers, etc.). We don't >> want an untrusted (or compromised) application to have CAP_SYS_CHROOT >> nor (complex) namespace access. > > If you can ensure that an unprivileged application is > always run with NO_NEW_PRIVS you could also ensure that > it runs with only CAP_SYS_CHROOT or in an appropriate > namespace. I believe that it would be easier for your > particular use case. I don't believe that is sufficient. You can't always have this assertion, e.g. because a user may require to run (legitimate) SETUID binaries… For everyone following a defense in depth approach (i.e. multiple layers of security), an unprivileged chroot is valuable. > >>>> Regards, >>>> Mickaël >>>> >>>> >>>> On 16/03/2021 21:36, Mickaël Salaün wrote: >>>>> From: Mickaël Salaün >>>>> >>>>> Being able to easily change root directories enables to ease some >>>>> development workflow and can be used as a tool to strengthen >>>>> unprivileged security sandboxes. chroot(2) is not an access-control >>>>> mechanism per se, but it can be used to limit the absolute view of the >>>>> filesystem, and then limit ways to access data and kernel interfaces >>>>> (e.g. /proc, /sys, /dev, etc.). >>>>> >>>>> Users may not wish to expose namespace complexity to potentially >>>>> malicious processes, or limit their use because of limited resources. >>>>> The chroot feature is much more simple (and limited) than the mount >>>>> namespace, but can still be useful. As for containers, users of >>>>> chroot(2) should take care of file descriptors or data accessible by >>>>> other means (e.g. current working directory, leaked FDs, passed FDs, >>>>> devices, mount points, etc.). There is a lot of literature that discuss >>>>> the limitations of chroot, and users of this feature should be aware > of >>>>> the multiple ways to bypass it. Using chroot(2) for security purposes >>>>> can make sense if it is combined with other features (e.g. dedicated >>>>> user, seccomp, LSM access-controls, etc.). >>>>> >>>>> One could argue that chroot(2) is useless without a properly populated >>>>> root hierarchy (i.e. without /dev and /proc). However, there are >>>>> multiple use cases that don't require the chrooting process to create >>>>> file hierarchies with special files nor mount points, e.g.: >>>>> * A process sandboxing itself, once all its libraries are loaded, may >>>>> not need files other than regular files, or even no file at all. >>>>> * Some pre-populated root hierarchies could be used to chroot into, >>>>> provided for instance by development environments or tailored >>>>> distributions. >>>>> * Processes executed in a chroot may not require access to these special >>>>> files (e.g. with minimal runtimes, or by emulating some special files >>>>> with a LD_PRELOADed library or seccomp). >>>>> >>>>> Allowing a task to change its own root directory is not a threat to the >>>>> system if we can prevent confused deputy attacks, which could be >>>>> performed through execution of SUID-like binaries. This can be >>>>> prevented if the calling task sets PR_SET_NO_NEW_PRIVS on itself with >>>>> prctl(2). To only affect this task, its filesystem information must > not >>>>> be shared with other tasks, which can be achieved by not passing >>>>> CLONE_FS to clone(2). A similar no_new_privs check is already used by >>>>> seccomp to avoid the same kind of security issues. Furthermore, because >>>>> of its security use and to avoid giving a new way for attackers to get >>>>> out of a chroot (e.g. using /proc//root, or chroot/chdir), an >>>>> unprivileged chroot is only allowed if the calling process is not >>>>> already chrooted. This limitation is the same as for creating user >>>>> namespaces. >>>>> >>>>> This change may not impact systems relying on other permission models >>>>> than POSIX capabilities (e.g. Tomoyo). Being able to use chroot(2) on >>>>> such systems may require to update their security policies. >>>>> >>>>> Only the chroot system call is relaxed with this no_new_privs check; > the >>>>> init_chroot() helper doesn't require such change. >>>>> >>>>> Allowing unprivileged users to use chroot(2) is one of the initial >>>>> objectives of no_new_privs: >>>>> https://www.kernel.org/doc/html/latest/userspace-api/no_new_privs.html >>>>> This patch is a follow-up of a previous one sent by Andy Lutomirski: >>>>> https://lore.kernel.org/lkml/0e2f0f54e19bff53a3739ecfddb4ffa9a6dbde4d.1327858005.git.luto@amacapital.net/ >>>>> >>>>> Cc: Al Viro >>>>> Cc: Andy Lutomirski >>>>> Cc: Christian Brauner >>>>> Cc: Christoph Hellwig >>>>> Cc: David Howells >>>>> Cc: Dominik Brodowski >>>>> Cc: Eric W. Biederman >>>>> Cc: James Morris >>>>> Cc: Jann Horn >>>>> Cc: John Johansen >>>>> Cc: Kentaro Takeda >>>>> Cc: Serge Hallyn >>>>> Cc: Tetsuo Handa >>>>> Signed-off-by: Mickaël Salaün >>>>> Reviewed-by: Kees Cook >>>>> Link: https://lore.kernel.org/r/20210316203633.424794-2-mic@digikod.net >>>>> --- >>>>> >>>>> Changes since v4: >>>>> * Use READ_ONCE(current->fs->users) (found by Jann Horn). >>>>> * Remove ambiguous example in commit description. >>>>> * Add Reviewed-by Kees Cook. >>>>> >>>>> Changes since v3: >>>>> * Move the new permission checks to a dedicated helper >>>>> current_chroot_allowed() to make the code easier to read and align >>>>> with user_path_at(), path_permission() and security_path_chroot() >>>>> calls (suggested by Kees Cook). >>>>> * Remove now useless included file. >>>>> * Extend commit description. >>>>> * Rebase on v5.12-rc3 . >>>>> >>>>> Changes since v2: >>>>> * Replace path_is_under() check with current_chrooted() to gain the same >>>>> protection as create_user_ns() (suggested by Jann Horn). See commit >>>>> 3151527ee007 ("userns: Don't allow creation if the user is chrooted") >>>>> >>>>> Changes since v1: >>>>> * Replace custom is_path_beneath() with existing path_is_under(). >>>>> --- >>>>> fs/open.c | 23 +++++++++++++++++++++-- >>>>> 1 file changed, 21 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/fs/open.c b/fs/open.c >>>>> index e53af13b5835..480010a551b2 100644 >>>>> --- a/fs/open.c >>>>> +++ b/fs/open.c >>>>> @@ -532,6 +532,24 @@ SYSCALL_DEFINE1(fchdir, unsigned int, fd) >>>>> return error; >>>>> } >>>>> >>>>> +static inline int current_chroot_allowed(void) >>>>> +{ >>>>> + /* >>>>> + * Changing the root directory for the calling task (and its future >>>>> + * children) requires that this task has CAP_SYS_CHROOT in its >>>>> + * namespace, or be running with no_new_privs and not sharing its >>>>> + * fs_struct and not escaping its current root (cf. create_user_ns()). >>>>> + * As for seccomp, checking no_new_privs avoids scenarios where >>>>> + * unprivileged tasks can affect the behavior of privileged children. >>>>> + */ >>>>> + if (task_no_new_privs(current) && READ_ONCE(current->fs->users) == >>> 1 && >>>>> + !current_chrooted()) >>>>> + return 0; >>>>> + if (ns_capable(current_user_ns(), CAP_SYS_CHROOT)) >>>>> + return 0; >>>>> + return -EPERM; >>>>> +} >>>>> + >>>>> SYSCALL_DEFINE1(chroot, const char __user *, filename) >>>>> { >>>>> struct path path; >>>>> @@ -546,9 +564,10 @@ SYSCALL_DEFINE1(chroot, const char __user *, filename) >>>>> if (error) >>>>> goto dput_and_out; >>>>> >>>>> - error = -EPERM; >>>>> - if (!ns_capable(current_user_ns(), CAP_SYS_CHROOT)) >>>>> + error = current_chroot_allowed(); >>>>> + if (error) >>>>> goto dput_and_out; >>>>> + >>>>> error = security_path_chroot(&path); >>>>> if (error) >>>>> goto dput_and_out; >>>>> >