From: ebiederm@xmission.com (Eric W. Biederman)
To: Andy Lutomirski <luto@amacapital.net>
Cc: "Serge E. Hallyn" <serge@hallyn.com>,
        Seth Forshee <seth.forshee@canonical.com>,
        Alexander Viro <viro@zeniv.linux.org.uk>,
        Serge Hallyn <serge.hallyn@canonical.com>,
        James Morris <james.l.morris@oracle.com>,
        Linux FS Devel <linux-fsdevel@vger.kernel.org>,
        LSM List <linux-security-module@vger.kernel.org>,
        SELinux-NSA <selinux@tycho.nsa.gov>,
        "linux-kernel\@vger.kernel.org" <linux-kernel@vger.kernel.org>
References: <1436989569-69582-1-git-send-email-seth.forshee@canonical.com>
	<1436989569-69582-4-git-send-email-seth.forshee@canonical.com>
	<20150715214848.GA24204@mail.hallyn.com>
	<CALCETrVTGC8NJdpOhkqDaE8HdD95y-K2-VnsL+Oo4NjVUyovsA@mail.gmail.com>
	<87wpy1camr.fsf@x220.int.ebiederm.org>
	<CALCETrULyv9pfwAeyNrvJ06o+bW6JbWgBOsi4b4OUhbPjD4FnQ@mail.gmail.com>
	<87io9kzq5g.fsf@x220.int.ebiederm.org>
	<CALCETrWrFQJev_yn00BgMYbRhGGGa6nDBqmodbZxBrDO9xVSNw@mail.gmail.com>
Date: Thu, 16 Jul 2015 00:04:43 -0500
In-Reply-To: <CALCETrWrFQJev_yn00BgMYbRhGGGa6nDBqmodbZxBrDO9xVSNw@mail.gmail.com>
	(Andy Lutomirski's message of "Wed, 15 Jul 2015 21:49:16 -0700")
Message-ID: <87wpy0u1zo.fsf@x220.int.ebiederm.org>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
Subject: Re: [PATCH 3/7] fs: Ignore file caps in mounts from other user namespaces
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 4833
Lines: 123

Andy Lutomirski <luto@amacapital.net> writes:

> On Wed, Jul 15, 2015 at 9:23 PM, Eric W. Biederman
> <ebiederm@xmission.com> wrote:
>>
>> Ok.  Andy I have stopped and really looked at your patch that is 4/7 in
>> this series.  Something I had not done before since it sounded totally
>> wrong.
>>
>> That combined with your earlier comments I think I can say something
>> meaningful.
>>
>> Andy as I read your patch the thread you are primarily worried about is
>> chdir(/some/directory/in/another/mnt/ns).  I think enhancing nosuid to
>> deal with that case is reasonable, and is unlikely to break userspace.
>> It is one of those hairy security things so we need to be careful not to
>> introduce a regression.
>>
>
> Indeed.  It's plausible this could regress something, but it would be
> really weird.
>
>> I think a top down enhancement of nosuid to just block funny cases that
>> no one cares about is completely sensible.    Removing goofy corner
>> that no one cares about and that are only good for security exploits
>> seems reasonable.
>>
>
> Agreed.
>
>> I am a little concerned that smack does not seem to respect nosuid
>> on filesystems.  But that is an issue with nosuid not with your enhanced
>> nosuid.
>>
>>
>>
>>
>> Now this patch 3/7 really should be entitled:
>> "Limit file caps to the userns of the super block".
>>
>> It really really is doing something different.   This change is about a
>> bottom up understanding of what file caps means on a filesystem mounted
>> by a user namespace root.
>>
>> That is file caps should only apply to the user namespace root of the
>> root user who mounted the filesystem, because that is all the privileges
>> the mounter of the filesystem had.
>>
>> This guarantees that even if the filesystem somehow propagates with
>> mount propagation that there will be no issues.  I think I know how to
>> make that happen...
>>
>>
>>
>>
>> But deeply and fundamentally limiting a filesystem to only the
>> privilieges of it's user namespace root, and enhancing nosuid
>> protections are rather different things.
>>
>
> So here's the semantic question:
>
> Suppose an unprivileged user (uid 1000) creates a user namespace and a
> mount namespace.  They stick a file (owned by uid 1000 as seen by
> init_user_ns) in there and mark it setuid root and give it fcaps.

To make this make sense I have to ask, is this file on a filesystem
where uid 1000 as seen by the init_user_ns stored as uid 1000 on
the filesystem?  Or is this uid 0 as seen by the filesystem?

I assume this is uid 0 on the filesystem in question or else your
unprivileged user would not have sufficient privileges over the
filesystem to setup fcaps.

> Then global root gets an fd to this filesystem.  If they execve the
> file directly, then, with my patch 4, it won't act as setuid 1000 and
> the fcaps will be ignored.  Even with my patch 4, though, if they bind
> mount the fs and execve the file from their bind mount, it will act as
> setuid 1000.  Maybe this is odd.  However, with Seth's patch 3, the
> fcaps will (correctly) not be honored.

With patch 3 you can also think of it as fcaps being honored and you
get all the caps in the appropriate user namespace, but since you are
not in that user namespace and so don't have a place to store them
in struct cred you don't get the file caps.

>From the philosophy of interpreting the file as defined by the
filesystem in principle we could extend struct cred so you actually
get the creds just in uid 1000s user namespace, but that is very
unlikely to be worth it.

> I tend to thing that, if we're not honoring the fcaps, we shouldn't be
> honoring the setuid bit either.  After all, it's really not a trusted
> file, even though the only user who could have messed with it really
> is the apparent owner.

For the file caps we can't honor them because you don't have the bits
in struct cred.

For setuid we can honor it, and setuid is something that the user
namespace allows.

> And, if we're going to say we don't trust the file and shouldn't honor
> setuid or fcaps, then merging all the functionality into mnt_may_suid
> could make sense.  Yes, these two things do different things, but they
> could hook in to the same place.

There are really two separate questions:
- Do we trust this filesystem?
- Do you have the bits to implement this concept?

Even if in this specific context the two questions wind up looking
exactly the same. I think it makes a lot of sense to ask the two
questions separately.  As future maintenance changes may cause the
implementation of the questions to diverge.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/