Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755941AbbKCX36 (ORCPT ); Tue, 3 Nov 2015 18:29:58 -0500 Received: from mail-ig0-f173.google.com ([209.85.213.173]:37451 "EHLO mail-ig0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751258AbbKCX34 (ORCPT ); Tue, 3 Nov 2015 18:29:56 -0500 MIME-Version: 1.0 In-Reply-To: <20151104002132.010ccd1d@rsjtdrjgfuzkfg.com> References: <1446511187-9131-1-git-send-email-public@rsjtdrjgfuzkfg.com> <20151104002132.010ccd1d@rsjtdrjgfuzkfg.com> Date: Tue, 3 Nov 2015 15:29:55 -0800 X-Google-Sender-Auth: _HeKUzOnU7ZMXXmj3WmApvj6wo4 Message-ID: Subject: Re: [RFC] namei: prevent sgid-hardlinks for unmapped gids From: Kees Cook To: Dirk Steinmetz , Michael Kerrisk-manpages Cc: Serge Hallyn , Seth Forshee , Alexander Viro , Linux FS Devel , LKML , "Eric W . Biederman" , Serge Hallyn , "security@kernel.org" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 11626 Lines: 310 On Tue, Nov 3, 2015 at 3:21 PM, Dirk Steinmetz wrote: > On Tue, 3 Nov 2015 10:20:38 -0800, Kees Cook wrote: >> On Mon, Nov 2, 2015 at 4:39 PM, Dirk Steinmetz >> wrote: >> > In order to hardlink to a sgid-executable, it is sufficient to be the >> > file's owner. When hardlinking within an unprivileged user namespace, the >> > users of that namespace could thus use hardlinks to pin setgid binaries >> > owned by themselves (or any mapped uid, with CAP_FOWNER) and a gid outside >> > of the namespace. This is a possible security risk. >> >> How would such a file appear within the namespace? Wouldn't the gid >> have to map to something inside the namespace? >> >> > >> > This change prevents hardlinking of sgid-executables within user >> > namespaces, if the file is not owned by a mapped gid. >> >> For clarity, this should say "... is not group-owned by a mapped git." correct? >> >> > Signed-off-by: Dirk Steinmetz >> > --- >> > >> > MISSING: Documentation/sysctl/fs.txt not updated, as this patch is >> > intended for discussion. >> > >> > If there are no further misunderstandings on my side, this patch is what >> > Serge and I agree on (modulo my not-that-much-linux-kernel-experience >> > codestyle, feel free to suggest improvements!). >> > >> > The new condition for sgid-executables is equivalent to >> >> inode_owner_or_capable(inode) && kgid_has_mapping(ns, inode->i_gid) >> > which, as recommended by Serge, does not change the behaviour for the init >> > namespace. It fixes the problem of pinning parent namespace's gids. >> > >> > However, I think the "same" security issue is also valid within any >> > namespace, for regular users pinning other gids within the same namespace. >> > I already presented an example for that in a previous mail: >> > - A file has the setgid and user/group executable bits set, and is owned >> > by user:group. >> > - The user 'user' is not in the group 'group', and does not have any >> > capabilities. >> > - The user 'user' hardlinks the file. The permission check will succeed, >> > as the user is the owner of the file. >> > - The file is replaced with a newer version (for example fixing a security >> > issue) >> > - Now user can still use the hardlink-pinned version to execute the file >> > as 'user:group' (and for example exploit the security issue). >> >> I believe this to be an unneeded check is the stated configuration >> (setgid but without group ownership) is itself a security flaw. This >> allows the user already to gain those group privileges even without >> needing to pin the file or do anything: >> >> setgid executable that reports euid and egid: >> >> $ cat poof.c >> #include >> >> int main(void) >> { >> printf("%d:%d\n", geteuid(), getegid()); >> return 0; >> } >> $ make poof >> cc poof.c -o poof >> $ sudo chgrp root poof && sudo chmod g+s poof >> $ ls -la poof >> -rwxr-s--- 1 keescook root 8658 Nov 3 10:14 poof >> $ ./poof >> 149786:0 >> >> I am not a member of the 0 group: >> >> $ id >> uid=149786(keescook) gid=5000(eng) >> groups=5000(eng),4(adm),20(dialout),21(fax),24(cdrom),25(floppy),26(tape),27(sudo),30(dip),44(video),46(plugdev),106(fuse),110(lpadmin),124(sambashare),129(pkcs11),133(libvirtd),999(logindev) >> >> Now I mmap the file, and rewrite it (here I change the format string >> from a : separator to a -, but we just just as easily injected code): >> >> $ cat mod.c >> #include >> #include >> #include >> #include >> #include >> #include >> >> int main(void) >> { >> int fd; >> struct stat info; >> unsigned char *ptr; >> off_t i; >> >> fd = open("poof", O_RDWR); >> fstat(fd, &info); >> ptr = mmap(NULL, info.st_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); >> close(fd); >> >> for (i = 0; i < info.st_size; i++) { >> if (0 == strncmp(ptr + i, "%d:%d", 5)) { >> ptr[i + 2] = '-'; >> } >> } >> munmap(ptr, info.st_size); >> >> system("./poof"); >> >> return 0; >> } >> $ make mod >> cc mod.c -o mod >> $ ./mod >> 149786-0 >> $ ls -la poof >> -rwxr-s--- 1 keescook root 8658 Nov 3 10:17 poof >> >> So, I don't think this patch actually makes anything safer, though >> there might be a namespace mapping element I've not understood. > > Thank you for that beautiful demonstration! You're welcome. It was a fun diversion. Weirdly, it's only possible via mmap. Using "write" does kill the set-gid bit. I haven't looked at why. Al or anyone else, is there a meaningful distinction here? Should the mmap MAP_SHARED-write trigger the loss of the set-gid bit too? While holding the file open with either open or mmap, I get a Text-in-use error, so I would kind of expect the same behavior between either close() and munmap(). I wonder if this is a bug, and if so, then your link patch is indeed useful again. :) > I agree. I was unaware that sgid-executables on foreign groups were already > considered as insecure, as the documentation states > "As a security measure, depending on the filesystem, the set-user-ID > and set-group-ID execution bits may be turned off if a file is written. > (On Linux this occurs if the writing process does not have the > CAP_FSETID capability.)" [1] -- I blindly assumed that this would be the > case for most filesystems and incorrectly derived that sgid with foreign > groups are a supported scenario. As you demonstrated, they are not, and > thus this patch and the sgid-issue discussed are irrelevant, as the > scenario is unsafe by design and not by error. > > The only question remaining is whether the documentation should be > re-worded to prevent other people from doing the same mistake and assuming > the scenario would be supported. > > However, I'm unsure how to word it properly; my best guess would be > "The set-user-ID and set-group-ID execution bits may be turned off if a > file is written. (On Linux, this is guaranteed to only happen if the > writing process does not have the CAP_FSETID capability. You should not > rely on this behavior as a security measure: any user having write access > to or owning the file should be considered as able to impersonate the > file's owner/group-owner)" That wording seems good to me. Adding Michael for his thoughts. -Kees > > - Dirk > > > Reference: > [1] > >> >> -Kees >> >> > >> > To prevent that, the condition would need to be changed to something like >> > inode_group_or_capable, resembling inode_owner_or_capable, but checking >> > that the caller is in the group the inode belongs to or has some >> > capability (for consistency with former behaviour, CAP_FOWNER? for >> > consistency with the documentation, CAP_FSETID?). However, this would >> > change userland behaviour outside of userns. Thus my main question: >> > Is the scenario above bad enough to change userland behaviour? >> > >> > I'd apprechiate your comments. >> > >> > - Dirk >> > >> > >> > Diffstat: >> > fs/namei.c | 47 ++++++++++++++++++++++++++++++++++++++--------- >> > 1 file changed, 38 insertions(+), 9 deletions(-) >> > >> > diff --git a/fs/namei.c b/fs/namei.c >> > index 29fc6a6..9c6c2e2 100644 >> > --- a/fs/namei.c >> > +++ b/fs/namei.c >> > @@ -913,18 +913,19 @@ static inline int may_follow_link(struct nameidata *nd) >> > } >> > >> > /** >> > - * safe_hardlink_source - Check for safe hardlink conditions >> > + * safe_hardlink_source_uid - Check for safe hardlink conditions not dependent >> > + * on the inode's group. These conditions may be overridden by inode ownership >> > + * or CAP_FOWNER with respect to the inode's uid >> > * @inode: the source inode to hardlink from >> > * >> > * Return false if at least one of the following conditions: >> > * - inode is not a regular file >> > * - inode is setuid >> > - * - inode is setgid and group-exec >> > * - access failure for read and write >> > * >> > * Otherwise returns true. >> > */ >> > -static bool safe_hardlink_source(struct inode *inode) >> > +static bool safe_hardlink_source_uid(struct inode *inode) >> > { >> > umode_t mode = inode->i_mode; >> > >> > @@ -936,10 +937,6 @@ static bool safe_hardlink_source(struct inode *inode) >> > if (mode & S_ISUID) >> > return false; >> > >> > - /* Executable setgid files should not get pinned to the filesystem. */ >> > - if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) >> > - return false; >> > - >> > /* Hardlinking to unreadable or unwritable sources is dangerous. */ >> > if (inode_permission(inode, MAY_READ | MAY_WRITE)) >> > return false; >> > @@ -948,30 +945,62 @@ static bool safe_hardlink_source(struct inode *inode) >> > } >> > >> > /** >> > + * safe_hardlink_source_gid - Check for safe hardlink conditions dependent >> > + * on the inode's group. These conditions may be overridden by inode ownership >> > + * or CAP_FOWNER with respect to the inode's gid >> > + * @inode: the source inode to hardlink from >> > + * >> > + * Return false if inode is setgid and group-exec >> > + * >> > + * Otherwise returns true. >> > + */ >> > +static bool safe_hardlink_source_gid(struct inode *inode) >> > +{ >> > + umode_t mode = inode->i_mode; >> > + >> > + /* Executable setgid files should not get pinned to the filesystem. */ >> > + if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) >> > + return false; >> > + >> > + return true; >> > +} >> > + >> > +/** >> > * may_linkat - Check permissions for creating a hardlink >> > * @link: the source to hardlink from >> > * >> > * Block hardlink when all of: >> > * - sysctl_protected_hardlinks enabled >> > * - fsuid does not match inode >> > - * - hardlink source is unsafe (see safe_hardlink_source() above) >> > + * - hardlink source is unsafe (see safe_hardlink_source_*() above) >> > * - not CAP_FOWNER in a namespace with the inode owner uid mapped >> > + * (and inode gid mapped, if hardlink conditions depending on the inode's >> > + * group are not satisfied) >> > * >> > * Returns 0 if successful, -ve on error. >> > */ >> > static int may_linkat(struct path *link) >> > { >> > struct inode *inode; >> > + struct user_namespace *ns; >> > + bool owner; >> > + bool safe_uid; >> > + bool safe_gid; >> > >> > if (!sysctl_protected_hardlinks) >> > return 0; >> > >> > inode = link->dentry->d_inode; >> > + ns = current_user_ns(); >> > >> > /* Source inode owner (or CAP_FOWNER) can hardlink all they like, >> > * otherwise, it must be a safe source. >> > */ >> > - if (inode_owner_or_capable(inode) || safe_hardlink_source(inode)) >> > + owner = inode_owner_or_capable(inode); >> > + safe_uid = safe_hardlink_source_uid(inode) || owner; >> > + safe_gid = safe_hardlink_source_gid(inode) || >> > + (owner && kgid_has_mapping(ns, inode->i_gid)); >> > + if (safe_uid && safe_gid) >> > return 0; >> > >> > audit_log_link_denied("linkat", link); >> > -- >> > 2.1.4 >> > >> >> >> > -- Kees Cook Chrome OS Security -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/