Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752893AbbLZVI3 (ORCPT ); Sat, 26 Dec 2015 16:08:29 -0500 Received: from h2.hallyn.com ([78.46.35.8]:37411 "EHLO h2.hallyn.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752092AbbLZVI1 (ORCPT ); Sat, 26 Dec 2015 16:08:27 -0500 Date: Sat, 26 Dec 2015 15:08:25 -0600 From: "Serge E. Hallyn" To: Jann Horn Cc: "Serge E. Hallyn" , Roland McGrath , Oleg Nesterov , linux-kernel@vger.kernel.org, security@kernel.org, Serge Hallyn , Andy Lutomirski , "Eric W. Biederman" Subject: Re: [PATCH] ptrace: being capable wrt a process requires mapped uids/gids Message-ID: <20151226210825.GB19815@mail.hallyn.com> References: <1449951161-4850-1-git-send-email-jann@thejh.net> <20151226011038.GA25455@pc.thejh.net> <20151226202345.GA19815@mail.hallyn.com> <20151226205550.GA29895@pc.thejh.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151226205550.GA29895@pc.thejh.net> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3633 Lines: 79 On Sat, Dec 26, 2015 at 09:55:50PM +0100, Jann Horn wrote: > On Sat, Dec 26, 2015 at 02:23:45PM -0600, Serge E. Hallyn wrote: > > On Sat, Dec 26, 2015 at 02:10:38AM +0100, Jann Horn wrote: > > > On Sat, Dec 12, 2015 at 09:12:41PM +0100, Jann Horn wrote: > > > > With this change, the entering process can first enter the > > > > namespace and then safely inspect the namespace's > > > > properties, e.g. through /proc/self/{uid_map,gid_map}, > > > > assuming that the namespace owner doesn't have access to > > > > uid 0. > > > > > > Actually, I think I missed something there. Well, at least it > > > should not directly lead to a container escape. > > > > > > > > > > -static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode) > > > > +static bool ptrace_has_cap(const struct cred *tcred, unsigned int mode) > > > > { > > > > + struct user_namespace *tns = tcred->user_ns; > > > > + struct user_namespace *curns = current_cred()->user_ns; > > > > + > > > > + /* When a root-owned process enters a user namespace created by a > > > > + * malicious user, the user shouldn't be able to execute code under > > > > + * uid 0 by attaching to the root-owned process via ptrace. > > > > + * Therefore, similar to the capable_wrt_inode_uidgid() check, > > > > + * verify that all the uids and gids of the target process are > > > > + * mapped into the current namespace. > > > > + * No fsuid/fsgid check because __ptrace_may_access doesn't do it > > > > + * either. > > > > + */ > > > > + if (!kuid_has_mapping(curns, tcred->euid) || > > > > + !kuid_has_mapping(curns, tcred->suid) || > > > > + !kuid_has_mapping(curns, tcred->uid) || > > > > + !kgid_has_mapping(curns, tcred->egid) || > > > > + !kgid_has_mapping(curns, tcred->sgid) || > > > > + !kgid_has_mapping(curns, tcred->gid)) > > > > + return false; > > > > + > > > > if (mode & PTRACE_MODE_NOAUDIT) > > > > - return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE); > > > > + return has_ns_capability_noaudit(current, tns, CAP_SYS_PTRACE); > > > > else > > > > - return has_ns_capability(current, ns, CAP_SYS_PTRACE); > > > > + return has_ns_capability(current, tns, CAP_SYS_PTRACE); > > > > } > > > > > > If the namespace owner can run code in the init namespace, the kuids are > > > mapped into curns but he is still capable wrt the target namespace. > > > > > > I think a proper fix should first determine the highest parent of > > > tcred->user_ns in which the caller still has privs, then do the > > > kxid_has_mapping() checks in there. > > > > Hi, > > > > I don't quite follow what you are concerned about. Based on the new > > patch you sent, I assume it's not the case where the tcred's kuid is > > actually mapped into the container. So is it the case where I > > unshare a userns which unshares a userns, then setns from the grandparent > > into the child? And if so, the concern is that if the setns()ing task's > > kuid is mappable all along into the grandhild, then container root should > > be able to ptrace it? > > Consider the following scenario: > > init_user_ns has a child namespace (I'll call it child_ns). > child_ns is owned by an attacker (child_ns->owner == attacker_kuid). > The attacking process has current_cred()->euid == attacker_kuid and lives > in init_user_ns (which means it's capable in child_ns). Ah, right. Special. Thanks. -serge -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/