Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752749AbbLZUzy (ORCPT ); Sat, 26 Dec 2015 15:55:54 -0500 Received: from thejh.net ([37.221.195.125]:34311 "EHLO thejh.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751878AbbLZUzx (ORCPT ); Sat, 26 Dec 2015 15:55:53 -0500 Date: Sat, 26 Dec 2015 21:55:50 +0100 From: Jann Horn To: "Serge E. Hallyn" Cc: Roland McGrath , Oleg Nesterov , linux-kernel@vger.kernel.org, security@kernel.org, Serge Hallyn , Andy Lutomirski , "Eric W. Biederman" Subject: Re: [PATCH] ptrace: being capable wrt a process requires mapped uids/gids Message-ID: <20151226205550.GA29895@pc.thejh.net> References: <1449951161-4850-1-git-send-email-jann@thejh.net> <20151226011038.GA25455@pc.thejh.net> <20151226202345.GA19815@mail.hallyn.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="J/dobhs11T7y2rNN" Content-Disposition: inline In-Reply-To: <20151226202345.GA19815@mail.hallyn.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5910 Lines: 136 --J/dobhs11T7y2rNN Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Dec 26, 2015 at 02:23:45PM -0600, Serge E. Hallyn wrote: > On Sat, Dec 26, 2015 at 02:10:38AM +0100, Jann Horn wrote: > > On Sat, Dec 12, 2015 at 09:12:41PM +0100, Jann Horn wrote: > > > With this change, the entering process can first enter the > > > namespace and then safely inspect the namespace's > > > properties, e.g. through /proc/self/{uid_map,gid_map}, > > > assuming that the namespace owner doesn't have access to > > > uid 0. > >=20 > > Actually, I think I missed something there. Well, at least it > > should not directly lead to a container escape. > >=20 > >=20 > > > -static int ptrace_has_cap(struct user_namespace *ns, unsigned int mo= de) > > > +static bool ptrace_has_cap(const struct cred *tcred, unsigned int mo= de) > > > { > > > + struct user_namespace *tns =3D tcred->user_ns; > > > + struct user_namespace *curns =3D current_cred()->user_ns; > > > + > > > + /* When a root-owned process enters a user namespace created by a > > > + * malicious user, the user shouldn't be able to execute code under > > > + * uid 0 by attaching to the root-owned process via ptrace. > > > + * Therefore, similar to the capable_wrt_inode_uidgid() check, > > > + * verify that all the uids and gids of the target process are > > > + * mapped into the current namespace. > > > + * No fsuid/fsgid check because __ptrace_may_access doesn't do it > > > + * either. > > > + */ > > > + if (!kuid_has_mapping(curns, tcred->euid) || > > > + !kuid_has_mapping(curns, tcred->suid) || > > > + !kuid_has_mapping(curns, tcred->uid) || > > > + !kgid_has_mapping(curns, tcred->egid) || > > > + !kgid_has_mapping(curns, tcred->sgid) || > > > + !kgid_has_mapping(curns, tcred->gid)) > > > + return false; > > > + > > > if (mode & PTRACE_MODE_NOAUDIT) > > > - return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE); > > > + return has_ns_capability_noaudit(current, tns, CAP_SYS_PTRACE); > > > else > > > - return has_ns_capability(current, ns, CAP_SYS_PTRACE); > > > + return has_ns_capability(current, tns, CAP_SYS_PTRACE); > > > } > >=20 > > If the namespace owner can run code in the init namespace, the kuids are > > mapped into curns but he is still capable wrt the target namespace. > >=20 > > I think a proper fix should first determine the highest parent of > > tcred->user_ns in which the caller still has privs, then do the > > kxid_has_mapping() checks in there. >=20 > Hi, >=20 > I don't quite follow what you are concerned about. Based on the new > patch you sent, I assume it's not the case where the tcred's kuid is > actually mapped into the container. So is it the case where I > unshare a userns which unshares a userns, then setns from the grandparent > into the child? And if so, the concern is that if the setns()ing task's > kuid is mappable all along into the grandhild, then container root should > be able to ptrace it? Consider the following scenario: init_user_ns has a child namespace (I'll call it child_ns). child_ns is owned by an attacker (child_ns->owner =3D=3D attacker_kuid). The attacking process has current_cred()->euid =3D=3D attacker_kuid and liv= es in init_user_ns (which means it's capable in child_ns). The victim process (with euid=3D=3D0) enters the namespace, then the attack= ing process tries to attach to it. ptrace_has_cap(tcred, mode) would, with my old patch, still be true because current is capable in tcred->user_ns and all kuids are mapped into init_user_ns. The new patch uses the following rule for cap checks: The caller is capable relative to a target if a user namespace exists for which all of the following conditions are true: - The caller has the requested capability inside the namespace. - The target is inside the namespace (either directly or in a child). - The target's kuids and kgids are mapped into the namespace. The first rule is enforced by the has_ns_capability(..., tns, ...) or has_ns_capability_noaudit(..., tns, ...) call at the bottom. The second rule is implicitly true because tns initially is the target's user namespace and then moves up through ->parent. The third rule is enforced by the while loop. This prevents the attack I described, but e.g. still allows someone who is capable in init_user_ns to ptrace anything, no matter in what weird namespace the target is - if a task was ptrace-able for a process before it called clone(CLONE_NEWUSER) / unshare(CLONE_NEWUSER) / setns(, CLONE_NEWUSER), it should still be ptrace-able afterwards. (Unless access was permitted based on the introspection rule.) --J/dobhs11T7y2rNN Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJWfv7WAAoJED4KNFJOeCOoOf0P/1LHog2febJXSt+SFsr16MjJ i7InurlVZXD+Th/jopvCdYBrCUTXxyk+U4XJiQXn9SZ1vytWy5M5d8wOAnQtyRfv 2qKMMLLIiaHo7Wu+MePpuLnhIOePOIQxIpTJJYG/EtoijnkDNPcssr6LdjpQBwIQ OgWw6mj1oIcC2BNKez1ADFC1NzoM16pTALsYc1GkFFXmeWraH2IbFlbtGM7lv4gW gA5C/3woEQgF3fd5Qn0mI8fyOfEVpG/+XFYnwmnOOx4L4nddlyhzRKDBHUWbQ9rd 7DBnrM6yIC6pptj+8WUgYHVdjNNZe8DnGoyda/ZYC5Kp9l09BSapjBzElZDhI2IG uFjqtjO3M+aDeAXorR+xHi0Mnkmwdj92pnqZYZ8n+6aiA48Sb2em40lZxr55g8gB w0sgXWniBtcgmCP6LQ/SJFTvgpPSb0/AyRdeQ/P//F9S4VGtLn2krP34M0e0opb6 XPGrU2bPH6GpMh67QOMWYH67KT/KUSJxe/DY5HIBqEEzKsYrn9uA2L57Ic7bqFxz qH84Ylhfui7kGrS/QX5TNZGnNNvdNmM5tMB28+TO4rdkki6aMDMsqXlr6eWPUBgy 0bPB7Ec2gLCXU1DYLct2q08F1No06IT12FnHP5gE7HapaIBQO2juTdesQvA0kIdw TXHmvFk6DzuZFEFEfWuV =oG3i -----END PGP SIGNATURE----- --J/dobhs11T7y2rNN-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/