Received: by 10.223.185.116 with SMTP id b49csp6189780wrg; Thu, 8 Mar 2018 03:24:39 -0800 (PST) X-Google-Smtp-Source: AG47ELv2CishbCsaFmUBuI+hm9s7W5Htw1aY60AxNygedJ8ngQfkkH/QMBVUfJVFvotWj6r82QF6 X-Received: by 10.98.13.24 with SMTP id v24mr25846930pfi.89.1520508279513; Thu, 08 Mar 2018 03:24:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520508279; cv=none; d=google.com; s=arc-20160816; b=AHLBavReLfiqX11qbMi/urpptZMuCfnWIMdmnddTipLTmF6PNJADORpnuG0BupaQIa 7Dz4CrU4bO9nxa417Ol2aI0+DKtD7sGCRENm+d5LoRB43Ro9Aes5/mKHhdYiSRKvv1t1 KL6Nms1pyrtcKqdVq5dyoZQg+GI7zfNRLEROt9m2N9pKzp+SJIZRex4elO1IDNHFX5bW TJtFWU1KB92NpJ95pIi+nLWKxkBQomRlSPIsfyejD2kT9LBzV515aAPyEuTxvYv7mtwH AP8f4psPcbvPMzUDHzSemHNTbymYvrWjvjO9NF3CXtCC0Abunx0EZCpo/geuV8WNsnJf SV5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:date:from :arc-authentication-results; bh=TTJj1i6FXvQtolBsmrAr4HdAlAru/CUbZhi4r1Os6qk=; b=xjxdQpTRY2Mxxhdf67XvbTPAX3vyiamYV0qRMU0+XaXxp5/C0DmbjfOz6k3Hb4+eAm O55T6AVeq8UnKsS+lZjhByKmzqqPRpzc00WatFLReG8uBAfOFwA4I8ARuOBedI1g73z/ FyC2z5IogjZOQ7ODfhf5K0OQufBJKZL2Dh80igAREEqlq+FH8DguH18e5VQu3g9V+k9P asdA0Wy54nyBhOwqSK1I7W3Xp6eqlA5fDo0Q5K8zpHx8SLB+zzK7tJ0DVKs0fxL/uhuQ k467soPL7xTLJ4ThWQ/CFATHm0ZIzISRmZUsTJq2WTV3bWzwd7qwOKKjDJM3YcjX1cde tnpQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u3-v6si14524001plz.494.2018.03.08.03.24.24; Thu, 08 Mar 2018 03:24:39 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935333AbeCHLW6 (ORCPT + 99 others); Thu, 8 Mar 2018 06:22:58 -0500 Received: from youngberry.canonical.com ([91.189.89.112]:48586 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751434AbeCHLW4 (ORCPT ); Thu, 8 Mar 2018 06:22:56 -0500 Received: from mail-wm0-f69.google.com ([74.125.82.69]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1ettdP-0004HX-VZ for linux-kernel@vger.kernel.org; Thu, 08 Mar 2018 11:22:55 +0000 Received: by mail-wm0-f69.google.com with SMTP id p14so7655084wmc.0 for ; Thu, 08 Mar 2018 03:22:55 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:date:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=TTJj1i6FXvQtolBsmrAr4HdAlAru/CUbZhi4r1Os6qk=; b=Es5glDCUAW76EUl8YqC21u4ziuMMeKop1ER/NvXeAJk3LpogIunxOki9+Bfde8GCUk 2M1E5bqef0Ch37JCJ5wqfUyvjRLrvFaxjKKOOaiS+1PLz1HKcFgO8rzWvaUzwEGOda42 s3QFmzBO2UmVYhMWfiBw/LRrtj4M1u8IBBW0gwIlU7Ml1a4ZF0d6PRxv0FkJDPo/xryq bc8nh8T6QiKVg67Sv9rZwnwI3lgfYyLdhGZUEI+oDP6bjaY3MtsxYX8BKLmbGMHmCWPK jErw3uvueknxiwlC8RsdtRIpg0I4M/scYg2oG7EVTSYIrVfcuMcMx0cATOw1/6ik8xHE CGJw== X-Gm-Message-State: AElRT7G9FvwzAAAnS3gcmtWskix4inBSQD5/IVVtVnBo8JQ6k2GwWTWq MC65jw+gTY+NbAlwlQEJsZYauVESgV6PxNDmYRk4W2esRjFDYNaq8KYzIJ1QWdqRLxobQpfnvXZ DmOzhZoA2kO5ZK8b3T2GpISUBnhxE39ql5dQBNV1evw== X-Received: by 10.28.224.135 with SMTP id x129mr18501299wmg.57.1520508175454; Thu, 08 Mar 2018 03:22:55 -0800 (PST) X-Received: by 10.28.224.135 with SMTP id x129mr18501281wmg.57.1520508175165; Thu, 08 Mar 2018 03:22:55 -0800 (PST) Received: from gmail.com ([37.220.133.201]) by smtp.gmail.com with ESMTPSA id m187sm19348367wmg.0.2018.03.08.03.22.52 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 08 Mar 2018 03:22:54 -0800 (PST) From: Christian Brauner X-Google-Original-From: Christian Brauner Date: Thu, 8 Mar 2018 12:22:49 +0100 To: "Eric W. Biederman" Cc: linux-kernel@vger.kernel.org, torvalds@linux-foundation.org Subject: Re: Invalid /proc//fd/{0,1,2} symlinks with TIOCGPTPEER Message-ID: <20180308112248.GA5918@gmail.com> References: <20180307161744.GA17562@gmail.com> <878tb3u1hv.fsf@xmission.com> <20180308082228.GD22728@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20180308082228.GD22728@gmail.com> User-Agent: Mutt/1.9.3 (2018-01-21) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 08, 2018 at 09:22:29AM +0100, Christian Brauner wrote: > On Wed, Mar 07, 2018 at 01:30:52PM -0600, Eric W. Biederman wrote: > > Christian Brauner writes: > > > > > Hey, > > > > > > We discovered a potential bug in the devpts implementation via > > > TIOCGPTPEER ioctl()s today. We've tackled a similar problem already in: > > > > > > commit 311fc65c9fb9c966bca8e6f3ff8132ce57344ab9 > > > Author: Eric W. Biederman > > > Date: Thu Aug 24 15:13:29 2017 -0500 > > > > > > pty: Repair TIOCGPTPEER > > > > > > Most libcs will *still* look at /dev/ptmx when opening the master fd of > > > pty device. Usually, /dev/ptmx will nowadays be either a symlink to > > > /dev/pts/ptmx or it will be a second device node with permissions 666 > > > whereas /dev/pts/ptmx will usually have permissions 000. Afaik, we've > > > also always supported making /dev/ptmx a bind-mount to /dev/pts/ptmx or > > > at least I haven't observed any issues with this so far and it's > > > something fairly common in containers. So in short, it should be legal > > > to do: > > > > > > mount --bind /dev/pts/ptmx /dev/ptmx > > > chmod 666 /dev/ptmx > > > > > > However, for any libc implementation or program that uses TIOCGPTPEER > > > the /proc//fd/{0,1,2} symlinks are broken (currently affects at > > > least glibc 2.27) with bind-mounts of /dev/pts/ptmx to /dev/ptmx. A > > > quick reproducer is: > > > > > > unshare --mount > > > mount --bind /dev/pts/ptmx /dev/ptmx > > > chmod 666 /dev/ptmx > > > script > > > ls -al /proc/self/fd/0 > > > > > > Let's assume the slave device index I received was 5 then I would expect to > > > see: > > > > > > ls -al /proc/self/fd/0 > > > lrwx------ 1 chb chb 64 Mar 7 16:41 /proc/self/fd/0 -> /dev/pts/5 > > > > > > But what I actually see is: > > > > > > ls -al /proc/self/fd/0 > > > lrwx------ 1 chb chb 64 Mar 7 16:41 /proc/self/fd/0 -> / > > > > > > I think the explanation for this is fairly straightforward. When > > > userspace does: > > > > > > master = open("/dev/ptmx", O_RDWR | O_NOCTTY); > > > slave = ioctl(master, TIOCGPTPEER, O_RDWR | O_NOCTTY); > > > > > > and /dev/ptmx is a bind-mount of /dev/pts/ptmx looking up the root mount > > > of the dentry for the slave it appears to the kernel as if the dentry is > > > escaping it's bind-mount: > > > > > > ├─/dev udev devtmpfs rw,nosuid,relatime,size=4001260k,nr_inodes=1000315,mode=755 > > > │ ├─/dev/pts devpts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 > > > │ └─/dev/ptmx devpts[/ptmx] devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 > > > > > > since the root mount of the dentry is /dev/pts but the root mount of > > > /dev/ptmx is /dev if I'm correct so similar to what Linus pointed out in > > > a previous discussion (see [1]) before. So we still record the "wrong" > > > vfsmount when /dev/ptmx is a bind-mount and then hit the problem when we > > > call devpts_mntget() in drivers/tty/pty.c. > > > > I think your analysis of why we return / is correct. If the root of the > > mount is a file (aka /dev/pts/ptmx). Then any other file will on that > > mount will not be under the root of the mount, and will be displayed > > as '/'. Because we have in fact escaped the root of the mount. > > > > I think this is more of a quality of implementation issue more than a > > bug per se. > > It's at least a regression since this used to work before. :) > > > > > > So I thought about this and - in case my analysis is correct - the > > > solution didn't seem obvious to me as a bind-mount has no concept of > > > what it's "parent" is (Which in this case should be the devpts mount at > > > /dev/pts.). > > > > We might be able to improve the quality of the implementation, by > > noticing this case early (sb->s_root != mnt->mnt_root) and using the > > same tricks on /dev/pts/ptmx as we do on /dev/ptmx. That is looking > > in ../pts and see if the filesystem we want is there. > > > > It would be a wee bit tricky but doable. The practical question becomes > > what breaks and what makes it worth maintaining such a mechanism. > > > > I don't remember how important it is to have a valid path in proc. So > > I won't comment on how important it is to improve the quality of > > the implementation. > > It's quite important for containers. The problem is that we can't (yet) > mknod() in a user namespace and making /dev/ptmx a symlink to > /dev/pts/ptmx will cause issues when used together with path-based LSMs > like AppArmor so a bind-mount is the only reliable option. > > > > > The code can be improved by doing something like: > > Right, what do you think about Linus suggestion? I'm happy to look into > it. Ah you're pointing at a similar solution. Sorry, I missed it on the first pass. I'll send a patch soon. Christian > > Christian > > > > > static int devpts_ptmx_pts_path(struct path *path) > > { > > struct super_block *sb; > > int err; > > > > /* Is a devpts filesystem at "pts" in the same directory? */ > > err = path_pts(path); > > if (err) > > return err; > > > > /* Is the path the root of a devpts filesystem? */ > > sb = path->mnt->mnt_sb; > > if ((sb->s_magic != DEVPTS_SUPER_MAGIC) || > > (path->mnt->mnt_root != sb->s_root)) > > return -ENODEV; > > > > return 0; > > } > > > > .... > > if ((DEVPTS_SB(path.mnt->mnt_sb) == fsi) && > > (path.mnt->mnt_root == fsi->ptmx_dentry)) { > > /* While the start point is a bind mount of single file > > * walk upwards. > > */ > > while ((path.mnt->mnt_root == path.dentry) && follow_up(&path)) > > ; > > if (devpts_ptmx_pts_path(&path) == 0) { > > dput(path.dentry); > > return path.mnt; > > } > > /* No luck fall through to the old code */ > > path_put(path); > > path = filp->f_path; > > path_get(&path); > > } > > > > The fall through vs fail would be a judgement on how important it is to > > have a useable path in proc for TIOCPTPEER. > > > > Eric