Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932844AbcDEBl5 (ORCPT ); Mon, 4 Apr 2016 21:41:57 -0400 Received: from out03.mta.xmission.com ([166.70.13.233]:50103 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932822AbcDEBly (ORCPT ); Mon, 4 Apr 2016 21:41:54 -0400 From: "Eric W. Biederman" To: Linus Torvalds Cc: "H. Peter Anvin" , Peter Hurley , Greg KH , Jiri Slaby , Aurelien Jarno , Andy Lutomirski , Florian Weimer , Al Viro , Serge Hallyn , Jann Horn , "security@kernel.org" , security@ubuntu.com, security@debian.org, Willy Tarreau , Linux Kernel Mailing List , "Eric W. Biederman" Date: Mon, 4 Apr 2016 20:29:29 -0500 Message-Id: <1459819769-30387-13-git-send-email-ebiederm@xmission.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1459819769-30387-1-git-send-email-ebiederm@xmission.com> References: <878u0s3orx.fsf_-_@x220.int.ebiederm.org> <1459819769-30387-1-git-send-email-ebiederm@xmission.com> X-XM-AID: U2FsdGVkX18Vt4nrPZYWD+cVdmwEGRRQUPHWZixt1pw= X-SA-Exim-Connect-IP: 67.3.249.252 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.7 XMSubLong Long Subject * 1.5 XMNoVowels Alpha-numberic number with no vowels * 0.0 TVD_RCVD_IP Message was received from an IP address * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5007] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa07 1397; Body=1 Fuz1=1 Fuz2=1] * 0.0 T_TooManySym_01 4+ unique symbols in subject X-Spam-DCC: XMission; sa07 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: **;Linus Torvalds X-Spam-Relay-Country: X-Spam-Timing: total 1231 ms - load_scoreonly_sql: 0.11 (0.0%), signal_user_changed: 5.0 (0.4%), b_tie_ro: 3.4 (0.3%), parse: 1.79 (0.1%), extract_message_metadata: 20 (1.6%), get_uri_detail_list: 8 (0.7%), tests_pri_-1000: 7 (0.5%), tests_pri_-950: 1.34 (0.1%), tests_pri_-900: 1.10 (0.1%), tests_pri_-400: 54 (4.4%), check_bayes: 53 (4.3%), b_tokenize: 23 (1.9%), b_tok_get_all: 15 (1.2%), b_comp_prob: 6 (0.5%), b_tok_touch_all: 6 (0.5%), b_finish: 0.81 (0.1%), tests_pri_0: 1129 (91.7%), check_dkim_signature: 0.70 (0.1%), check_dkim_adsp: 2.8 (0.2%), tests_pri_500: 7 (0.6%), rewrite_mail: 0.00 (0.0%) Subject: [PATCH 13/13] devpts: Kill the DEVPTS_MULTIPLE_INSTANCE config option X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Wed, 24 Sep 2014 11:00:52 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 13992 Lines: 363 Retain the code that was previously enabled with DEVPTS_MULTIPLE_INSTANCES and remove the config option, and kill the little bit of code that existed only when DEVPTS_MULTIPLE_INSTANCES was not selected. With the recently updated semantics userspace actively depends on having multiple instances of devpts for correct operation. Having each mount of devpts return a distinct instance ensures that user space will not accidentally stomp gid or mode devpts options, of the primary system devpts, by mounting devpts in a chroot environment. A guarantee that userspace will not stomp attributes of system devpts removes the need for a setuid root pt_chown executable. Running a userspace without DEVPTS_MULTIPLE_INSTANCES that has depends on the current behavior and has removed a setuid root pt_chown exectuable will allow the system devpts gid and mode options to be stomped breaking userspace. Which makes the ability to disable the code previously selected by DEVPTS_MULTIPLE_INSTANCES actively wrong. The size increase by always using the DEVPTS_MULTIPLE_INSTANCE path is minimal, and the code is much easier to maintain and use without having two different code paths to worry about. The documentation has been updated to relfect this change. Signed-off-by: "Eric W. Biederman" --- Documentation/filesystems/devpts.txt | 122 ++++++++++++++--------------------- drivers/tty/Kconfig | 11 ---- fs/devpts/inode.c | 41 ------------ 3 files changed, 47 insertions(+), 127 deletions(-) diff --git a/Documentation/filesystems/devpts.txt b/Documentation/filesystems/devpts.txt index 30d2fcb32f72..984b645ac341 100644 --- a/Documentation/filesystems/devpts.txt +++ b/Documentation/filesystems/devpts.txt @@ -1,38 +1,32 @@ -To support containers, we now allow multiple instances of devpts filesystem, -such that indices of ptys allocated in one instance are independent of indices -allocated in other instances of devpts. - -To preserve backward compatibility, this support for multiple instances is -enabled only if: - - - CONFIG_DEVPTS_MULTIPLE_INSTANCES=y, and - - '-o newinstance' mount option is specified while mounting devpts - -IOW, devpts now supports both single-instance and multi-instance semantics. - -If CONFIG_DEVPTS_MULTIPLE_INSTANCES=n, there is no change in behavior and -this referred to as the "legacy" mode. In this mode, the new mount options -(-o newinstance and -o ptmxmode) will be ignored with a 'bogus option' message -on console. - -If CONFIG_DEVPTS_MULTIPLE_INSTANCES=y and devpts is mounted without the -'newinstance' option (as in current start-up scripts) the new mount binds -to the initial kernel mount of devpts. This mode is referred to as the -'single-instance' mode and the current, single-instance semantics are -preserved, i.e PTYs are common across the system. - -The only difference between this single-instance mode and the legacy mode -is the presence of new, '/dev/pts/ptmx' node with permissions 0000, which -can safely be ignored. - -If CONFIG_DEVPTS_MULTIPLE_INSTANCES=y and 'newinstance' option is specified, -the mount is considered to be in the multi-instance mode and a new instance -of the devpts fs is created. Any ptys created in this instance are independent -of ptys in other instances of devpts. Like in the single-instance mode, the -/dev/pts/ptmx node is present. To effectively use the multi-instance mode, -open of /dev/ptmx must be a redirected to '/dev/pts/ptmx' using a symlink or -bind-mount. +Each mount of the devpts filesystem is now a distinct instance from +all other mounts of devpts. Mount options and indicies of ptys +allocated in one instance are independent of indices allocated in +other instances of devpts. + +If devpts is mounted without the 'newinstance' option (as in current +start-up scripts) and if the initial kernel mount of devpts has not +been exported to userspace the new mount binds to the initial kernel +mount of devpts. + +If devpts is mounted with the 'newinstance' option or the initial +internal mount of devpts has already been mounted a new instance of +the devpts filesystem is created. Any ptys created in this instance +are independent of the ptys created in other instances of devpts, and +the initial permissions of those ptys are independent from the initial +permissions from any other instance of the devpts filesystem. + +Ideally people will make use of the /dev/pts/ptmx device node to +create ptys on the devpts filesystem. This can be done by updating +userspace, bind mounting /dev/pts/ptmx onto /dev/ptmx or making +/dev/ptmx a symlink to /dev/pts/ptmx. + +To be seemlessly backwards compatible an open of /dev/ptmx will look +to see if the name pts in the same directory is the root directory of +a devpts filesystem. If that is the case and it is not the initial +instance of devpts /dev/pts/ptmx is opened. Otherwise the initial +instance of devpts is opened. In older kernels /dev/ptmx did not +perform this redirection. Eg: A container startup script could do the following: @@ -60,44 +54,34 @@ Per-instance limit could be set by adding mount option "max=". This feature was added in kernel 3.4 together with sysctl kernel.pty.reserve. In kernels older than 3.4 sysctl kernel.pty.max works as per-instance limit. -User-space changes ------------------- +What user-space needs to do +---------------------------- -In multi-instance mode (i.e '-o newinstance' mount option is specified at least -once), following user-space issues should be noted. +1. If the devpts filesystem is only mounted once /dev/pts/ptmx can be + ignored and no change is needed to system-startup scripts. -1. If -o newinstance mount option is never used, /dev/pts/ptmx can be ignored - and no change is needed to system-startup scripts. +2. For best results userspace libraries and applications should be updated + to try opening /dev/pts/ptmx before /dev/ptmx, as /dev/pts/ptmx is less + ambiguous and higher performance. -2. To effectively use multi-instance mode (i.e -o newinstance is specified) - administrators or startup scripts should "redirect" open of /dev/ptmx to - /dev/pts/ptmx using either a bind mount or symlink. +3. To effectively use a new instance of devpts open of /dev/ptmx should + be redirected to /dev/pts/ptmx using either a bind mount or symlink. $ mount -t devpts -o newinstance devpts /dev/pts followed by either - $ rm /dev/ptmx $ ln -s pts/ptmx /dev/ptmx $ chmod 666 /dev/pts/ptmx or $ mount -o bind /dev/pts/ptmx /dev/ptmx -3. The '/dev/ptmx -> pts/ptmx' symlink is the preferred method since it - enables better error-reporting and treats both single-instance and - multi-instance mounts similarly. - - But this method requires that system-startup scripts set the mode of - /dev/pts/ptmx correctly (default mode is 0000). The scripts can set the - mode by, either +4. The '/dev/ptmx -> pts/ptmx' symlink is the preferred method since it + enables better error-reporting and treats all cases the same. - - adding ptmxmode mount option to devpts entry in /etc/fstab, or - - using 'chmod 0666 /dev/pts/ptmx' - -4. If multi-instance mode mount is needed for containers, but the system - startup scripts have not yet been updated, container-startup scripts - should bind mount /dev/ptmx to /dev/pts/ptmx to avoid breaking single- - instance mounts. +5. If the system startup scripts do not create /dev/ptmx as a symlink, + container-startup scripts should bind mount /dev/ptmx to /dev/pts/ptmx + to avoid breaking the rest of the system. Or, in general, container-startup scripts should use: @@ -106,11 +90,9 @@ once), following user-space issues should be noted. mount -o bind /dev/pts/ptmx /dev/ptmx fi - When all devpts mounts are multi-instance, /dev/ptmx can permanently be - a symlink to pts/ptmx and the bind mount can be ignored. - -5. A multi-instance mount that is not accompanied by the /dev/ptmx to - /dev/pts/ptmx redirection would result in an unusable/unreachable pty. +6. A mount of devpts that is not accompanied by the /dev/ptmx to + /dev/pts/ptmx redirection will result in an unusable/unreachable pty, + on older kernels. mount -t devpts -o newinstance lxcpts /dev/pts @@ -121,21 +103,11 @@ once), following user-space issues should be noted. would create a pty, say /dev/pts/7, in the initial kernel mount. But /dev/pts/7 would be invisible in the new mount. -6. The permissions for /dev/pts/ptmx node should be specified when mounting - /dev/pts, using the '-o ptmxmode=%o' mount option (default is 0000). +7. The permissions for /dev/pts/ptmx node should be specified when mounting + /dev/pts, using the '-o ptmxmode=%o' mount option (default is 0666 was 0000). mount -t devpts -o newinstance -o ptmxmode=0644 devpts /dev/pts The permissions can be later be changed as usual with 'chmod'. chmod 666 /dev/pts/ptmx - -7. A mount of devpts without the 'newinstance' option results in binding to - initial kernel mount. This behavior while preserving legacy semantics, - does not provide strict isolation in a container environment. i.e by - mounting devpts without the 'newinstance' option, a container could - get visibility into the 'host' or root container's devpts. - - To workaround this and have strict isolation, all mounts of devpts, - including the mount in the root container, should use the newinstance - option. diff --git a/drivers/tty/Kconfig b/drivers/tty/Kconfig index 82c4d2e45319..95103054c0e4 100644 --- a/drivers/tty/Kconfig +++ b/drivers/tty/Kconfig @@ -120,17 +120,6 @@ config UNIX98_PTYS All modern Linux systems use the Unix98 ptys. Say Y unless you're on an embedded system and want to conserve memory. -config DEVPTS_MULTIPLE_INSTANCES - bool "Support multiple instances of devpts" - depends on UNIX98_PTYS - default n - ---help--- - Enable support for multiple instances of devpts filesystem. - If you want to have isolated PTY namespaces (eg: in containers), - say Y here. Otherwise, say N. If enabled, each mount of devpts - filesystem with the '-o newinstance' option will create an - independent PTY namespace. - config LEGACY_PTYS bool "Legacy (BSD) PTY support" default y diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c index f86fae8dac0b..9bed39d89cb3 100644 --- a/fs/devpts/inode.c +++ b/fs/devpts/inode.c @@ -109,11 +109,9 @@ static const match_table_t tokens = { {Opt_uid, "uid=%u"}, {Opt_gid, "gid=%u"}, {Opt_mode, "mode=%o"}, -#ifdef CONFIG_DEVPTS_MULTIPLE_INSTANCES {Opt_ptmxmode, "ptmxmode=%o"}, {Opt_newinstance, "newinstance"}, {Opt_max, "max=%d"}, -#endif {Opt_err, NULL} }; @@ -128,7 +126,6 @@ static inline struct pts_fs_info *DEVPTS_SB(struct super_block *sb) return sb->s_fs_info; } -#ifdef CONFIG_DEVPTS_MULTIPLE_INSTANCES static int devpts_path_ptmx(struct file *filp) { struct pts_fs_info *fsi; @@ -186,12 +183,6 @@ fail: path_put(&path); return err; } -#else -static inline int devpts_path_ptmx(struct file *filp) -{ - return -ENOENT; -} -#endif struct inode *devpts_ptmx(struct inode *inode, struct file *filp) { @@ -210,10 +201,8 @@ struct inode *devpts_ptmx(struct inode *inode, struct file *filp) static inline struct super_block *pts_sb_from_inode(struct inode *inode) { -#ifdef CONFIG_DEVPTS_MULTIPLE_INSTANCES if (inode->i_sb->s_magic == DEVPTS_SUPER_MAGIC) return inode->i_sb; -#endif if (!devpts_mnt) return NULL; return devpts_mnt->mnt_sb; @@ -289,7 +278,6 @@ static int parse_mount_options(char *data, struct pts_mount_opts *opts) return -EINVAL; opts->mode = option & S_IALLUGO; break; -#ifdef CONFIG_DEVPTS_MULTIPLE_INSTANCES case Opt_ptmxmode: if (match_octal(&args[0], &option)) return -EINVAL; @@ -303,7 +291,6 @@ static int parse_mount_options(char *data, struct pts_mount_opts *opts) return -EINVAL; opts->max = option; break; -#endif default: pr_err("called with bogus options\n"); return -EINVAL; @@ -313,7 +300,6 @@ static int parse_mount_options(char *data, struct pts_mount_opts *opts) return 0; } -#ifdef CONFIG_DEVPTS_MULTIPLE_INSTANCES static int mknod_ptmx(struct super_block *sb) { int mode; @@ -374,16 +360,6 @@ static void update_ptmx_mode(struct pts_fs_info *fsi) inode->i_mode = S_IFCHR|fsi->mount_opts.ptmxmode; } } -#else -static inline void update_ptmx_mode(struct pts_fs_info *fsi) -{ - return; -} -static inline int mknod_ptmx(struct super_block *sb) -{ - return 0; -} -#endif static int devpts_remount(struct super_block *sb, int *flags, char *data) { @@ -417,11 +393,9 @@ static int devpts_show_options(struct seq_file *seq, struct dentry *root) seq_printf(seq, ",gid=%u", from_kgid_munged(&init_user_ns, opts->gid)); seq_printf(seq, ",mode=%03o", opts->mode); -#ifdef CONFIG_DEVPTS_MULTIPLE_INSTANCES seq_printf(seq, ",ptmxmode=%03o", opts->ptmxmode); if (opts->max < NR_UNIX98_PTY_MAX) seq_printf(seq, ",max=%d", opts->max); -#endif return 0; } @@ -512,7 +486,6 @@ fail: return error; } -#ifdef CONFIG_DEVPTS_MULTIPLE_INSTANCES /* * devpts_mount() * @@ -565,18 +538,6 @@ static struct dentry *devpts_mount(struct file_system_type *fs_type, return root; } -#else -/* - * This supports only the legacy single-instance semantics (no - * multiple-instance semantics) - */ -static struct dentry *devpts_mount(struct file_system_type *fs_type, int flags, - const char *dev_name, void *data) -{ - return mount_single(fs_type, flags, data, devpts_fill_super); -} -#endif - static void devpts_kill_sb(struct super_block *sb) { struct pts_fs_info *fsi = DEVPTS_SB(sb); @@ -591,9 +552,7 @@ static struct file_system_type devpts_fs_type = { .name = "devpts", .mount = devpts_mount, .kill_sb = devpts_kill_sb, -#ifdef CONFIG_DEVPTS_MULTIPLE_INSTANCES .fs_flags = FS_USERNS_MOUNT | FS_USERNS_DEV_MOUNT, -#endif }; /* -- 2.6.3