Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp4739395imm; Mon, 11 Jun 2018 18:30:34 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIkx6c8WZggUon8TB+kH75mMlXlz1dTY5I0E/mwLjBF1lAoURXhIHMVlT+Cz5VYIJw7TGos X-Received: by 2002:a17:902:8d85:: with SMTP id v5-v6mr1635517plo.93.1528767034572; Mon, 11 Jun 2018 18:30:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528767034; cv=none; d=google.com; s=arc-20160816; b=hjR/5qeww8GBgoYz+11RbNz92M2G1GXBUPrU//pFIQPlkezHBJPO6lpi3ANBDB+H1b JwOWRSP7saj/mrHc83a7Rjrf1WwxKGgkqlsSRUfgKN4YT6zsGwDUncW62xbeTKqvMhxI NJdlCmV8LmQ3+yad/zc2WvjDOHlO2leOQp6amzUy71iMhOp8ZnLme6qEpcji1fBt8vqV oDIH5ZsC20163e9s7PuAhX0L/lpjOc36EBJ/5ZHOfKJm33Zv3kVf9llw7X1rq8ISwg24 DJXNgdmbNlxCZP12sV9RtwXV9XmhS5555qUorcrd4ioqognEuH/9I/ZqsYXEGqx4uV5e ySSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:subject:mime-version:user-agent :message-id:in-reply-to:date:references:cc:to:from :arc-authentication-results; bh=ViFEyI31/U4MvkEOoexf7HhpISeq7h54riMoou/M55Q=; b=UfdBvBWAlsNMJXtZ+qeaYZAhqVYCPetXP/nMnhESWgFFKyLKUDmG5i1mTrKR1qV9WC 3bgeaUM47thjMdb4Rhdz1ifcM3GHQvMXMvBOoOWtHHe0CGpUxw2xR9nCCy4YkcFtd35p ADjCg8GRrWg+FkrpnXQugn+7BHEVQLe8B8WkeNLLptqIpBQ8/jzKo36ZhDhBupXgcS4d UQYnIyCJmHMa+E1jn17jn8o2TxDJUMScimx2gJgRJmf3KxELkrZfMngSdvXzpukxHS4w 6MmVXRSw7z/toqvGA62tTgBQG1LgKczCCboUHb6h18wYEHbDaSkrLf3qycXJNR6xB6Yw d2xw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a80-v6si29694154pfg.200.2018.06.11.18.30.18; Mon, 11 Jun 2018 18:30:34 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934105AbeFLBWO (ORCPT + 99 others); Mon, 11 Jun 2018 21:22:14 -0400 Received: from out01.mta.xmission.com ([166.70.13.231]:57396 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932893AbeFLBWL (ORCPT ); Mon, 11 Jun 2018 21:22:11 -0400 Received: from in01.mta.xmission.com ([166.70.13.51]) by out01.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1fSY0g-0000cn-Sp; Mon, 11 Jun 2018 19:22:10 -0600 Received: from 97-119-124-205.omah.qwest.net ([97.119.124.205] helo=x220.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1fSY0g-0005vc-2u; Mon, 11 Jun 2018 19:22:10 -0600 From: ebiederm@xmission.com (Eric W. Biederman) To: Alistair Strachan Cc: linux-fsdevel@vger.kernel.org, Seth Forshee , Djalal Harouni , kernel-team@android.com, linux-kernel@vger.kernel.org, Linux Containers References: <20180611195744.154962-1-astrachan@google.com> Date: Mon, 11 Jun 2018 20:22:03 -0500 In-Reply-To: <20180611195744.154962-1-astrachan@google.com> (Alistair Strachan's message of "Mon, 11 Jun 2018 12:57:44 -0700") Message-ID: <87bmcgpzno.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1fSY0g-0005vc-2u;;;mid=<87bmcgpzno.fsf@xmission.com>;;;hst=in01.mta.xmission.com;;;ip=97.119.124.205;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX19iWsf7U7zpmbf6wtTrrXL9BuGresvtPk8= X-SA-Exim-Connect-IP: 97.119.124.205 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on sa08.xmission.com X-Spam-Level: *** X-Spam-Status: No, score=3.5 required=8.0 tests=ALL_TRUSTED,BAYES_50, DCC_CHECK_NEGATIVE,LotsOfNums_01,TVD_RCVD_IP,T_TM2_M_HEADER_IN_MSG,XMPhish26 autolearn=disabled version=3.4.1 X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 2.5 XMPhish26 BODY: Yet more teamwork rules. * 0.0 TVD_RCVD_IP Message was received from an IP address * 1.2 LotsOfNums_01 BODY: Lots of long strings of numbers * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa08 1397; Body=1 Fuz1=1 Fuz2=1] X-Spam-DCC: XMission; sa08 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ***;Alistair Strachan X-Spam-Relay-Country: X-Spam-Timing: total 282 ms - load_scoreonly_sql: 0.03 (0.0%), signal_user_changed: 3.1 (1.1%), b_tie_ro: 2.2 (0.8%), parse: 0.73 (0.3%), extract_message_metadata: 14 (4.8%), get_uri_detail_list: 2.5 (0.9%), tests_pri_-1000: 7 (2.4%), tests_pri_-950: 1.11 (0.4%), tests_pri_-900: 0.89 (0.3%), tests_pri_-400: 27 (9.7%), check_bayes: 26 (9.3%), b_tokenize: 8 (2.8%), b_tok_get_all: 10 (3.6%), b_comp_prob: 2.2 (0.8%), b_tok_touch_all: 4.0 (1.4%), b_finish: 0.69 (0.2%), tests_pri_0: 221 (78.6%), check_dkim_signature: 0.45 (0.2%), check_dkim_adsp: 2.9 (1.0%), tests_pri_500: 4.2 (1.5%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH] proc: Fix parsing of mount parameters. X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Alistair Strachan writes: > In commit e94591d0d90c "proc: Convert proc_mount to use mount_ns" > the parsing of mount parameters for the proc filesystem was broken. > > The SB_KERNMOUNT for procfs happens via: > > start_kernel() > rest_init() > kernel_thread() > _do_fork() > copy_process() > alloc_pid() > pid_ns_prepare_proc() > kern_mount_data() > proc_mount() > mount_ns() > > In mount_ns(), the kernel calls proc_fill_super() only if the superblock > has not previously been set up (i.e. the first mount reference), > regardless of SB_KERNMOUNT. Because the call to proc_parse_options() had > been moved inside here, and the SB_KERNMOUNT uses no mount options, the > option parser became a no-op. > > When userspace later mounted procfs with e.g. hidepid=2, the options > would be ignored. > > This change backs out a part of the original cleanup and parses the > procfs mount options at every mount call. Because the options currently > only update the pid_ns for the mount, they are applied for all mounts of > proc by that pid or childen of that pid, instantaneously. This is the > same behavior as the original code. Two years for a regression to be reported is a litte long. I think that gets out of the kneejerk immediate fix or revert phase and into thinking a little bout about what makes sense in this code. As we say with devpts there is a very real danger of someone mounting a second instance of proc in a chroot and causing problems by either strengthening or weakening the hid pid protections for the entire pid namespace. If we go with your proposed change in behavior. Ordinary block device filesystems (like ext4) avoid this problem by allowing a second mount and by not parsing the mount options except on remount. What proc currently does. So I think it can be reasonably argued that the change in behavior is was an unintentional fix. I can see an argument for failing the mount of proc if mount options are specified or if those mount options differ from the existing mount options. proc_remount's call of proc_parse_options is definitely buggy as it can partially succeed and change the pid namespace and return an error code. That is bad error handling. There may be an argument for making these options available in something other than a mount of proc. As they are pid namespace wide. There may be an argument for multiple instances of proc so that it makes sense to process these options during an ordinary mount. Ultimately what I see is that this is a difficult area of semantics that there is at least a little room for improvement on, but it is not as simple as this proposed change. > Fixes: e94591d0d90c ("proc: Convert proc_mount to use mount_ns") > Signed-off-by: Alistair Strachan > Cc: Seth Forshee > Cc: Djalal Harouni > Cc: "Eric W. Biederman" > Cc: kernel-team@android.com > Cc: linux-kernel@vger.kernel.org > --- > fs/proc/inode.c | 4 ---- > fs/proc/internal.h | 1 - > fs/proc/root.c | 5 ++++- > 3 files changed, 4 insertions(+), 6 deletions(-) > > diff --git a/fs/proc/inode.c b/fs/proc/inode.c > index 2cf3b74391ca..bbbbf348be0a 100644 > --- a/fs/proc/inode.c > +++ b/fs/proc/inode.c > @@ -492,13 +492,9 @@ struct inode *proc_get_inode(struct super_block *sb, struct proc_dir_entry *de) > > int proc_fill_super(struct super_block *s, void *data, int silent) > { > - struct pid_namespace *ns = get_pid_ns(s->s_fs_info); > struct inode *root_inode; > int ret; > > - if (!proc_parse_options(data, ns)) > - return -EINVAL; > - > /* User space would break if executables or devices appear on proc */ > s->s_iflags |= SB_I_USERNS_VISIBLE | SB_I_NOEXEC | SB_I_NODEV; > s->s_flags |= SB_NODIRATIME | SB_NOSUID | SB_NOEXEC; > diff --git a/fs/proc/internal.h b/fs/proc/internal.h > index 50cb22a08c2f..89b7e845b000 100644 > --- a/fs/proc/internal.h > +++ b/fs/proc/internal.h > @@ -264,7 +264,6 @@ static inline void proc_tty_init(void) {} > * root.c > */ > extern struct proc_dir_entry proc_root; > -extern int proc_parse_options(char *options, struct pid_namespace *pid); > > extern void proc_self_init(void); > extern int proc_remount(struct super_block *, int *, char *); > diff --git a/fs/proc/root.c b/fs/proc/root.c > index 61b7340b357a..d40676a5dd6c 100644 > --- a/fs/proc/root.c > +++ b/fs/proc/root.c > @@ -36,7 +36,7 @@ static const match_table_t tokens = { > {Opt_err, NULL}, > }; > > -int proc_parse_options(char *options, struct pid_namespace *pid) > +static int proc_parse_options(char *options, struct pid_namespace *pid) > { > char *p; > substring_t args[MAX_OPT_ARGS]; > @@ -98,6 +98,9 @@ static struct dentry *proc_mount(struct file_system_type *fs_type, > ns = task_active_pid_ns(current); > } > > + if (!proc_parse_options(data, ns)) > + return ERR_PTR(-EINVAL); > + > return mount_ns(fs_type, flags, data, ns, ns->user_ns, proc_fill_super); > }