Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752124AbcCUGK3 (ORCPT ); Mon, 21 Mar 2016 02:10:29 -0400 Received: from out01.mta.xmission.com ([166.70.13.231]:56136 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750797AbcCUGK1 (ORCPT ); Mon, 21 Mar 2016 02:10:27 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: Zhao Lei Cc: , , Mateusz Guzik References: <77053bb2bdd21489e09b6ef362044d283e1ba12b.1458305141.git.zhaolei@cn.fujitsu.com> Date: Mon, 21 Mar 2016 01:00:27 -0500 In-Reply-To: <77053bb2bdd21489e09b6ef362044d283e1ba12b.1458305141.git.zhaolei@cn.fujitsu.com> (Zhao Lei's message of "Fri, 18 Mar 2016 20:48:35 +0800") Message-ID: <87twk0tlok.fsf@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-AID: U2FsdGVkX18C/VkxlA0V5GbMGqslyH4QBbXJmsxwR2c= X-SA-Exim-Connect-IP: 67.3.249.252 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.0 TVD_RCVD_IP Message was received from an IP address * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa04 1397; Body=1 Fuz1=1 Fuz2=1] * 0.0 T_TooManySym_01 4+ unique symbols in subject X-Spam-DCC: XMission; sa04 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ;Zhao Lei X-Spam-Relay-Country: X-Spam-Timing: total 2583 ms - load_scoreonly_sql: 0.03 (0.0%), signal_user_changed: 3.2 (0.1%), b_tie_ro: 2.4 (0.1%), parse: 0.64 (0.0%), extract_message_metadata: 11 (0.4%), get_uri_detail_list: 2.1 (0.1%), tests_pri_-1000: 3.9 (0.1%), tests_pri_-950: 0.94 (0.0%), tests_pri_-900: 0.80 (0.0%), tests_pri_-400: 23 (0.9%), check_bayes: 22 (0.9%), b_tokenize: 7 (0.3%), b_tok_get_all: 9 (0.3%), b_comp_prob: 1.97 (0.1%), b_tok_touch_all: 3.3 (0.1%), b_finish: 0.56 (0.0%), tests_pri_0: 390 (15.1%), check_dkim_signature: 0.43 (0.0%), check_dkim_adsp: 2.5 (0.1%), tests_pri_500: 2147 (83.1%), poll_dns_idle: 2141 (82.9%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH v2 3/3] Make core_pattern support namespace X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Wed, 24 Sep 2014 11:00:52 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4433 Lines: 118 Zhao Lei writes: > Currently, each container shared one copy of coredump setting > with the host system, if host system changed the setting, each > running containers will be affected. > > Moreover, it is not easy to let each container keeping their own > coredump setting. > > We can use some workaround as pipe program to make the second > requirement possible, but it is not simple, and both host and > container are limited to set to fixed pipe program. > In one word, for host running contailer, we can't change core_pattern > anymore. > To make the problem more hard, if a host running more than one > container product, each product will try to snatch the global > coredump setting to fit their own requirement. > > For container based on namespace design, it is good to allow > each container keeping their own coredump setting. > > It will bring us following benefit: > 1: Each container can change their own coredump setting > based on operation on /proc/sys/kernel/core_pattern > 2: Coredump setting changed in host will not affect > running containers. > 3: Support both case of "putting coredump in guest" and > "putting curedump in host". > > Each namespace-based software(lxc, docker, ..) can use this function > to custom their dump setting. > > And this function makes each continer working as separate system, > it fit for design goal of namespace There are a lot of questionable things with this patchset. > @@ -183,7 +182,7 @@ put_exe_file: > static int format_corename(struct core_name *cn, struct coredump_params *cprm) > { > const struct cred *cred = current_cred(); > - const char *pat_ptr = core_pattern; > + const char *pat_ptr = current->nsproxy->pid_ns_for_children->core_pattern; current->nsproxy->pid_ns_for_children as the name implies is completely inappropriate for getting the pid namespace of the current task. This should use task_active_pid_namespace. > int ispipe = (*pat_ptr == '|'); > int pid_in_pattern = 0; > int err = 0; > diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h > index 918b117..a5af1e9 100644 > --- a/include/linux/pid_namespace.h > +++ b/include/linux/pid_namespace.h > @@ -9,6 +9,7 @@ > #include > #include > #include > +#include > > struct pidmap { > atomic_t nr_free; > @@ -45,6 +46,7 @@ struct pid_namespace { > int hide_pid; > int reboot; /* group exit code if this pidns was rebooted */ > struct ns_common ns; > + char core_pattern[CORENAME_MAX_SIZE]; > }; > > extern struct pid_namespace init_pid_ns; > diff --git a/kernel/pid.c b/kernel/pid.c > index 4d73a83..c79c1d5 100644 > --- a/kernel/pid.c > +++ b/kernel/pid.c > @@ -83,6 +83,7 @@ struct pid_namespace init_pid_ns = { > #ifdef CONFIG_PID_NS > .ns.ops = &pidns_operations, > #endif > + .core_pattern = "core", > }; > EXPORT_SYMBOL_GPL(init_pid_ns); > > diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c > index a65ba13..16d6d21 100644 > --- a/kernel/pid_namespace.c > +++ b/kernel/pid_namespace.c > @@ -123,6 +123,9 @@ static struct pid_namespace *create_pid_namespace(struct user_namespace *user_ns > for (i = 1; i < PIDMAP_ENTRIES; i++) > atomic_set(&ns->pidmap[i].nr_free, BITS_PER_PAGE); > > + strncpy(ns->core_pattern, parent_pid_ns->core_pattern, > + sizeof(ns->core_pattern)); > + This is pretty horrible. You are giving unprivileged processes the ability to run an already specified core dump helper in a pid namespace of their choosing. That is not backwards compatible, and it is possible this can lead to privilege escalation by triciking a privileged dump process to do something silly because it is running in the wrong pid namespace. Similarly the entire concept of forking from the program dumping core suffers from the same problem but for all other namespaces. I was hoping that I would see a justification somewhere in the patch descriptions describing why this set of decisions could be safe. I do not and so I assume this case was not considered. If you had managed to fork for the child_reaper of the pid_namespace that set the core pattern (as has been suggested) there would be some chance that things would work correctly. As you are forking from the program actually dumping core I see no chance that this patchset is either safe or backwards compatible as currently written. Eric