2016-10-18 10:06:39

by 曹树烽

[permalink] [raw]
Subject: [PATCH v3.1 0/3] Make core_pattern support namespace

This patchset includes following function points:
1: Let usermodehelper function possible to set pid namespace
done by: [PATCH v3.1 1/3] Make call_usermodehelper_exec possible
to set pid namespace.
2: Let pipe_type core_pattern write dump into container's rootfs
done by: [PATCH v3.1 2/3] Limit dump_pipe program's permission to
init for container.
2: Make separate core_pattern setting for each container
done by: [PATCH v3.1 3/3] Make core_pattern support namespace
3: Compatibility with current system
also included in: [PATCH v3.1 3/3] Make core_pattern support namespace
If container hadn't change core_pattern setting, it will keep
same setting with host.

Test:
1: Pass a test script for each function of this patchset
## TEST IN HOST ##
[root@kerneldev dumptest]# ./test_host
Set file core_pattern: OK
./test_host: line 41: 2366 Segmentation fault (core dumped) "$SCRI=
PT_BASE_DIR"/make_dump
Checking dumpfile: OK
Set file core_pattern: OK
./test_host: line 41: 2369 Segmentation fault (core dumped) "$SCRI=
PT_BASE_DIR"/make_dump
Checking dump_pipe triggered: OK
Checking rootfs: OK
Checking dumpfile: OK
Checking namespace: OK
Checking process list: OK
Checking capabilities: OK

## TEST IN GUEST ##
# ./test
Segmentation fault (core dumped)
Checking dump_pipe triggered: OK
Checking rootfs: OK
Checking dumpfile: OK
Checking namespace: OK
Checking process list: OK
Checking cg pids: OK
Checking capabilities: OK
[ 64.940734] make_dump[2432]: segfault at 0 ip 000000000040049d sp 000=
07ffc4af025f0 error 6 in make_dump[400000+a6000]
#
2: Pass other test(which is not easy to do in script) by hand.

Changelog v3-v3.1:
1. Switch "pwd" of pipe program to container's root fs.
2. Rebase on top of v4.9-rc1

Changelog v2->v3:
1: Fix problem of setting pid namespace, pointed out by:
Andrei Vagin <[email protected]>

Changelog v1(RFC)->v2:
1: Add [PATCH 2/2] which was todo in [RFC v1].
2: Pass a test script for each function.
3: Rebase on top of v4.7.

Suggested-by: Eric W. Biederman <[email protected]>
Suggested-by: KOSAKI Motohiro <[email protected]>
Signed-off-by: Zhao Lei <[email protected]>

Zhao Lei (3):
Make call_usermodehelper_exec possible to set pid namespace
Limit dump_pipe program's permission to init for container
Make core_pattern support namespace

fs/coredump.c | 127 +++++++++++++++++++++++++++++++++++++---
include/linux/binfmts.h | 1 +
include/linux/kmod.h | 2 +
include/linux/pid_namespace.h | 3 +
init/do_mounts_initrd.c | 3 +-
kernel/kmod.c | 133 +++++++++++++++++++++++++++++++++++++-----
kernel/pid.c | 2 +
kernel/pid_namespace.c | 2 +
kernel/sysctl.c | 50 ++++++++++++++--
lib/kobject_uevent.c | 3 +-
security/keys/request_key.c | 4 +-
11 files changed, 297 insertions(+), 33 deletions(-)

--
2.7.4




2016-10-18 10:06:55

by 曹树烽

[permalink] [raw]
Subject: [PATCH v3.1 2/3] Limit dump_pipe program's permission to init for container

From: Zhao Lei <[email protected]>

Currently when we set core_pattern to a pipe, the pipe program is
forked by kthread running with root's permission, and write dumpfile
into host's filesystem.
Same thing happened for container, the dumper and dumpfile are also
in host(not in container).

It have following program:
1: Not consistent with file_type core_pattern
When we set core_pattern to a file, the container will write dump
into container's filesystem instead of host.
2: Not safe for privileged container
In a privileged container, user can destroy host system by following
command:
# # In a container
# echo "|/bin/dd of=/boot/vmlinuz" >/proc/sys/kernel/core_pattern
# make_dump

This patch switch dumper program's environment to init task, so, for
container, dumper program have same environment with init task in
container, which make dumper program put in container's filesystem, and
write coredump into container's filesystem.
The dumper's permission is also limited into subset of container's init
process.

Suggested-by: Eric W. Biederman <[email protected]>
Suggested-by: KOSAKI Motohiro <[email protected]>

Signed-off-by: Zhao Lei <[email protected]>
Co-Authored-By: Cao ShuFeng<[email protected]>
---
fs/coredump.c | 103 +++++++++++++++++++++++++++++++++++++++++++++++-
include/linux/binfmts.h | 1 +
2 files changed, 102 insertions(+), 2 deletions(-)

diff --git a/fs/coredump.c b/fs/coredump.c
index ceb0ee8..8ea8bc1 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -501,6 +501,23 @@ static void wait_for_dump_helpers(struct file *file)
pipe_unlock(pipe);
}

+static int umh_ns_setup(struct subprocess_info *info)
+{
+ struct coredump_params *cp = (struct coredump_params *)info->data;
+ struct task_struct *base_task = cp->base_task;
+
+ if (base_task) {
+ /* Set namespaces to base_task */
+ get_nsproxy(base_task->nsproxy);
+ switch_task_namespaces(current, base_task->nsproxy);
+
+ /* Return -EAGAIN to notice caller to refork */
+ return -EAGAIN;
+ }
+
+ return 0;
+}
+
/*
* umh_pipe_setup
* helper function to customize the process used
@@ -516,6 +533,8 @@ static int umh_pipe_setup(struct subprocess_info *info, struct cred *new)
{
struct file *files[2];
struct coredump_params *cp = (struct coredump_params *)info->data;
+ struct task_struct *base_task;
+
int err = create_pipe_files(files, 0);
if (err)
return err;
@@ -524,10 +543,76 @@ static int umh_pipe_setup(struct subprocess_info *info, struct cred *new)

err = replace_fd(0, files[0], 0);
fput(files[0]);
+ if (err)
+ return err;
+
/* and disallow core files too */
current->signal->rlim[RLIMIT_CORE] = (struct rlimit){1, 1};

- return err;
+ base_task = cp->base_task;
+ if (base_task) {
+ const struct cred *base_cred;
+
+ /* Set fs_root to base_task */
+ spin_lock(&base_task->fs->lock);
+ set_fs_root(current->fs, &base_task->fs->root);
+ set_fs_pwd(current->fs, &base_task->fs->pwd);
+ spin_unlock(&base_task->fs->lock);
+
+ /* Set cgroup to base_task */
+ current->flags &= ~PF_NO_SETAFFINITY;
+ err = cgroup_attach_task_all(base_task, current);
+ if (err < 0)
+ return err;
+
+ /* Set cred to base_task */
+ base_cred = get_task_cred(base_task);
+
+ new->uid = base_cred->uid;
+ new->gid = base_cred->gid;
+ new->suid = base_cred->suid;
+ new->sgid = base_cred->sgid;
+ new->euid = base_cred->euid;
+ new->egid = base_cred->egid;
+ new->fsuid = base_cred->fsuid;
+ new->fsgid = base_cred->fsgid;
+
+ new->securebits = base_cred->securebits;
+
+ new->cap_inheritable = base_cred->cap_inheritable;
+ new->cap_permitted = base_cred->cap_permitted;
+ new->cap_effective = base_cred->cap_effective;
+ new->cap_bset = base_cred->cap_bset;
+ new->cap_ambient = base_cred->cap_ambient;
+
+ security_cred_free(new);
+#ifdef CONFIG_SECURITY
+ new->security = NULL;
+#endif
+ err = security_prepare_creds(new, base_cred, GFP_KERNEL);
+ if (err < 0) {
+ put_cred(base_cred);
+ return err;
+ }
+
+ free_uid(new->user);
+ new->user = base_cred->user;
+ get_uid(new->user);
+
+ put_user_ns(new->user_ns);
+ new->user_ns = base_cred->user_ns;
+ get_user_ns(new->user_ns);
+
+ put_group_info(new->group_info);
+ new->group_info = base_cred->group_info;
+ get_group_info(new->group_info);
+
+ put_cred(base_cred);
+
+ validate_creds(new);
+ }
+
+ return 0;
}

void do_coredump(const siginfo_t *siginfo)
@@ -590,6 +675,7 @@ void do_coredump(const siginfo_t *siginfo)

if (ispipe) {
int dump_count;
+ struct task_struct *vinit_task;
char **helper_argv;
struct subprocess_info *sub_info;

@@ -631,6 +717,14 @@ void do_coredump(const siginfo_t *siginfo)
goto fail_dropcount;
}

+ rcu_read_lock();
+ vinit_task = find_task_by_vpid(1);
+ rcu_read_unlock();
+ if (!vinit_task) {
+ printk(KERN_WARNING "failed getting init task info, skipping core dump\n");
+ goto fail_dropcount;
+ }
+
helper_argv = argv_split(GFP_KERNEL, cn.corename, NULL);
if (!helper_argv) {
printk(KERN_WARNING "%s failed to allocate memory\n",
@@ -638,15 +732,20 @@ void do_coredump(const siginfo_t *siginfo)
goto fail_dropcount;
}

+ get_task_struct(vinit_task);
+
+ cprm.base_task = vinit_task;
+
retval = -ENOMEM;
sub_info = call_usermodehelper_setup(helper_argv[0],
helper_argv, NULL, GFP_KERNEL,
- NULL, umh_pipe_setup,
+ umh_ns_setup, umh_pipe_setup,
NULL, &cprm);
if (sub_info)
retval = call_usermodehelper_exec(sub_info,
UMH_WAIT_EXEC);

+ put_task_struct(vinit_task);
argv_free(helper_argv);
if (retval) {
printk(KERN_INFO "Core dump to |%s pipe failed\n",
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index 1303b57..193d269 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -59,6 +59,7 @@ struct linux_binprm {

/* Function parameter for binfmt->coredump */
struct coredump_params {
+ struct task_struct *base_task;
const siginfo_t *siginfo;
struct pt_regs *regs;
struct file *file;
--
2.7.4



2016-10-18 10:07:14

by 曹树烽

[permalink] [raw]
Subject: [PATCH v3.1 1/3] Make call_usermodehelper_exec possible to set pid namespace

From: Zhao Lei <[email protected]>

Current call_usermodehelper_exec() can not set pid namespace for
the executed program, because we need addition fork to make pid
namespace active.

This patch add above function for call_usermodehelper_exec().
When init_intermediate callback return -EAGAIN, the usermodehelper
will fork again to make pid namespace active, and run program
in the child process.

This function is helpful for coredump to run pipe_program in
specific container environment.

Signed-off-by: Zhao Lei <[email protected]>
---
fs/coredump.c | 3 +-
include/linux/kmod.h | 2 +
init/do_mounts_initrd.c | 3 +-
kernel/kmod.c | 133 ++++++++++++++++++++++++++++++++++++++------
lib/kobject_uevent.c | 3 +-
security/keys/request_key.c | 4 +-
6 files changed, 127 insertions(+), 21 deletions(-)

diff --git a/fs/coredump.c b/fs/coredump.c
index 281b768..ceb0ee8 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -641,7 +641,8 @@ void do_coredump(const siginfo_t *siginfo)
retval = -ENOMEM;
sub_info = call_usermodehelper_setup(helper_argv[0],
helper_argv, NULL, GFP_KERNEL,
- umh_pipe_setup, NULL, &cprm);
+ NULL, umh_pipe_setup,
+ NULL, &cprm);
if (sub_info)
retval = call_usermodehelper_exec(sub_info,
UMH_WAIT_EXEC);
diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index fcfd2bf..8fb8c0e 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -61,6 +61,7 @@ struct subprocess_info {
char **envp;
int wait;
int retval;
+ int (*init_intermediate)(struct subprocess_info *info);
int (*init)(struct subprocess_info *info, struct cred *new);
void (*cleanup)(struct subprocess_info *info);
void *data;
@@ -71,6 +72,7 @@ call_usermodehelper(char *path, char **argv, char **envp, int wait);

extern struct subprocess_info *
call_usermodehelper_setup(char *path, char **argv, char **envp, gfp_t gfp_mask,
+ int (*init_intermediate)(struct subprocess_info *info),
int (*init)(struct subprocess_info *info, struct cred *new),
void (*cleanup)(struct subprocess_info *), void *data);

diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c
index a1000ca..bb5dce5 100644
--- a/init/do_mounts_initrd.c
+++ b/init/do_mounts_initrd.c
@@ -72,7 +72,8 @@ static void __init handle_initrd(void)
current->flags |= PF_FREEZER_SKIP;

info = call_usermodehelper_setup("/linuxrc", argv, envp_init,
- GFP_KERNEL, init_linuxrc, NULL, NULL);
+ GFP_KERNEL, NULL, init_linuxrc, NULL,
+ NULL);
if (!info)
return;
call_usermodehelper_exec(info, UMH_WAIT_PROC);
diff --git a/kernel/kmod.c b/kernel/kmod.c
index 0277d12..30a5802 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -91,7 +91,7 @@ static int call_modprobe(char *module_name, int wait)
argv[4] = NULL;

info = call_usermodehelper_setup(modprobe_path, argv, envp, GFP_KERNEL,
- NULL, free_modprobe_argv, NULL);
+ NULL, NULL, free_modprobe_argv, NULL);
if (!info)
goto free_module_name;

@@ -209,14 +209,11 @@ static void umh_complete(struct subprocess_info *sub_info)
call_usermodehelper_freeinfo(sub_info);
}

-/*
- * This is the task which runs the usermode application
- */
-static int call_usermodehelper_exec_async(void *data)
+static int __call_usermodehelper_exec_doexec(void *data)
{
struct subprocess_info *sub_info = data;
struct cred *new;
- int retval;
+ int retval = 0;

spin_lock_irq(&current->sighand->siglock);
flush_signal_handlers(current, 1);
@@ -228,10 +225,11 @@ static int call_usermodehelper_exec_async(void *data)
*/
set_user_nice(current, 0);

- retval = -ENOMEM;
new = prepare_kernel_cred(current);
- if (!new)
+ if (!new) {
+ retval = -ENOMEM;
goto out;
+ }

spin_lock(&umh_sysctl_lock);
new->cap_bset = cap_intersect(usermodehelper_bset, new->cap_bset);
@@ -248,20 +246,121 @@ static int call_usermodehelper_exec_async(void *data)
}

commit_creds(new);
-
retval = do_execve(getname_kernel(sub_info->path),
- (const char __user *const __user *)sub_info->argv,
- (const char __user *const __user *)sub_info->envp);
+ (const char __user *const __user *)sub_info->argv,
+ (const char __user *const __user *)sub_info->envp);
+
out:
- sub_info->retval = retval;
+ return retval;
+}
+
+static int call_usermodehelper_exec_doexec(void *data)
+{
+ struct subprocess_info *sub_info = data;
+ int ret = __call_usermodehelper_exec_doexec(data);
+
/*
- * call_usermodehelper_exec_sync() will call umh_complete
- * if UHM_WAIT_PROC.
+ * If it is called in non-sync mode:
+ * On fail:
+ * should set sub_info->retval, call umh_complete and exit process.
+ * On success:
+ * should set sub_info->retval, call umh_complete and return to
+ * user-mode program.
+ * It it is called in sync mode:
+ * On fail:
+ * should set sub_info->retval and exit process, the parent process
+ * will wait and call umh_complete()
+ * On success:
+ * should return to user-mode program, the parent process will wait
+ * and get program's ret from wait(), and call umh_complete().
*/
+ sub_info->retval = ret;
+
if (!(sub_info->wait & UMH_WAIT_PROC))
umh_complete(sub_info);
- if (!retval)
+
+ if (!ret)
return 0;
+
+ /*
+ * see comment in call_usermodehelper_exec_sync() before change
+ * it to do_exit(ret).
+ */
+ do_exit(0);
+}
+
+/*
+ * This is the task which runs the usermode application
+ */
+static int call_usermodehelper_exec_async(void *data)
+{
+ struct subprocess_info *sub_info = data;
+ int ret = 0;
+ int refork = 0;
+
+ if (sub_info->init_intermediate) {
+ ret = sub_info->init_intermediate(sub_info);
+ switch (ret) {
+ case 0:
+ break;
+ case -EAGAIN:
+ /*
+ * if pid namespace is changed,
+ * we need refork to make it active.
+ */
+ ret = 0;
+ refork = 1;
+ break;
+ default:
+ goto out;
+ }
+ }
+
+ if (refork) {
+ /*
+ * Code copyed from call_usermodehelper_exec_work() and
+ * call_usermodehelper_exec_sync(), see comments in these
+ * functions for detail.
+ */
+ pid_t pid;
+
+ if (sub_info->wait & UMH_WAIT_PROC) {
+ kernel_sigaction(SIGCHLD, SIG_DFL);
+ pid = kernel_thread(call_usermodehelper_exec_doexec,
+ sub_info, SIGCHLD);
+ if (pid < 0) {
+ ret = pid;
+ } else {
+ ret = -ECHILD;
+ sys_wait4(pid, (int __user *)&ret, 0, NULL);
+ }
+ kernel_sigaction(SIGCHLD, SIG_IGN);
+
+ sub_info->retval = ret;
+ } else {
+ pid = kernel_thread(call_usermodehelper_exec_doexec,
+ sub_info, SIGCHLD);
+ if (pid < 0) {
+ sub_info->retval = pid;
+ umh_complete(sub_info);
+ }
+ }
+ } else {
+ ret = __call_usermodehelper_exec_doexec(data);
+
+out:
+ /*
+ * see comment in call_usermodehelper_exec_doexec() for detail
+ */
+ sub_info->retval = ret;
+
+ if (!(sub_info->wait & UMH_WAIT_PROC))
+ umh_complete(sub_info);
+
+ if (!ret)
+ return 0;
+ }
+
do_exit(0);
}

@@ -518,6 +617,7 @@ static void helper_unlock(void)
*/
struct subprocess_info *call_usermodehelper_setup(char *path, char **argv,
char **envp, gfp_t gfp_mask,
+ int (*init_intermediate)(struct subprocess_info *info),
int (*init)(struct subprocess_info *info, struct cred *new),
void (*cleanup)(struct subprocess_info *info),
void *data)
@@ -533,6 +633,7 @@ struct subprocess_info *call_usermodehelper_setup(char *path, char **argv,
sub_info->envp = envp;

sub_info->cleanup = cleanup;
+ sub_info->init_intermediate = init_intermediate;
sub_info->init = init;
sub_info->data = data;
out:
@@ -619,7 +720,7 @@ int call_usermodehelper(char *path, char **argv, char **envp, int wait)
gfp_t gfp_mask = (wait == UMH_NO_WAIT) ? GFP_ATOMIC : GFP_KERNEL;

info = call_usermodehelper_setup(path, argv, envp, gfp_mask,
- NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL);
if (info == NULL)
return -ENOMEM;

diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c
index f6c2c1e..7e571f0 100644
--- a/lib/kobject_uevent.c
+++ b/lib/kobject_uevent.c
@@ -345,7 +345,8 @@ int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
retval = -ENOMEM;
info = call_usermodehelper_setup(env->argv[0], env->argv,
env->envp, GFP_KERNEL,
- NULL, cleanup_uevent_env, env);
+ NULL, NULL, cleanup_uevent_env,
+ env);
if (info) {
retval = call_usermodehelper_exec(info, UMH_NO_WAIT);
env = NULL; /* freed by cleanup_uevent_env */
diff --git a/security/keys/request_key.c b/security/keys/request_key.c
index 43affcf..3abff78 100644
--- a/security/keys/request_key.c
+++ b/security/keys/request_key.c
@@ -78,8 +78,8 @@ static int call_usermodehelper_keys(char *path, char **argv, char **envp,
struct subprocess_info *info;

info = call_usermodehelper_setup(path, argv, envp, GFP_KERNEL,
- umh_keys_init, umh_keys_cleanup,
- session_keyring);
+ NULL, umh_keys_init, umh_keys_cleanup,
+ session_keyring);
if (!info)
return -ENOMEM;

--
2.7.4



2016-10-18 10:06:51

by 曹树烽

[permalink] [raw]
Subject: [PATCH v3.1 3/3] Make core_pattern support namespace

From: Zhao Lei <[email protected]>

Currently, each container shared one copy of coredump setting
with the host system, if host system changed the setting, each
running containers will be affected.
Same story happened when container changed core_pattern, both
host and other container will be affected.

For container based on namespace design, it is good to allow
each container keeping their own coredump setting.

It will bring us following benefit:
1: Each container can change their own coredump setting
based on operation on /proc/sys/kernel/core_pattern
2: Coredump setting changed in host will not affect
running containers.
3: Support both case of "putting coredump in guest" and
"putting curedump in host".

Each namespace-based software(lxc, docker, ..) can use this function
to custom their dump setting.

And this function makes each continer working as separate system,
it fit for design goal of namespace.

Test(in lxc):
# In the host
# ----------------
# echo host_core >/proc/sys/kernel/core_pattern
# cat /proc/sys/kernel/core_pattern
host_core
# ulimit -c 1024000
# ./make_dump
Segmentation fault (core dumped)
# ls -l
-rw------- 1 root root 331776 Feb 4 18:02 host_core.2175
-rwxr-xr-x 1 root root 759731 Feb 4 18:01 make_dump
#

# In the container
# ----------------
# cat /proc/sys/kernel/core_pattern
host_core
# echo container_core >/proc/sys/kernel/core_pattern
# ./make_dump
Segmentation fault (core dumped)
# ls -l
-rwxr-xr-x 1 root root 759731 Feb 4 10:45 make_dump
-rw------- 1 root root 331776 Feb 4 10:45 container_core.16
#

# Return to host
# ----------------
# cat /proc/sys/kernel/core_pattern
host_core
# ls
host_core.2175 make_dump make_dump.c
# rm -f host_core.2175
# ./make_dump
Segmentation fault (core dumped)
# ls -l
-rw------- 1 root root 331776 Feb 4 18:49 host_core.2351
-rwxr-xr-x 1 root root 759731 Feb 4 18:01 make_dump
#

Signed-off-by: Zhao Lei <[email protected]>
---
fs/coredump.c | 25 ++++++++++++++++------
include/linux/pid_namespace.h | 3 +++
kernel/pid.c | 2 ++
kernel/pid_namespace.c | 2 ++
kernel/sysctl.c | 50 ++++++++++++++++++++++++++++++++++++++-----
5 files changed, 70 insertions(+), 12 deletions(-)

diff --git a/fs/coredump.c b/fs/coredump.c
index 8ea8bc1..b854541 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -49,7 +49,6 @@

int core_uses_pid;
unsigned int core_pipe_limit;
-char core_pattern[CORENAME_MAX_SIZE] = "core";
static int core_name_size = CORENAME_MAX_SIZE;

struct core_name {
@@ -57,8 +56,6 @@ struct core_name {
int used, size;
};

-/* The maximal length of core_pattern is also specified in sysctl.c */
-
static int expand_corename(struct core_name *cn, int size)
{
char *corename = krealloc(cn->corename, size, GFP_KERNEL);
@@ -183,10 +180,10 @@ static int cn_print_exe_file(struct core_name *cn)
* name into corename, which must have space for at least
* CORENAME_MAX_SIZE bytes plus one byte for the zero terminator.
*/
-static int format_corename(struct core_name *cn, struct coredump_params *cprm)
+static int format_corename(struct core_name *cn, const char *pat_ptr,
+ struct coredump_params *cprm)
{
const struct cred *cred = current_cred();
- const char *pat_ptr = core_pattern;
int ispipe = (*pat_ptr == '|');
int pid_in_pattern = 0;
int err = 0;
@@ -641,6 +638,8 @@ void do_coredump(const siginfo_t *siginfo)
*/
.mm_flags = mm->flags,
};
+ struct pid_namespace *pid_ns;
+ char core_pattern[CORENAME_MAX_SIZE];

audit_core_dumps(siginfo->si_signo);

@@ -650,6 +649,18 @@ void do_coredump(const siginfo_t *siginfo)
if (!__get_dumpable(cprm.mm_flags))
goto fail;

+ pid_ns = task_active_pid_ns(current);
+ spin_lock(&pid_ns->core_pattern_lock);
+ while (pid_ns != &init_pid_ns) {
+ if (pid_ns->core_pattern[0])
+ break;
+ spin_unlock(&pid_ns->core_pattern_lock);
+ pid_ns = pid_ns->parent,
+ spin_lock(&pid_ns->core_pattern_lock);
+ }
+ strcpy(core_pattern, pid_ns->core_pattern);
+ spin_unlock(&pid_ns->core_pattern_lock);
+
cred = prepare_creds();
if (!cred)
goto fail;
@@ -671,7 +682,7 @@ void do_coredump(const siginfo_t *siginfo)

old_cred = override_creds(cred);

- ispipe = format_corename(&cn, &cprm);
+ ispipe = format_corename(&cn, core_pattern, &cprm);

if (ispipe) {
int dump_count;
@@ -718,7 +729,7 @@ void do_coredump(const siginfo_t *siginfo)
}

rcu_read_lock();
- vinit_task = find_task_by_vpid(1);
+ vinit_task = find_task_by_pid_ns(1, pid_ns);
rcu_read_unlock();
if (!vinit_task) {
printk(KERN_WARNING "failed getting init task info, skipping core dump\n");
diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h
index 34cce96..f0b0c21 100644
--- a/include/linux/pid_namespace.h
+++ b/include/linux/pid_namespace.h
@@ -9,6 +9,7 @@
#include <linux/nsproxy.h>
#include <linux/kref.h>
#include <linux/ns_common.h>
+#include <linux/binfmts.h>

struct pidmap {
atomic_t nr_free;
@@ -46,6 +47,8 @@ struct pid_namespace {
int hide_pid;
int reboot; /* group exit code if this pidns was rebooted */
struct ns_common ns;
+ spinlock_t core_pattern_lock;
+ char core_pattern[CORENAME_MAX_SIZE];
};

extern struct pid_namespace init_pid_ns;
diff --git a/kernel/pid.c b/kernel/pid.c
index f66162f..e7ee122 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -83,6 +83,8 @@ struct pid_namespace init_pid_ns = {
#ifdef CONFIG_PID_NS
.ns.ops = &pidns_operations,
#endif
+ .core_pattern_lock = __SPIN_LOCK_UNLOCKED(init_pid_ns.core_pattern_lock),
+ .core_pattern = "core",
};
EXPORT_SYMBOL_GPL(init_pid_ns);

diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
index df9e8e9..c70ca5d 100644
--- a/kernel/pid_namespace.c
+++ b/kernel/pid_namespace.c
@@ -137,6 +137,8 @@ static struct pid_namespace *create_pid_namespace(struct user_namespace *user_ns
for (i = 1; i < PIDMAP_ENTRIES; i++)
atomic_set(&ns->pidmap[i].nr_free, BITS_PER_PAGE);

+ spin_lock_init(&ns->core_pattern_lock);
+
return ns;

out_free_map:
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 706309f..c73ced5 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -484,7 +484,7 @@ static struct ctl_table kern_table[] = {
},
{
.procname = "core_pattern",
- .data = core_pattern,
+ .data = NULL,
.maxlen = CORENAME_MAX_SIZE,
.mode = 0644,
.proc_handler = proc_dostring_coredump,
@@ -2401,12 +2401,20 @@ int proc_dointvec_minmax(struct ctl_table *table, int write,
static void validate_coredump_safety(void)
{
#ifdef CONFIG_COREDUMP
+ struct pid_namespace *pid_ns = task_active_pid_ns(current);
+ const char *core_pattern;
+
+ spin_lock(&pid_ns->core_pattern_lock);
+ core_pattern = pid_ns->core_pattern;
+
if (suid_dumpable == SUID_DUMP_ROOT &&
core_pattern[0] != '/' && core_pattern[0] != '|') {
printk(KERN_WARNING "Unsafe core_pattern used with "\
"suid_dumpable=2. Pipe handler or fully qualified "\
"core dump path required.\n");
}
+
+ spin_unlock(&pid_ns->core_pattern_lock);
#endif
}

@@ -2423,10 +2431,42 @@ static int proc_dointvec_minmax_coredump(struct ctl_table *table, int write,
static int proc_dostring_coredump(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp, loff_t *ppos)
{
- int error = proc_dostring(table, write, buffer, lenp, ppos);
- if (!error)
- validate_coredump_safety();
- return error;
+ int ret;
+ char core_pattern[CORENAME_MAX_SIZE];
+ struct pid_namespace *pid_ns = task_active_pid_ns(current);
+
+ if (write) {
+ if (*ppos && sysctl_writes_strict == SYSCTL_WRITES_WARN)
+ warn_sysctl_write(table);
+
+ ret = _proc_do_string(core_pattern, table->maxlen, write,
+ (char __user *)buffer, lenp, ppos);
+ if (ret)
+ return ret;
+
+ spin_lock(&pid_ns->core_pattern_lock);
+ strcpy(pid_ns->core_pattern, core_pattern);
+ spin_unlock(&pid_ns->core_pattern_lock);
+ } else {
+ spin_lock(&pid_ns->core_pattern_lock);
+ while (pid_ns != &init_pid_ns) {
+ if (pid_ns->core_pattern[0])
+ break;
+ spin_unlock(&pid_ns->core_pattern_lock);
+ pid_ns = pid_ns->parent,
+ spin_lock(&pid_ns->core_pattern_lock);
+ }
+ strcpy(core_pattern, pid_ns->core_pattern);
+ spin_unlock(&pid_ns->core_pattern_lock);
+
+ ret = _proc_do_string(core_pattern, table->maxlen, write,
+ (char __user *)buffer, lenp, ppos);
+ if (ret)
+ return ret;
+ }
+
+ validate_coredump_safety();
+ return 0;
}
#endif

--
2.7.4