2012-10-04 21:56:58

by Andrei Vagin

[permalink] [raw]
Subject: [PATCH] pidns: remove recursion from free_pid_ns

Here is a stack trace of recursion:
free_pid_ns(parent)
put_pid_ns(parent)
kref_put(&ns->kref, free_pid_ns);
free_pid_ns

This patch turns recursion into loops.

pidns can be nested many times, so in case of recursion
a simple user space program can provoke a kernel panic
due to exceed of a kernel stack.

Cc: Andrew Morton <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Signed-off-by: Andrew Vagin <[email protected]>
---
include/linux/kref.h | 12 ++++++++++++
kernel/pid_namespace.c | 17 +++++++++++++----
2 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/include/linux/kref.h b/include/linux/kref.h
index 65af688..d234199 100644
--- a/include/linux/kref.h
+++ b/include/linux/kref.h
@@ -95,6 +95,18 @@ static inline int kref_put(struct kref *kref, void (*release)(struct kref *kref)
return kref_sub(kref, 1, release);
}

+/**
+ * kref_put - decrement refcount for object.
+ * @kref: object.
+ *
+ * Decrement the refcount.
+ * Return 1 if refcount is zero.
+ */
+static inline int __kref_put(struct kref *kref)
+{
+ return atomic_sub_and_test(1, &kref->refcount);
+}
+
static inline int kref_put_mutex(struct kref *kref,
void (*release)(struct kref *kref),
struct mutex *lock)
diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
index b17bf93..632eb88 100644
--- a/kernel/pid_namespace.c
+++ b/kernel/pid_namespace.c
@@ -138,11 +138,20 @@ void free_pid_ns(struct kref *kref)

ns = container_of(kref, struct pid_namespace, kref);

- parent = ns->parent;
- destroy_pid_namespace(ns);
+ while (1) {

- if (parent != NULL)
- put_pid_ns(parent);
+ parent = ns->parent;
+ destroy_pid_namespace(ns);
+
+ if (parent == NULL || parent == &init_pid_ns)
+ break;
+
+ /* kref_put cannot be used for avoiding recursion */
+ if (__kref_put(&parent->kref) == 0)
+ break;
+
+ ns = parent;
+ }
}

void zap_pid_ns_processes(struct pid_namespace *pid_ns)
--
1.7.1


2012-10-05 06:43:05

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [PATCH] pidns: remove recursion from free_pid_ns

On Fri, Oct 05, 2012 at 01:21:02AM +0400, Andrew Vagin wrote:
> Here is a stack trace of recursion:
> free_pid_ns(parent)
> put_pid_ns(parent)
> kref_put(&ns->kref, free_pid_ns);
> free_pid_ns
>
> This patch turns recursion into loops.
>
> pidns can be nested many times, so in case of recursion
> a simple user space program can provoke a kernel panic
> due to exceed of a kernel stack.

Acked-by: Cyrill Gorcunov <[email protected]>

Looks good to me. Thanks Andrew!

2012-10-05 14:45:58

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] pidns: remove recursion from free_pid_ns

On 10/05, Andrew Vagin wrote:
>
> Here is a stack trace of recursion:
> free_pid_ns(parent)
> put_pid_ns(parent)
> kref_put(&ns->kref, free_pid_ns);
> free_pid_ns
>
> This patch turns recursion into loops.

I think the patch is correct, a couple of minor nits.

> +static inline int __kref_put(struct kref *kref)
> +{
> + return atomic_sub_and_test(1, &kref->refcount);

perhaps atomic_dec_and_test(&kref->refcount) makes more sense?

> +}
> @@ -138,11 +138,20 @@ void free_pid_ns(struct kref *kref)
>
> ns = container_of(kref, struct pid_namespace, kref);
>
> - parent = ns->parent;
> - destroy_pid_namespace(ns);
> + while (1) {
>
> - if (parent != NULL)
> - put_pid_ns(parent);
> + parent = ns->parent;
> + destroy_pid_namespace(ns);
> +
> + if (parent == NULL || parent == &init_pid_ns)
^^^^^^^^^^^^^^

Why ns->parent == NULL is only possible if ns == init_pid_ns, right?
But in this case we should not be here. The caller verifies that
initial ns != init_pid_ns, and this loops should stop once we reach
init_pid_ns.

Oleg.