2012-10-10 20:43:01

by Cyrill Gorcunov

[permalink] [raw]
Subject: [PATCH] pidns: remove recursion from free_pid_ns() v5

The free_pid_ns function done in recursion fashion:

free_pid_ns(parent)
put_pid_ns(parent)
kref_put(&ns->kref, free_pid_ns);
free_pid_ns

thus if there was a huge nesting of namespaces the userspace
may trigger avalanche calling of free_pid_ns leading to
kernel stack exhausting and a panic eventually.

This patch turns the recursion into iterative loop.

v5 (from oleg@):
- Drop @ret variable
- Make put_pid_ns non-inline since it grows in size,
in turn make free_pid_ns static

Based-on-patch-by: Andrew Vagin <[email protected]>
Signed-off-by: Cyrill Gorcunov <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Cc: Greg KH <[email protected]>
---
include/linux/pid_namespace.h | 8 +-------
kernel/pid_namespace.c | 19 +++++++++++++------
2 files changed, 14 insertions(+), 13 deletions(-)

Index: linux-2.6.git/include/linux/pid_namespace.h
===================================================================
--- linux-2.6.git.orig/include/linux/pid_namespace.h
+++ linux-2.6.git/include/linux/pid_namespace.h
@@ -47,15 +47,9 @@ static inline struct pid_namespace *get_
}

extern struct pid_namespace *copy_pid_ns(unsigned long flags, struct pid_namespace *ns);
-extern void free_pid_ns(struct kref *kref);
extern void zap_pid_ns_processes(struct pid_namespace *pid_ns);
extern int reboot_pid_ns(struct pid_namespace *pid_ns, int cmd);
-
-static inline void put_pid_ns(struct pid_namespace *ns)
-{
- if (ns != &init_pid_ns)
- kref_put(&ns->kref, free_pid_ns);
-}
+extern void put_pid_ns(struct pid_namespace *ns);

#else /* !CONFIG_PID_NS */
#include <linux/err.h>
Index: linux-2.6.git/kernel/pid_namespace.c
===================================================================
--- linux-2.6.git.orig/kernel/pid_namespace.c
+++ linux-2.6.git/kernel/pid_namespace.c
@@ -132,17 +132,24 @@ struct pid_namespace *copy_pid_ns(unsign
return create_pid_namespace(old_ns);
}

-void free_pid_ns(struct kref *kref)
+static void free_pid_ns(struct kref *kref)
{
- struct pid_namespace *ns, *parent;
+ struct pid_namespace *ns;

ns = container_of(kref, struct pid_namespace, kref);
-
- parent = ns->parent;
destroy_pid_namespace(ns);
+}
+
+void put_pid_ns(struct pid_namespace *ns)
+{
+ struct pid_namespace *parent;

- if (parent != NULL)
- put_pid_ns(parent);
+ while (ns != &init_pid_ns) {
+ parent = ns->parent;
+ if (!kref_put(&ns->kref, free_pid_ns))
+ break;
+ ns = parent;
+ }
}

void zap_pid_ns_processes(struct pid_namespace *pid_ns)


2012-10-10 20:54:13

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] pidns: remove recursion from free_pid_ns() v5

On Thu, 11 Oct 2012 00:42:56 +0400
Cyrill Gorcunov <[email protected]> wrote:

> The free_pid_ns function done in recursion fashion:
>
> free_pid_ns(parent)
> put_pid_ns(parent)
> kref_put(&ns->kref, free_pid_ns);
> free_pid_ns
>
> thus if there was a huge nesting of namespaces the userspace
> may trigger avalanche calling of free_pid_ns leading to
> kernel stack exhausting and a panic eventually.
>
> This patch turns the recursion into iterative loop.
>
> v5 (from oleg@):
> - Drop @ret variable
> - Make put_pid_ns non-inline since it grows in size,
> in turn make free_pid_ns static

OK, let's try that. I'll sit on this until -rc2 to give it a bit of
time to cook.

A -stable backport might be needed. What capabilities does userspace
need to be able to trigger the kernel stack overflow?

2012-10-10 20:59:52

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] pidns: remove recursion from free_pid_ns() v5

Andrew Morton <[email protected]> writes:

> On Thu, 11 Oct 2012 00:42:56 +0400
> Cyrill Gorcunov <[email protected]> wrote:
>
>> The free_pid_ns function done in recursion fashion:
>>
>> free_pid_ns(parent)
>> put_pid_ns(parent)
>> kref_put(&ns->kref, free_pid_ns);
>> free_pid_ns
>>
>> thus if there was a huge nesting of namespaces the userspace
>> may trigger avalanche calling of free_pid_ns leading to
>> kernel stack exhausting and a panic eventually.
>>
>> This patch turns the recursion into iterative loop.
>>
>> v5 (from oleg@):
>> - Drop @ret variable
>> - Make put_pid_ns non-inline since it grows in size,
>> in turn make free_pid_ns static
>
> OK, let's try that. I'll sit on this until -rc2 to give it a bit of
> time to cook.
>
> A -stable backport might be needed. What capabilities does userspace
> need to be able to trigger the kernel stack overflow?

CAP_SYS_ADMIN is required to create a new pid namespace today.

With a little luck the user namespace bits that allow unprivelged
creation of pid namespaces will be ready for 3.8.

Eric

2012-10-10 21:14:38

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [PATCH] pidns: remove recursion from free_pid_ns() v5

On Wed, Oct 10, 2012 at 01:54:08PM -0700, Andrew Morton wrote:
> On Thu, 11 Oct 2012 00:42:56 +0400
> Cyrill Gorcunov <[email protected]> wrote:
>
> > The free_pid_ns function done in recursion fashion:
> >
> > free_pid_ns(parent)
> > put_pid_ns(parent)
> > kref_put(&ns->kref, free_pid_ns);
> > free_pid_ns
> >
> > thus if there was a huge nesting of namespaces the userspace
> > may trigger avalanche calling of free_pid_ns leading to
> > kernel stack exhausting and a panic eventually.
> >
> > This patch turns the recursion into iterative loop.
> >
> > v5 (from oleg@):
> > - Drop @ret variable
> > - Make put_pid_ns non-inline since it grows in size,
> > in turn make free_pid_ns static
>
> OK, let's try that. I'll sit on this until -rc2 to give it a bit of
> time to cook.
>
> A -stable backport might be needed. What capabilities does userspace
> need to be able to trigger the kernel stack overflow?

I believe it'll apply on stable even in current form. As Eric mentioned
CAP_SYS_ADMIN is required (so it's not that urgent i think).