2013-06-10 21:58:37

by Raphael S Carvalho

[permalink] [raw]
Subject: [PATCH 1/1] kernel/pid.c: Masking the flag out to get the actual value.

This patch shouldn't be applied if those branches must only be taken when
the pid_allocation(PIDNS_HASH_ADDING) flag was turned off.
Otherwise, we must turn the PIDNS_HASH_ADDING flag (1U << 31) off
before getting into the switch-cases.

Signed-off-by: Raphael S. Carvalho <[email protected]>
---
kernel/pid.c | 5 ++++-
1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/kernel/pid.c b/kernel/pid.c
index 0db3e79..6bda527 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -263,7 +263,10 @@ void free_pid(struct pid *pid)
struct upid *upid = pid->numbers + i;
struct pid_namespace *ns = upid->ns;
hlist_del_rcu(&upid->pid_chain);
- switch(--ns->nr_hashed) {
+
+ /* We must turn the PIDNS_HASH_ADDING flag off to
+ get the actual value of nr_hashed */
+ switch ((--ns->nr_hashed) & ~(PIDNS_HASH_ADDING)) {
case 1:
/* When all that is left in the pid namespace
* is the reaper wake up the reaper. The reaper
--
1.7.2.5


2013-06-11 22:45:29

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 1/1] kernel/pid.c: Masking the flag out to get the actual value.

On Mon, 10 Jun 2013 18:56:38 -0300 "Raphael S. Carvalho" <[email protected]> wrote:

> This patch shouldn't be applied if those branches must only be taken when
> the pid_allocation(PIDNS_HASH_ADDING) flag was turned off.

Well yes - hopefully this is the case. Otherwise we're missing some
rather important-looking wakeups.


> Otherwise, we must turn the PIDNS_HASH_ADDING flag (1U << 31) off
> before getting into the switch-cases.
>
> ...
>
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
> @@ -263,7 +263,10 @@ void free_pid(struct pid *pid)
> struct upid *upid = pid->numbers + i;
> struct pid_namespace *ns = upid->ns;
> hlist_del_rcu(&upid->pid_chain);
> - switch(--ns->nr_hashed) {
> +
> + /* We must turn the PIDNS_HASH_ADDING flag off to
> + get the actual value of nr_hashed */
> + switch ((--ns->nr_hashed) & ~(PIDNS_HASH_ADDING)) {
> case 1:
> /* When all that is left in the pid namespace
> * is the reaper wake up the reaper. The reaper

Eric, can you please take a look? Presumably and hopefully
PIDNS_HASH_ADDING cannot be set here, but what guarantees this?

Hopefully we can fix this one by adding the missing comment.

2013-06-12 01:17:18

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH 1/1] kernel/pid.c: Masking the flag out to get the actual value.

Andrew Morton <[email protected]> writes:

> On Mon, 10 Jun 2013 18:56:38 -0300 "Raphael S. Carvalho" <[email protected]> wrote:
>
>> This patch shouldn't be applied if those branches must only be taken when
>> the pid_allocation(PIDNS_HASH_ADDING) flag was turned off.

That is correct. We should not encounter the last pid in the pid
namespace while still allowing processes to be created in the pid
namespace.

So the patch I am seeing quoted below is unnecessary.

> Well yes - hopefully this is the case. Otherwise we're missing some
> rather important-looking wakeups.
>
>
>> Otherwise, we must turn the PIDNS_HASH_ADDING flag (1U << 31) off
>> before getting into the switch-cases.
>>
>> ...
>>
>> --- a/kernel/pid.c
>> +++ b/kernel/pid.c
>> @@ -263,7 +263,10 @@ void free_pid(struct pid *pid)
>> struct upid *upid = pid->numbers + i;
>> struct pid_namespace *ns = upid->ns;
>> hlist_del_rcu(&upid->pid_chain);
>> - switch(--ns->nr_hashed) {
>> +
>> + /* We must turn the PIDNS_HASH_ADDING flag off to
>> + get the actual value of nr_hashed */
>> + switch ((--ns->nr_hashed) & ~(PIDNS_HASH_ADDING)) {
>> case 1:
>> /* When all that is left in the pid namespace
>> * is the reaper wake up the reaper. The reaper
>
> Eric, can you please take a look? Presumably and hopefully
> PIDNS_HASH_ADDING cannot be set here, but what guarantees this?

The init process has not exited, and called zap_pid_ns_processes.

In fact there is a case where nr_hashed can be 0 | PIDNS_HASH_ADDING
where we absolutely don't want to do these things. Of course there
are no pids allocated in that case to free so we can't possible get
to the switch in free pid either.

> Hopefully we can fix this one by adding the missing comment.

Perhaps we can fix this one by having people who care read the code and
think about what it means? Seriously if we are adding pids/processes in
the pid namespace why would to clean up the pid namespace?

Eric

2013-06-12 01:29:01

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 1/1] kernel/pid.c: Masking the flag out to get the actual value.

On Tue, 11 Jun 2013 18:16:50 -0700 [email protected] (Eric W. Biederman) wrote:

>
> > Hopefully we can fix this one by adding the missing comment.
>
> Perhaps we can fix this one by having people who care read the code and
> think about what it means?

As is obvious from this thread, that approach isn't working.

> Seriously if we are adding pids/processes in
> the pid namespace why would to clean up the pid namespace?

A good way to communicate the design would be to describe the semantics
of PIDNS_HASH_ADDING, at its definition site.

[idly wonders what the heck pid_namespace.level and pid.level do, sigh]

2013-06-12 01:54:23

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH 1/1] kernel/pid.c: Masking the flag out to get the actual value.

Andrew Morton <[email protected]> writes:

> On Tue, 11 Jun 2013 18:16:50 -0700 [email protected] (Eric W. Biederman) wrote:
>
>>
>> > Hopefully we can fix this one by adding the missing comment.
>>
>> Perhaps we can fix this one by having people who care read the code and
>> think about what it means?
>
> As is obvious from this thread, that approach isn't working.
>
>> Seriously if we are adding pids/processes in
>> the pid namespace why would to clean up the pid namespace?
>
> A good way to communicate the design would be to describe the semantics
> of PIDNS_HASH_ADDING, at its definition site.
>
> [idly wonders what the heck pid_namespace.level and pid.level do,
> sigh]

Explaining the semantics a bit more seems reasonable.

Something like:

unsigned int level; /* How deeply nested is this pid namespace */

#define PIDNS_HASH_ADDING (1U << 31) /* Process are still entering the pid namespace */

Sorry I don't have the focus to make that into a proper patch.


Eric