2008-07-31 03:24:24

by Lai Jiangshan

[permalink] [raw]
Subject: [PATCH] cpuset: make ntasks to be a monotonic increasing value


ntasks is not a monotonic increasing value,
So maybe fudge+1 processes are created when kmalloc and killed
when kfree in every loop. And the loop will not end or
repetition a long time.

This patch prevent this kind of attack.

Signed-off-by: Lai Jiangshan <[email protected]>
---
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index d5ab79c..65eaa2b 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -949,16 +949,20 @@ static int update_tasks_nodemask(struct cpuset *cs, const nodemask_t *oldmem)
* few more lines of code, we can retry until we get a big
* enough mmarray[] w/o using GFP_ATOMIC.
*/
+ ntasks = cgroup_task_count(cs->css.cgroup); /* guess */
while (1) {
- ntasks = cgroup_task_count(cs->css.cgroup); /* guess */
+ int ntasks_now;
ntasks += fudge;
mmarray = kmalloc(ntasks * sizeof(*mmarray), GFP_KERNEL);
if (!mmarray)
goto done;
read_lock(&tasklist_lock); /* block fork */
- if (cgroup_task_count(cs->css.cgroup) <= ntasks)
+ ntasks_now = cgroup_task_count(cs->css.cgroup);
+ if (ntasks_now <= ntasks)
break; /* got enough */
read_unlock(&tasklist_lock); /* try again */
+ ntasks = ntasks_now;
+ fudge += fudge >> 3;
kfree(mmarray);
}



2008-07-31 07:12:13

by Li Zefan

[permalink] [raw]
Subject: Re: [PATCH] cpuset: make ntasks to be a monotonic increasing value

Lai Jiangshan wrote:
> ntasks is not a monotonic increasing value,
> So maybe fudge+1 processes are created when kmalloc and killed
> when kfree in every loop. And the loop will not end or
> repetition a long time.
>
> This patch prevent this kind of attack.
>

Could you demonstrate how to manage to do this so-called attack in
real-life?

2008-07-31 07:47:53

by Lai Jiangshan

[permalink] [raw]
Subject: Re: [PATCH] cpuset: make ntasks to be a monotonic increasing value


Er, I can't. But this attack is existence theoretically indeed.
And IMO a monotonic increasing value is very helpful for this loop.


Li Zefan wrote:
> Lai Jiangshan wrote:
>> ntasks is not a monotonic increasing value,
>> So maybe fudge+1 processes are created when kmalloc and killed
>> when kfree in every loop. And the loop will not end or
>> repetition a long time.
>>
>> This patch prevent this kind of attack.
>>
>
> Could you demonstrate how to manage to do this so-called attack in
> real-life?
>
>
>
>

2008-07-31 12:24:17

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpuset: make ntasks to be a monotonic increasing value

I agree that in theory, this kernel/cpuset.c update_tasks_nodemask()
loop could loop forever, and that by forcing ntasks to keep increasing
monotonically, this guarantees that it cannot loop forever.

I also agree that no known exploit of this exists, and doubt that
any could be created.

I did find the added code logic to be a tad more difficult to read
than I'd like. How about the following patch, instead:

---
kernel/cpuset.c | 3 +++
1 file changed, 3 insertions(+)

--- 2.6.25-mm1.orig/kernel/cpuset.c 2008-07-31 07:05:23.000000000 -0500
+++ 2.6.25-mm1/kernel/cpuset.c 2008-07-31 07:13:48.000000000 -0500
@@ -880,6 +880,7 @@ static int update_nodemask(struct cpuset
struct task_struct *p;
struct mm_struct **mmarray;
int i, n, ntasks;
+ int prev_ntasks = 0;
int migrate;
int fudge;
int retval;
@@ -939,7 +940,9 @@ static int update_nodemask(struct cpuset
*/
while (1) {
ntasks = cgroup_task_count(cs->css.cgroup); /* guess */
+ ntasks = max(ntasks, prev_ntasks); /* keep increasing */
ntasks += fudge;
+ prev_ntasks = ntasks;
mmarray = kmalloc(ntasks * sizeof(*mmarray), GFP_KERNEL);
if (!mmarray)
goto done;


--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.940.382.4214

2008-07-31 13:13:15

by Lai Jiangshan

[permalink] [raw]
Subject: Re: [PATCH] cpuset: make ntasks to be a monotonic increasing value


cgroup_task_count() was called twice in every loop. IMO, it's not need.
task number maybe have been increased after kfree. But kfree is generally
quicker than kmalloc. So considering the increasing task number when kfree
is not so useful.

My patch has removed one cgroup_task_count() in every loop.

My patch has an additional line: fudge += fudge >> 3;
This line will reduce loop times remarkably when loop times is large.
(but also loop times is large just in theory)



Paul Jackson wrote:
> I agree that in theory, this kernel/cpuset.c update_tasks_nodemask()
> loop could loop forever, and that by forcing ntasks to keep increasing
> monotonically, this guarantees that it cannot loop forever.
>
> I also agree that no known exploit of this exists, and doubt that
> any could be created.
>
> I did find the added code logic to be a tad more difficult to read
> than I'd like. How about the following patch, instead:
>
> ---
> kernel/cpuset.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> --- 2.6.25-mm1.orig/kernel/cpuset.c 2008-07-31 07:05:23.000000000 -0500
> +++ 2.6.25-mm1/kernel/cpuset.c 2008-07-31 07:13:48.000000000 -0500
> @@ -880,6 +880,7 @@ static int update_nodemask(struct cpuset
> struct task_struct *p;
> struct mm_struct **mmarray;
> int i, n, ntasks;
> + int prev_ntasks = 0;
> int migrate;
> int fudge;
> int retval;
> @@ -939,7 +940,9 @@ static int update_nodemask(struct cpuset
> */
> while (1) {
> ntasks = cgroup_task_count(cs->css.cgroup); /* guess */
> + ntasks = max(ntasks, prev_ntasks); /* keep increasing */
> ntasks += fudge;
> + prev_ntasks = ntasks;
> mmarray = kmalloc(ntasks * sizeof(*mmarray), GFP_KERNEL);
> if (!mmarray)
> goto done;
>
>

2008-07-31 13:37:22

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpuset: make ntasks to be a monotonic increasing value

Lai wrote:
> cgroup_task_count() was called twice in every loop. IMO, it's not need.

Ah - true - but I suspect you are trying to optimize the code runtime
(reduce CPU cycles) whereas I am trying to optimize the source code
readability for humans.

Optimizing out the second cgroup_task_count() then requires more
subtle semantics on the pre and post conditions on the ntasks variable
at various points in the code. This makes it slightly harder for
humans to understand the code. That in turn increases the chances
of a subsequent change to the code introducing a bug, because the
author of that subsequent change didn't quite realize all the details
that mattered. Contributing to the introduction of just one bug in
that code loop, at anytime in our lifetimes, would probably cause far
more grief than anything we are trying to fix today.

> My patch has an additional line: fudge += fudge >> 3;
> This line will reduce loop times remarkably when loop times is large.
> (but also loop times is large just in theory)

Agreed to this much at least: "just in theory".

I don't usually add code lines to optimize a case that is just in
theory, in code paths that are not critical, when even without the
added code line, it would still work just fine. For one thing, that
hurts all the normal cases by slightly increasing the kernel text size,
hence slightly increasing the number of cache hit misses executing this
piece of code.

But more importantly, it is one more bit of stuff for humans to
have to read in the code.

I prefer to only add kernel source code complexity when it is needed
in practice for correct function or necessary performance. The above
more rapid growth of fudge is not needed for either reason, so far as
I can tell.

Bottom line - my priorities for non-critical code paths, most important first:
1) It must work.
2) Keep it easy for humans to understand.
3) Reduce kernel text size.
4) Reduce CPU cycles.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.940.382.4214

2008-07-31 17:06:58

by Paul Menage

[permalink] [raw]
Subject: Re: [PATCH] cpuset: make ntasks to be a monotonic increasing value

On Thu, Jul 31, 2008 at 6:37 AM, Paul Jackson <[email protected]> wrote:
> I prefer to only add kernel source code complexity when it is needed
> in practice for correct function or necessary performance. The above
> more rapid growth of fudge is not needed for either reason, so far as
> I can tell.

That loop really could do with some updates though - currently it
looks at the mm for every task in the cpuset, rather than filtering
duplicate mms from threaded applications.

Paul

2008-07-31 19:38:39

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpuset: make ntasks to be a monotonic increasing value

Paul M wrote:
> That loop really could do with some updates though - currently it
> looks at the mm for every task in the cpuset, rather than filtering
> duplicate mms from threaded applications.

Interesting.

After a quick glance, I suppose that we'd still have:
1) allocate an mmarray[] in that particular loop as we do now,
sized large enough for all tasks,
2) convert each task to it's mm, in the next code chunk, with:
mm = get_task_mm(p);

but that then, before we call "mpol_rebind_mm()" for each such
mm, we could essentially do a "sort -u" (sort unique) on that
mmarray[], to remove duplicate mm's. This would not change any
of the existing loops; rather just add one more code paragraph,
to remove the duplicate mm's.

Is that what you're thinking, Paul M?

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.940.382.4214

2008-07-31 20:02:51

by Paul Menage

[permalink] [raw]
Subject: Re: [PATCH] cpuset: make ntasks to be a monotonic increasing value

On Thu, Jul 31, 2008 at 12:38 PM, Paul Jackson <[email protected]> wrote:
> Paul M wrote:
>> That loop really could do with some updates though - currently it
>> looks at the mm for every task in the cpuset, rather than filtering
>> duplicate mms from threaded applications.
>
> Interesting.
>
> After a quick glance, I suppose that we'd still have:
> 1) allocate an mmarray[] in that particular loop as we do now,
> sized large enough for all tasks,
> 2) convert each task to it's mm, in the next code chunk, with:
> mm = get_task_mm(p);
>
> but that then, before we call "mpol_rebind_mm()" for each such
> mm, we could essentially do a "sort -u" (sort unique) on that
> mmarray[], to remove duplicate mm's. This would not change any
> of the existing loops; rather just add one more code paragraph,
> to remove the duplicate mm's.

Yes, something like that.

Paul

2008-08-01 01:37:32

by Lai Jiangshan

[permalink] [raw]
Subject: Re: [PATCH] cpuset: make ntasks to be a monotonic increasing value

Paul Jackson wrote:
> Bottom line - my priorities for non-critical code paths, most important first:
> 1) It must work.
> 2) Keep it easy for humans to understand.
> 3) Reduce kernel text size.
> 4) Reduce CPU cycles.
>

I agree, Thank you!

How about this one?

The loop after this patch applied is:

ntasks_now = cgroup_task_count(cs->css.cgroup);
while (1) {
ntasks = ntasks_now; /* guess */
ntasks += fudge;
mmarray = kmalloc(ntasks * sizeof(*mmarray), GFP_KERNEL);
if (!mmarray)
goto done;
read_lock(&tasklist_lock); /* block fork */
ntasks_now = cgroup_task_count(cs->css.cgroup);
if (ntasks_now <= ntasks)
break; /* got enough */
read_unlock(&tasklist_lock); /* try again */
kfree(mmarray);
}

I think the readability of this code is as good as original's.
And it's better in semantic meaning.

The old non monotonic increasing value is caused by the redundant
cgroup_task_count(). Removing it is good for keeping ntasks
increasing(not need additional arithmetic compare or max statement).

Signed-off-by: Lai Jiangshan <[email protected]>
---
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index d5ab79c..56a057f 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -930,7 +930,7 @@ static int update_tasks_nodemask(struct cpuset *cs, const nodemask_t *oldmem)
{
struct task_struct *p;
struct mm_struct **mmarray;
- int i, n, ntasks;
+ int i, n, ntasks, ntasks_now;
int migrate;
int fudge;
struct cgroup_iter it;
@@ -949,14 +949,16 @@ static int update_tasks_nodemask(struct cpuset *cs, const nodemask_t *oldmem)
* few more lines of code, we can retry until we get a big
* enough mmarray[] w/o using GFP_ATOMIC.
*/
+ ntasks_now = cgroup_task_count(cs->css.cgroup);
while (1) {
- ntasks = cgroup_task_count(cs->css.cgroup); /* guess */
+ ntasks = ntasks_now; /* guess */
ntasks += fudge;
mmarray = kmalloc(ntasks * sizeof(*mmarray), GFP_KERNEL);
if (!mmarray)
goto done;
read_lock(&tasklist_lock); /* block fork */
- if (cgroup_task_count(cs->css.cgroup) <= ntasks)
+ ntasks_now = cgroup_task_count(cs->css.cgroup);
+ if (ntasks_now <= ntasks)
break; /* got enough */
read_unlock(&tasklist_lock); /* try again */
kfree(mmarray);