2006-01-27 23:07:12

by tip-bot for Jack Steiner

[permalink] [raw]
Subject: 2.6.16 - sys_sched_getaffinity & hotplug


It appears if CONFIG_HOTPLUG_CPU is enabled, then all possible
cpus (0 .. NR_CPUS-1) are set in the cpu_possible_map on IA64.

void __init
smp_build_cpu_map (void)
{
...
for (cpu = 0; cpu < NR_CPUS; cpu++) {
ia64_cpu_to_sapicid[cpu] = -1;
#ifdef CONFIG_HOTPLUG_CPU <<<<
cpu_set(cpu, cpu_possible_map); <<<<
#endif <<<<
}


sched_getaffinity() returns the cpu_possible_map and'd with the current
task p->cpus_allowed. The default cpus_allowed is all ones.

This is causing problems for apps that use sched_get_sched_affinity()
to determine which cpus that they are allowed to run on.
The call to sched_getaffinity returns:

(from strace on a 2 cpu system with NR_CPUS = 512)
sched_getaffinity(0, 1024, { ffffffffffffffff, ffffff ...



The man page for sched_getaffinity() is ambiguous. It says:
- A set bit corresponds to a legally schedulable CPU

But it also says:
- Usually, all bits in the mask are set.


Should the following change be made to sched_getaffinity().

Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c 2006-01-25 08:50:21.401747695 -0600
+++ linux/kernel/sched.c 2006-01-27 16:57:24.504871895 -0600
@@ -4031,7 +4031,7 @@ long sched_getaffinity(pid_t pid, cpumas
goto out_unlock;

retval = 0;
- cpus_and(*mask, p->cpus_allowed, cpu_possible_map);
+ cpus_and(*mask, p->cpus_allowed, cpu_online_map);

out_unlock:
read_unlock(&tasklist_lock);

--
Thanks

Jack Steiner ([email protected])



2006-01-28 02:59:00

by Nathan Lynch

[permalink] [raw]
Subject: Re: 2.6.16 - sys_sched_getaffinity & hotplug

Jack Steiner wrote:
>
> It appears if CONFIG_HOTPLUG_CPU is enabled, then all possible
> cpus (0 .. NR_CPUS-1) are set in the cpu_possible_map on IA64.

That's too bad...


> sched_getaffinity() returns the cpu_possible_map and'd with the current
> task p->cpus_allowed. The default cpus_allowed is all ones.
>
> This is causing problems for apps that use sched_get_sched_affinity()
> to determine which cpus that they are allowed to run on.

How? Are these apps expecting all set bits to correspond to online
cpus?


> The call to sched_getaffinity returns:
>
> (from strace on a 2 cpu system with NR_CPUS = 512)
> sched_getaffinity(0, 1024, { ffffffffffffffff, ffffff ...
>
>
>
> The man page for sched_getaffinity() is ambiguous. It says:
> - A set bit corresponds to a legally schedulable CPU
>
> But it also says:
> - Usually, all bits in the mask are set.
>
>
> Should the following change be made to sched_getaffinity().
>
> Index: linux/kernel/sched.c
> ===================================================================
> --- linux.orig/kernel/sched.c 2006-01-25 08:50:21.401747695 -0600
> +++ linux/kernel/sched.c 2006-01-27 16:57:24.504871895 -0600
> @@ -4031,7 +4031,7 @@ long sched_getaffinity(pid_t pid, cpumas
> goto out_unlock;
>
> retval = 0;
> - cpus_and(*mask, p->cpus_allowed, cpu_possible_map);
> + cpus_and(*mask, p->cpus_allowed, cpu_online_map);


I don't think so.

For one, that would be mucking around with a kernel/userspace ABI, I
guess.

Additionally, it would mean that the result of sched_getaffinity would
vary with the number of online cpus in the system, which I don't think
is desirable.

2006-01-28 03:14:12

by Paul Jackson

[permalink] [raw]
Subject: Re: 2.6.16 - sys_sched_getaffinity & hotplug

Jack wrote:
> Should the following change be made to sched_getaffinity().
>
> Index: linux/kernel/sched.c
> ===================================================================
> --- linux.orig/kernel/sched.c 2006-01-25 08:50:21.401747695 -0600
> +++ linux/kernel/sched.c 2006-01-27 16:57:24.504871895 -0600
> @@ -4031,7 +4031,7 @@ long sched_getaffinity(pid_t pid, cpumas
> goto out_unlock;
>
> retval = 0;
> - cpus_and(*mask, p->cpus_allowed, cpu_possible_map);
> + cpus_and(*mask, p->cpus_allowed, cpu_online_map);

Adding Robert Love to the cc list, as he is Mr. sched_getaffinity,
I believe.

I ended up doing a similar change, to the cpus (and mems) masks
in the root (all encompassing) cpuset. These now show the values
of cpu_online_map and node_online_map, not *_MASK_ALL.

My hunches are:
* This change to cpu_online_map is a good one.
* The man page sentence "Usually, all bits in the mask are set."
might have meant something when it was written, but it is not
now clear what.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.925.600.0401

2006-01-28 03:42:48

by Nathan Lynch

[permalink] [raw]
Subject: Re: 2.6.16 - sys_sched_getaffinity & hotplug

Paul Jackson wrote:
> Jack wrote:
> > Should the following change be made to sched_getaffinity().
> >
> > Index: linux/kernel/sched.c
> > ===================================================================
> > --- linux.orig/kernel/sched.c 2006-01-25 08:50:21.401747695 -0600
> > +++ linux/kernel/sched.c 2006-01-27 16:57:24.504871895 -0600
> > @@ -4031,7 +4031,7 @@ long sched_getaffinity(pid_t pid, cpumas
> > goto out_unlock;
> >
> > retval = 0;
> > - cpus_and(*mask, p->cpus_allowed, cpu_possible_map);
> > + cpus_and(*mask, p->cpus_allowed, cpu_online_map);
>
> Adding Robert Love to the cc list, as he is Mr. sched_getaffinity,
> I believe.
>
> I ended up doing a similar change, to the cpus (and mems) masks
> in the root (all encompassing) cpuset.

Which is problematic, because cpuset_cpus_allowed ->
guarantee_online_cpus restricts the task->cpus_allowed mask to cpus
which happen to be online at the time of the call to
sched_setaffinity. If more cpus come online later, that task can't be
migrated to them.

> These now show the values
> of cpu_online_map and node_online_map, not *_MASK_ALL.
>
> My hunches are:
> * This change to cpu_online_map is a good one.

It's not.

> * The man page sentence "Usually, all bits in the mask are set."
> might have meant something when it was written, but it is not
> now clear what.

I think it could reasonably be interpreted as all bits in the mask are
set unless the task's affinity has been modified.

2006-01-28 04:59:01

by Paul Jackson

[permalink] [raw]
Subject: Re: 2.6.16 - sys_sched_getaffinity & hotplug

Nathan wrote:
> Which is problematic, because cpuset_cpus_allowed ->
> guarantee_online_cpus restricts the task->cpus_allowed mask to cpus
> which happen to be online at the time of the call to
> sched_setaffinity. If more cpus come online later, that task can't be
> migrated to them.

Well, sort of.

A task could always migrate - just because a sched_getaffinity
the task did in the past doesn't show a CPU as valid, doesn't stop
the task from asking to pin to that CPU now.

One of three lessons could be taken from your example:
1) return all possible CPUS (CPU_MASK_ALL, likely), as you recommend
2) tell the task to not stash possibly stale returns from sched_getaffinity
3) virtualize app CPU numbers relative to their containing cpuset,
using an additional layer of user code.

I don't think we (or at least I ;) have an adequate understanding yet
of how hotplug will interact with the CPU affinity and Memory Node
mempolicy system calls, both of which are easier to use if things
don't come and go. These calls are still, of course, usable, but
the possibilities for the task confusing itself with stale data
increase, and the simple system numbering of CPUs and Nodes by these
system calls makes (properly so) no effort to hide^Wvirtualize
these changes.

I tend to prefer lesson (3) above, but haven't yet delivered the
libraries or tools needed to support this as Open Source, so can't
really expect that preference to be very persuasive to others.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.925.600.0401

2006-01-28 05:24:04

by Nathan Lynch

[permalink] [raw]
Subject: Re: 2.6.16 - sys_sched_getaffinity & hotplug

Paul Jackson wrote:
> Nathan wrote:
> > Which is problematic, because cpuset_cpus_allowed ->
> > guarantee_online_cpus restricts the task->cpus_allowed mask to cpus
> > which happen to be online at the time of the call to
> > sched_setaffinity. If more cpus come online later, that task can't be
> > migrated to them.
>
> Well, sort of.
>
> A task could always migrate - just because a sched_getaffinity
> the task did in the past doesn't show a CPU as valid, doesn't stop
> the task from asking to pin to that CPU now.

I was speaking of the setaffinity (not getaffinity) case -- I assumed
this was what you were referring to since I couldn't find any calls to
the cpuset code in the getaffinity path.


> One of three lessons could be taken from your example:
> 1) return all possible CPUS (CPU_MASK_ALL, likely), as you
> recommend

I'm only recommending not changing the current behavior of
sched_getaffinity.

(BTW - cpu_possible_map can be a subset of CPU_MASK_ALL on some
platforms -- powerpc, at least, since we can discover the number of
truly possible cpus early in boot.)

2006-01-28 06:41:29

by Paul Jackson

[permalink] [raw]
Subject: Re: 2.6.16 - sys_sched_getaffinity & hotplug

Nathan, responding to pj, responding to Nathan:
> > Nathan wrote:
> > > Which is problematic, because cpuset_cpus_allowed ->
> > > guarantee_online_cpus restricts the task->cpus_allowed mask to cpus
> > > which happen to be online at the time of the call to
> > > sched_setaffinity. If more cpus come online later, that task can't be
> > > migrated to them.
> >
> > Well, sort of.
> >
> > A task could always migrate - just because a sched_getaffinity
> > the task did in the past doesn't show a CPU as valid, doesn't stop
> > the task from asking to pin to that CPU now.
>
> I was speaking of the setaffinity (not getaffinity) case -- I assumed
> this was what you were referring to since I couldn't find any calls to
> the cpuset code in the getaffinity path.


Oh dear ... and you said 'setaffinity' quite clearly. Though
Jack's original post only dealt with getaffinity.

I think this discussion is getting quite confused, for which
I can take at least some of the credit.

You observe, correctly, that the call chain:
sched_setaffinity
cpuset_cpus_allowed
guarantee_online_cpus
restricts a sched_setaffinity to CPUs online at the time of that
sched_setaffinity call.

However, I have no clue how you conclude from this that "If more cpus
come online later, that task can't be migrated to them."

At anytime that some system service or batch scheduler wants to
migrate a task to some different CPUs (whether or not those CPUs were
once offline), it can either attach that task to a different cpuset,
or change the 'cpus' of its current cpuset.

Then if it wants to properly keep that tasks placement relative to its
new cpuset, it can reissue a sched_setaffinity on that tasks behalf,
to again set that tasks cpus_allowed to the same, relative to the
containing cpuset, CPUs as before.

Nothing in the behaviour of sched_getaffinity, that Jack was
considering, nor in the behaviour of sched_setaffinity, that
you thought I must be considering, has any impact on which CPUs
a task can be migrated to.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.925.600.0401

2006-01-28 07:04:40

by Paul Jackson

[permalink] [raw]
Subject: Re: 2.6.16 - sys_sched_getaffinity & hotplug

Nathan wrote:
> I'm only recommending not changing the current behavior of
> sched_getaffinity.

Jack is essentially recommending -unchanging- the behaviour of
sched_getaffinity. CONFIG_HOTPLUG_CPU changed it, as an unintended
side affect, and Jack is asking if we should revert that change.

Prior to CONFIG_HOTPLUG_CPU, on (for example) an ia64 SN2, which is
compiled with 512 or 1024 NR_CPUS, the sched_getaffinity call returned
at most the number of CPUs set as were online.

For example, on an 8 CPU SN2 system (compiled NR_CPUS 512) that is at
hand to me right now, compiled without CONFIG_HOTPLUG_CPU, the command:
strace -etrace=sched_getaffinity taskset -p $$

produces the strace output:
sched_getaffinity(13282, 128, { ff, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }) = 128

and produces the taskset output:
pid 13282's current affinity mask: ff

(why the particular taskset binary I am invoking is compiled
for just 128 CPUs beats me ;).

This is the sort of behaviour that apps might have become to
expect. And it is not clear that apps would clearly distinguish
between CPUs online at the moment, and possibly online after
some future hotplug event. Given the paucity of hotpluggable
CPUs, it is a safe bet most apps doing this have not clearly
distinguished these two cases.

Now when we introduce CONFIG_HOTPLUG_CPU, as Jack reports, this
set_getaffinity call is returning with all bits set (Jack was
apparently using a sched_getaffinity call from an app compiled
for 1024 CPUs, on a 512 NR_CPUS kernel):

(from strace on a 2 cpu system with NR_CPUS = 512)
sched_getaffinity(0, 1024, { ffffffffffffffff, ffffff ...

This will break code that thinks this return means that there are
actually available, right now, all those CPUs.

The addition of CONFIG_HOTPLUG_CPU has changed the apparent (what
-seems- to be happening) behaviour of sched_getaffinity. Without
it, on a small system running a big NR_CPUS kernel, just a small
number of bits were set. With it, all the bits are set.

We need to choose, with the advent of hotplug, whether the
sched_getaffinity means:
1) at most, the CPUs online now, or
2) at most, all possible online CPUs.

This choice did not exist before. I recommend choosing the way that
will be the "least surprising" to existing code. I believe that this
would be (1) the CPUs online now, as Jack's patch accomplishes.

We should not stumble blindly into changing the behaviour of a system
call in an effort to seem to avoid changing it.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.925.600.0401

2006-01-28 13:32:10

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.16 - sys_sched_getaffinity & hotplug


* Paul Jackson <[email protected]> wrote:

> Jack wrote:
> > Should the following change be made to sched_getaffinity().
> >
> > Index: linux/kernel/sched.c
> > ===================================================================
> > --- linux.orig/kernel/sched.c 2006-01-25 08:50:21.401747695 -0600
> > +++ linux/kernel/sched.c 2006-01-27 16:57:24.504871895 -0600
> > @@ -4031,7 +4031,7 @@ long sched_getaffinity(pid_t pid, cpumas
> > goto out_unlock;
> >
> > retval = 0;
> > - cpus_and(*mask, p->cpus_allowed, cpu_possible_map);
> > + cpus_and(*mask, p->cpus_allowed, cpu_online_map);
>
> Adding Robert Love to the cc list, as he is Mr. sched_getaffinity, I
> believe.

i'm to blame for the syscall, Robert is to blame for the tool side
:-) In any case, Jack's change looks reasonable and obviously correct.

Acked-by: Ingo Molnar <[email protected]>

Ingo

2006-01-28 16:09:03

by tip-bot for Jack Steiner

[permalink] [raw]
Subject: Re: 2.6.16 - sys_sched_getaffinity & hotplug

On Sat, Jan 28, 2006 at 02:32:44PM +0100, Ingo Molnar wrote:
>
> * Paul Jackson <[email protected]> wrote:
>
> > Jack wrote:
> > > Should the following change be made to sched_getaffinity().
> > >
> > > Index: linux/kernel/sched.c
> > > ===================================================================
> > > --- linux.orig/kernel/sched.c 2006-01-25 08:50:21.401747695 -0600
> > > +++ linux/kernel/sched.c 2006-01-27 16:57:24.504871895 -0600
> > > @@ -4031,7 +4031,7 @@ long sched_getaffinity(pid_t pid, cpumas
> > > goto out_unlock;
> > >
> > > retval = 0;
> > > - cpus_and(*mask, p->cpus_allowed, cpu_possible_map);
> > > + cpus_and(*mask, p->cpus_allowed, cpu_online_map);
> >
> > Adding Robert Love to the cc list, as he is Mr. sched_getaffinity, I
> > believe.
>
> i'm to blame for the syscall, Robert is to blame for the tool side
> :-) In any case, Jack's change looks reasonable and obviously correct.
>
> Acked-by: Ingo Molnar <[email protected]>
>
> Ingo

Ok, thanks. I'll repost as a patch later today....


--
Thanks

Jack Steiner ([email protected]) 651-683-5302


2006-01-28 19:27:45

by Nathan Lynch

[permalink] [raw]
Subject: Re: 2.6.16 - sys_sched_getaffinity & hotplug

Ingo Molnar wrote:
>
> > Jack wrote:
> > > Should the following change be made to sched_getaffinity().
> > >
> > > Index: linux/kernel/sched.c
> > > ===================================================================
> > > --- linux.orig/kernel/sched.c 2006-01-25 08:50:21.401747695 -0600
> > > +++ linux/kernel/sched.c 2006-01-27 16:57:24.504871895 -0600
> > > @@ -4031,7 +4031,7 @@ long sched_getaffinity(pid_t pid, cpumas
> > > goto out_unlock;
> > >
> > > retval = 0;
> > > - cpus_and(*mask, p->cpus_allowed, cpu_possible_map);
> > > + cpus_and(*mask, p->cpus_allowed, cpu_online_map);
> >
> In any case, Jack's change looks reasonable and obviously correct.

Are you sure? Assuming this change is in effect, consider the
following:

Task starts with default affinity.

Task does sched_getaffinity, stashes the result in saved_mask.

Task pins itself to one cpu and does some work.

Meanwhile, more cpus are brought online.

Task finishes work and does sched_setaffinity(saved_mask).

Task will never run on the new cpus.


2006-01-28 20:07:10

by Paul Jackson

[permalink] [raw]
Subject: Re: 2.6.16 - sys_sched_getaffinity & hotplug

Nathan wrote:
> Task finishes work and does sched_setaffinity(saved_mask).

Stupid task. If task wants to run on -all- cpus on a
hotplug system, task should not pass a saved mask, but
rather construct a mask with all bits set and pass that:

cpu_set_t mask;
unsigned int i;

/* set all bits in mask - code totally untested */
for (i = 0; i < sizeof(cpu_set_t) / sizeof (__cpu_mask); i++)
mask.__bits[i] = ~0;

sched_setaffinity(&mask);

Similar problems exist for a task running in a cpuset under
migration. Saved masks are useless in all but static systems,
having no migration, no hotplug.

That, or use a library on top of this that lets the task work
with relative (to whatever is available) CPU and (for the
mbind/mempolicy calls) Memory Node numbers and that handles
the above details. If all goes well, I should be releasing
such a library in the not distant future.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.925.600.0401

2006-01-28 20:09:56

by Paul Jackson

[permalink] [raw]
Subject: Re: 2.6.16 - sys_sched_getaffinity & hotplug

Ingo wrote:
> i'm to blame for the syscall, Robert is to blame for the tool side

And here I've been blaming Robert for that syscall all these years.

My humble apologies, Robert ;).

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.925.600.0401

2006-01-28 20:47:30

by Robert Love

[permalink] [raw]
Subject: Re: 2.6.16 - sys_sched_getaffinity & hotplug

On Sat, 2006-01-28 at 12:09 -0800, Paul Jackson wrote:

> And here I've been blaming Robert for that syscall all these years.
>
> My humble apologies, Robert ;).

Well, I actually did do the 2.5 version of the patch and sent it to
Linus, so I do find myself at confession on a monthly basis, begging for
some forgiveness.

Robert Love


2006-01-28 21:01:07

by Paul Jackson

[permalink] [raw]
Subject: Re: 2.6.16 - sys_sched_getaffinity & hotplug

Robert wrote:
> so I do find myself at confession ...

yeah ... dont' we all ...

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.925.600.0401

2006-01-29 13:01:05

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.16 - sys_sched_getaffinity & hotplug


* Robert Love <[email protected]> wrote:

> On Sat, 2006-01-28 at 12:09 -0800, Paul Jackson wrote:
>
> > And here I've been blaming Robert for that syscall all these years.
> >
> > My humble apologies, Robert ;).
>
> Well, I actually did do the 2.5 version of the patch and sent it to
> Linus, so I do find myself at confession on a monthly basis, begging
> for some forgiveness.

ah, indeed, so *you* are the one to be blamed for passing on a mortally
flawed hack, making you guilty of contributory enkludgement of the 2.6
kernel ;)

Ingo

2006-01-29 13:07:16

by tip-bot for Jack Steiner

[permalink] [raw]
Subject: Re: 2.6.16 - sys_sched_getaffinity & hotplug

On Fri, Jan 27, 2006 at 08:58:55PM -0600, Nathan Lynch wrote:
> Jack Steiner wrote:
> >
> > It appears if CONFIG_HOTPLUG_CPU is enabled, then all possible
> > cpus (0 .. NR_CPUS-1) are set in the cpu_possible_map on IA64.
>
> That's too bad...

Yes it is! It breaks current applications that expect a set bit
to correspond to a valid cpu that a task can be scheduled on.
We have MPI applications that use sched_getaffinity() to determine
where to place their threads. Placing them on non-existant cpus
is problematic :-)

>
>
> > sched_getaffinity() returns the cpu_possible_map and'd with the current
> > task p->cpus_allowed. The default cpus_allowed is all ones.
> >
> > This is causing problems for apps that use sched_get_sched_affinity()
> > to determine which cpus that they are allowed to run on.
>
> How? Are these apps expecting all set bits to correspond to online
> cpus?

Yes. That is what the man page says. That is what sched_getaffinity()
returns if CONFIG_HOTPLUG_CPU is not enabled.


>
>
> > The call to sched_getaffinity returns:
> >
> > (from strace on a 2 cpu system with NR_CPUS = 512)
> > sched_getaffinity(0, 1024, { ffffffffffffffff, ffffff ...
> >
> >
> >
> > The man page for sched_getaffinity() is ambiguous. It says:
> > - A set bit corresponds to a legally schedulable CPU
> >
> > But it also says:
> > - Usually, all bits in the mask are set.
> >
> >
> > Should the following change be made to sched_getaffinity().
> >
> > Index: linux/kernel/sched.c
> > ===================================================================
> > --- linux.orig/kernel/sched.c 2006-01-25 08:50:21.401747695 -0600
> > +++ linux/kernel/sched.c 2006-01-27 16:57:24.504871895 -0600
> > @@ -4031,7 +4031,7 @@ long sched_getaffinity(pid_t pid, cpumas
> > goto out_unlock;
> >
> > retval = 0;
> > - cpus_and(*mask, p->cpus_allowed, cpu_possible_map);
> > + cpus_and(*mask, p->cpus_allowed, cpu_online_map);
>
>
> I don't think so.
>
> For one, that would be mucking around with a kernel/userspace ABI, I
> guess.

I would argue that CONFIG_HOTPLUG_CPU is what changed the API. The
hotplug code (at least on IA64) has changed the meaning of the bits.

In addition, it does not seem logical that an API should change on IA64
based on whether or not the CONFIG_HOTPLUG_CPU config option is enabled.

>
> Additionally, it would mean that the result of sched_getaffinity would
> vary with the number of online cpus in the system, which I don't think
> is desirable.

OTOH, if sched_getaffinity() does reflect online cpus, then what does
it reflect? If CONFIG_HOTPLUG_CPU is enabled, sched_getaffinity()
unconditionally returns a mask with NR_CPUS bits set. This conveys
no useful infornmation except for a kernel compile option.

--
Thanks

Jack Steiner ([email protected]) 651-683-5302
Principal Engineer SGI - Silicon Graphics, Inc.



2006-01-29 13:51:15

by tip-bot for Jack Steiner

[permalink] [raw]
Subject: [PATCH] - sys_sched_getaffinity & hotplug

Change sched_getaffinity() so that it returns a bitmap that indicates the
legally schedulable cpus that a task is allowed to run on.

Without this patch, if CONFIG_HOTPLUG_CPU is enabled, sched_getaffinity()
unconditionally returns (at least on IA64) a mask with NR_CPUS bits set.
This conveys no useful infornmation except for a kernel compile option.


Signed-off-by: Jack Steiner <[email protected]>
Acked-by: Ingo Molnar <[email protected]>

---
This fixes a breakage we obseved running recent kernels. We have MPI jobs
that use sched_getaffinity() to determine where to place their threads.
Placing them on non-existant cpus is problematic :-)


Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c 2006-01-28 10:13:01.834293691 -0600
+++ linux/kernel/sched.c 2006-01-29 07:15:11.217227453 -0600
@@ -4031,7 +4031,7 @@ long sched_getaffinity(pid_t pid, cpumas
goto out_unlock;

retval = 0;
- cpus_and(*mask, p->cpus_allowed, cpu_possible_map);
+ cpus_and(*mask, p->cpus_allowed, cpu_online_map);

out_unlock:
read_unlock(&tasklist_lock);

2006-01-29 16:06:50

by Robert Love

[permalink] [raw]
Subject: Re: 2.6.16 - sys_sched_getaffinity & hotplug

On Sun, 2006-01-29 at 14:01 +0100, Ingo Molnar wrote:

> ah, indeed, so *you* are the one to be blamed for passing on a mortally
> flawed hack, making you guilty of contributory enkludgement of the 2.6
> kernel ;)

To be fair, I should point out that my original patch made the second
argument a pointer, so that the kernel could return the actual length of
the mask if it were too small. This would move the interface from
"flawed hack" to "not too bad". ;-)

Anyhow, Linus said that the interface was stupid and the second
parameter should not be a pointer. So, if we are gonna blame
someone... :)

Robert Love


2006-01-29 17:26:19

by Paul Jackson

[permalink] [raw]
Subject: Re: 2.6.16 - sys_sched_getaffinity & hotplug

Shush Ingo and Robert ... the glibc folks have graciously
been covering for us all these years. If you just keep
quiet, no one will notice your minor contributions to this
botch.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.925.600.0401