In the recent optimizations to sys_sched_yield a bug was introduced.
In the current implementation of sys_sched_yield()
the aligned_data and idle_tasks are indexed by logical cpu-#.
They should however be indexed by physical cpu-#.
Since logical==physical on the x86 platform, it doesn't matter there,
for other platforms where this is not true it will matter.
Below is the fix.
diff -uwrbBN linux-2.4.3/kernel/sched.c linux-2.4.3-fix/kernel/sched.c
--- linux-2.4.3/kernel/sched.c Thu Mar 22 12:20:45 2001
+++ linux-2.4.3-fix/kernel/sched.c Wed Apr 11 11:27:16 2001
@@ -1024,9 +1024,11 @@
int i;
// Substract non-idle processes running on other CPUs.
- for (i = 0; i < smp_num_cpus; i++)
- if (aligned_data[i].schedule_data.curr != idle_task(i))
+ for (i = 0; i < smp_num_cpus; i++) {
+ int cpu = cpu_logical_map(i);
+ if (aligned_data[cpu].schedule_data.curr != idle_task(cpu))
nr_pending--;
+ }
#else
// on UP this process is on the runqueue as well
nr_pending--;
Hubertus Franke
Enterprise Linux Group (Mgr), Linux Technology Center (Member Scalability)
email: [email protected]
(w) 914-945-2003 (fax) 914-945-4425 TL: 862-2003
On Wed, Apr 11, 2001 at 03:31:37PM -0400, Hubertus Franke wrote:
> Below is the fix.
correct. Could you also use cpu_curr(cpu) instead of the longer expression?
(for the mainline it's only a beauty issue of course)
Andrea
Hubertus Franke wrote:
>
> In the recent optimizations to sys_sched_yield a bug was introduced.
> In the current implementation of sys_sched_yield()
> the aligned_data and idle_tasks are indexed by logical cpu-#.
>
> They should however be indexed by physical cpu-#.
> Since logical==physical on the x86 platform, it doesn't matter there,
> for other platforms where this is not true it will matter.
> Below is the fix.
>
Uh... I do know about this map, but I wonder if it is at all needed.
What is the real difference between a logical cpu and the physical one.
Or is this only interesting if the machine is not Smp, i.e. all the cpus
are not the same? It just seems to me that introducing an additional
mapping just slows things down and, if all the cpus are the same, does
not really do anything. Of course, I am assuming that ALL usage would
be to the logical :)
George
george anzinger writes:
> Uh... I do know about this map, but I wonder if it is at all needed.
> What is the real difference between a logical cpu and the physical one.
> Or is this only interesting if the machine is not Smp, i.e. all the cpus
> are not the same? It just seems to me that introducing an additional
> mapping just slows things down and, if all the cpus are the same, does
> not really do anything. Of course, I am assuming that ALL usage would
> be to the logical :)
Right. That is not always the case. IA32 is somewhat special. ;) The
logical mapping allows you to, among other things, easily enumerate
over the set of active processors without having to check if a
processor exists at the current processor address.
The difference is apparent when the physical CPU ID is, say, an
address on a processor bus, or worse, an address on a set of processor
busses. Take a look at the IA-64's smp.h. The IA64 physical
processor ID is a 64-bit structure that has to 8-bit ID's; an EID for
what amounts to a "processor bus" ID and an ID that corresponds to a
specific processor on a processor bus. Together, they're a system
global ID for a specific processor. But there is no guarantee that
the set of global ID's will be contiguous.
It's possible to have disjoint (non-contiguous) physical processor
ID's if a processor bus is not completely populated, or there is an
empty processor slot or odd processor numbering in firmware, or
whatever.
--Walt
Walt Drummond wrote:
>
> george anzinger writes:
> > Uh... I do know about this map, but I wonder if it is at all needed.
> > What is the real difference between a logical cpu and the physical one.
> > Or is this only interesting if the machine is not Smp, i.e. all the cpus
> > are not the same? It just seems to me that introducing an additional
> > mapping just slows things down and, if all the cpus are the same, does
> > not really do anything. Of course, I am assuming that ALL usage would
> > be to the logical :)
>
> Right. That is not always the case. IA32 is somewhat special. ;) The
> logical mapping allows you to, among other things, easily enumerate
> over the set of active processors without having to check if a
> processor exists at the current processor address.
>
> The difference is apparent when the physical CPU ID is, say, an
> address on a processor bus, or worse, an address on a set of processor
> busses. Take a look at the IA-64's smp.h. The IA64 physical
> processor ID is a 64-bit structure that has to 8-bit ID's; an EID for
> what amounts to a "processor bus" ID and an ID that corresponds to a
> specific processor on a processor bus. Together, they're a system
> global ID for a specific processor. But there is no guarantee that
> the set of global ID's will be contiguous.
>
> It's possible to have disjoint (non-contiguous) physical processor
> ID's if a processor bus is not completely populated, or there is an
> empty processor slot or odd processor numbering in firmware, or
> whatever.
>
All that is cool. Still, most places we don't really address the
processor, so the logical cpu number is all we need. Places like
sched_yield, for example, should be using this, not the actual number,
which IMO should only be used when, for some reason, we NEED the hard
address of the cpu. I don't think this ever has to leak out to the
common kernel code, or am i missing something here.
George
george anzinger writes:
> All that is cool. Still, most places we don't really address the
> processor, so the logical cpu number is all we need. Places like
> sched_yield, for example, should be using this, not the actual number,
> which IMO should only be used when, for some reason, we NEED the hard
> address of the cpu. I don't think this ever has to leak out to the
> common kernel code, or am i missing something here.
No your not, I was. I completely misinterpreted your question.
Sorry about that.
Hubertus and Kanoj have provided the answer I should have given.
--Walt