I just noticed this on sparc64, as I lost 31 cpus on my
Niagara box due to it :)
boot_cpu_init() sets the boot processor ID in cpu_present_map.
But fixup_cpu_present_map() will only populate the cpu_present_map if
it is empty, which it won't be because of what boot_cpu_init() just
did.
"David S. Miller" <[email protected]> wrote:
>
>
> I just noticed this on sparc64, as I lost 31 cpus on my
> Niagara box due to it :)
>
> boot_cpu_init() sets the boot processor ID in cpu_present_map.
>
> But fixup_cpu_present_map() will only populate the cpu_present_map if
> it is empty, which it won't be because of what boot_cpu_init() just
> did.
oops. I guess most architectures set cpu_present_map while bringing up the
APs.
I think it'd be cleanest to require that the arch do that -
fixup_cpu_present_map() looks like a bit of a hack.
I guess if we want to perpetuate fixup_cpu_present_map() then we should
teach it to ignore the boot cpu. (cpus_weight(&cpu_present_map) == 1)
would do that.
From: Andrew Morton <[email protected]>
Date: Sat, 25 Mar 2006 03:47:44 -0800
> I think it'd be cleanest to require that the arch do that -
> fixup_cpu_present_map() looks like a bit of a hack.
Indeed it does. I'm planning on doing someting like this
for sparc64:
diff --git a/arch/sparc64/kernel/smp.c b/arch/sparc64/kernel/smp.c
index 1b6e2ad..7dc28a4 100644
--- a/arch/sparc64/kernel/smp.c
+++ b/arch/sparc64/kernel/smp.c
@@ -1298,6 +1298,7 @@ void __init smp_prepare_cpus(unsigned in
while (!cpu_find_by_instance(instance, NULL, &mid)) {
if (mid != boot_cpu_id) {
cpu_clear(mid, phys_cpu_present_map);
+ cpu_clear(mid, cpu_present_map);
if (num_possible_cpus() <= max_cpus)
break;
}
@@ -1332,8 +1333,10 @@ void __init smp_setup_cpu_possible_map(v
instance = 0;
while (!cpu_find_by_instance(instance, NULL, &mid)) {
- if (mid < NR_CPUS)
+ if (mid < NR_CPUS) {
cpu_set(mid, phys_cpu_present_map);
+ cpu_set(mid, cpu_present_map);
+ }
instance++;
}
}
On Sat, Mar 25, 2006 at 03:47:44AM -0800, Andrew Morton wrote:
> "David S. Miller" <[email protected]> wrote:
> >
> >
> > I just noticed this on sparc64, as I lost 31 cpus on my
> > Niagara box due to it :)
> >
> > boot_cpu_init() sets the boot processor ID in cpu_present_map.
> >
> > But fixup_cpu_present_map() will only populate the cpu_present_map if
> > it is empty, which it won't be because of what boot_cpu_init() just
> > did.
>
> oops. I guess most architectures set cpu_present_map while bringing up the
> APs.
>
> I think it'd be cleanest to require that the arch do that -
> fixup_cpu_present_map() looks like a bit of a hack.
>
> I guess if we want to perpetuate fixup_cpu_present_map() then we should
> teach it to ignore the boot cpu. (cpus_weight(&cpu_present_map) == 1)
> would do that.
At setup_arch() time, we initialise cpu_possible_map to contain the CPUs
the system might have.
We then call smp_prepare_boot_cpu() which marks the boot cpu in both
cpu_present_map and cpu_online_map.
Eventually, we call smp_prepare_cpus(), where an architecture may
populate cpu_present_map to indicate which cpus are actually present,
and following this we call fixup_cpu_present_map().
With your proposed change, if a SMP system with has 4 possible CPUs
was passed maxcpus=1, cpu_possible_map may well have 4 CPUs, and
cpu_present_map will only contain the one. However, due to the
fixup_cpu_present_map(), it will say "oh only one CPU, we need to
populate the others" and so you'd actually try to boot all 4.
So no, this doesn't work. Isn't it about time the pre-CPU hotplug SMP
stuff was updated, rather than trying to messily support two different
SMP initialisation methodologies in generic code with band aid plasters
all over?
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core
From: Russell King <[email protected]>
Date: Sat, 25 Mar 2006 12:05:46 +0000
> So no, this doesn't work. Isn't it about time the pre-CPU hotplug SMP
> stuff was updated, rather than trying to messily support two different
> SMP initialisation methodologies in generic code with band aid plasters
> all over?
Agreed.
"David S. Miller" <[email protected]> wrote:
>
> From: Andrew Morton <[email protected]>
> Date: Sat, 25 Mar 2006 03:47:44 -0800
>
> > I think it'd be cleanest to require that the arch do that -
> > fixup_cpu_present_map() looks like a bit of a hack.
>
> Indeed it does. I'm planning on doing someting like this
> for sparc64:
Fair enough. fixup_cpu_present_map() is an elaborate no-op now. I'll nuke
it and will send a heads-up to the arch maintainers.
Russell King <[email protected]> wrote:
>
> On Sat, Mar 25, 2006 at 03:47:44AM -0800, Andrew Morton wrote:
> > "David S. Miller" <[email protected]> wrote:
> > >
> > >
> > > I just noticed this on sparc64, as I lost 31 cpus on my
> > > Niagara box due to it :)
> > >
> > > boot_cpu_init() sets the boot processor ID in cpu_present_map.
> > >
> > > But fixup_cpu_present_map() will only populate the cpu_present_map if
> > > it is empty, which it won't be because of what boot_cpu_init() just
> > > did.
> >
> > oops. I guess most architectures set cpu_present_map while bringing up the
> > APs.
> >
> > I think it'd be cleanest to require that the arch do that -
> > fixup_cpu_present_map() looks like a bit of a hack.
> >
> > I guess if we want to perpetuate fixup_cpu_present_map() then we should
> > teach it to ignore the boot cpu. (cpus_weight(&cpu_present_map) == 1)
> > would do that.
>
> At setup_arch() time, we initialise cpu_possible_map to contain the CPUs
> the system might have.
OK.
> We then call smp_prepare_boot_cpu() which marks the boot cpu in both
> cpu_present_map and cpu_online_map.
OK.
> Eventually, we call smp_prepare_cpus(), where an architecture may
> populate cpu_present_map to indicate which cpus are actually present,
> and following this we call fixup_cpu_present_map().
OK.
> With your proposed change,
Which proposed change? I proposed two.
> if a SMP system with has 4 possible CPUs
> was passed maxcpus=1, cpu_possible_map may well have 4 CPUs, and
> cpu_present_map will only contain the one. However, due to the
> fixup_cpu_present_map(), it will say "oh only one CPU, we need to
> populate the others" and so you'd actually try to boot all 4.
The change we appear to be going with is to remove fixup_cpu_present_map()
which appears to address this.
> So no, this doesn't work.
What doesn't work?
> Isn't it about time the pre-CPU hotplug SMP
> stuff was updated, rather than trying to messily support two different
> SMP initialisation methodologies in generic code with band aid plasters
> all over?
What two methodologies? arch-doing-it and fixup_cpu_present_map() doing it?
On Sat, Mar 25, 2006 at 04:15:59AM -0800, Andrew Morton wrote:
> Russell King <[email protected]> wrote:
> > With your proposed change,
>
> Which proposed change? I proposed two.
The latter change.
> > if a SMP system with has 4 possible CPUs
> > was passed maxcpus=1, cpu_possible_map may well have 4 CPUs, and
> > cpu_present_map will only contain the one. However, due to the
> > fixup_cpu_present_map(), it will say "oh only one CPU, we need to
> > populate the others" and so you'd actually try to boot all 4.
>
> The change we appear to be going with is to remove fixup_cpu_present_map()
> which appears to address this.
>
> > So no, this doesn't work.
>
> What doesn't work?
The situation I described and you've quoted in the bulk of the above
quote.
> > Isn't it about time the pre-CPU hotplug SMP
> > stuff was updated, rather than trying to messily support two different
> > SMP initialisation methodologies in generic code with band aid plasters
> > all over?
>
> What two methodologies? arch-doing-it and fixup_cpu_present_map() doing it?
What I'm referring to is the pre-CPU hotplug SMP initialisation
methodology and the post-CPU hotplug SMP initialisation methodology,
which I think is covered by "two different SMP initialisation
methodologies".
The two methodologies had entirely different ways of bringing up the
non-boot CPUs to the extent of using the cpu_*_map variables in different
ways. However, now that I come to look again at x86, the situation does
appear to have improved somewhat over the last year or so since I last
looked (which was when I sorted out the ARM SMP support.)
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core