2006-02-14 00:19:23

by Hubertus Franke

[permalink] [raw]
Subject: SMP BUG

Folks the change introduced in 2.6.16-rc2 over 2.6.15
wrt to the SMP initialization are wrong.
Please apply to unroll the change..

Here is the logic ...
sched_init is called from start_kernel before the
architecture specific function cpu_check_smp() is called
which is done as part of rest_init().

On s390 this actually sets the cpu_possible_map, which
is now used in sched_init through the for_each_cpu without
properly being initialized.
As a result bringing 2nd and subsequent cpu online
breaks.

This should be a quick fix, until this chicken and egg
problem is solved otherwise.

-- Hubertus

--- kernel/sched.c.orig 2006-02-13 19:08:28.000000000 -0500
+++ kernel/sched.c 2006-02-13 19:09:08.000000000 -0500
@@ -6111,7 +6111,7 @@ void __init sched_init(void)
runqueue_t *rq;
int i, j, k;

- for_each_cpu(i) {
+ for (i = 0; i < NR_CPUS; i++ ) {
prio_array_t *array;

rq = cpu_rq(i);


2006-02-15 23:07:24

by Russell King

[permalink] [raw]
Subject: Re: SMP BUG

On Mon, Feb 13, 2006 at 07:19:19PM -0500, Hubertus Franke wrote:
> Folks the change introduced in 2.6.16-rc2 over 2.6.15
> wrt to the SMP initialization are wrong.
> Please apply to unroll the change..
>
> Here is the logic ...
> sched_init is called from start_kernel before the
> architecture specific function cpu_check_smp() is called
> which is done as part of rest_init().
>
> On s390 this actually sets the cpu_possible_map, which
> is now used in sched_init through the for_each_cpu without
> properly being initialized.
> As a result bringing 2nd and subsequent cpu online
> breaks.
>
> This should be a quick fix, until this chicken and egg
> problem is solved otherwise.
>
> -- Hubertus
>
> --- kernel/sched.c.orig 2006-02-13 19:08:28.000000000 -0500
> +++ kernel/sched.c 2006-02-13 19:09:08.000000000 -0500
> @@ -6111,7 +6111,7 @@ void __init sched_init(void)
> runqueue_t *rq;
> int i, j, k;
>
> - for_each_cpu(i) {
> + for (i = 0; i < NR_CPUS; i++ ) {
> prio_array_t *array;
>
> rq = cpu_rq(i);

(left most of the message intact because it seems to have been ignored.
Copying Linus and akpm in the vague hope of a response.)

Yes, I'm also seeing an oops caused by exactly this on ARM:

<5>Linux version 2.6.16-rc3-rmk ([email protected]) (gcc version 3.3 20030728 (Red Hat Linux 3.3-16)) #201 SMP Wed Feb 15 22:34:57 GMT 2006
CPU: Some Random V6 Processor [410fb020] revision 0 (ARMv6TEJ)
Machine: ARM-RealView EB
Memory policy: ECC disabled, Data cache writealloc
<7>On node 0 totalpages: 32768
<7> DMA zone: 32768 pages, LIFO batch:7
<7> DMA32 zone: 0 pages, LIFO batch:0
<7> Normal zone: 0 pages, LIFO batch:0
<7> HighMem zone: 0 pages, LIFO batch:0
CPU0: D VIPT write-back cache
CPU0: I cache: 32768 bytes, associativity 4, 32 byte lines, 256 sets
CPU0: D cache: 32768 bytes, associativity 4, 32 byte lines, 256 sets
Built 1 zonelists
<5>Kernel command line: root=/dev/nfs mem=128M console=ttyAMA0,38400 ip=dhcp cachepolicy=writealloc
PID hash table entries: 1024 (order: 10, 16384 bytes)
Console: colour dummy device 80x30
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
<6>Memory: 128MB = 128MB total
<5>Memory: 127104KB available (1992K code, 426K data, 100K init)
<7>Calibrating delay loop... 83.14 BogoMIPS (lpj=415744)
Mount-cache hash table entries: 512
<6>CPU: Testing write buffer coherency: ok
Calibrating local timer... 104.41MHz.
CPU1: Booted secondary processor
CPU1: D VIPT write-back cache
CPU1: I cache: 32768 bytes, associativity 4, 32 byte lines, 256 sets
CPU1: D cache: 32768 bytes, associativity 4, 32 byte lines, 256 sets
<7>Calibrating delay loop... 83.14 BogoMIPS (lpj=415744)
<1>Unable to handle kernel NULL pointer dereference at virtual address 0000001c
<1>pgd = c0004000
<1>[0000001c] *pgd=00000000
Internal error: Oops: 5 [#1]
Modules linked in:
CPU: 0
PC is at enqueue_task+0x1c/0x64
LR is at activate_task+0xcc/0xe4
pc : [<c0034c28>] lr : [<c0034f80>] Not tainted
sp : c7c05ebc ip : c7c05ed0 fp : c7c05ecc
r10: 00000001 r9 : c001b160 r8 : c038a240
r7 : c03fe2e0 r6 : 00000000 r5 : 0095257a r4 : 00000008
r3 : 00000018 r2 : c03fe308 r1 : 00000000 r0 : c03fe2e0
Flags: nZCv IRQs off FIQs on Mode SVC_32 Segment kernel
Control: C5787F Table: 0000400A DAC: 00000017
Process swapper (pid: 1, stack limit = 0xc7c04194)
Stack: (0xc7c05ebc to 0xc7c06000)
5ea0: 00000008
5ec0: c7c05ef0 c7c05ed0 c0034f80 c0034c18 c001b160 00000001 00000000 c03fe2e0
5ee0: c038a240 c7c05f48 c7c05ef4 c0035960 c0034ec0 c0065a28 c00657b4 c7c04000
5f00: c7c0c000 c038a240 00000000 00000001 00000000 00000000 0000000f 80000013
5f20: c021a55c 00000001 00000001 00000000 c0264f8c 00000000 00000000 c7c05f58
5f40: c7c05f4c c00359d4 c00355e0 c7c05f88 c7c05f5c c00395dc c00359c8 c002b0a4
5f60: c021a55c 00000001 00000002 00000000 c0264f8c 00000000 00000000 c7c05fa4
5f80: c7c05f8c c004d544 c00394d4 00000000 00000001 00000001 c7c05fc8 c7c05fa8
5fa0: c005b488 c004d518 00000001 c0264f8c 00000000 00000000 00000000 c7c05fe0
5fc0: c7c05fcc c0008874 c005b3b8 c0262efc 00000000 c7c05ff4 c7c05fe4 c0021120
5fe0: c0008818 00000000 00000000 c7c05ff8 c0040a44 c0021084 2020202d 2d2d2020
Backtrace:
[<c0034c0c>] (enqueue_task+0x0/0x64) from [<c0034f80>] (activate_task+0xcc/0xe4) r4 = 00000008
[<c0034eb4>] (activate_task+0x0/0xe4) from [<c0035960>] (try_to_wake_up+0x38c/0x3e8)
[<c00355d4>] (try_to_wake_up+0x0/0x3e8) from [<c00359d4>] (wake_up_process+0x18/0x1c)
[<c00359bc>] (wake_up_process+0x0/0x1c) from [<c00395dc>] (migration_call+0x114/0x328)
[<c00394c8>] (migration_call+0x0/0x328) from [<c004d544>] (notifier_call_chain+0x38/0x50)
[<c004d50c>] (notifier_call_chain+0x0/0x50) from [<c005b488>] (cpu_up+0xdc/0x104)
[<c005b3ac>] (cpu_up+0x0/0x104) from [<c0008874>] (smp_init+0x68/0xc4)
[<c000880c>] (smp_init+0x0/0xc4) from [<c0021120>] (init+0xa8/0x1c8)
[<c0021078>] (init+0x0/0x1c8) from [<c0040a44>] (do_exit+0x0/0x3f4)
Code: e5903020 e2802028 e0813183 e2833018 (e593c004)
<0>Kernel panic - not syncing: Attempted to kill init!
<2>CPU1: stopping
[<c0027478>] (dump_stack+0x0/0x14) from [<c0028af8>] (ipi_cpu_stop+0x2c/0x64)
[<c0028acc>] (ipi_cpu_stop+0x0/0x64) from [<c0028bec>] (do_IPI+0xbc/0xe8)
[<c0028b30>] (do_IPI+0x0/0xe8) from [<c00219b0>] (__irq_svc+0x30/0xc0)
[<c0023b88>] (default_idle+0x0/0x44) from [<c0023c2c>] (cpu_idle+0x60/0x80)
[<c0023bcc>] (cpu_idle+0x0/0x80) from [<c00285b4>] (secondary_start_kernel+0xc8/0xd8)
[<c00284ec>] (secondary_start_kernel+0x0/0xd8) from [<000080e0>] (0x80e0)

enqueue_task is being called with p = c03fe2e0, array = NULL, leading
to a NULL pointer dereference because rq->array has not been initialised.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core

2006-02-15 23:18:29

by Andrew Morton

[permalink] [raw]
Subject: Re: SMP BUG

Russell King <[email protected]> wrote:
>
> On Mon, Feb 13, 2006 at 07:19:19PM -0500, Hubertus Franke wrote:
> > Folks the change introduced in 2.6.16-rc2 over 2.6.15
> > wrt to the SMP initialization are wrong.
> > Please apply to unroll the change..
> >
> > Here is the logic ...
> > sched_init is called from start_kernel before the
> > architecture specific function cpu_check_smp() is called
> > which is done as part of rest_init().
> >
> > On s390 this actually sets the cpu_possible_map, which
> > is now used in sched_init through the for_each_cpu without
> > properly being initialized.
> > As a result bringing 2nd and subsequent cpu online
> > breaks.
> >
> > This should be a quick fix, until this chicken and egg
> > problem is solved otherwise.
> >
> > -- Hubertus
> >
> > --- kernel/sched.c.orig 2006-02-13 19:08:28.000000000 -0500
> > +++ kernel/sched.c 2006-02-13 19:09:08.000000000 -0500
> > @@ -6111,7 +6111,7 @@ void __init sched_init(void)
> > runqueue_t *rq;
> > int i, j, k;
> >
> > - for_each_cpu(i) {
> > + for (i = 0; i < NR_CPUS; i++ ) {
> > prio_array_t *array;
> >
> > rq = cpu_rq(i);
>
> (left most of the message intact because it seems to have been ignored.
> Copying Linus and akpm in the vague hope of a response.)

This has already been fixed in s390.

> Yes, I'm also seeing an oops caused by exactly this on ARM:
>
> ...
>
> enqueue_task is being called with p = c03fe2e0, array = NULL, leading
> to a NULL pointer dereference because rq->array has not been initialised.

Is arm's setup_arch() populating cpu_possible_map?

If that's not possible, statically initialising it to CPU_MASK_ALL should
fix it, but that's a lame solution and might lead to wastage of per-cpu
memory on not-possible CPUs.

2006-02-15 23:23:12

by Linus Torvalds

[permalink] [raw]
Subject: Re: SMP BUG



On Wed, 15 Feb 2006, Russell King wrote:
>
> Yes, I'm also seeing an oops caused by exactly this on ARM:

I suspect ARM just initializes the cpu_possible_map too late, like alpha
and s390 did.

That said, nobody seemed to comment on this patch by Rik, which seemed to
be a nice cleanup regardless of any other issues.

Does this fix the ARM oops?

Linus

---
Date: Wed, 8 Feb 2006 20:20:58 -0500 (EST)
From: Rik van Riel <[email protected]>
Subject: Re: [PATCH] percpu data: only iterate over possible CPUs

On Wed, 8 Feb 2006, Rik van Riel wrote:

> On Sun, 5 Feb 2006, Linux Kernel Mailing List wrote:
>
> > [PATCH] percpu data: only iterate over possible CPUs
>
> The sched.c bit breaks Xen, and probably also other architectures
> that have CPU hotplug. I suspect the reason is that during early
> bootup only the boot CPU is online, so nothing initialises the
> runqueues for CPUs that are brought up afterwards.
>
> I suspect we can get rid of this problem quite easily by moving
> runqueue initialisation to init_idle()...

Well, it works. This (fairly trivial) patch makes hotplug cpu
work again, by ensuring that the runqueues of a newly brought
up CPU are initialized just before they are needed.

Without this patch the "spin_lock_irqsave(&rq->lock, flags);"
in init_idle() would oops if CONFIG_DEBUG_SPINLOCK was set.

With this patch, things just work.

Signed-off-by: Rik van Riel <[email protected]>

--- linux-2.6.15.i686/kernel/sched.c.idle_init 2006-02-08 17:56:50.000000000 -0500
+++ linux-2.6.15.i686/kernel/sched.c 2006-02-08 17:58:57.000000000 -0500
@@ -4437,6 +4437,35 @@ void __devinit init_idle(task_t *idle, i
{
runqueue_t *rq = cpu_rq(cpu);
unsigned long flags;
+ prio_array_t *array;
+ int j, k;
+
+ spin_lock_init(&rq->lock);
+ rq->nr_running = 0;
+ rq->active = rq->arrays;
+ rq->expired = rq->arrays + 1;
+ rq->best_expired_prio = MAX_PRIO;
+
+#ifdef CONFIG_SMP
+ rq->sd = NULL;
+ for (j = 1; j < 3; j++)
+ rq->cpu_load[j] = 0;
+ rq->active_balance = 0;
+ rq->push_cpu = 0;
+ rq->migration_thread = NULL;
+ INIT_LIST_HEAD(&rq->migration_queue);
+#endif
+ atomic_set(&rq->nr_iowait, 0);
+
+ for (j = 0; j < 2; j++) {
+ array = rq->arrays + j;
+ for (k = 0; k < MAX_PRIO; k++) {
+ INIT_LIST_HEAD(array->queue + k);
+ __clear_bit(k, array->bitmap);
+ }
+ // delimiter for bitsearch
+ __set_bit(MAX_PRIO, array->bitmap);
+ }

idle->sleep_avg = 0;
idle->array = NULL;
@@ -6110,41 +6139,6 @@ int in_sched_functions(unsigned long add

void __init sched_init(void)
{
- runqueue_t *rq;
- int i, j, k;
-
- for_each_cpu(i) {
- prio_array_t *array;
-
- rq = cpu_rq(i);
- spin_lock_init(&rq->lock);
- rq->nr_running = 0;
- rq->active = rq->arrays;
- rq->expired = rq->arrays + 1;
- rq->best_expired_prio = MAX_PRIO;
-
-#ifdef CONFIG_SMP
- rq->sd = NULL;
- for (j = 1; j < 3; j++)
- rq->cpu_load[j] = 0;
- rq->active_balance = 0;
- rq->push_cpu = 0;
- rq->migration_thread = NULL;
- INIT_LIST_HEAD(&rq->migration_queue);
-#endif
- atomic_set(&rq->nr_iowait, 0);
-
- for (j = 0; j < 2; j++) {
- array = rq->arrays + j;
- for (k = 0; k < MAX_PRIO; k++) {
- INIT_LIST_HEAD(array->queue + k);
- __clear_bit(k, array->bitmap);
- }
- // delimiter for bitsearch
- __set_bit(MAX_PRIO, array->bitmap);
- }
- }
-
/*
* The boot idle thread does lazy MMU switching as well:
*/

2006-02-15 23:31:39

by Andrew Morton

[permalink] [raw]
Subject: Re: SMP BUG

Linus Torvalds <[email protected]> wrote:
>
> That said, nobody seemed to comment on this patch by Rik, which seemed to
> be a nice cleanup regardless of any other issues.

I thought that patch wasn't a good one. The runqueues should be
initialised in sched_init(). init_idle() is called from fork_idle() which
is called from the bowels of arch code. I'm not sure that it gets called
at all if !SMP (which seems strange).

2006-02-15 23:34:16

by Russell King

[permalink] [raw]
Subject: Re: SMP BUG

On Wed, Feb 15, 2006 at 03:17:16PM -0800, Andrew Morton wrote:
> Is arm's setup_arch() populating cpu_possible_map?

Correct, because setup_arch() itself doesn't have _any_ SMP awareness at
all - only the platform code knows about that. Hence, we initialise it
in smp_prepare_cpus(), which is a platform specific function.

> If that's not possible, statically initialising it to CPU_MASK_ALL should
> fix it, but that's a lame solution and might lead to wastage of per-cpu
> memory on not-possible CPUs.

I think that's what we'll have to do, and then later on clear it and
re-populate it with the correct bitmask. Grumble.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core

2006-02-15 23:37:08

by Russell King

[permalink] [raw]
Subject: Re: SMP BUG

On Wed, Feb 15, 2006 at 03:30:13PM -0800, Andrew Morton wrote:
> Linus Torvalds <[email protected]> wrote:
> > That said, nobody seemed to comment on this patch by Rik, which seemed to
> > be a nice cleanup regardless of any other issues.
>
> I thought that patch wasn't a good one. The runqueues should be
> initialised in sched_init(). init_idle() is called from fork_idle() which
> is called from the bowels of arch code. I'm not sure that it gets called
> at all if !SMP (which seems strange).

Wouldn't it make sense to do this initialisation in a CPU_UP_PREPARE
notifier, if not already done?

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core

2006-02-15 23:47:51

by Andrew Morton

[permalink] [raw]
Subject: Re: SMP BUG

Russell King <[email protected]> wrote:
>
> On Wed, Feb 15, 2006 at 03:30:13PM -0800, Andrew Morton wrote:
> > Linus Torvalds <[email protected]> wrote:
> > > That said, nobody seemed to comment on this patch by Rik, which seemed to
> > > be a nice cleanup regardless of any other issues.
> >
> > I thought that patch wasn't a good one. The runqueues should be
> > initialised in sched_init(). init_idle() is called from fork_idle() which
> > is called from the bowels of arch code. I'm not sure that it gets called
> > at all if !SMP (which seems strange).
>
> Wouldn't it make sense to do this initialisation in a CPU_UP_PREPARE
> notifier, if not already done?
>

Could be - we have a couple of notifier handlers in sched.c already,
although they're inside awkward CONFIG_* wrappers.

But architectures need to initialise cpu_possible_map in setup_arch()
anyway, because we immediately call setup_per_cpu_areas(), which needs to
know which CPUs are possible so it will only allocate memory for them.

Yes, architectures can override the generic setup_per_cpu_areas(), but the
same argument applies: they don't want to be allocating memory for
!possible CPUs.

So I think it's sanest to say that the arch shalt initialise cpu_possible_map
in setup_arch().

2006-02-16 00:14:16

by Russell King

[permalink] [raw]
Subject: Re: SMP BUG

On Wed, Feb 15, 2006 at 03:46:34PM -0800, Andrew Morton wrote:
> Russell King <[email protected]> wrote:
> >
> > On Wed, Feb 15, 2006 at 03:30:13PM -0800, Andrew Morton wrote:
> > > Linus Torvalds <[email protected]> wrote:
> > > > That said, nobody seemed to comment on this patch by Rik, which seemed to
> > > > be a nice cleanup regardless of any other issues.
> > >
> > > I thought that patch wasn't a good one. The runqueues should be
> > > initialised in sched_init(). init_idle() is called from fork_idle() which
> > > is called from the bowels of arch code. I'm not sure that it gets called
> > > at all if !SMP (which seems strange).
> >
> > Wouldn't it make sense to do this initialisation in a CPU_UP_PREPARE
> > notifier, if not already done?
> >
>
> Could be - we have a couple of notifier handlers in sched.c already,
> although they're inside awkward CONFIG_* wrappers.
>
> But architectures need to initialise cpu_possible_map in setup_arch()
> anyway, because we immediately call setup_per_cpu_areas(), which needs to
> know which CPUs are possible so it will only allocate memory for them.
>
> Yes, architectures can override the generic setup_per_cpu_areas(), but the
> same argument applies: they don't want to be allocating memory for
> !possible CPUs.
>
> So I think it's sanest to say that the arch shalt initialise cpu_possible_map
> in setup_arch().

Is that the only thing which needs to be initialised early for SMP, or
are there other changes with early SMP init looming?

The reason I ask is that the cleanest solution for ARM would be to
introduce yet another initialisation function for MP platforms to
implement, which setup_arch() will call after the initial page tables
have been setup. Hence, I'm wondering if the platform specific parts
could be simplified by moving more stuff earlier.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core

2006-02-16 00:30:00

by Andrew Morton

[permalink] [raw]
Subject: Re: SMP BUG

Russell King <[email protected]> wrote:
>
> > So I think it's sanest to say that the arch shalt initialise cpu_possible_map
> > in setup_arch().
>
> Is that the only thing which needs to be initialised early for SMP, or
> are there other changes with early SMP init looming?

Well, I'm not aware of anything - just the patch "percpu data changes" I
sent to linux-arch the other day, which is related to this issue.

The whole early bootup what-the-arch-should-do-when protocol is rather a
mess, so I'm sure other things will need changing.

> The reason I ask is that the cleanest solution for ARM would be to
> introduce yet another initialisation function for MP platforms to
> implement, which setup_arch() will call after the initial page tables
> have been setup. Hence, I'm wondering if the platform specific parts
> could be simplified by moving more stuff earlier.

That sounds sane. Perhaps make it a generic setup_arch_platform() whose
mandate is "do whatever needs to be done for setup_arch()".

2006-02-16 00:52:38

by Linus Torvalds

[permalink] [raw]
Subject: Re: SMP BUG



On Wed, 15 Feb 2006, Andrew Morton wrote:
>
> I thought that patch wasn't a good one. The runqueues should be
> initialised in sched_init(). init_idle() is called from fork_idle() which
> is called from the bowels of arch code. I'm not sure that it gets called
> at all if !SMP (which seems strange).

sched_init() calls init_idle for the current CPU.

Perhaps more importantly, every _single_ CPU that comes up must call
init-idle pretty much by definition. It's really what starts the whole
scheduling thing - the scheduler itself very much depends on the "idle"
task for each CPU.

So the reason I thought that Rik's patch was a cleanup was not because it
was needed (initializing cpu_possible_map early should fix up the
problems), but because it would actually be a very natural thing to do to
to initialize the scheduler data structures as part of init-idle. The
scheduler really isn't initialized until it has an idle thread anyway.

init_idle() already does part of the scheduler initializations, a pretty
fundamental part, in fact:

rq->curr = rq->idle = idle;

and the fact that "sched_init()" does some _other_ part of scheduler data
structure initialization is actually just ugly.

So I like Rik's patch, but I don't feel _too_ strongly about it. The
people who actually work on the scheduler should be the ones to sign off
(or not) on it.

Linus

2006-02-16 03:29:43

by Nick Piggin

[permalink] [raw]
Subject: Re: SMP BUG

Linus Torvalds wrote:

> So I like Rik's patch, but I don't feel _too_ strongly about it. The
> people who actually work on the scheduler should be the ones to sign off
> (or not) on it.
>

I thought it looked fine. As you say, the scheduler can't work (and
will blow up) if init_idle() isn't called before it is used.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com

2006-02-16 08:39:27

by Ingo Molnar

[permalink] [raw]
Subject: Re: SMP BUG


* Linus Torvalds <[email protected]> wrote:

> That said, nobody seemed to comment on this patch by Rik, which seemed
> to be a nice cleanup regardless of any other issues.
>
> Does this fix the ARM oops?

> Signed-off-by: Rik van Riel <[email protected]>

yep - i agree that pushing runqueue initialization into init_idle() is a
natural thing to do. [We used to do init_idle() much later in the past,
but today we do it straight from sched_init() - so Rik's patch should be
perfectly fine.]

Signed-off-by: Ingo Molnar <[email protected]>

Ingo

2006-02-16 10:21:09

by Russell King

[permalink] [raw]
Subject: Re: SMP BUG

On Wed, Feb 15, 2006 at 03:23:02PM -0800, Linus Torvalds wrote:
> Does this fix the ARM oops?

It fixes that exact oops but only by preventing us getting that far
due to another oops.

We call cpu_up, which sends a CPU_UP_PREPARE event. This causes the
migration thread to be spawned, and rq->migration_thread to be set.

Eventually, we call the architecture __cpu_up(), which ends up
calling init_idle(). Due to this patch, init_idle() then NULLs out
rq->migration_thread.

Later, we send a CPU_ONLINE event, which then tries to do
wake_up_process(rq->migration_thread) - resulting in a NULL pointer
deref in try_to_wake_up().

Hence, with this patch, it looks like rq will be used prior to
initialisation. I could try commenting out the migration_thread
initialisation to NULL, but I suspect that there may be other
problems associated with this patch (eg, rq->migration_queue).

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core

2006-02-16 15:55:07

by Linus Torvalds

[permalink] [raw]
Subject: Re: SMP BUG



On Thu, 16 Feb 2006, Russell King wrote:
>
> It fixes that exact oops but only by preventing us getting that far
> due to another oops.

Thanks for walking through it.

> We call cpu_up, which sends a CPU_UP_PREPARE event. This causes the
> migration thread to be spawned, and rq->migration_thread to be set.
>
> Eventually, we call the architecture __cpu_up(), which ends up
> calling init_idle(). Due to this patch, init_idle() then NULLs out
> rq->migration_thread.

Fair enough.

That actually does point to a real bug, I think. The fact that we
apparently now survive the fact that we spawn the migration thread before
the idle thread works looks like it just hides the bug that we shouldn't
do that. Ingo?

Oh, well. For now the fix is clearly to just leave things well alone, and
just have cpu_possible_map initialized early enough.

Linus