2005-01-24 02:40:58

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Problem with cpu_rest() change

Hi Ingo !

Could you explain me precisely what is the race you are fixing by adding
local_irq_disable() to rest_init() ?

This patch is causing lockups on boot on various ppc machines. I think
i've found at least one possible reason for that in the ppc cpu_idle()
code, which may not re-enable interrupts in some cases when
need_resched() is not set, assuming they were enabled on entry. However,
I'm wondering precisely what exact race you are trying to fix as my fix
would cause IRQs to be re-enabled before the call to schedule() when
need_resched() is not set, which goes against the comment you added to
rest_init() about letting them be re-enable by schedule() itself...

Ben.



2005-01-25 09:02:11

by Ingo Molnar

[permalink] [raw]
Subject: Re: Problem with cpu_rest() change


* Benjamin Herrenschmidt <[email protected]> wrote:

> Hi Ingo !
>
> Could you explain me precisely what is the race you are fixing by
> adding local_irq_disable() to rest_init() ?

it can be bad for the idle task to hold the BKL and to have preemption
enabled - in such a situation the scheduler will get confused if an
interrupt triggers a forced preemption in that small window. But it's
not necessary to keep IRQs disabled after the BKL has been dropped. In
fact i think IRQ-disabling doesnt have to be done at all, the patch
below ought to solve this scenario equally well, and should solve the
PPC side-effects too.

Tested ontop of 2.6.11-rc2 on x86 PREEMPT+SMP and PREEMPT+!SMP (which
IIRC were the config variants that triggered the original problem), on
an SMP and on a UP system.

Ingo

Signed-off-by: Ingo Molnar <[email protected]>

--- linux/init/main.c.orig
+++ linux/init/main.c
@@ -373,14 +373,9 @@ static void noinline rest_init(void)
{
kernel_thread(init, NULL, CLONE_FS | CLONE_SIGHAND);
numa_default_policy();
- /*
- * Re-enable preemption but disable interrupts to make sure
- * we dont get preempted until we schedule() in cpu_idle().
- */
- local_irq_disable();
- preempt_enable_no_resched();
unlock_kernel();
- cpu_idle();
+ preempt_enable_no_resched();
+ cpu_idle();
}

/* Check for early params. */

2005-01-25 23:54:10

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Problem with cpu_rest() change

On Tue, 2005-01-25 at 10:01 +0100, Ingo Molnar wrote:

> it can be bad for the idle task to hold the BKL and to have preemption
> enabled - in such a situation the scheduler will get confused if an
> interrupt triggers a forced preemption in that small window. But it's
> not necessary to keep IRQs disabled after the BKL has been dropped. In
> fact i think IRQ-disabling doesnt have to be done at all, the patch
> below ought to solve this scenario equally well, and should solve the
> PPC side-effects too.
>
> Tested ontop of 2.6.11-rc2 on x86 PREEMPT+SMP and PREEMPT+!SMP (which
> IIRC were the config variants that triggered the original problem), on
> an SMP and on a UP system.

Excellent, thanks.

Ben.


2005-01-26 05:45:52

by Kumar Gala

[permalink] [raw]
Subject: Re: Problem with cpu_rest() change

Will these changes cause us to back out the patch already made to
arch/ppc/kernel/idle.c for systems that did not support powersavings?

- kumar

On Jan 25, 2005, at 5:49 PM, Benjamin Herrenschmidt wrote:

> On Tue, 2005-01-25 at 10:01 +0100, Ingo Molnar wrote:
>
> > it can be bad for the idle task to hold the BKL and to have
> preemption
> > enabled - in such a situation the scheduler will get confused if an
> > interrupt triggers a forced preemption in that small window. But
> it's
> > not necessary to keep IRQs disabled after the BKL has been dropped.
> In
> > fact i think IRQ-disabling doesnt have to be done at all, the patch
> > below ought to solve this scenario equally well, and should solve
> the
> > PPC side-effects too.
> >
> > Tested ontop of 2.6.11-rc2 on x86 PREEMPT+SMP and PREEMPT+!SMP (which
> > IIRC were the config variants that triggered the original problem),
> on
> > an SMP and on a UP system.
>
> Excellent, thanks.
>
> Ben.
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at? http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at? http://www.tux.org/lkml/

2005-01-26 13:46:16

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Problem with cpu_rest() change

On Tue, 2005-01-25 at 23:45 -0600, Kumar Gala wrote:
> Will these changes cause us to back out the patch already made to
> arch/ppc/kernel/idle.c for systems that did not support powersavings?

Did it already make it upstream ? Ingo's fix should make our workarounds
unnecessary indeed...

Ben.


2005-01-26 14:13:18

by Kumar Gala

[permalink] [raw]
Subject: Re: Problem with cpu_rest() change

Ok, will send a patch to back out the change that Linus already
accepted.

- kumar

On Jan 26, 2005, at 7:44 AM, Benjamin Herrenschmidt wrote:

> On Tue, 2005-01-25 at 23:45 -0600, Kumar Gala wrote:
> > Will these changes cause us to back out the patch already made to
> > arch/ppc/kernel/idle.c for systems that did not support powersavings?
>
> Did it already make it upstream ? Ingo's fix should make our
> workarounds
> unnecessary indeed...
>
> Ben.